arviz.ColumnDataSource#
- arviz.ColumnDataSource(*args, **kwargs)[source]#
Wrap bokeh.models.ColumnDataSource.
Maps names of columns to sequences or arrays.
The
ColumnDataSourceis a fundamental data structure of Bokeh. Most plots, data tables, etc. will be driven by aColumnDataSource.If the
ColumnDataSourceinitializer is called with a single argument that can be any of the following:A Python
dictthat maps string names to sequences of values, e.g. lists, arrays, etc.data = {'x': [1,2,3,4], 'y': np.array([10.0, 20.0, 30.0, 40.0])} source = ColumnDataSource(data)
Note
ColumnDataSourceonly creates a shallow copy ofdata. Use e.g.ColumnDataSource(copy.deepcopy(data))if initializing from anotherColumnDataSource.dataobject that you want to keep independent.A Pandas
DataFrameobjectsource = ColumnDataSource(df)
In this case the CDS will have columns corresponding to the columns of the
DataFrame. If theDataFramecolumns have multiple levels, they will be flattened using an underscore (e.g. level_0_col_level_1_col). The index of theDataFramewill be flattened to anIndexof tuples if it’s aMultiIndex, and then reset usingreset_index. The result will be a column with the same name if the index was named, or level_0_name_level_1_name if it was a namedMultiIndex. If theIndexdid not have a name or theMultiIndexname could not be flattened/determined, thereset_indexfunction will name the index columnindex, orlevel_0if the nameindexis not available.A Pandas
GroupByobjectgroup = df.groupby(('colA', 'ColB'))
In this case the CDS will have columns corresponding to the result of calling
group.describe(). Thedescribemethod generates columns for statistical measures such asmeanandcountfor all the non-grouped original columns. The CDS columns are formed by joining original column names with the computed measure. For example, if aDataFramehas columns'year'and'mpg'. Then passingdf.groupby('year')to a CDS will result in columns such as'mpg_mean'If the
GroupBy.describeresult has a named index column, then CDS will also have a column with this name. However, if the index name (or any subname of aMultiIndex) isNone, then the CDS will have a column generically namedindexfor the index.Note this capability to adapt
GroupByobjects may only work with Pandas>=0.20.0.
Note
There is an implicit assumption that all the columns in a given
ColumnDataSourceall have the same length at all times. For this reason, it is usually preferable to update the.dataproperty of a data source “all at once”.