tascCODA - Tree-aggregated compositional analysis

tascCODA - Tree-aggregated compositional analysis#

This notebook is a tutorial on how to use tascCODA Ostner et al., 2021 for tree-aggregated compositional analysis of high-throughput sequencing (HTS) data.

For this example, we use single-cell RNA sequencing data [OCM21]. However, there are no limitations to use tascCODA with other HTS data, such as 16S rRNA sequencing.

The particular dataset for this analysis was generated by Smillie et al., 2019. It contains samples from two different regions in the small intestine of mice - Epithelium and Lamina Propria - and three different inflammation conditions - healthy, non-inflamed and inflamed. In total, we have 365.492 cells from 51 cell types in 133 samples.

import matplotlib.pyplot as plt
import pertpy as pt

Dataset#

First, we read in the per-cell data. In this dataset, samples are specified as “Subject”, and cell types are denoted as “Cluster”. Interesting covariates are the samples’ location and health status. The columns “Major_l1” - “Major_l4” and “Cluster” describe a lineage tree over the cell types.

smillie_counts = pt.dt.smillie_2019()
smillie_counts.obs

	Origin	Subject	Sample	Location	Replicate	Health	Cluster	nGene	nUMI	Major_l1	Major_l2	Major_l3	Major_l4
N7.EpiA.AAGGCTACCCTTTA	Imm	N7	EpiA	Epi	A	Non-inflamed	Plasma	624.0	7433.0	Immune	Lymphoid	B cells	Plasma4
N7.EpiA.AAGGTGCTACGGAG	Imm	N7	EpiA	Epi	A	Non-inflamed	CD8+ IELs	558.0	1904.0	Immune	Lymphoid	T cells	CD8+ T
N7.EpiA.AAGTAACTTGCTTT	Imm	N7	EpiA	Epi	A	Non-inflamed	CD8+ IELs	437.0	1366.0	Immune	Lymphoid	T cells	CD8+ T
N7.EpiA.ACAATAACCCTCAC	Imm	N7	EpiA	Epi	A	Non-inflamed	Plasma	484.0	5161.0	Immune	Lymphoid	B cells	Plasma4
N7.EpiA.ACAGTTCTTCTACT	Imm	N7	EpiA	Epi	A	Non-inflamed	CD8+ IELs	470.0	1408.0	Immune	Lymphoid	T cells	CD8+ T
...	...	...	...	...	...	...	...	...	...	...	...	...	...
N110.LPB.TTTGGTTGTGTGGCTC	Epi	N110	LPB	LP	B	Inflamed	Immature Enterocytes 2	2553.0	11705.0	Epithelial	Epithelial	Absorptive	Immature cells
N110.LPB.TTTGGTTTCCTTAATC	Epi	N110	LPB	LP	B	Inflamed	TA 2	3234.0	16164.0	Epithelial	Epithelial	Absorptive	TA cells
N110.LPB.TTTGGTTTCTTACCTA	Epi	N110	LPB	LP	B	Inflamed	Enterocyte Progenitors	258.0	384.0	Epithelial	Epithelial	Absorptive	Immature cells
N110.LPB.TTTGTCAAGGATGGAA	Epi	N110	LPB	LP	B	Inflamed	TA 1	487.0	772.0	Epithelial	Epithelial	Absorptive	TA cells
N110.LPB.TTTGTCAGTTGTTTGG	Epi	N110	LPB	LP	B	Inflamed	TA 1	363.0	747.0	Epithelial	Epithelial	Absorptive	TA cells

365492 rows × 13 columns

smillie_counts

AnnData object with n_obs × n_vars = 365492 × 18172
    obs: 'Origin', 'Subject', 'Sample', 'Location', 'Replicate', 'Health', 'Cluster', 'nGene', 'nUMI', 'Major_l1', 'Major_l2', 'Major_l3', 'Major_l4'

Next, we convert the data to a (samples x cell types) object. To identify the statistical samples, we need to combine the “Subject” and “Sample” columns. We also need to extract the tree-structured cell lineage information (Lineages Major_l1 - Major_l4) from the data, generate a Newick string through tasccoda.tree_utils.df2newick() and convert it into a ete3 tree object. This is all done within the load function by specifying.

tasccoda_model = pt.tl.Tasccoda()
smillie_data = tasccoda_model.load(
    smillie_counts,
    type="cell_level",
    cell_type_identifier="Cluster",
    sample_identifier=["Subject", "Sample"],
    covariate_obs=["Location", "Health"],
    levels_orig=["Major_l1", "Major_l2", "Major_l3", "Major_l4", "Cluster"],
    add_level_name=True,
)
smillie_data

MuData object with n_obs × n_vars = 365625 × 18223
  2 modalities
    rna:	365492 x 18172
      obs:	'Origin', 'Subject', 'Sample', 'Location', 'Replicate', 'Health', 'Cluster', 'nGene', 'nUMI', 'Major_l1', 'Major_l2', 'Major_l3', 'Major_l4', 'scCODA_sample_id'
    coda:	133 x 51
      obs:	'Sample', 'Location', 'Health', 'Subject'
      var:	'n_cells'
      uns:	'tree'

We can also visualize the tree:

tasccoda_model.plot_draw_tree(smillie_data["coda"])

../../_images/ce60af58bd92c0be068cfafead5b5bcf16e53bc10be26819b16baab5467d7525.png

For this tutorial, we focus on finding changes between healthy and non-inflamed tissue in the Lamina Propria. Therefore, we subset the data accordingly, and end up with 48 samples:

smillie_data.mod["coda_LP"] = smillie_data["coda"][
    (smillie_data["coda"].obs["Health"].isin(["Healthy", "Non-inflamed"]))
    & (smillie_data["coda"].obs["Location"] == "LP")
]
smillie_data

MuData object with n_obs × n_vars = 365625 × 18223
  3 modalities
    rna:	365492 x 18172
      obs:	'Origin', 'Subject', 'Sample', 'Location', 'Replicate', 'Health', 'Cluster', 'nGene', 'nUMI', 'Major_l1', 'Major_l2', 'Major_l3', 'Major_l4', 'scCODA_sample_id'
    coda:	133 x 51
      obs:	'Sample', 'Location', 'Health', 'Subject'
      var:	'n_cells'
      uns:	'tree'
    coda_LP:	48 x 51
      obs:	'Sample', 'Location', 'Health', 'Subject'
      var:	'n_cells'
      uns:	'tree'

A quick plot of the data with the sccoda.util.data_visualization module shows us that there are some changes in relative abundance in the data:

tasccoda_model.plot_boxplots(smillie_data, modality_key="coda_LP", feature_name="Health", figsize=(20, 8))
plt.show()

../../_images/e10a4e42b7284693afded691d47f18b4696964a59e0105daca91cb9346a25fc8.png

Run tascCODA#

Inferring credible effects with tascCODA on our data is now simply a matter of running the sampling process. The model creation and inference works analogous to scCODA (see the scCODA quickstart tutorial)

As hyperparameters, we have to specify in the prepare function:

The reference cell type (which is assumed to be unchanged between the conditions): We use the automatic setting here, which chooses a reference that induces minimal compositional effects (here we use the automatic suggestion - NK cells)
The model formula (R-style formula string, just as in scCODA)
The aggregation bias \(\phi\), defined in the tascCODA paper. We go with an unbiased aggregation (pen_args={"phi": 0}) here.
The key in .uns of our MuData modality where the tree is saved.

Then, we run No-U-turn sampling via run_nuts with the default settings of 11.000 samples, of which 1.000 are discarded as burn-in

smillie_data = tasccoda_model.prepare(
    smillie_data,
    modality_key="coda_LP",
    tree_key="tree",
    reference_cell_type="automatic",
    formula="Health",
    pen_args={"phi": 0},
)
smillie_data

• Automatic reference selection! Reference cell type set to NKs
• Zero counts encountered in data! Added a pseudocount of 0.5.

/home/lukas/miniforge3/envs/pertpy/lib/python3.12/site-packages/anndata/_core/anndata.py:618: FutureWarning: You are attempting to set `X` to a matrix on a view which has non-unique indices. The resulting `adata.X` will likely not equal the value to which you set it. To avoid this potential issue, please make a copy of the data first. In the future, this operation will throw an error.
  warnings.warn(msg, FutureWarning, stacklevel=1)
/home/lukas/code/pertpy/pertpy/tools/_coda/_base_coda.py:123: ImplicitModificationWarning: Modifying `X` on a view results in data being overridden
  sample_adata.X = sample_adata.X.astype(dtype)
/home/lukas/code/pertpy/pertpy/tools/_coda/_base_coda.py:130: ImplicitModificationWarning: Setting element `.obsm['covariate_matrix']` of view, initializing view as actual.
  sample_adata.obsm["covariate_matrix"] = np.array(covariate_matrix[:, 1:]).astype(dtype)

MuData object with n_obs × n_vars = 365625 × 18223
  3 modalities
    rna:	365492 x 18172
      obs:	'Origin', 'Subject', 'Sample', 'Location', 'Replicate', 'Health', 'Cluster', 'nGene', 'nUMI', 'Major_l1', 'Major_l2', 'Major_l3', 'Major_l4', 'scCODA_sample_id'
    coda:	133 x 51
      obs:	'Sample', 'Location', 'Health', 'Subject'
      var:	'n_cells'
      uns:	'tree'
    coda_LP:	48 x 51
      obs:	'Sample', 'Location', 'Health', 'Subject'
      var:	'n_cells'
      uns:	'tree', 'scCODA_params'
      obsm:	'covariate_matrix', 'sample_counts'

tasccoda_model.run_nuts(smillie_data, modality_key="coda_LP")

sample: 100%|██████████| 11000/11000 [02:48<00:00, 65.43it/s, 63 steps of size 7.85e-02. acc. prob=0.91]

Result analysis#

Calling summary, we can see the most relevant information for further analysis:

tasccoda_model.summary(smillie_data, modality_key="coda_LP")

                                          Compositional Analysis summary                                           
┌────────────────────────────────────────────┬────────────────────────────────────────────────────────────────────┐
│ Name                                       │ Value                                                              │
├────────────────────────────────────────────┼────────────────────────────────────────────────────────────────────┤
│ Data                                       │ Data: 48 samples, 51 cell types                                    │
│ Reference cell type                        │ NKs                                                                │
│ Formula                                    │ Health                                                             │
└────────────────────────────────────────────┴────────────────────────────────────────────────────────────────────┘

┌─────────────────────────────────────────────────────────────────────────────────────────────────────────────────┐
│ Intercepts                                                                                                      │
├─────────────────────────────────────────────────────────────────────────────────────────────────────────────────┤
│                           Final Parameter  Expected Sample                                                      │
│ Cluster                                                                                                         │
│ Best4+ Enterocytes            -1.182            10.358                                                          │
│ CD4+ Activated Fos-hi          1.560           160.732                                                          │
│ CD4+ Activated Fos-lo          1.609           168.804                                                          │
│ CD4+ Memory                    1.319           126.310                                                          │
│ CD4+ PD1+                     -0.932            13.300                                                          │
│ CD8+ IELs                     -0.112            30.197                                                          │
│ CD8+ IL17+                    -0.803            15.131                                                          │
│ CD8+ LP                        1.602           167.626                                                          │
│ CD69+ Mast                     0.590            60.931                                                          │
│ CD69- Mast                    -1.357             8.695                                                          │
│ Cycling B                     -0.478            20.942                                                          │
│ Cycling Monocytes             -0.565            19.197                                                          │
│ Cycling T                     -0.702            16.739                                                          │
│ Cycling TA                     0.089            36.919                                                          │
│ DC1                           -0.603            18.481                                                          │
│ DC2                            0.353            48.074                                                          │
│ Endothelial                    0.120            38.082                                                          │
│ Enterocyte Progenitors        -0.906            13.650                                                          │
│ Enterocytes                   -1.223             9.942                                                          │
│ Enteroendocrine               -1.264             9.542                                                          │
│ Follicular                     0.830            77.458                                                          │
│ GC                            -0.912            13.568                                                          │
│ Glia                          -0.104            30.439                                                          │
│ Goblet                        -1.218             9.992                                                          │
│ ILCs                          -0.846            14.494                                                          │
│ Immature Enterocytes 1        -1.065            11.643                                                          │
│ Immature Enterocytes 2        -0.826            14.787                                                          │
│ Immature Goblet               -0.329            24.306                                                          │
│ Inflammatory Fibroblasts      -0.970            12.804                                                          │
│ Inflammatory Monocytes        -0.538            19.722                                                          │
│ M cells                       -1.401             8.321                                                          │
│ MT-hi                         -0.716            16.506                                                          │
│ Macrophages                    1.571           162.510                                                          │
│ Microvascular                 -0.299            25.047                                                          │
│ Myofibroblasts                -0.015            33.273                                                          │
│ NKs                           -0.390            22.868                                                          │
│ Pericytes                     -0.684            17.043                                                          │
│ Plasma                         3.579          1210.437                                                          │
│ Post-capillary Venules         0.009            34.081                                                          │
│ RSPO3+                        -0.616            18.242                                                          │
│ Secretory TA                  -0.914            13.541                                                          │
│ Stem                          -0.800            15.176                                                          │
│ TA 1                          -0.016            33.239                                                          │
│ TA 2                          -0.174            28.381                                                          │
│ Tregs                          0.684            66.936                                                          │
│ Tuft                          -1.443             7.978                                                          │
│ WNT2B+ Fos-hi                  0.738            70.650                                                          │
│ WNT2B+ Fos-lo 1                1.117           103.207                                                          │
│ WNT2B+ Fos-lo 2                0.553            58.717                                                          │
│ WNT5B+ 1                       0.382            49.488                                                          │
│ WNT5B+ 2                       0.783            73.902                                                          │
└─────────────────────────────────────────────────────────────────────────────────────────────────────────────────┘

┌─────────────────────────────────────────────────────────────────────────────────────────────────────────────────┐
│ Effects                                                                                                         │
├─────────────────────────────────────────────────────────────────────────────────────────────────────────────────┤
│                                                Effect  Expected Sample  log2-fold change                        │
│ Covariate            Cell Type                                                                                  │
│ HealthT.Non-inflamed Best4+ Enterocytes         0.000      19.415             0.906                             │
│                      CD4+ Activated Fos-hi     -0.676     153.247            -0.069                             │
│                      CD4+ Activated Fos-lo     -1.031     112.849            -0.581                             │
│                      CD4+ Memory               -0.676     120.428            -0.069                             │
│                      CD4+ PD1+                 -0.676      12.680            -0.069                             │
│                      CD8+ IELs                 -0.676      28.791            -0.069                             │
│                      CD8+ IL17+                -0.676      14.426            -0.069                             │
│                      CD8+ LP                   -0.676     159.820            -0.069                             │
│                      CD69+ Mast                -0.252      88.770             0.543                             │
│                      CD69- Mast                -0.252      12.668             0.543                             │
│                      Cycling B                 -0.707      19.357            -0.114                             │
│                      Cycling Monocytes         -0.607      19.610             0.031                             │
│                      Cycling T                 -0.676      15.959            -0.069                             │
│                      Cycling TA                 0.000      69.203             0.906                             │
│                      DC1                       -0.607      18.879             0.031                             │
│                      DC2                       -0.607      49.109             0.031                             │
│                      Endothelial               -0.365      49.553             0.380                             │
│                      Enterocyte Progenitors     0.000      25.586             0.906                             │
│                      Enterocytes                0.000      18.635             0.906                             │
│                      Enteroendocrine            0.000      17.887             0.906                             │
│                      Follicular                -0.707      71.597            -0.114                             │
│                      GC                        -0.707      12.542            -0.114                             │
│                      Glia                      -0.365      39.609             0.380                             │
│                      Goblet                     0.000      18.729             0.906                             │
│                      ILCs                       0.000      27.168             0.906                             │
│                      Immature Enterocytes 1     0.000      21.825             0.906                             │
│                      Immature Enterocytes 2     0.000      27.717             0.906                             │
│                      Immature Goblet            0.000      45.561             0.906                             │
│                      Inflammatory Fibroblasts  -0.365      16.661             0.380                             │
│                      Inflammatory Monocytes    -0.607      20.147             0.031                             │
│                      M cells                    0.000      15.597             0.906                             │
│                      MT-hi                     -0.676      15.738            -0.069                             │
│                      Macrophages               -0.607     166.010             0.031                             │
│                      Microvascular             -0.365      32.591             0.380                             │
│                      Myofibroblasts            -0.365      43.295             0.380                             │
│                      NKs                        0.000      42.865             0.906                             │
│                      Pericytes                 -0.365      22.177             0.380                             │
│                      Plasma                    -1.043     799.550            -0.598                             │
│                      Post-capillary Venules    -0.365      44.347             0.380                             │
│                      RSPO3+                    -0.365      23.737             0.380                             │
│                      Secretory TA               0.000      25.382             0.906                             │
│                      Stem                       0.000      28.447             0.906                             │
│                      TA 1                       0.340      87.536             1.397                             │
│                      TA 2                       0.340      74.742             1.397                             │
│                      Tregs                     -0.676      63.819            -0.069                             │
│                      Tuft                       0.000      14.955             0.906                             │
│                      WNT2B+ Fos-hi             -0.365      91.932             0.380                             │
│                      WNT2B+ Fos-lo 1           -0.365     134.296             0.380                             │
│                      WNT2B+ Fos-lo 2           -0.365      76.405             0.380                             │
│                      WNT5B+ 1                  -0.365      64.396             0.380                             │
│                      WNT5B+ 2                  -0.365      96.163             0.380                             │
└─────────────────────────────────────────────────────────────────────────────────────────────────────────────────┘

┌─────────────────────────────────────────────────────────────────────────────────────────────────────────────────┐
│ Nodes                                                                                                           │
├─────────────────────────────────────────────────────────────────────────────────────────────────────────────────┤
│ Covariate=Health[T.Non-inflamed]_node                                                                           │
├─────────────────────────────────────────────────────────────────────────────────────────────────────────────────┤
│                                   Final Parameter  Is credible                                                  │
│ Node                                                                                                            │
│ Major_l1_Immune                        0.00           False                                                     │
│ Major_l2_Stromal                      -0.36            True                                                     │
│ Major_l2_Epithelial                    0.00           False                                                     │
│ Major_l2_Lymphoid                      0.00           False                                                     │
│ Major_l2_Myeloid                      -0.25            True                                                     │
│ Major_l3_Fibroblasts                   0.00           False                                                     │
│ Major_l4_EndothelialCells              0.00           False                                                     │
│ Glia                                   0.00           False                                                     │
│ Major_l3_Absorptive                    0.00           False                                                     │
│ Major_l3_Secretory                     0.00           False                                                     │
│ Cycling TA                             0.00           False                                                     │
│ M cells                                0.00           False                                                     │
│ Stem                                   0.00           False                                                     │
│ Major_l3_B cells                      -0.71            True                                                     │
│ Major_l3_T cells                      -0.68            True                                                     │
│ ILCs                                   0.00           False                                                     │
│ NKs                                    0.00           False                                                     │
│ Major_l3_Monocytes                    -0.35            True                                                     │
│ Major_l3_Mast                          0.00           False                                                     │
│ Major_l4_WNT2B+                        0.00           False                                                     │
│ Major_l4_WNT5B+                        0.00           False                                                     │
│ Myofibroblasts                         0.00           False                                                     │
│ Inflammatory Fibroblasts               0.00           False                                                     │
│ Pericytes                              0.00           False                                                     │
│ Microvascular                          0.00           False                                                     │
│ Endothelial                            0.00           False                                                     │
│ Post-capillary Venules                 0.00           False                                                     │
│ Major_l4_TA cells                      0.34            True                                                     │
│ Major_l4_Immature cells                0.00           False                                                     │
│ Major_l4_Absorptive Mature cells       0.00           False                                                     │
│ Major_l4_Progenitor cells              0.00           False                                                     │
│ Major_l4_Secretory Mature cells        0.00           False                                                     │
│ Plasma                                -0.34            True                                                     │
│ Cycling B                              0.00           False                                                     │
│ Follicular                             0.00           False                                                     │
│ GC                                     0.00           False                                                     │
│ Major_l4_CD8+ T                        0.00           False                                                     │
│ Major_l4_CD4+ T                        0.00           False                                                     │
│ Major_l4_DCs                           0.00           False                                                     │
│ Macrophages                            0.00           False                                                     │
│ Cycling Monocytes                      0.00           False                                                     │
│ Inflammatory Monocytes                 0.00           False                                                     │
│ CD69+ Mast                             0.00           False                                                     │
│ CD69- Mast                             0.00           False                                                     │
│ WNT2B+ Fos-lo 1                        0.00           False                                                     │
│ WNT2B+ Fos-hi                          0.00           False                                                     │
│ WNT2B+ Fos-lo 2                        0.00           False                                                     │
│ RSPO3+                                 0.00           False                                                     │
│ WNT5B+ 2                               0.00           False                                                     │
│ WNT5B+ 1                               0.00           False                                                     │
│ TA 1                                   0.00           False                                                     │
│ TA 2                                   0.00           False                                                     │
│ Enterocyte Progenitors                 0.00           False                                                     │
│ Immature Enterocytes 2                 0.00           False                                                     │
│ Immature Enterocytes 1                 0.00           False                                                     │
│ Best4+ Enterocytes                     0.00           False                                                     │
│ Enterocytes                            0.00           False                                                     │
│ Immature Goblet                        0.00           False                                                     │
│ Secretory TA                           0.00           False                                                     │
│ Goblet                                 0.00           False                                                     │
│ Enteroendocrine                        0.00           False                                                     │
│ Tuft                                   0.00           False                                                     │
│ CD8+ IELs                              0.00           False                                                     │
│ Cycling T                              0.00           False                                                     │
│ CD8+ IL17+                             0.00           False                                                     │
│ CD8+ LP                                0.00           False                                                     │
│ Tregs                                  0.00           False                                                     │
│ CD4+ Activated Fos-hi                  0.00           False                                                     │
│ CD4+ Activated Fos-lo                 -0.35            True                                                     │
│ CD4+ PD1+                              0.00           False                                                     │
│ CD4+ Memory                            0.00           False                                                     │
│ MT-hi                                  0.00           False                                                     │
│ DC2                                    0.00           False                                                     │
│ DC1                                    0.00           False                                                     │
└─────────────────────────────────────────────────────────────────────────────────────────────────────────────────┘

Model properties

First, the summary shows an overview over the model properties:

Number of samples/cell types
The reference cell type.
The formula used

The model has three types of parameters that are relevant for analysis - intercepts, feature-level effects and node-wise effects. These can be interpreted like in a standard regression model: Intercepts show how the cell types are distributed without any active covariates, effects show how the covariates influence the cell types.

Intercepts

The first column of the intercept summary shows the parameters determined by the MCMC inference.

The “Expected sample” column gives some context to the numerical values. If we had a new sample (with no active covariates) with a total number of cells equal to the mean sampling depth of the dataset, then this distribution over the cell types would be most likely.

Feature-level Effects

For the feature-level effect summary, the first column again shows the inferred parameters for all combinations of covariates and cell types, as sums over node-level effects on all parent nodes. Most important is the distinction between zero and non-zero entries A value of zero means that no statistically credible effect was detected. For a value other than zero, a credible change was detected. A positive sign indicates an increase, a negative sign a decrease in abundance.

Since the numerical values of the “Effect” column are not straightforward to interpret, the “Expected sample” and “log2-fold change” columns give us an idea on the magnitude of the change. The expected sample is calculated for each covariate separately (covariate value = 1, all other covariates = 0), with the same method as for the intercepts. The log-fold change is then calculated between this expected sample and the expected sample with no active covariates from the intercept section. Since the data is compositional, cell types for which no credible change was detected, will still change in abundance as well, as soon as a credible effect is detected on another cell type due to the sum-to-one constraint. If there are no credible effects for a covariate, its expected sample will be identical to the intercept sample, therefore the log2-fold change is 0.

Node-level effects

These parameters are the most important ones. They describe, at which points in the tree a credible change in abundance was detected. The data frame just has two columns: The “Final Parameter” column shows the effect values, the “Is credible” column simply depicts whether the inferred effect is different from 0, i.e. credible. In a normal tascCODA analysis, we are interested in which subtrees (i.e. nodes) have a nonzero effect, which we can easily extract:

tasccoda_model.credible_effects(smillie_data, modality_key="coda_LP")

Covariate                    Node               
Health[T.Non-inflamed]_node  Major_l1_Immune        False
                             Major_l2_Stromal        True
                             Major_l2_Epithelial    False
                             Major_l2_Lymphoid      False
                             Major_l2_Myeloid        True
                                                    ...  
                             CD4+ PD1+              False
                             CD4+ Memory            False
                             MT-hi                  False
                             DC2                    False
                             DC1                    False
Name: Final Parameter, Length: 74, dtype: bool

Interpretation

We see that most credible effects are on intermediate nodes. The only slight increase in abundance is on TA cells, while there are decreases on Stromal and some Immune cell types. The most important decreases (i.e. with the largest effect sizes) can be found on B- and T-cells. Plasma cells have an additional decrease, meaning that they change even stronger in abundance than the rest of the B-cell lineage.

We can also easily plot the credible effects as nodes on the tree for better visualization.

tasccoda_model.plot_draw_effects(
    smillie_data,
    modality_key="coda_LP",
    tree=smillie_data["coda_LP"].uns["tree"],
    covariate="Health[T.Non-inflamed]",
)

../../_images/4ef19cb0f38e0bf3f14581e8f0c5b00e1fcc5f77488661b4d31e6861ba4d1f1e.png

To better see the size of aggregated effects on the individual cell types, we can plot them at the side of the tree plot:

tasccoda_model.plot_draw_effects(
    smillie_data,
    modality_key="coda_LP",
    tree=smillie_data["coda_LP"].uns["tree"],
    covariate="Health[T.Non-inflamed]",
    show_legend=False,
    show_leaf_effects=True,
)

Wrote file: /home/lukas/code/pertpy-tutorials/tree_effect.png

/home/lukas/code/pertpy/pertpy/tools/_coda/_base_coda.py:2072: FutureWarning: 

Passing `palette` without assigning `hue` is deprecated and will be removed in v0.14.0. Assign the `y` variable to `hue` and set `legend=False` for the same effect.

  sns.barplot(data=leaf_effs, x="Effect", y="Cell Type", palette=palette, ax=ax[1])

../../_images/6b0935cd8fa4bd5dada18efb5b6c041e15feffafa4b47fcd3bcd25931201f8df.png

tascCODA - Tree-aggregated compositional analysis

Contents

tascCODA - Tree-aggregated compositional analysis#

Dataset#

Run tascCODA#

Result analysis#