pertpy.tools.Cinemaot

class pertpy.tools.Cinemaot[source]

CINEMA-OT is a causal framework for perturbation effect analysis to identify individual treatment effects and synergy.

Methods table

attribution_scatter(adata, pert_key, control)

A simple function for computing confounder-specific genes.

causaleffect(adata, pert_key, control[, ...])

Calculate the confounding variation, optimal transport counterfactual pairs, and single-cell level treatment effects.

causaleffect_weighted(adata, pert_key, control)

The resampling CINEMA-OT algorithm that allows tackling the differential abundance in an unsupervised manner.

generate_pseudobulk(adata, de, pert_key, ...)

Generating pseudobulk anndata considering the differential response behaviors revealed by CINEMA-OT.

get_dim(adata[, c, use_rep])

Estimating the rank of the count matrix.

get_weightidx(adata, pert_key, control[, ...])

Generating the resampled indices that balances covariates across treatment conditions.

synergy(adata, pert_key, base, A, B, AB[, ...])

A wrapper for computing the synergy matrices.

plot_vis_matching(adata, de, pert_key, ...)

Visualize the CINEMA-OT matching matrix.

Methods

attribution_scatter

Cinemaot.attribution_scatter(adata, pert_key, control, cf_rep='cf', use_raw=False)[source]

A simple function for computing confounder-specific genes.

Parameters:
  • adata (AnnData) – The annotated data object.

  • pert_key (str) – The column of .obs with perturbation categories, should also contain control.

  • control (str) – Control category from the pert_key column.

  • cf_rep (str) – the place to put the confounder embedding in the original adata.obsm.

  • use_raw (bool) – If true, use adata.raw.X to aggregate the pseudobulk profiles. Otherwise use adata.X.

Returns:

Returns the confounder effect (c_effect) and the residual effect (s_effect).

Examples

>>> import pertpy as pt
>>> adata = pt.dt.cinemaot_example()
>>> model = pt.tl.Cinemaot()
>>> c_effect, s_effect = model.attribution_scatter(adata, pert_key="perturbation", control="No stimulation")

causaleffect

Cinemaot.causaleffect(adata, pert_key, control, return_matching=False, cf_rep='cf', use_rep='X_pca', batch_size=None, dim=20, thres=0.15, smoothness=0.0001, rank=200, eps=0.001, solver='Sinkhorn', preweight_label=None)[source]

Calculate the confounding variation, optimal transport counterfactual pairs, and single-cell level treatment effects.

Parameters:
  • adata (AnnData) – The annotated data object.

  • pert_key (str) – The column of .obs with perturbation categories, should also contain control.

  • control (str) – Control category from the pert_key column.

  • return_matching (bool) – Whether to return the matching matrix in the returned de.obsm[‘ot’].

  • cf_rep (str) – the place to put the confounder embedding in the original adata.obsm.

  • use_rep (str) – Use the indicated representation. ‘X’ or any key for .obsm is valid.

  • batch_size (int | None) – Size of batch to calculate the optimal transport map.

  • dim (int | None) – Use the first dim components in use_rep. If none, use a biwhitening procedure on the raw count matrix to derive a reasonable rank.

  • thres (float) – the threshold for the rank-dependence metric.

  • smoothness (float) – the coefficient determining the smooth level in entropic optimal transport problem.

  • rank (int) – Only used if the solver “LRSinkhorn” is used. Specifies the rank number of the transport map.

  • eps (float) – Tolerate error of the optimal transport.

  • solver (str) – Either “Sinkhorn” or “LRSinkhorn”. The ott-jax solver used.

  • preweight_label (str | None) – The annotated label (e.g. cell type) that is used to assign weights for treated and control cells to balance across the label. Helps overcome the differential abundance issue.

Returns:

Returns an AnnData object that contains the single-cell level treatment effect as de.X and the corresponding low dimensional embedding in de.obsm[‘X_embedding’], and optional matching matrix stored in the de.obsm[‘ot’]. Also puts the confounding variation in adata.obsm[cf_rep].

Examples

>>> import pertpy as pt
>>> adata = pt.dt.cinemaot_example()
>>> model = pt.tl.Cinemaot()
>>> out_adata = model.causaleffect(
>>>         adata, pert_key="perturbation", control="No stimulation", return_matching=True,
>>>         thres=0.5, smoothness=1e-5, eps=1e-3, solver="Sinkhorn", preweight_label="cell_type0528")

causaleffect_weighted

Cinemaot.causaleffect_weighted(adata, pert_key, control, return_matching=False, cf_rep='cf', use_rep='X_pca', batch_size=None, k=20, dim=20, thres=0.15, smoothness=0.0001, rank=200, eps=0.001, solver='Sinkhorn', resolution=1.0)[source]

The resampling CINEMA-OT algorithm that allows tackling the differential abundance in an unsupervised manner.

Parameters:
  • adata (AnnData) – The annotated data object.

  • pert_key (str) – The column of .obs with perturbation categories, should also contain control.

  • control (str) – Control category from the pert_key column.

  • return_matching (bool) – Whether to return the matching matrix in the returned de.obsm[‘ot’].

  • cf_rep (str) – the place to put the confounder embedding in the original adata.obsm.

  • use_rep (str) – Use the indicated representation. ‘X’ or any key for .obsm is valid.

  • batch_size (int | None) – Size of batch to calculate the optimal transport map.

  • k (int) – the number of neighbors used in the k-NN matching phase.

  • dim (int | None) – Use the first dim components in use_rep. If None, use a biwhitening procedure on the raw count matrix to derive a reasonable rank.

  • thres (float) – the threshold for the rank-dependence metric.

  • smoothness (float) – the coefficient determining the smooth level in entropic optimal transport problem.

  • rank (int) – Only used if the solver “LRSinkhorn” is used. Specifies the rank number of the transport map.

  • eps (float) – Tolerate error of the optimal transport.

  • solver (str) – Either “Sinkhorn” or “LRSinkhorn”. The ott-jax solver used.

  • resolution (float) – the clustering resolution used in the sampling phase.

Returns:

Returns an anndata object that contains the single-cell level treatment effect as de.X and the corresponding low dimensional embedding in de.obsm[‘X_embedding’], and optional matching matrix stored in the de.obsm[‘ot’]. Also puts the confounding variation in adata.obsm[cf_rep].

Examples

>>> import pertpy as pt
>>> adata = pt.dt.cinemaot_example()
>>> model = pt.tl.Cinemaot()
>>> ad, de = model.causaleffect_weighted(
>>>              adata, pert_key="perturbation", control="No stimulation", return_matching=True,
>>>              thres=0.5, smoothness=1e-5, eps=1e-3, solver="Sinkhorn")

generate_pseudobulk

Cinemaot.generate_pseudobulk(adata, de, pert_key, control, label_list, cf_rep='cf', de_rep='X_embedding', cf_resolution=0.5, de_resolution=0.5, use_raw=True)[source]

Generating pseudobulk anndata considering the differential response behaviors revealed by CINEMA-OT.

Requires running Cinemaot.causaleffect() or Cinemaot.causaleffect_weighted() first.

Parameters:
  • adata (AnnData) – The annotated data object.

  • de (AnnData) – The anndata output from Cinemaot.causaleffect() or Cinemaot.causaleffect_weighted().

  • pert_key (str) – The column of .obs with perturbation categories, should also contain control.

  • control (str) – Control category from the pert_key column.

  • label_list (list) – Additional covariate labels used to segragate pseudobulk. Should at least contain sample information (sample 1, sample 2,…, etc).

  • cf_rep (str) – the place to put the confounder embedding in the original adata.obsm.

  • de_rep (str) – Use the indicated representation in de.obsm.

  • assign_cf – If a str is passed, a label in adata.obs instead of confounder Leiden label is used.

  • cf_resolution (float) – The leiden clustering resolution for the confounder.

  • de_resolution (float) – The leiden clustering resolution for the differential response.

  • use_raw (bool) – If true, use adata.raw.X to aggregate the pseudobulk profiles. Otherwise use adata.X.

Returns:

Returns an anndata object that contains aggregated pseudobulk profiles and associated metadata.

Examples

>>> import pertpy as pt
>>> adata = pt.dt.cinemaot_example()
>>> model = pt.tl.Cinemaot()
>>> de = model.causaleffect(
>>>         adata, pert_key="perturbation", control="No stimulation", return_matching=True, thres=0.5,
>>>         smoothness=1e-5, eps=1e-3, solver="Sinkhorn", preweight_label="cell_type0528")
>>> adata_pb = model.generate_pseudobulk(
>>>         adata, de, pert_key="perturbation", control="No stimulation", label_list=None)

get_dim

Cinemaot.get_dim(adata, c=0.5, use_rep='X_pca')[source]

Estimating the rank of the count matrix. Always use adata.raw.X. Make sure it is the raw count matrix.

Parameters:
  • adata (AnnData) – The annotated data object.

  • c (float) – the parameter regarding the quadratic variance distribution. c=0 means Poisson count matrices.

  • use_rep (str) – the embedding used to give a upper bound for the estimated rank.

Returns:

Returns the estimated dimension number.

Examples

>>> import pertpy as pt
>>> adata = pt.dt.cinemaot_example()
>>> model = pt.tl.Cinemaot()
>>> dim = model.get_dim(adata)

get_weightidx

Cinemaot.get_weightidx(adata, pert_key, control, use_rep='X_pca', k=20, resolution=1.0)[source]

Generating the resampled indices that balances covariates across treatment conditions.

Parameters:
  • adata (AnnData) – The annotated data object.

  • c – the parameter regarding the quadratic variance distribution. c=0 means Poisson count matrices.

  • use_rep (str) – the embedding used to give a upper bound for the estimated rank.

  • k (int) – the number of neighbors used in the k-NN matching phase.

  • resolution (float) – the clustering resolution used in the sampling phase.

Returns:

Returns the indices.

Examples

>>> import pertpy as pt
>>> adata = pt.dt.cinemaot_example()
>>> model = pt.tl.Cinemaot()
>>> idx = model.get_weightidx(adata, pert_key="perturbation", control="No stimulation")

plot_vis_matching

Cinemaot.plot_vis_matching(adata, de, pert_key, control, de_label, source_label, matching_rep='ot', resolution=0.5, normalize='col', title='CINEMA-OT matching matrix', min_val=0.01, show=True, save=None, ax=None, **kwargs)[source]

Visualize the CINEMA-OT matching matrix.

Parameters:
  • adata (AnnData) – the original anndata after running cinemaot.causaleffect or cinemaot.causaleffect_weighted.

  • de (AnnData) – The anndata output from Cinemaot.causaleffect() or Cinemaot.causaleffect_weighted().

  • pert_key (str) – The column of .obs with perturbation categories, should also contain control.

  • control (str) – Control category from the pert_key column.

  • de_label (str) – the label for differential response. If none, use leiden cluster labels at resolution 1.0.

  • source_label (str) – the confounder / cell type label.

  • matching_rep (str) – the place that stores the matching matrix. default de.obsm[‘ot’].

  • normalize (str) – normalize the coarse-grained matching matrix by row / column.

  • title (str) – the title for the figure.

  • min_val (float) – The min value to truncate the matching matrix.

  • show (bool) – Show the plot, do not return axis.

  • save (str | None) – If True or a str, save the figure. A string is appended to the default filename. Infer the filetype if ending on {‘.pdf’, ‘.png’, ‘.svg’}.

  • **kwargs – Other parameters to input for seaborn.heatmap.

Return type:

None

Examples

>>> import pertpy as pt
>>> adata = pt.dt.cinemaot_example()
>>> cot = pt.tl.Cinemaot()
>>> de = cot.causaleffect(
>>>         adata, pert_key="perturbation", control="No stimulation", return_matching=True,
>>>         thres=0.5, smoothness=1e-5, eps=1e-3, solver="Sinkhorn", preweight_label="cell_type0528")
>>> cot.plot_vis_matching(
>>>         adata, de, pert_key="perturbation",control="No stimulation", de_label=None, source_label="cell_type0528")

synergy

Cinemaot.synergy(adata, pert_key, base, A, B, AB, dim=20, thres=0.15, smoothness=0.0001, preweight_label=None, **kwargs)[source]

A wrapper for computing the synergy matrices.

Parameters:
  • adata (AnnData) – The annotated data object.

  • pert_key (str) – The column of .obs with perturbation categories, should also contain control.

  • base (str) – Control category from the pert_key column.

  • A (str) – the category for perturbation A.

  • B (str) – the category for perturbation B.

  • AB (str) – the category for the combinatorial perturbation A+B.

  • dim (int | None) – Use the first dim components in use_rep. If none, use a biwhitening procedure on the raw count matrix to derive a reasonable rank.

  • thres (float) – the threshold for the rank-dependence metric.

  • smoothness (float) – the coefficient determining the smooth level in entropic optimal transport problem.

  • eps – Tolerate error of the optimal transport.

  • preweight_label (str | None) – the annotated label (e.g. cell type) that is used to assign weights for treated and control cells to balance across the label. Helps overcome the differential abundance issue.

  • **kwargs – other parameters that can be passed to Cinemaot.causaleffect()

Returns:

Returns an AnnData object that contains the single-cell level synergy matrix de.X and the embedding.

Examples

>>> import pertpy as pt
>>> adata = pt.dt.dong_2023()
>>> sc.pp.pca(adata)
>>> model = pt.tl.Cinemaot()
>>> combo = model.synergy(adata, pert_key='perturbation', base='No stimulation', A='IFNb', B='IFNg',
>>>                   AB='IFNb+ IFNg', thres=0.5, smoothness=1e-5, eps=1e-3, solver='Sinkhorn')