pertpy.preprocessing.GuideAssignment#

class GuideAssignment[source]#

Assign cells to guide RNAs.

Methods table#

assign_by_threshold(data, /, *, ...[, ...])

Simple threshold based gRNA assignment function.

assign_mixture_model(adata[, model, ...])

Assigns gRNAs to cells using a mixture model.

assign_to_max_guide(data, /, *, ...[, ...])

Simple threshold based max gRNA assignment function.

assign_to_max_guide_anndata(adata, /, *, ...)

assign_to_max_guide_numpy(X, /, *, var, ...)

assign_to_max_guide_sparse(X, /, *, var, ...)

plot_heatmap(adata, *[, layer, order_by, ...])

Heatmap plotting of guide RNA expression matrix.

Methods#

GuideAssignment.assign_by_threshold(data, /, *, assignment_threshold, layer=None, output_layer='assigned_guides')[source]#

Simple threshold based gRNA assignment function.

Each cell is assigned to gRNA with at least assignment_threshold counts. This function expects unnormalized data as input.

Parameters:
  • data (AnnData | ndarray | csr_matrix) – The (annotated) data matrix of shape n_obs × n_vars. Rows correspond to cells and columns to genes.

  • assignment_threshold (float) – The count threshold that is required for an assignment to be viable.

  • layer (str | None, default: None) – Key to the layer containing raw count values of the gRNAs. adata.X is used if layer is None. Expects count data.

  • output_layer (str, default: 'assigned_guides') – Assigned guide will be saved on adata.layers[output_key].

Examples

Each cell is assigned to gRNA that occurs at least 5 times in the respective cell.

>>> import pertpy as pt
>>> mdata = pt.data.papalexi_2021()
>>> gdo = mdata.mod["gdo"]
>>> ga = pt.pp.GuideAssignment()
>>> ga.assign_by_threshold(gdo, assignment_threshold=5)
GuideAssignment.assign_mixture_model(adata, model='poisson_gauss_mixture', assigned_guides_key='assigned_guide', no_grna_assigned_key='negative', max_assignments_per_cell=5, multiple_grna_assigned_key='multiple', multiple_grna_assignment_string='+', only_return_results=False, show_progress=False, **mixture_model_kwargs)[source]#

Assigns gRNAs to cells using a mixture model.

Parameters:
  • adata (AnnData) – AnnData object containing gRNA values.

  • model (Literal['poisson_gauss_mixture'], default: 'poisson_gauss_mixture') – The model to use for the mixture model. Currently only Poisson_Gauss_Mixture is supported.

  • assigned_guides_key (str, default: 'assigned_guide') – Assigned guide will be saved on adata.obs[output_key].

  • no_grna_assigned_key (str, default: 'negative') – The key to return if a cell is negative for all gRNAs.

  • max_assignments_per_cell (int, default: 5) – The maximum number of gRNAs that can be assigned to a cell.

  • multiple_grna_assigned_key (str, default: 'multiple') – The key to return if multiple gRNAs are assigned to a cell.

  • multiple_grna_assignment_string (str, default: '+') – The string to use to join multiple gRNAs assigned to a cell.

  • only_return_results (bool, default: False) – Whether input AnnData is not modified and the result is returned as an np.ndarray.

  • show_progress (bool, default: False) – Whether to shows progress bar.

  • mixture_model_kwargs – Are passed to the mixture model.

Return type:

ndarray | None

Examples

>>> import pertpy as pt
>>> mdata = pt.dt.papalexi_2021()
>>> gdo = mdata.mod["gdo"]
>>> ga = pt.pp.GuideAssignment()
>>> ga.assign_mixture_model(gdo)
GuideAssignment.assign_to_max_guide(data, /, *, assignment_threshold, layer=None, obs_key='assigned_guide', no_grna_assigned_key='Negative')[source]#

Simple threshold based max gRNA assignment function.

Each cell is assigned to the most expressed gRNA if it has at least assignment_threshold counts. This function expects unnormalized data as input.

Parameters:
  • data (AnnData | ndarray | csr_matrix) – The (annotated) data matrix of shape n_obs × n_vars. Rows correspond to cells and columns to genes.

  • assignment_threshold (float) – The count threshold that is required for an assignment to be viable.

  • layer (str | None, default: None) – Key to the layer containing raw count values of the gRNAs. adata.X is used if layer is None. Expects count data.

  • obs_key (str, default: 'assigned_guide') – Assigned guide will be saved on adata.obs[output_key].

  • no_grna_assigned_key (str, default: 'Negative') – The key to return if no gRNA is expressed enough.

Return type:

ndarray | None

Examples

Each cell is assigned to the most expressed gRNA if it has at least 5 counts.

>>> import pertpy as pt
>>> mdata = pt.dt.papalexi_2021()
>>> gdo = mdata.mod["gdo"]
>>> ga = pt.pp.GuideAssignment()
>>> ga.assign_to_max_guide(gdo, assignment_threshold=5)
GuideAssignment.assign_to_max_guide_anndata(adata, /, *, assignment_threshold, layer=None, obs_key='assigned_guide', no_grna_assigned_key='Negative')[source]#
Return type:

None

GuideAssignment.assign_to_max_guide_numpy(X, /, *, var, assignment_threshold, no_grna_assigned_key='Negative')[source]#
Return type:

ndarray

GuideAssignment.assign_to_max_guide_sparse(X, /, *, var, assignment_threshold, no_grna_assigned_key='Negative')[source]#
Return type:

ndarray

GuideAssignment.plot_heatmap(adata, *, layer=None, order_by=None, key_to_save_order=None, return_fig=False, **kwargs)[source]#

Heatmap plotting of guide RNA expression matrix.

Assuming guides have sparse expression, this function reorders cells and plots guide RNA expression so that a nice sparse representation is achieved. The cell ordering can be stored and reused in future plots to obtain consistent plots before and after analysis of the guide RNA expression. Note: This function expects a log-normalized or binary data.

Parameters:
  • adata (AnnData) – Annotated data matrix containing gRNA values

  • layer (str | None, default: None) – Key to the layer containing log normalized count values of the gRNAs. adata.X is used if layer is None.

  • order_by (ndarray | str | None, default: None) – The order of cells in y axis. If None, cells will be reordered to have a nice sparse representation. If a string is provided, adata.obs[order_by] will be used as the order. If a numpy array is provided, the array will be used for ordering.

  • key_to_save_order (str, default: None) – The obs key to save cell orders in the current plot. Only saves if not None.

  • return_fig (bool, default: False) – if True, returns figure of the plot, that can be used for saving.

  • kwargs – Are passed to sc.pl.heatmap.

Return type:

Figure | None

Returns:

If return_fig is True, returns the figure, otherwise None. Order of cells in the y-axis will be saved on adata.obs[key_to_save_order] if provided.

Examples

Each cell is assigned to gRNA that occurs at least 5 times in the respective cell, which is then visualized using a heatmap.

>>> import pertpy as pt
>>> mdata = pt.dt.papalexi_2021()
>>> gdo = mdata.mod["gdo"]
>>> ga = pt.pp.GuideAssignment()
>>> ga.assign_by_threshold(gdo, assignment_threshold=5)
>>> ga.plot_heatmap(gdo)