pertpy.preprocessing.GuideAssignment#
Methods table#
|
Simple threshold based gRNA assignment function. |
|
Assigns gRNAs to cells using a mixture model. |
|
Simple threshold based max gRNA assignment function. |
|
|
|
|
|
|
|
Heatmap plotting of guide RNA expression matrix. |
Methods#
- GuideAssignment.assign_by_threshold(data, /, *, assignment_threshold, layer=None, output_layer='assigned_guides')[source]#
Simple threshold based gRNA assignment function.
Each cell is assigned to gRNA with at least assignment_threshold counts. This function expects unnormalized data as input.
- Parameters:
data (
AnnData|ndarray|csr_matrix) – The (annotated) data matrix of shape n_obs × n_vars. Rows correspond to cells and columns to genes.assignment_threshold (
float) – The count threshold that is required for an assignment to be viable.layer (
str|None, default:None) – Key to the layer containing raw count values of the gRNAs. adata.X is used if layer is None. Expects count data.output_layer (
str, default:'assigned_guides') – Assigned guide will be saved on adata.layers[output_key].
Examples
Each cell is assigned to gRNA that occurs at least 5 times in the respective cell.
>>> import pertpy as pt >>> mdata = pt.data.papalexi_2021() >>> gdo = mdata.mod["gdo"] >>> ga = pt.pp.GuideAssignment() >>> ga.assign_by_threshold(gdo, assignment_threshold=5)
- GuideAssignment.assign_mixture_model(adata, model='poisson_gauss_mixture', assigned_guides_key='assigned_guide', no_grna_assigned_key='negative', max_assignments_per_cell=5, multiple_grna_assigned_key='multiple', multiple_grna_assignment_string='+', only_return_results=False, show_progress=False, **mixture_model_kwargs)[source]#
Assigns gRNAs to cells using a mixture model.
- Parameters:
adata (
AnnData) – AnnData object containing gRNA values.model (
Literal['poisson_gauss_mixture'], default:'poisson_gauss_mixture') – The model to use for the mixture model. Currently only Poisson_Gauss_Mixture is supported.assigned_guides_key (
str, default:'assigned_guide') – Assigned guide will be saved on adata.obs[output_key].no_grna_assigned_key (
str, default:'negative') – The key to return if a cell is negative for all gRNAs.max_assignments_per_cell (
int, default:5) – The maximum number of gRNAs that can be assigned to a cell.multiple_grna_assigned_key (
str, default:'multiple') – The key to return if multiple gRNAs are assigned to a cell.multiple_grna_assignment_string (
str, default:'+') – The string to use to join multiple gRNAs assigned to a cell.only_return_results (
bool, default:False) – Whether input AnnData is not modified and the result is returned as an np.ndarray.show_progress (
bool, default:False) – Whether to shows progress bar.mixture_model_kwargs – Are passed to the mixture model.
- Return type:
Examples
>>> import pertpy as pt >>> mdata = pt.dt.papalexi_2021() >>> gdo = mdata.mod["gdo"] >>> ga = pt.pp.GuideAssignment() >>> ga.assign_mixture_model(gdo)
- GuideAssignment.assign_to_max_guide(data, /, *, assignment_threshold, layer=None, obs_key='assigned_guide', no_grna_assigned_key='Negative')[source]#
Simple threshold based max gRNA assignment function.
Each cell is assigned to the most expressed gRNA if it has at least assignment_threshold counts. This function expects unnormalized data as input.
- Parameters:
data (
AnnData|ndarray|csr_matrix) – The (annotated) data matrix of shape n_obs × n_vars. Rows correspond to cells and columns to genes.assignment_threshold (
float) – The count threshold that is required for an assignment to be viable.layer (
str|None, default:None) – Key to the layer containing raw count values of the gRNAs. adata.X is used if layer is None. Expects count data.obs_key (
str, default:'assigned_guide') – Assigned guide will be saved on adata.obs[output_key].no_grna_assigned_key (
str, default:'Negative') – The key to return if no gRNA is expressed enough.
- Return type:
Examples
Each cell is assigned to the most expressed gRNA if it has at least 5 counts.
>>> import pertpy as pt >>> mdata = pt.dt.papalexi_2021() >>> gdo = mdata.mod["gdo"] >>> ga = pt.pp.GuideAssignment() >>> ga.assign_to_max_guide(gdo, assignment_threshold=5)
- GuideAssignment.assign_to_max_guide_anndata(adata, /, *, assignment_threshold, layer=None, obs_key='assigned_guide', no_grna_assigned_key='Negative')[source]#
- Return type:
- GuideAssignment.assign_to_max_guide_numpy(X, /, *, var, assignment_threshold, no_grna_assigned_key='Negative')[source]#
- Return type:
- GuideAssignment.assign_to_max_guide_sparse(X, /, *, var, assignment_threshold, no_grna_assigned_key='Negative')[source]#
- Return type:
- GuideAssignment.plot_heatmap(adata, *, layer=None, order_by=None, key_to_save_order=None, return_fig=False, **kwargs)[source]#
Heatmap plotting of guide RNA expression matrix.
Assuming guides have sparse expression, this function reorders cells and plots guide RNA expression so that a nice sparse representation is achieved. The cell ordering can be stored and reused in future plots to obtain consistent plots before and after analysis of the guide RNA expression. Note: This function expects a log-normalized or binary data.
- Parameters:
adata (
AnnData) – Annotated data matrix containing gRNA valueslayer (
str|None, default:None) – Key to the layer containing log normalized count values of the gRNAs. adata.X is used if layer is None.order_by (
ndarray|str|None, default:None) – The order of cells in y axis. If None, cells will be reordered to have a nice sparse representation. If a string is provided, adata.obs[order_by] will be used as the order. If a numpy array is provided, the array will be used for ordering.key_to_save_order (
str, default:None) – The obs key to save cell orders in the current plot. Only saves if not None.return_fig (
bool, default:False) – if True, returns figure of the plot, that can be used for saving.kwargs – Are passed to sc.pl.heatmap.
- Return type:
- Returns:
If return_fig is True, returns the figure, otherwise None. Order of cells in the y-axis will be saved on adata.obs[key_to_save_order] if provided.
Examples
Each cell is assigned to gRNA that occurs at least 5 times in the respective cell, which is then visualized using a heatmap.
>>> import pertpy as pt >>> mdata = pt.dt.papalexi_2021() >>> gdo = mdata.mod["gdo"] >>> ga = pt.pp.GuideAssignment() >>> ga.assign_by_threshold(gdo, assignment_threshold=5) >>> ga.plot_heatmap(gdo)