⌂ / Overview / Single-cell & Spatial / Pooled CRISPR Screens

View companion source

Pooled CRISPR Screens

Analyse pooled CRISPR screens with scRNA-seq readout.

Overview

Problem. Perturb many genes; read each effect transcriptomically.

Use when: Perturb-seq / CROP-seq data

Avoid when: Insufficient library coverage

Learning goals

Functional genomics: perturb and observe
Assigning guides to cells is the hard part

Figures

Tutorial

Analyze pooled CRISPR screens with single-cell RNA-seq readout using a tiered workflow: fast screening → target validation → rigorous differential expression.

When to Use This Skill

Use this skill when you have:

✅ Pooled CRISPR screens with scRNA-seq (Perturb-seq, CROP-seq, CRISPRi/a)
✅ 10X Feature Barcoding data (sgRNA captured as feature barcodes)
✅ Multi-library experiments with biological replicates
✅ sgRNA-to-cell mapping files (already assigned)

Don't use this skill for:

❌ Arrayed CRISPR screens (separate wells per perturbation) → use bulk RNA-seq DE skills
❌ Non-transcriptional readouts (e.g., protein, flow cytometry)
❌ Data without sgRNA assignments → use CellRanger or CROP-seq pipeline first

Quick Start (Example Data)

Test this skill with a real CRISPRi Perturb-seq dataset (~10 minutes):

from scripts.load_example_data import load_example_data
data = load_example_data()  # Downloads Papalexi 2021 (~140MB, cached after first run)

adata_list = data['adata_list']  # List of AnnData objects (one per batch)
mapping_files = data['mapping_files']  # sgRNA mapping files

What you get:

Dataset: Papalexi & Satija 2021 ECCITE-seq CRISPRi screen (THP-1 cells)
Size: ~20,700 cells x 18,649 genes across 4 batches
Perturbations: 25 target genes (~4 guides each) targeting immune checkpoint regulators + non-targeting controls
Screen type: CRISPRi (expected knockdown direction: down)
Reference: Papalexi et al. (2021) Nature Genetics 53:322-331

For offline testing: Use load_example_data(dataset='demo') for a small synthetic dataset.

For your own data: Replace with your 10X feature-barcode matrices and sgRNA mapping files (see Inputs).

Installation

Core packages (required):

# Create conda environment
conda create -n crispr-screen python=3.8
conda activate crispr-screen

# Install packages
pip install scanpy==1.9+ anndata==0.8+ pandas numpy scipy
pip install scikit-learn  # For outlier detection
pip install diffxpy  # For differential expression

Visualization packages (required):

pip install matplotlib seaborn

Optional packages:

# For glmGamPoi (requires R ≥4.0)
# Install R, then:
# install.packages("BiocManager")
# BiocManager::install("glmGamPoi")
pip install rpy2  # Python-R interface

# For PDF report generation (optional, recommended)
pip install reportlab

Version requirements:

Software	Version	License	Commercial Use	Install
scanpy	≥1.9	BSD-3	✅ Permitted	`pip install scanpy`
anndata	≥0.8	BSD-3	✅ Permitted	`pip install anndata`
matplotlib	≥3.5	PSF	✅ Permitted	`pip install matplotlib`
seaborn	≥0.12	BSD-3	✅ Permitted	`pip install seaborn`
scikit-learn	≥1.0	BSD-3	✅ Permitted	`pip install scikit-learn`
diffxpy	≥0.7	BSD-3	✅ Permitted	`pip install diffxpy`
reportlab	≥3.6	BSD	✅ Permitted	`pip install reportlab` (optional)
rpy2	≥3.5	GPL-2	✅ Permitted	`pip install rpy2` (optional)

Inputs

Required:

10X Feature-Barcode matrices (.h5 format, one per library)
Contains gene expression + sgRNA counts
From CellRanger with Feature Barcoding
sgRNA mapping files (TSV with cell-sgRNA assignments)
Format: cell_barcode\tsgRNA_id
One file per library
Should filter to single sgRNA assignments (doublets removed)

sgRNA naming conventions:

With gene names: GENE_sgRNA1, GENE_sgRNA2 (delimiter: _)
Without gene names: Requires separate gene lookup table

Alternative inputs: CROP-seq pipeline output, custom sgRNA assignment files

See references/crispr_screen_best_practices.md for data format details.

Outputs

Primary results:

screening_summary.txt - Per-perturbation DE statistics from t-test
validation_results.csv - Target gene knockdown/activation validation
glmgampoi/ - Batch-corrected DE results (top hits)
hit_list.csv - Final validated hits with DE gene counts

Processed data:

adata_normalized.h5ad - Normalized AnnData object for downstream analysis
Load with: adata = sc.read_h5ad('adata_normalized.h5ad')
Required for: Clustering, pseudotime, trajectory analysis
adata_processed.h5ad - Final processed data with all annotations

Analysis objects (pickle):

analysis_objects.pkl - Analysis results for downstream skills
Load with: import pickle; objs = pickle.load(open('analysis_objects.pkl', 'rb'))
Contains: perturbation_summary, de_results, outlier_cells
Required for: functional-enrichment-from-degs, coexpression-network

Report:

crispr_screen_report.pdf - Publication-quality PDF with Introduction, Methods, Results, Conclusions
Requires: pip install reportlab (optional — text report generated regardless)

QC plots (PNG + SVG format):

qc_metrics/ - Cell counts, mapping rates, doublet rates per library
volcano_plots/ - Per-perturbation volcano plots (top hits)
target_validation/ - Target gene expression heatmaps

DE results (per perturbation):

initial_screening/ - Fast t-test results (CSV per perturbation)
glmgampoi/ - Rigorous DE with batch correction (top 50-100 hits)

Clarification Questions

🚨 ALWAYS ask Question 1 FIRST. Do not ask about screen type, library details, or analysis goals before the user has answered Question 1.

1. Input Files (ASK THIS FIRST):

Do you have 10X Feature-Barcode matrix files (.h5) to analyze?
- If uploaded: How many libraries/replicates?
- Expected format: raw_feature_bc_matrix.h5 from CellRanger
Do you have sgRNA-to-cell mapping files?
- If yes: Format? (TSV with barcode-sgRNA pairs). Doublets already filtered?
Or use example data for testing? (Papalexi 2021 CRISPRi dataset — real data, ~140MB download)

🚨 IF EXAMPLE DATA SELECTED: All parameters are pre-defined (CRISPRi, THP-1 immune cells, 25 target genes). DO NOT ask questions 2-6. Proceed directly to Step 1.

Questions 2-6 are ONLY for users providing their own data:

2. Screen Type: CRISPRi (knockdown), CRISPRa (activation), CRISPRko (knockout), or other?

3. Library Information: How many biological replicates? Non-targeting controls present? Approximate perturbations (10-100, 100-1000, 1000+)? sgRNAs per gene (1-2, 3-5, 5+)?

4. Cell Type: Primary neurons, iPSC-derived, immune cells, cancer cell lines, other? Expected phenotype (cell death, differentiation, activation, subtle)?

5. sgRNA Naming: Do sgRNA IDs include gene names (e.g., "GENE_sgRNA1")? If not, need gene lookup table?

6. Analysis Goals (skip for example data):

Primary goal?
- a) Identify all hits with transcriptional phenotypes (standard)
- b) Compare specific perturbations against each other
- c) Focus on a known gene set
Run downstream pathway enrichment?
- a) Yes (recommended)
- b) No, just hit identification

Typical Complete Workflow

This skill performs core Perturb-seq analysis: screening, validation, and hit identification. For a complete CRISPR screen workflow:

This skill: Screen analysis → validated hits with transcriptional phenotypes (H5AD + hit lists)
functional-enrichment-from-degs: Pathway enrichment on hit perturbations → biological interpretation
coexpression-network: Network analysis from perturbed cells → regulatory relationships
bulk-omics-clustering: Cluster perturbations by signature → functional groups

Quick start: "Analyze my pooled CRISPR screen and identify hits with pathway enrichment"

Why separate skills? Modular design works across screen types (CRISPRi/a/ko) and readouts (scRNA-seq, bulk). See Suggested Next Steps for details.

Standard Workflow

🚨 MANDATORY: USE SCRIPTS EXACTLY AS SHOWN - DO NOT WRITE INLINE CODE 🚨

This skill uses low-freedom script execution. Pooled CRISPR screen analysis requires careful multi-step processing with QC checkpoints. Scripts handle:

Multi-library loading and concatenation
sgRNA doublet filtering
Cell-type specific QC thresholds
Batch effect handling
Target validation checks

Step 1 - Load data and map sgRNAs:

For example data:

from scripts.load_example_data import load_example_data
data = load_example_data()  # Papalexi 2021 CRISPRi (~140MB, cached)
adata_list = data['adata_list']
mapping_files = data['mapping_files']

For your own data:

from scripts.load_10x_libraries import load_multiple_libraries
from scripts.map_sgrna_to_cells import map_sgrna_to_adata
from scripts.qc_filtering import apply_qc_filters
from scripts.concatenate_libraries import concatenate_libraries

# Load, map, filter, concatenate
adata_list = load_multiple_libraries(["lib1.h5", "lib2.h5"])
adata_mapped = [map_sgrna_to_adata(ad, mf, sgrna_delimiter="_")
                for ad, mf in zip(adata_list, mapping_files)]
adata_filtered = [apply_qc_filters(ad, min_genes=2000, min_counts=8000, max_mito_pct=0.15)
                  for ad in adata_mapped]
adata = concatenate_libraries(adata_filtered, batch_labels=["lib1", "lib2"])

DO NOT write inline loading/QC code. Use the functions above.

✅ VERIFICATION: You should see "✓ sgRNA mapping complete: [N] cells retained" for each library, then "✓ Libraries concatenated: [N] total cells"

Step 2 - Preprocess and normalize:

from scripts.gene_name_corrections import correct_gene_names
from scripts.expression_filtering import filter_by_expression
from scripts.normalize_and_scale import normalize_only

# Preprocessing pipeline
adata = correct_gene_names(adata, corrections={}, gene_col='gene')
adata = filter_by_expression(adata, group_key='sgRNA', min_mean_expression=0.5)
adata = normalize_only(adata, target_sum=1e6, exclude_highly_expressed=True)

DO NOT write inline preprocessing code. Use the functions above.

✅ VERIFICATION: You should see "✓ Gene name corrections applied", then "✓ Expression filtering: [N] genes retained", then "✓ Normalization complete"

Step 3 - Run tiered analysis:

from scripts.screen_all_perturbations import screen_all_perturbations, call_hits
from scripts.validate_perturbations import validate_target_effect
from scripts.differential_expression_glmgampoi import run_glmgampoi_batch

# Fast screening → preliminary hits → validation
de_results = screen_all_perturbations(adata, control_group='non-targeting',
                                       gene_col='gene', test_method='t-test',
                                       output_dir='results/initial_screening/')
preliminary_hits = call_hits(de_results, min_de_genes=10)
validation = validate_target_effect(de_results, expected_direction='down',  # 'up' for CRISPRa
                                     min_log2fc=-0.5)
validated_hits = preliminary_hits[
    preliminary_hits['perturbation'].isin(
        validation[validation['target_affected']]['perturbation']
    )
]

# Rigorous DE for top hits (optional, requires R + glmGamPoi)
top_hits = validated_hits.head(50)
glmgampoi_results = run_glmgampoi_batch(adata, perturbations=top_hits['perturbation'].tolist(),
                                         donor_col='batch', output_dir='results/glmgampoi/')

DO NOT write inline DE code. Use the functions above.

✅ VERIFICATION: You should see "✓ Screening complete: [N] perturbations tested", then "Validated hits: [N]/[M]", then (if glmGamPoi runs) "✓ glmGamPoi complete"

Step 4 - Visualize and export:

from scripts.visualize_perturbations import create_volcano_plots
from scripts.export_results import export_all_results

# Volcano plots for top hits
for gene in top_hits['perturbation'].head(10):
    create_volcano_plots(de_results[gene], perturbation=gene, output_dir='figures/')

# Export all results (CSV, H5AD, pickle, PDF report, hit lists)
export_all_results(adata=adata, perturbation_summary=validated_hits,
                   de_results=de_results, output_dir='results/')

DO NOT write custom export code. Use export_all_results().

✅ VERIFICATION: You should see "✓ Volcano plots generated: [N] perturbations", then "=== Export Complete ===" with list of files saved

❌ IF YOU DON'T SEE VERIFICATION MESSAGES: You wrote inline code instead of using the functions. Stop and use the provided functions.

⚠️ CRITICAL - DO NOT:

❌ Write inline data loading code → STOP: Use load_multiple_libraries() and map_sgrna_to_adata()
❌ Write inline QC/filtering code → STOP: Use apply_qc_filters() and filter_by_expression()
❌ Write inline DE screening code → STOP: Use screen_all_perturbations() and validate_target_effect()
❌ Write custom export code → STOP: Use export_all_results()

⚠️ IF SCRIPTS FAIL - Script Failure Hierarchy:

Fix and Retry (90%) - Install missing package, re-run script
Modify Script (5%) - Edit the script file itself, document changes
Use as Reference (4%) - Read script, adapt approach, cite source
Write from Scratch (1%) - Only if genuinely impossible, explain why

NEVER skip directly to writing inline code without trying the script first.

📁 QC Thresholds by Cell Type:

Cell Type	min_genes	min_counts	max_mito_pct
Primary neurons	2000-3000	8000-10000	0.10-0.15
iPSC-derived	1500-2500	5000-8000	0.15-0.20
Immune cells	1000-1500	3000-5000	0.15-0.20
Macrophages	1000	3000-5000	0.18-0.20
Cancer cell lines	1500-2500	5000-8000	0.15-0.25

See references/qc_guidelines.md for detailed thresholds.

Common Issues

Issue	Cause	Solution
Low sgRNA mapping rate (<30%)	Poor capture efficiency, wrong mapping file	Check feature barcode library prep, verify sgRNA reference matches
High doublet rate (>10%)	High MOI, barcode collisions	Lower viral MOI in future screens, filter doublets computationally
Target gene not DE	Incomplete knockdown, ineffective sgRNA, compensation	Check sgRNA design, validate with protein, check for paralogs
Low validation rate (<50%)	Weak perturbations, wrong expected direction	Check CRISPRi vs CRISPRa, adjust log2fc threshold, check positive controls
Memory errors	Large dataset (>100k cells)	Process libraries separately, use backed mode (`adata = sc.read_h5ad('file.h5ad', backed='r')`)
Inconsistent replicates	Batch effects, population drift	Check batch balance, perform batch correction, analyze replicates separately first
glmGamPoi fails	R not installed, missing package	Install R ≥4.0 and glmGamPoi, or skip and use t-test results
SVG export failed	Missing SVG backend	Normal — script falls back automatically. Both PNG and SVG attempted.

See references/troubleshooting_guide.md for detailed solutions.

Suggested Next Steps

After running this skill:

functional-enrichment-from-degs - Pathway enrichment on hit perturbations
coexpression-network - Build gene regulatory networks from perturbed cells
bulk-omics-clustering - Cluster perturbations by transcriptional signature

Related analyses:

Compare hit genes to GWAS loci (genetic-variant-annotation)
Investigate heterogeneous responses (use Step 12 optional heterogeneity analysis in references/statistical_methods.md)

Related Skills

functional-enrichment-from-degs - Pathway enrichment on DE genes
de-results-to-plots - Advanced visualization (heatmaps, custom plots)
coexpression-network - Gene regulatory network inference
bulk-omics-clustering - Cluster samples/perturbations by expression

References

Key Papers:

Dixit et al. (2016) "Perturb-Seq" Cell 167:1853-1866
Datlinger et al. (2017) "Pooled CRISPR screening with single-cell readout" Nature Methods 14:297-301
Papalexi et al. (2021) "Characterizing the molecular regulation of inhibitory immune checkpoints" Nature Genetics 53:322-331 (example dataset)
Replogle et al. (2020) "Direct guide RNA capture" Nature Biotechnology 38:954-961
Peidli et al. (2024) "scPerturb: harmonized single-cell perturbation data" Nature Methods 21:531-540 (data source)

Online Resources:

10X Feature Barcoding: https://support.10xgenomics.com/single-cell-gene-expression/software/pipelines/latest/algorithms/crispr
CROP-seq pipeline: https://github.com/argschwind/CROP-seq-pipeline
Alector Perturb-seq analysis: https://github.com/Alector-BIO/CRISPR_PerturbSeq_pHM_publication

See references/crispr_screen_best_practices.md for comprehensive guidelines.

Code preview

scripts/concatenate_libraries.py

"""
Concatenate Multiple Libraries

Merge filtered libraries into a single AnnData object with batch labels.
"""

import scanpy as sc
import anndata as ad
from typing import List, Optional
import warnings


def concatenate_libraries(
    adata_list: List[ad.AnnData],
    batch_labels: Optional[List[str]] = None,
    batch_key: str = 'batch'
) -> ad.AnnData:
    """
    Concatenate multiple AnnData objects (libraries) into one.

    Parameters
    ----------
    adata_list : list of AnnData
        List of filtered AnnData objects to concatenate
    batch_labels : list of str, optional
        Labels for each batch (e.g., ["lib1", "lib2", "lib3"])
        If None, uses numeric labels (0, 1, 2, ...)
    batch_key : str, default='batch'
        Column name for batch labels in obs

    Returns
    -------
    adata_combined : AnnData
        Concatenated AnnData object with batch labels

    Example
    -------
    >>> adata_combined = concatenate_libraries(
    ...     [adata1_filtered, adata2_filtered, adata3_filtered, adata4_filtered],
    ...     batch_labels=["lib1", "lib2", "lib3", "lib4"]
    ... )
    """
    print(f"Concatenating {len(adata_list)} libraries...")

    # Print summary of each library
    for i, adata in enumerate(adata_list):
        label = batch_labels[i] if batch_labels else f"batch_{i}"
        print(f"  {label}: {adata.n_obs} cells x {adata.n_vars} genes")

        # Check for required columns
        if 'gene' in adata.obs.columns:
            n_genes = adata.obs['gene'].nunique()
            print(f"    Unique perturbations: {n_genes}")

    # Concatenate using modern anndata.concat (not deprecated concatenate method)
    if batch_labels:
        # Add batch labels to each AnnData
        for i, adata in enumerate(adata_list):
            adata.obs[batch_key] = batch_labels[i]

        adata_combined = ad.concat(
            adata_list,
            join='outer',
            label=None,  # We already added batch labels above
            keys=None,
            index_unique=None
        )
    else:
        # Simple concatenation without batch labels
        adata_combined = ad.concat(
            adata_list,
            join='outer'
        )

    print(f"\nCombined dataset:")
    print(f"  Total cells: {adata_combined.n_obs}")
    print(f"  Total genes: {adata_combined.n_vars}")

    if 'gene' in adata_combined.obs.columns:
        n_unique_genes = adata_combined.obs['gene'].nunique()

scripts/detect_perturbed_cells.py

"""
Detect Perturbed Cells Using Outlier Detection

For each perturbation, identify cells with significant phenotypic changes using
outlier detection methods in PCA space computed on differentially expressed genes.
"""

import numpy as np
import pandas as pd
import scanpy as sc
import anndata as ad
from sklearn.svm import OneClassSVM
from sklearn.neighbors import LocalOutlierFactor
from sklearn.ensemble import IsolationForest
import diffxpy.api as de
from typing import Dict, List, Optional, Literal


def detect_perturbed_cells(
    adata: ad.AnnData,
    control_group: str = 'non-targeting',
    gene_col: str = 'gene',
    sgrna_col: str = 'sgRNA',
    n_pcs: int = 4,
    method: Literal['LocalOutlierFactor', 'IsolationForest', 'OneClassSVM'] = 'LocalOutlierFactor',
    min_de_genes: int = 10,
    min_outliers: int = 10,
    pval_threshold: float = 0.05,
    save_plots: bool = True,
    plot_dir: str = 'figures/'
) -> Dict:
    """
    Detect perturbed cells with significant phenotypes using outlier detection.

    For each perturbation vs control:
    1. Run differential expression (t-test)
    2. Select top DE genes (p < pval_threshold)
    3. If ≥min_de_genes found, compute PCA on DE genes
    4. Train outlier detector on control cells
    5. Predict outliers in perturbed cells
    6. If ≥min_outliers detected, flag as "hit"

    Parameters
    ----------
    adata : AnnData
        Log-normalized AnnData (not scaled)
    control_group : str, default='non-targeting'
        Label for non-targeting control
    gene_col : str, default='gene'
        Column in obs with gene/perturbation labels
    sgrna_col : str, default='sgRNA'
        Column in obs with sgRNA labels
    n_pcs : int, default=4
        Number of PCs to compute on DE genes
    method : str, default='LocalOutlierFactor'
        Outlier detection method: 'LocalOutlierFactor', 'IsolationForest', 'OneClassSVM'
    min_de_genes : int, default=10
        Minimum DE genes required to proceed with outlier detection
    min_outliers : int, default=10
        Minimum outliers to call perturbation as "hit"
    pval_threshold : float, default=0.05
        P-value threshold for calling DE genes
    save_plots : bool, default=True
        Generate UMAP/PCA plots for each perturbation
    plot_dir : str, default='figures/'
        Directory to save plots

    Returns
    -------
    results : dict
        Dictionary containing:
        - 'outlier_cells': list of cell barcodes flagged as outliers
        - 'perturbation_summary': DataFrame with hit calls per perturbation
        - 'de_results': dict of DE results per perturbation
        - 'adata_dict': dict of per-perturbation AnnData objects with classifications

    Example
    -------
    >>> results = detect_perturbed_cells(
    ...     adata,

scripts/differential_expression.py

"""
Differential Expression Analysis per Perturbation

For each perturbation, perform differential expression analysis comparing
perturbed cells to non-targeting controls.
"""

import os
import pandas as pd
import anndata as ad
import diffxpy.api as de
from typing import Dict, Optional, Literal


def run_de_analysis(
    adata: ad.AnnData,
    control_group: str = 'non-targeting',
    gene_col: str = 'gene',
    test_method: Literal['t-test', 'wilcoxon', 'negative_binomial'] = 't-test',
    use_outliers_only: bool = True,
    outlier_cells: Optional[list] = None,
    top_n_genes: int = 50,
    output_dir: str = 'DEG/rank_test/',
    is_logged: bool = True
) -> Dict[str, pd.DataFrame]:
    """
    Run differential expression analysis for each perturbation vs control.

    Parameters
    ----------
    adata : AnnData
        Log-normalized AnnData (use adata.raw or normalized data)
    control_group : str, default='non-targeting'
        Label for control group
    gene_col : str, default='gene'
        Column in obs with perturbation labels
    test_method : str, default='t-test'
        Statistical test: 't-test', 'wilcoxon', 'negative_binomial'
    use_outliers_only : bool, default=True
        If True and outlier_cells provided, use only outlier cells for DE
    outlier_cells : list, optional
        List of cell barcodes identified as outliers (from detect_perturbed_cells)
    top_n_genes : int, default=50
        Number of top genes to highlight in output
    output_dir : str, default='DEG/rank_test/'
        Directory to save per-perturbation DE results
    is_logged : bool, default=True
        Whether data is log-transformed

    Returns
    -------
    de_results : dict
        Dictionary mapping gene -> DE results DataFrame

    Example
    -------
    >>> de_results = run_de_analysis(
    ...     adata,
    ...     control_group='non-targeting',
    ...     test_method='t-test',
    ...     use_outliers_only=True,
    ...     outlier_cells=results['outlier_cells'],
    ...     top_n_genes=50
    ... )
    """
    os.makedirs(output_dir, exist_ok=True)

    # Subset to outliers + controls if requested
    if use_outliers_only and outlier_cells is not None:
        print(f"Subsetting to outlier cells + controls...")
        control_cell_barcodes = adata[adata.obs[gene_col] == control_group].obs_names.tolist()
        kept_cells = list(set(outlier_cells + control_cell_barcodes))
        adata_selected = adata[adata.obs_names.isin(kept_cells)].copy()
        print(f"  Using {len(outlier_cells)} outlier cells + {len(control_cell_barcodes)} control cells")
    else:
        adata_selected = adata.copy()
        print("Using all cells for DE analysis")

    # Get unique perturbations
    genes = adata_selected.obs[gene_col].unique().tolist()

Companion files

Type	Path	Bytes
Markdown	references/crispr_screen_best_practices.md	10,033
Markdown	references/qc_guidelines.md	19,037
Markdown	references/statistical_methods.md	16,067
Markdown	references/troubleshooting_guide.md	17,341
Markdown	references/umi_optimization.md	8,274
Python	scripts/concatenate_libraries.py	4,154
Python	scripts/detect_perturbed_cells.py	9,271
Python	scripts/differential_expression.py	6,364
Python	scripts/differential_expression_glmgampoi.py	9,804
Python	scripts/export_results.py	8,954
Python	scripts/expression_filtering.py	4,425
Python	scripts/gene_name_corrections.py	5,335
Python	scripts/generate_report.py	19,776
Python	scripts/load_10x_libraries.py	1,613
Python	scripts/load_example_data.py	15,150
Python	scripts/map_sgrna_to_cells.py	3,454
Python	scripts/normalize_and_scale.py	3,845
Python	scripts/qc_filtering.py	5,267
R	scripts/run_glmgampoi.R	5,862
Python	scripts/screen_all_perturbations.py	9,968
Python	scripts/validate_perturbations.py	11,843
Python	scripts/visualize_perturbations.py	9,421
Markdown	SKILL.md	16,975
JSON	skill.meta.json	4,359