Single-Instance TasksCRISPROutcome

CRISPROutcome

CRISPR Repair Prediction

Task Overview

CRISPR-Cas9 is a gene editing technology that allows targeted deletion or modification of specific regions of the DNA within an organism. This is achieved through designing a guide RNA sequence that binds upstream of the target site which is then cleaved through a Cas9-mediated double stranded DNA break. The cell responds by employing DNA repair mechanisms (such as non-homologous end joining) that result in heterogeneous outcomes including gene insertion or deletion mutations (indels) of varying lengths and frequencies. This task aims to predict the repair outcome given a DNA sequence.

Impact

Gene editing offers a powerful new avenue of research for tackling intractable illnesses that are infeasible to treat using conventional approaches. For example, the FDA recently approved engineering of T-cells using gene editing to treat patients with acute lymphoblastic leukemia. However, since many human genetic variants associated with disease arise from insertions and deletions, it is critical to be able to better predict gene editing outcomes to ensure efficacy and avoid unwanted pathogenic mutations.

Generalization

The distribution of Cas9-mediated editing products at a given target site is reproducible and dependent on local sequence context. Thus, it is expected that repair outcomes predicted using well-trained models should be able to generalize across cell lines and reagent delivery methods.

Product

Cell and gene therapy.

Pipeline Stage

Efficacy and safety.

Available Datasets

Usage Example

You can access these datasets using the PyTDC library:

from tdc_ml.single_pred import CRISPROutcome

# Load a dataset
data = CRISPROutcome(name='Leenay')

# Access the data
df = data.get_data()
print(df.head())