CRISPROutcome
CRISPR Repair Prediction
Task Overview
CRISPR-Cas9 is a gene editing technology that allows targeted deletion or modification of specific regions of the DNA within an organism. This is achieved through designing a guide RNA sequence that binds upstream of the target site which is then cleaved through a Cas9-mediated double stranded DNA break. The cell responds by employing DNA repair mechanisms (such as non-homologous end joining) that result in heterogeneous outcomes including gene insertion or deletion mutations (indels) of varying lengths and frequencies. This task aims to predict the repair outcome given a DNA sequence.
Impact
Gene editing offers a powerful new avenue of research for tackling intractable illnesses that are infeasible to treat using conventional approaches. For example, the FDA recently approved engineering of T-cells using gene editing to treat patients with acute lymphoblastic leukemia. However, since many human genetic variants associated with disease arise from insertions and deletions, it is critical to be able to better predict gene editing outcomes to ensure efficacy and avoid unwanted pathogenic mutations.
Generalization
The distribution of Cas9-mediated editing products at a given target site is reproducible and dependent on local sequence context. Thus, it is expected that repair outcomes predicted using well-trained models should be able to generalize across cell lines and reagent delivery methods.
Product
Cell and gene therapy.
Pipeline Stage
Efficacy and safety.
Available Datasets
Usage Example
You can access these datasets using the PyTDC library:
from tdc_ml.single_pred import CRISPROutcome
# Load a dataset
data = CRISPROutcome(name='Leenay')
# Access the data
df = data.get_data()
print(df.head())