CYP2C9 Substrate, Carbon-Mangels et al.
ADME
Dataset Description
CYP P450 2C9 plays a major role in the oxidation of both xenobiotic and endogenous compounds. Substrates are drugs that are metabolized by the enzyme. TDC used a dataset from [1], which merged information on substrates and nonsubstrates from six publications.
Task Description
Binary Classification. Given a drug SMILES string, predict if it is a substrate to the enzyme.
Dataset Statistics
666 drugs.
Available Splits
Random SplitScaffold Split
Usage Example
from tdc_ml.single_pred import ADME data = ADME(name='CYP2C9_Substrate_CarbonMangels') # Access the data df = data.get_data() print(df.head()) # Get train/val/test splits split = data.get_split() print(split)
References
- [1] Carbon‐Mangels, Miriam, and Michael C. Hutter. “Selecting relevant descriptors for classification by bayesian estimates: a comparison with decision trees and support vector machines approaches for disparate data sets.” Molecular informatics 30.10 (2011): 885-895.
- [2] Cheng, Feixiong, et al. “admetSAR: a comprehensive source and free tool for assessment of chemical ADMET properties.” (2012): 3099-3105.
License
This dataset is licensed under CC BY 4.0