CYP2D6 Substrate, Carbon-Mangels et al.
ADME
Dataset Description
CYP2D6 is primarily expressed in the liver. It is also highly expressed in areas of the central nervous system, including the substantia nigra. TDC used a dataset from [1], which merged information on substrates and nonsubstrates from six publications.
Task Description
Binary Classification. Given a drug SMILES string, predict if it is a substrate to the enzyme.
Dataset Statistics
664 drugs.
Available Splits
Random SplitScaffold Split
Usage Example
from tdc_ml.single_pred import ADME data = ADME(name='CYP2D6_Substrate_CarbonMangels') # Access the data df = data.get_data() print(df.head()) # Get train/val/test splits split = data.get_split() print(split)
References
- [1] Carbon‐Mangels, Miriam, and Michael C. Hutter. “Selecting relevant descriptors for classification by bayesian estimates: a comparison with decision trees and support vector machines approaches for disparate data sets.” Molecular informatics 30.10 (2011): 885-895.
- [2] Cheng, Feixiong, et al. “admetSAR: a comprehensive source and free tool for assessment of chemical ADMET properties.” (2012): 3099-3105.
License
This dataset is licensed under CC BY 4.0