Single-Instance TasksADMECYP3A4 Substrate, Carbon-Mangels et al.

CYP3A4 Substrate, Carbon-Mangels et al.

ADME

Dataset Description

CYP3A4 is an important enzyme in the body, mainly found in the liver and in the intestine. It oxidizes small foreign organic molecules (xenobiotics), such as toxins or drugs, so that they can be removed from the body. TDC used a dataset from [1], which merged information on substrates and nonsubstrates from six publications.

Task Description

Binary Classification. Given a drug SMILES string, predict if it is a substrate to the enzyme.

Dataset Statistics

667 drugs.

Available Splits

Random SplitScaffold Split

Usage Example

from tdc_ml.single_pred import ADME

data = ADME(name='CYP3A4_Substrate_CarbonMangels')

# Access the data
df = data.get_data()
print(df.head())

# Get train/val/test splits
split = data.get_split()
print(split)

License

This dataset is licensed under CC BY 4.0