Carcinogens
Tox
Dataset Description
A carcinogen is any substance, radionuclide, or radiation that promotes carcinogenesis, the formation of cancer. This may be due to the ability to damage the genome or to the disruption of cellular metabolic processes.
Task Description
Binary classification. Given a drug SMILES string, predict whether it can cause carcinogen.
Dataset Statistics
278 drugs.
Available Splits
Random SplitScaffold Split
Usage Example
from tdc_ml.single_pred import Tox data = Tox(name='Carcinogens_Lagunin') # Access the data df = data.get_data() print(df.head()) # Get train/val/test splits split = data.get_split() print(split)
References
License
This dataset is licensed under CC BY 4.0.