Single-Instance TasksToxCarcinogens

Carcinogens

Tox

Dataset Description

A carcinogen is any substance, radionuclide, or radiation that promotes carcinogenesis, the formation of cancer. This may be due to the ability to damage the genome or to the disruption of cellular metabolic processes.

Task Description

Binary classification. Given a drug SMILES string, predict whether it can cause carcinogen.

Dataset Statistics

278 drugs.

Available Splits

Random SplitScaffold Split

Usage Example

from tdc_ml.single_pred import Tox

data = Tox(name='Carcinogens_Lagunin')

# Access the data
df = data.get_data()
print(df.head())

# Get train/val/test splits
split = data.get_split()
print(split)

License

This dataset is licensed under CC BY 4.0.