Single-Instance TasksToxhERG Karim et al.

hERG Karim et al.

Tox

Dataset Description

A integrated Ether-a-go-go-related gene (hERG) dataset consisting of molecular structures labelled as hERG (<10uM) and non-hERG (>=10uM) blockers in the form of SMILES strings was obtained from the DeepHIT, the BindingDB database, ChEMBL bioactivity database, and other literature.

Task Description

Binary classification. Given a drug SMILES string, predict whether it blocks (1, <10uM) or not blocks (0, >=10uM).

Dataset Statistics

13,445 drugs.

Available Splits

Random SplitScaffold Split

Usage Example

from tdc_ml.single_pred import Tox

data = Tox(name='hERG_Karim')

# Access the data
df = data.get_data()
print(df.head())

# Get train/val/test splits
split = data.get_split()
print(split)

License

This dataset is licensed under CC BY 4.0.