hERG Karim et al.
Tox
Dataset Description
A integrated Ether-a-go-go-related gene (hERG) dataset consisting of molecular structures labelled as hERG (<10uM) and non-hERG (>=10uM) blockers in the form of SMILES strings was obtained from the DeepHIT, the BindingDB database, ChEMBL bioactivity database, and other literature.
Task Description
Binary classification. Given a drug SMILES string, predict whether it blocks (1, <10uM) or not blocks (0, >=10uM).
Dataset Statistics
13,445 drugs.
Available Splits
Random SplitScaffold Split
Usage Example
from tdc_ml.single_pred import Tox data = Tox(name='hERG_Karim') # Access the data df = data.get_data() print(df.head()) # Get train/val/test splits split = data.get_split() print(split)
License
This dataset is licensed under CC BY 4.0.