PDB, Jespersen et al.
Epitope
Dataset Description
Epitope prediction is to predict the active region in the antigen. This dataset is from Bepipred, which curates a dataset from PDB. It collects B-cell epitopes and non-epitope amino acids determined from crystal structures.
Task Description
Token-level classification. Given the antigen's amino acid sequence, predict amino acid token that is active in binding, i.e. X is an amino acid sequence, Y is a list of indices for the active tokens in X.
Dataset Statistics
447 antigens.
Available Splits
Random Split
Usage Example
from tdc_ml.single_pred import Epitope data = Epitope(name='PDB_Jespersen') # Access the data df = data.get_data() print(df.head()) # Get train/val/test splits split = data.get_split() print(split)
References
License
This dataset is licensed under CC BY 4.0.