Single-Instance TasksEpitopePDB, Jespersen et al.

PDB, Jespersen et al.

Epitope

Dataset Description

Epitope prediction is to predict the active region in the antigen. This dataset is from Bepipred, which curates a dataset from PDB. It collects B-cell epitopes and non-epitope amino acids determined from crystal structures.

Task Description

Token-level classification. Given the antigen's amino acid sequence, predict amino acid token that is active in binding, i.e. X is an amino acid sequence, Y is a list of indices for the active tokens in X.

Dataset Statistics

447 antigens.

Available Splits

Random Split

Usage Example

from tdc_ml.single_pred import Epitope

data = Epitope(name='PDB_Jespersen')

# Access the data
df = data.get_data()
print(df.head())

# Get train/val/test splits
split = data.get_split()
print(split)

License

This dataset is licensed under CC BY 4.0.