Single-Instance Tasks›Epitope›IEDB, Jespersen et al.

IEDB, Jespersen et al.

Epitope

Dataset Description

Epitope prediction is to predict the active region in the antigen. This dataset is from Bepipred, which curates a dataset from IEDB. It collects B-cell epitopes and non-epitope amino acids determined from crystal structures.

Task Description

Token-level classification. Given an amino acid sequence, predict amino acid token that is active in binding, i.e. X is amino acid sequence, Y is a list of indices for the active positions in X.

Dataset Statistics

3,159 antigens.

Available Splits

Random Split

Usage Example

from tdc_ml.single_pred import Epitope

data = Epitope(name='IEDB_Jespersen')

# Access the data
df = data.get_data()
print(df.head())

# Get train/val/test splits
split = data.get_split()
print(split)

References

[1] Vita, Randi, et al. “The immune epitope database (IEDB): 2018 update.” Nucleic acids research 47.D1 (2019): D339-D343.
[2] Jespersen, Martin Closter, et al. “BepiPred-2.0: improving sequence-based B-cell epitope prediction using conformational epitopes.” Nucleic acids research 45.W1 (2017): W24-W29.

License

This dataset is licensed under CC BY 4.0.