MHC Class II, IEDB, Jensen et al.
PeptideMHC
Dataset Description
Major histocompatibility complex class II (MHC‐II) molecules are found on the surface of antigen‐presenting cells where they present peptides derived from extracellular proteins to T helper cells. Useful to identify T‐cell epitopes. An organized datasets by NetMHCIIpan for MHC class II collected from IEDB database.
Task Description
Regression. Given the amino acid sequence of peptide and the pseudo amino acid sequence of MHC, predict the binding affinity.
Dataset Statistics
134,281 pairs, 17,003 peptides and 75 MHC class 2s
Available Splits
Random Split
Usage Example
from tdc_ml.multi_pred import PeptideMHC data = PeptideMHC(name='MHC2_IEDB_Jensen') # Access the data df = data.get_data() print(df.head()) # Get train/val/test splits split = data.get_split() print(split)
References
- [1] Jensen, Kamilla Kjaergaard, et al. “Improved methods for predicting peptide binding affinity to MHC class II molecules.” Immunology 154.3 (2018): 394-406.
- [2] Vita, Randi, et al. “The immune epitope database (IEDB): 2018 update.” Nucleic acids research 47.D1 (2019): D339-D343.
- [3] Zeng, Haoyang, and David K. Gifford. “Quantification of uncertainty in peptide-MHC binding prediction improves high-affinity peptide Selection for therapeutic design.” Cell systems 9.2 (2019): 159-166.
License
This dataset is licensed under CC BY 4.0.