GDA
Gene-Disease Association
Task Overview
Many diseases are driven by genes aberrations. Gene-disease associations (GDA) quantify the relation among a pair of gene and disease. The GDA is usually constructed as a network where we can probe the gene-disease mechanisms by taking into account multiple genes and diseases factors. This task is to predict the association of any gene and disease from both a biochemical modeling and network edge classification perspectives.
Impact
A high association between a gene and disease could hint at a potential therapeutics target for the disease. Thus, to fill in the vastly incomplete GDA using machine learning accurately could bring numerous therapeutic opportunities.
Generalization
Extrapolating to unseen gene and disease pairs with accurate association prediction.
Product
Any therapeutics.
Pipeline Stage
Basic biomedical research, target discovery.
Available Datasets
Usage Example
You can access these datasets using the PyTDC library:
from tdc_ml.multi_pred import GDA
# Load a dataset
data = GDA(name='DisGeNET')
# Access the data
df = data.get_data()
print(df.head())
# Get train/val/test splits
split = data.get_split()
print(split)