GDA

Gene-Disease Association

Task Overview

Many diseases are driven by genes aberrations. Gene-disease associations (GDA) quantify the relation among a pair of gene and disease. The GDA is usually constructed as a network where we can probe the gene-disease mechanisms by taking into account multiple genes and diseases factors. This task is to predict the association of any gene and disease from both a biochemical modeling and network edge classification perspectives.

Impact

A high association between a gene and disease could hint at a potential therapeutics target for the disease. Thus, to fill in the vastly incomplete GDA using machine learning accurately could bring numerous therapeutic opportunities.

Generalization

Extrapolating to unseen gene and disease pairs with accurate association prediction.

Product

Any therapeutics.

Pipeline Stage

Basic biomedical research, target discovery.

Available Datasets

Usage Example

You can access these datasets using the PyTDC library:

from tdc_ml.multi_pred import GDA

# Load a dataset
data = GDA(name='DisGeNET')

# Access the data
df = data.get_data()
print(df.head())

# Get train/val/test splits
split = data.get_split()
print(split)