TY - UNPB
T1 - Machine learning reveals novel targets for both glioblastoma and osteosarcoma
AU - Li, Nan
AU - Ward, Max
AU - Bashir, Muniba
AU - Cao, Yunpeng
AU - Datta, Amitava
AU - Li, Zhaoyu
AU - Zhang, Shuang
PY - 2024
Y1 - 2024
N2 - Glioblastoma and osteosarcoma originate from the same lineage, yet patients with these two tumour types show significant differences in survival outcomes. Transcriptomic analysis comparing these tumours reveals that over 65% genes show similar expression patterns. Principal component analysis further demonstrates substantial similarities between these two tumour types, albeit with discernible differences. Deep learning analysis employing an autoencoder unveils nuanced distinctions and similarities of these two tumours at a high resolution. A classification model, leveraging gradient boosting with eXtreme Gradient Boosting (XGBoost), achieves high accuracy in distinguishing between these two tumour types. Identification of key contributors to the model’s performance is facilitated by SHapley Additive exPlanations (SHAP), yielding two lists of top target genes with and without considering gender. Notably, these SHAP targets tend to cluster within one or two networks of signalling pathways. Remarkably, gene expression levels of many of these SHAP targets alone can recapitulate survival differences solely based on clinical data between glioblastoma and osteosarcoma patients. Of particular interest, C2ORF72 emerges as a common target from both lists, representing an uncharacterised protein with promising potential as a novel target for diagnostic, prognostic, and therapeutic target for both glioblastoma and osteosarcoma.
AB - Glioblastoma and osteosarcoma originate from the same lineage, yet patients with these two tumour types show significant differences in survival outcomes. Transcriptomic analysis comparing these tumours reveals that over 65% genes show similar expression patterns. Principal component analysis further demonstrates substantial similarities between these two tumour types, albeit with discernible differences. Deep learning analysis employing an autoencoder unveils nuanced distinctions and similarities of these two tumours at a high resolution. A classification model, leveraging gradient boosting with eXtreme Gradient Boosting (XGBoost), achieves high accuracy in distinguishing between these two tumour types. Identification of key contributors to the model’s performance is facilitated by SHapley Additive exPlanations (SHAP), yielding two lists of top target genes with and without considering gender. Notably, these SHAP targets tend to cluster within one or two networks of signalling pathways. Remarkably, gene expression levels of many of these SHAP targets alone can recapitulate survival differences solely based on clinical data between glioblastoma and osteosarcoma patients. Of particular interest, C2ORF72 emerges as a common target from both lists, representing an uncharacterised protein with promising potential as a novel target for diagnostic, prognostic, and therapeutic target for both glioblastoma and osteosarcoma.
U2 - 10.1101/2024.11.05.622056
DO - 10.1101/2024.11.05.622056
M3 - Preprint
BT - Machine learning reveals novel targets for both glioblastoma and osteosarcoma
PB - bioRxiv
CY - USA
ER -