LOTUS: a single- and multitask machine learning algorithm for the prediction of cancer driver genes
Abstract
Cancer driver genes, i.e., oncogenes and tumor suppressor genes, are involved in the
acquisition of important functions in tumors, providing a selective growth advantage,
allowing uncontrolled proliferation and avoiding apoptosis. It is therefore important to
identify these driver genes, both for the fundamental understanding of cancer and to
help finding new therapeutic targets. Although the most frequently mutated driver
genes have been identified, it is believed that many more remain to be discovered,
particularly for driver genes specific to some cancer types.
In this paper we propose a new computational method called LOTUS to predict new
driver genes. LOTUS is a machine-learning based approach which allows to integrate
various types of data in a versatile manner, including informations about gene
mutations and protein-protein interactions. In addition, LOTUS can predict cancer
driver genes in a pan-cancer setting as well as for specific cancer types, using a
multitask learning strategy to share information across cancer types.
We empirically show that LOTUS outperforms three other state-of-the-art driver
gene prediction methods, both in terms of intrinsic consistency and prediction accuracy,
and provide predictions of new cancer genes across many cancer types.