Motivation: The underlying relationship between genomic factors and the response of diverse cancer drugs still remains unclear. A number of studies showed that the heterogeneous responses to anticancer treatments of patients were partly associated with their specific changes in gene expression and somatic alterations. The emerging large-scale pharmacogenomic data provide us valuable opportunities to improve existing therapies or to guide early-phase clinical trials of compounds under development. However, how to identify the underlying combinatorial patterns among pharmacogenomics data are still a challenging issue.
Results: In this study, we adopted a sparse network-regularized partial least square (SNPLS) method to identify joint modular patterns using large-scale pairwise gene-expression and drug-response data. We incorporated a molecular network to the (sparse) partial least square model to improve the module accuracy via a network-based penalty. We first demonstrated the effectiveness of SNPLS using a set of simulation data and compared it with two typical methods. Further, we applied it to gene expression profiles for 13 321 genes and pharmacological profiles for 98 anticancer drugs across 641 cancer cell lines consisting of diverse types of human cancers. We identified 20 gene-drug co-modules, each of which consists of 30 cell lines, 137 genes and 2 drugs on average. The majority of identified co-modules have significantly functional implications and coordinated gene-drug associations. The modular analysis here provided us new insights into the molecular mechanisms of how drugs act and suggested new drug targets for therapy of certain types of cancers.
Availability and implementation: A matlab package of SNPLS is available at http://page.amss.ac.cn/shihua.zhang/
Supplementary information: Supplementary data are available at Bioinformatics online.