1Department of Computational and Systems Biology, University of Pittsburgh, Pittsburgh, PA, USA2Department of Computer Science, University of Pittsburgh, Pittsburgh, PA, USA
Checking for direct PDF access through Ovid
MotivationLearning probabilistic graphs over mixed data is an important way to combine gene expression and clinical disease data. Leveraging the existing, yet imperfect, information in pathway databases for mixed graphical model (MGM) learning is an understudied problem with tremendous potential applications in systems medicine, the problems of which often involve high-dimensional data.ResultsWe present a new method, piMGM, which can learn with accuracy the structure of probabilistic graphs over mixed data by appropriately incorporating priors from multiple experts with different degrees of reliability. We show that piMGM accurately scores the reliability of prior information from a given expert even at low sample sizes. The reliability scores can be used to determine active pathways in healthy and disease samples. We tested piMGM on both simulated and real data from TCGA, and we found that its performance is not affected by unreliable priors. We demonstrate the applicability of piMGM by successfully using prior information to identify pathway components that are important in breast cancer and improve cancer subtype classification.Availability and implementationhttp://www.benoslab.pitt.edu/manatakisECCB2018.htmlSupplementary informationSupplementary data are available at Bioinformatics online.