1McKusick-Nathans Institute of Genetic Medicine, Johns Hopkins School of Medicine, Baltimore, MD, USA2Lieber Institute for Brain Development, Baltimore, MD, USA3Department of Oncology and Division of Biostatistics and Bioinformatics, Johns Hopkins School of Medicine, Baltimore, MD, USA4Vavilov Institute of General Genetics, Moscow, Russia5Research Institute of Genetics and Selection of Industrial Microorganisms, Moscow, Russia6Department of Otolaryngology-Head and Neck Surgery, Johns Hopkins School of Medicine, Baltimore, MD, USA7Department of Mathematics and Statistics, The College of New Jersey, Ewing Township, NJ, USA8Department of Neurology and Department of Neuroscience, Johns Hopkins School of Medicine, Baltimore, MD 21205, USA9Institute for Genome Sciences, University of Maryland School of Medicine
Checking for direct PDF access through Ovid
Summary:Non-negative Matrix Factorization (NMF) algorithms associate gene expression with biological processes (e.g. time-course dynamics or disease subtypes). Compared with univariate associations, the relative weights of NMF solutions can obscure biomarkers. Therefore, we developed a novel patternMarkers statistic to extract genes for biological validation and enhanced visualization of NMF results. Finding novel and unbiased gene markers with patternMarkers requires whole-genome data. Therefore, we also developed Genome-Wide CoGAPS Analysis in Parallel Sets (GWCoGAPS), the first robust whole genome Bayesian NMF using the sparse, MCMC algorithm, CoGAPS. Additionally, a manual version of the GWCoGAPS algorithm contains analytic and visualization tools including patternMatcher, a Shiny web application. The decomposition in the manual pipeline can be replaced with any NMF algorithm, for further generalization of the software. Using these tools, we find granular brain-region and cell-type specific signatures with corresponding biomarkers in GTEx data, illustrating GWCoGAPS and patternMarkers ascertainment of data-driven biomarkers from whole-genome data.Availability and Implementation:PatternMarkers & GWCoGAPS are in the CoGAPS Bioconductor package (3.5) under the GPL license.Contact:firstname.lastname@example.org or email@example.com or firstname.lastname@example.orgSupplementary information:Supplementary data are available at Bioinformatics online.