1Department of Biomedical Informatics (DBMI)2Department of Systems Biology3Center for Computational Biology and Bioinformatics (C2B2), Columbia University, New York, NY, USA4Scuola Superiore Sant’Anna, Pisa, Italy5Department of Biochemistry and Molecular Biophysics6Institute for Cancer Genetics7Herbert Irving Comprehensive Cancer Center, Columbia University, New York, NY, USA
Checking for direct PDF access through Ovid
Motivation: Multiplex readout assays are now increasingly being performed using microfluidic automation in multiwell format. For instance, the Library of Integrated Network-based Cellular Signatures (LINCS) has produced gene expression measurements for tens of thousands of distinct cell perturbations using a 384-well plate format. This dataset is by far the largest 384-well gene expression measurement assay ever performed. We investigated the gene expression profiles of a million samples from the LINCS dataset and found that the vast majority (96%) of the tested plates were affected by a significant 2D spatial bias.Results: Using a novel algorithm combining spatial autocorrelation detection and principal component analysis, we could remove most of the spatial bias from the LINCS dataset and show in parallel a dramatic improvement of similarity between biological replicates assayed in different plates. The proposed methodology is fully general and can be applied to any highly multiplexed assay performed in multiwell format.Contact:firstname.lastname@example.orgSupplementary information: Supplementary data are available at Bioinformatics online.