Motivation: Multiplex readout assays are now increasingly being performed using microfluidic automation in multiwell format. For instance, the Library of Integrated Network-based Cellular Signatures (LINCS) has produced gene expression measurements for tens of thousands of distinct cell perturbations using a 384-well plate format. This dataset is by far the largest 384-well gene expression measurement assay ever performed. We investigated the gene expression profiles of a million samples from the LINCS dataset and found that the vast majority (96%) of the tested plates were affected by a significant 2D spatial bias.
Results: Using a novel algorithm combining spatial autocorrelation detection and principal component analysis, we could remove most of the spatial bias from the LINCS dataset and show in parallel a dramatic improvement of similarity between biological replicates assayed in different plates. The proposed methodology is fully general and can be applied to any highly multiplexed assay performed in multiwell format.
Supplementary information: Supplementary data are available at Bioinformatics online.