Multicohort Analysis of Whole-Blood Gene Expression Data Does Not Form a Robust Diagnostic for Acute Respiratory Distress Syndrome
To identify a novel, generalizable diagnostic for acute respiratory distress syndrome using whole-blood gene expression arrays from multiple acute respiratory distress syndrome cohorts of varying etiologies.Data Sources:
We performed a systematic search for human whole-blood gene expression arrays of acute respiratory distress syndrome in National Institutes of Health Gene Expression Omnibus and ArrayExpress. We also included the Glue Grant gene expression cohorts.Study Selection:
We included investigator-defined acute respiratory distress syndrome within 48 hours of diagnosis and compared these with relevant critically ill controls.Data Extraction:
We used multicohort analysis of gene expression to identify genes significantly associated with acute respiratory distress syndrome, both with and without adjustment for clinical severity score. We performed gene ontology enrichment using Database for Annotation, Visualization and Integrated Discovery and cell type enrichment tests for both immune cells and pneumocyte gene expression. Finally, we selected a gene set optimized for diagnostic power across the datasets and used leave-one-dataset-out cross validation to assess robustness of the model.Data Synthesis:
We identified datasets from three adult cohorts with sepsis, one pediatric cohort with acute respiratory failure, and two datasets of adult patients with trauma and burns, for a total of 148 acute respiratory distress syndrome cases and 268 critically ill controls. We identified 30 genes that were significantly associated with acute respiratory distress syndrome (false discovery rate < 20% and effect size >1.3), many of which had been previously associated with sepsis. When metaregression was used to adjust for clinical severity scores, none of these genes remained significant. Cell type enrichment was notable for bands and neutrophils, suggesting that the gene expression signature is one of acute inflammation rather than lung injury per se. Finally, an attempt to develop a generalizable diagnostic gene set for acute respiratory distress syndrome showed a mean area under the receiver-operating characteristic curve of only 0.63 on leave-one-dataset-out cross validation.Conclusions:
The whole-blood gene expression signature across a wide clinical spectrum of acute respiratory distress syndrome is likely confounded by systemic inflammation, limiting the utility of whole-blood gene expression studies for uncovering a generalizable diagnostic gene signature.