Volatile organic compounds present in the exhaled breath have shown promise as biomarkers of lung cancer. Advances in colorimetric sensor array technology, breath collection methods, and clinical phenotyping may lead to the development of a more accurate breath biomarker.Objectives:
Perform a discovery-level assessment of the accuracy of a colorimetric sensor array-based volatile breath biomarker.Methods:
Subjects with biopsy-confirmed untreated lung cancer, and others at risk for developing lung cancer, performed tidal breathing into a breath collection instrument designed to expose a colorimetric sensor array to the alveolar portion of the breath. Random forest models were built from the sensor output of 70% of the study subjects and were tested against the remaining 30%. Models were developed to separate cancer and subgroups from control, and to characterize the cancer. Additional models were developed after matching the clinical phenotypes of cancer and control subjects.Measurements and Main Results:
Ninety-seven subjects with lung cancer and 182 control subjects participated. The accuracies, reported as C-statistics, for models of cancer and subgroups versus control ranged from 0.794 to 0.861. The accuracy was improved by developing models for cancer and control groups selected through propensity matching for clinical variables. A model built using only subjects from the largest available clinical subgroup (49 subjects) had a C-statistic of 0.982. Models developed and tested to characterize cancer histology, and to compare early- with late-stage cancer, had C-statistics of 0.881–0.960.Conclusions:
The colorimetric sensor array signature of exhaled breath volatile organic compounds was capable of distinguishing patients with lung cancer from clinically relevant control subjects in a discovery level trial. The incorporation of clinical phenotypes into the further development of this biomarker may optimize its accuracy.