Usual Interstitial Pneumonia Can Be Detected in Transbronchial Biopsies Using Machine Learning
Usual interstitial pneumonia (UIP) is the histopathologic hallmark of idiopathic pulmonary fibrosis. Although UIP can be detected by high-resolution computed tomography of the chest, the results are frequently inconclusive, and pathology from transbronchial biopsy (TBB) has poor sensitivity. Surgical lung biopsy may be necessary for a definitive diagnosis.Objectives:
To develop a genomic classifier in tissue obtained by TBB that distinguishes UIP from non-UIP, trained against central pathology as the reference standard.Methods:
Exome enriched RNA sequencing was performed on 283 TBBs from 84 subjects. Machine learning was used to train an algorithm with high rule-in (specificity) performance using specimens from 53 subjects. Performance was evaluated by cross-validation and on an independent test set of specimens from 31 subjects. We explored the feasibility of a single molecular test per subject by combining multiple TBBs from upper and lower lobes. To address whether classifier accuracy depends upon adequate alveolar sampling, we tested for correlation between classifier accuracy and expression of alveolar-specific genes.Results:
The top-performing algorithm distinguishes UIP from non-UIP conditions in single TBB samples with an area under the receiver operator characteristic curve (AUC) of 0.86, with specificity of 86% (confidence interval = 71-95%) and sensitivity of 63% (confidence interval = 51-74%) (31 test subjects). Performance improves to an AUC of 0.92 when three to five TBB samples per subject are combined at the RNA level for testing. Although we observed a wide range of type I and II alveolar-specific gene expression in TBBs, expression of these transcripts did not correlate with classifier accuracy.Conclusions:
We demonstrate proof of principle that genomic analysis and machine learning improves the utility of TBB for the diagnosis of UIP, with greater sensitivity and specificity than pathology in TBB alone. Combining multiple individual subject samples results in increased test accuracy over single sample testing. This approach requires validation in an independent cohort of subjects before application in the clinic.