Introduction: CSF volumetric change within the first 24hrs after ischemic stroke (ΔCSF) may be an early imaging biomarker of cerebral edema. The ability to accurately and rapidly quantify ΔCSF in large multicenter stroke populations will facilitate large-scale studies to understand complex dynamics and genetic influences of cerebral edema.
Hypothesis: An automated machine-learning approach to CSF segmentation utilizing random forest will be superior to standard Hounsfield unit thresholding for measuring ΔCSF using CT scans from different medical centers.
Methods: Manual CSF delineation on both the baseline (within 6hrs) and 24hr follow-up head CT scans from 26 ischemic stroke patients acquired at center A were used as the training samples for random forest classifiers to segment CSF. The trained classifiers were then employed to segment baseline and 24-hr scans from 12 new patients from center B. Correlations of random forest detected CSF volumes to manual segmentation (including correlation coefficient and pvalue) were then compared to those from a thresholding approach, applying the optimal CT threshold derived from the training scans to the validation cohort from center B.
Results: Random forest segmentation was very efficient (run time: 23 min/scan) and was able to accurately quantify CSF volumes and volumetric changes in the validation cohort. As shown in the table, random forest outperformed thresholding in the correlations between automated quantification and manual delineation.
Conclusion: We have developed a robust, efficient and accurate automated approach to quantify both absolute and relative CSF volume changes from multicenter head CT images. This approach is ready to be employed in multicenter large data analysis of the genetic influence on edema formation following acute ischemic stroke.