The purpose of this study is to determine the optimal representative reconstruction and quantitative image feature set for a computer-aided diagnosis (CADx) scheme for dedicated breast computer tomography (bCT).Method
We used 93 bCT scans that contain 102 breast lesions (62 malignant, 40 benign). Using an iterative image reconstruction (IIR) algorithm, we created 37 reconstructions with different image appearances for each case. In addition, we added a clinical reconstruction for comparison purposes. We used image sharpness, determined by the gradient of gray value in a parenchymal portion of the reconstructed breast, as a surrogate measure of the image qualities/appearances for the 38 reconstructions. After segmentation of the breast lesion, we extracted 23 quantitative image features. Using leave-one-out-cross-validation (LOOCV), we conducted the feature selection, classifier training, and testing. For this study, we used the linear discriminant analysis classifier. Then, we selected the representative reconstruction and feature set for the classifier with the best diagnostic performance among all reconstructions and feature sets. Then, we conducted an observer study with six radiologists using a subset of breast lesions (N = 50). Using 1000 bootstrap samples, we compared the diagnostic performance of the trained classifier to those of the radiologists.Result
The diagnostic performance of the trained classifier increased as the image sharpness of a given reconstruction increased. Among combinations of reconstructions and quantitative image feature sets, we selected one of the sharp reconstructions and three quantitative image feature sets with the first three highest diagnostic performances under LOOCV as the representative reconstruction and feature set for the classifier. The classifier on the representative reconstruction and feature set achieved better diagnostic performance with an area under the ROC curve (AUC) of 0.94 (95% CI = [0.81, 0.98]) than those of the radiologists, where their maximum AUC was 0.78 (95% CI = [0.63, 0.90]). Moreover, the partial AUC, at 90% sensitivity or higher, of the classifier (pAUC = 0.085 with 95% CI = [0.063, 0.094]) was statistically better (P-value < 0.0001) than those of the radiologists (maximum pAUC = 0.009 with 95% CI = [0.003, 0.024]).Conclusion
We found that image sharpness measure can be a good candidate to estimate the diagnostic performance of a given CADx algorithm. In addition, we found that there exists a reconstruction (i.e., sharp reconstruction) and a feature set that maximizes the diagnostic performance of a CADx algorithm. On this optimal representative reconstruction and feature set, the CADx algorithm outperformed radiologists.