To improve the precision of multicenter clinical trials, several efforts are underway to determine scanner-specific parameters for harmonization using standardized phantom measurements. The goal of this study was to test the correspondence between quantification in phantom and patient images and validate the use of phantoms for harmonization of patient images.Methods
The National Electrical Manufacturers’ Association image quality phantom with hot spheres was scanned on two time-of-flight PET scanners. Whole-body [18F]-fluorodeoxyglucose (FDG)-PET scans were acquired of subjects on the same systems. List-mode events from spheres (diam.: 10–28 mm) measured in air on each scanner were embedded into the phantom and subject list-mode data from each scanner to create lesions with known uptake with respect to the local background in the phantom and each subject's liver and lung regions, as a proxy to characterize true lesion quantification. Images were analyzed using the contrast recovery coefficient (CRC) typically used in phantom studies and serving as a surrogate for the standardized uptake value used clinically. Postreconstruction filtering (resolution recovery and Gaussian smoothing) was applied to determine if the effect on the phantom images translates equivalently to subject images. Three postfiltering strategies were selected to harmonize the CRCmean or CRCmax values between the two scanners based on the phantom measurements and then applied to the subject images.Results
Both the average CRCmean and CRCmax values for lesions embedded in the lung and liver in four subjects (BMI range 25–38) agreed to within 5% with the CRC values for lesions embedded in the phantom for all lesion sizes. In addition, the relative changes in CRCmean and CRCmax resulting from the application of the postfilters on the subject and phantom images were consistent within measurement uncertainty. Further, the root mean squared percent difference (RMSpd) between CRC values on the two scanners calculated over the three sphere sizes was significantly reduced in the subjects using postfiltering strategies chosen to harmonize CRCmean or CRCmax based on phantom measurements: RMSpd of the CRCmean values in subjects was reduced from 36% to < 8% after harmonizing CRCmean, while RMSpd for CRCmax was reduced from ˜33% to < 6% after harmonizing CRCmax with a different strategy. However, with this strategy designed to harmonize CRCmax, the RMSpd for CRCmean only improved to ˜14% in subjects.Conclusions
The consistency of the CRC measurements between the phantom and subject data demonstrates that harmonization strategies defined with phantom studies track well to patient images. However, quantitative agreement between different scanners as represented by the RMSpd depends on the metric chosen for harmonization.