1. Introduction
The standardized uptake value (SUV) is the most widely utilized quantitative metric in positron emission tomography (PET) [
1]. In oncologic imaging, the vast majority of image reconstructions are based on the SUV, which reflects the number of decay events in a given volume of tissue, regardless of whether that tracer is bound or unbound. As such, physiologic tracer uptake by normal organs on SUV images can obscure tracer-avid lesions, resulting in inaccurate tumor burden assessments [
2]. The Patlak model attempts to address this shortcoming [
3]. This model assumes that the circulating tracer is trapped irreversibly, allowing the tracer’s net uptake rate to be estimated via the Patlak slope (PS) [
4]. Several clinically utilized oncologic PET tracers generally exhibit this behavior, thereby allowing Patlak modeling of dynamic whole-body (WB) PET data [
5,
6]. Importantly, PS images, by removing the signal derived from the unbound tracer, have the potential to improve lesion conspicuity in organs with relatively high background parenchymal activity (e.g., liver) [
7].
Despite many publications on the Patlak method, only two have evaluated the clinical utility of WB PS images [
8,
9]. These studies reported fewer false positive and false negative findings and better tumor-to-background ratios (TBRs) on PS images relative to SUV images but did not utilize identical post-injection intervals for SUV and PS reconstructions. As such, disparate effective uptake times may have contributed to differences in SUV versus PS image quality. Thus, the aim of our study was to compare the visual quality and hepatic TBRs of SUV images to PS images at equivalent post-injection time points.
2. Materials and Methods
2.1. Study Design
This prospective study, which occurred at a single tertiary care center, was approved by our local Institutional Review Board and complied with the standards of the Health Insurance Portability and Accountability Act. We enrolled subjects already scheduled to undergo standard-of-care (SOC) oncologic PET/computed tomography (CT) examinations with one of the following tracers, all of which satisfy the assumptions of the Patlak model [
5,
6,
10,
11]: 2-deoxy-2-[
18F]fluoro-D-glucose ([
18F]FDG); [
68Ga]Ga-DOTATATE or [
64Cu]Cu-DOTATATE (hereafter collectively called DOTATATE, given our analytic pooling of
64Cu and
68Ga cases); or [
18F]piflufolastat ([
18F]DCFPyL). The inclusion criteria were as follows: age of 18 years or older; ability to provide written informed consent; and ability (self-reported) to undergo approximately 90 min of supine imaging with minimal motion. Informed consent was obtained from all subjects involved in the study. Study imaging occurred immediately prior to and following the SOC PET/CT acquisition in a single session.
2.2. Imaging Protocol
The imaging protocol is captured in
Figure 1. All study imaging occurred on a single Biograph Vision 600 PET/CT scanner (Siemens Healthineers; Knoxville, TN, USA) utilizing Food and Drug Administration-approved, commercially available software for on-line reconstruction of multiparametric PET images (FlowMotion Multiparametric PET Suite; Siemens Healthineers; Knoxville, TN, USA). Subjects undergoing [
18F]FDG imaging were required to fast for at least 4 h prior to [
18F]FDG injection; a blood glucose of 200 mg/dL or less was required at the time of tracer administration. [
18F]FDG dosing followed a weight-based schema: <54 kg–370 MBq; 55–113 kg—555 MBq; >113 kg—740 MBq. [
68Ga]Ga-DOTATATE dosing was also weight-based: 2.0 MBq/kg. In contrast, [
64Cu]Cu-DOTATATE (333 MBq) and [
18F]DCFPyL (148 MBq) doses were identical across all weights. The tracer was injected intravenously, with the patient positioned supine within the scanner bore. A 6-min dynamic PET acquisition, centered about the heart, was performed, ensuring that the initial intravascular bolus arrival was captured. The subsequent variable-duration ‘whole-body’ PET passes utilized continuous bed motion and list mode acquisition. Five 2-min passes, followed by five 5-min passes, were performed before SOC imaging. Three additional 5-min passes were performed after SOC imaging. The craniocaudal range of these ‘whole-body’ passes was determined by the clinical indication, most commonly extending from the skull base through the proximal thighs. Note that subjects left the scanner table to empty their urinary bladder immediately before the SOC acquisition. Both low-dose CT scans utilized the following parameters: CARE Dose4D—111 mAs (reference); CARE kV—120 kV (reference); ADMIRE strength of 2.
2.3. PET Image Reconstruction
Cylindrical volumes of interest (VOIs) were automatically placed by the scanner software in the descending thoracic aorta on all PET acquisitions. Blood activity concentrations were extracted at each time point to derive a time-activity curve (i.e., arterial input function), as required for Patlak modeling [
12]. Prior to Patlak reconstruction, the ‘whole-body’ passes were dynamically reviewed by one of the study investigators to ensure that the images were not substantially degraded by bulk body motion. SUV and PS image reconstructions utilized the following parameters per manufacturer recommendations: SUV—time-of-flight, point-spread-function, 4 iterations, 5 subsets, 440 × 440 matrix, all-pass filter; PS—time-of-flight, point-spread-function, 8 iterations, 5 subsets, 220 × 220 matrix, 2 mm Gaussian filter. Note that PS images are intrinsically noisier than SUV images and that the differences in these parameters (e.g., smaller matrix size) were intended to achieve a similar level of image noise across reconstructions. Time-matched PS and SUV images were reconstructed from three 5-min ‘whole-body’ passes performed approximately 35–50 min (early) or 75–90 min (late) following tracer injection. We utilized the three latest passes before the SOC imaging for reconstruction of the early images to ensure adequate time for steady-state conditions to be achieved.
The SUV calculation utilized actual body weight; SUV had units of g/mL. For [
18F]FDG studies, PS had units of mg/min/100 mL, as the scanner-derived PS values were multiplied by the patient’s blood glucose level at the time of tracer injection; this approach accounts for the effects of large differences in blood glucose levels on the rate of irreversible [
18F]FDG trapping. This version of the PS is equivalent to the metabolic rate of [
18F]FDG (MR
FDG). For DOTATATE or [
18F]DCFPyL studies, PS had units of ml/min/100 mL. This version of the PS is equivalent to the influx constant (Ki). The MR
FDG and the Ki are collectively called the PS in this study. Additionally, as DOTATATE and [
18F]DCFPyL remain in the blood plasma (i.e., do not equilibrate with the red blood cell cytoplasm like [
18F]FDG), the PS values for cases performed with these tracers were retrospectively corrected for the patient’s hematocrit as follows [
13,
14]:
2.4. Quantitative Analysis
Tracer-avid hepatic lesions felt to represent sites of malignancy on the SOC PET/CT interpretation were identified. In the case of numerous lesions, the largest and/or most tracer-avid (up to a maximum of 5 per patient) were selected. Utilizing co-registered CT images for guidance, each lesion was manually segmented in MIM version 7.1.5 (MIM Software; Cleveland, OH, USA) on four PET image sets (PS-early, SUV-early, PS-late, SUV-late), thereby generating 4 separate VOIs for each lesion. Maximum (max) and peak values were extracted for each lesion [
1]. Additionally, for each PET reconstruction, a spherical VOI of 3 cm diameter was placed in the right hemiliver (avoiding areas of pathology) to extract early and late mean hepatic SUVs and PS values. Early and late TBRs, defined as the ratio of a lesion’s maximum or peak value to the background liver mean value, were calculated.
2.5. Qualitative Analysis
Two independent readers blinded to reconstruction type assessed each PET image set with co-registered CT images for a given participant in a single session. The assessment order was randomized on a per-subject basis to mitigate the systematic effects of recall bias. Overall image quality, image noise, artifact freeness, and lesion conspicuity were scored via a 4-point Likert scale (1 = worst; 4 = best). Furthermore, readers recorded the number of presumably malignant tracer-avid lesions for each reconstruction, using the reconstruction with the fewest such lesions as the reference (i.e., relative lesion number). For a given subject, the reconstruction with the fewest lesions was assigned a score of 0. The other three reconstructions were assigned a number indicating how many more lesions were apparent on that reconstruction. For example, if the early PS images showed 8 lesions, the early SUV images showed 9 lesions, and the late PS and late SUV images each showed 11 lesions, the relative lesion number was scored as follows: early PS—0; early SUV—1; late PS—3; late SUV—3.
2.6. Statistical Analysis
Prism 9 (GraphPad Software; San Diego, CA, USA) and Excel 2016 (Microsoft, Inc.; Redmond, WA, USA) were utilized for statistical analysis. Demographic, oncologic, and PET/CT characteristics were summarized descriptively. Pairwise comparisons of quantitative and qualitative variables, many of which were deemed to be non-normal via the Shapiro-Wilk test, were performed via the two-tailed Wilcoxon signed-rank test. Separate analyses were performed for all cases, for the [18F]FDG subgroup, and for the DOTATATE subgroup. There were insufficient [18F]DCFPyL cases for subgroup analysis. To assess qualitative inter-reader agreement for the qualitative analysis, we utilized percent agreement rather than kappa due to multiple instances in which both readers preferred the same reconstruction for all or nearly all cases. As a result, kappa could not be estimated. For a given pairwise comparison, percent agreement was defined as the number of cases in which (A) both readers preferred the same reconstruction (regardless of magnitude) or (B) both readers had a lack of preference, divided by the total number of cases (n = 43). A 95% confidence interval (CI) for the percent agreement was calculated via binomial exact proportions due to the relatively small sample size. p < 0.05 defined statistical significance.
3. Results
3.1. Study Cohort
Seventy-eight patients were enrolled in the study. Forty-three subjects (33 [
18F]FDG, 8 DOTATATE, and 2 [
18F]DCFPyL) were deemed to have tracer-avid, presumably malignant lesions on the clinical reports for the SOC portions of their PET/CT examinations and were included in the analysis (
Figure 2). This study cohort was 60.5% male (26/43) with a mean age of 63.3 years. Additional patient and scan characteristics are summarized in
Table 1.
3.2. Image Quality of SUV vs. PS Reconstructions
Supplemental Tables S1–S5 show the results of the qualitative analyses across all tracers and for both subgroups. These data are also summarized visually for all tracers (
Figure 3), as well as for the [
18F]FDG (
Figure 4) and DOTATATE (
Figure 5) subgroups. An example case is shown in
Figure 6. The values in this subsection are means.
For overall image quality (R1: 3.95 vs. 1.19; R2: 3.95 vs. 2.14), image noise, and artifact freeness, both readers rated the PS-early images as inferior (all p values < 0.001) to the other reconstructions across all tracers, with high agreement (range: 81.4–100%). PS-early images also had significantly lower lesion conspicuity than the other reconstructions for both readers, though with lower agreement (range: 44.2–69.8%). PS-early images had significantly fewer tracer-avid lesions relative to the other reconstructions for both readers (with the exception of reader 1 when comparing with SUV-early images). The [18F]FDG subgroup analysis produced similar results. The DOTATATE subgroup analysis had too few lesions to assess the relative lesion number statistically.
The relationships among the PS-late, SUV-early, and SUV-late reconstructions were more heterogeneous. For example, across all tracers, both readers preferred PS-late images to SUV-late images (p ≤ 0.002) for overall image quality, image noise, and artifact freeness, though some of these relationships did not persist for reader 2 in the subgroup analysis. However, across all tracers, there were no significant differences between PS-late images and SUV-late images in terms of lesion conspicuity and relative lesion number, with the exception of a slightly higher relative lesion number for reader 1 on the SUV-late images (2.09 vs. 1.35; p = 0.04). For overall image quality and image noise (across all tracers), both readers preferred SUV-early images to SUV-late images (all p ≤ 0.03), but with different results for each reader for the other qualitative features. Finally, across all tracers and for both subgroups, SUV-early and PS-late images scored similarly in overall image quality and image noise; however, both readers reported significantly higher lesion conspicuity and relative lesion number for PS-late images across all tracers and for the [18F]FDG subgroup (p ≤ 0.02).
3.3. Hepatic TBRs on SUV vs. PS Images
Among the 43 subjects included in the qualitative analysis, 15 subjects (7 [
18F]FDG, 8 DOTATATE) had a total of 36 tracer-avid liver lesions (18 [
18F]FDG, 18 DOTATATE).
Table 2 shows the results of the TBR analysis. Values in this subsection are medians. Across all tracers, hepatic TBRs were slightly but significantly higher at the early versus late time point when based on PS-max (3.87 vs. 3.57;
p < 0.001) and PS-peak (2.90 vs. 2.80;
p = 0.03), though with opposite trends for the [
18F]FDG and DOTATATE subgroups. In contrast, across all tracers, hepatic TBRs were significantly lower at the early versus late time point when based on SUV-max (3.09 vs. 5.29;
p < 0.001) and SUV-peak (2.28 vs. 3.10;
p < 0.001), with mostly similar findings in the subgroup analyses.
Across all tracers, hepatic TBRs were significantly higher at the early time point for PS-max versus SUV-max (3.87 vs. 3.09;
p = 0.006) and for PS-peak versus SUV-peak (2.90 vs. 2.28;
p = 0.003). Similar findings were observed for the [
18F]FDG subgroup; however, for the DOTATATE subgroup, there were no significant differences in early TBRs between SUV-based metrics and PS-based metrics. In contrast, across all tracers, hepatic TBRs were significantly lower at the late time point for PS-max versus SUV-max (3.57 vs. 5.29;
p < 0.001) and for PS-peak versus SUV-peak (2.80 vs. 3.10;
p < 0.001). Equivalent, statistically significant late hepatic TBR findings were also observed for the [
18F]FDG and DOTATATE subgroups. An example case is shown in
Figure 7.
4. Discussion
In this study, we examined PS versus SUV overall image quality and found that late PS images (R1: 3.95, R2: 3.95) were similar (p > 0.05) to early SUV images (R1: 3.88, R2: 3.84) but slightly superior (p ≤ 0.002) to late SUV images (R1: 2.97, R2: 3.44), with more pronounced superiority (p < 0.001) relative to early PS images (R1: 1.19, R2: 2.14). In terms of relative lesion number, late PS images outperformed early SUV images for both readers but were slightly inferior to late SUV images for one reader only; again, early PS images were generally inferior to other reconstructions. Finally, among hepatic lesions, early TBRs were higher for PS images, whereas late TBRs were higher for SUV images.
Our finding that late PS images are similar (or sometimes superior) to SUV images in terms of multiple qualitative metrics agrees with previously published data. For example, a study of 18 patients undergoing oncologic [
18F]FDG-PET/CT found that PS images were of similar or slightly inferior image quality relative to SUV images [
9]. However, PS images were reconstructed from earlier post-injection time points than SUV images, possibly contributing to lower PS image quality. A similar study of 109 patients undergoing oncologic [
18F]FDG-PET/CT reported that PS and SUV reconstructions were subjectively of comparable quality, though PS images were again derived from earlier post-injection time points than SUV images [
8]. These same two studies also reported that Patlak-derived reconstructions (including PS images) may occasionally identify malignant lesions not seen on SUV images or allow lesions that appear suspicious on SUV images to be dismissed as benign [
8,
9]. Although our study did not entail the use of a reference standard to compare the diagnostic accuracy of PS versus SUV images, we did find that late PS images allow for the identification of a slightly higher number of tracer-avid lesions than early SUV images but similar to slightly fewer tracer-avid lesions than late SUV images.
Regarding quantitative assessments of lesion conspicuity, which focus specifically on liver lesions, we found that PS images provide a higher TBR than SUV images at early post-injection time points, with opposite findings at late post-injection time points. Prior studies reported higher TBRs for PS images than for SUV images, though this finding was confounded by the different uptake times for SUV and PS images [
8,
9]. We observed that the benefits of PS images in terms of hepatic lesion conspicuity are highly dependent on the uptake times; PS images are unlikely to offer much (if any) added value for hepatic lesion detection relative to SUV images, provided that the PET data are acquired after a sufficiently long delay (75 min in our study). For SUV images, the higher late hepatic TBRs reflect continuous accumulation of tracer by malignant lesions during the post-injection period; significant early-to-late decreases in background hepatic tracer activity also contributed to this finding for the [
18F]FDG subgroup, likely reflecting hepatic [
18F]FDG efflux related to physiological dephosphorylation [
15,
16].
Our study has several limitations. First, there may be biases related to our single-center design and utilization of a single PET scanner. Our results should be confirmed at other institutions and on other PET scanner models. Due to the heterogeneity of the patient cohort and the relatively small sample size, it was not possible to perform subgroup analysis based on particular tumor types or clinical indications. The PET and PS reconstructions utilized parameters specifically recommended by the scanner’s manufacturer. We did not independently optimize these settings for image quality. Therefore, the observed qualitative inferiority of the PS-early images might be overcome by future modifications to the reconstruction parameters. Our study also did not utilize a reference standard to adjudicate the diagnostic accuracy of PS versus SUV images. Larger studies focusing on particular cancer types and imaging indications will be needed to achieve sufficient power to address questions of clinical impact. Due to our relatively small sample sizes, cases/lesions were pooled across tracer types for some analyses, though tracer-specific analyses were also performed to evaluate for any effects related to differences in tracer behavior. Finally, our study excluded subjects who reported potential difficulties tolerating a 90-min imaging period. Although data acquisition for PS reconstruction can be accomplished much faster than in our study (i.e., only three WB passes), the recruitment strategy may have enriched our cohort for patients capable of remaining relatively motionless during imaging. Consequently, PS images, the reconstruction of which is based on the expectation that patients remain nearly motionless throughout the dynamic PET imaging period, may be of lower quality in an unselected oncologic population.