Systematic Review 2023
Systematic Review 2023
Systematic Review 2023
Summary eClinicalMedicine
2023;59: 101960
Background The accuracy of diagnostic tests available in primary care to identify the disc, sacroiliac joint, and facet
joint as the source of low back pain is uncertain. Published Online 6 April
2023
https://doi.org/10.
Methods Systematic review of diagnostic tests available in primary care. MEDLINE, CINAHL, and EMBASE were 1016/j.eclinm.2023.
searched between March 2006 and 25th January 2023. Pairs of reviewers independently screened all studies, extracted 101960
data, and assessed risk of bias using QUADAS-2. Pooling was performed for homogenous studies. Positive likelihood
ratios (+LR) ≥2 and negative likelihood ratios (−LR) ≤0.5 were considered informative. This review is registered with
PROSPERO (CRD42020169828).
Findings We included 62 studies: 35 investigated the disc, 14 the facet joint, 11 the sacroiliac joint, and 2 investigated
all three structures in patients with persistent low back pain. For risk of bias, the domain ‘reference standard’ scored
worst, however approximately half the studies were of low risk of bias for every other domain. For the disc, pooling
demonstrated MRI findings of disc degeneration and annular fissure resulted in informative +LRs: 2.53 (95% CI:
1.57–4.07) and 2.88 (95% CI: 2.02–4.10) and −LRs: 0.15 (95% CI: 0.09–0.24) and 0.24 (95% CI: 0.10–0.55) respec-
tively. Pooled results for Modic type 1, Modic type 2, and HIZ on MRI, and centralisation phenomenon yielded
informative +LRs: 10.00 (95% CI: 4.20–23.82), 8.03 (95% CI: 3.23–19.97), 3.10 (95% CI: 2.27–4.25), and 3.06 (95% CI:
1.44–6.50) respectively, but uninformative −LRs: 0.84 (95% CI: 0.74–0.96), 0.88 (95% CI: 0.80–0.96), 0.61 (95% CI:
0.48–0.77), and 0.66 (95% CI: 0.52–0.84) respectively. For the facet joint, pooling demonstrated facet joint uptake on
SPECT resulted in informative +LRs: 2.80 (95% CI: 1.82–4.31) and −LRs: 0.44 (95% CI: 0.25–0.77). For the sacroiliac
joint, a combination of pain provocation tests and absence of midline low back pain resulted in informative +LRs of
2.41 (95% CI: 1.89–3.07) and 2.44 (95% CI: 1.50–3.98) and −LRs of 0.35 (95% CI: 0.12–1.01) and 0.31 (95% CI:
0.21–0.47) respectively. Radionuclide imaging yielded an informative +LR 7.33 (95% CI: 1.42–37.80) but an
uninformative −LR 0.74 (95% CI: 0.41–1.34).
Interpretation There are informative diagnostic tests for the disc, sacroiliac joint, and facet joint (only one test). The
evidence suggests a diagnosis may be possible for some patients with low back pain, potentially guiding targeted and
specific treatment approaches.
Copyright © 2023 The Author(s). Published by Elsevier Ltd. This is an open access article under the CC BY-NC-ND
license (http://creativecommons.org/licenses/by-nc-nd/4.0/).
Keywords: Diagnosis; Pathoanatomical; Low back pain; Reference standard; Index test
*Corresponding author.
E-mail address: christopher.han@sydney.edu.au (C.S. Han).
Research in context
Evidence before this study the disc, sacroiliac joint, and the facet joint. The review
Our previous review conducted in 2007 found relatively few provides evidence that a diagnosis may be possible for some
studies had investigated the diagnostic accuracy of tests to patients with low back pain, potentially guiding clinical
identify the disc, facet joint or sacroiliac joint as the source of management.
low back pain. Pooling was limited due to heterogeneity
Implications of all the available evidence
between studies and poor study quality. The diagnostic
Our study identifies tests available to primary care clinicians to
accuracy of index tests was unclear.
identify the disc and sacroiliac joint as the source of low back
Added value of this study pain and creates opportunities for more targeted and specific
In this new study a greater number of studies were able to be treatment approaches. Future research should investigate if
pooled, resolving some uncertainty about diagnostic accuracy. this targeted approach provides better outcomes than
Considering an informative diagnostic test as one with a generic, symptomatic treatment of low back pain.
likelihood ratio ≥2.0 or ≤0.5; there were informative tests for
Search strategy timing.11 Each author independently rated the risk of bias
MEDLINE, CINAHL, and EMBASE were searched for in each of the four domains (guided by signalling ques-
the period between March 2006 and 25th January 2023 tions). Further details on the decision-making process of
with the search strategy used in the Hancock et al., 2007 study quality are presented in Supplementary Appendix
review.7 The complete search strategies from all data- S5. Any disagreements were resolved through discus-
bases are presented in Supplementary Appendix S1. We sion and a third reviewer (CGM or MJH).
also screened reference lists of included studies. For- Following data extraction, data were pooled when-
ward citation searching was also performed. We ever possible using random effects models. The soft-
included studies in all languages. ware used for meta-analysis and to calculate
sensitivities, specificities, and likelihood ratios from raw
Eligibility criteria data (2 × 2 tables) was Meta-DiSc v1.4. Due to limited
Studies were required to meet the following criteria data for most index tests, pooling was not always
(also used in the previous review7) to be eligible. possible, and results of individual tests were also pre-
sented descriptively.
i) Included participants with low back pain without Potential publication bias was assessed post-hoc us-
serious pathology such as cancer, infection, or ing the midas command12 in Stata version 17/BE. To
fracture evaluate potential publication bias, funnel plots were
ii) used a reference standard test advocated by the explored, and Deeks Funnel Plot Asymmetry Test was
International Association for the Study of Pain.10 used to explore funnel plot asymmetry. Publication bias
These were: discography for discogenic pain was deemed statistically significant if the p value was
(with a minimum of two levels tested per patient), <0.05. As per the Cochrane Handbook for Systematic
intra-articular local anaesthetic blocks for SIJ pain, Reviews of Diagnostic Test Accuracy Version 2.0, 2022
and either intra-articular blocks or medial branch (Chapter 11.5.5) we only assessed for publication bias if
blocks for facet joint pain ten or more studies were included in the analysis for a
iii) Assessed at least one index test available to primary single test.13
care clinicians For index tests scored as either positive or negative
iv) Presented a 2 × 2 contingency table, or data we calculated the sensitivity, specificity, and likelihood
allowing development of contingency table(s). ratios. We also presented post-test probabilities using
standard methods with the “95% confidence interval for
One review author (SwS) excluded clearly irrelevant post-test probability determined with point estimate of
titles. Two review authors (SwS, AT, or SaS) indepen- pre-test probability and the 95% confidence interval of
dently screened abstracts to exclude irrelevant studies. the likelihood ratio.”14 For index tests with different
Two reviewers (SwS, AT, SaS, or CSH) then indepen- thresholds to score as positive or negative, these results
dently reviewed full texts for eligibility based on the are presented descriptively to show the estimates of
inclusion criteria. Studies in languages other than En- sensitivities and specificities for each test cut-off.
glish were screened by a researcher fluent in the We focused on likelihood ratios in our results;
appropriate language. Disagreements were resolved however, sensitivity and specificity are presented in
through discussion and a third reviewer (CGM or MJH). Table 1 and Supplementary Appendix S6. We consid-
ered the index test to be informative if the positive
Data and statistical analysis likelihood ratios (+LR) were ≥2 and negative likelihood
Two review authors independently extracted data into ratios (−LR) were ≤0.5.49,50
the data extraction form. Any disagreements or dis- We pre-specified that we would examine the effect of
crepancies in the data extraction were resolved through employing stricter reference standards.51 The stricter
discussion and a third reviewer (CGM or MJH) if reference standards were:51
necessary. This form was used to record study popula-
tion, hypothesised nociceptive/tissue source of low back i) Discogenic pain studies: discography with a
pain, index tests, sensitivity, specificity, the positive concordant pain provocation score of 6 out of 10 or
(+LR) and negative (−LR) likelihood ratios and 95% greater and an adjacent pain-free control disc;
confidence intervals (95% CI). We prioritised using raw ii) Facet joint pain studies: greater than or equal to
data when available to calculate the sensitivity, 80% pain relief with double blocks (using a placebo
specificity, +LR, −LR, and 95% CIs. control or comparator local anaesthetic) to account
Two review authors (SwS, AT, SaS, or CH) used the for concurrent pain generators;
Quality Assessment of Diagnostic Accuracy Studies-2 iii) Sacroiliac joint pain studies: greater than or equal
(QUADAS-2) scale11 independently to rate risk of bias. to 50% pain relief with double blocks (using a
This tool evaluates four domains such as patient selec- placebo control or comparator local anaesthetic) to
tion; index test; reference standard; and flow and account for concurrent pain generators.
Index test Studies/total sample Sensitivity (95% CI) Specificity (95% CI) +LR (95% CI) −LR (95% CI)
Disc
Disc degeneration (Grade ≥3)15–18 4/381 91.0 (85.7–94.7) 61.3 (54.2–68.0) 2.53 (1.57–4.07) 0.15 (0.09–0.24)
Heterogeneity (I2) 50.1% 82.7% 83.4% 0.0%
b
Disc degeneration (Grade ≥4)16,18,19 3/288 70.7 (60.7–79.4) 66.7 (59.5–73.3) 2.20 (1.61–3.01) 0.37 (0.19–0.73)
Heterogeneity (I2) 84.7% 87.7% 44.4% 59.9%
HIZ18–29 12/1817 50.1 (46.7–53.4) 86.2 (83.9–88.4) 3.10 (2.27–4.25) 0.61 (0.48–0.77)
Heterogeneity (I2) 95.4% 65.6% 63.6% 92.2%
Annular fissure22,25,30–32 5/920 61.2 (56.3–66.0) 73.8 (69.8–77.5) 2.88 (2.02–4.10) 0.24 (0.10–0.55)
Heterogeneity (I2) 96.0% 94.3% 79.5% 90.3%
Modic type18,24,25,33 4/803 12.9 (9.7–16.6) 98.7 (97.0–99.5) 10.00 (4.20–23.82) 0.84 (0.74–0.96)
Heterogeneity (I2) 91.0% 0.0% 0.0% 89.2%
Modic type18,25,34 3/706 12.0 (8.9–15.9) 98.6 (96.7–99.5) 8.03 (3.23–19.97) 0.88 (0.80–0.96)
Heterogeneity (I2) 77.3% 0.0% 0.0% 65.8%
a
Centralisation phenomenon33,35–37 4/218 41.2 (33.2–49.6) 85.9 (75.6–93.0) 3.06 (1.44–6.50) 0.66 (0.52–0.84)
Heterogeneity (I2) 79.9% 66.1% 10.8% 57.9%
SIJ
Radionuclide imaging (bone scan)38,39 2/82 22.7 (11.5–37.8) 96.1 (84.4–99.7) 7.33 (1.42–37.8) 0.74 (0.41–1.34)
Heterogeneity (I2) 81.4% 0.0% 0.0% 80.2%
a
Gaenslen’s test40–43 4/213 47.9 (38.7–57.2) 47.9 (37.5–58.4) 0.85 (0.56–1.28) 1.12 (0.77–1.62)
Heterogeneity (I2) 72.3% 73.3% 46.5% 34.3%
a
Sacral thrust test40,41,43 3/168 57.3 (45.9–68.2) 48.8 (37.9–59.9) 1.13 (0.73–1.75) 0.87 (0.52–1.44)
Heterogeneity (I2) 42.5% 84.1% 59.7% 47.7%
a
Thigh thrust test40–44 5/415 54.1 (48.1–60.1) 53.7 (44.9–62.3) 1.13 (0.83–1.55) 0.91 (0.67–1.22)
Heterogeneity (I2) 74.6% 31.0% 42.5% 42.1%
a
Compression test41,43 2/83 48.6 (31.9–65.6) 71.7 (56.5–84.0) 1.79 (1.03–3.11) 0.74 (0.52–1.05)
Heterogeneity (I2) 56.0% 0.0% 0.0% 0.0%
a
Patrick’s test (FABER test)40,43,44 3/319 76.4 (70.2–81.8) 32.3 (23.3–42.5) 1.05 (0.69–1.60) 0.86 (0.30–2.48)
Heterogeneity (I2) 80.3% 55.0% 80.7% 84.9%
a
Distraction test41,43 2/82 41.7 (25.5–59.2) 80.4 (66.1–90.6) 2.18 (1.08–4.38) 0.73 (0.54–0.99)
Heterogeneity (I2) 0.0% 0.0% 0.0% 0.0%
a
Gillet’s test40,42 2/134 67.5 (56.4–77.3) 45.5 (31.2–60.2) 1.01 (0.80–1.28) 1.08 (0.75–1.55)
Heterogeneity (I2) 97.5% 89.2% 23.8% 0.0%
a
Absence of midline LBP37,45 2/226 24.6 (14.1–37.8) 34.3 (27.2–42.0) 2.41 (1.89–3.07) 0.35 (0.12–1.01)
Heterogeneity (I2) 80.2% 0.0% 0.0% 75.6%
a
3 or more positive SIJ 6/276 80.5 (72.0–87.4) 68.1 (60.4–75.2) 2.44 (1.50–3.98) 0.31 (0.21–0.47)
pain provocation tests37,41,43,46–48
2
Heterogeneity (I ) 0.0% 69.1% 76.6% 0.0%
Facet joint
Uptake on SPECT 3/121 72.6 (59.1–83.6) 72.3 (59.8–82.7) 2.80 (1.82–4.31) 0.44 (0.25–0.77)
Heterogeneity (I2) 31.1% 0.0% 0.0% 7.0%
All magnetic resonance imaging (MRI) findings (index tests) are compared to the absence of the respected MRI finding unless indicated. No index tests based on imaging were able to be pooled for Facet
joint. aAll index tests are based on MRI unless indicated. bMRI finding index compared to disc degeneration (Grade ≤3), SIJ = Sacroiliac joint, LR = likelihood ratio, HIZ = high intensity zone, FABER = Flexion
abduction external rotation, SPECT = single photon emission computed tomography.
(Fig. 1). Excluded studies and primary reasons for remaining domains were as follows: patient selection
exclusion are presented in Supplementary Appendix S2. (33/62 studies; 53%), index test (54/62 studies; 87%),
Six authors16,47,52–55 were contacted for additional data and reference standard (31/62 studies; 50%). Further
and/or discrepancies, with one author16 able to provide details regarding scoring of the domains are presented
additional data. in Supplementary Appendix S4 and S5.
The results of the quality assessment using the Of the 62 included studies,15–42,44–48,52–76 35
QUADAS-211 scale are presented in Supplementary studies15–36,52,54,55,59,61,66–68,72–76 investigated the disc, 14
Appendix S3 Fig. 2 (studies reporting on the disc) and studies56–58,60,62–65,70,71 investigated the facet joint and 11
Fig. 3(studies reporting on the SIJ and facet). The studies38–42,44,46–48,53 the sacroiliac joint. Two studies37,45
domain which scored worst was ‘reference standard’ investigated all three sources. Studies investigated
where only 17 studies (27%) demonstrated low risk of from 1 to 40 index tests. Sample sizes of the 62 included
bias. Low risk of bias scores in the remaining domains studies ranged from 15 to 736 (median: 55) and they
were as follows: patient selection (26/62 studies; 42%), were conducted in 13 countries. Forty-five studies were
index test (34/62 studies; 55%), and flow and timing conducted in tertiary care, 3 studies in secondary care,
(30/62 studies; 48%). Seventeen of the 62 (27%) studies and 14 studies were not clear on the healthcare setting.
demonstrated low concerns for applicability across all No studies were conducted in primary care. All studies
domains. Low concerns of applicability scores in the included patients with persistent LBP. The estimated
prevalence of pain originating from discs, facet joints, Four studies33,35–37 investigated the centralisation
and the sacroiliac joint (according to a positive reference phenomenon to identify discogenic pain and we were
test) across all studies demonstrated a median of 46% able to pool all four studies (Table 1). Pooling demon-
(IQR: 26%) for discogenic pain, 53% (IQR: 7%) for strated informative +LRs (3.06; 95% CI: 1.44–6.50,
sacroiliac joint pain, and 42% (IQR: 20%) for facet joint I2 = 10.8%), but uninformative −LRs (0.66; 95% CI:
pain. The characteristics of the 60 studies are provided 0.52–0.84, I2 = 57.9%) (Table 1).
in Table 2. Index tests for sacroiliac joint pain studies that could
The pooled results are provided in Table 1 and be pooled across at least 2 studies are presented below.
Supplementary Appendix S7. Table 1 and Index tests investigated included a range of pain prov-
Supplementary Appendix S7 also includes heterogeneity ocation tests (clinical examination) and bone scan.
statistics (I2) for the pooled results. Results for individ- Two studies38,39 investigated radionuclide imaging
ual studies are provided in Supplementary Appendix S6. (i.e., bone scan) to identify the sacroiliac joint as a
Results for index tests for discogenic pain studies that source of low back pain and we were able to pool both
could be pooled across at least two studies are presented studies. Pooling demonstrated informative +LRs (7.33;
below. All MRI studies investigating discogenic pain 95% CI: 1.42–37.8, I2 = 0.0%), but uninformative −LRs
calculated diagnostic accuracy at the level of the disc, (0.74; 95% CI: 0.41–1.34, I2 = 80.2%) (Table 1).
while studies investigating centralisation always calcu- For clinical examination-based index tests to identify
lated diagnostic accuracy at the level of the patient. the sacroiliac joint our meta-analysis demonstrated
Ten studies15–19,22,25,52,54,55 investigated MRI evidence of informative +LR for the distraction test41,43 (2.18; 95% CI
disc degeneration. Five studies15–19 used the Pfirrmann 1.08–4.38, I2 = 0.0%), but uninformative −LR (0.73; 95%
scale69 and seven studies measured disc degeneration, CI: 0.54–0.99, I2 = 0.0%). Absence of midline low back
four using disc nuclear signal22,25,68,72 and three using pain37,45 demonstrated informative +LR (2.41 95% CI;
disc height.19,22,25 Only studies investigating disc degen- 1.89–3.07, I2 = 0.0%) and −LR 0.35 (95% CI: 0.12–1.01,
eration using the Pfirrmann scale (i.e., ≥ grade 3 or ≥ I2 = 75.6%). Pooling for all other index tests for the
grade 4) were pooled for the main analysis. sacroiliac joint demonstrated that no test in isolation
For studies measuring disc degeneration using the provided informative +LRs and −LRs (Table 1). Seven
Pfirrmann scale, we were able to pool four studies studies37,41–43,46–48 investigated a composite of pain provo-
investigating disc degeneration with a threshold of ≥ cation tests (3 or more positive sacroiliac joint provoca-
grade 315–18 and three studies investigating disc tion tests) and we were able to pool six studies.37,41,43,46–48
degeneration with a threshold of ≥ grade 416,18,19 Pooling demonstrated informative +LRs (2.44; 95% CI:
(Table 1). Pooling demonstrated informative +LRs 1.50−3.98, I2 = 76.6%) and −LRs (0.31; 95% CI:
(2.53; 95% CI: 1.57–4.07, I2 = 83.4% and 2.20; 95% CI: 0.21−0.47, I2 = 0.0%) (Table 1).
1.61–3.01, I2 = 44.4% for > grade 3 and > grade 4 Index tests for facet joint pain studies evaluated in
respectively) and −LRs (0.15; 95% CI: 0.09–0.24, two or more individual studies are presented below.
I2 = 0.0%) and 0.37; 95% CI: 0.19–0.73, I2 = 59.9% Only one index test for the facet joint was able to be
respectively) (Table 1). pooled. The remaining index tests are presented in
Fourteen studies18–29,52,54 investigated MRI evidence of Supplementary Appendix S6. Index tests presented
a high intensity zone (HIZ) and we were able to pool 12 below include imaging tests (SPECT—single-photon
studies18–29 (Table 1). Pooling demonstrated emission computed tomography), Revel’s criteria (5 or
informative +LRs (3.10; 95% CI: 2.27–4.25, I2 = 63.6%), more of 7 clinical characteristics: age ≥65, pain relieved
but uninformative −LRs (0.61; 95% CI: 0.48–0.77, by recumbency, pain not exacerbated by coughing, pain
I2 = 92.2%) (Table 1). not exacerbated by extension/rotation, pain not exacer-
Six studies22,25,30–32,74 investigated the MRI evidence of bated by forward flexion, pain not exacerbated by hy-
an annular fissure and we were able to pool five perextension, pain not worse with rising), paraspinal
studies22,25,30–32 (Table 1). Pooling demonstrated pain, midline spinal pain. Index tests investigated in
informative +LRs (2.88; 95% CI: 2.02–4.10, I2 = 79.5%) single studies included a range of clinical examination
and −LRs (0.24; 95% CI: 0.10–0.55, I2 = 90.3%) (Table 1). findings, and clinical prediction rules (Supplementary
Four studies15,18,24,25 investigated MRI evidence of Type Appendix S6).
1 Modic changes and we were able to pool all 4 studies Four studies56,58,60,77 investigated evidence of facet
(Table 1). Pooling demonstrated informative +LRs (10.0; joint uptake on SPECT and we were able to pool three
95% CI: 4.20–23.82, I2 = 0.0%), but uninformative −LRs studies56,58,77 (Table 1). Pooling demonstrated
(0.84; 95% CI: 0.74–0.96, I2 = 89.2%) (Table 1). informative +LRs (2.80; 95% CI: 1.82–4.31, I2 = 0.0%)
Three studies15,18,25 investigated MRI evidence of Type and –LRs (0.44; 95% CI: 0.25–0.77, I2 = 7.0%) (Table 1).
2 Modic changes and we were able to pool all 3 studies Studies investigating Revel’s criteria were not pooled
(Table 1). Pooling demonstrated informative +LRs (8.03; due to heterogeneity in diagnostic values between
95% CI: 3.23–19.97, I2 = 0.0%), but uninformative −LRs studies.63,64,70,71 The diagnostic accuracy of Revel’s criteria
(0.88; 95% CI: 0.88–0.96, I2 = 65.8%) (Table 1). was inconsistent.63,64,70,71 Two studies70,71 found
informative +LRs and −LRs and two other studies63,64 estimates of the diagnostic accuracy of index tests by
found uninformative +LRs and −LRs (Supplementary excluding studies with high risk of bias in each of the
Appendix S6). None of the seven individual items four domains of the QUADAS-2 scale. For a combina-
comprising Revels criteria produced informative +LRs tion of sacroiliac pain provocation tests, removing
and −LRs in more than one study (Supplementary studies of high risk of bias for the domain ‘patient se-
Appendix S6). Para-spinal pain and midline spinal lection’ reduced the +LR from 2.44 (95% CI: 1.50–3.98)
pain both had uninformative +LRs and −LRs to 1.69 (95% CI: 0.98–2.90) and for ‘reference standard’
(Supplementary Appendix S6). increased the +LR from 2.44 (95% CI: 1.50–3.98) to 3.06
We also pre-planned sensitivity analysis to investi- (95% CI: 1.75–5.33). For HIZ and the remaining do-
gate the influence of the reference test quality on esti- mains for combination of sacroiliac joint pain provoca-
mates of the diagnostic accuracy of index tests, by tion tests, the removal of studies with high risk of bias
limiting the studies to those meeting our higher had little to no effect on the diagnostic accuracy values.
threshold for the reference test quality. Further details are presented in Supplementary
Five studies20,21,25,26,28 investigating HIZ met our Appendix S8.
criteria for a higher quality reference standard. When Only one index test had more than ten studies in the
pooling only these studies20,21,25,26,28 the results demon- meta-analysis to explore publication bias. No publication
strated slightly higher informativeness for the +LRs bias was found for the meta-analysis of the index test
(3.54; 95% CI: 2.03–6.20) compared to when pooling all ‘MRI evidence of high intensity zone’ (12 studies) based
studies regardless of the reference standard quality on Deeks test (p = 0.53).
(+LR: 3.10; 95% CI: 2.27–4.25) (Table 1). The
pooled −LR (0.48; 95% CI: 0.28–0.81) met our cut-off for Discussion
informativeness unlike when all studies were pooled We located 60 studies investigating the diagnostic ac-
regardless of reference standard quality (−LR: 0.61; 95% curacy of tests to identify the source of low back pain. To
CI: 0.48–0.77) (Supplementary Appendix S8). our knowledge this is the most comprehensive analysis
Two studies46,48 investigating a combination of of diagnostic tests for low back pain to date. Most
sacroiliac joint pain provocation tests met our criteria studies focussed on the disc (35 studies) with fewer
for a higher quality reference standard. When pooling studies considering the facet joint (14 studies) or the
only these studies46,48 the results demonstrated higher sacroiliac joint (11 studies). We found evidence that
informativeness for the +LR (4.09; 95% CI: 2.53–6.60) some index tests had informative +LRs and −LRs for the
compared to when pooling all studies regardless of disc and sacroiliac joint, and the facet joint. In studies of
reference standard quality (+LR 2.44; 95% CI: people with persistent pain, MRI findings of disc
1.50–3.98) (Table 1). The pooled −LR (0.17; 95% CI: degeneration, HIZ, annular fissure, Modic type 1,
0.08–0.39) met our cut-off for informativeness unlike Modic type 2, and uptake of facet joint on SPECT
when all studies were pooled irrespective of reference increased the likelihood of the disc being a nociceptive
standard quality (−LR 0.31; 95% CI: 0.21–0.47) source of low back pain. The only informative physical
(Supplementary Appendix S8). However, these results examination-based index tests were a positive central-
should be interpreted with caution due to the small isation phenomenon to identify discogenic pain, and the
number of studies. absence of midline low back pain and a combination of
We also performed sensitivity analyses post-hoc to sacroiliac joint pain provocation tests to identify sacro-
examine the influence of risk of bias quality on iliac joint pain.
In the previous review, MRI evidence of a high in- research regarding their safety and effectiveness; how-
tensity zone and disc degeneration, and centralisation ever, these research approaches present an opportunity
phenomenon were found to be informative to identify to develop targeted more effective treatments.
the disc as the source of low back pain and a combi- Our results align with a recent systematic review80
nation of sacroiliac joint provocation tests were found to that investigated the diagnostic accuracy of clusters of
be informative to identify the SIJ as the source of low pain provocation tests for the sacroiliac joint as the
back pain.7 The results of our review reinforced the re- source of LBP. The review found similar LRs to our
sults of the previous review as a larger number of review for 3 or more sacroiliac joint provocation tests
studies were able to be pooled. The previous review (+LR: 2.13; 95% CI: 1.20–3.90 and −LR: 0.33; 95% CI
found there was no informative index tests to identify 0.11–0.72).80 The review also found a positive or negative
the facet joint as the source of low back pain, however test was associated with an increase or decrease in the
the results of our review found that facet joint uptake on probability of the sacroiliac joint being the source of
SPECT was informative. pain.80 Our results challenge and support some of the
Strengths of this study include using a sensitive clinical features that are thought to be associated with
search strategy and following a strict pre-specified pro- discogenic and SIJ pain. For example, a recent seminar
tocol. Another strength is that we were able to pool a in The Lancet6 advocated the MRI findings that our re-
greater number of studies compared to the previous view showed were informative for discogenic LBP, but it
review resulting in more confidence in the results.7 A also advocated clinical features which lack evidence of
limitation of our review is that most studies included in diagnostic accuracy. Our review will thus have value for
our review included convenience samples of patients informing clinical assessment of low back pain.
referred to tertiary settings for further diagnostic Studies investigating index tests with different
testing, despite testing index tests more commonly used thresholds or defined by different measures (e.g., disc
in primary care. This sampling approach may inflate degeneration) did not always make it clear which
prevalence estimates, but it’s impact on diagnostic ac- thresholds or definitions were being compared. It is
curacy of the index tests is unclear. Risk of bias was possible this may have affected diagnostic accuracy
generally high for the domain ‘reference standard’ as values; however, these studies were reviewed indepen-
most studies used a less strict criteria for a positive test, dently by two reviewers (CSH, MJH, or CGM) to
which may reflect the realities of how these tests are minimise errors. Possible sources of heterogeneity in
used in clinical practice. Approximately half of the our review could be from the population (e.g., were
studies in the other three domains demonstrated low patients’ surgical candidates), healthcare setting (e.g.,
risk of bias, which should provide clinicians with some primary care verse secondary care), the index test per-
confidence in our results. Our review found that 21 new formed, the reference standard, and methods of the
studies investigating the source of low back pain have study (e.g., defined thresholds for a positive test). Un-
been published in the past 15 years. While this sub- fortunately, it was not possible to explore heterogeneity
stantially increased the available evidence, some index between studies due to a lack of data and studies that
tests still lack adequate evidence and demonstrate could be pooled in the analysis.
imprecise values to draw confident conclusions There is a need for more diagnostic research evalu-
regarding diagnostic accuracy. ating tests to identify the pathoanatomical source of low
Our results challenge the dominant view in the low back pain. To date the diagnostic research has focussed
back pain field, that a pathoanatomical diagnosis is on the facet joint, disc and sacroiliac joint as sources of
usually not possible and so the label non-specific low low back pain, but other structures of the lumbosacral
back pain should instead be used for most patients. Our spine (e.g., muscles, fascia, ligaments and vertebral
review provides preliminary evidence that a diagnosis body)6 are potential nociceptive sources and there are
may be possible for a subgroup of patients with low back treatments targeting these structures. As there are now
pain, potentially moving beyond the non-specific low informative tests for the disc, sacroiliac joint, and facet
back pain label. The ability to form a diagnosis in people joint (only one test), it should be a priority to investigate
with persistent low back pain is an important step to- whether patients judged to have these conditions based
wards developing new, more targeted, and specific upon validated index tests have different prognoses or
treatment approaches. The pathology and causal mech- responses to treatment compared to patients considered
anism driving Modic type 1 changes have, for example, not to have those conditions. Ultimately, the diagnoses
been linked to bacterial infection, inflammation, and based upon these index tests will only have utility if they
bone marrow oedema.78 For annular fissures, zones of predict prognosis or response to treatment. A related
granulation tissue have been proposed to be the causal issue is how informative does a test need to be to justify
mechanism driving discogenic low back pain.79 Treat- shifting from generic treatment of cases of non-specific
ments that aim to address these causes and mecha- low back pain to more targeted treatments directed to-
nisms (e.g., antibiotics, Zoledronic acid or Denosumab wards the structure putatively responsible for a patient’s
for Modic type 1 changes) require much further symptoms. This issue is of interest as the +LRs for the
disc tests ranged from 2.20 to 10.00 and so yield quite Data sharing statement
different certainties in a particular diagnosis. Some of The original data will be shared with researchers upon request.
the index tests investigated in our review are continuous Declaration of interests
(e.g., disc degeneration) and the diagnostic accuracy We (author team) declare no competing interests.
may be influenced by the thresholds used to define a
positive test. Future research should more carefully Acknowledgements
CGM is funded by an NHMRC Research Fellowship (APP 1194283).
investigate the optimal thresholds for a positive test.
No other funding was provided to complete this study.
Not all patients with low back pain would potentially
benefit from a pathoanatomical diagnosis as many acute Appendix A. Supplementary data
cases settle rapidly with little or no formal treatment.81 Supplementary data related to this article can be found at https://doi.
The potential benefit is more likely to be seen in those org/10.1016/j.eclinm.2023.101960.
20 Aprill C, Bogduk N. High-intensity zone: a diagnostic sign of 42 Nejati P, Sartaj E, Imani F, Moeineddin R, Nejati L, Safavi M.
painful lumbar disc on magnetic resonance imaging. Br J Radiol. Accuracy of the diagnostic tests of sacroiliac joint dysfunction.
1992;65(773):361–369. J Chiropr Med. 2020;19(1):28–37.
21 Hanna H, Tommy H, Hebelka H, Hansson T. HIZ’s relation to 43 Schneider BJ, Ehsanian R, Rosati R, Huynh L, Levin J, Kennedy DJ.
axial load and low back pain: investigated with axial loaded MRI Validity of physical exam maneuvers in the diagnosis of sacroiliac
and pressure controlled discography. Eur Spine J. 2013;22(4): joint pathology. Pain Med. 2020;21(2):255–260.
734–739. 44 Mekhail N, Saweris Y, Sue Mehanny D, Makarova N, Guirguis M,
22 Ito M, Incorvaia KM, Yu SF, Fredrickson BE, Yuan HA, Costandi S. Diagnosis of sacroiliac joint pain: predictive value of
Rosenbaum AE. Predictive signs of discogenic lumbar pain on three diagnostic clinical tests. Pain Pract. 2021;21(2):204–214.
magnetic resonance imaging with discography correlation. Spine 45 Depalma MJ, Ketchum JM, Trussell BS, Saullo TR, Slipman CW.
(Phila Pa 1976). 1998;23(11):1252–1258. discussion 1259-1260. Does the location of low back pain predict its source? PM R.
23 Lam KS, Carlin D, Mulholland RC. Lumbar disc high-intensity 2011;3(1):33–39.
zone: the value and significance of provocative discography in the 46 Laslett M, Young SB, Aprill CN, McDonald B. Diagnosing painful
determination of the discogenic pain source. Eur Spine J. sacroiliac joints: a validity study of a McKenzie evaluation and
2000;9(1):36–41. sacroiliac provocation tests. Aust J Physiother. 2003;49(2):89–97.
24 Lei D, Rege A, Koti M, Smith FW, Wardlaw D. Painful disc lesion: 47 Stanford G, Burnham RS. Is it useful to repeat sacroiliac joint
can modern biplanar magnetic resonance imaging replace discog- provocative tests post-block? Pain Med. 2010;11(12):1774–1776.
raphy? J Spinal Disord Tech. 2008;21(6):430–435. 48 van der Wurff P, Buijs EJ, Groen GJ. A multitest regimen of pain
25 O’Neill C, Kurgansky M, Kaiser J, Lau W. Accuracy of MRI for provocation tests as an aid to reduce unnecessary minimally inva-
diagnosis of discogenic pain. Pain Physician. 2008;11(3):311–326. sive sacroiliac joint procedures. Arch Phys Med Rehabil.
26 Ricketson R, Simmons JW, Hauser BO. The prolapsed interverte- 2006;87(1):10–14.
bral disc. The high-intensity zone with discography correlation. 49 Deeks JJ, Altman DG. Diagnostic tests 4: likelihood ratios. BMJ.
Spine (Phila Pa 1976). 1996;21(23):2758–2762. 2004;329(7458):168–169.
27 Saifuddin A, Braithwaite I, White J, Taylor BA, Renton P. The value 50 Jaeschke R, Guyatt GH, Sackett DL. Users’ guides to the medical
of lumbar spine magnetic resonance imaging in the demonstration literature. III. How to use an article about a diagnostic test. B. What
of anular tears. Spine (Phila Pa 1976). 1998;23(4):453–457. are the results and will they help me in caring for my patients? The
28 Schellhas KP, Pollei SR, Gundry CR, Heithoff KB. Lumbar disc evidence-based medicine working group. JAMA. 1994;271(9):703–
high-intensity zone. Correlation of magnetic resonance imaging 707.
and discography. Spine (Phila Pa 1976). 1996;21(1):79–86. 51 Petersen T, Laslett M, Juhl C. Clinical classification in low back
29 Smith BM, Hurwitz EL, Solsberg D, et al. Interobserver reliability pain: best-evidence diagnostic rules based on systematic reviews.
of detecting lumbar intervertebral disc high-intensity zone on BMC Musculoskeletal Disord. 2017;18(1):188.
magnetic resonance imaging and association of high-intensity zone 52 Chelala L, Trent G, Waldrop G, Dagher AP, Reinig JW. Positive
with pain and anular disruption. Spine (Phila Pa 1976). predictive values of lumbar spine magnetic resonance imaging
1998;23(19):2074–2080. findings for provocative discography. J Comput Assist Tomogr.
30 Bartynski WS, Agarwal V, Trang H, et al. Enhancing annular fis- 2019;43(4):568–571.
sures and high-intensity zones: pain, internal derangement, and 53 Eskander JP, Ripoll JG, Calixto F, et al. Value of examination under
anesthetic response at provocation lumbar discography. AJNR Am J fluoroscopy for the assessment of sacroiliac joint dysfunction. Pain
Neuroradiol. 2023;44(1):95–104. Physician. 2015;18(5):E781–E786.
31 Vanharanta H, Ohnmeiss DD, Aprill CN. Vibration pain provoca- 54 Kang CH, Kim YH, Lee SH, et al. Can magnetic resonance imaging
tion can improve the specificity of MRI in the diagnosis of symp- accurately predict concordant pain provocation during provocative
tomatic lumbar disc rupture. Clin J Pain. 1998;14(3):239–247. disc injection? Skeletal Radiol. 2009;38(9):877–885.
32 Yoshida H, Fujiwara A, Tamai K, Kobayashi N, Saiki K, Saotome K. 55 Waldrop G, Trent G, Dagher AP, Reinig J, Thompson KJ. The as-
Diagnosis of symptomatic disc by magnetic resonance imaging: T2- sociation between magnetic resonance imaging disc pathology and
weighted and gadolinium-DTPA-enhanced T1-weighted magnetic provocative discography at the lumbar level. J Comput Assist
resonance imaging. J Spinal Disord Tech. 2002;15(3):193–198. Tomogr. 2021;45(1):146–150.
33 Donelson R, Aprill C, Medcalf R, Grant W. A prospective study of 56 Akgun E, Akgun MY. The effectiveness of bone scintigraphy in the
centralization of lumbar and referred pain. A predictor of symp- management of low back pain. Clin Neurol Neurosurg. 2022;222:
tomatic discs and anular competence. Spine (Phila Pa 1976). 107440.
1997;22(10):1115–1122. 57 Carrera GF. Lumbar facet joint injection in low back pain and
34 Horton WC, Daftari TK. Which disc as visualized by magnetic sciatica: preliminary results. Radiology. 1980;137(3):665–667.
resonance imaging is actually a source of pain? A correlation be- 58 Freiermuth D, Kretzschmar M, Bilecen D, et al. Correlation of
tween magnetic resonance imaging and discography. Spine (Phila (99m) Tc-DPD SPECT/CT scan findings and diagnostic blockades
Pa 1976). 1992;17(6 Suppl):S164–S171. of lumbar medial branches in patients with unspecific low back
35 Laslett M, Aprill CN, McDonald B, Oberg B. Clinical predictors of pain in a randomized-controlled trial. Pain Med. 2015;16(10):1916–
lumbar provocation discography: a study of clinical predictors of 1922.
lumbar provocation discography. Eur Spine J. 2006;15(10):1473– 59 Gornet MG, Peacock J, Claude J, et al. Magnetic resonance spec-
1484. troscopy (MRS) can identify painful lumbar discs and may facilitate
36 Laslett M, Oberg B, Aprill CN, McDonald B. Centralization as a improved clinical outcomes of lumbar surgeries for discogenic
predictor of provocation discography results in chronic low back pain. Eur Spine J. 2019;28(4):674–687.
pain, and the influence of disability and distress on diagnostic 60 Koh WU, Kim SH, Hwang BY, et al. Value of bone scintigraphy
power. Spine J. 2005;5(4):370–380. and single photon emission computed tomography (SPECT) in
37 Young S, Aprill C, Laslett M. Correlation of clinical examination lumbar facet disease and prediction of short-term outcome of ul-
characteristics with three sources of chronic low back pain. Spine J. trasound guided medial branch block with bone SPECT. Korean J
2003;3(6):460–465. Pain. 2011;24(2):81–86.
38 Maigne JY, Boulahdour H, Chatellier G. Value of quantitative 61 Kokkonen SM, Kurunlahti M, Tervonen O, Ilkko E, Vanharanta H.
radionuclide bone scanning in the diagnosis of sacroiliac joint Endplate degeneration observed on magnetic resonance imaging of
syndrome in 32 patients with low back pain. Eur Spine J. the lumbar spine: correlation with pain provocation and disc
1998;7(4):328–331. changes observed on computed tomography diskography. Spine
39 Slipman CW, Sterenfeld EB, Chou LH, Herzog R, Vresilovic E. The (Phila Pa 1976). 2002;27(20):2274–2278.
value of radionuclide imaging in the diagnosis of sacroiliac joint 62 Laslett M, McDonald B, Aprill CN, Tropp H, Oberg B. Clinical
syndrome. Spine (Phila Pa 1976). 1996;21(19):2251–2254. predictors of screening lumbar zygapophyseal joint blocks: devel-
40 Dreyfuss P, Michaelsen M, Pauza K, McLarty J, Bogduk N. The opment of clinical prediction rules. Spine J. 2006;6(4):370–379.
value of medical history and physical examination in diagnosing 63 Laslett M, Oberg B, Aprill CN, McDonald B. Zygapophysial joint
sacroiliac joint pain. Spine (Phila Pa 1976). 1996;21(22):2594– blocks in chronic low back pain: a test of Revel’s model as a
2602. screening test. BMC Musculoskelet Disord. 2004;5:43.
41 Laslett M, Aprill CN, McDonald B, Young SB. Diagnosis of sacro- 64 Manchikanti L, Pampati V, Fellows B, Baha AG. The inability of the
iliac joint pain: validity of individual provocation tests and com- clinical picture to characterize pain from facet joints. Pain Physi-
posites of tests. Man Ther. 2005;10(3):207–218. cian. 2000;3(2):158–166.
65 Manchikanti L, Pampati V, Fellows B, Bakhit CE. Prevalence of 74 Yrjama M, Tervonen O, Kurunlahti M, Vanharanta H. Bony vi-
lumbar facet joint pain in chronic low back pain. Pain Physician. bration stimulation test combined with magnetic resonance imag-
1999;2(3):59–64. ing. Can discography be replaced? Spine (Phila Pa 1976).
66 Milette PC, Raymond J, Fontaine S. Comparison of high-resolution 1997;22(7):808–813.
computed tomography with discography in the evaluation of lum- 75 Yrjama M, Tervonen O, Vanharanta H. Ultrasonic imaging of
bar disc herniations. Spine (Phila Pa 1976). 1990;15(6):525–533. lumbar discs combined with vibration pain provocation compared
67 Ohnmeiss DD, Vanharanta H, Ekholm J. Relationship of pain with discography in the diagnosis of internal anular fissures of the
drawings to invasive tests assessing intervertebral disc pathology. lumbar spine. Spine (Phila Pa 1976). 1996;21(5):571–575.
Eur Spine J. 1999;8(2):126–131. 76 Yrjama M, Vanharanta H. Bony vibration stimulation: a new, non-
68 Osti OL, Fraser RD. MRI and discography of annular tears and invasive method for examining intradiscal pain. Eur Spine J.
intervertebral disc degeneration. A prospective clinical comparison. 1994;3(4):233–235.
J Bone Joint Surg Br. 1992;74(3):431–435. 77 Holder LE, Machin JL, Asdourian PL, Links JM, Sexton CC. Planar
69 Pfirrmann CW, Metzdorf A, Zanetti M, Hodler J, Boos N. Magnetic and high-resolution SPECT bone imaging in the diagnosis of facet
resonance classification of lumbar intervertebral disc degeneration. syndrome. J Nucl Med. 1995;36(1):37–44.
Spine (Phila Pa 1976). 2001;26(17):1873–1878. 78 Cai G, Laslett LL, Aitken D, et al. Effect of zoledronic acid and
70 Revel M, Poiraudeau S, Auleley GR, et al. Capacity of the clinical Denosumab in patients with low back pain and modic change: a
picture to characterize low back pain relieved by facet joint anes- proof-of-principle trial. J Bone Miner Res. 2018;33(5):773–782.
thesia. Proposed criteria to identify patients with painful facet 79 Peng B, Pang X, Wu Y, Zhao C, Song X. A randomized placebo-
joints. Spine (Phila Pa 1976). 1998;23(18):1972–1976. discussion controlled trial of intradiscal methylene blue injection for the treat-
1977. ment of chronic discogenic low back pain. Pain. 2010;149(1):124–129.
71 Revel ME, Listrat VM, Chevalier XJ, et al. Facet joint block for low 80 Saueressig T, Owen PJ, Diemer F, Zebisch J, Belavy DL. Diagnostic
back pain: identifying predictors of a good response. Arch Phys Med accuracy of clusters of pain provocation tests for detecting sacroiliac
Rehabil. 1992;73(9):824–828. joint pain: systematic review with meta-analysis. J Orthop Sports
72 Simmons JW, Emery SF, McMillin JN, Landa D, Kimmich SJ. Phys Ther. 2021;51(9):422–431.
Awake discography. A comparison study with magnetic reso- 81 da C Menezes Costa L, Maher CG, Hancock MJ, McAuley JH,
nance imaging. Spine (Phila Pa 1976). 1991;16(6 Suppl):S216– Herbert RD, Costa LO. The prognosis of acute and persistent low-
S221. back pain: a meta-analysis. CMAJ. 2012;184(11):E613–E624.
73 Vanharanta H, Sachs BL, Spivey M, et al. A comparison of CT/ 82 Jacobs JC, Jarvik JG, Chou R, et al. Observational study of the
discography, pain response and radiographic disc height. Spine downstream consequences of inappropriate MRI of the lumbar
(Phila Pa 1976). 1988;13(3):321–324. spine. J Gen Intern Med. 2020;35(12):3605–3612.