Evidence and recommendations for the use of segmental motion testing for patients with LBP: a systematic review
Our take
Are hands-on segmental motion tests of the lumbar spine valid and reliable enough to guide clinical decisions in patients with low back pain?
The evidence on lumbar segmental motion tests is poor overall, and no single test can be strongly recommended in isolation. When specificity is generally high, sensitivity is too low to rule out pathology, and agreement between examiners is mostly below clinically acceptable thresholds.
ChallengesRead paper
Systematic review13 TrialsLimited evidence
Key points
- 13 studies covered three test types: PAIVMs, PPIVMs, and the prone instability test (PIT)
- Specificity was generally high for PAIVMs and PPIVMs, but sensitivity was consistently poor, making them weak for ruling out lumbar instability
- Inter-rater reliability for mobility testing was overwhelmingly poor (most kappa values below 0.4) across both PAIVM and PPIVM studies
- Pain provocation as a test outcome showed better reliability than mobility judgement in several studies
- The PIT showed the most consistent reliability, with four of six studies exceeding the clinical relevance threshold (kappa 0.54 to 0.87)
How it was conducted
- Design
- Systematic review (PRISMA-DTA)
- Databases
- PubMed, LIVIVO, and Cochrane Library (searched September 2019)
- Studies included
- 13 primary studies
- Tests evaluated
- PAIVMs, PPIVMs, and the prone instability test (PIT)
- Quality appraisal
- QUADAS-2 for diagnostic accuracy studies; adapted QAREL for reliability studies
- Meta-analysis
- Not conducted due to clinical and statistical heterogeneity across studies
What they found
- PPIVM specificity for detecting lumbar instability was 0.99-1.00; sensitivity was extremely poor at 0.03-0.07; positive likelihood ratios ranged from 4.82 to 26.80 and negative likelihood ratios from 0.93 to 0.98 (Abbott et al. 2005)
- PAIVM specificity for detecting lumbar instability ranged from 0.81 to 0.95; sensitivity ranged from 0.17 to 0.46; positive likelihood ratios were predominantly moderate (2.42-9.00); negative likelihood ratios were poor (0.60-0.88) (Abbott et al. 2005; Fritz et al. 2005)
- Combined PAIVMs and PPIVMs for detecting painful segments: sensitivity 0.94, specificity 1.00, yielding an excellent positive and near-zero negative likelihood ratio when verbal pain response was permitted; mobility judgement alone gave sensitivity 0.53, specificity 0.80 (Phillips and Twomey 1996)
- PAIVM inter-rater reliability for mobility was overwhelmingly poor: kappa values ranged from -0.02 to 0.48 across 10 studies; exception was Landel et al. (kappa 0.71) where examiners only had to agree on the least mobile segment
- PPIVM inter-rater reliability ranged from kappa -0.11 to 0.32; intra-rater reliability for flexion testing was kappa 0.31 (Qvistgaard et al. 2007)
- PIT inter-rater reliability ranged from kappa 0.27 to 0.87 across six studies; four of six studies exceeded the clinical relevance threshold; studies with current-complaint LBP patients achieved kappa 0.67-0.87, while chronic or recurrent LBP populations yielded kappa 0.27-0.71
- PAIVMs for detecting painful segments showed inter-rater kappa ranging from -0.14 to 0.69 across studies; ICC values for pain intensity at individual lumbar levels ranged from 0.61 to 0.69 (Maher and Adams 1994)
Limitations
- Abstract screening was conducted by only one reviewer, increasing the risk of missed studies
- Electronic search was restricted to three databases and four languages, potentially excluding relevant literature
- The adapted QAREL domains modelled on QUADAS-2 lack validated evidence for their structure, which may affect risk-of-bias ratings
- High proportion of included reliability studies (7 of 12) received 'unclear risk of bias' for rater blinding because studies did not report whether examiners were blinded to clinical information
Why it matters
- For patients
- Patients should be aware that a clinician's hands-on finding of a stiff or unstable spinal level may not reliably reflect what is actually happening in their spine, and treatment decisions based solely on these tests should be interpreted cautiously.
- For clinicians
- Clinicians should avoid making diagnostic or management decisions based on a single segmental motion test in isolation, and should instead integrate findings with other clinical information or use test batteries, particularly the PIT which showed the most consistent inter-rater agreement.
- For readers
- This review highlights a persistent gap between widespread clinical use of lumbar segmental motion tests and the weak psychometric evidence supporting them, calling for better-designed studies and standardised testing protocols.
Source
doi:10.1016/j.msksp.2019.102076
Read the original paperClinically assessing this area? See the lumbar spine & low back special tests.
More Lumbar Spine & Low Back studies
- Immediate physical therapy is beneficial for adolescent athletes with active lumbar spondylolysis: a multicentre RCTRCT
- Subgrouping non-specific low back pain based on spinal marker trajectory data: an unsupervised machine learning approachPrimary study
- "It's hard to trust an individual, it's easier to trust an image": patients with low back pain want imaging as a means of coping with uncertaintyPrimary study
- MRI screening for lumbar bone stress injuries in young male cricket fast bowlers: a 15-year retrospective cohort studyCohort study
- Active and passive physical therapy in patients with chronic low back pain: a level I Bayesian network meta-analysisMeta-analysis
- The identification of pain phenotypes in individuals with low back pain in response to dynamic resistance exercisePrimary study