- Systematic review
- Open Access
Reliability and validity of manual palpation for the assessment of patients with low back pain: a systematic and critical review
Chiropractic & Manual Therapies volume 29, Article number: 33 (2021)
Static or motion manual palpation of the low back is commonly used to assess pain location and reproduction in low back pain (LBP) patients. The purpose of this study is to review the reliability and validity of manual palpation used for the assessment of LBP in adults.
We systematically searched five databases from 2000 to 2019. We critically appraised internal validity of studies using QAREL and QUADAS-2 instruments. We stratified results using best-evidence synthesis. Validity studies were classified according to Sackett and Haynes.
We identified 2023 eligible articles, of which 14 were low risk of bias. Evidence suggests that reliability of soft tissue structures palpation is inconsistent, and reliability of bony structures and joint mobility palpation is poor. We found preliminary evidence that gluteal muscle palpation for tenderness may be valid in differentiating LBP patients with and without radiculopathy.
Reliability of manual palpation tests in the assessment of LBP patients varies greatly. This is problematic because these tests are commonly used by manual therapists and clinicians. Little is known about the validity of these tests; therefore, their clinical utility is uncertain. High quality validity studies are needed to inform the clinical use of manual palpation tests.
Low back pain (LBP) is the most prevalent musculoskeletal condition in the general population [1, 2]. The point prevalence of LBP ranges between 1 to 58.1% and one-year prevalence ranges between 0.8 to 82.5%  depending of the LBP definition and population. LBP is the leading cause of years lived with disability and is the sixth leading cause of disability adjusted life years globally [4, 5] and it is associated with poor health-related quality of life and has a substantial economic burden to society [6, 7]. Non-specific LBP is more common than specific LBP (e.g., cancer, fractures, infectious disorders, or ankylosing spondylitis) and it cannot be attributed to a specific underlying pathology .
The clinical assessment of low back pain involves completing a physical examination . Manual palpation is a common tool used to assess patients with LBP . It includes static and dynamic palpation of soft tissue or joints and aims to identify painful structures and biomechanical dysfunction of the spine . However, the clinical utility of these tests is controversial.
Previous systematic reviews have investigated the reliability and validity of manual palpation for the assessment of patients with LBP [9, 11,12,13]. According to these reviews, the inter-rater reliability of static joint and soft-tissue palpation to locate pain is poor (kappa (k) ≤ 0.40), and the inter-rater reliability of static palpation for soft tissue changes (e.g., tension) is inconsistent [9, 11, 13]. Furthermore, one review reported that motion palpation may be valid in detecting decreased motion, or lack of end-play in the lumbar spine . However, motion palpation may not be valid to detect aberrant motion of the sacroiliac joints . These reviews are outdated and there is a need for an up-to-date systematic review. The purpose of our systematic review was to determine the reliability and validity of manual palpation used to assess adult patients with LBP.
We included studies of adults (≥18 years) with LBP. LBP refers to pain or discomfort below the costal margin and above the inferior gluteal folds and can be with or without referred leg pain . Our systematic review includes patients with non-radicular low back pain, radicular low back pain, spinal stenosis, degenerative or isthmic spondylolisthesis, and failed back surgery syndrome.
Our review focuses on studies assessing the reliability or validity of manual palpation for the assessment of patients with LBP. Reliability describes the consistency of measurements across people or instruments . Validity is the degree to which a test measures what it is intended to measure .
Manual palpation is a diagnostic procedure where the examiner feels with their hands to assess the mobility and state of the soft and boney tissues . Palpation techniques include both static and dynamic (motion) methods, which are often used to identify areas of tissue pain and dysfunction, target manual and manipulative therapies and determine effectiveness of the intervention . Static palpation is used to identify bony asymmetry of bony landmarks, tender points, and trigger points to evaluate tissue texture, temperature and tone . Motion palpation is used to assess the quantity and quality of movement through the lumbar spine and pelvis . Motion palpation assessment can be continuous within the normal range of motion with joint play, or dynamic soft tissue palpation or end range assessment for end-feel or joint springing . Palpation involving devices such as pressure algometry were excluded.
We aimed to evaluate clinical outcomes assessed by palpation. Outcomes include pain, segmental mobility and stiffness for static joint palpation; joint movement and position assessed for motion joint palpation; and pain, tenderness, trigger points, muscle contraction assessed for static soft tissue palpation.
Eligible studies met the following inclusion: 1) English or French language; 2) published in peer reviewed journals between January 1, 2000 to July 11, 2019; 3) assessing the reliability or validity of manual palpation. Previously published systematic reviews on this topic were included in our review. Comparing our systematic review with previous systematic reviews examined findings of studies published before 2000. We excluded: 1) letters, guidelines, editorials, commentaries, unpublished manuscripts, dissertations, reports, book chapters, conference proceedings and abstracts, lectures, addresses, and consensus statements; 2) cadaveric and animal studies; 3) literature reviews and case studies; 4) studies targeting individuals with serious pathology (e.g., fractures, dislocations, systemic disease, myelopathy, neoplasm and infection; and 5) studies with sample size < 20 per group.
Search strategy and data sources
The search strategy was developed in consultation with a health sciences librarian and a second librarian was consulted to ensure accuracy and completeness using the Peer Review of Electronic Search Strategies PRESS checklist . We systematically searched the following electronic databases: MEDLINE, CINAHL, PubMed, Cochrane Central Register of Controlled Trials, and SPORTDiscus. Search terms consisted of subject headings specific to each database (e.g. MeSH in MEDLINE) and free text words relevant to LBP, diagnosis, reliability, validity, and palpation (Additional file 1).
Identified citations were exported into EndNote for reference management and tracking of the screening process. We screened articles in two stages. In stage one, titles and abstracts were screened for their relevance by pairs of independent reviewers (NL, PN, ALM). Stage two involved screening the full text article of all possibly relevant citations from stage one. Disagreements on screening stages were discussed between reviewers to reach consensus. When consensus could not be reached, a third reviewer independently screened the citation and discussed with the two reviewers to reach consensus.
Assessment of risk of Bias
Three reviewers (NL, PN, ALM) critically appraised all relevant studies (Tables 1 and 2) using the modified Quality Appraisal Tool for Studies of Diagnostic Reliability (QAREL)  criteria to assess the internal validity of the diagnostic reliability studies and the modified Quality Assessment of Diagnostic Accuracy Studies-2 (QUADAS-2)  criteria to assess diagnostic accuracy/validity studies (Additional files 2 and 3). The original QAREL and QUADAS-2 instruments were modified to include: 1) not applicable options; 2) a question regarding the clarity of the study objective; and 3) the Sackett and Haynes classification (phases of validity studies in QUADAS-2 instrument). If a study was judged as “low” on all domains relating to bias or applicability then it was appropriate to have an overall judgment of “low risk of bias” or “low concern regarding applicability” for that study. If a study was judged “high” or “unclear” on one or more domains then it may be judged “at risk of bias” or as having “concerns regarding applicability” [33, 34]. We included low risk of bias studies in our best evidence synthesis.
Validity studies with low risk of bias were classified into one for four phases of investigation following the recommendation of Sackett and Haynes . The purpose of phase I studies is to determine if test results are different for LBP patients and healthy controls. The purpose of Phase I studies is to determine whether test results differ between LBP patients and healthy controls. This information is useful to justify Phase II studies. Phase II studies aim to determine whether patients with a positive palpation result are more likely to have decreased functions, severe disability or structure changes (e.g., spinal stenosis) than patients with a negative result. Phase I and II studies provide preliminary evidence that a test should to be tested in phase III studies. On their own, results from phase I and II studies cannot be used to confirm the validity of tests. However, according to Sackett and Haynes classification, phase I – II justify that a test should be further investigated. Phase III studies aim to determine whether a test result can distinguish between LBP patients with suspected conditions (e.g., radiculopathy). Finally, Phase IV studies aim to determine whether patients who undergo a manual palpation test have a better prognosis than similar patients who were not tested . Phase IV studies are a unique type of studies that differ from phase I-III studies in examining diagnostic accuracy. Low risk of bias of phase IV study would be assessed using the Scottish Intercollegiate Guidelines Network (SIGN) criteria .
Data extraction and synthesis of results
One reviewer (PN) extracted data from low risk of bias studies and built evidence tables (Tables 3 and 4); and two reviewers (NL or HY) verified the accuracy and completeness of the data extraction. The reliability and validity studies were stratified according to targeted body structures (joint or soft tissue), technique (static or motion palpation), and clinical outcome (pain provocation, mobility, or stiffness). We used qualitative synthesis to synthesize the best evidence . Eligible statistics include 1) means, median and/or percent in phase I studies; 2) correlations, sensitivity, specificity, positive predictive value, negative predictive value and/or likelihood ratio in phase II or III studies; and 3) prevalence in phase III studies.
No arbitrary classification was used to report the strength of reliability or validity findings. Such classification used arbitrary cut-points that do not take into account the level of misclassification that can be acceptable in specific context. Rather, values of kappa coefficients, sensitivity, specificity etc. were reported. The authors interpreted the kappa and measurement errors according to clinical settings and purposes of palpation tests in their context. Kappa scores of < 0.6 are considered to have no, minimal or weak agreement and kappa scores of > 0.6 are considered to have moderate, strong or almost perfect agreement . This should be used as a rough guide when interpreting the kappa and measurement errors according to clinical settings and purposes of palpation tests in individual context.
We computed kappa coefficients (k) and 95% confidence intervals (CI) to determine the inter-rater reliability of our screening methodology of articles. We computed the percentage agreement between reviewers for the classification of articles into high or low risk of bias.
This review complies with the Reporting Items for Systematic Reviews and Meta-Analyses (PRISMA) statement (Additional file 4) . The Statement for Reporting Studies of Diagnostic Accuracy (STARD) was used to inform in the critical appraisal with the QAREL and QUADAS-2 .
We identified 2307 citations (plus 3 citations from other resources) removed 287 duplicates, and reviewed 2023 articles for eligibility (Fig. 1). In stage 1 screening, 1976 citations were ineligible. Forty-seven papers were reviewed in stage 2, and 31 were excluded: ineligible study population (n = 11) [41,42,43,44,45,46,47,48,49,50,51], inappropriate outcome measure (n = 6) [52,53,54,55,56,57], ineligible publication type (n = 4) [58,59,60,61], ineligible sample size (n = 3) [62,63,64], study design (n = 3) [65,66,67] and did not investigate manual palpation (n = 4) [68,69,70,71]. Two authors were contacted for publication type and age range, both responded [27, 59].
We critically appraised 16 articles and 14 articles had low risk of bias and were included in our evidence synthesis [19,20,21,22,23,24,25,26,27,28,29,30,31,32] (Fig. 1). Over the 16 articles appraised, 14 articles including 17 studies were reported (three articles included both reliability and validity in their study). The inter-rater agreement for screening of articles was Kappa = 0.86 (95% CI 0.73–0.98). The percentage agreement for the admissibility of studies was 100% (17 agreements/17 studies over the 16 articles appraised).
Fourteen articles had a low risk of bias [19,20,21,22,23,24,25,26,27,28,29,30,31,32]. Of those, 11 reported on the reliability of palpation tests [19,20,21,22,23,24,25,26,27,28,29] and six reported on validity [22, 28,29,30,31,32]. Three articles examined both reliability and validity [22, 28, 29].
The eleven reliability studies with low risk of bias examined inter-rater reliability of manual palpation to assess joints mobility or motion [19,20,21, 23, 26, 27], pain [19, 21, 23,24,25,26, 28, 29] and muscle contraction . Two of the eleven studies also examined intra-rater reliability of manual palpation assessing joint motion  and muscle tenderness . The six validity studies included one phase I study on palpation of joints and muscles to assess pain , four phase II on palpation of nerves to elicit pain , spinal stiffness  muscle contraction  and sacroiliac joint motion  and one phase III study on palpation of gluteal muscle for tenderness and pain .
The 14 low risk of bias articles investigated: 1) static joint palpation (n = 7) [19, 21, 23, 25, 26, 29, 31], 2) motion joint palpation (n = 3) [20, 27, 32], and 3) static soft tissue palpation (n = 5) [22, 24, 28,29,30] (Tables 3 and 4). They assessed various techniques: 1) joint pain provocation [19, 21, 23, 26, 29], 2) pain or tenderness of muscles [24, 29, 30], 3) pain and tenderness of nerves , 4) joint stiffness/mobility [19, 21, 23, 25, 26, 31], 5) joint motion [20, 27, 32], and 6) isometric muscle contraction . Table 5 showed a glossary of definitions for all of the palpation tests included in the articles.
The duration of LBP varied across studies: < 7 weeks (1/14 articles) , > 4 weeks (1/14 articles) , ≥ l months (1/14 articles) , new episode to > 3 months (1/14 articles)  and unspecified duration (10/14 articles) [20, 22, 23, 25,26,27,28, 30,31,32]. The studies were conducted in Australia , Canada , Denmark , Iran [20, 32], Ireland , and the United States [19, 22, 23, 25,26,27, 29, 31] between 2003 and 2017.
We did not perform a meta-analysis because of the heterogeneity of studies in symptom duration, palpation technique, and outcome specification.
Assessment of risk of Bias
The low risk of bias studies met the following criteria: 1) clearly described objective; 2) representative sample; 3) representative raters; 4) blinding of the test results between raters; 5) appropriate and valid standard test; and 6) appropriate statistical analysis (Tables 1 and 2). However, these studies had the following limitations: 1) unclear time interval between tests (n = 1) ; 2) no blinding for intra-examiner reliability (n = 2) [20, 24]; 3) 30 min rest period between the repeat testing between the same examiner and no blinding to clinical information (n = 2) [27, 28]; 4) unclear blinding to clinical information or additional clues ; 5) no blinding to clinical information and unclear blinding to additional clues (n=8) [21, 22, 23, 24, 25, 27, 28, 29] and 6) non-random or unclear administration of tests (n = 5) [19, 23, 27,28,29]. Most validity studies had appropriate exclusion criteria and blinding. However, validity studies had limitations: 1) four studies did not use a consecutive or random sample [22, 29, 31, 32]; and 2) two studies were unclear as to whether an appropriate time interval between tests were used [28, 31]; 3) one study was unclear as to whether an appropriate reference standard (slump test and straight leg raise) was used ; 4) in one study the examiner was not blinded to the results of the index or reference test  and 5) in one study it was unclear as to whether all patients were included in the analysis .
Two validity studies were excluded after critical appraisal. Abbott et al. used flexion/extension radiographs as a reference standard without establishing the test-retest reliability of patient positioning when taking of the radiographs . Telli et al. didn’t use blinding in their reliability study .
Summary of evidence
Reliability of joint and bony structure palpation
Four studies investigated static palpation to elicit pain. Overall, these studies suggest that important measurement error is associated with eliciting pain from: 1) lumbar facet joints (inter-rater reliability 0.38 ≤ k ≤ 0.73); 2) lumbar spinous processes (inter-rater reliability 0.21 ≤ k ≤ 0.57); 3) sacro-iliac (SI) joints (inter-rater reliability 0.14 ≤ k ≤ 0.59) [19, 23, 26, 29] (Table 3). Similarly, the evidence suggests that static palpation used to identify joint segmental mobility has low inter-rater reliability (i.e., lumbar facet joints: − 0.17 ≤ k ≤ 0.17; and lumbar spinous processes; − 0.02 ≤ k ≤ 0.26 SI joints: − 0.11 ≤ k ≤ − 0.10) [19, 23, 26]. The inter-rater reliability of the prone instability test for pain ranged from a kappa of 0.30 , 0.41  and 0.54  in the relaxation phase of the test and a kappa of 0.46 , 0.71  and 0.87  in the contraction phase of the test. In a study that combined the two phases of the test into a positive or negative finding reported a kappa of 0.10  (Table 3). Furthermore, a third study by Downey et al. (2003) reported low inter-rater reliability of joint static palpation to locate the spinal level (0.23 ≤ k ≤ 0.54) and name the spinal level (− 0.13 ≤ k ≤ 0.41) in patients with LBP symptoms  (Table 3).
We found inconsistent evidence in support of the reliability of motion palpation of the lumbar spine and SI joints to assess joint motion [20, 27]. The inter-rater reliability of motion palpation of the sacroiliac joint varied (inter-rater reliability 0.14 ≤ k ≤ 0.75 and intra-rater reliability 0.23 ≤ k ≤ 0.73) (Table 3) [20, 27]. Tong et al. (2006) suggested that sacral position cannot be reliably assessed during trunk motion using sacral base position test (inter-rater reliability: flexion k = 0.37, extension k = 0.05) .
Reliability of soft tissue palpation
We found varying levels of reliability for the palpation of the soft tissue structures associated with low back pain [22, 24, 28, 29]. The inter-rater reliability ranged from k = 0.80 for sciatic nerve pain, to 0.51 ≤ k ≤ 0.68 for gluteal tender points and k = 0.34 for lumbar paraspinal muscle pain [24, 28, 29]. One study suggested that the multifidus muscle can be reliably assessed by examiners who believe they are palpating the multifidus muscle for abnormal isometric contraction by palpating lateral and adjacent to the interspinous space of L4-L5 and L5-S1 with contralateral arm raising both with and without using hand weights (inter-rater reliability 0.75 ≤ k ≤ 0.81) . It is possible that the multifidus lift test is also palpating a more superficial muscle which raises questions about the validity of this test.
Validity of joint and bony structure palpation
Two studies investigated the validity of static joint palpation [29, 31]. One phase I study found that pain elicited by palpation of the SI joints and lumbar spinous processes was more common in LBP patients compared to healthy controls . One phase II study reported that posterior to anterior palpation used to identify stiffness from L1-L5 had a sensitivity of 38% (95% CI 21–59%), a specificity of 45% (95% CI 28–62%), a positive likelihood ratio of 0.69 (95% CI 0.37–1.31) and a negative likelihood ratio of 1.38 (95% CI 0.82, 2.33) when compared to a mechanized indentation device  (Table 4).
One phase II study investigated the validity of joint motion palpation tests for the sacroiliac joints . They examined the relationship between sacroiliac tests for joint motion (Gillet test, sitting flexion test and standing flexion test) and sacroiliac pain provocation tests (Faber test, thigh thrust test and resisted abduction test) but did not use statistics for validity (Table 4).
Validity of soft tissue palpation
Four studies investigated the validity of static soft tissue palpation [22, 28,29,30]. One phase I study found that pain elicited by palpation of the lumbar paraspinal and piriformis muscles was more common in LBP patients compared to without LBP . A phase II study tested the validity of the multifidus lift test with and without hand weights to identify abnormal isometric multifidus muscle contraction when compared to measurement with real-time ultrasound imaging of lumbar multifidus muscle thickness  (Table 4). The authors reported that the multifidus lift test correlates with ultrasound finding at the L4–5 level (r biserial correlation coefficient: 0.59 without hand weight and 0.73 without hand weight) and weakly associated at the L5-S1 level (r biserial correlation coefficient: 0.17 and 0.47) (Table 4) . Another phase II study investigated the validity of sciatic nerve palpation between the ischial tuberosity and the greater trochanter for pain using the straight leg raise and slump test as reference standard to evaluate mechanosensitivity of the sciatic nerve . The authors found that sciatic nerve palpation had a sensitivity of 85% (95% CI, 75–95%) and a specificity of 60% (95% CI, 46–74%) . Finally, one phase III study investigated the validity of static palpation of gluteal muscle for taut band, tenderness and pain recognition compared to an expert panel confirmation of radicular LBP (informed by MRI and electro-diagnostic testing). The authors reported that static palpation of the gluteal muscle had a sensitivity of 74.1% (95% CI, 67.7–80.3%) and a specificity of 91.4% (95% CI, 86.8–96.0%) in identifying radicular pain .
Summary of results
We reviewed the reliability and validity of manual palpation used to assess patients with LBP. We retrieved eleven studies on the reliability of static and motion palpation of joint and soft tissue. Overall, the evidence suggest that static joint palpation is not reliable in identifying pain and segmental mobility of the lumbar facet joints, lumbar spinous processes and SI joints, and location of spinal level contributing LBP symptoms. However, static soft tissue palpation may help reliably identify gluteal tender points, sciatic nerve pain, and multifidus contraction but not lumbar paraspinal muscle pain. We identified six validity studies for the assessment of LBP using static joint, joint motion and soft tissue palpation. Gluteal muscle palpation for pain was able to help identify differentiate LBP patients with or without radiculopathy (phase III study). We found preliminary evidence for the validity of the piriformis and lumbar paraspinal muscle palpation for pain (phase I study), spinous and sacroiliac joint palpation for pain (phase I study), sciatic nerve palpation for pain to identify mechanosensitivity of the sciatic nerve as determined by the straight leg raise and slump test (phase II study) and the multifidus lift test to help identify abnormal isometric contraction (phase II study); and against posterior to anterior palpation used to identify stiffness from L1-L5 spine levels (phase II study). Sacroiliac joint motion tests were not associated with sacroiliac pain provocation tests (phase II study). Overall, very little knowledge is available to support the usefulness of palpation of the lumbar and sacroiliac test when examining patient with low back pain.
Comparison with previous systematic reviews
The results of our systematic review differ from previous systematic reviews [9, 11, 13]. Our finding that static joint palpation of the spinous processes, facet and sacroiliac joints is not reliable to identify pain disagrees with previous systematic reviews [9, 11, 13]. Three reviews reported that the reliability of static joint palpation for pain was acceptable, but the kappa used to make this conclusion is low (k ≥ 0.4) [9, 11, 13]. Our review disagrees with the previous finding by Stochkendahl et al. et al. that found that static soft tissue palpation may help reliably identify soft tissue pain (k ≤ 0.4) . Our review found inconsistent reliability to identify soft tissue pain with the inclusion of three recent studies [22, 24, 28]. The different conclusions may be due to different search strategies, new evidence, inclusion of small sample studies, use of self-developed checklists, or use of predefined cut-off points to differentiate low and high quality studies in the four systematic reviews. However, our results are consistent with a systematic review published in 2020 focusing only on segmental motion palpation . Poor evidence regarding reliability and validity of segmental motion testing were reported and clinical use of stand-alone tests cannot be recommended .
Strengths and limitations
Our systematic review has several strengths. First, our comprehensive search strategy of multiple databases was developed by a health sciences librarian in consultation with content experts and was then reviewed by an independent health sciences librarian using the PRESS Checklist . Second, we used detailed, predefined inclusion and exclusion criteria to capture a diffuse range of possibly relevant citations. Third, we used paired independent reviewers to screen and critically appraise citations to minimize bias and error. The critical appraisal was completed by trained reviewers using standardized quality assessment tools (QAREL/QUADAS-2). Fourth, bias in reported results was minimized by performing a best-evidence synthesis that included only high-quality studies. Finally, we only included studies that tested subjects with LBP. This makes our results more generalizable to the patients seen by practitioners in clinical practice.
Our review also had limitations. First, our search was limited to studies published in English and French languages. It is possible that relevant studies in other languages may have been excluded. Second, our search may not have retrieved all relevant studies, although our search strategy was comprehensive and the search was conducted in multiple major medical databases. Third, our search was limited to studies published after 2000. Fourth, it is possible that individual differences in scientific judgment could have resulted in varied critical appraisal outcomes among reviewers. This bias was minimized using training with the standardized assessment tools and a consensus process for determining internal validity of studies. Finally, studies examining motion palpation tests had smaller sample sizes (validity studies n = 50; reliability studies n = 49) than studies of static joint or muscle palpation. This may have limited the precision of the results and led to uncertainty in our assessment of motion palpation tests.
Our review found very little evidence for the use of manual palpation to assess low back pain patients. Manual palpation tests suffered from misclassification error in that they were unable to differentiate those with LBP to subjects without LBP. Soft tissue palpation of the sciatic nerve, gluteal muscles for pain and the multifidus muscle for isometric contraction were reliable but have not been tested sufficiently for their validity for use in clinical practice. Although we did find that gluteal muscle palpation of trigger points and taut bands is valid to differentiate LBP patients with or without radiculopathy in a clinical setting. We found very limited evidence to support the use of joint palpation and clinician should reconsider its diagnostic value when assessing patients with low back pain.
We synthesize the evidence on the reliability and validity of manual palpation to assess adults with LBP. The evidence does not support reliability of joint palpation but static soft tissue palpation is reliable. There is little evidence on the motion joint palpation used in LBP patients. Gluteal muscle palpation for pain was able to differentiate LBP patients with or without radiculopathy (phase III study). We found preliminary evidence from Phases I and II validity studies for some palpation tests. High quality phase III and IV validity studies are required to understand the diagnostic value of manual palpation tests in the assessment of adults with LBP. Clinicians must reconsider the usefulness of these tests when examining patients.
Availability of data and materials
Vos T, Allen C, Arora M, Barber RM, Bhutta ZA, Brown A, et al. Global, regional, and national incidence, prevalence, and years lived with disability for 310 diseases and injuries, 1990–2015: a systematic analysis for the global burden of disease study 2015. Lancet. 2016;388(10053):1545–602. https://doi.org/10.1016/S0140-6736(16)31678-6.
Cassidy JD, Côté P, Carroll LJ, Kristman V. Incidence and course of low back pain episodes in the general population. Spine. 2005;30(24):2817–23. https://doi.org/10.1097/01.brs.0000190448.69091.53.
Hoy D, Brooks P, Blyth F, Buchbinder R. The epidemiology of low back pain. Best Pract Res Clin Rheumatol. 2010;24(6):769–81. https://doi.org/10.1016/j.berh.2010.10.002.
Vos T, Flaxman AD, Naghavi M, Lozano R, Michaud C, Ezzati M, et al. Years lived with disability (YLDs) for 1160 sequelae of 289 diseases and injuries 1990–2010: a systematic analysis for the global burden of disease study 2010. Lancet. 2012;380(9859):2163–96. https://doi.org/10.1016/S0140-6736(12)61729-2.
Murray CJ, Vos T, Lozano R, Naghavi M, Flaxman AD, Michaud C, et al. Disability-adjusted life years (DALYs) for 291 diseases and injuries in 21 regions, 1990–2010: a systematic analysis for the global burden of disease study 2010. Lancet. 2012;380(9859):2197–223. https://doi.org/10.1016/S0140-6736(12)61689-4.
Nolet PS, Kristman VL, Côté P, Carroll LJ, Cassidy JD. Is low back pain associated with worse health-related quality of life 6 months later? Eur Spine J. 2015;24(3):458–66. https://doi.org/10.1007/s00586-014-3649-4.
Hoy D, March L, Brooks P, Blyth F, Woolf A, Bain C, et al. The global burden of low back pain: estimates from the global burden of disease 2010 study. Ann Rheum Dis. 2014;73(6):968–74. https://doi.org/10.1136/annrheumdis-2013-204428.
Savigny P, Watson P, Underwood M. Early management of persistent non-specific low back pain: summary of NICE guidance. Bmj. 2009;338(jun04 3):b1805. https://doi.org/10.1136/bmj.b1805.
Seffinger MA, Najm WI, Mishra SI, Adams A, Dickerson VM, Murphy LS, et al. Reliability of spinal palpation for diagnosis of back and neck pain: a systematic review of the literature. Spine. 2004;29(19):E413–25. https://doi.org/10.1097/01.brs.0000141178.98157.8e.
Najm WI, Seffinger MA, Mishra SI, Dickerson VM, Adams A, Reinsch S, et al. Content validity of manual spinal palpatory exams-a systematic review. BMC Complement Altern Med. 2003;3(1):1–4. https://doi.org/10.1186/1472-6882-3-1.
Stochkendahl MJ, Christensen HW, Hartvigsen J, Vach W, Haas M, Hestbaek L, et al. Manual examination of the spine: a systematic critical literature review of reproducibility. J Manip Physiol Ther. 2006;29(6):475–85. https://doi.org/10.1016/j.jmpt.2006.06.011.
Hestœk L, Leboeuf-Yde C. Are chiropractic tests for the lumbo-pelvic spine reliable and valid? A systematic critical literature review. J Manip Physiol Ther. 2000;23(4):258–75. https://doi.org/10.1067/mmt.2000.106097.
Haneline MT, Young M. A review of intraexaminer and interexaminer reliability of static spinal palpation: a literature synthesis. J Manip Physiol Ther. 2009;32(5):379–86. https://doi.org/10.1016/j.jmpt.2009.04.010.
Duthey B. Background paper 6.24 low back pain. World Health Organization (WHO)(ed.) priority medicines for Europe and the world ‘a public health approach to innovation’. Geneva: WHO; 2012.
Fletcher RH, Fletcher SW, Fletcher GS. Clinical Epidemiology: The essentials. 5th ed: Philadelphia, Pennsylvania, Lippincott Williams & Williams; 2012.
Jonas WB. Mosby’s dictionary of complementary and alternative medicine; St. Louis (Mo), Mosby, Elsevier, 2005.
Bergmann TF, Peterson DH. Chiropractic Technique: Principles and Procedures. 3rd ed: St. Louis (Mo) Mosby; 2002.
McGowan J, Sampson M, Salzwedel DM, Cogo E, Foerster V, Lefebvre C. PRESS peer review of electronic search strategies: 2015 guideline statement. J Clin Epidemiol. 2016;75:40–6. https://doi.org/10.1016/j.jclinepi.2016.01.021.
Alyazedi FM, Lohman EB, Wesley Swen R, Bahjri K. The inter-rater reliability of clinical tests that best predict the subclassification of lumbar segmental instability: structural, functional and combined instability. J Man Manip Ther. 2015;23(4):197–204. https://doi.org/10.1179/2042618615Y.0000000002.
Arab AM, Abdollahi I, Joghataei MT, Golafshani Z, Kazemnejad A. Inter-and intra-examiner reliability of single and composites of selected motion palpation and pain provocation tests for sacroiliac joint. Man Ther. 2009;14(2):213–21. https://doi.org/10.1016/j.math.2008.02.004.
Downey B, Taylor N, Niere K. Can manipulative physiotherapists agree on which lumbar level to treat based on palpation? Physiotherapy. 2003;89(2):74–81. https://doi.org/10.1016/S0031-9406(05)60578-0.
Hebert JJ, Koppenhaver SL, Teyhen DS, Walker BF, Fritz JM. The evaluation of lumbar multifidus muscle function via palpation: reliability and validity of a new clinical test. Spine J. 2015;15(6):1196–202. https://doi.org/10.1016/j.spinee.2013.08.056.
Hicks GE, Fritz JM, Delitto A, Mishock J. Interrater reliability of clinical examination measures for identification of lumbar segmental instability. Arch Phys Med Rehabil. 2003;84(12):1858–64. https://doi.org/10.1016/S0003-9993(03)00365-4.
Jensen OK, Callesen J, Nielsen MG, Ellingsen T. Reproducibility of tender point examination in chronic low back pain patients as measured by intrarater and inter-rater reliability and agreement: a validation study. BMJ Open. 2013;3(2):e002532.
Ravenna MM, Hoffman SL, Van Dillen LR. Low interrater reliability of examiners performing the prone instability test: a clinical test for lumbar shear instability. Arch Phys Med Rehabil. 2011;92(6):913–9. https://doi.org/10.1016/j.apmr.2010.12.042.
Schneider M, Erhard R, Brach J, Tellin W, Imbarlina F, Delitto A. Spinal palpation for lumbar segmental mobility and pain provocation: an interexaminer reliability study. J Manip Physiol Ther. 2008;31(6):465–73. https://doi.org/10.1016/j.jmpt.2008.06.004.
Tong HC, Heyman OG, Lado DA, Isser MM. Interexaminer reliability of three methods of combining test results to determine side of sacral restriction, sacral base position, and innominate bone position. J Am Osteopath Assoc. 2006;106(8):464–8.
Walsh J, Hall T. Reliability, validity and diagnostic accuracy of palpation of the sciatic, tibial and common peroneal nerves in the examination of low back related leg pain. Man Ther. 2009;14(6):623–9. https://doi.org/10.1016/j.math.2008.12.007.
Weiner DK, Sakamoto S, Perera S, Breuer P. Chronic low back pain in older adults: prevalence, reliability, and validity of physical examination findings. J Am Geriatr Soc. 2006;54(1):11–20. https://doi.org/10.1111/j.1532-5415.2005.00534.x.
Adelmanesh F, Jalali A, Shirvani A, Pakmanesh K, Pourafkari M, Raissi GR, et al. The diagnostic accuracy of gluteal trigger points to differentiate radicular from nonradicular low back pain. Clin J Pain. 2016;32(8):666–72. https://doi.org/10.1097/AJP.0000000000000311.
Koppenhaver SL, Hebert JJ, Kawchuk GN, Childs JD, Teyhen DS, Croy T, et al. Criterion validity of manual assessment of spinal stiffness. Man Ther. 2014;19(6):589–94. https://doi.org/10.1016/j.math.2014.06.001.
Soleimanifar M, Karimi N, Arab AM. Association between composites of selected motion palpation and pain provocation tests for sacroiliac joint disorders. J Bodyw Mov Ther. 2017;21(2):240–5. https://doi.org/10.1016/j.jbmt.2016.06.003.
Lucas NP, Macaskill P, Irwig L, Bogduk N. The development of a quality appraisal tool for studies of diagnostic reliability (QAREL). J Clin Epidemiol. 2010;63(8):854–61. https://doi.org/10.1016/j.jclinepi.2009.10.002.
Whiting PF, Rutjes AW, Westwood ME, Mallet S, Deeks JJ, Reitsma JB, et al. Research and reporting methods accuracy studies. Ann Intern Med. 2011;155(4):529–36. https://doi.org/10.7326/0003-4819-155-8-201110180-00009.
Sackett DL, Haynes RB. The architecture of diagnostic research. BMJ. 2002;324(7336):539–41. https://doi.org/10.1136/bmj.324.7336.539.
Harbour R, Miller J. A new system for grading recommendations in evidence based guidelines. BMJ. 2001;323(7308):334–6. https://doi.org/10.1136/bmj.323.7308.334.
Slavin RE. Best evidence synthesis: an intelligent alternative to meta-analysis. J Clin Epidemiol. 1995;48(1):9–18. https://doi.org/10.1016/0895-4356(94)00097-A.
McHugh ML. Interrater reliability: the kappa statistic. Biochemia Medica. 2012;22(3):276–82.
Moher D, Liberati A, Tetzlaff J, Altman DG, Prisma Group. Preferred reporting items for systematic reviews and meta-analyses: the PRISMA statement. PLoS Med. 2009;6(7):e1000097.
Bossuyt PM, Reitsma JB, Bruns DE, Gatsonis CA, Glasziou PP, Irwig LM, et al. Toward complete and accurate reporting of studies of diagnostic accuracy: the STARD initiative. Am J Clin Pathol. 2003;119(1):18–22. https://doi.org/10.1309/8EXCCM6YR1THUBAF.
Hsieh CY, Hong CZ, Adams AH, Platt KJ, Danielson CD, Hoehler FK, et al. Interexaminer reliability of the palpation of trigger points in the trunk and lower limb muscles. Arch Phys Med Rehabil. 2000;81(3):258–64. https://doi.org/10.1016/S0003-9993(00)90068-6.
Leboeuf-Yde C, van Dijk J, Franz C, Hustad SA, Olsen D, Pihl T, et al. Motion palpation findings and self-reported low back pain in a population-based study sample. J Manip Physiol Ther. 2002;25(2):80–7. https://doi.org/10.1067/mmt.2002.122330.
Collaer JW, McKeough DM, Boissonnault WG. Lumbar isthmic spondylolisthesis detection with palpation: interrater reliability and concurrent criterion-related validity. J Manual Manipulative Ther. 2006;14(1):22–9. https://doi.org/10.1179/106698106790820917.
Chakraverty R, Pynsent P, Isaacs K. Which spinal levels are identified by palpation of the iliac crests and the posterior superior iliac spines? J Anat. 2007;210(2):232–6. https://doi.org/10.1111/j.1469-7580.2006.00686.x.
Holmgren U, Waling K. Inter-examiner reliability of four static palpation tests used for assessing pelvic dysfunction. Man Ther. 2008;13(1):50–6. https://doi.org/10.1016/j.math.2006.09.009.
Liu Y, Palmer JL. Iliacus tender points in young adults: a pilot study. J Am Osteopathic Assoc. 2012;112(5):285–9.
Monnier A, Heuer J, Norman K, Äng BO. Inter-and intra-observer reliability of clinical movement-control tests for marines. BMC Musculoskelet Disord. 2012;13(1):1–1.
Kurosawa D, Murakami E, Ozawa H, Koga H, Isu T, Chiba Y, et al. A diagnostic scoring system for sacroiliac joint pain originating from the posterior ligament. Pain Med. 2017;18(2):228–38. https://doi.org/10.1093/pm/pnw117.
Ferreira AP, Póvoa LC, Zanier JF, Machado DC, Ferreira AS. Sensitivity for palpating lumbopelvic soft-tissues and bony landmarks and its associated factors: a single-blinded diagnostic accuracy study. J Back Musculoskeletal Rehabil. 2017;30(4):735–44. https://doi.org/10.3233/BMR-150356.
Holt K, Russell D, Cooperstein R, Young M, Sherson M, Haavik H. Interexaminer reliability of a multidimensional battery of tests used to assess for vertebral subluxations. Chiropractic J Aust. 2018;46(1):101-7.
Holt K, Russell D, Cooperstein R, Young M, Sherson M, Haavik H. Interexaminer reliability of seated motion palpation for the stiffest spinal site. J Manip Physiol Ther. 2018;41(7):571–9. https://doi.org/10.1016/j.jmpt.2017.08.009.
Pollard HP, Bablis P, Bonello R. Can the ileocecal valve point predict low back pain using manual muscle testing? Chiropractic J Aust. 2006;36(2):58–62.
Robinson HS, Brox JI, Robinson R, Bjelland E, Solem S, Telje T. The reliability of selected motion-and pain provocation tests for the sacroiliac joint. Man Ther. 2007;12(1):72–9. https://doi.org/10.1016/j.math.2005.09.004.
Degenhardt BF, Johnson JC, Snider KT, Snider EJ. Maintenance and improvement of interobserver reliability of osteopathic palpatory tests over a 4-month period. J Am Osteopathic Assoc. 2010;110(10):579–86.
Merz O, Wolf U, Robert M, Gesing V, Rominger M. Validity of palpation techniques for the identification of the spinous process L5. Man Ther. 2013;18(4):333–8. https://doi.org/10.1016/j.math.2012.12.003.
Mainka T, Lemburg SP, Heyer CM, Altenscheidt J, Nicolas V, Maier C. Association between clinical signs assessed by manual segmental examination and findings of the lumbar facet joints on magnetic resonance scans in subjects with and without current low back pain: a prospective, single-blind study. PAIN. 2013;154(9):1886–95. https://doi.org/10.1016/j.pain.2013.06.018.
Adhia DB, Milosavljevic S, Tumilty S, Bussey MD. Innominate movement patterns, rotation trends and range of motion in individuals with low back pain of sacroiliac joint origin. Man Ther. 2016;21:100–8. https://doi.org/10.1016/j.math.2015.06.004.
Fransoo P, Legand D. Inter-observer reliability of clinical sacroiliac tests. Les Annales Kinesitherapie. 2004;32:33–8.
Sebastian D, Chovvath R. Reliability of palpation assessment in non-neutral dysfunctions of the lumbar spine. Orthop Phys Ther Pract. 2004;16:23–6.
Adelmanesh F, Jalali A, Shirvani A. Comparison between the sensitivity of straight leg raising test and gluteal trigger point to detect radicular low back pain: A diagnostic accuracy study. J Rehabil Med. 2016;48:94.
Ridehalgh C, Moore A, Hough A. Relationship of straight leg raise and slump tests to nerve palpation in individuals with spinally referred leg pain. Man Ther. 2016;100(25):e49.
Abbott JH, Mercer SR. Lumbar segmental hypomobility: criterion-related validity of clinical examination items (a pilot study). N Z J Physiother. 2003;31(1):3–10.
Billis EV, Foster NE, Wright CC. Reproducibility and repeatability: errors of three groups of physiotherapists in locating spinal levels by palpation. Man Ther. 2003;8(4):223–32. https://doi.org/10.1016/S1356-689X(03)00017-1.
Calvo-Lobo C, Diez-Vega I, Martínez-Pascual B, Fernández-Martínez S, de la Cueva-Reguera M, Garrosa-Martín G, et al. Tensiomyography, sonoelastography, and mechanosensitivity differences between active, latent, and control low back myofascial trigger points: a cross-sectional study. Medicine. 2017;96(10):e6287.
Skorupska E. Muscle atrophy measurement as assessment method for low back pain patients. Muscle Atrophy. 2018:437–61. https://doi.org/10.1007/978-981-13-1435-3_20.
Thawrani DP, Agabegi SS, Asghar F. Diagnosing sacroiliac joint pain. JAAOS-J Am Acad Orthop Surg. 2019;27(3):85–93.
Hunter C, Dubois M, Zou S, Oswald W, Coakley K, Shehebar M, et al. A new muscle pain detection device to diagnose muscles as a source of back and/or neck pain. Pain Med. 2010;11(1):35–43. https://doi.org/10.1111/j.1526-4637.2009.00773.x.
Gerhardt A, Eich W, Janke S, Leisner S, Treede RD, Tesarz J. Chronic widespread back pain is distinct from chronic local back pain: evidence from quantitative sensory testing, pain drawings, and psychometrics. Clin J Pain. 2016;32(7):568–79. https://doi.org/10.1097/AJP.0000000000000300.
Adachi S, Nakano A, Kin A, Baba I, Kurokawa Y, Neo M. The tibial nerve compression test for the diagnosis of lumbar spinal canal stenosis—a simple and reliable physical examination for use by primary care physicians. Acta Orthop Traumatol Turc. 2018;52(1):12–6. https://doi.org/10.1016/j.aott.2017.04.007.
Alqarni AM, Manlapaz D, Baxter D, Tumilty S, Mani R. Test procedures to assess somatosensory abnormalities in individuals with back pain: a systematic review of psychometric properties. Phys Ther Rev. 2018;23(3):178–96. https://doi.org/10.1080/10833196.2018.1479212.
Esmailiejah AA, Abbasian M, Bidar R, Esmailiejah N, Safdari F, Amirjamshidi A. Diagnostic efficacy of clinical tests for lumbar spinal instability. Surg Neurol Int. 2018;9:17.
Abbott JH, McCane B, Herbison P, Moginie G, Chapple C, Hogarty T. Lumbar segmental instability: a criterion-related validity study of manual therapy assessment. BMC Musculoskelet Disord. 2005;6(1):1–0.
Telli H, Telli S, Topal M. The validity and reliability of provocation tests in the diagnosis of sacroiliac joint dysfunction. Pain Physician. 2018;21(4):E367–76.
Stolz M, von Piekartz H, Hall T, Schindler A, Ballenberger N. Evidence and recommendations for the use of segmental motion testing for patients with LBP–A systematic review. Musculoskeletal Sci Pract. 2020;45:102076. https://doi.org/10.1016/j.msksp.2019.102076.
The authors acknowledge and thank Mrs. Anne Taylor-Vaisey, librarian for her suggestions and review of the search strategy. This research was undertaken, in part, thanks to funding from the Canada Research Chairs program to Dr. Pierre Côté, Canada Research Chair in Disability Prevention and Rehabilitation at the University of Ontario Institute of Technology.
This study was funded by the Association Française de Chiropraxie in France. This association was not involved in the collection of data, data analysis, interpretation of data, or drafting of the manuscript.
Ethics approval and consent to participate
Consent for publication
The authors declare that they have no conflict of interest.
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
About this article
Cite this article
Nolet, P.S., Yu, H., Côté, P. et al. Reliability and validity of manual palpation for the assessment of patients with low back pain: a systematic and critical review. Chiropr Man Therap 29, 33 (2021). https://doi.org/10.1186/s12998-021-00384-3
- Manual palpation
- Low back pain
- Systematic review