Skip to main content

Reliability and validity of manual palpation for the assessment of patients with low back pain: a systematic and critical review



Static or motion manual palpation of the low back is commonly used to assess pain location and reproduction in low back pain (LBP) patients. The purpose of this study is to review the reliability and validity of manual palpation used for the assessment of LBP in adults.


We systematically searched five databases from 2000 to 2019. We critically appraised internal validity of studies using QAREL and QUADAS-2 instruments. We stratified results using best-evidence synthesis. Validity studies were classified according to Sackett and Haynes.


We identified 2023 eligible articles, of which 14 were low risk of bias. Evidence suggests that reliability of soft tissue structures palpation is inconsistent, and reliability of bony structures and joint mobility palpation is poor. We found preliminary evidence that gluteal muscle palpation for tenderness may be valid in differentiating LBP patients with and without radiculopathy.


Reliability of manual palpation tests in the assessment of LBP patients varies greatly. This is problematic because these tests are commonly used by manual therapists and clinicians. Little is known about the validity of these tests; therefore, their clinical utility is uncertain. High quality validity studies are needed to inform the clinical use of manual palpation tests.


Low back pain (LBP) is the most prevalent musculoskeletal condition in the general population [1, 2]. The point prevalence of LBP ranges between 1 to 58.1% and one-year prevalence ranges between 0.8 to 82.5% [3] depending of the LBP definition and population. LBP is the leading cause of years lived with disability and is the sixth leading cause of disability adjusted life years globally [4, 5] and it is associated with poor health-related quality of life and has a substantial economic burden to society [6, 7]. Non-specific LBP is more common than specific LBP (e.g., cancer, fractures, infectious disorders, or ankylosing spondylitis) and it cannot be attributed to a specific underlying pathology [8].

The clinical assessment of low back pain involves completing a physical examination [9]. Manual palpation is a common tool used to assess patients with LBP [10]. It includes static and dynamic palpation of soft tissue or joints and aims to identify painful structures and biomechanical dysfunction of the spine [11]. However, the clinical utility of these tests is controversial.

Previous systematic reviews have investigated the reliability and validity of manual palpation for the assessment of patients with LBP [9, 11,12,13]. According to these reviews, the inter-rater reliability of static joint and soft-tissue palpation to locate pain is poor (kappa (k) ≤ 0.40), and the inter-rater reliability of static palpation for soft tissue changes (e.g., tension) is inconsistent [9, 11, 13]. Furthermore, one review reported that motion palpation may be valid in detecting decreased motion, or lack of end-play in the lumbar spine [12]. However, motion palpation may not be valid to detect aberrant motion of the sacroiliac joints [12]. These reviews are outdated and there is a need for an up-to-date systematic review. The purpose of our systematic review was to determine the reliability and validity of manual palpation used to assess adult patients with LBP.


Eligibility criteria


We included studies of adults (≥18 years) with LBP. LBP refers to pain or discomfort below the costal margin and above the inferior gluteal folds and can be with or without referred leg pain [14]. Our systematic review includes patients with non-radicular low back pain, radicular low back pain, spinal stenosis, degenerative or isthmic spondylolisthesis, and failed back surgery syndrome.


Our review focuses on studies assessing the reliability or validity of manual palpation for the assessment of patients with LBP. Reliability describes the consistency of measurements across people or instruments [15]. Validity is the degree to which a test measures what it is intended to measure [15].

Manual palpation is a diagnostic procedure where the examiner feels with their hands to assess the mobility and state of the soft and boney tissues [16]. Palpation techniques include both static and dynamic (motion) methods, which are often used to identify areas of tissue pain and dysfunction, target manual and manipulative therapies and determine effectiveness of the intervention [9]. Static palpation is used to identify bony asymmetry of bony landmarks, tender points, and trigger points to evaluate tissue texture, temperature and tone [17]. Motion palpation is used to assess the quantity and quality of movement through the lumbar spine and pelvis [17]. Motion palpation assessment can be continuous within the normal range of motion with joint play, or dynamic soft tissue palpation or end range assessment for end-feel or joint springing [17]. Palpation involving devices such as pressure algometry were excluded.


We aimed to evaluate clinical outcomes assessed by palpation. Outcomes include pain, segmental mobility and stiffness for static joint palpation; joint movement and position assessed for motion joint palpation; and pain, tenderness, trigger points, muscle contraction assessed for static soft tissue palpation.

Study characteristics

Eligible studies met the following inclusion: 1) English or French language; 2) published in peer reviewed journals between January 1, 2000 to July 11, 2019; 3) assessing the reliability or validity of manual palpation. Previously published systematic reviews on this topic were included in our review. Comparing our systematic review with previous systematic reviews examined findings of studies published before 2000. We excluded: 1) letters, guidelines, editorials, commentaries, unpublished manuscripts, dissertations, reports, book chapters, conference proceedings and abstracts, lectures, addresses, and consensus statements; 2) cadaveric and animal studies; 3) literature reviews and case studies; 4) studies targeting individuals with serious pathology (e.g., fractures, dislocations, systemic disease, myelopathy, neoplasm and infection; and 5) studies with sample size < 20 per group.

Search strategy and data sources

The search strategy was developed in consultation with a health sciences librarian and a second librarian was consulted to ensure accuracy and completeness using the Peer Review of Electronic Search Strategies PRESS checklist [18]. We systematically searched the following electronic databases: MEDLINE, CINAHL, PubMed, Cochrane Central Register of Controlled Trials, and SPORTDiscus. Search terms consisted of subject headings specific to each database (e.g. MeSH in MEDLINE) and free text words relevant to LBP, diagnosis, reliability, validity, and palpation (Additional file 1).

Study selection

Identified citations were exported into EndNote for reference management and tracking of the screening process. We screened articles in two stages. In stage one, titles and abstracts were screened for their relevance by pairs of independent reviewers (NL, PN, ALM). Stage two involved screening the full text article of all possibly relevant citations from stage one. Disagreements on screening stages were discussed between reviewers to reach consensus. When consensus could not be reached, a third reviewer independently screened the citation and discussed with the two reviewers to reach consensus.

Assessment of risk of Bias

Three reviewers (NL, PN, ALM) critically appraised all relevant studies (Tables 1 and 2) using the modified Quality Appraisal Tool for Studies of Diagnostic Reliability (QAREL) [33] criteria to assess the internal validity of the diagnostic reliability studies and the modified Quality Assessment of Diagnostic Accuracy Studies-2 (QUADAS-2) [34] criteria to assess diagnostic accuracy/validity studies (Additional files 2 and 3). The original QAREL and QUADAS-2 instruments were modified to include: 1) not applicable options; 2) a question regarding the clarity of the study objective; and 3) the Sackett and Haynes classification (phases of validity studies in QUADAS-2 instrument). If a study was judged as “low” on all domains relating to bias or applicability then it was appropriate to have an overall judgment of “low risk of bias” or “low concern regarding applicability” for that study. If a study was judged “high” or “unclear” on one or more domains then it may be judged “at risk of bias” or as having “concerns regarding applicability” [33, 34]. We included low risk of bias studies in our best evidence synthesis.

Table 1 Risk of bias for scientifically admissible reliability studies based on the modified QAREL criteria
Table 2 Risk of bias for scientifically admissible validity studies based on the modified QUADAS-2 criteria

Validity studies with low risk of bias were classified into one for four phases of investigation following the recommendation of Sackett and Haynes [35]. The purpose of phase I studies is to determine if test results are different for LBP patients and healthy controls. The purpose of Phase I studies is to determine whether test results differ between LBP patients and healthy controls. This information is useful to justify Phase II studies. Phase II studies aim to determine whether patients with a positive palpation result are more likely to have decreased functions, severe disability or structure changes (e.g., spinal stenosis) than patients with a negative result. Phase I and II studies provide preliminary evidence that a test should to be tested in phase III studies. On their own, results from phase I and II studies cannot be used to confirm the validity of tests. However, according to Sackett and Haynes classification, phase I – II justify that a test should be further investigated. Phase III studies aim to determine whether a test result can distinguish between LBP patients with suspected conditions (e.g., radiculopathy). Finally, Phase IV studies aim to determine whether patients who undergo a manual palpation test have a better prognosis than similar patients who were not tested [35]. Phase IV studies are a unique type of studies that differ from phase I-III studies in examining diagnostic accuracy. Low risk of bias of phase IV study would be assessed using the Scottish Intercollegiate Guidelines Network (SIGN) criteria [36].

Data extraction and synthesis of results

One reviewer (PN) extracted data from low risk of bias studies and built evidence tables (Tables 3 and 4); and two reviewers (NL or HY) verified the accuracy and completeness of the data extraction. The reliability and validity studies were stratified according to targeted body structures (joint or soft tissue), technique (static or motion palpation), and clinical outcome (pain provocation, mobility, or stiffness). We used qualitative synthesis to synthesize the best evidence [37]. Eligible statistics include 1) means, median and/or percent in phase I studies; 2) correlations, sensitivity, specificity, positive predictive value, negative predictive value and/or likelihood ratio in phase II or III studies; and 3) prevalence in phase III studies.

Table 3 Evidence table for low risk of bias studies assessing the reliability of manual palpation tests in patients with low back pain
Table 4 Evidence table for low risk of bias studies assessing the validity of manual palpation tests in patient with low back pain

No arbitrary classification was used to report the strength of reliability or validity findings. Such classification used arbitrary cut-points that do not take into account the level of misclassification that can be acceptable in specific context. Rather, values of kappa coefficients, sensitivity, specificity etc. were reported. The authors interpreted the kappa and measurement errors according to clinical settings and purposes of palpation tests in their context. Kappa scores of < 0.6 are considered to have no, minimal or weak agreement and kappa scores of > 0.6 are considered to have moderate, strong or almost perfect agreement [38]. This should be used as a rough guide when interpreting the kappa and measurement errors according to clinical settings and purposes of palpation tests in individual context.

Statistical analyses

We computed kappa coefficients (k) and 95% confidence intervals (CI) to determine the inter-rater reliability of our screening methodology of articles. We computed the percentage agreement between reviewers for the classification of articles into high or low risk of bias.


This review complies with the Reporting Items for Systematic Reviews and Meta-Analyses (PRISMA) statement (Additional file 4) [39]. The Statement for Reporting Studies of Diagnostic Accuracy (STARD) was used to inform in the critical appraisal with the QAREL and QUADAS-2 [40].


Study selection

We identified 2307 citations (plus 3 citations from other resources) removed 287 duplicates, and reviewed 2023 articles for eligibility (Fig. 1). In stage 1 screening, 1976 citations were ineligible. Forty-seven papers were reviewed in stage 2, and 31 were excluded: ineligible study population (n = 11) [41,42,43,44,45,46,47,48,49,50,51], inappropriate outcome measure (n = 6) [52,53,54,55,56,57], ineligible publication type (n = 4) [58,59,60,61], ineligible sample size (n = 3) [62,63,64], study design (n = 3) [65,66,67] and did not investigate manual palpation (n = 4) [68,69,70,71]. Two authors were contacted for publication type and age range, both responded [27, 59].

Fig. 1

Identification and selection of articles on reliability and validity of manual palpation used to assess patients with low back pain. *Not mutually exclusive

We critically appraised 16 articles and 14 articles had low risk of bias and were included in our evidence synthesis [19,20,21,22,23,24,25,26,27,28,29,30,31,32] (Fig. 1). Over the 16 articles appraised, 14 articles including 17 studies were reported (three articles included both reliability and validity in their study). The inter-rater agreement for screening of articles was Kappa = 0.86 (95% CI 0.73–0.98). The percentage agreement for the admissibility of studies was 100% (17 agreements/17 studies over the 16 articles appraised).

Study characteristics

Fourteen articles had a low risk of bias [19,20,21,22,23,24,25,26,27,28,29,30,31,32]. Of those, 11 reported on the reliability of palpation tests [19,20,21,22,23,24,25,26,27,28,29] and six reported on validity [22, 28,29,30,31,32]. Three articles examined both reliability and validity [22, 28, 29].

The eleven reliability studies with low risk of bias examined inter-rater reliability of manual palpation to assess joints mobility or motion [19,20,21, 23, 26, 27], pain [19, 21, 23,24,25,26, 28, 29] and muscle contraction [22]. Two of the eleven studies also examined intra-rater reliability of manual palpation assessing joint motion [20] and muscle tenderness [24]. The six validity studies included one phase I study on palpation of joints and muscles to assess pain [29], four phase II on palpation of nerves to elicit pain [28], spinal stiffness [31] muscle contraction [22] and sacroiliac joint motion [32] and one phase III study on palpation of gluteal muscle for tenderness and pain [30].

The 14 low risk of bias articles investigated: 1) static joint palpation (n = 7) [19, 21, 23, 25, 26, 29, 31], 2) motion joint palpation (n = 3) [20, 27, 32], and 3) static soft tissue palpation (n = 5) [22, 24, 28,29,30] (Tables 3 and 4). They assessed various techniques: 1) joint pain provocation [19, 21, 23, 26, 29], 2) pain or tenderness of muscles [24, 29, 30], 3) pain and tenderness of nerves [28], 4) joint stiffness/mobility [19, 21, 23, 25, 26, 31], 5) joint motion [20, 27, 32], and 6) isometric muscle contraction [22]. Table 5 showed a glossary of definitions for all of the palpation tests included in the articles.

Table 5 Glossary of Manual Palpation Tests in Accepted Articles

The duration of LBP varied across studies: < 7 weeks (1/14 articles) [21], > 4 weeks (1/14 articles) [24], ≥ l months (1/14 articles) [29], new episode to > 3 months (1/14 articles) [19] and unspecified duration (10/14 articles) [20, 22, 23, 25,26,27,28, 30,31,32]. The studies were conducted in Australia [21], Canada [30], Denmark [24], Iran [20, 32], Ireland [28], and the United States [19, 22, 23, 25,26,27, 29, 31] between 2003 and 2017.

We did not perform a meta-analysis because of the heterogeneity of studies in symptom duration, palpation technique, and outcome specification.

Assessment of risk of Bias

Tables 1 and 2 showed the risk of bias for scientifically admissible reliability and validity studies based on the modified QAREL and QUADAS-2 criteria respectively.

The low risk of bias studies met the following criteria: 1) clearly described objective; 2) representative sample; 3) representative raters; 4) blinding of the test results between raters; 5) appropriate and valid standard test; and 6) appropriate statistical analysis (Tables 1 and 2). However, these studies had the following limitations: 1) unclear time interval between tests (n = 1) [27]; 2) no blinding for intra-examiner reliability (n = 2) [20, 24]; 3) 30 min rest period between the repeat testing between the same examiner and no blinding to clinical information (n = 2) [27, 28]; 4) unclear blinding to clinical information or additional clues [23]; 5) no blinding to clinical information and unclear blinding to additional clues (n=8) [21, 22, 23, 24, 25, 27, 28, 29] and 6) non-random or unclear administration of tests (n = 5) [19, 23, 27,28,29]. Most validity studies had appropriate exclusion criteria and blinding. However, validity studies had limitations: 1) four studies did not use a consecutive or random sample [22, 29, 31, 32]; and 2) two studies were unclear as to whether an appropriate time interval between tests were used [28, 31]; 3) one study was unclear as to whether an appropriate reference standard (slump test and straight leg raise) was used [28]; 4) in one study the examiner was not blinded to the results of the index or reference test [32] and 5) in one study it was unclear as to whether all patients were included in the analysis [22].

Two validity studies were excluded after critical appraisal. Abbott et al. used flexion/extension radiographs as a reference standard without establishing the test-retest reliability of patient positioning when taking of the radiographs [72]. Telli et al. didn’t use blinding in their reliability study [73].

Summary of evidence

Reliability of joint and bony structure palpation

Static palpation

Four studies investigated static palpation to elicit pain. Overall, these studies suggest that important measurement error is associated with eliciting pain from: 1) lumbar facet joints (inter-rater reliability 0.38 ≤ k ≤ 0.73); 2) lumbar spinous processes (inter-rater reliability 0.21 ≤ k ≤ 0.57); 3) sacro-iliac (SI) joints (inter-rater reliability 0.14 ≤ k ≤ 0.59) [19, 23, 26, 29] (Table 3). Similarly, the evidence suggests that static palpation used to identify joint segmental mobility has low inter-rater reliability (i.e., lumbar facet joints: − 0.17 ≤ k  0.17; and lumbar spinous processes; − 0.02  k  0.26 SI joints: − 0.11 ≤ k ≤ − 0.10) [19, 23, 26]. The inter-rater reliability of the prone instability test for pain ranged from a kappa of 0.30 [23], 0.41 [19] and 0.54 [26] in the relaxation phase of the test and a kappa of 0.46 [26], 0.71 [19] and 0.87 [23] in the contraction phase of the test. In a study that combined the two phases of the test into a positive or negative finding reported a kappa of 0.10 [25] (Table 3). Furthermore, a third study by Downey et al. (2003) reported low inter-rater reliability of joint static palpation to locate the spinal level (0.23 ≤ k ≤ 0.54) and name the spinal level (− 0.13 ≤ k ≤ 0.41) in patients with LBP symptoms [21] (Table 3).

Motion palpation

We found inconsistent evidence in support of the reliability of motion palpation of the lumbar spine and SI joints to assess joint motion [20, 27]. The inter-rater reliability of motion palpation of the sacroiliac joint varied (inter-rater reliability 0.14 ≤ k ≤ 0.75 and intra-rater reliability 0.23 ≤ k ≤ 0.73) (Table 3) [20, 27]. Tong et al. (2006) suggested that sacral position cannot be reliably assessed during trunk motion using sacral base position test (inter-rater reliability: flexion k = 0.37, extension k = 0.05) [27].

Reliability of soft tissue palpation

Static palpation

We found varying levels of reliability for the palpation of the soft tissue structures associated with low back pain [22, 24, 28, 29]. The inter-rater reliability ranged from k = 0.80 for sciatic nerve pain, to 0.51 ≤ k ≤ 0.68 for gluteal tender points and k = 0.34 for lumbar paraspinal muscle pain [24, 28, 29]. One study suggested that the multifidus muscle can be reliably assessed by examiners who believe they are palpating the multifidus muscle for abnormal isometric contraction by palpating lateral and adjacent to the interspinous space of L4-L5 and L5-S1 with contralateral arm raising both with and without using hand weights (inter-rater reliability 0.75 ≤ k ≤ 0.81) [22]. It is possible that the multifidus lift test is also palpating a more superficial muscle which raises questions about the validity of this test.

Validity of joint and bony structure palpation

Static palpation

Two studies investigated the validity of static joint palpation [29, 31]. One phase I study found that pain elicited by palpation of the SI joints and lumbar spinous processes was more common in LBP patients compared to healthy controls [29]. One phase II study reported that posterior to anterior palpation used to identify stiffness from L1-L5 had a sensitivity of 38% (95% CI 21–59%), a specificity of 45% (95% CI 28–62%), a positive likelihood ratio of 0.69 (95% CI 0.37–1.31) and a negative likelihood ratio of 1.38 (95% CI 0.82, 2.33) when compared to a mechanized indentation device [31] (Table 4).

Motion palpation

One phase II study investigated the validity of joint motion palpation tests for the sacroiliac joints [32]. They examined the relationship between sacroiliac tests for joint motion (Gillet test, sitting flexion test and standing flexion test) and sacroiliac pain provocation tests (Faber test, thigh thrust test and resisted abduction test) but did not use statistics for validity (Table 4).

Validity of soft tissue palpation

Static palpation

Four studies investigated the validity of static soft tissue palpation [22, 28,29,30]. One phase I study found that pain elicited by palpation of the lumbar paraspinal and piriformis muscles was more common in LBP patients compared to without LBP [29]. A phase II study tested the validity of the multifidus lift test with and without hand weights to identify abnormal isometric multifidus muscle contraction when compared to measurement with real-time ultrasound imaging of lumbar multifidus muscle thickness [22] (Table 4). The authors reported that the multifidus lift test correlates with ultrasound finding at the L4–5 level (r biserial correlation coefficient: 0.59 without hand weight and 0.73 without hand weight) and weakly associated at the L5-S1 level (r biserial correlation coefficient: 0.17 and 0.47) (Table 4) [31]. Another phase II study investigated the validity of sciatic nerve palpation between the ischial tuberosity and the greater trochanter for pain using the straight leg raise and slump test as reference standard to evaluate mechanosensitivity of the sciatic nerve [28]. The authors found that sciatic nerve palpation had a sensitivity of 85% (95% CI, 75–95%) and a specificity of 60% (95% CI, 46–74%) [26]. Finally, one phase III study investigated the validity of static palpation of gluteal muscle for taut band, tenderness and pain recognition compared to an expert panel confirmation of radicular LBP (informed by MRI and electro-diagnostic testing). The authors reported that static palpation of the gluteal muscle had a sensitivity of 74.1% (95% CI, 67.7–80.3%) and a specificity of 91.4% (95% CI, 86.8–96.0%) in identifying radicular pain [30].


Summary of results

We reviewed the reliability and validity of manual palpation used to assess patients with LBP. We retrieved eleven studies on the reliability of static and motion palpation of joint and soft tissue. Overall, the evidence suggest that static joint palpation is not reliable in identifying pain and segmental mobility of the lumbar facet joints, lumbar spinous processes and SI joints, and location of spinal level contributing LBP symptoms. However, static soft tissue palpation may help reliably identify gluteal tender points, sciatic nerve pain, and multifidus contraction but not lumbar paraspinal muscle pain. We identified six validity studies for the assessment of LBP using static joint, joint motion and soft tissue palpation. Gluteal muscle palpation for pain was able to help identify differentiate LBP patients with or without radiculopathy (phase III study). We found preliminary evidence for the validity of the piriformis and lumbar paraspinal muscle palpation for pain (phase I study), spinous and sacroiliac joint palpation for pain (phase I study), sciatic nerve palpation for pain to identify mechanosensitivity of the sciatic nerve as determined by the straight leg raise and slump test (phase II study) and the multifidus lift test to help identify abnormal isometric contraction (phase II study); and against posterior to anterior palpation used to identify stiffness from L1-L5 spine levels (phase II study). Sacroiliac joint motion tests were not associated with sacroiliac pain provocation tests (phase II study). Overall, very little knowledge is available to support the usefulness of palpation of the lumbar and sacroiliac test when examining patient with low back pain.

Comparison with previous systematic reviews

The results of our systematic review differ from previous systematic reviews [9, 11, 13]. Our finding that static joint palpation of the spinous processes, facet and sacroiliac joints is not reliable to identify pain disagrees with previous systematic reviews [9, 11, 13]. Three reviews reported that the reliability of static joint palpation for pain was acceptable, but the kappa used to make this conclusion is low (k ≥ 0.4) [9, 11, 13]. Our review disagrees with the previous finding by Stochkendahl et al. et al. that found that static soft tissue palpation may help reliably identify soft tissue pain (k ≤ 0.4) [11]. Our review found inconsistent reliability to identify soft tissue pain with the inclusion of three recent studies [22, 24, 28]. The different conclusions may be due to different search strategies, new evidence, inclusion of small sample studies, use of self-developed checklists, or use of predefined cut-off points to differentiate low and high quality studies in the four systematic reviews. However, our results are consistent with a systematic review published in 2020 focusing only on segmental motion palpation [74]. Poor evidence regarding reliability and validity of segmental motion testing were reported and clinical use of stand-alone tests cannot be recommended [74].

Strengths and limitations

Our systematic review has several strengths. First, our comprehensive search strategy of multiple databases was developed by a health sciences librarian in consultation with content experts and was then reviewed by an independent health sciences librarian using the PRESS Checklist [18]. Second, we used detailed, predefined inclusion and exclusion criteria to capture a diffuse range of possibly relevant citations. Third, we used paired independent reviewers to screen and critically appraise citations to minimize bias and error. The critical appraisal was completed by trained reviewers using standardized quality assessment tools (QAREL/QUADAS-2). Fourth, bias in reported results was minimized by performing a best-evidence synthesis that included only high-quality studies. Finally, we only included studies that tested subjects with LBP. This makes our results more generalizable to the patients seen by practitioners in clinical practice.

Our review also had limitations. First, our search was limited to studies published in English and French languages. It is possible that relevant studies in other languages may have been excluded. Second, our search may not have retrieved all relevant studies, although our search strategy was comprehensive and the search was conducted in multiple major medical databases. Third, our search was limited to studies published after 2000. Fourth, it is possible that individual differences in scientific judgment could have resulted in varied critical appraisal outcomes among reviewers. This bias was minimized using training with the standardized assessment tools and a consensus process for determining internal validity of studies. Finally, studies examining motion palpation tests had smaller sample sizes (validity studies n = 50; reliability studies n = 49) than studies of static joint or muscle palpation. This may have limited the precision of the results and led to uncertainty in our assessment of motion palpation tests.

Clinical implications

Our review found very little evidence for the use of manual palpation to assess low back pain patients. Manual palpation tests suffered from misclassification error in that they were unable to differentiate those with LBP to subjects without LBP. Soft tissue palpation of the sciatic nerve, gluteal muscles for pain and the multifidus muscle for isometric contraction were reliable but have not been tested sufficiently for their validity for use in clinical practice. Although we did find that gluteal muscle palpation of trigger points and taut bands is valid to differentiate LBP patients with or without radiculopathy in a clinical setting. We found very limited evidence to support the use of joint palpation and clinician should reconsider its diagnostic value when assessing patients with low back pain.


We synthesize the evidence on the reliability and validity of manual palpation to assess adults with LBP. The evidence does not support reliability of joint palpation but static soft tissue palpation is reliable. There is little evidence on the motion joint palpation used in LBP patients. Gluteal muscle palpation for pain was able to differentiate LBP patients with or without radiculopathy (phase III study). We found preliminary evidence from Phases I and II validity studies for some palpation tests. High quality phase III and IV validity studies are required to understand the diagnostic value of manual palpation tests in the assessment of adults with LBP. Clinicians must reconsider the usefulness of these tests when examining patients.

Availability of data and materials

Not applicable.


  1. 1.

    Vos T, Allen C, Arora M, Barber RM, Bhutta ZA, Brown A, et al. Global, regional, and national incidence, prevalence, and years lived with disability for 310 diseases and injuries, 1990–2015: a systematic analysis for the global burden of disease study 2015. Lancet. 2016;388(10053):1545–602.

    Article  Google Scholar 

  2. 2.

    Cassidy JD, Côté P, Carroll LJ, Kristman V. Incidence and course of low back pain episodes in the general population. Spine. 2005;30(24):2817–23.

    Article  PubMed  Google Scholar 

  3. 3.

    Hoy D, Brooks P, Blyth F, Buchbinder R. The epidemiology of low back pain. Best Pract Res Clin Rheumatol. 2010;24(6):769–81.

    CAS  Article  PubMed  Google Scholar 

  4. 4.

    Vos T, Flaxman AD, Naghavi M, Lozano R, Michaud C, Ezzati M, et al. Years lived with disability (YLDs) for 1160 sequelae of 289 diseases and injuries 1990–2010: a systematic analysis for the global burden of disease study 2010. Lancet. 2012;380(9859):2163–96.

    Article  PubMed  PubMed Central  Google Scholar 

  5. 5.

    Murray CJ, Vos T, Lozano R, Naghavi M, Flaxman AD, Michaud C, et al. Disability-adjusted life years (DALYs) for 291 diseases and injuries in 21 regions, 1990–2010: a systematic analysis for the global burden of disease study 2010. Lancet. 2012;380(9859):2197–223.

    Article  PubMed  Google Scholar 

  6. 6.

    Nolet PS, Kristman VL, Côté P, Carroll LJ, Cassidy JD. Is low back pain associated with worse health-related quality of life 6 months later? Eur Spine J. 2015;24(3):458–66.

    Article  PubMed  Google Scholar 

  7. 7.

    Hoy D, March L, Brooks P, Blyth F, Woolf A, Bain C, et al. The global burden of low back pain: estimates from the global burden of disease 2010 study. Ann Rheum Dis. 2014;73(6):968–74.

    Article  PubMed  Google Scholar 

  8. 8.

    Savigny P, Watson P, Underwood M. Early management of persistent non-specific low back pain: summary of NICE guidance. Bmj. 2009;338(jun04 3):b1805.

    Article  PubMed  Google Scholar 

  9. 9.

    Seffinger MA, Najm WI, Mishra SI, Adams A, Dickerson VM, Murphy LS, et al. Reliability of spinal palpation for diagnosis of back and neck pain: a systematic review of the literature. Spine. 2004;29(19):E413–25.

    Article  PubMed  PubMed Central  Google Scholar 

  10. 10.

    Najm WI, Seffinger MA, Mishra SI, Dickerson VM, Adams A, Reinsch S, et al. Content validity of manual spinal palpatory exams-a systematic review. BMC Complement Altern Med. 2003;3(1):1–4.

    Article  PubMed  PubMed Central  Google Scholar 

  11. 11.

    Stochkendahl MJ, Christensen HW, Hartvigsen J, Vach W, Haas M, Hestbaek L, et al. Manual examination of the spine: a systematic critical literature review of reproducibility. J Manip Physiol Ther. 2006;29(6):475–85.

    Article  Google Scholar 

  12. 12.

    Hestœk L, Leboeuf-Yde C. Are chiropractic tests for the lumbo-pelvic spine reliable and valid? A systematic critical literature review. J Manip Physiol Ther. 2000;23(4):258–75.

    Article  Google Scholar 

  13. 13.

    Haneline MT, Young M. A review of intraexaminer and interexaminer reliability of static spinal palpation: a literature synthesis. J Manip Physiol Ther. 2009;32(5):379–86.

    Article  Google Scholar 

  14. 14.

    Duthey B. Background paper 6.24 low back pain. World Health Organization (WHO)(ed.) priority medicines for Europe and the world ‘a public health approach to innovation’. Geneva: WHO; 2012.

    Google Scholar 

  15. 15.

    Fletcher RH, Fletcher SW, Fletcher GS. Clinical Epidemiology: The essentials. 5th ed: Philadelphia, Pennsylvania, Lippincott Williams & Williams; 2012.

  16. 16.

    Jonas WB. Mosby’s dictionary of complementary and alternative medicine; St. Louis (Mo), Mosby, Elsevier, 2005.

  17. 17.

    Bergmann TF, Peterson DH. Chiropractic Technique: Principles and Procedures. 3rd ed: St. Louis (Mo) Mosby; 2002.

  18. 18.

    McGowan J, Sampson M, Salzwedel DM, Cogo E, Foerster V, Lefebvre C. PRESS peer review of electronic search strategies: 2015 guideline statement. J Clin Epidemiol. 2016;75:40–6.

    Article  PubMed  PubMed Central  Google Scholar 

  19. 19.

    Alyazedi FM, Lohman EB, Wesley Swen R, Bahjri K. The inter-rater reliability of clinical tests that best predict the subclassification of lumbar segmental instability: structural, functional and combined instability. J Man Manip Ther. 2015;23(4):197–204.

    Article  PubMed  PubMed Central  Google Scholar 

  20. 20.

    Arab AM, Abdollahi I, Joghataei MT, Golafshani Z, Kazemnejad A. Inter-and intra-examiner reliability of single and composites of selected motion palpation and pain provocation tests for sacroiliac joint. Man Ther. 2009;14(2):213–21.

    Article  PubMed  PubMed Central  Google Scholar 

  21. 21.

    Downey B, Taylor N, Niere K. Can manipulative physiotherapists agree on which lumbar level to treat based on palpation? Physiotherapy. 2003;89(2):74–81.

    Article  Google Scholar 

  22. 22.

    Hebert JJ, Koppenhaver SL, Teyhen DS, Walker BF, Fritz JM. The evaluation of lumbar multifidus muscle function via palpation: reliability and validity of a new clinical test. Spine J. 2015;15(6):1196–202.

    Article  PubMed  PubMed Central  Google Scholar 

  23. 23.

    Hicks GE, Fritz JM, Delitto A, Mishock J. Interrater reliability of clinical examination measures for identification of lumbar segmental instability. Arch Phys Med Rehabil. 2003;84(12):1858–64.

    Article  PubMed  PubMed Central  Google Scholar 

  24. 24.

    Jensen OK, Callesen J, Nielsen MG, Ellingsen T. Reproducibility of tender point examination in chronic low back pain patients as measured by intrarater and inter-rater reliability and agreement: a validation study. BMJ Open. 2013;3(2):e002532.

    Article  PubMed Central  Google Scholar 

  25. 25.

    Ravenna MM, Hoffman SL, Van Dillen LR. Low interrater reliability of examiners performing the prone instability test: a clinical test for lumbar shear instability. Arch Phys Med Rehabil. 2011;92(6):913–9.

    Article  PubMed  PubMed Central  Google Scholar 

  26. 26.

    Schneider M, Erhard R, Brach J, Tellin W, Imbarlina F, Delitto A. Spinal palpation for lumbar segmental mobility and pain provocation: an interexaminer reliability study. J Manip Physiol Ther. 2008;31(6):465–73.

    Article  Google Scholar 

  27. 27.

    Tong HC, Heyman OG, Lado DA, Isser MM. Interexaminer reliability of three methods of combining test results to determine side of sacral restriction, sacral base position, and innominate bone position. J Am Osteopath Assoc. 2006;106(8):464–8.

    PubMed  PubMed Central  Google Scholar 

  28. 28.

    Walsh J, Hall T. Reliability, validity and diagnostic accuracy of palpation of the sciatic, tibial and common peroneal nerves in the examination of low back related leg pain. Man Ther. 2009;14(6):623–9.

    Article  PubMed  PubMed Central  Google Scholar 

  29. 29.

    Weiner DK, Sakamoto S, Perera S, Breuer P. Chronic low back pain in older adults: prevalence, reliability, and validity of physical examination findings. J Am Geriatr Soc. 2006;54(1):11–20.

    Article  PubMed  PubMed Central  Google Scholar 

  30. 30.

    Adelmanesh F, Jalali A, Shirvani A, Pakmanesh K, Pourafkari M, Raissi GR, et al. The diagnostic accuracy of gluteal trigger points to differentiate radicular from nonradicular low back pain. Clin J Pain. 2016;32(8):666–72.

    Article  PubMed  PubMed Central  Google Scholar 

  31. 31.

    Koppenhaver SL, Hebert JJ, Kawchuk GN, Childs JD, Teyhen DS, Croy T, et al. Criterion validity of manual assessment of spinal stiffness. Man Ther. 2014;19(6):589–94.

    Article  PubMed  PubMed Central  Google Scholar 

  32. 32.

    Soleimanifar M, Karimi N, Arab AM. Association between composites of selected motion palpation and pain provocation tests for sacroiliac joint disorders. J Bodyw Mov Ther. 2017;21(2):240–5.

    Article  PubMed  PubMed Central  Google Scholar 

  33. 33.

    Lucas NP, Macaskill P, Irwig L, Bogduk N. The development of a quality appraisal tool for studies of diagnostic reliability (QAREL). J Clin Epidemiol. 2010;63(8):854–61.

    Article  PubMed  Google Scholar 

  34. 34.

    Whiting PF, Rutjes AW, Westwood ME, Mallet S, Deeks JJ, Reitsma JB, et al. Research and reporting methods accuracy studies. Ann Intern Med. 2011;155(4):529–36.

    Article  PubMed  Google Scholar 

  35. 35.

    Sackett DL, Haynes RB. The architecture of diagnostic research. BMJ. 2002;324(7336):539–41.

    CAS  Article  PubMed  PubMed Central  Google Scholar 

  36. 36.

    Harbour R, Miller J. A new system for grading recommendations in evidence based guidelines. BMJ. 2001;323(7308):334–6.

    CAS  Article  PubMed  PubMed Central  Google Scholar 

  37. 37.

    Slavin RE. Best evidence synthesis: an intelligent alternative to meta-analysis. J Clin Epidemiol. 1995;48(1):9–18.

    CAS  Article  PubMed  Google Scholar 

  38. 38.

    McHugh ML. Interrater reliability: the kappa statistic. Biochemia Medica. 2012;22(3):276–82.

    Article  PubMed Central  Google Scholar 

  39. 39.

    Moher D, Liberati A, Tetzlaff J, Altman DG, Prisma Group. Preferred reporting items for systematic reviews and meta-analyses: the PRISMA statement. PLoS Med. 2009;6(7):e1000097.

    Article  PubMed Central  Google Scholar 

  40. 40.

    Bossuyt PM, Reitsma JB, Bruns DE, Gatsonis CA, Glasziou PP, Irwig LM, et al. Toward complete and accurate reporting of studies of diagnostic accuracy: the STARD initiative. Am J Clin Pathol. 2003;119(1):18–22.

    Article  PubMed  PubMed Central  Google Scholar 

  41. 41.

    Hsieh CY, Hong CZ, Adams AH, Platt KJ, Danielson CD, Hoehler FK, et al. Interexaminer reliability of the palpation of trigger points in the trunk and lower limb muscles. Arch Phys Med Rehabil. 2000;81(3):258–64.

    CAS  Article  PubMed  PubMed Central  Google Scholar 

  42. 42.

    Leboeuf-Yde C, van Dijk J, Franz C, Hustad SA, Olsen D, Pihl T, et al. Motion palpation findings and self-reported low back pain in a population-based study sample. J Manip Physiol Ther. 2002;25(2):80–7.

    Article  Google Scholar 

  43. 43.

    Collaer JW, McKeough DM, Boissonnault WG. Lumbar isthmic spondylolisthesis detection with palpation: interrater reliability and concurrent criterion-related validity. J Manual Manipulative Ther. 2006;14(1):22–9.

    Article  Google Scholar 

  44. 44.

    Chakraverty R, Pynsent P, Isaacs K. Which spinal levels are identified by palpation of the iliac crests and the posterior superior iliac spines? J Anat. 2007;210(2):232–6.

    Article  PubMed  PubMed Central  Google Scholar 

  45. 45.

    Holmgren U, Waling K. Inter-examiner reliability of four static palpation tests used for assessing pelvic dysfunction. Man Ther. 2008;13(1):50–6.

    Article  PubMed  PubMed Central  Google Scholar 

  46. 46.

    Liu Y, Palmer JL. Iliacus tender points in young adults: a pilot study. J Am Osteopathic Assoc. 2012;112(5):285–9.

    Google Scholar 

  47. 47.

    Monnier A, Heuer J, Norman K, Äng BO. Inter-and intra-observer reliability of clinical movement-control tests for marines. BMC Musculoskelet Disord. 2012;13(1):1–1.

    Article  Google Scholar 

  48. 48.

    Kurosawa D, Murakami E, Ozawa H, Koga H, Isu T, Chiba Y, et al. A diagnostic scoring system for sacroiliac joint pain originating from the posterior ligament. Pain Med. 2017;18(2):228–38.

    Article  PubMed  PubMed Central  Google Scholar 

  49. 49.

    Ferreira AP, Póvoa LC, Zanier JF, Machado DC, Ferreira AS. Sensitivity for palpating lumbopelvic soft-tissues and bony landmarks and its associated factors: a single-blinded diagnostic accuracy study. J Back Musculoskeletal Rehabil. 2017;30(4):735–44.

    CAS  Article  Google Scholar 

  50. 50.

    Holt K, Russell D, Cooperstein R, Young M, Sherson M, Haavik H. Interexaminer reliability of a multidimensional battery of tests used to assess for vertebral subluxations. Chiropractic J Aust. 2018;46(1):101-7.

  51. 51.

    Holt K, Russell D, Cooperstein R, Young M, Sherson M, Haavik H. Interexaminer reliability of seated motion palpation for the stiffest spinal site. J Manip Physiol Ther. 2018;41(7):571–9.

    Article  Google Scholar 

  52. 52.

    Pollard HP, Bablis P, Bonello R. Can the ileocecal valve point predict low back pain using manual muscle testing? Chiropractic J Aust. 2006;36(2):58–62.

    Google Scholar 

  53. 53.

    Robinson HS, Brox JI, Robinson R, Bjelland E, Solem S, Telje T. The reliability of selected motion-and pain provocation tests for the sacroiliac joint. Man Ther. 2007;12(1):72–9.

    Article  PubMed  PubMed Central  Google Scholar 

  54. 54.

    Degenhardt BF, Johnson JC, Snider KT, Snider EJ. Maintenance and improvement of interobserver reliability of osteopathic palpatory tests over a 4-month period. J Am Osteopathic Assoc. 2010;110(10):579–86.

    Google Scholar 

  55. 55.

    Merz O, Wolf U, Robert M, Gesing V, Rominger M. Validity of palpation techniques for the identification of the spinous process L5. Man Ther. 2013;18(4):333–8.

    Article  PubMed  PubMed Central  Google Scholar 

  56. 56.

    Mainka T, Lemburg SP, Heyer CM, Altenscheidt J, Nicolas V, Maier C. Association between clinical signs assessed by manual segmental examination and findings of the lumbar facet joints on magnetic resonance scans in subjects with and without current low back pain: a prospective, single-blind study. PAIN. 2013;154(9):1886–95.

    Article  PubMed  PubMed Central  Google Scholar 

  57. 57.

    Adhia DB, Milosavljevic S, Tumilty S, Bussey MD. Innominate movement patterns, rotation trends and range of motion in individuals with low back pain of sacroiliac joint origin. Man Ther. 2016;21:100–8.

    Article  PubMed  PubMed Central  Google Scholar 

  58. 58.

    Fransoo P, Legand D. Inter-observer reliability of clinical sacroiliac tests. Les Annales Kinesitherapie. 2004;32:33–8.

    Google Scholar 

  59. 59.

    Sebastian D, Chovvath R. Reliability of palpation assessment in non-neutral dysfunctions of the lumbar spine. Orthop Phys Ther Pract. 2004;16:23–6.

    Google Scholar 

  60. 60.

    Adelmanesh F, Jalali A, Shirvani A. Comparison between the sensitivity of straight leg raising test and gluteal trigger point to detect radicular low back pain: A diagnostic accuracy study. J Rehabil Med. 2016;48:94.

    Google Scholar 

  61. 61.

    Ridehalgh C, Moore A, Hough A. Relationship of straight leg raise and slump tests to nerve palpation in individuals with spinally referred leg pain. Man Ther. 2016;100(25):e49.

    Article  Google Scholar 

  62. 62.

    Abbott JH, Mercer SR. Lumbar segmental hypomobility: criterion-related validity of clinical examination items (a pilot study). N Z J Physiother. 2003;31(1):3–10.

    Google Scholar 

  63. 63.

    Billis EV, Foster NE, Wright CC. Reproducibility and repeatability: errors of three groups of physiotherapists in locating spinal levels by palpation. Man Ther. 2003;8(4):223–32.

    CAS  Article  PubMed  PubMed Central  Google Scholar 

  64. 64.

    Calvo-Lobo C, Diez-Vega I, Martínez-Pascual B, Fernández-Martínez S, de la Cueva-Reguera M, Garrosa-Martín G, et al. Tensiomyography, sonoelastography, and mechanosensitivity differences between active, latent, and control low back myofascial trigger points: a cross-sectional study. Medicine. 2017;96(10):e6287.

    Article  PubMed Central  Google Scholar 

  65. 65.

    Skorupska E. Muscle atrophy measurement as assessment method for low back pain patients. Muscle Atrophy. 2018:437–61.

  66. 66.

    Thawrani DP, Agabegi SS, Asghar F. Diagnosing sacroiliac joint pain. JAAOS-J Am Acad Orthop Surg. 2019;27(3):85–93.

    Article  Google Scholar 

  67. 67.

    Hunter C, Dubois M, Zou S, Oswald W, Coakley K, Shehebar M, et al. A new muscle pain detection device to diagnose muscles as a source of back and/or neck pain. Pain Med. 2010;11(1):35–43.

    Article  PubMed  PubMed Central  Google Scholar 

  68. 68.

    Gerhardt A, Eich W, Janke S, Leisner S, Treede RD, Tesarz J. Chronic widespread back pain is distinct from chronic local back pain: evidence from quantitative sensory testing, pain drawings, and psychometrics. Clin J Pain. 2016;32(7):568–79.

    Article  PubMed  PubMed Central  Google Scholar 

  69. 69.

    Adachi S, Nakano A, Kin A, Baba I, Kurokawa Y, Neo M. The tibial nerve compression test for the diagnosis of lumbar spinal canal stenosis—a simple and reliable physical examination for use by primary care physicians. Acta Orthop Traumatol Turc. 2018;52(1):12–6.

    Article  PubMed  PubMed Central  Google Scholar 

  70. 70.

    Alqarni AM, Manlapaz D, Baxter D, Tumilty S, Mani R. Test procedures to assess somatosensory abnormalities in individuals with back pain: a systematic review of psychometric properties. Phys Ther Rev. 2018;23(3):178–96.

    Article  Google Scholar 

  71. 71.

    Esmailiejah AA, Abbasian M, Bidar R, Esmailiejah N, Safdari F, Amirjamshidi A. Diagnostic efficacy of clinical tests for lumbar spinal instability. Surg Neurol Int. 2018;9:17.

    Article  PubMed Central  Google Scholar 

  72. 72.

    Abbott JH, McCane B, Herbison P, Moginie G, Chapple C, Hogarty T. Lumbar segmental instability: a criterion-related validity study of manual therapy assessment. BMC Musculoskelet Disord. 2005;6(1):1–0.

    Article  Google Scholar 

  73. 73.

    Telli H, Telli S, Topal M. The validity and reliability of provocation tests in the diagnosis of sacroiliac joint dysfunction. Pain Physician. 2018;21(4):E367–76.

    Article  Google Scholar 

  74. 74.

    Stolz M, von Piekartz H, Hall T, Schindler A, Ballenberger N. Evidence and recommendations for the use of segmental motion testing for patients with LBP–A systematic review. Musculoskeletal Sci Pract. 2020;45:102076.

    Article  Google Scholar 

Download references


The authors acknowledge and thank Mrs. Anne Taylor-Vaisey, librarian for her suggestions and review of the search strategy. This research was undertaken, in part, thanks to funding from the Canada Research Chairs program to Dr. Pierre Côté, Canada Research Chair in Disability Prevention and Rehabilitation at the University of Ontario Institute of Technology.


This study was funded by the Association Française de Chiropraxie in France. This association was not involved in the collection of data, data analysis, interpretation of data, or drafting of the manuscript.

Author information




PC, NL developed the research question. The search strategy was developed by KM, citations and full text articles were screened and assessed for risk of bias by NL, PN, ALM, HY. Data extraction and evidence tables built by PN, NL, HY. Statistical analysis was done by DS. Manuscript was written and edited by PN, NL, HY, DS, PC, VK. All authors approved the final manuscript as submitted and agree to be accountable for all aspects of the work.

Corresponding author

Correspondence to Paul S. Nolet.

Ethics declarations

Ethics approval and consent to participate

Not applicable.

Consent for publication

Not applicable.

Competing interests

The authors declare that they have no conflict of interest.

Additional information

Publisher’s Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Supplementary Information

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit The Creative Commons Public Domain Dedication waiver ( applies to the data made available in this article, unless otherwise stated in a credit line to the data.

Reprints and Permissions

About this article

Verify currency and authenticity via CrossMark

Cite this article

Nolet, P.S., Yu, H., Côté, P. et al. Reliability and validity of manual palpation for the assessment of patients with low back pain: a systematic and critical review. Chiropr Man Therap 29, 33 (2021).

Download citation


  • Manual palpation
  • Reliability
  • Validity
  • Assessment
  • Low back pain
  • Systematic review