Reliability and validity of manual palpation for the assessment of patients with low back pain: a systematic and critical review

Abstract Background Static or motion manual palpation of the low back is commonly used to assess pain location and reproduction in low back pain (LBP) patients. The purpose of this study is to review the reliability and validity of manual palpation used for the assessment of LBP in adults. Method We systematically searched five databases from 2000 to 2019. We critically appraised internal validity of studies using QAREL and QUADAS-2 instruments. We stratified results using best-evidence synthesis. Validity studies were classified according to Sackett and Haynes. Results We identified 2023 eligible articles, of which 14 were low risk of bias. Evidence suggests that reliability of soft tissue structures palpation is inconsistent, and reliability of bony structures and joint mobility palpation is poor. We found preliminary evidence that gluteal muscle palpation for tenderness may be valid in differentiating LBP patients with and without radiculopathy. Conclusion Reliability of manual palpation tests in the assessment of LBP patients varies greatly. This is problematic because these tests are commonly used by manual therapists and clinicians. Little is known about the validity of these tests; therefore, their clinical utility is uncertain. High quality validity studies are needed to inform the clinical use of manual palpation tests. Supplementary Information The online version contains supplementary material available at 10.1186/s12998-021-00384-3.


Introduction
Low back pain (LBP) is the most prevalent musculoskeletal condition in the general population [1,2]. The point prevalence of LBP ranges between 1 to 58.1% and oneyear prevalence ranges between 0.8 to 82.5% [3] depending of the LBP definition and population. LBP is the leading cause of years lived with disability and is the sixth leading cause of disability adjusted life years globally [4,5] and it is associated with poor health-related quality of life and has a substantial economic burden to society [6,7]. Non-specific LBP is more common than specific LBP (e.g., cancer, fractures, infectious disorders, or ankylosing spondylitis) and it cannot be attributed to a specific underlying pathology [8].
The clinical assessment of low back pain involves completing a physical examination [9]. Manual palpation is a common tool used to assess patients with LBP [10]. It includes static and dynamic palpation of soft tissue or joints and aims to identify painful structures and biomechanical dysfunction of the spine [11]. However, the clinical utility of these tests is controversial.
Previous systematic reviews have investigated the reliability and validity of manual palpation for the assessment of patients with LBP [9,[11][12][13]. According to these reviews, the inter-rater reliability of static joint and soft-tissue palpation to locate pain is poor (kappa (k) ≤ 0. 40), and the inter-rater reliability of static palpation for soft tissue changes (e.g., tension) is inconsistent [9,11,13]. Furthermore, one review reported that motion palpation may be valid in detecting decreased motion, or lack of end-play in the lumbar spine [12]. However, motion palpation may not be valid to detect aberrant motion of the sacroiliac joints [12]. These reviews are outdated and there is a need for an up-to-date systematic review. The purpose of our systematic review was to determine the reliability and validity of manual palpation used to assess adult patients with LBP.

Eligibility criteria Population
We included studies of adults (≥18 years) with LBP. LBP refers to pain or discomfort below the costal margin and above the inferior gluteal folds and can be with or without referred leg pain [14]. Our systematic review includes patients with non-radicular low back pain, radicular low back pain, spinal stenosis, degenerative or isthmic spondylolisthesis, and failed back surgery syndrome.

Definitions
Our review focuses on studies assessing the reliability or validity of manual palpation for the assessment of patients with LBP. Reliability describes the consistency of measurements across people or instruments [15]. Validity is the degree to which a test measures what it is intended to measure [15].
Manual palpation is a diagnostic procedure where the examiner feels with their hands to assess the mobility and state of the soft and boney tissues [16]. Palpation techniques include both static and dynamic (motion) methods, which are often used to identify areas of tissue pain and dysfunction, target manual and manipulative therapies and determine effectiveness of the intervention [9]. Static palpation is used to identify bony asymmetry of bony landmarks, tender points, and trigger points to evaluate tissue texture, temperature and tone [17]. Motion palpation is used to assess the quantity and quality of movement through the lumbar spine and pelvis [17]. Motion palpation assessment can be continuous within the normal range of motion with joint play, or dynamic soft tissue palpation or end range assessment for endfeel or joint springing [17]. Palpation involving devices such as pressure algometry were excluded.

Outcomes
We aimed to evaluate clinical outcomes assessed by palpation. Outcomes include pain, segmental mobility and stiffness for static joint palpation; joint movement and position assessed for motion joint palpation; and pain, tenderness, trigger points, muscle contraction assessed for static soft tissue palpation.

Study characteristics
Eligible studies met the following inclusion: 1) English or French language; 2) published in peer reviewed journals between January 1, 2000 to July 11, 2019; 3) assessing the reliability or validity of manual palpation. Previously published systematic reviews on this topic were included in our review. Comparing our systematic review with previous systematic reviews examined findings of studies published before 2000. We excluded: 1) letters, guidelines, editorials, commentaries, unpublished manuscripts, dissertations, reports, book chapters, conference proceedings and abstracts, lectures, addresses, and consensus statements; 2) cadaveric and animal studies; 3) literature reviews and case studies; 4) studies targeting individuals with serious pathology (e.g., fractures, dislocations, systemic disease, myelopathy, neoplasm and infection; and 5) studies with sample size < 20 per group.

Search strategy and data sources
The search strategy was developed in consultation with a health sciences librarian and a second librarian was consulted to ensure accuracy and completeness using the Peer Review of Electronic Search Strategies PRESS checklist [18]. We systematically searched the following electronic databases: MEDLINE, CINAHL, PubMed, Cochrane Central Register of Controlled Trials, and SPORTDiscus. Search terms consisted of subject headings specific to each database (e.g. MeSH in MEDLINE) and free text words relevant to LBP, diagnosis, reliability, validity, and palpation (Additional file 1).

Study selection
Identified citations were exported into EndNote for reference management and tracking of the screening process. We screened articles in two stages. In stage one, titles and abstracts were screened for their relevance by pairs of independent reviewers (NL, PN, ALM). Stage two involved screening the full text article of all possibly relevant citations from stage one. Disagreements on screening stages were discussed between reviewers to reach consensus. When consensus could not be reached, a third reviewer independently screened the citation and discussed with the two reviewers to reach consensus.

Assessment of risk of Bias
Three reviewers (NL, PN, ALM) critically appraised all relevant studies (Tables 1 and 2) using the modified Quality Appraisal Tool for Studies of Diagnostic Reliability (QAREL) [33] criteria to assess the internal validity of the diagnostic reliability studies and the modified    Quality Assessment of Diagnostic Accuracy Studies-2 (QUADAS-2) [34] criteria to assess diagnostic accuracy/ validity studies (Additional files 2 and 3). The original QAREL and QUADAS-2 instruments were modified to include: 1) not applicable options; 2) a question regarding the clarity of the study objective; and 3) the Sackett and Haynes classification (phases of validity studies in QUADAS-2 instrument). If a study was judged as "low" on all domains relating to bias or applicability then it was appropriate to have an overall judgment of "low risk of bias" or "low concern regarding applicability" for that study. If a study was judged "high" or "unclear" on one or more domains then it may be judged "at risk of bias" or as having "concerns regarding applicability" [33,34]. We included low risk of bias studies in our best evidence synthesis. Validity studies with low risk of bias were classified into one for four phases of investigation following the recommendation of Sackett and Haynes [35]. The purpose of phase I studies is to determine if test results are different for LBP patients and healthy controls. The purpose of Phase I studies is to determine whether test results differ between LBP patients and healthy controls. This information is useful to justify Phase II studies. Phase II studies aim to determine whether patients with a positive palpation result are more likely to have decreased functions, severe disability or structure changes (e.g., spinal stenosis) than patients with a negative result. Phase I and II studies provide preliminary evidence that a test should to be tested in phase III studies. On their own, results from phase I and II studies cannot be used to confirm the validity of tests. However, according to Sackett and Haynes classification, phase I -II justify that a test should be further investigated. Phase III studies aim to determine whether a test result can distinguish between LBP patients with suspected conditions (e.g., radiculopathy). Finally, Phase IV studies aim to determine whether patients who undergo a manual palpation test have a better prognosis than similar patients who were not tested [35]. Phase IV studies are a unique type of studies that differ from phase I-III studies in examining diagnostic accuracy. Low risk of bias of phase IV study would be assessed using the Scottish Intercollegiate Guidelines Network (SIGN) criteria [36].

Data extraction and synthesis of results
One reviewer (PN) extracted data from low risk of bias studies and built evidence tables (Tables 3 and 4); and two reviewers (NL or HY) verified the accuracy and completeness of the data extraction. The reliability and validity studies were stratified according to targeted body structures (joint or soft tissue), technique (static or motion palpation), and clinical outcome (pain provocation, mobility, or stiffness). We used qualitative synthesis to synthesize the best evidence [37]. Eligible statistics include 1) means, median and/or percent in phase I studies; 2) correlations, sensitivity, specificity, positive predictive value, negative predictive value and/or likelihood ratio in phase II or III studies; and 3) prevalence in phase III studies.
No arbitrary classification was used to report the strength of reliability or validity findings. Such classification used arbitrary cut-points that do not take into account the level of misclassification that can be acceptable in specific context. Rather, values of kappa coefficients, sensitivity, specificity etc. were reported. The authors interpreted the kappa and measurement errors according to clinical settings and purposes of palpation tests in their context. Kappa scores of < 0.6 are considered to have no, minimal or weak agreement and kappa scores of > 0.6 are considered to have moderate, strong or almost perfect agreement [38]. This should be used as a rough guide when interpreting the kappa and measurement errors according to clinical settings and purposes of palpation tests in individual context.

Statistical analyses
We computed kappa coefficients (k) and 95% confidence intervals (CI) to determine the inter-rater reliability of our screening methodology of articles. We computed the percentage agreement between reviewers for the classification of articles into high or low risk of bias.

Reporting
This review complies with the Reporting Items for Systematic Reviews and Meta-Analyses (PRISMA) statement (Additional file 4) [39]. The Statement for Reporting Studies of Diagnostic Accuracy (STARD) was used to inform in the critical appraisal with the QAREL and QUADAS-2 [40].
We critically appraised 16 articles and 14 articles had low risk of bias and were included in our evidence synthesis [19][20][21][22][23][24][25][26][27][28][29][30][31][32] (Fig. 1). Over the 16 articles appraised, 14 Prone instability test was done in two parts: 1) relaxation phase: the subject was lying prone on the examination table with feet on the floor. The examiner performed PA mobility testing to identify painful lumbar segments with the subject's muscles relaxed. 2) co-contraction phase: the subjects then raises their feet off the floor. If pain identified in the relaxation phase subsides at the co-contraction phase the test is considered positive. PA glide test: Subjects were lying prone and examiners performs PA glide on the lumbar spinous processes. Lack of segmental hypomobility, is considered a positive test. Examiners: two physical therapists who were certified as Orthopaedic Clinical Specialists Time between inter-rater assessments was at least 15 min Inter-rater reliability Prone instability test for pain (relaxation phase); k (95% CI) k = 0. 41  Palpation for the spinal level contributing most to the patients' LBP symptoms (abnormal end-feel, abnormal quality of resistance to motion, and reproduction of pain, local or referred); patient prone, posterior to anterior pressure applied to spinal process and verbal communication between examiner and patient about reproduction of pain. Examiners: three pairs of manipulative physiotherapists with 7-15 yrs. experience and ≥ 3 yrs. experience after postgraduate qualifications in manipulative physiotherapy. Time between inter-rater assessments unknown. Low back pain without radiation of pain past the knee, symptom duration unknown, 20 to 66 yrs. old.

Inter-rater reliability
Prone instability test: The subject lies prone on the examination table with their feet on the floor. The examiner performs passive intervertebral motion testing for pain. The subject then lifts their feet off the floor. A positive test is when pain provoked during the first part of the test disappears when the legs are lifted up. Passive intervertebral motion testing: with the subject lying prone the examiner applied PA pressure with their hypothenar eminence on each lumbar spinous process. Segmental mobility is judged as normal mobility, hypomobility (more motion than normally expected) and hypermobility (less motion than normally expected). Pain provocation is judged as manual pressure producing pain or not producing pain. Examiners were 4 physical therapists with a least 2 yrs. experience. Examiners were placed in 3 separate pairs. Time between inter-rater assessments was at least 15 min  Palpation for lumbar segmental mobility, pain provocation and prone instability: patient prone with 1) Prone mobility testing: posterior to anterior joint springing palpation by examiners of SIJs, all lumbar spinous processes and all lumbar facet joints bilaterally; normal or restricted mobility was noted; 2) prone pain provocation testing: patient notifies pain or discomfort provoked while repeating prone mobility test; 3) prone instability test:  Chronic LBP, ≥3 months duration, ≥60 yrs. old.
Palpation of the SI joints and lumbar spinous processes to identify pain: 1) SI joints: patient standing on floor with shoes removed, examiner standing behind patient exerts firm pressure over sacroiliac joint, palpation of right joint with right thumb while standing to left side of patient; 2) lumbar spinous processes: examiner behind patient, firmly palpate spinous processes L1-L5 using dominant thumb The examiners underwent training in the protocol with an expert physical therapists to refine and standardize the physical examination procedures Time between intra-rater assessments < 5 min  Multifidus lift test: to identify lumbar multifidus contraction; participants prone and contralateral arm lifted with/without a hand weight while multifidus muscle palpated immediately lateral and adjacent to the interspinous space of L4-L5 and L5-S1. Examiners: > 10 yrs. clinical experience and approximately 5 yrs. research experience. Time between intra-rater assessments unknown.
The eleven reliability studies with low risk of bias examined inter-rater reliability of manual palpation to assess joints mobility or motion [19-21, 23, 26, 27], pain [19, 21, 23-26, 28, 29] and muscle contraction [22]. Two of the eleven studies also examined intra-rater reliability of manual palpation assessing joint motion [20] and muscle tenderness [24]. The six validity studies included one phase I study on palpation of joints and muscles to assess pain [29], four phase II on palpation of nerves to elicit pain [28], spinal stiffness [31] muscle contraction [22] and sacroiliac joint motion [32] and one phase III study on palpation of gluteal muscle for tenderness and pain [30].
We did not perform a meta-analysis because of the heterogeneity of studies in symptom duration, palpation technique, and outcome specification.

Assessment of risk of Bias
Tables 1 and 2 showed the risk of bias for scientifically admissible reliability and validity studies based on the modified QAREL and QUADAS-2 criteria respectively.
The low risk of bias studies met the following criteria: 1) clearly described objective; 2) representative sample; 3) representative raters; 4) blinding of the test results between raters; 5) appropriate and valid standard test; and 6) appropriate statistical analysis (Tables 1 and 2). However, these studies had the following limitations: 1) unclear time interval between tests (n = 1) [27]; 2) no blinding for intra-examiner reliability (n = 2) [20,24]; 3) 30 min rest period between the repeat testing between the same examiner and no blinding to clinical information (n = 2) [27,28]; 4) unclear blinding to clinical information or additional clues [23]; 5) no blinding to clinical information and unclear blinding to additional clues (n= 8) [21,22,23,24,25,27,28,29] and 6) non-random or unclear administration of tests (n = 5) [19,23,[27][28][29]. Most validity studies had appropriate exclusion criteria and blinding. However, validity studies had limitations: 1) four studies did not use a consecutive or random sample [22,29,31,32]; and 2) two studies were unclear as to whether an appropriate time interval between tests were used [28,31]; 3) one study was unclear as to  Palpation of the SI joints and, lumbar spinous processes to identify pain:1) SI joints: patient standing on floor with shoes removed, examiner standing behind patient exerts firm pressure over sacroiliac joint, palpation of right joint with right thumb while standing to left side of patient, repeated on the other side; 2) lumbar spinous processes: examiner behind patient, firmly palpate spinous processes L1-L5 using dominant thumb; The examiners underwent training in the protocol with an expert physical therapist to refine and standardize the physical examination procedures. The foot is rested on unaffected knee. A positive test is when buttock or groin pain below L5 is reproduced Resisted abduction test: The subject supine with the leg fully extended as well as being abducted to 30°. The examiner holds the ankle and pushes medially while the subject pushes laterally. The test is positive when familiar pain is produced over the SIJ below L5.  Lumbar multifidus muscle thickness measures at the L4-L5 and L5-S1 spinal levels, at rest and submaximal contraction during contralateral arm lift using brightness-mode real-time ultrasound imaging. Examiner: 1 clinician with 5 years ultrasound experience.
Two validity studies were excluded after critical appraisal. Abbott et al. used flexion/extension radiographs as a reference standard without establishing the testretest reliability of patient positioning when taking of the radiographs [72]. Telli et al. didn't use blinding in their reliability study [73].

Motion palpation
We found inconsistent evidence in support of the reliability of motion palpation of the lumbar spine and SI joints to assess joint motion [20,27]. Palpation of the lumbar paraspinal muscles, and piriformis muscles to identify pain: 1) paralumbar muscles: patient standing on floor with shoes removed, examiner stands behind to left side of patient and braces patient in front with left arm; palpate full extent of right paravertebral musculature with right thumb. Exert approximately 4 kgf: 2) piriformis: patient supine flexes right hip and knee, keeping sole of foot on table. Cross bent leg over opposite leg and again place sole on table and exert mild medially directed pressure on lateral aspect of knee to put piriformis in stretch. Exert firm pressure (4 kg) over middle extent of piriformis, The examiners underwent training in the protocol with an expert physical therapist to refine and standardize the physical examination procedures. The inter-rater reliability of motion palpation of the sacroiliac joint varied (inter-rater reliability 0.14 ≤ k ≤ 0.75 and intra-rater reliability 0.23 ≤ k ≤ 0.73) (Table 3) [20,27]. Tong et al. (2006) suggested that sacral position cannot be reliably assessed during trunk motion using sacral base position test (inter-rater reliability: flexion k = 0.37, extension k = 0.05) [27].

Reliability of soft tissue palpation
Static palpation We found varying levels of reliability for the palpation of the soft tissue structures associated with low back pain [22,24,28,29]. The inter-rater reliability ranged from k = 0.80 for sciatic nerve pain, to 0.51 ≤ k ≤ 0.68 for gluteal tender points and k = 0.34 for lumbar paraspinal muscle pain [24,28,29]. One study suggested that the multifidus muscle can be reliably assessed by examiners who believe they are palpating the multifidus muscle for abnormal isometric contraction by palpating lateral and adjacent to the interspinous space of L4-L5 and L5-S1 with contralateral arm raising both with and without using hand weights (inter-rater reliability 0.75 ≤ k ≤ 0.81) [22]. It is possible that the multifidus lift test is also palpating a more superficial muscle which raises questions about the validity of this test.

Validity of joint and bony structure palpation
Static palpation Two studies investigated the validity of static joint palpation [29,31]. One phase I study found that pain elicited by palpation of the SI joints and lumbar spinous processes was more common in LBP patients compared to healthy controls [29]. One phase II study reported that posterior to anterior palpation used to identify stiffness from L1-L5 had a sensitivity of 38% (95% CI 21-59%), a specificity of 45% (95% CI 28-62%), a positive likelihood ratio of 0.69 (95% CI 0.37-1.31) and a negative likelihood ratio of 1.38 (95% CI 0.82, 2.33) when compared to a mechanized indentation device [31] ( Table 4).   [29] Palpation to identify pain The examiner is behind patient and firmly palpates the spinous processes of L1-L5 using their dominant thumb. A positive test is pain on palpation.
Passive intervertebral motion tests Hicks et al. 2003 [23] Lumbar palpation for segmental mobility and pain With the subject lying prone the examiner applies AP pressure with their hypothenar eminence on each lumbar spinous process. Segmental mobility is judged as normal, hypomobile and hypermobile. Pain provocation is judged as pressure producing pain or not producing pain.
Posterior/Anterior Glide Test Alyazedi et al. 2015 [19] Palpation for identify lumbar spinal mobility. Subjects are lying prone and examiners performs PA glide on the lumbar spinous processes. Lack of segmental hypomobility, is considered a positive test.  [29] Palpation of the sacroiliac joints for pain The patient stands on floor with shoes removed and the examiner stands behind patient. The examiner exerts firm pressure over sacroiliac joint, palpation of right joint with right thumb while standing to left side of patient and palpation of the left joint with the left thumb while standing to the right of the patient. A positive test is the patient reporting pain in the back.
Spinous Palpation for Stiffness Koppenhaver et al., 2014 [31] Joint springing of the lumbar spinous process The spinous processes of L1-L5 are palpated with the subject lying prone. The participant was asked to relax as a posterior to anterior (PA) force was applied. Each vertebral segment was judged to be hypermobile, hypomobile or normal mobility.

Motion Joint Palpation
Gillet Test Arab et al., 2009 [20] Soleimanifar et al., 2017 [32] Palpation for movement at the PSIS while patient raises knee The subject is standing with the examiner palpating the PSIS as the subject raises that knee toward their chest. A positive test is when the PSIS on the side of the knee flexion does not move or moves posterior-inferiorly only minimally or even paradoxically moves superiorly. Tong et al., 2006 [27] Palpation of the sacral base for position while the patient flexes then extends their spine

Sacral Base Position Test
The subject is sitting, the evaluator palpates the sacral base with the subject's trunk forward flexed and backward flexed. A positive test is when one side of the sacrum is more anterior or posterior when compared to the other side of the sacrum on the spine motions. Tong et al., 2006 [27] Bilateral palpation for cephalad movement at the PSIS while patient forward bends spine

Seated Flexion Test
The evaluator palpates both PSISs. As the subject bends forward, the evaluator's thumbs follow the motion of the PSIS cephalad. If one side moves more cephalad than the other side by more than 1 cm, the side that moves more is considered abnormal.
Sitting Flexion Test Arab et al., 2009 [20] Soleimanifar et al., 2017 [32] Palpation for movement at the PSIS while patient forward bends spine The subject is sitting and the examiner palpates the PSIS as the subject bends forward. A positive result in this test indicates limited movement of the sacrum on the ilium.  [27] Palpation for movement at the PSIS while patient forward bends spine The subject is standing and the examiner palpates the PSIS as the subject bends forward. A positive result in a standing flexion test indicates limited movement of the ilium on the sacrum.
Motion palpation One phase II study investigated the validity of joint motion palpation tests for the sacroiliac joints [32]. They examined the relationship between sacroiliac tests for joint motion (Gillet test, sitting flexion test and standing flexion test) and sacroiliac pain provocation tests (Faber test, thigh thrust test and resisted abduction test) but did not use statistics for validity ( Table 4).

Validity of soft tissue palpation
Static palpation Four studies investigated the validity of static soft tissue palpation [22,[28][29][30]. One phase I study found that pain elicited by palpation of the lumbar paraspinal and piriformis muscles was more common in LBP patients compared to without LBP [29]. A phase II study tested the validity of the multifidus lift test with and without hand weights to identify abnormal isometric multifidus muscle contraction when compared to measurement with real-time ultrasound imaging of lumbar multifidus muscle thickness [22] (Table 4). The authors reported that the multifidus lift test correlates with ultrasound finding at the L4-5 level (r biserial correlation coefficient: 0.59 without hand weight and 0.73 without hand weight) and weakly associated at the L5-S1 level (r biserial correlation coefficient: 0.17 and 0.47) ( Table 4) [31]. Another phase II study investigated the validity of sciatic nerve palpation between the ischial tuberosity and the greater trochanter for pain using the straight leg raise and slump test as reference standard to evaluate mechanosensitivity of the sciatic nerve [28]. The authors found that sciatic nerve palpation had a sensitivity of 85% (95% CI, 75-95%) and a specificity of 60% (95% CI, 46-74%) [26]. Finally, one phase III study Palpation of the multifidus muscle for contraction while patient raises while lifting contralateral arm Participants prone and contralateral arm lifted with/without a hand weight while multifidus muscle palpated immediately lateral and adjacent to the interspinous space of L4-L5 and L5-S1. A test was judged as normal or abnormal lumbar multifidus contraction.

Palpation of Gluteal Muscle
Adelmanesh et al., 2016 [30] Palpation of the superior-lateral quadrant of the gluteal muscle to identify GTrP.
Palpation of the superior-lateral quadrant of the gluteal muscle to identify GTrP representing the combination of tenderness, taut band and pain: With the patient prone the gluteal muscle was compressed with a flat thumb or index finger against the underlying tissue or bone. The points were considered GTrP when the combination of taut band, tenderness, and pain recognition were present. Sciatic nerve palpation for pain With the patient lying prone they are asked if there is any pain or discomfort when the examiner applies gentle pressure at the sciatic nerve bilaterally at the midway point of a line from ischial tuberosity to the greater trochanter of the femur. A positive test is pain or discomfort over the sciatic nerve.

Palpation of Lumbar
investigated the validity of static palpation of gluteal muscle for taut band, tenderness and pain recognition compared to an expert panel confirmation of radicular LBP (informed by MRI and electro-diagnostic testing). The authors reported that static palpation of the gluteal muscle had a sensitivity of 74.1% (95% CI, 67.7-80.3%) and a specificity of 91.4% (95% CI, 86.8-96.0%) in identifying radicular pain [30].

Summary of results
We reviewed the reliability and validity of manual palpation used to assess patients with LBP. We retrieved eleven studies on the reliability of static and motion palpation of joint and soft tissue. Overall, the evidence suggest that static joint palpation is not reliable in identifying pain and segmental mobility of the lumbar facet joints, lumbar spinous processes and SI joints, and location of spinal level contributing LBP symptoms. However, static soft tissue palpation may help reliably identify gluteal tender points, sciatic nerve pain, and multifidus contraction but not lumbar paraspinal muscle pain. We identified six validity studies for the assessment of LBP using static joint, joint motion and soft tissue palpation. Gluteal muscle palpation for pain was able to help identify differentiate LBP patients with or without radiculopathy (phase III study). We found preliminary evidence for the validity of the piriformis and lumbar paraspinal muscle palpation for pain (phase I study), spinous and sacroiliac joint palpation for pain (phase I study), sciatic nerve palpation for pain to identify mechanosensitivity of the sciatic nerve as determined by the straight leg raise and slump test (phase II study) and the multifidus lift test to help identify abnormal isometric contraction (phase II study); and against posterior to anterior palpation used to identify stiffness from L1-L5 spine levels (phase II study). Sacroiliac joint motion tests were not associated with sacroiliac pain provocation tests (phase II study). Overall, very little knowledge is available to support the usefulness of palpation of the lumbar and sacroiliac test when examining patient with low back pain.

Comparison with previous systematic reviews
The results of our systematic review differ from previous systematic reviews [9,11,13]. Our finding that static joint palpation of the spinous processes, facet and sacroiliac joints is not reliable to identify pain disagrees with previous systematic reviews [9,11,13]. Three reviews reported that the reliability of static joint palpation for pain was acceptable, but the kappa used to make this conclusion is low (k ≥ 0.4) [9,11,13]. Our review disagrees with the previous finding by Stochkendahl et al. et al. that found that static soft tissue palpation may help reliably identify soft tissue pain (k ≤ 0.4) [11]. Our review found inconsistent reliability to identify soft tissue pain with the inclusion of three recent studies [22,24,28]. The different conclusions may be due to different search strategies, new evidence, inclusion of small sample studies, use of self-developed checklists, or use of predefined cut-off points to differentiate low and high quality studies in the four systematic reviews. However, our results are consistent with a systematic review published in 2020 focusing only on segmental motion palpation [74]. Poor evidence regarding reliability and validity of segmental motion testing were reported and clinical use of stand-alone tests cannot be recommended [74].

Strengths and limitations
Our systematic review has several strengths. First, our comprehensive search strategy of multiple databases was developed by a health sciences librarian in consultation with content experts and was then reviewed by an independent health sciences librarian using the PRESS Checklist [18]. Second, we used detailed, predefined inclusion and exclusion criteria to capture a diffuse range of possibly relevant citations. Third, we used paired independent reviewers to screen and critically appraise citations to minimize bias and error. The critical appraisal was completed by trained reviewers using standardized quality assessment tools (QAREL/QUADAS-2). Fourth, bias in reported results was minimized by performing a best-evidence synthesis that included only high-quality studies. Finally, we only included studies that tested subjects with LBP. This makes our results more generalizable to the patients seen by practitioners in clinical practice. Our review also had limitations. First, our search was limited to studies published in English and French languages. It is possible that relevant studies in other languages may have been excluded. Second, our search may not have retrieved all relevant studies, although our search strategy was comprehensive and the search was conducted in multiple major medical databases. Third, our search was limited to studies published after 2000. Fourth, it is possible that individual differences in scientific judgment could have resulted in varied critical appraisal outcomes among reviewers. This bias was minimized using training with the standardized assessment tools and a consensus process for determining internal validity of studies. Finally, studies examining motion palpation tests had smaller sample sizes (validity studies n = 50; reliability studies n = 49) than studies of static joint or muscle palpation. This may have limited the precision of the results and led to uncertainty in our assessment of motion palpation tests.

Clinical implications
Our review found very little evidence for the use of manual palpation to assess low back pain patients. Manual palpation tests suffered from misclassification error in that they were unable to differentiate those with LBP to subjects without LBP. Soft tissue palpation of the sciatic nerve, gluteal muscles for pain and the multifidus muscle for isometric contraction were reliable but have not been tested sufficiently for their validity for use in clinical practice. Although we did find that gluteal muscle palpation of trigger points and taut bands is valid to differentiate LBP patients with or without radiculopathy in a clinical setting. We found very limited evidence to support the use of joint palpation and clinician should reconsider its diagnostic value when assessing patients with low back pain.

Conclusion
We synthesize the evidence on the reliability and validity of manual palpation to assess adults with LBP. The evidence does not support reliability of joint palpation but static soft tissue palpation is reliable. There is little evidence on the motion joint palpation used in LBP patients. Gluteal muscle palpation for pain was able to differentiate LBP patients with or without radiculopathy (phase III study). We found preliminary evidence from Phases I and II validity studies for some palpation tests. High quality phase III and IV validity studies are required to understand the diagnostic value of manual palpation tests in the assessment of adults with LBP. Clinicians must reconsider the usefulness of these tests when examining patients.