Skip to main content

A literature review of clinical tests for lumbar instability in low back pain: validity and applicability in clinical practice



Several clinical tests have been proposed on low back pain (LBP), but their usefulness in detecting lumbar instability is not yet clear. The objective of this literature review was to investigate the clinical validity of the main clinical tests used for the diagnosis of lumbar instability in individuals with LBP and to verify their applicability in everyday clinical practice.


We searched studies of the accuracy and/or reliability of Prone Instability Test (PIT), Passive Lumbar Extension Test (PLE), Aberrant Movements Pattern (AMP), Posterior Shear Test (PST), Active Straight Leg Raise Test (ASLR) and Prone and Supine Bridge Tests (PB and SB) in Medline, Embase, Cinahl, PubMed, and Scopus databases. Only the studies in which each test was investigated by at least one study concerning both the accuracy and the reliability were considered eligible. The quality of the studies was evaluated by QUADAS and QAREL scales.


Six papers considering 333 LBP patients were included. The PLE was the most accurate and informative clinical test, with high sensitivity (0.84, 95% CI: 0.69 - 0.91) and high specificity (0.90, 95% CI: 0.85 -0.97).

The diagnostic accuracy of AMP depends on each singular test. The PIT and the PST demonstrated by fair to moderate sensitivity and specificity [PIT sensitivity = 0.71 (95% CI: 0.51 - 0.83), PIT specificity = 0.57 (95% CI: 039 - 0.78); PST sensitivity = 0.50 (95% CI: 0.41 - 0.76), PST specificity = 0.48 (95% CI: 0.22 - 0.58)].

The PLE showed a good reliability (k = 0.76), but this result comes from a single study. The inter-rater reliability of the PIT ranged by slight (k = 0.10 and 0.04), to good (k = 0.87).

The inter-rater reliability of the AMP ranged by slight (k = −0.07) to moderate (k = 0.64), whereas the inter-rater reliability of the PST was fair (k = 0.27).


The data from the studies provided information on the methods used and suggest that PLE is the most appropriate tests to detect lumbar instability in specific LBP. However, due to the lack of available papers on other lumbar conditions, these findings should be confirmed with studies on non-specific LBP patients.


Low back pain (LBP) is a growing health problem in the industrialized world. Despite the high medical expenses required for its management, the prevalence of LBP is increasing [1]. LBP is a heterogeneous condition, and the identification of different sub-groups could help the management decisions [2,3]. One of these sub-groups is lumbar segmental instability [4,5].

The radiologically determined instability is characterized by a loss of passive integrity, causing excessive vertebral translation or rotation. The maximum lumbar flexion-extension radiographs in standing position are considered to be a reference standard to detect the function of the passive stabilization system [6,7]. This imaging method is commonly used to evaluate lumbar segmental mobility in isthmic and degenerative spondylolisthesis and degenerative disc dysfunctions. The radiographic diagnosis of spondylolisthesis is considered to be one of the most efficient methods of identifying lumbar instability [8].

Some authors refer to the concept of instability also considering the so-called “clinical” or “functional” instability, in which no defect of the body architecture of the lumbar spine, and no excessive detectable translation or rotation are shown. However, a poor trunk muscle function and/or an insufficient motor control is believed to be a factor in abnormal inter-segmental movement and LBP [9-11]. Despite this type of instability has not been demonstrated enough as a clinical entity and is not really measureable by any gold standard, it is one of the most frequent fields of interest for chiropractors and manual therapists.

Clinicians have used several clinical tests to detect the spinal instability and/or the ability of the muscles to stabilize the lumbar spine [12]. Recently, some of these tests have been suggested in the “Clinical Practice Guidelines linked to the International Classification of Functioning, Disability and Health from the Orthopaedic Section of the American Physical Therapy Association”, to assess the impairments of body functions in LBP [5]. The most commonly used tests are the Prone Instability Test (PIT), the Passive Lumbar Extension (PLE) test, the Aberrant Movements Pattern (AMP), the Posterior Shear Test (PST), the Prone Bridge Test (PBT), the Supine Bridge Test (SBT), and the Active Straight Leg Raise Test.

Previous reviews separately investigated the diagnostic accuracy [13] or the reliability [14] of the instability tests, but a complete vision about their diagnostic validity to detect lumbar instability is lacking. A single literature review on both the diagnostic accuracy (sensitivity, specificity and likelihood ratios) and the inter-rater reliability of these clinical tests does not exist. More specifically, a researcher could be interested in investigating the reliability of the tests that previously demonstrated sufficient face validity.

The objective of this literature review was to assess the methods used for diagnosis (primarily the accuracy with additional reporting of reliability of these tests) of the clinical tests for lumbar instability in individuals with LBP and investigate their applicability in daily practice.


This is a literature review of all the studies presenting a diagnosis of the clinical tests for lumbar instability in individuals with LBP in literature. PRISMA Guidelines [15] were followed during the design, search and reporting stages of this review on diagnostic test studies.

Literature search

A literature search of relevant literature was performed from July 2012 to December 2013. A comprehensive search, limited to articles in English, Italian and Spanish, was conducted in the following databases: Medline, Embase, Cinahl, PubMed, Scopus. Diagnostic test studies regarding humans published between 1972 and December 2013 were included. Narrative or systematic reviews, guidelines and meta-analyses were excluded.

Two authors (SF and TM) independently performed two different and parallel searches to avoid leaving out relevant articles. The search strategies are shown in Figure 1.

Figure 1

Flow chart.

The results of these seven searches were unified into a single item set. From the results of the initial search, double citations were removed and then the titles, abstracts and full texts of retrieved articles were independently evaluated for definitive inclusion. When the two reviewers were unable to reach a consensus, a third reviewer (CV) was consulted. In addition to the Internet-assisted search, references were pulled from a textbook on diagnostic accuracy of orthopedic clinical tests [16], and from reference lists of included studies. Finally, an independent hand search including scanning of reference lists from other systematic reviews [13,14] was performed.

Study selection

Several criteria were used to select eligible studies. Articles examining clinical tests for lumbar instability were included if they met the following criteria:

  1. 1)

    Diagnostic accuracy studies on adult population with sub-acute or chronic LBP were considered if clinical instability tests were employed as index tests. Dynamic radiographs were the reference test to diagnose lumbar instability. The subject articles had to report data which would allow computation of parametric statistical tests of diagnostic accuracy [sensitivity, specificity, or positive and negative likelihood ratios (+LR and -LR)].

  2. 2)

    Reliability studies on healthy or LBP adult population were considered if they concerned the use of clinical tests to diagnose lumbar instability by one or more clinicians. Articles had to report the parametric statistical tests of relationship or agreement.

  3. 3)

    Finally, only the studies in which each test was investigated by at least one study concerning both the accuracy and the reliability were considered eligible.

Data extraction and quality assessment

One author (TM) gathered data regarding clinical tests, with its description and score, study population (e.g. age, gender, setting, clinical characteristics), inclusion and exclusion criteria, diagnostic reference standard, differences in operationalizing the index tests, study raters. Study results about sensitivity, specificity, LR+, LR-, and reliability were collected (or calculated, if included articles did not provide these data). Other authors (SF and FB) verified data extraction once completed. The methodological quality of included articles was independently assessed by 2 reviewers (TM and FB), using different tools for the 2 types of studies: the Quality Assessment of Diagnostic Accuracy Studies (QUADAS) tool for diagnostic accuracy articles [17] and the Quality Appraisal of Reliability Studies (QAREL) checklist for diagnostic reliability articles [18].

Data synthesis and analysis

Kappa statistics were used to assess agreement between the 2 raters on article selection and QUADAS and QAREL ratings [19]. The QUADAS and QAREL statement delineates essential items to be reported in diagnostic test studies (Table 1 and Table 2).

Table 1 QUADAS (Quality Assessment of Diagnostic Accuracy Study) tool results
Table 2 QAREL application results

Concerning sensitivity and specificity, the acceptable levels were set between 50% (unacceptable test) and 100% (perfect test) [20]. The diagnostic accuracy was considered satisfactory, thus affecting the probability of lumbar instability, with + LR ≥ 2.0 or - LR ≤0.50 [21].

Concerning reliability, the following criteria has been used to determine the strength of the coefficients: ≤ 0.25 = little or no relationship; 0.26 – 0.50 = fair degree of relationship; 0.51 – 0.75 = moderate to good relationship; 0.76 – 1.00 = good to excellent relationship [22].


Figure 1 shows the process of study selection. Initial searching identified 773 citations. Following the first screening, 299 articles were excluded and 474 citations were retained for the second screening; after reviewing the titles, 446 were excluded and 28 considered of interest, looking at the abstracts 16 were maintained and 13 retrieved in full text. Using the inclusion and exclusion criteria a further 7 articles were excluded. This study finally included 6 papers, considering 333 LBP patients, for the review [12,23-27].

Quality scores

Two articles of the 6 studies (33%) were identified as having high methodological rigor according to the QUADAS tool (Table 1). Table 2 shows the distribution of studies according to the scores obtained from the assessment of their methodological quality, following the QAREL tool.

Diagnostic accuracy of the tests

The diagnostic accuracy was investigated by 2 authors only: Fritz et al. [24] and Kasai et al. [25] Four lumbar instability tests were considered: the PLE test, the PIT, the AMP, and the PST. The main characteristics of the studies on diagnostic accuracy are shown in Table 3, whereas Table 4 shows the results.

Table 3 Summary of the studies on diagnostic accuracy
Table 4 Results of diagnostic accuracy studies

Kasai et al. [25] found that the PLE test was the most accurate clinical test, with high sensitivity (0.84, 95% CI: 0.7 - 0.93) and specificity (0.90, 95% CI: 0.82 - 0.95), in a sample of subjects diagnosed with spinal stenosis or lumbar spondylolisthesis or lumbar degenerative scoliosis. The positive and negative LR’s were informative.

The diagnostic accuracy of AMP depends on each singular test. Low sensitivity (0.26, 95% CI: 0.15 - 0.42) and good specificity (0.86, 95% CI: 0.77 - 0.92) were found by Kasai et al. [25] for the Instability Catch Signs. The Painful Catch Sign and the Apprehension Sign showed the same trend, low sensitivity (0.37, 95% CI: 0.24 - 0.54 and 0.18, 95% CI: 0.22 - 0.64 respectively) and good specificity (0.73, 95% CI: 0.61 - 0.8 and 0.88, 95% CI: 0.61 - 0.78 respectively). These tests are included in the AMP, also studied by Fritz et al. [24], who reported low sensitivity (0.18, 95% CI: 0.08 - 0.36) and high specificity (0.95, 95% CI: 0.77 - 0.99) for the AMP test in a cohort of patients with chronic LBP.

The article by Fritz et al. [24] is the only one that studied the diagnostic accuracy of the PIT and the PST. Both tests demonstrated by fair to moderate diagnostic test accuracy. PIT sensitivity = 0.71 (95% CI: 0.53 - 0.85); specificity = 0.57 (95% CI: 0.37 - 0.76); PST sensitivity = 0.50 (95% CI: 0.34 - 0.66); specificity = 0.48 (95% CI: 0.28 - 0.68).

Reliability of the tests

The reliability of the four clinical tests was studied in 5 papers [12,23,24,26,27]. The main characteristics of the studies on reliability and their results are shown in Table 5, whereas Table 6 shows the results in terms of inter-rater reliability.

Table 5 Summary of the articles on reliability
Table 6 Summary of results on reliability

The PLE test showed a better reliability, but this result comes from a single study [12]. The inter-rater reliability of this test resulted good (k = 0.76).

Five studies investigated the inter-rater reliability of the PIT. This reliability was considered fair by Schneider et al. [27] (k = 0.46) and Ravenna et al. [26] (k = 0.10 and 0.04), moderate by Fritz et al. [24] and Rabin et al. [12] (k = 0.69 and k = 0.67, respectively), and good by Hicks et al. [23] (k = 0.87).

The inter-rater reliability of the AMP was studied by Hicks et al. [23] Fritz et al. [24] and Rabin et al. [12]. Whereas Fritz et al. [24] found poor reproducibility (k = −0.07), Hicks et al. [23] (k = 0.60) and Rabin et al. [12] (k = 0.64) calculated moderate reliability. The inter-rater reliability of the Posterior Shear Test was only studied by Fritz et al. [24] showing poor reliability (k = 0.27).

Implications for clinical practice

The data from the studies provided information on the tests and methods used, the error of measurement and also the validity of the tests. However, only 5 studies (83.3%) provided information concerning the setting and the years of raters clinical experience, whereas all studies identified the person performing the assessment and his/her professional competence.


This literature review was aimed to identify the most reliable findings concerning the assessment of methods for diagnosis of the clinical tests for lumbar instability in LBP subjects.

The lumbar instability is traditionally a field of debate. Lumbar segmental instability in the absence of defects of the bony architecture of the lumbar spine has also been cited as a significant cause of chronic low back pain [5,28]. The differences between surgical instability criteria and “functional instability” criteria were defined by Panjabi [29] decades ago. Chiropractics and Manual Therapists are more interested in the lost of motor control than in hypermobility detectable with flexion/extension radiological imaging, which is more useful to spine surgeons. However, the difficulty to clinically detect abnormal or excessive inter-segmental motion makes these tests often insensitive and unreliable and it becomes a limit for the clinical diagnosis of lumbar segmental instability [30,31]. The lack of studies in this field emerges also by our research, which found many studies about reliability of tests used by clinicians but few about their accuracy. Being aware that this criterion is too rigorous for manual therapists we have chosen to be rigorous and we have been forced to do our research having as reference the best reference (gold standard) to instability, that is dynamic X-rays. The result is that many other tests used in the manual clinical practice to detect lumbar clinical instability (i.e. active hip abduction test or hip extension test) have not been considered because no study had investigated their accuracy. These tests are not present in this review, so that, in latest analysis, our study could be considered as a literature review of accuracy of lumbar clinical tests with additional reporting of reliability information.

Six high-quality studies were selected and four lumbar clinical instability tests (PLE test, PIT, AMP and PST) satisfied the inclusion criteria.


The characteristics of the samples of the 2 subject studies [24,25] cannot be considered accurate. Fritz et al. [24] studied a population whose majority had a prior history of LBP, and in which only 30.6% (n = 15) of people complained about distal knee symptoms. Kasai et al. [25], however, investigated a population with specific lumbar conditions (lumbar spinal canal stenosis, lumbar spondylolisthesis or lumbar scoliosis), most of whom had intermittent claudication, and 42.6% (n = 52) had neurological leg symptoms.

The PLE test was the most accurate and informative test, even though it was measured by only one study, in patients affected by lumbar degenerative diseases. Despite the PLE test appears to be a potentially effective clinical test to detect lumbar instability, the characteristics of the investigated sample and the presence of only one study on its diagnostic accuracy may suggest the necessity of studies on non-specific LBP patients.

The PIT demonstrated low to moderate sensitivity and specificity [24] indicating that this test has limited accuracy in diagnosing lumbar instability in patients with LBP.

The PST showed relatively poor sensitivity and specificity [24], indicating that this test is less accurate than the PLE test and the PIT to detect lumbar instability.

The Instability Catch Sign, the Painful Catch Sign and the Apprehension Sign are three of the five signs included in the AMP investigated by Fritz et al. [24]. The relatively low sensitivity and high specificity resulting from the study of Kasai et al. [25] suggest caution in the use of these tests to diagnose lumbar instability. According to Hicks et al. [23], these 5 tests should be used together, as a complete observation of the trunk movement and the 5 signs could be considered as only one comprehensive test. However, positive results on AMP and PIT, which demonstrated moderate sensitivity and specificity, were considered predictive for a favorable response to stabilization exercises [32].


The characteristics of the samples were not always well explained or were not reliable. The PLE test [12] and the PIT [12,23,24] demonstrated good inter-rater reliability. The reliability of PLE test is evident in younger subjects referred to outpatient physical therapy [12]. Five studies on PIT demonstrated very different inter-rater reliability scores. Nevertheless, the 2 studies showing fair reliability [26,27] are affected by possible bias; in the first case [27] due to a very limited sample size and in the second case [26] due to procedures and methodological weaknesses as the involvement of novel raters and the use of a modified test. The main statistical problem was the presence of few samples that could invalidate the k score. Despite all the other 4 studies adopting the PIT closely followed its original description, some differences in the positivity criteria were found. Hicks et al. [23] and Schneider et al. [27] judged the test positive when the pain disappeared in the second part of the test; Fritz et al. [24] when the pain decreased, whilst for Rabin et al. [12] the pain had to be both relieved or abolished.

After having excluded the two studies with the main methodological weaknesses, the reliability of the PIT appeared from moderate to good.

The AMP reliability was investigated in three studies [12,23,24] but their results were not similar and ranged from insufficient reliability [24] to moderate reliability [12,23]. The PST was investigated by only one study and scored the lowest reliability [24], which is insufficient to recommend its use.

Implications for clinical practice

After an initial inspection of the articles it appears that the information derived from the studies could provide a useful picture of the items that contribute to the definition of “applicability in rehabilitation practice”. Sufficient information was provided on the execution of the tests, whereas little information regarded the duration, and the time needed to process data. Considering that in clinical practice a standard manual therapy session normally lasts 30 minutes, it may be the case that a series of tests proposed in the literature cannot be repeated by the clinicians due to lack of time. The attempt to identify methods for the evaluation of lumbar instability in patients with LBP allowed us to select some tests that are suitable for clinicians in everyday clinical practice. The time needed to test and process data are compatible with clinical practice and research purposes. Starting from the same key-words used for the search of the articles of the literature review, 4 clinical tests (PIT, PLE, AMP and PST) investigated by 2 studies [24,25] met the criteria of applicability in clinical practice.


The main limitation of this review is the small number of articles found on any single test. Only 2 studies concerned the diagnostic accuracy, while for the studies investigating the reliability, the results are limited by statistical or methodological weaknesses. For example, the Ravenna’s [26] conclusions should be cautiously interpreted also for some significant modifications made to standardize the PIT, such as the different hip and knee positions, the use of a stabilization scapular belt and a stool for foot placement.

The average age and the characteristics of the spinal dysfunctions of the samples were not homogeneous in the different studies, thus reducing the external validity of the results. Another limitation of this review concerns the insufficient homogeneity regarding the execution and interpretation of the tests. As already mentioned, a lack of standardization of a test affects comparative analyses among different studies and the implementation of that test in clinical practice.


The actual state of the art of clinical tests for lumbar instability include 6 studies of almost 333 patients and 4 clinical tests. Our data suggest that the PLE test is the most suitable test for detecting lumbar instability, thanks to its excellent diagnostic accuracy, and good reliability. Further studies on the diagnostic properties of the PLE test to detect lumbar instability among different populations with LBP are suggested.

After more than 20 years from the definition of the importance of diagnostic clinical tests for lumbar instability in individuals with LBP, clinicians can use some tests showing encouraging results in terms of accuracy and reliability. Nevertheless, their application in daily practice might be affected by insufficient research and evidence on their performances. Future research should be oriented to compare in the same study different assessment methods on the same sample size, in order to evaluate their reliability and validity.


  1. 1.

    Martin BI, Deyo RA, Mirza SK, Turner JA, Comstock BA, Hollingworth W, et al. Expenditures and health status among adults with back and neck problems. JAMA. 2008;299(6):656–64. doi:10.1001/jama.299.6.656.

    Article  CAS  PubMed  Google Scholar 

  2. 2.

    Childs JD, Fritz JM, Piva SR, Erhard RE. Clinical decision making in the identification of patients likely to benefit from spinal manipulation: a traditional versus an evidence-based approach. J Orthop Sports Phys Ther. 2003;33(5):259–72.

    Article  PubMed  Google Scholar 

  3. 3.

    Hall H, McIntosh G, Boyle C. Effectiveness of a low back pain classification system. Spine J. 2009;9(8):648–57. doi:10.1016/j.spinee.2009.04.017.

    Article  PubMed  Google Scholar 

  4. 4.

    Abbott JH, McCane B, Herbison P, Moginie G, Chapple C, Hogarty T. Lumbar segmental instability: a criterion-related validity study of manual therapy assessment. BMC Musculoskelet Disord. 2005;6:56.

    Article  PubMed Central  PubMed  Google Scholar 

  5. 5.

    Delitto A, George SZ, Van Dillen LR, Whitman JM, Sowa G, Shekelle P, et al. Low back pain. J Orthop Sports Phys Ther. 2012;42(4):A1–57. doi:10.2519/jospt.2012.0301.

    Article  PubMed  Google Scholar 

  6. 6.

    Dupuis PR, Yong-Hing K, Cassidy JD, Kirkaldy-Willis WH. Radiologic diagnosis of degenerative lumbar spinal instability. Spine. 1985;10(3):262–76.

    Article  CAS  PubMed  Google Scholar 

  7. 7.

    Nizard RS, Wybier M, Laredo JD. Radiologic assessment of lumbar intervertebral instability and degenerative spondylolisthesis. Radiol Clin North Am. 2001;39(1):55–71. 1.

    Article  CAS  PubMed  Google Scholar 

  8. 8.

    O’Sullivan PB, Phyty GD, Twomey LT, Allison GT. Evaluation of specific stabilizing exercise in the treatment of chronic low back pain with radiologic diagnosis of spondylolysis or spondylolisthesis. Spine. 1997;22(24):2959–67.

    Article  PubMed  Google Scholar 

  9. 9.

    Hodges PW, Moseley GL. Pain and motor control of the lumbopelvic region: effect and possible mechanisms. J Electromyogr Kinesiol. 2003;13(4):361–70.

    Article  PubMed  Google Scholar 

  10. 10.

    Lee P, Helewa A, Goldsmith CH, Smythe HA, Stitt LW. Low back pain: prevalence and risk factors in an industrial setting. J Rheumatol. 2001;28(2):346–51.

    CAS  PubMed  Google Scholar 

  11. 11.

    Macedo LG, Latimer J, Maher CG, Hodges PW, Nicholas M, Tonkin L, et al. Motor control or graded activity exercises for chronic low back pain? A randomised controlled trial. BMC Musculoskelet Disord. 2008;9:65. doi:10.1186/1471-2474-9-65.

    Article  PubMed Central  PubMed  Google Scholar 

  12. 12.

    Rabin A, Shashua A, Pizem K, Dar G. The interrater reliability of physical examination tests that may predict the outcome or suggest the need for lumbar stabilization exercises. J Orthop Sports Phys Ther. 2013;43(2):83–90. doi:10.2519/jospt.2013.4310.

    Article  PubMed  Google Scholar 

  13. 13.

    Alqarni AM, Schneiders AG, Hendrick PA. Clinical tests to diagnose lumbar segmental instability: a systematic review. J Orthop Sports Phys Ther. 2011;41(3):130–40. doi:10.2519/jospt.2011.3457.

    Article  PubMed  Google Scholar 

  14. 14.

    May S, Littlewood C, Bishop A. Reliability of procedures used in the physical examination of non-specific low back pain: a systematic review. Aust J Physiother. 2006;52(2):91–102.

    Article  PubMed  Google Scholar 

  15. 15.

    Moher D, Liberati A, Tetzlaff J, Altman DG, Group P. Preferred reporting items for systematic reviews and meta-analyses: the PRISMA statement. Int J Surg. 2010;8(5):336–41. doi:10.1016/j.ijsu.2010.02.007.

    Article  PubMed  Google Scholar 

  16. 16.

    Cleland JA. Orthopaedic Clinical Examination: An Evidence-Based Approach for Physical Therapists. Carlstadt, NJ: Icon Learning Systems; 2005. p. 516. ISBN 1-929007-87-6.

    Google Scholar 

  17. 17.

    Whiting P, Rutjes AW, Reitsma JB, Bossuyt PM, Kleijnen J. The development of QUADAS: a tool for the quality assessment of studies of diagnostic accuracy included in systematic reviews. BMC Med Res Methodol. 2003;3:25.

    Article  PubMed Central  PubMed  Google Scholar 

  18. 18.

    Lucas NP, Macaskill P, Irwig L, Bogduk N. The development of a quality appraisal tool for studies of diagnostic reliability (QAREL). J Clin Epidemiol. 2010;63(8):854–61. doi:10.1016/j.jclinepi.2009.10.002.

    Article  PubMed  Google Scholar 

  19. 19.

    Villafañe JH, Zanetti L, Isgrò M, Cleland JA, Bertozzi L, Gobbo M, et al. Methods for the assessment of neuromotor capacity in non-specific low back pain: Validity and applicability in everyday clinical practice. J Back Musculoskelet Rehabil. 2014. In Press.

  20. 20.

    van der Wurff P, Meyne W, Hagmeijer RH. Clinical tests of the sacroiliac joint. Man Ther. 2000;5(2):89–96.

    Article  PubMed  Google Scholar 

  21. 21.

    Vanti C, Bonfiglioli R, Calabrese M, Marinelli F, Guccione A, Violante FS, et al. Upper Limb Neurodynamic Test 1 and symptoms reproduction in carpal tunnel syndrome. A validity study. Man Ther. 2011;16(3):258–63. doi:10.1016/j.math.2010.11.003.

    Article  PubMed  Google Scholar 

  22. 22.

    Jewell D. Guide to Evidence-Based Physical Therapist Practice. 2nd ed. Sudbury MA: Jones & Bartlett Learning; 2011. p. 230.

    Google Scholar 

  23. 23.

    Hicks GE, Fritz JM, Delitto A, Mishock J. Interrater reliability of clinical examination measures for identification of lumbar segmental instability. Arch Phys Med Rehabil. 2003;84(12):1858–64.

    Article  PubMed  Google Scholar 

  24. 24.

    Fritz JM, Whitman JM, Childs JD. Lumbar spine segmental mobility assessment: an examination of validity for determining intervention strategies in patients with low back pain. Arch Phys Med Rehabil. 2005;86(9):1745–52.

    Article  PubMed  Google Scholar 

  25. 25.

    Kasai Y, Morishita K, Kawakita E, Kondo T, Uchida A. A new evaluation method for lumbar spinal instability: passive lumbar extension test. Phys Ther. 2006;86(12):1661–7.

    Article  PubMed  Google Scholar 

  26. 26.

    Ravenna MM, Hoffman SL, Van Dillen LR. Low interrater reliability of examiners performing the prone instability test: a clinical test for lumbar shear instability. Arch Phys Med Rehabil. 2011;92(6):913–9.

    Article  PubMed Central  PubMed  Google Scholar 

  27. 27.

    Schneider M, Erhard R, Brach J, Tellin W, Imbarlina F, Delitto A. Spinal palpation for lumbar segmental mobility and pain provocation: an interexaminer reliability study. J Manipulative Physiol Ther. 2008;31(6):465–73. doi:10.1016/j.jmpt.2008.06.004.

    Article  PubMed  Google Scholar 

  28. 28.

    Long DM, BenDebba M, Torgerson WS, Boyd RJ, Dawson EG, Hardy RW, et al. Persistent back pain and sciatica in the United States: patient characteristics. J Spinal Disord. 1996;9(1):40–58.

    Article  CAS  PubMed  Google Scholar 

  29. 29.

    Panjabi MM. The stabilizing system of the spine. Part II. Neutral zone and instability hypothesis. J Spinal Disord. 1992;5(4):390–6. discussion 7.

    Article  CAS  PubMed  Google Scholar 

  30. 30.

    Dvorak J, Panjabi MM, Novotny JE, Chang DG, Grob D. Clinical validation of functional flexion-extension roentgenograms of the lumbar spine. Spine. 1991;16(8):943–50.

    Article  CAS  PubMed  Google Scholar 

  31. 31.

    Pope MH, Frymoyer JW, Krag MH. Diagnosing instability. Clin Orthop Relat Res. 1992;279:60–7.

    PubMed  Google Scholar 

  32. 32.

    Hicks GE, Fritz JM, Delitto A, McGill SM. Preliminary development of a clinical prediction rule for determining which patients with low back pain will respond to a stabilization exercise program. Arch Phys Med Rehabil. 2005;86(9):1753–62.

    Article  PubMed  Google Scholar 

Download references


We would like to thank Anna Trevisan for assisting with the literature search, Giovanni Gobitti for helping us in the statistical analysis, Fabio Cassola and Paola D’Ovidio for their help in the language review.

Author information



Corresponding author

Correspondence to Silvano Ferrari.

Additional information

Competing interests

The authors declare that they have no competing interests. The authors disclose any financial and personal relationships with other people or organizations that could inappropriately influence this work.

Authors’ contributions

SF was the main supervisor on location and overlooked every step of the process. SF and CV conceived the study and were responsible for drafting the manuscript. All authors were responsible for the study design. TM and FB performed the data collection. JUV undertook the analyses. All authors read and approved the final manuscript.

Rights and permissions

This is an Open Access article distributed under the terms of the Creative Commons Attribution License (, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly credited. The Creative Commons Public Domain Dedication waiver ( applies to the data made available in this article, unless otherwise stated.

Reprints and Permissions

About this article

Verify currency and authenticity via CrossMark

Cite this article

Ferrari, S., Manni, T., Bonetti, F. et al. A literature review of clinical tests for lumbar instability in low back pain: validity and applicability in clinical practice. Chiropr Man Therap 23, 14 (2015).

Download citation


  • Joint instability
  • Lumbar instability
  • Low back pain
  • Physical examination
  • Reproducibility of results
  • Prone instability test
  • Passive lumbar extension test
  • Aberrant movements pattern
  • Posterior shear test