- Open Access
A diagnosis-based clinical decision rule for spinal pain part 2: review of the literature
Chiropractic & Osteopathy volume 16, Article number: 7 (2008)
Spinal pain is a common and often disabling problem. The research on various treatments for spinal pain has, for the most part, suggested that while several interventions have demonstrated mild to moderate short-term benefit, no single treatment has a major impact on either pain or disability. There is great need for more accurate diagnosis in patients with spinal pain. In a previous paper, the theoretical model of a diagnosis-based clinical decision rule was presented. The approach is designed to provide the clinician with a strategy for arriving at a specific working diagnosis from which treatment decisions can be made. It is based on three questions of diagnosis. In the current paper, the literature on the reliability and validity of the assessment procedures that are included in the diagnosis-based clinical decision rule is presented.
The databases of Medline, Cinahl, Embase and MANTIS were searched for studies that evaluated the reliability and validity of clinic-based diagnostic procedures for patients with spinal pain that have relevance for questions 2 (which investigates characteristics of the pain source) and 3 (which investigates perpetuating factors of the pain experience). In addition, the reference list of identified papers and authors' libraries were searched.
A total of 1769 articles were retrieved, of which 138 were deemed relevant. Fifty-one studies related to reliability and 76 related to validity. One study evaluated both reliability and validity.
Regarding some aspects of the DBCDR, there are a number of studies that allow the clinician to have a reasonable degree of confidence in his or her findings. This is particularly true for centralization signs, neurodynamic signs and psychological perpetuating factors. There are other aspects of the DBCDR in which a lesser degree of confidence is warranted, and in which further research is needed.
Accurate diagnosis or classification of patients with spinal pain has been identified as a research priority . We presented in Part 1 the theoretical model of an approach to diagnosis in patients with spinal pain . This approach incorporated the various factors that have been found, or in some cases theorized, to be of importance in the generation and perpetuation of neck or back pain into an organized scheme upon which a management strategy can be based. The authors termed this approach a diagnosis-based clinical decision rule (DBCDR). The DBCDR is not a clinical prediction rule. It is an attempt to identify aspects of the clinical picture in each patient that are relevant to the perpetuation of pain and disability so that these factors can be addressed with interventions designed to improve them. The purpose of this paper is to review the literature on the methods involved in the DBCDR regarding reliability and validity and to identify those areas in which the literature is currently lacking.
The Three Essential Questions of Diagnosis
The DBCDR is based on what the authors refer to as the 3 essential questions of diagnosis . The answers to these questions supply the clinician with the most important information that is required to develop an individualized diagnosis from which a management strategy can be derived. The 3 questions are:
1. Are the symptoms with which the patient is presenting reflective of a visceral disorder or a serious or potentially life-threatening disease?
In seeking the answer to this question, history and examination and, when indicated, special tests, are used to detect or raise the level of suspicion for the presence of pathological disorders for which spinal pain may be the first or only symptom. Some examples are gastrointestinal or genitourinary disorders, fracture, infection and malignancy. Potentially serious or life-threatening conditions are sometimes referred to as "red flags" .
2. From where is the patient's pain arising?
In seeking the answer to this question, four signs are searched for: (1) centralization signs, (2) segmental pain provocation signs, (3) neurodynamic signs, and (4) muscle palpation signs.
3. What has gone wrong with this person as a whole that would cause the pain experience to develop and persist?
In seeking the answer to this question, perpetuating factors are searched for: (1) dynamic instability (impaired motor control), (2) central pain hypersensitivity, (3) oculomotor dysfunction (in cervical trauma patients), (4) fear, (5) catastrophizing, (6) passive coping, and (7) depression. These latter psychological factors are sometimes referred to as "yellow flags" .
An algorithm illustrating the diagnostic strategy of the DBCDR is presented in figure 1. The recommended management strategy based on the DBCDR is presented in figure 2.
The purpose of this paper is to review the literature on the reliability and validity of the detection of the individual diagnostic factors included in the DBCDR, and to present the evidence as it currently exists, for the various aspects of this approach.
Literature search and selection
The following databases were searched up to December 22, 2006: Medline, Cinahl, Embase and MANTIS. Searches of the authors' own libraries were also conducted. Finally, citation searches of relevant articles and texts were conducted manually. The following search terms were used:
Diagnosis AND "low back pain"
Diagnosis AND "neck pain"
Diagnosis AND "low back pain" AND palpation
Diagnosis AND "neck pain" AND palpation
Diagnosis AND "low back pain" AND McKenzie
Diagnosis AND "neck pain" AND McKenzie
Diagnosis AND "low back pain" AND neurodynamics
Diagnosis AND "neck pain" AND neurodynamics
Diagnosis AND "low back pain" AND radiculopathy
Diagnosis AND "neck pain" AND radiculopathy
Diagnosis AND "low back pain" AND trigger points
Diagnosis AND "neck pain" AND trigger points
Diagnosis AND "low back pain" AND muscle
Diagnosis AND "neck pain" AND muscle
Diagnosis AND "low back pain" AND instability
Diagnosis AND "neck pain" AND instability
Diagnosis AND "low back pain" AND "motor control"
Diagnosis AND "neck pain" AND "motor control"
Diagnosis AND "low back pain" AND "central sensitization"
Diagnosis AND "low back pain" AND "central pain hypersensitivity"
Diagnosis AND "neck pain" AND "central sensitization"
Diagnosis AND "neck pain" AND "central pain hypersensitivity"
Diagnosis AND "neck pain" AND oculomotor
Diagnosis AND "low back pain" AND fear
Diagnosis AND "neck pain" AND fear
Diagnosis AND "low back pain" AND catastrophizing
Diagnosis AND "neck pain" AND catastrophizing
Diagnosis AND "low back pain" AND coping
Diagnosis AND "neck pain" AND coping
Diagnosis AND "low back pain" AND depression
Diagnosis AND "neck pain" AND depression
Studies were included if they were in English and provided original, statistically analyzed data regarding the reliability and validity of clinic-based diagnostic procedures used for the identification of relevant factors in the causation or perpetuation of spinal pain. Included studies had to contain data on the assessment of patients with cervical or lumbar pain, including headache related to the cervical spine and spine-related upper or lower extremity pain. Non-English language studies were excluded, as were studies that did not present data on reliability and validity. The search focused on diagnostic procedures that are potentially useful in answering the second or third question of diagnosis. Studies that were potentially useful in answering question 1 were not considered for the purpose of this paper. Diagnostic studies that require special equipment not typically found in the clinic (such as MRI) or that require a laboratory (such as blood tests) were excluded because the purpose of the study was to evaluate clinic-based means by which the DBCDR may be applied. It is recognized that imaging or laboratory tests are often useful in the diagnosis of spinal pain, but the presentation of these procedures was beyond the scope of this paper. In cases in which systematic reviews of the literature were found, the individual studies included in the reviews were not reviewed separately, unless this was necessary to clarify information that was not readily apparent from the systematic review.
Each study was reviewed by two authors (DRM and CFN) and deemed relevant or irrelevant. A study was considered relevant if the information contained in the study indicated that it met the above inclusion/exclusion criteria.
The search strategy identified 1769 articles, and of these, 138 were deemed relevant. Additional files 1 and 2 provide a breakdown of the number of studies in each area of consideration. Additional files 3 and 4 present the data from those studies that met the inclusion criteria. We have divided the presentation of the literature into those studies that apply to patients with neck pain and those that relate to patients with low back pain (LBP).
Question 1. Are the symptoms with which the patient is presenting reflective of a visceral disorder or a serious or potentially life-threatening disease?
A detailed review of the literature related to this question is beyond the scope of this paper. However, in general, history, focusing on the presence of symptoms such as GI distress, fever or previous history of cancer, and examination, focusing on vital signs, abdominal examination and examination of peripheral pulses, are useful in raising the level of suspicion as to the presence of a visceral disorder or a serious or potentially life-threatening disease . Imaging and/or special tests such as sedimentation rate can be utilized for further confirmation . Details can be found elsewhere [5–7].
Question 2. From where is the patient's pain arising?
Centralization signs are detected through methods originally developed by McKenzie [8, 9]. The examination procedure involves moving the spine to end range in various directions and monitoring the mechanical and symptomatic response to these movements.
Clare, et al  used 2 physical therapists trained in the McKenzie method to examine 25 patients with cervical pain. They found good inter-examiner reliability (IER) (kappa, [k] = 0.63 and 93% agreement) for the assessment procedure.
No studies were identified that have addressed the validity of centralization signs in the cervical spine.
Segmental pain provocation signs
A number of studies have examined segmental mobility assessment and have generally found poor IER [11–16] and validity . Other studies have examined procedures designed to identify segmental pain (as opposed to mobility impairment).
Hubka and Phelan  assessed the IER of palpation for tenderness between 2 practitioners in 30 patients with unilateral neck pain. They found good IER (k = 0.68). Jull, et al  assessed IER of segmental palpation using 7 examiners and 40 subjects with or without neck pain and headache. The criteria for a positive test were based on resistance to joint movement and pain provocation in response to palpation. Kappa values indicated excellent to perfect IER (k = 0.78–1.00) in 6 instances, fair to good (k = 0.45–0.65) in 14 instances and poor (k = 0.25–0.34) in 5 instances. They point out that, in the instances of poor agreement, the raw data indicated that the examiners had agreed on 13 of 14 decisions. But the calculations of k were vulnerable because 12 of the 13 agreements were in the same cell of agreed negative finding. Marcus, et al  used 4 physical therapists to examine 72 headache patients and 24 controls. The therapists examined all subjects for "cervical synovial joint abnormalities" in the same manner as described in the study by Jull, et al . They found good IER (k = 0.63) between examiners. McPartland and Goodridge  assessed IER of "TART" exam, described as segmental palpation that focused on three parameters: tissue texture change, restriction of vertebral motion and zygapophyseal (z) joint tenderness. They found the IER of examination that considered all three parameters was poor (k = 0.35 for asymptomatic subjects, k = 0.34 for symptomatic subjects). But for the parameter of tenderness alone, IER improved (k = 0.529). Van Suijlekom, et al  used 2 neurologists to examine 24 headache patients and found IER for segmental palpation to be slight to fair (k = 0.14 to 0.37). However, the palpation method was poorly described in this study. Also, it is not known as to whether the difference between the findings of this study and those of the other studies reported here relate to the fact that the "negative" IER studies used neurologists, whereas the "positive" IER study used chiropractors or physical therapists. Cleland, et al  used 2 examiners and 22 subjects and found highly variable IER between 2 physical therapists for palpation for pain provocation, with k ranging from -.52 to .90, depending on the segment involved. They speculated that this high variability related to the clinicians not agreeing on the segmental level being examined, as opposed to lack of agreement on the findings.
Jull, et al  used diagnostic blocks to identify the presence and location of symptomatic z joints in 20 patients with cervical related pain. The patients were examined by a manipulative physiotherapist who also attempted to identify the presence and location of symptomatic z joints. The definition of a symptomatic joint as determined by palpation was based on abnormal "end feel", increased resistance to motion and reproduction of pain. They found that the SE and SP were both 1.00. That is, the examiner was able to identify 100% of the symptomatic segments as well as all of the subjects whose pain was not abolished by diagnostic block. This study used single, rather than double blind, diagnostic blocks. Regardless, as will be discussed below, the use of diagnostic blocks as a Gold Standard for the presence of z joint pain has been questioned . Treleaven, et al  assessed 12 patients with postconcussion headache with segmental palpation. The method of palpation was the same as that used by Jull, et al . They found complete agreement between the examiner and independent report of the patient as to which segments were painful and almost complete agreement as to which segment was most painful. Sandmark and Nisell , calculated the SE, SP and PPV and negative predictive value (NPV) of segmental palpation in the cervical spine relative to reported neck pain. They found these values to be 0.82, 0.79, 0.62 and 0.91 respectively. Lord, et al , used a double blind anesthetic block to determine the prevalence of pain arising from the C2-3 z joint in patients with the complaint of chronic headache after cervical trauma. These authors demonstrated that the prevalence of C2-3 z-joint pain was 53%, and the only sign that was associated with these patients was tenderness to palpation over the C2-3 z joint. They calculated that palpation had SE of 0.85, a positive likelihood ratio (PLR) of 1.7 and a negative likelihood ratio (NLR) of 0.3. The precise method of palpation was not described. Zito, et al  using the palpation method found to be reliable by Jull et al  found a significantly higher incidence (p < 0.05) of hypomobile and painful z joints in the upper cervical spine of patients classified according to the International Headache Society criteria as having cervicogenic headache compared to those classified as having migraine with aura. King, et al  used "controlled, diagnostic blocks" as a Gold Standard against which segmental palpation that was described as being similar to that of Jull, et al . They found the SE to be 0.88, SP to be 0.39 and PLR to be 1.3. Again, using diagnostic block as a Gold Standard may be questionable , leaving open the issue of what should be the Gold Standard for segmental palpation signs. Further work in the area of establishing a true Gold Standard for the identification of zygapophyseal joint pain may be needed before definitive statements regarding the presence or absence of pain from this structure can be made.
The standard neurodynamic test in the cervical spine is the brachial plexus tension test (also known as the upper limb tension test ). Wainner, et al  found good to excellent IER of this test (k = 0.76 to 0.81). They also found good to excellent IER of several historical questions of patients with documented cervical radiculopathy (k = 0.53 to .082). They found varying IER of neurologic exam findings, but good to excellent IER of Spurling's test (which they described as bending the seated patient's head toward the side of symptoms, rotating and extending slightly, and applying downward pressure), the cervical distraction test and Valsalva's maneuver. The kappa values for these tests ranged from 0.60 to 0.88.
Wainner, et al  provide data on the SE, SP PLR and NLR of a variety of historical factors and examination procedures. They found that the cluster of 4 tests – Spurling's test, the upper limb tension test, the cervical distraction test and limited rotation toward the side of symptoms secondary to pain – carried the greatest diagnostic accuracy as compared to the Gold Standard of electromyography. When 3 of these tests were positive, there was a 65% probability of the presence of cervical radiculopathy the SE and SP were 0.39 and 0.94, respectively and a PLR of 6.1. When all 4 tests were positive, there was a 90% probability of the presence of cervical radiculopathy. The SE and SP were 0.24 and 0.99 respectively and the PLR was 30.3.
Shah and Rajshekhar  also used Spurling's test, the description of which was the same as that in the Wainner, et al study , and found it to be useful in identifying "soft disc prolapse" as opposed to "hard disc" (i.e., osteophyte). They calculated the SE and SP to be 0.90 and 1.00, respectively compared to the Gold Standard of operative findings. The PPV was calculated to be 1.00 and the NPV to be 0.71. In patients treated non-surgically, they used MRI as the Gold Standard and calculated the SE and SP to be 0.90 and 0.93, respectively. The PPV was calculated to be 0.90 and the NPV to be 0.93.
Muscle palpation signs
Marcus, et al, in the same study cited above  found good to perfect IER of TrP palpation in the cervical spine (k = 0.74), head (k = 0.81) and shoulder (k = 1.00). van Suijlekom, et al  in the study cited above, found variable IER (k = 0.0 – 1.00) of TrP palpation in patients with headache. As was the case with segmental palpation, the method of TrP examination was poorly described. Gerwin, et al  performed 2 different experiments to assess IER. In the first, 4 examiners assessed 20 different muscles on each of 25 patients with various symptom presentations. They used a general observer-agreement statistic called the "S av", which they defined as "a generalized version of the Cohen's kappa which reports pairwise judge agreement corrected for chance agreement." They found poor IER (S av = 0.0–1.0). They then repeated the study after spending a 3-hour session in which the examiners discussed positive findings and palpation techniques. They found good to excellent IER (S av = 0.65 – .95) after the training session. Sciotti, et al  found good IER (Generalizability coefficient = 0.83–0.92) between 2 examiners looking for latent trigger points (TrPs) in the upper trapezius muscle. However, the subjects were asymptomatic. On the other hand, Lew, et al  found poor IER for TrP palpation in the upper trapezius, although the subjects in that study were also asymptomatic.
The validity of muscle palpation signs is unknown, largely due to lack of an appropriate Gold or reference standard.
3. What has gone wrong with this person as a whole that would cause the pain experience to develop and persist?
As was discussed in the earlier paper describing the DBCDR , this third question attempts to identify those factors that may be placing the patient at risk of developing persistent or recurrent spinal pain, or, in the case of chronic patients, have contributed to the establishment of the chronic or recurrent problem. There are a number of factors that have been suggested to be of importance in the perpetuation of chronic spinal pain, although research investigating this area is ongoing.
Dynamic instability (impaired motor control)
In the cervical spine, the Craniocervical Flexion (CF) test [37, 38] is designed to detect decreased activity in the deep cervical flexor muscles and hyperactivity in the sternocleidomastoid muscles. It is thought that, as the deep cervical flexors are important for stability of the intersegmental joints of the cervical spine, this imbalance in muscle activation compromises cervical spine stability . The CF test measures the motor control capacity of the deep cervical flexors. Jull, et al  found good IER (ICC = 0.81 to 0.93) in 50 asymptomatic subjects; Chiu, et al  found good IER (k = 0.72) in 10 asymptomatic subjects.
Recently, 3 studies [23, 40, 41] have demonstrated IER of a test that uses a similar positioning but, rather than using a pressure cuff, involves practitioner observation of the ability of patients to maintain a position of slight upper cervical flexion in the supine position. Cleland, et al  used 2 examiners and 22 subjects and found moderate IER (ICC = 0.57). Harris, et al  used 2 examiners and 40 subjects and found moderate IER (ICC = 0.67); Olson, et al , using an almost identical test as Harris, et al , found excellent IER (k = 0.83 to 0.88) between 2 examiners in 27 subjects without neck pain.
Treleavan, et al  compared 12 patients with postconcussion headache with asymptomatic controls using the CF test. They found a significant (p = 0.02) decrease in the duration of time that the test position could be held in patients compared to controls. Jull, et al  compared 15 patients with cervicogenic headache and compared them with 15 controls. They found significantly (p < 0.001) poorer performance on the CF test in the patients compared to controls. Jull, et al  compared patients with neck pain after whiplash, patients with insidious onset neck pain and normal controls in the performance of the CF test. They found significantly poorer performance (p < 0.05) in both neck pain groups than in controls. There was no difference between the post-whiplash patients and the insidious onset patients. Falla, et al  used the CF test and electromyography (EMG) to demonstrate reduced activity in the deep cervical flexor muscles in patients with chronic neck pain compared to controls. There was also a trend toward increased activity in the sternocleidomastoid and scalene muscles in patients compared to controls. With regard to increased activity in the sternocleidomastoid muscle during the performance of the CF test, this replicated the findings of Jull .
Central Pain Hypersensitivity (CPH)
As will be discussed below, there is good evidence that the presence of nonorganic signs is reflective of increased pain perception. 
Sobel, et al  developed nonorganic signs for patients with neck pain and found excellent to perfect (k = 0.80 to 1.00) IER in 26 patients.
The validity of cervical nonorganic signs is unknown.
Imaging modalities like functional MRI and SPECT have promise in the diagnosis of CPH [47, 48]; however, it is not clear as to whether these are viable tools for common use.
Oculomotor dysfunction has been found in patients with chronic neck pain after whiplash  as well as in patients with chronic tension type headache . Gimse, et al  compared 26 patients with chronic (average 4.7 years) neck pain after whiplash and who had complaints of visual problems or vertigo and compared them with 26 matched controls. They found significantly (p < 0.001) poorer performance on tests of oculomotor function in the whiplash group. Tjell, et al  compared 160 chronic (a minimum of 6 months) neck pain patients whose pain was attributed to whiplash with 122 patients with either non-traumatic neck pain, dizziness related to the cervical spine and fibromyalgia. Using the same method of measurement of oculomotor function used by Gimse, et al , they found significantly (p < 0.05 to p < 0.0001) poorer performance on tests of oculomotor function in the whiplash patients compared to the other groups. There currently are no simple tests for oculomotor reflex function that are practical for the typical clinical setting. However, Heikkilla and Wenngren  found significant correlation between the finding of poor performance on oculomotor tests and on a test for head repositioning accuracy, which can be measured in the clinic using Revel's test .
Revel, et al  originally demonstrated that patients with chronic neck pain had significantly (p < 0.01) poorer repositioning accuracy compared to a group of 30 asymptomatic controls. Loudon, et al  also found significantly (p < 0.05) poorer repositioning accuracy in patients with chronic neck pain after whiplash compared to healthy controls; however, the small sample size (11 subjects in each group) makes interpretation problematic. Heikkilla and Wenngren  found significantly greater error in patients (n = 27) with chronic neck pain after whiplash compared to 39 controls. As was stated earlier, Heikklla and Wenngren  found close correlation (p = 0.007) between poor head repositioning accuracy and dysfunction of oculomotor reflexes.
Treleaven, et al  also found close correlation between head repositioning accuracy (which they termed "joint position error") and oculomotor function. They calculated the SE and SP of using head repositioning accuracy to predict oculomotor dysfunction to be 0.60 and 0.54, respectively and the PPV to be 0.88.
Fear and Catastrophizing
Several instruments have been used to measure fear and catastrophizing. Regarding fear, the best studied are the Fear-Avoidance Beliefs Questionnaire , the Tampa Scale for Kinesiophobia  and the Fear-Avoidance Pain Scale .
In patients with neck pain, measures of fear have been found to predict future chronicity in both non-traumatic neck pain  and neck pain after whiplash [61, 62], although there is some conflicting evidence .
The Vanderbilt Pain Management Inventory has been demonstrated to be a reliable and valid measure of passive coping  and this measure has been found to predict slower recovery from whiplash injury .
The Center for Epidemiologic Studies Depression (CES-D) Scale  has been found to have good internal consistency and responsiveness to change over time as well as validity as compared to clinical criteria, self-report criteria, need for services and association with life events . Depressive symptoms as measured by the CES-D have been found to contribute to slower recovery from whiplash injury .
Low Back Pain
Question 1. Are the symptoms with which the patient is presenting reflective of a visceral disorder or a serious or potentially life-threatening disease?
As stated earlier, a detailed review of the literature related to this question is beyond the scope of this paper. The discussion of this question in the neck pain section of the paper applies to this section as well.
Question 2. From where is the patient's pain arising?
Early studies [68, 69] failed to demonstrated adequate IER of the McKenzie assessment in the lumbar spine. For example, Riddle and Rothstein  looked at 363 patients with LBP and used 49 physical therapists at 8 different clinics and found poor IER (k = 0.26) of the classification systems of McKenzie. Postgraduate training in the system did not improve IER. However, these studies have been criticized on the grounds that minimally trained therapists were used, the study failed to consider the classification of patients into subsyndromes and, in the case of Kilby, et al , the protocol included elements that are not a standard part of the McKenzie system . More recent studies have attempted to improve upon the methodology of these earlier studies. Werneke, et al  used 5 physical therapists who assessed 289 patients with LBP or neck pain and found IER that ranged from k = 0.917 to 1.0. Fritz, et al  used 40 physical therapists in practice and 40 physical therapy students and had them watch a video of 12 examinations using the McKenzie method. They found IER coefficients ranging from k = 0.763 to 0.823. Razmjou, et al  used 2 trained McKenzie therapists and 45 patients with acute, subacute or chronic LBP and found good IER (k = 0.70). Kilpikosk, et al  looked at 39 patients with low back pain examined by 2 physical therapists trained in the McKenzie method. They found good agreement for the presence of the centralization sign (k = 0.7) and excellent agreement for direction preference (k = 0.9). Clare, et al  found perfect IER (k = 1.0) between 2 examiners in 25 patients with LBP.
Donelson, et al  found that the McKenzie assessment differentiated discogenic from nondiscogenic pain (p < 0.001), using discogram as the Gold Standard. Young, et al  used the Donelson, et al  data and calculated the sensitivity (SE) and specificity (SP) to be 0.94 (95% confidence interval [CI] 0.82, 0.99) and 0.52 (95% CI 0.34, 0.69), respectively. Young, et al , using their own original data, calculated the SE and SP of centralization signs to be 0.47 and 1.00, respectively, also using discography as the Gold Standard. They also found that pain upon arising from a sitting position was associated with disc pain (p = .017). This historical factor may therefore be useful in identifying the "centralizer", though as will be noted below, pain when arising from sitting is also associated with segmental pain provocation signs in the sacroiliac (SI) area. Laslett, et al  also used discogram as the Gold Standard and calculated the SE, SP, and positive likelihood ratio (PLR) and negative likelihood ratio (NLR) for centralization signs to be 40%, 94%, 6.9 and 0.63 respectively. They also used the Roland Morris Disability questionnaire to measure disability and the Distress Risk Assessment Method to measure distress, and found these factors altered the SE, SP and PPV. In the presence of severe disability, these values were 46%, 80%, 3.2 and 0.63 respectively and in the presence of severe distress they were 45%, 89%, 4.1 and 0.61 respectively.
It is pointed out by Long, et al , that it is not necessary to assume a particular pain generating tissue when using the McKenzie assessment as a means of making treatment decisions. In their study, clinical decisions were made regarding exercise direction based on the findings of the end range loading examination. One group of patients were given exercise maneuvers in the direction of centralization of symptoms, another was given exercises in the direction opposite that of centralization, and a third group was given exercises that did not consider any specific direction. They found significantly greater improvement (p < 0.001) in outcome in the patients who were given exercises in the direction of centralization, suggesting that the McKenzie evaluation in the lumbar spine allows clinicians to make treatment decisions that are of ultimate benefit to patients. This may be a more important measure of "validity" than the identification of a certain pain generating tissue (e.g., using a prognostic criterion as a reference standard for the assessment method).
Centralization signs have also been found to be predictive of long term outcome. Werneke and Hart  found that discriminating between patients who exhibit centralization signs from those who do not allows for prediction of pain, disability and return to work at 1 year. In a separate study, Werneke and Hart  compared classification according to centralization signs with classification according to the Quebec Task Force (QTF) criteria . They found that examination for centralization signs had greater predictive validity for pain and disability at discharge from care than the QTF criteria. Werneke and Hart have also found that assessing centralization signs over the period of multiple visits allows for more accurate discrimination than a single assessment .
Segmental pain provocation signs
Reliability – lumbar
Similar to what was found for the cervical spine, palpation for movement restriction in the lumbar spine has not been shown to be reliable, though palpation for pain has. Keating, et al  used 3 chiropractors who examined 25 asymptomatic subjects and 21 patients with low back pain. They found marginal to good IER of palpation for pain provocation over bony structures (k = 0.19 to 0.48) and soft tissues (k = 0.10 to 0.59). The strongest IER was found for the L4-5 and L5-S1 segments. Maher and Adams  used 2 examiners to assess 90 subjects with low back pain, allowing each examiner to use whatever palpation method he or she chose. The examiners assessed each patient for pain and stiffness. They found that, while the IER of palpation for stiffness was low (intraclass correlation coefficient [ICC] = 0.03–0.37) the IER for pain was good (ICC = 0.67–0.72). Strender, et al  used 2 medical physicians and 2 physical therapists to evaluate 71 patients with low back pain. They found moderate agreement (k = 0.40) for palpation for tenderness. Lundberg, et al  used 2 examiners to assess 609 female subjects for segmental mobility and pain provocation through palpation. They found good IER (k = 0.67 – 0.71) for this assessment.
Seffinger, et al  systematically reviewed the literature regarding the IER of palpatory diagnosis in both neck and back pain. They concluded that palpatory procedures for pain provocation generally have acceptable IER (k = 0.40 or greater) and that 64% of studies looking at pain provocation found acceptable IER.
Reliability – Sacroiliac area
With regard to the SI area, the earliest study of IER was that of Potter and Rothstein . They did not use the kappa statistic, but they found that tests that attempt to determine movement abnormality had poor reliability (less than 70% agreement) but the 2 tests that relied on patient response had agreement of 70–90%. Carmichael  also found poor IER (k = 0.314) of an SI test that assessed for mobility. Freburger and Riddle  found poor reliability (k = 0.18) of the measurement of SI joint position using handheld calipers. Robinson, et al  evaluated the reliability of various pain and SI joint dysfunction tests. The palpation test for joint play showed very poor reliability (k = -0.06). Other pain provocation tests demonstrated moderate to good reliability (k = 0.43–0.84). When clustered results of three to five pain provocation tests were used there was also good reliability (k = 0.51–0.75). A study by Vincent-Smith and Gibbons  evaluated the IER and intra-examiner reliability of the standing flexion test for SI joint dysfunction. Intra-examiner reliability was moderate (k = 0.46) while IER was very poor (k = 0.052).
Tong, et al  tested the hypothesis that combining the test results of various measures of SI joint dysfunction would yield greater reliability than individual tests. They established three methods to be evaluated; Method 1: using the test result with the highest IER; Method 2: requiring at least one test result to be abnormal for the variable to be abnormal, and; Method 3: requiring all test results to be abnormal for the variable to be abnormal. Kappa scores were 0.47, 0.08, and 0.32 using Method 1 for the sacral position, innominate bone position, and side of sacroiliac joint dysfunction, respectively. For Method 2 the values were 0.09, 0.4, and 0.16. For Method 3 the values were 0.16, 0.1, and -0.33.
Laslett and Williams  used 2 examiners to evaluate 51 patients using 6 tests designed to identify a painful SI joint. They found moderate to high IER (k = 0.69 to 0.82), of several tests. Dreyfuss, et al  found moderate IER (k = 0.61 to 0.64) for 3 SI pain provocation tests. Kokmeyer, et al  found good IER (k = 0.70) of a cluster of 5 SI pain provocation tests. Studies that have evaluated tests of SI mobility have generally found poor IER .
Validity – lumbar
Young, et al  found a correlation between abolishment of pain with facet joint blocks and the absence of a historical report of pain when standing from a sitting position. Revel, et al  found that the following characteristics were associated with patients whose pain was relieved by 75% or more with facet joint blocks: age over 65, pain not exacerbated by coughing, pain not worsened by hyperextension, pain not worsened by forward flexion, pain not worsened by rising from forward flexion, pain not worsened by extension-rotation and pain well relieved with recumbency. Similar findings have been found by other authors [98, 99]. Laslett, et al  found that these criteria had low SE (< 0.17), though they did have high SP (0.90). Laslett, et al  found that 4 or more out of the following 7 signs carried a SE of 1.00 and SP of 0.87 as compared to single facet joint blocks: Age ≥ 50, symptoms best walking, symptoms best sitting, onset pain is paraspinal, Modified Somatic Perception Questionnaire score > 13, positive extension/rotation test, and absence of centralization signs. So, as will be seen in the SI joint area, ruling out centralization signs is necessary to increase the diagnostic yield in identifying segmental pain provocation signs.
Validity – SI joint area
In the SI joint area, Broadhurst and Bond  compared 3 pain provocation tests with anesthetic block and found the SE of single tests ranged from 0.77 to 0.87. The SP of each test was 1.00. Slipman, et al  used a cluster of pain provocation tests and used the criteria of at least 3 "positive" tests in 50 consecutive patients with LBP. They compared this examination with the Gold Standard of single anesthetic blocks. They estimated the PPV of the examination to be 60%. van der Wurff, et al  assessed 140 patients with chronic LBP with a cluster of 5 pain provocation maneuvers for the SI joint. This cluster was the same as that used in the study by Kokmeyer, et al  that had found good IER. They considered that 3 out of the 5 tests being pain-producing constituted a "positive" test. They compared this regimen with the Gold Standard of double anesthetic blocks. They calculated the SE of the regimen as 0.85 (95% CI, 0.72–0.99) the SP as 0.79 (95% CI, 0.65–0.93), and the PPV and NPV as 0.77 (95% CI, 0.62–0.92) and 0.87 (95% CI, 0.74–0.99), respectively. The PLR was 4.02 (95% CI, 2.04–7.89); the NLR was 0.19 (95% CI, 0.07–0.47). Laslett, et al  used these same SI provocation tests and compared these to single anesthetic block. They added to the Gold Standard criteria the reproduction of concordant pain upon infiltration, followed by 80% or more reduction of pain as a result of injection. They found that the presence of 3 positive tests carried a SE of 0.94, a SP of 0.78, a PPV of 0.68, and a NPV of 0.96. Young, et al  also found significant (p < .001) association between the presence of 3 or more positive pain provocation tests for the SI and positive SI injection and also found positive association between positive SI injection and the following historical factors: pain when arising from a sitting position (p = .02), pain being unilateral (p = .05) and the absence of midline pain (p = .05). They also noted that patients with positive SI injection rarely had pain superior to the L5 level.
Importantly, Laslett, et al  found that performing SI provocation maneuvers in the context of the end range loading exam for centralization signs (see below) increases the diagnostic yield of the SI tests. The SP of the SI provocation tests rose from 0.78 to 0.87 and the PLR rose from 4.16 to 6.97.
Slipman, et al  compared radionuclide imaging to the Gold Standard of single SI joint block and found this test to have high SP (100%) but very low SE (12.9%).
The standard neurodynamic tests in the lumbar spine are the Straight Leg Raise (SLR), Femoral Nerve Stretch test (FNST – also sometimes referred to as the Prone Knee Bend ) and the Slump test. Clinicians will often include Bragard's test (adding ankle dorsiflexion to the SLR) and the Well Leg Raise (WLR) test (eliciting pain on the affected side by performing a SLR on the contralateral limb) to serve as sensitizing and differentiating maneuvers for the purpose of increasing the specificity of the examination for lower lumbar nerve root pain .
Hunt, et al  assessed the IER of the SLR using 2 teams of examiners, each team consisting of one physician and one physical therapist. They found fair IER (k = 0.54 for left leg, 0.48 for right leg) but this study used asymptomatic subjects and measured SLR using a goniometer. Vroomen, et al , used a neurologist and a neurology resident to assess 338 patients with "sciatica". They calculated the IER of a variety historical factors and clinical tests in patients with suspected lumbar radiculopathy. For the standard SLR, they found good IER (k = 0.68) when the interpretation of the test findings included the production of "typically dermatomal pain". The k values for the Bragard's and WLR tests were 0.66 and 0.70, respectively. When historical and examination factors were taken into consideration regarding arriving at a diagnosis of nerve root pain, the k value was 0.66. The historical factors with the greatest IER were increased pain with coughing/sneezing/straining (k = 0.64), increased pain with walking (k = 0.56), coldness in the lower extremity (k = 0.56), urinary incontinence (k = 0.79) and previous back pain episodes (k = 0.67).
McCombe, et al  used 2 surgeons to assess 50 patients and found fair agreement for the FNST (k = 0.3–0.5). Philip, et al  used 6 pairs of physiotherapists to examine 93 patients using the Slump test. They found good to perfect IER (k = 0.72 to 1.00). Gabbe, et al  used a physiotherapist and a research student to assess 15 asymptomatic volunteers using the slump test and found excellent reliability (ICC = 0.92, 95% CI 0.77, 0.97).
Vroomen, et al  found that SLR was not predictive of the presence of herniated disc on MRI. They did not assess WLR or Bragard's test. They did note that the historical factors of a dermatomal distribution of pain, increase in pain on coughing, sneezing, or straining, paroxysmal pain, and predominant leg pain were predictive. Using MRI as a "Gold Standard" may be questionable because of the potential for false positive findings . Lurie  reviewed the literature on diagnostic tests for LBP and found that the SLR has generally been found to have high SE (0.78 to 0.97) but low SP (0.10 to 0.52) in identifying patients with disc herniation. The opposite is found for WLR test, which has been found to have low SE (0.22 to 0.52) and high SP (0.85 to 1.0). He does note, however that "much of the literature is limited by methodological flaws". Many clinicians feel that the combination of the SLR and WLR, along with Bragard's test and other "localizing" and "sensitizing" maneuvers improves the SE and SP of the examination for pain of neural origin . This has not been specifically evaluated.
The validity of the FNST has not been well studied .
Stankovic, et al  found those patients with the complaint of LBP and/or leg pain whose imaging revealed a herniated disc were more likely to have distal pain in the lower extremity on the performance of the Slump test, although the difference was not statistically significant (p < 0.017). No values with regard to SE, SP and PPV and NPV were calculated.
Nice, et al  used 12 examiners to assess 50 patients with LBP for trigger points, using the standard criteria of the presence of a "taut band" and localized "nodule", the presence of a "twitch response" and the reproduction of familiar pain. They found IER to be poor (k = 0.29 to 0.38). Njoo and Van der Does  also found poor IER when considering all of the standard criteria of TrP presence. However, when considering only tenderness to palpation, particularly when combined with the identification of concordant pain on the part of the patient, IER increased greatly (k > 0.5). Hsieh, et al  used 1 "expert" DC with many years of experience with TrP palpation, 2 DC's with 15 years of practice experience but not with extensive experience with TrP palpation, and several chiropractic and psychiatry residents. They provided all clinicians with 3 2-hour lectures and 3 2-hour hands-on sessions as training in TrP palpation, and compared the agreement between the expert and the others for the presence of taut band, local twitch response and referred pain. They found generally poor IER, concluding that even with experienced clinicians, short term training in TrP palpation is not enough to provide IER.
It would appear that if the examiner places greatest emphasis on tenderness to palpation and reproduction of concordant pain, and less emphasis on the presence of a taut band and a twitch response, the IER of muscle palpation signs will be enhanced. Also, Simons has pointed out  that those studies using untrained and/or inexperienced examiners have generally found poor IER, whereas those using trained and experienced examiners have generally found favorable IER in TrP examination, indicating the importance of examiners having appropriate training and experience with muscle palpation signs.
As with the cervical spine, the validity of myofascial signs in the lumbar spine is unknown due to the absence of a Gold Standard for the identification of myofascial pain.
3. What has gone wrong with this person as a whole that would cause the pain experience to develop and persist?
Dynamic instability (impaired motor control)
There are 3 tests that have been proposed to identify the presence of dynamic instability in the lumbar spine, and for which there are data on IER. One is the Segmental Instability test , which Hicks, et al  found to have excellent (k = .87) IER between 3 pairs of examiners in 63 subjects. This study  found the Standing Flexion test to have moderate IER (k = .69). The Hip Extension test , was found by Murphy, et al  to have good (k = 0.72 to 0.76) IER between 2 examiners in 42 subjects.
Reliability – pelvis
The Active Straight Leg Raise (ASLR) test  is designed to assess dynamic stability in the pelvis. IER of the ASLR has not been evaluated, however, Mens, et al  test-retest reliability over the space of one week to be high (Pearson's correlation coefficient = 0.87; ICC = 0.83) in a study of pregnant women.
Validity – lumbar
The only validity study that was found was that of Abbott, et al . This study assessed manual examination using intervertebral motion tests. They compared this with a reference standard using flexion-extension radiographs. They provided SE, SP and PPV data, however, no data were presented with regard to the IER of the manual examination procedures, making interpretation of the validity data difficult.
Validity – pelvis
Mens, et al  compared the ASLR test with the Posterior Pelvic Pain Provocation (PPPP) test, a test with good reliability and validity  for the identification of painful SI joints. Using the PPPP test as the Gold Standard, they found the ASLR test to have a SE of 0.87 and a SP of 0.94. In another study, Mens, et al  compared the ASLR test to the Quebec Back Pain Disability Scale in 200 pregnant patients with posterior pelvic pain. They found a high correlation between the 2 tests (r = 0.70). O'Sullivan et al  found evidence of altered activity in the diaphragm and the pelvic floor muscles, both of which are thought to play important roles in motor control of the trunk, in patients with a positive ASLR as compared to those with a normal test. No actual measures of pelvic motor control were performed, however.
Central Pain Hypersensitivity (CPH)
There is some evidence for the IER of Waddell's nonorganic signs, although this evidence is inconsistent .
Fishbain, et al  reviewed the literature on the use of Waddell's nonorganic signs and found consistent evidence that they are associated with decreased functional performance, poor treatment outcome and increased pain perception. Whether the relationship between the presence of these signs and increased pain perception means that these signs are an indication of CPH specifically is unknown. However, until further investigation is undertaken, it appears that these signs may be a useful means to identify increased pain perception that may be related to CPH.
Fear and Catastrophizing
The Fear-Avoidance Beliefs Questionnaire , the Tampa Scale for Kinesiophobia  and the Fear-Avoidance Pain Scale  have been demonstrated to be predictive of present LBP as well as future progression of chronicity [130–134]. Regarding catastrophizing, the Pain Catastrophizing Scale [132, 134] has been found to be useful.
These measures have been found to predict decreased physical performance and perceived disability in patients with acute LBP , current pain intensity and disability in patients with chronic LBP , and reduction in disability after treatment .
The Guarding scale of the Chronic Pain Coping Inventory  and the Coping Strategies Questionnaire  have been found to be predictive, in part, of chronicity in patients with LBP.
The Beck Depression Inventory (BDI) has been used for a number of years in patients with spinal pain, and has been demonstrated to have good utility in identifying significant depressive symptoms in LBP patients . Walsh, et al  found that a Mental Component Summary cutoff score of 35 on the SF-36 instrument carried a SE of 0.80 and a SP of 0.90 compared to the Gold Standard of the CES-D. Low scores on the SF-36 Mental Health Index are associated both cross-sectionally and longitudinally with low-back pain and disability  suggesting that psychological distress may be both a predictor and consequence of spinal pain. The Depression Anxiety Stress Scales (DASS) have been found to have good internal consistency and reliability, and to compare favorably with the BDI , although this study was not performed with spinal pain patients. Haggman, et al  used receiver operating characteristic curves to compare the administration of a 2 question screening ("During the past month, have you often been bothered by feeling down, depressed, or hopeless?" and "During the past month, have you often been bothered by little interest or pleasure in doing things?") with the DASS. They found the screening questions accurately predicted DASS scores (Area Under the Curve [AUC] values of 0.77 to 0.81). The PLR reached as high as 5.40 and the NLRs as low as 0.18. Whether this 2-question screening is useful for research purposes is unclear.
As was stated in Part 1, there is significant overlap and interaction between fear, catastrophizing, passive coping and depression [141, 142]. Thus, from a clinical standpoint, it may be only necessary to measure 1 or 2 of these constructs in spinal pain patients, rather than having to measure all, however research is needed to determine this for certain.
In a previous paper the authors presented the conceptual model of a novel approach to the diagnosis and treatment of patients with spinal pain. The specific components of the diagnostic model were described and the decision making process based on the diagnostic approach were discussed. In this paper, the evidence as it currently exists for the reliability and validity of the components of the diagnostic model is presented. Future research will be conducted to investigate those questions that remain unanswered with regard to the ability of clinicians to arrive at a specific diagnosis in patients with spinal pain on which they can base a targeted treatment approach.
Borkan J, Van Tulder M, Reis S, Schoene ML, Croft P, Hermoni D: Advances in the field of low back pain in primary care: a report from the Fourth International Forum. Spine. 2002, 27 (5): E128–E132-
Murphy DR Hurwitz EL: A theoretical model for the development of a diagnosis-based clinical decision rule for the management of patients with spinal pain. BMC musculoskeletal disorders. 2007, 8: 75-
Bigos S, Bowyer O, Braen G Brown K, Deyo R, Haldeman S: Acute Low Back Problems in Adults Clinical Practice Guideline Number 14 AHCPR Pub No 95-0642 Rockville, MD Agency for Health Care Policy and Research, Public Health Service, US Department of Health and Human Services. US Department of Health and Human Service. 1994
Australian Acute Musculoskeletal Pain Guidelines Group. Evidence-Based Managment of Acute Musculoskeletal Pain. 2003, Bowen Hills, QLD
Chou R, Qaseem A, Snow V, Casey D, Cross JT, Shekelle P, Owens DK Jr.: Diagnosis and treatment of low back pain: a joint clinical practice guideline from the American College of Physicians and the American Pain Society. Annals of internal medicine. 2007, 147 (7): 478-491.
Ferri FF: Ferri's Differential Diagnosis - A Practical Guide to the Differential Diagnosis of Symptoms, Signs, and Clinical Disorders . 2006, St. Louis , Mosby
Swenson RS: A medical approach to the differential diagnosis of low back pain. J Neuromusculoskel Sys. 1998, 6 (3): 100-113.
McKenzie RA: The Cervical and Thoracic Spine: Mechanical Diagnosis and Therapy. 2006, Waikanae, New Zealand , Spinal Publications
McKenzie RA May S.: The Lumbar Spine: Mechanical Diagnosis and Therapy. 2003, Waikenae, NZ , Spinal Publications, 2nd
Clare HA Adams R, Maher CG: Reliability of McKenzie classification of patients with cervical or lumbar pain. J Manipulative Physiol Ther. 2005, 28 (2): 122-127.
Love RM, Brodeur RR: Inter- and Intra- examiner reliability of motion palpation for the thoracolumbar spine. J Manipulative Physiol Ther. 1987, 10 (1): 1-4.
Boline PD, Keating JC, Brist J, Denver G: Interexaminer reliability of papatory evaluations of the lumbar spine. AJCM. 1988, 1 (1): 5-11.
Christensen HW, Vach W, Vach K, Manniche C, Haghfelt T, Hartvigsen L, Carlsen PFH: Palpation of the upper thoracic spine an observer reliability study. J Manipulative Physiol Ther. 2002, 25 (5): 285-292.
Mior S King R, McGregor M, Bernard M: Intra and inter-examiner reliability of motion palpation in the cervical spine. J Can Chiro Assoc. 1985, 29: 195-199.
Binkley J, Stratford PW, Gill C: Interrater reliability of lumbar accessory motion mobility testing. Phys Ther. 1995, 75 (9): 786-792.
Trijffel EV Anderegg Q, Lucas C: Inter-examiner reliability of passive assessment of intervertebral motion in the cervical and lumbar spine: A systematic review. Manual therapy. 2005, 10: 256-269.
Leboeuf-Yde C, van Dijk J, Franz C, Hustad SA, Olsen D, Pihl T, Robech R, Vendrup SS, Bendix T, Kyvik KO: Motion palpation findings and self-reported low back pain in a population-based study sample. J Manipulative Physiol Ther. 2002, 25 (2): 80-87.
Hubka MJ Phelan SP: Interexaminer reliability of palpation for cervical spine tenderness. J Manipulative Physiol Ther. 1994, 17 (9): 591-595.
Jull G, Zito G, Trott P, Potter H, Shirley D: Inter-examiner reliability to detect painful upper cervical joint dysfunction. Aust J Physiother. 1997, 43 (2): 125-129.
Marcus DA, Scharff L, Mercer S, Turk DC: Musculoskeletal abnormalities in chronic headache a controlled comparison of headache diagnostic groups. Headache. 1999, 39: 21-27.
McPartland JM Goodridge JP: Counterstrain and traditional osteopathic examination of the cervical spine compared. J Bodywork Movement Ther. 1997, 1 (3): 173-178.
van Suijlekom HA de Vet HCW, van den Berg SGM, Weber WEJ: Interobserver reliability on physical examination of the cervical spine in patients with headache. Headache. 2000, 40: 581-586.
Cleland JA, Childs JD, Fritz JM, Whitman JM: Interrater reliability of the history and physical examination in patients with mechanical neck pain. Arch Phys Med Rehabil. 2006, 87 (10): 1388-1395.
Jull G, Bogduk N, Marsland A: The accuracy of manual diagnosis for cervical zygapophysial joint pain syndromes. Med J Aust. 1988, 148 (5): 233-236.
Carragee EJ, Haldeman S, Hurwitz E: The pyrite standard: the Midas touch in the diagnosis of axial pain syndromes. Spine J. 2007, 7 (1): 27-31.
Treleaven J, Jull G, Atkinson L: Cervical musculoskeletal dysfunction in post-concussional headache. Cephalalgia. 1994, 14: 273-279.
Sandmark H Nisell R: Validity of five manual neck pain provokation tests. Scand J Rehabil Med. 1995, 27 (3): 131-136.
Lord SM, Barnsley L, Wallis BJ, Bogduk N: Third occipital nerve headache: a prevalence study. J Neurol Neurosurg Psychiatr. 1994, 57: 1187-1190.
Zito G Jull G, Story I: Clinical tests of musculoskeletal dysfunction in the diagnosis of cervicogenic headache. Manual therapy. 2006, 11: 118-129.
King W, Lau P, Lees R, Bogduk N: The validity of manual examination in assessing patients with neck pain. Spine J. 2007, 7 (1): 22-26.
Kleinrensink GJ, Stoeckart R, Mulder PG, Hoek G, Broek T, Vleeming A, Snijders CJ: Upper limb tension tests as tools in the diagnosis of nerve and plexus lesions. Anatomical and biomechanical aspects. Clin Biomech (Bristol, Avon). 2000, 15 (1): 9-14.
Wainner RS, Fritz JM, Irrgang JJ, Boninger ML, Delitto A, Allison S: Reliability and diagnostic accuracy of the clinical and patient self report measures for cervical radiculopathy. Spine. 2003, 28 (1): 52-62.
Shah KC Rajshekhar V: Reliability of diagnosis of soft cervical disc prolapse using Spurling’s test. Br J Neurosurg. 2004, 18 (5): 480-3.
Gerwin RD, Shannon S, Hong CZ, Hubbard D, Gevirtz R: Interrater reliability in myofascial trigger point examination. Pain. 1997, 69 (1/2): 65-73.
Sciotti VM, Mittak VL, DiMarco L, Ford LM, Plezbert J, Santipadri E, Wigglesworth J, Ball K: Clinical precision of myofascial trigger point location in the trapezius muscle. Pain. 2001, 93: 259-266.
Lew PC Lewis J, Story I: Inter-therapist reliability in locating latent myofascial trigger points using palpation. Manual therapy. 1997, 2 (2): 87-90.
Falla D: Unraveling the complexity of muscle impairment in chronic neck pain. Manual therapy. 2004, 9: 125-133.
Jull G, Barrett C, Magee R, Ho P: Further clinical clarification of the muscle dysfunction in cervical headache. Cephalalgia. 1999, 19: 179-185.
Chiu TTW Law, EYH, Chiu, THF: Performance of the craniocervical flexion test in subjects with and without chronic neck pain. J Orthop Sports Phys Ther. 2005, 35 (9): 567-571.
Harris KD Heer DM, Roy TC, Santos DM, Whitman JM, Wainner RS: Reliability of a measurement of neck flexor muscle endurance. Physical therapy. 2005, 85 (12): 1349-1355.
Olson LE Millar AL, Dunker J, Hicks J, Glanz D: Reliability of a clinical test for deep cervical flexor endurance. J Manipulative Physiol Ther. 2006, 29 (2): 134-138.
Jull G, Kristjansson E, Dall' Alba P: Impairment in the cervical flexors a comparison of whiplash and insidious onset neck pain patients. Manual therapy. 2004, 9 (2): 89-94.
Falla D Bilenkij G, Jull G: Patients with chronic neck pain demonstrate altered patterns of muscle activation during performance of a functional upper limb task. Spine. 2004, 29 (13): 1436-1440.
Jull GA: Deep cervical flexor muscle dysfunction in whiplash. J Musculoskel Pain. 2000, 8 (1/2): 143-154.
Fishbain DA, Cole B, Cutler RB, Lewis J, Rosomoff HL, Rosomoff RS: A structured evidence-based review on the meaning of nonorganic physical signs Waddell Signs. Pain Med. 2003, 4 (2): 141-181.
Sobel JB, Sollenberger P, Robinson R, Polatin PB, Gatchel RJ: Cervical nonorganic signs a new clinical tool to assess abnormal illness behavior in neck pain patients a pilot study. Arch Phys Med Rehabil. 2000, 81 (2): 170-175.
Casey KL: Concepts of pain mechanisms: the contribution of functional imaging of the human brain. Prog Brain Res. 2000, 129: 277-288.
Moseley GL: Widespread brain activity during an abdominal task markedly reduced after pain physiology education: fMRI evaluation of a single patient with chronic low back pain. Aust J Physiother. 2005, 51 (1): 49-52.
Hildingsson C, Wenngren B, Bring G, Toolanen G: Eye motility dysfunction after soft tissue injury of the cervical spine a controlled prospective study of 38 patients. Acta Orthop Scand. 1993, 64 (2): 129-132.
Rosenhall U, Tjell C, Carlsson J: The effect of neck torsion on smooth pursuit eye movements in tension-type headache patients. J Audiol Med. 1996, 5 (3): 130-140.
Gimse R, Tjell C, Bjorgen I, Saunte C: Disturbed eye movements after whiplash due to injuries to posture control system. J Clin Exp Neuropsychol. 1996, 18 (2): 178-186.
Tjell C Tenenbaum A, Sandstrom S: Smooth pursuit neck torsion test-a specific test for whiplash associated disorders?. J Whiplash Rel Disord. 2002, 1 (2): 9-24.
Heikkila HV, Wenngren BI: Cervicocephalic kinesthetic sensibility, active range of cervical motion, oculomotor function in patients with whiplash injury. Arch Phys Med Rehabil. 1998, 79 (9): 1089-1094.
Revel M, Andre-Deshays C, Minguet M: Cervicocephalic kinesthetic sensibility in patients with cervical pain. Arch Phys Med Rehabil. 1991, 72 (5): 288-291.
Loudon JK, Ruhl M, Field E: Ability to reproduce head position after whiplash injury. Spine. 1997, 22 (8): 865-868.
Treleaven J Jull G, LowChoy N: The relationship of cervical joint position error to balance and eye movement disturbances in persistent whiplash. Manual therapy. 2006, 11: 99-106.
Waddell G, Newton M, Henderson I, Somerville D Main, C.J.: A fear-avoidance beliefs questionnaire (FABQ) and the role of fear-avoidance beliefs in chronic low back pain and disability. Pain. 1993, 52: 157-168.
Swinkels-Meewisse EJCM, Swinkels RAHM, Verbeek ALM, Vlaeyen JWS, Oostendorp RAB: Psychometric properties of the Tampa Scale for Kinesiophobia and the Fear-Avoidance Beliefs Questionnaire in acute low back pain. Manual therapy. 2003, 8 (1): 29-36.
Crowley D, Kendall NAS: Development and initial validation of a questionnaire for measuring fear-avoidance associated with pain the fear-avoidance of pain scale. J Musculoskel Pain. 1999, 7 (3): 3-20.
Nederhand MJ, Izerman MJ, Hermens HJ, Turk DC, Zilvold G: Predictive value of fear avoidance in developing chronic neck pain disability consequences for clinical decision making. Arch Phys Med Rehabil. 2004, 85 (3): 496-501.
Sterling M Jull G, Vicenzine B, Kenardy J, Darnel R: Physical and psychological factors predict outcome following whiplash injury. Pain. 2005, 114 (1-2): 141-148.
Sterling M, Jull G, Kenardy J: Physical and psychological factors maintain long-term predictive capacity post-whiplash injury. Pain. 2006, 122 (1-2): 102-108.
Buitenhuis J Jaspers JPC, Fidler V: Can kinesiophobia predict the duration of neck symptoms in acute whiplash?. The Clinical journal of pain. 2006, 22 (3): 272-277.
Mercado AC Carroll LJ, Cassidy JD, Cote P: Coping with neck and low back pain in the general population. Health Psychol. 2000, 19 (4): 333-338.
Carroll LJ, Cassidy JD, Cote P: The role of pain coping strategies in prognosis after whiplash injury: passive coping predicts slowed recovery. Pain. 2006, 124 (1-2): 18-26.
Stansbury JP Ried LD, Velozo CA.: Unidimensionality and bandwidth in the Center for Epidemiologic Studies Depression (CES-D) Scale. J Pers Assess. 2006, 86 (1): 10-22.
Radloff L: The CES-d scale: a self-report depression scale for research in the general population. Appl Psychol Measurement. 1977, 1: 385–392-
Riddle DL, Rothstein JM: Intertester reliability of McKenzie’s classifications of the syndromes types present in patients with low back pain. Spine. 1993, 18 (10): 1333-1344.
Kilby J, Stigant M, Roberts A: The reliability of back pain assessment by physiotherapists using a "McKenzie algorithm". Physiother. 1990, 76 (9): 579-583.
Werneke M, Hart DL, Cook D: A descriptive study of the centralization phenomenon a prospective analysis. Spine. 1999, 24 (7): 676-683.
Fritz JM Delitto A, Vignovic M, Busse RG: Interrater reliability of judgments of the centralization phenomenon and status change during movement testing in patients with low back pain. Arch Phys Med Rehabil. 2000, 81 (1): 57-61.
Razmjou H, Kramer J, Yamada R: Intertester reliability of the McKenzie evaluation in assessing patients with mechanical low-back pain. J Orthop Sports Phys Ther. 2000, 30 (7): 368-83; discussion 384–9.
Kilpikoski S, Airaksinen O, Kankaanpaa M, Leminen P, Videman T, Alen M: Interexaminer reliability of low back pain assessment using the McKenzie Method. Spine. 2002, 27 (8): E207-E214.
Donelson R, Aprill C, Medcalf R, Grant W: A prospective study of centralization of lumbar and referred pain a predictor of symptomatic discs and anular competence. Spine. 1997, 22 (10): 1115-1122.
Young S, Aprill C, Laslett M: Correlation of clinical examination characteristics with three sources of chronic low back pain. Spine J. 2003, 3 (6): 460-465.
Laslett M Birgitta O, Aprill CN, McDonald B: Centralization as a predictor of provocation discography results in chronic low back pain, and the influence of disability and distress on diagnostic power. Spine J. 2005, 5 (4): 370-380.
Long A Donelson R, Fung T: Does it matter which exercise? A randomized control trial of exercise for low back pain. Spine. 2004, 29 (23): 2593-2602.
Werneke M, Hart DL: Centralization phenomenon as a prognostic factor for chronic low back pain and disability. Spine. 2001, 26 (7): 758-764.
Werneke MW, Hart DL: Categorizing patients with occupational low back pain by use of the Quebec Task Force classification system versus pain pattern classification procedures discriminant and predictive validity. Physical therapy. 2004, 84 (3): 243-254.
Spitzer WO, Skovron ML, Salmi LR, Cassidy JD, Duranceau J, Suissa S, Zeiss E: Scientific monograph of the Quebec Task Force on Whiplash-Associated Disorders. Spine. 1995, 20 (8S): 2S-73S.
Werneke M, Hart DL: Discriminant validity and relative precision for classifying patients with nonspecific neck and back pain by anatomic pain patterns. Spine. 2003, 28 (2): 161-166.
Keating JC Bergmann TF, Jacobs GE, Finer BA, Larson K: Inter-examiner reliability of eight evaluative dimensions of lumbar segmental abnormality. J Manipulative Physiol Ther. 1990, 13: 463-470.
Maher C, Adams R: Reliability of pain and stiffness assessments in clinical manual lumbar spine examination. Phys Ther. 1994, 74 (9): 801-809.
Strender LE, Sjoblom A, Sundell K, Ludwig R Taube, A.: Interexaminer reliability in physical examination of patients with low back pain. Spine. 1997, 22 (7): 814-820.
Lundberg G Gerdle,B: The relationships between spinal sagittal configuration, joint mobility, general low back mobility and segmental mobility in female homecare personnel. Scand J Rehab Med. 1999, 31 (4): 197-206.
Seffinger MA Najm WI, Mishra SI, Adams A, Dickerson VM, Murphy LS, Reinsch S: Reliability of spinal palpation for diagnosis of back and neck pain. Spine. 2004, 29 (19): E413-E424.
Potter NA, Rothstein JM: Intertester reliability for selected clinical tests of the sacroiliac joint. Physical therapy. 1985, 65 (11): 1671-1675.
Carmichael JP: Inter- and intra-examiner reliability of palpation for sacroiliac joint dysfunction. J Manipulative Physiol Ther. 1987, 10 (4): 164-171.
Freburger JK, Riddle DL: Measurement of sacroiliac joint dysfunction: a multicenter intertester reliability study. Physical therapy. 1999, 79 (12): 1134-1141.
Robinson HS, Brox JI, Robinson R, Bjelland E, Solem S, Telje T: The reliability of selected motion- and pain provocation tests for the sacroiliac joint. Manual therapy. 2007, 12 (1): 72-79.
Vincent-Smith B, Gibbons P: Inter-examiner and intra-examiner reliability of the standing flexion test. Manual therapy. 1999, 4 (2): 87-93.
Tong HC Heyman OG, Lado DA, Isser MM: Interexaminer reliability of three methods of combining test results to determine side of sacral restriction, sacral base position, and innominate bone position. The Journal of the American Osteopathic Association. 2006, 106 (8): 464-468.
Laslett M, Williams M: The reliability of selected pain provocation tests for sacroiliac joint pathology. Spine. 1994, 19 (11): 1243-1249.
Dreyfuss P, Michaelsen M, Pauza K McLarty J, Bogduk N.: The value of medical history and physical examination in diagnosing sacroiliac joint pain. Spine. 1996, 21 (22): 2594-2602.
Kokmeyer DJ, van der Wurff P, Aufdemkampe G, Fikenscher TCM: The reliability of multitest regimens with sacroiliac pain provocation tests. J Manipulative Physiol Ther. 2002, 25 (1): 42– 48-
van der Wurff P, Meyne W Hagmeijer, RHM: Clinical tests of the sacroiliac joint. A systematic methodological review. Part 1: Reliability. Manual therapy. 2000, 5 (1): 30-36.
Revel M, Poiraudeau S, Auleley GR, Payan C DA Nguyen M, Chevrot A, Fermanian J.: Capacity of the clinical picture to characterize low back pain relieved by facet joint anesthesia: Proposed criteria to identify patients with painful facet joints. Spine. 1998, 23 (18): 1972-1976.
Jackson RP Jacobs RR, Montesano PX: Facet joint injection in low-back pain: A prospective statistical study. Spine. 1988, 13: 966-971.
Jackson RP: The facet syndrome: Myth or reality?. Clin Orthop Relat Res. 1992, 279: 110-121.
Laslett M Oberg B, April CN, McDonald B: Zygapophysial joint blocks in chronic low back pain: a test of Revel's model as a screening test. BMC Musculoskel Disord. 2004, 5: 43-
Laslett M McDonald B, Aprill CN, Tropp H, Oberg B: Clinical predictors of screening lumbar zygopophyseal joint blocks: development of clinical prediction rules. Spine J. 2006, 6 (4): 370-379.
Broadhurst NA, Bond MJ: Pain provocation tests for the assessment of sacroiliac joint dysfunction. J Spinal Disord. 1998, 11 (4): 341-345.
Slipman CW, Sterenfeld EB, Chou LB, Herzog R Vresilovic E: The predictive value of provocative sacroiliac joint stress maneuvers in the diagnosis of sacroiliac joint syndrome. Arch Phys Med Rehabil. 1998, 79 (3): 288-292.
van der Wurff P Buijs EJ, Groen GJ: A multitest regimen of pain provocation tests as an aid to reduce unnecessary minimally invasive sacroiliac joint procedures. Arch Phys Med Rehabil. 2006, 87 (1): 10-14.
Laslett M Aprill CN, McDonald B, Young SB: Diagnosis of sacroiliac joint pain: validity of individual provocation tests and composites of tests. Manual therapy. 2005, 10: 207-218.
Laslett M, Young SB, Aprill CN, McDonald B: Diagnosing painful sacroiliac joints: A validity study of a McKenzie evaluation and sacroiliac provocation tests. Aust J Physiother. 2003, 49 (2): 89-97.
Slipman CW Sterenfeld EB, Chou LH, Herzog R, Vresilovic E: The value of radionuclide imaging in the diagnosis of sacroiliac joint syndrome. Spine. 1996, 21 (19): 2251-2254.
Butler DS: The Sensitive Nervous System. 2000, Adelaide, Australia , Noigroup Publications
Shacklock M: Clinical Neurodynamics. A New System of Musculoskeletal Treatment. 2005, Edinburgh , Elsevier
Hunt DG Zuberbier OA, Kozlowski AJ, Robinson J, Berkowitz J, Schultz IZ, Milner RA, Crook JM, Turk DC: Reliability of the lumbar flexion, lumbar extension, and passivve straight leg raise test in normal populations embedded within a complete physical examination. Spine. 2001, 26 (24): 2714-2718.
Vroomen PCAJ, de Krom CTFM, Knottnerus JA: Consistency of history taking and physical examination in patients with suspected lumbar nerve root involvement. Spine. 2000, 25 (1): 91-97.
McCombe PF, Fairbank JCT, Cockersole BC, Pynsent PB: Reproducibility of physical signs in low-back pain. Spine. 1989, 14 (9): 908-918.
Philip K Lew P, Matyas TA: Inter-therapist reliability of the slump test. Aust J Physiother. 1989, 35 (2): 89-94.
Gabbe BJ Bennel KL, Wajswelner H, Finch CF: Reliability of common lower extremity musculoskeletal screening tests. Phys Ther Sport. 2004, 5 (2): 90-97.
Vroomen PCAJ de Krom MCTFM, Kester ADM, Knottnerus JA: Diagnostic value of history and physical examination in patients suspected of lumbosacral nerve root compression. J Neurol Neurosurg Psychiatry. 2002, 72: 630-633.
Jarvik JJ, Hollingworth W, Heagerty P, Haynor DR, Deyo RA: The longitudinal assessment of imaging and disability of the back (LAIDback) study Baseline Data. Spine. 2001, 26 (10): 1158-1166.
Lurie JD: What diagnostic tests are useful for low back pain?. Best Pract Res Clin Rheumatol. 2005, 19 (4): 557-575.
Stankovic R Johnell O, Maly P, Willner S: Use of lumbar extension, slump test, physical and neurological examination in the evaluation of patients with suspected herniated nucleus pulposus. A prospective clinical study. Manual therapy. 1999, 4 (1): 25-32.
Nice DA Riddle DL, Lamb RL, Mayhew TP, Rucker K: Intertester reliability of judgements of the presence of trigger points in patients with low back pain. Archives of physical medicine and rehabilitation. 1992, 73: 893-898.
Njoo KH Van der Does E: The occurrence and inter-rater reliability of myofascial trigger points in the quadratus lumborum and gluteus medius: a prospective study in non-specific low back pain patients and controls in general practice. Pain. 1994, 58 (3): 317-323.
Hsieh, CY, Hong, CZ, Adams, A, Platt, K, C D, F H, Tobis, J: Interexaminer reliability of the palpation of trigger points in the trunk and lower limb muscles. Arch Phys MedRehabil. 2000, 81 (3): 258-264.
Simons DG: Enigmatic trigger points often cause enigmatic musculoskeletal pain: Columbus, OH. 2003, ,
Hicks GE, Fritz JM, Delitto A, Mishock J: Interrater reliability of clinical examination measures for identification of lumbar segmental instability. Arch Phys Med Rehabil. 2003, 84 (12): 1858-1864.
Murphy DR Byfield D, McCarthy P, Humphreys BK, Gregory AA, Rochon R: The hip extension test for suspected impaired motor control of the lumbar spine: a study of interexaminer reliability. J Manipulative Physiol Ther. 2006, 29 (5): 374-377.
Mens JMA Vleeming A, Snijders CJ, Stam HJ: Active straight leg raising test: a clinical approach to the load transfer function of the pelvic girdle. Movement, Stability and Low Back Pain The Essential Role of the Pelvis. Edited by: Vleeming AMVSCJDTASR. 1997, New York , Churchill Livingstone, 425-431.
Mens JMA, Vleeming A, Snijders CJ, Koes BJ, Stam HJ: Reliability and validity of the active straight leg raise test in posterior pelvic pain since pregnancy. Spine. 2001, 26 (10): 1167-1171.
Abbott JH, McCane B, Herbison P, Moginie G, Chapple C, Hogarty T: Lumbar segmental instability: a criterion-related validity study of manual therapy assessment. BMC musculoskeletal disorders. 2005, 6: 56-
Mens JMA, Vleeming A, Snijders CJ, Koes BW, Stam HJ: Validity of the active straight leg raise test for measuring disease severity in patients with posterior pelvic pain after pregnancy. Spine. 2002, 27 (2): 196-200.
O’Sullivan PB, Beales DJ, Beetham JA, Cripps J, Graf F, Lin IB, Tucker B, Avery A: Altered motor control strategies in subjects with sacroiliac joint pain during the active straight-leg-raise test. Spine. 2002, 27 (1): E1-E8.
Severeijns R Vlaeyen J W, van den Hout M A, Weber W E: Pain catastrophizing predicts pain intensity, disability, and psychological distress independent of the level of physical impairment. The Clinical journal of pain. 2001, 17 (2): 165-172.
Truchon M Cote, D: Predictive validity of the chronic pain coping inventory in subacute low back pain. Pain. 2005, 116 (3): 205-212.
Swinkels-Meewisse IEJ Roelofs J, Oostendorp RAB, Verbeek ALM, Vlaeyen JWS: Acute low back pain: pain-related fear and pain catastrophizing influence physical performance and perceived disability. Pain. 2006, 120 (1-2): 36-43.
Swinkels-Meewisse IEJ Roelofs J, Schouten EGW, Verbeek ALM,Oostendorp RAB, Vlaeyen JWS: Fear of Movement/ (re)injury predicting chronic disabling low back pain: a prospective inception cohort study. Spine. 2006, 31 (6): 658-664.
Woby SR Watson PJ, Roach NK, Urmston M: Are changes in fear-avoidance beliefs, catastrophizing, and appraisals of control, predictive of changes in chronic low back pain and disability?. Eur J Pain. 2004, 8 (3): 201-210.
Koleck M Mazaux JM, Rascle N, Brichon-Schweitzer M: Psycho-social factors and coping strategies as predictors of chronic evolution and quality of life in patients with low back pain: A prospective study. Eur J Pain. 2006, 10: 1-11.
Wesley AL, Gatchel RJ, Garofalo JP, Polatin PB: Toward more accurate use of the Beck Depression Inventory with chronic back pain patients. The Clinical journal of pain. 1999, 15 (2): 117-121.
Walsh TL Homa K, Hanscom B, Lurie J, Sepulveda MG, Abdu W: Screening for depressive symptoms in patients with chronic spinal pain using the SF-36 Health Survey. Spine J. 2006, 6 (3): 316-320.
Hurwitz EL, Morgenstern H, Yu F: Cross-sectional and longitudinal associations of low-back pain and related disability with psychological distress among patients enrolled in the UCLA Low-back pain study. J Clin Epidemiol. 2003, 56: 463-471.
Lovibond PF, Lovibond SH: The structure of negative emotional states: comparison of the Depression Anxiety Stress Scales (DASS) with the Beck Depression and Anxiety Inventories. Behav Res Ther. 1995, 33 (3): 335-343.
Haggman S, Maher CG, Refshauge KM: Screening for symptoms of depression by physical therapists managing low back pain. Physical therapy. 2004, 84 (12): 1157-1166.
Boersma K Linton S,: Psychological processes underlying the development of a chronic pain problem. A prospective study of the relationship between profiles of psychological variables in the fear-avoidance model and disability. The Clinical journal of pain. 2006, 22: 160-166.
Vlaeyen JWS, Kole-Snijders AMJ, Boeren RGB, van Eek H: Fear of movement/reinjury in chronic low back pain and its relation to behavioral performance. Pain. 1995, 62: 363-372.
The authors would like to thank Tovah Reis of the Brown University library and Mary Ott of the New York Chiropractic College library for help with information gathering.
The authors declare that they have no competing interests.
DRM conceived of the idea of the diagnosis-based clinical decision rule, led the literature search and review process, and was the principle author of the manuscript. ELH was responsible for help with design and presentation of the systematic review, assisted with the conceptualization of the presented research strategy and contributed to the writing of the manuscript. CFN was responsible for performing literature searches and reviews and contributed to the writing of the manuscript. All authors read and approved the final manuscript.
Electronic supplementary material
Additional file 1: Table 1. Number of studies identified that address factors related to question number 2. (DOC 33 KB)
Additional file 2: Table 2. Number of studies identified that address factors related to question number 3. (DOC 28 KB)
Authors’ original submitted files for images
Below are the links to the authors’ original submitted files for images.
Rights and permissions
This article is published under license to BioMed Central Ltd. This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
About this article
Cite this article
Murphy, D.R., Hurwitz, E.L. & Nelson, C.F. A diagnosis-based clinical decision rule for spinal pain part 2: review of the literature . Chiropr Man Therap 16, 7 (2008). https://doi.org/10.1186/1746-1340-16-7
- Neck Pain
- Positive Likelihood Ratio
- Spinal Pain
- Chronic Neck Pain
- Depression Anxiety Stress Scale