Analysis of the ability of chiropractors to predict their own patents’ outcome status suggests that practitioners are overall at best poor and at worst fail. Given that prognosis in this condition does not involve a life threatening outcome, one might argue that the AUC categorisation of the discriminative performance of clinicians might be more lenient, i.e. ‘poor’ becomes ‘fair’ and ‘fail’ becomes ‘poor’. Despite this, the accuracy of the initial clinician predictions remains predominantly poor. The only other studies to investigate clinician prediction of recovery from low back pain stated that GP’s risk estimation was comparable to other prognostic indicators as measured at baseline, although the AUC was reported as only 0.6 and physiotherapists were poorer than a clinical prediction rule which itself scored as poor in terms of AUCs [6, 12]. Most other studies investigating prognostic accuracy of physicians have centred on cancer survival with a large systematic review finding only weak evidence to support clinician’s estimates alone as predictors of survival . Predicting other important health outcomes also appears difficult with a recent study investigating the prognostic accuracy of occupational therapist advice regarding return to work times revealing consistent and marked underestimation of recovery by these health workers . However, the literature is not unanimous in its lack of support of clinician based prediction. For example, Reiso et al.  found that GPs ability to predict the period of certified sickness absence was high and good prediction was most strongly associated with type of diagnosis. However, the frequent lack of definitive diagnoses in the conditions dealt with in this study, most being categorised as nonspecific, has made prognosis considerably more problematic. Additionally, that the duration of sick certification investigated by the study was potentially under the control of the GP, could be considered a confounding factor. Although a small number of factors associated with poor prognosis have arisen from the MSD literature, particularly low back pain studies, they fail to explain much of the variance reported in outcome. In particular, even fewer robust indicators of prognosis have arisen amongst patients seeking manual therapy and it may not be so surprising why the clinicians in this study struggle to accurately judge outcomes amongst their patients given that extensive research into potential predictive factors of outcome have found so few in this particular MSD population.
Of those that have been reported, reviews of prospective studies reveal a variety of prognostic factors. For example longer pain duration has emerged as a generic prognostic factor amongst MSD patients generally and in low back pain patients in particular [20–22]. However, in this study practitioners were no more accurate in predicting outcomes in chronic (> 1 month) as compared to acute (< 1 month) patients. In addition many studies have indicated psychological factors as important in prognosis of MSD and this is may also be true of chiropractic patients  although this remains a matter of controversy . Other factors such as socioeconomic, gender, age and activity have been less reliably related to prognosis in neck pain, with research being indecisive in particular regarding age as a risk factor for poor prognosis .
Of interest beyond the primary question of this study are the differences and similarities between the method of determining outcomes and the outcomes of the conditions studied. Generally both dichotomised BQ and pain NRS based determination of improvement or otherwise produce similar proportions of patients at both follow up points. Of note is the fact that after 12 weeks around one 20 to 25% of patients remained unimproved for back and neck pain patients. This concurs with previous research that notes that, contrary to commonly held notions, a significant proportion of these patients do not recover entirely . On the other hand those presenting with shoulder pain and in this study, more chronic shoulder pain, seemed to recover remarkably well with the proportion categorised as not improved continuing to fall significantly beyond the first month, unlike with back and neck pain.
Interestingly, the PGIC global measure consistently categorised a greater proportion of patients as not improved at both follow up points across all conditions compared to BQ and pain NRS categorisations. It is possible this may reflect the way this measure may be thought about by patients where it allows any number of factors to be brought into a patients’ judgement of their improvement as opposed to a single measure such as pain or even a multidimensional measure such as the BQ. These differences may certainly warrant further investigation.
There are clear limitations to this study. Firstly, the question we asked the practitioners was ‘Whether they thought patients were less likely than average to report a good outcome following a course of care’. In meetings with the practitioners involved prior to the study this judgement was discussed in relation to patients’ response on the BQ, as the practitioners were familiar with the routine use of this questionnaire in their practice on a day to day basis. However, the question did not explicitly highlight a particular outcome measure. In order to increase the robustness of our conclusion, we therefore used 3 outcome measures dichotomised around published cutoff scores, with the BQ as the primary outcome. Given that similar if not identical findings were generated from all 3 outcomes it would tend to support the conclusion that practitioners fail to predict patient outcome and is less likely to be an idiosyncrasy of the outcome measure we used or a mismatch between the practitioners perception of the original question and the final outcome measure.
Secondly, we chose to analyse the association between practitioner prediction and patient self-reported outcome in a manner reminiscent of diagnostic test validity despite the fact that this was a prospective study. Normally, diagnostic test validity studies would ideally require minimal time periods between gold standard and new test data collection. In view of this, we also calculated the risk of improvement based on the chiropractor’s initial prognosis, typically a method appropriate to prospective studies. Although this provides a further measure of association, risk normally implies some causative impact of the risk factor on the outcome, whereas in our case there is no expectation that the practitioner’s prognosis would impact the actual outcome, although we did not know whether the practitioner had explicitly stated their prognosis to the patients and it is possible that if they had done so, this may have influenced outcome.
Thirdly, chiropractors in this study were not asked to predict patients’ reports of their outcome at any specific time point but in general and it is possible that had they been asked specifically how patients may report themselves at 1 or 3 months, prognostic accuracy would have been found to be higher.
Lastly, this study used outcome data collected as a normal part of practice activity returned by post or email by patients. It is not possible to exclude the possibility that those who did not respond to the request to complete the PROMs would have answered differently. However the proportion reporting a good outcome in this sample is similar to other studies from this group of practices which achieved a higher response rate by including a telephone follow up of non-responders making it less likely that the results quoted here are subject to non-response bias .