Skip to main content

Table 3 Inter-rater reliability coefficients and percent agreement with probabilistic benchmarking to the Landis and Koch scale in classification of MRI-findings at disc level

From: Degenerative findings in lumbar spine MRI: an inter-rater reliability study involving three raters

Diagnostic finding

N = 177 disc levels

Reliability

Rater 1 vs. 2

Reliability

Rater 1 vs. 3

Reliability

Rater 2 vs. 3

All

Landis and

Koch scale

 

95% C.I.

95% C.I.

95% C.I.

 

Probabilistic benchmark

Spondylolisthesis

 Conger’s K

0.24 [−0.16:0.64]

0.36 [−0.01:0.72]

0.36 [− 0.01:0.72]

0.33

Slight

 Gwet’s AC2

0.998 [0.997:1.000]

0.998 [0.996:0.999]

0.998 [0.996:0.999]

0.99

Almost perfect

 %-agreement

0.998 [0.997:1.000]

0.998 [0.996:0.999]

0.998 [0.996:0.999]

0.99

Almost perfect

Disc degeneration

 Conger’s K

0.60 [0.51:0.70]

0.67 [0.58:0.76]

0.76 [0.69:0.82]

0.68

Moderate

 Gwet’s AC2

0.90 [0.87:0.94]

0.89 [0.85:0.93]

0.91 [0.88:0.95]

0.90

Substantial

 %-agreement

0.95 [0.93:0.96]

0.94 [0.93:0.96]

0.96 [0.95:0.97]

0.95

Substantial

Nerve compromise

 Conger’s K

0.55 [0.38:0.71]

0.56 [0.39:0.72]

0.52 [0.34:0.70]

0.54

Fair

 Gwet’s AC2

0.96 [0.93:0.98]

0.93 [0.90:0.96]

0.92 [0.89:0.96]

0.93

Substantial

 %-agreement

0.96 [0.95:0.98]

0.95 [0.93–0.97]

0.94 [0.92:0.97]

0.95

Substantial

Spinal stenosis

 Conger’s K

0.19 [0.08:0.29]

0.33 [0.22:0.45]

0.43 [0.34:0.53]

0.33

Fair

 Gwet’s AC2

0.98 [0.97:0.98]

0.98 [0.98:0.99]

0.98 [0.97:0.98]

0.98

Almost perfect

 %-agreement

0.98 [0.98:0.99]

0.99 [0.98:0.99]

0.98 [0.98:0.99]

0.98

Almost perfect

Facet degeneration

 Conger’s K

0.27 [0.16:0.38]

0.32 [0.21:0.42]

0.35 [0.25:0.46]

0.32

Slight

 Gwet’s AC2

0.79 [0.74:0.84]

0.79 [0.74:0.84]

0.76 [0.71:0.82]

0.78

Moderate

 %-agreement

0.88 [0.86:0.90]

0.89 [0.86–0.91]

0.87 [0.85:0.90]

0.88

Moderate

Scoliosis

 Cohen’s K

0.49 [0.06:0.92]

0.59 [0.22:0.96]

0.75 [0.40:1.00]

0.61

Fair

 Gwet’s AC1

0.98 [0.96:1.00]

0.98 [0.96:1.00]

0.99 [0.97:1.00]

0.98

Almost perfect

 %-agreement

0.98 [0.96:1.00]

0.98 [0.96:1.00]

0.99 [0.97:1.00]

0.98

Almost perfect

Annular Fissure

 Cohen’s K

0.50 [0.32:0.68]

0.45 [0.26:0.65]

0.61 [0.45:0.77]

0.53

Moderate

 Gwet’s AC1

0.87 [0.82:0.93]

0.88 [0.82:0.93]

0.88 [0.83:0.93]

0.88

Almost perfect

 %-agreement

0.88 [0.83:0.93]

0.88 [0.83:0.93]

0.89 [0.84:0.93]

0.88

Almost perfect

Disc contour

 Cohen’s K

0.36 [0.25:0.48]

0.27 [0.17:0.38]

0.39 [0.29:0.49]

0.34

Fair

 Gwet’s AC1

0.73 [0.65:0.80]

0.59 [0.50:0.68]

0.62 [0.53:0.70]

0.64

Moderate

 %-agreement

0.75 [0.69:0.82]

0.64 [0.57:0.71]

0.67 [0.60:0.74]

0.69

Substantial

  1. Inter-rater reliability using Gwet’s AC1 (binominal/nominal data) and AC2 (ordinal data) and percent agreement are presented
  2. For comparison Cohen’s K (binominal/nominal data) and Conger’s K (ordinal data) also presented
  3. Numbers in parentheses are 95% confidence intervals [95% CI]