predictive factors for outcome in patients having …...table 5.1: characteristics of patients with...
TRANSCRIPT
Predictive Factors for Outcome in Patients having Surgery
FOR CERVICAL SPONDYLOTIC MYELOPATHY.
By
Alina Karpova, BSc
A thesis submitted in conformity with the requirements for the degree of Masters of Science
Department of the Institute of Medical Sciences University of Toronto
©Copyright by Alina Karpova, 2011
ii
Predictive factors for outcome in patients having surgery for cervical spondylotic myelopathy.
Master’s of Science, 2011
Alina Karpova Institute of Medical Sciences
University of Toronto
ABSTRACT
PURPOSE: The objective was to determine if particular magnetic resonance,
clinical and demographic findings were associated with functional status prior to surgery
and predictive of functional outcomes at follow-up.
RESULTS: The study included 65 consecutive CSM patients. The modified
Japanese Orthopaedic Association Scale (mJOA) was used as the primary outcome
measure. Higher baseline mJOA scores were associated with younger age, shorter
duration of symptoms, fewer compressed segments and less severe cord compression.
Better post-operative mJOA scores were associated with younger age, shorter duration of
symptoms and higher baseline scores. Using multivariate analysis, baseline and follow-up
mJOA scores adjusted for baseline mjOA score were best predicted by age.
CONCLUSION: Age and clinical severity scores at admission can both provide
valuable information. However, MR imaging features of the spinal cord before surgery
cannot accurately predict the functional prognosis for patients with CSM and hence
alternative imaging approaches may be required.
iii
ACKNOWLEDGMENTS
I would like to acknowledge and thank my mentor, Dr. Michael Fehlings
(Supervisor), and my Program Advisory Committee members, Dr. Aileen Davis, Dr.
Abhaya Kulkarni, and Dr. David Mikulis. I am deeply grateful for the academic
enrichment they were able to provide, as well as their guidance and unwavering support
throughout this project.
I am thankful to have received funding from the Ontario Neurotrauma
Foundation.
I would also like to thank my friends and colleagues who have helped me shape
this project: Dr. David Cadotte, who screened the articles as a second reviewer for
systematic review and gave me an opportunity to write review paper of CSM in elderly
population, Dr. Yuriy Petrenko for the technical support; Amy Lem; Dr. E Massicotte,
Dr. SJ Lewis, Dr. YR Rampersaud, Neurosurgeons at the Toronto Western Hospital,
Ontario, for allowing us to study their patients; Dr. Zvonimir Lubina for the imaging data
he has provided for this study; and Branko Kopjar for his statistical advices.
I would like to dedicate the thesis to my ‘family’, Alexandra, Olga, Nataly, Ira
and Yakov. I am extremely grateful to Roman for his patience, love and unwavering
support over the years. It is because of them that I have had the strength to see the project
through to the end.
iv
TABLE OF CONTENTS ABSTRACT……………………………………………………………………………..i ACKNOWLEDGMENTS……………………………………………………………… ii CHAPTER 1: Introduction 1.1. Problems of predicting outcomes in CSM patients after surgery……………………1 1.2. Importance of investigating predictors of outcomes after surgery…………………1-2 1.3. Magnetic Resonance Imaging (MRI) in CSM population………………………….2-3 CHAPTER 2: Background and literature review Functional outcomes after surgery and their important predictors: Current State of knowledge 2.1. Cervical spondylotic myelopathy (CSM): Definition and clinical presentation……..4 2.2. Epidemiology of CSM……………………………………………………………….4 2.3. CSM treatment……………………………………………………………………….5 2.4. Functional outcome assessments…………………………………………………...5-6 2.5. Predictors of functional outcomes following surgery……………………………..6-8 2.5.1 Age …………………………………………………………………………6 2.5.2 Gender………………………………………………………………………7 2.5.3 Duration of symptoms………………………………………………….....7-8 2.5.4 Baseline severity score………………………………………………………8 2.5.5. MR imaging findings……………………………………………………8-10 2.6. Theoretical framework and definition of the concept…………………………...10-11 Chapter 3: Systematic review Currently available MR imaging based measurements for the explanation of variations among CSM patients: review and critical appraisal 3.1. LITERATURE SEARCH………………………………………………………..12-16 3.1.1 Objective…………………………………………………………………...12 3.1.2 Inclusion criteria………………………………………………………..12-13 3.1.3 Identification of studies and assessment of methodological quality…...13-15 3.1.4 Data extraction …………………………………………………………….16 3.1.4.1 Severity of myelopathy: functional score and recovery percentage 3.1.4.2 MRI predictive factors…………………………………………...16 3.2. RESULTS………………………………………………………………………..16-33 3.2.1 Compression of spinal canal and cord………………………………….20-26 3.2.2 T2 signal changes on MRIs of the spinal cord…………………………26-33 3.3. OVERALL SUMMARY OF THE SYSTEMATIC LITERATURE REVIEW…33-36 3.4. RATIONALE FOR STUDYING CLINICAL AND IMAGING PREDICTORS OF OUTCOME IN CSM…………………………………………....................................36-37 3.5. HYPOTHESIS AND STUDY OBJECTIVES…………………………………..37-38 CHAPTER 4: Material and methods 4. 1. STUDY OBJECTIVES……………………………………………………………39 4. 2. STUDY DESIGN………………………………………………………………….39 4. 3. TARGET POPULATION………………………………………………………39-42
v
4. 4. DEFINITION OF PRIMARY OUTCOME…………………………………….42-43 4. 5. PRIMARY EXPOSURE (INDEPENDENT VARIABLES)…………………...43-51 4.5.1 Strategies to improve accuracy and easy use of exposure variables…...43-45 4.5.2 Definition of primary exposure and clinimetric properties (validity)
of the independent variables …………………………………………………45-48 4.5.2. 1. Age
4.5.2. 2. Gender 4.5.2. 3. Baseline mJOA 4.5.2. 4. Duration of symptoms 4.5.2. e. Degree of compression (Anteroposterior diameter and Transverse
Area) 4.5.2. f. Signal intensity changes 4.5.2. g. Number of affected stenotic levels
4. 6. CONFOUNDING VARIABLES……………………………….………………….48 4. 7. SAMPLE SIZE ……………………………………………………………………49 4. 8. DATA ANALYSIS …………………………………………………………….49-52
4.8.1. Exploratory analysis…………………………………………...49-50 4.8.1.1. Univariable (unadjusted) analysis……………………….50 4.8.2. Model development …………………………………………...50-51
4.8.3. Data sources and management ………………………………..51-52 4.8.4. Ethics ……………………………………………………………..52
Chapter 5: Results 5. 1. DESCRIPTIVE STATISTICS……………………………………………………..54 5. 2. MODEL DEVELOPMENT ……………………………………………………54-57 5. 2. 1. Improving the validity of the predictive model………………………54-57 5. 2. 2. Univariate (unadjusted) analysis …………………………………….56-57 5. 2. 2. 1. mJOA Scores at baseline 5. 2. 2. 2. mJOA Scores at follow up 5. 2. 3. Multivariate (adjusted) analysis ……………………………………........57 5. 2. 3. 1. mJOA Scores at baseline 5. 2. 3. 2. mJOA Scores at follow-up Chapter 6: Discussion ………………………………………………………………58-67 6.1. Summary of findings…………………………………………………………….58-62 6.2. Implications of findings …………………………………………………………62-63 6.3. Limitations ………………………………………………………………………63-64 6.4. Future directions ………………………………………………………………...64-67 6.5. Conclusion…………………………………………………………………………..67 Chapter 7 Reference List …………………………………………………………..68-72 Chapter 8 Appendices…………………………………………………………………….
vi
LIST OF TABLES
CHAPTER 3
Table 3.1: Presents criteria in a modified version of quality assessment checklist
Table 3.2: Presents the summary of methodological limitations in CSM predictive studies
Table 3.3: Study design, sample size, type of outcome measures and quality rating
Table 3.4: Data extracted included MRI characteristics (signal intensity, spinal cord
compression and spinal canal compromise)
Table 3.5: Data extracted included those predictive factors for which the strength of
association with short and long term outcomes in patients with cervical myelopathy, were
reported
Table 3.6: Results – previous predictive models
Table 3.7: List of MR imaging features as potential predictors of recovery percentage and functional scores after surgery
CHAPTER 4
Table 4.1: The modified Japanese Orthopaedic Association (mJOA) scaling for
functional classification for CSM
Table 4.2: Definition of exposure variables
Table 4.3: Standard parameters for cervical spine T1- and T2-weighted Magnetic
Resonance Image (MRI) used in our study
CHAPTER 5
Table 5.1: Characteristics of Patients with Cervical Spondylotic Myelopathy
Table 5.2: Performance of the mJOA in CSM sample
Table 5.3: Correlation matrix and coefficients between functional outcomes and
independent variables
Table 5.4: Correlation matrix and coefficients between functional outcomes and spinal
cord compression as a potential predictor
Table 5.5: Unadjusted beta value estimates for independent variables (univariate analysis)
Table 5.6: Statistical details of full models (multivariate analysis)
vii
LIST OF FIGURES
CHAPTER 3
Figure 3.1: Flow diagram of inclusion and exclusion criteria of systematic reviews.
CHAPTER 4
Figure 4.1: Flow diagram of the study population.
CHAPTER 5
Figure 5.1: MR imaging measures for spinal cord compression
Figure 5.2: T1-weighted image of the sagittal view revealing hypointensity in the spinal
cord and T2-weighted image of the sagittal view showing hyperintensity in the spinal
cord before surgery (arrow) obtained from clinic spine at the Toronto Western Hospital.
Figure 5.3: Focal compression and multiple level of compression
Figure 5.4: Distribution of baseline mJOA scores
Figure 5.5: Distribution of post-operative mJOA scores at 12 months
LIST OF APPENDICES
Appendix 1 Search strategy
Appendix 2 Research Ethics Board Approval at University Health Network
Appendix 3 Reliability project
Appendix 4 Grade of recommendation: Levels of Evidence Table (2002)
viii
LIST OF ABBREVIATIONS
CSM Cervical Spondylotic Myelopathy
OPLL Ossification of the Posterior Longitudinal Ligament
HD Herniated Disc
TA Transverse Area
AP Anteroposterior Diameter
CR Compression Ratio
MSCC Maximum Spinal Cord Compression
MCC Maximum Canal Compromise
mJOA Modified version of Orthopaedic Association
SCI Spinal Cord Injury
CI Confidence Interval
SE Standard Error
ICC Intraclass correlation coefficient
1
CHAPTER 1
INTRODUCTION
Predictors of outcome following surgery have been a significant part of cervical
spondylotic myelopathy (CSM) research over the past 20 years. With the development of
new therapies and interventions and varied natural history of CSM, there is a need for
reliable predictors to optimize the timing of surgical intervention in order to maximize
functional recovery.
1.1. Problems of predicting outcomes in CSM patients after surgery
There is no consensus as to the optimal ways to assess the clinical (eg. advanced
age, prolonged duration of symptoms, etc) and MR imaging features (eg. spinal cord
compression, signal intensity changes, and number of compressed spinal cord segments)
used in research and clinical practice. Furthermore, very few studies have attempted to
build a predictive, multidimensional model of functional outcomes after surgery that
would combine age, gender, duration of symptoms, baseline scores, degrees of
compression, signal intensity changes and number of compressed segments together.
1.2. Importance of investigating predictors of outcomes after surgery
A prediction model would be useful in identifying individuals who are most likely
to experience improvement after surgery by determining an expected magnitude of
predictive factors effect. This knowledge could allow individualized decisions regarding
the use of different surgical approaches for the treatment of elderly individuals with
2
CSM. Therefore, a predictive model could guide the application of such strategies in high
risk groups and would potentially optimize functional recovery of CSM patients.
1.3. Magnetic Resonance Imaging (MRI) in CSM population
The application of MR imaging to the spinal cord has become increasingly
attractive due to the ability of MRI to reflect the amount of spinal cord compression,
reflect the pathological changes within the cord, measure space within the spinal canal,
detect bony pathology, and show suspected lesions of the soft tissues in and around the
vertebral column in a multiplanar display (eg. midsagittal, axial, etc). There is also no
risk of radiation exposure and the procedure is non-invasive [1]. It is also a standard of
care for both diagnosis and preoperative planning of patients with suspected CSM.
However, there is variability in the literature as to the value of MR imaging as a predictor
of functional outcomes after surgery.
The overall objective of this project, therefore, was to develop a predictive model
of functional outcomes incorporating key demographic, clinical and MR imaging
assessments in patients with CSM undergoing surgical treatment. The primary goal of
this model is to help treating physicians and spine surgeons to identify individuals who
are most likely to experience better outcome after surgery. Furthermore, a predictive
model could help to improve the specificity of inclusion criteria for future clinical trials,
detecting the potential benefit of surgical interventions on selected homogeneous groups
of CSM. The study was organized into three stages; each stage being a portion of the
entire study and answering questions that contribute to the overall result, the CSM
predictive model of functional outcomes.
3
The thesis is organized into six chapters and an appendix. The objectives of
Chapter 2 and Chapter 3 were to establish components of domains for a predictive
model. More specifically, Chapter 2 summarizes current knowledge of available
variables other than MR imaging that are potentially predictive of functional outcomes
following surgery. Chapter 3 consisted primarily of a literature review and the
subsequent critical appraisal and summary of current evidence to determine the pre-
testing power of the MR imaging domain. Chapter 4 includes details of the methods
used to develop an objective predictive model. The results are detailed in Chapter 5.
Finally, Chapter 6 states the conclusions of the study together with a discussion about
the implications and limitations as future research direction. The thesis ends with the
appendix which includes extra exhibits of data that may assist in understanding certain
aspects. Appendix 3 is essential in the body of the thesis for further validation of spinal
cord compression methods.
4
CHAPTER 2
BACKGROUND AND LITERATURE REVIEW
Functional outcomes after surgery and their important predictors: current
state of knowledge
This chapter outlines the background related to cervical spondylotic myelopathy
(CSM) as well as outcomes and their measurements. The epidemiology of CSM, surgical
treatments, outcomes after surgery, current measurement approaches related to CSM
severity along with predictors of those outcomes, are reviewed and summarized in this
section.
2.1. Cervical spondylotic myelopathy (CSM): Definition and clinical presentation
Cervical spondylotic myelopathy (CSM) can be broadly defined as symptomatic
dysfunction of the cervical spinal cord caused by degenerative changes of the bony and
ligamentous spine [2]. CSM can occur in all adults due to cord compression resulting
from one of several different factors: degenerative disc disease (or spondylosis); frank
disc protrusion; or OPLL. Symptoms of CSM include: neck stiffness; unilateral or
bilateral deep, aching neck, arm and shoulder pain; stiffness or clumsiness while walking;
hand dysfunction; motor weakness; and numbness and bowel/bladder dysfunction.
Symptoms may range in severity from mildly uncomfortable to completely disabling.
2.2. Epidemiology of CSM
Although the prevalence of CSM is still unknown, it is the most common form of
spinal cord dysfunction in patients and the most common underlying cause of traumatic
5
SCI in individuals older than 55 years of age [3]. It is a major cause of disability in the
adult population [4-6].
2.3. CSM treatment
Faced with a patient with limited function and MR imaging evidence of cervical
spinal cord compression, decompressive surgery is a practical treatment option. In most
cases, patients are informed that surgery is unlikely to improve their functional outcomes
but rather is aimed at halting the progression of their disease. There is however, emerging
evidence that most patients make robust and functional improvements following surgery
[7, 8]. There is some evidence that demographic factors, clinical history of CSM and MR
imaging evidence can explain outcomes after surgery, but at present it is difficult to
predict an individual patient’s response to surgery.
2.4. Functional outcome assessments
Determining functional status and independence after surgery in CSM patients has
become a primary area of research because of the impact of CSM has on health related
quality of life as well as the financial burden of this condition on society and individuals.
However, gain in function after surgery has been documented for individuals with CSM
[7, 8]. Function in the CSM population is often reported by means of postoperative
functional scores, absolute or relative changes in scores and rate of recovery. It is often
measured using functional measurement tools as: the original Japanese Orthopaedic
Association Scale (JOA), the modified version of JOA (mJOA), Nurick score, 30-meter
timed up walk test, and the Neurosurgical Cervical Spine (NCS) Score. The Japanese
Orthopaedic Association (JOA) is a qualitative tool to measure functional disability. The
6
scale ranges from 0-17 with higher scores indicating better function [9]. The inter- and
intraobserver reliability of original JOA scale have been shown to be high (Yonenobu K
et al 2001). To establish the percent of recovery, the following formula was proposed by
Hirabayashi et al [9] recovery rate (%) = ([postoperative JOA score – preoperative JOA
score] / (17 - preoperative JOA score]) X 100%. The term recovery rate does not imply
the actual rate of recovery but rather extent of recovery (percentage). For simplicity, the
term recovery rate will be used to describe the percent of recovery throughout the
manuscript. The modified JOA (mJOA) scale, which is the current so-called standard
functional measure in CSM population [10], was revised to account for cultural
differences in western populations (Table 4.1). The domains include upper extremity
function (5 points), lower extremity function (7 points), sensory function (3 points),
urinary bladder function (3 points). The scale ranges from 0-18 with higher scores
indicating better function [11]. Similarly, the recovery after surgery was evaluated using
the formula proposed by Hirabayashi et al [9]: recovery rate (%) = ([postoperative mJOA
score – preoperative mJOA score] / (18 - preoperative mJOA score]) X 100%. The
Neurosurgical Cervical Spine Score (NCS) is also a functional measure to quantify gain
in recovery in the following manner: recovery rate (%) = ([postoperative score –
preoperative score] / (14 - preoperative score]) X 100% [12].
2.5. Predictors of functional outcomes following surgery
2.5.1. Age
Conflicting results regarding the treatment of cervical myelopathy in geriatric
populations have been reported previously [13-17]. Some studies have found an
7
association between age and functional score obtained at long term follow-up (greater
than 6 months after surgery) [18, 19], while others did not [15]. Based on univariate
analysis, Nagata et al have found an association between age and functional score
obtained from 12 months to 4.5 years (mean follow up of 1.5 years); Yamazaki et al,
however showed that age did not affect functional scores obtained from 12 to 90 months
after surgery (mean follow up of 40 months). The inconsistencies could be due to
variable definitions of older and younger groups. After adjustment for other confounding
variables, Morio et al showed that age is a reliable predictor of functional score obtained
from 6 months to 10 years (mean, 3.4 years) after decompression of the spinal cord in
CSM patients.
2.5.2. Gender
Gender is rarely highlighted by predictive studies as a potential predictor of
outcome, failing to find such an association. Those that do, tend to show that women
have a better outcome than men [20].
2.5.3. Duration of symptoms
The literature contains conflicting results with regards to the duration of CSM
symptoms and post-operative functional scores. Based on univariate analysis, several
authors have found an association between duration of symptoms and functional score, as
assessed using the JOA scale and obtained at long term follow up (greater than 6 months)
[15, 19, 21]; Fukushima et al, however, showed that the duration of symptoms did not
affect functional score after surgery [22]. After adjustment for other important variables,
Morio et al found that the duration of symptoms is a significant predictor of functional
8
score obtained from 6 months to 10 years after surgery (mean follow up, 3.4 years). We
suspect that inconsistencies could have resulted from the manner in which the authors
quantified the duration of symptoms and how ‘long versus short’ was defined.
2.5.4. Baseline severity score
The literature contains conflicting results regarding whether patients with initial
poor functional scores gain less or greater benefit from surgery [19, 23]. Based on results
of univariate analysis, Singh et al reported that patients with lower starting points make
the most gains in function, as assessed by walking tests, 3 months after surgery.
However, Morio et al identified a positive correlation between baseline score and
functional score assessed by JOA scale. After adjustment for other confounding
variables, Morio et al found that severity score at admission is a reliable predictor of
functional score obtained from 6 months to 10 years (mean, 3.4 years) after
decompression of the spinal cord. These inconsistencies could be due to variability in
measures of functional scores and follow up times.
2.5.5. MR imaging findings
A number of authors [8, 10, 14, 18, 19, 21, 22, 24-28] have reported that varying
patterns of signal intensity changes on T1-/T2WI, degrees of spinal cord compression and
multiplicity of spinal cord segments being compressed are good predictors of functional
outcome after surgical decompression. Some authors [8, 14, 15, 19, 27-34], on the other
hand, have reported no clear correlation between the surgical outcome and MRIs. The
lack of statistically significant predictors of functional outcome, may be attributed to a
9
broad spectrum of compressive pathologies and therefore a broad spectrum of spinal cord
recuperative potentials, the MR imaging variables (intensity signal changes, spinal cord
compression, and number of compressed segments) on T1-/T2WI studied may not be
good predictors of outcome following surgery in patients with CSM. Several authors [14,
24, 25] considered intramedullary high SI on T2-weighted MR images is a predictor of
good recovery and low SI on T1-weighted MR images is a predictor of poor recovery; the
authors [8, 14, 15, 28, 29, 31, 32] thought they did not affect outcomes after surgery. The
diverse conclusions were presented due to unknown nature of histopathological
representation of intramedullary low SI on T1 and high T2-weighted images. Based on
the literature, many authors have considered that intramedullary high SI might represent a
variety of histological changes, including edema, ischemia, demyelination, gliosis,
microcavities, and cavities [35-40]. It is widely believed that a greater degree of spinal
cord compression increases the chances that the tissue damage is irreversible despite
surgical decompression and therefore leads to poor recovery. But, the value of
morphological plasticity obtained by quantifying surface area of spinal cord on T2W
imaging in the evaluation of spinal cord function and its relationship with outcomes of
surgery has been questioned. More specifically, no consensus was established on a
critical point beyond at which functional recovery becomes irreversible [22].
10
SUMMARY:
In summary, age, duration of symptoms and baseline severity score are
consistently associated with functional scores following surgery. Therefore, it would be
essential to adjust for these variables in the comparison of functional scores across
varying MR imaging features. The MR imaging features are variables of interest which
will be addressed in the subsequent chapter via systematic review. This stage consists of
a literature review for the development of a predictive model, and the subsequent critical
appraisal and summary of current evidence to determine the pre-testing content of the
MR imaging domain, followed by overall hypothesis and specific aims.
2.6. Theoretical framework and definition of the concept
We investigated the combination of these factors and their contribution to
predicting outcome using the all-variable model approach complemented with clinical
judgement and statistical importance (beta estimates were provided) in a well-controlled
prospective cohort. The functional outcomes after surgery were measured through the
following domains: 1) demographic factors, 2) baseline severity score, 3) duration of
symptoms, and 4) MR imaging features.
Demographic factors associated with lower recovery rate and/or lower
performance scores after surgical intervention include: advanced age in male population.
Barriers to optimum recovery are exacerbated due to greater neural tissue damage caused
by greater range of motion in narrower canal and higher C level involvement [41], as well
as diminished spontaneous plasticity with age [42]. The association with gender maybe
explained by the differences in the mechanical loading/muscle compressive forces
11
promoting new bone growth [43]. These factors may lead to decreased likelihood of
maximising recovery and functional performance after surgery.
Longer duration of symptoms is associated with the functional score and recovery
rate at admission and follow-up [10, 13, 19, 27, 32, 44, 45] because long-standing
mechanical compression causing additional circulatory impairment of the spinal cord
[46]. This factor leads to increased likelihood of poor recovery after surgery.
Higher baseline scores at admission are associated with greater improvement in
scores and recovery rate after surgical intervention [10, 19, 47]. The greater benefit from
surgery observed in groups of patients who are less functionally disabled could be due to
less severe neuropathologic alterations (eg.edema, ischemia) in the spinal cord that could
reflect greater recuperative potential.
A greater degree of spinal cord compression increases the chances that the tissue
damage is irreversible despite surgical decompression and therefore leads to poor
recovery. The intrinsic signal changes on low T1- and high T2 together predict a poor
functional recovery following surgery in comparison to the absence of these findings
[19]. These imaging findings likely represent long-standing and ongoing damage to the
neural elements of the spinal cord and the corresponding white matter tracts.
Furthermore, complex injuries that are contiguous over several spinal segments may
interfere with optimum recovery because such an injury is associated with profound
changes in the grey matter and significant changes in the posterior and anterior columns
which may result in more severe functional, electrophysiological and histological
deterioration [48].
12
CHAPTER 3
Currently available MR imaging based measurements to assess the
spinal cord in the setting of cervical spondylotic myelopathy:
review and critical appraisal
3.1. LITERATURE SEARCH
3.1.1 Objective
The goal of this systematic review was to establish which MR imaging features
can predict outcomes following surgery, including functional disability score and
recovery percentage. Moreover, the level of evidence and quality of methodology were
examined in each study. MR imaging based measurements included transverse area of
spinal cord, compression ratio of spinal cord, anteroposterior diameter of spinal cord and
scoring systems to quantify the degree of spinal canal and cord compression,
absence/presence of changes in T2SI, degrees of signal intensity changes, multisegmental
area of signal intensity changes, signal intensity ratio, and T1WI/T2WI signal intensity
change patterns.
3.1.2 Inclusion criteria
Articles were included if they satisfied the following criteria: a minimum sample
of 25 aged 18 and older with symptomatic CSM who underwent surgical treatment and
followed up post-surgically; detailed description of MRI features; outcomes of interest
were functional scores and recovery percentage; study design was not limited to any
particular methodology. Studies that included subjects with spinal cord compression due
13
to trauma or mass lesions, acute spinal cord injuries, assessed by kinematic MR imaging,
diffusion-weighted MR imaging, cine-phase contrast MR imaging, perfusion-weighted
MR imaging or phase-contrast MR imaging were excluded. No review papers were
included in the study.
Figure 3.1: Flow diagram of inclusion and exclusion criteria of systematic reviews.
Potential relevant publications identified and
screened for retrieval (n=6890)
Papers evaluated against the inclusion and
exclusion criteria on the basis of title/abstract/
after review of the full article (n=112)
Included studies for prediction (n=22)
3.1.3 Identification of studies and assessment of methodological quality
Papers examining the predictive value of MR imaging features were identified
through searches of Medline, Embase, and Pubmed, January, 1980 – November, 2008. Of
14
112 publications identified initially, 22 articles fulfilled the inclusion and exclusion
criteria, and constituted the basis for this review. Search terms (and MeSH headings:
“magnetic resonance imaging”, “predict”, “prognosis”, “cervical spondylotic
myelopathy”, “spinal canal”, “spinal cord compression”, “cohort studies”) used to
identify the study population included cervical spondylotic myelopathy, spinal cord
compression, spinal canal compromise, cervical myelopathy and central cord syndrome.
For complete search strategy, please refer to Appendix 1. Articles were initially screened
on the basis of title, abstract and the reference lists from 22 articles; full text copies were
then examined to ensure that studies met all inclusion criteria. All relevant papers were
evaluated for validity of evidence using a checklist for assessment of methodological
quality, specifically designed for the predictive studies. The Cochrane guideline for
assessment of the non-randomized studies was revised, making it more specific to the
predictive nature of reviewed studies and the medical condition of interest (CSM) [34]. It
included items such as sample representation, blinding, baseline comparability, follow-
up, validity and reliability of primary outcome measure and predictors. Table 3.1
presents criteria description for included studies in a modified version of quality
assessment checklist designed by the Cochrane collaboration group et al (2007) [49].
Table 3.2 presents the summary of methodological limitations in CSM predictive studies.
In addition, both reviewers assessed the quality of methodology and determined the level
of evidence according to Sackett et al 2000 (Appendix 4). Differences in rating the
quality of articles were resolved by consensus of two raters. The definite conclusions
were drawn based on the presence of at least two studies which provided similar findings
using comparable length of follow ups and outcome measures. The representativeness of
15
sample, blinding and baseline comparability played a significant role in drawing these
final conclusions.
The following are the criteria description for including studies, using a modified version
of the quality assessment checklist designed by the Cochrane collaboration group et al
2007 (Table 3.1)*.
Representative sample
• Source description
• Referral pattern
• Patients’ characteristics
• Sample size
Blinding
• Blinded assessor
Baseline comparability
• Compared baseline performance of clinical status
• Compared baseline performance of other known predictive variables
Follow-up
• Complete
• Comparison of drop outs
• Reasons of drop outs
Psychometric properties of primary outcome measurement
• Validity
• Reliability
Accuracy of MR imaging measurements
• Definition
• Reliability
* Ryan R, Hill S, Broclain D, Horey D, Oliver S, Prictor M; Cochrane Consumers and
Communication Review Group. Study Quality Guide. March 2007.
www.latrobe.edu.au/cochrane/resources.html (July 2009).
16
3.1.4 Data extraction
3.1.4.1 Severity of myelopathy: functional score and recovery percentage
Table 3.3 includes study design, sample size, type of outcome measures and level
of evidence. These data were selected to provide a description of the cohort and show the
level of evidence. Table 3.4 includes categories of MRI features (signal intensity
changes, spinal cord compression and spinal canal compromise). These data were
selected to provide a description of a spectrum of MRI features used to evaluate severity
of myelopathy in cohorts. The percent recovery and/or post-operative score as outcomes
of interest after surgery were extracted according to the definition used in the individual
articles. This was generally evaluated using the recovery percentage formulae proposed
by Hirabayashi for JOA scoring system [9], its modified version (mJOA) [11], and for the
NCS scoring system [12]. The difference between the initial and final assessment scores
was reported for studies using the Nurick [50] and the walking test [24]. Table 3.5
includes data describing potential predictors of functional scores and recovery percentage
for which the strength of association with short term (at or less than 3 months/ 3-6
months) and long term follow-up (6-12 or greater than 12 months) in patients with
cervical myelopathy was reported. The data extracted in Table 3.6 were preoperative
variables that have been shown to be significantly associated with post-operative scores
and percent of recovery after surgery. No pooling of the results of 22 eligible studies was
completed due to heterogeneous study designs.
3.2. Results
Multiple studies have identified preoperative MR imaging variables that are
associated with functional scores and percent of recovery after surgery in CSM
17
population (N=17) (Table 3.5); however, only a few studies have tested these variables,
adjusting for age, duration of symptoms and baseline severity score (N=5). These studies
are summarized in Table 3.6.
I. Samples
The median sample size was 73. Few studies had a representative sample of the
study population, 5 (23%) studies reported consecutive sequence of referrals, and 8
(36%) studies adequately displayed their sources of selected patients. Few studies had
attempted to create a homogeneous cohort groups, 11(48%) studies included exclusion
and inclusion criteria, 12 (52%) showed a well-described point in clinical course of
disease, and 11 (48%) had inclusion of wide spectrum of patients at various stages,
severities and subtypes of cervical myelopathy.
II. Measurement of MR imaging features and functional outcomes
Although 16 (73%) studies had adequately specified MR imaging features and
outcome criteria (eg. a detailed MRI protocol is provided, including type of plane and
thickness of slices), the description of steps for its measurement techniques was limited.
Few studies reported MR imaging features that were measured blindly to the presence of
neurologic impairments 6 (27%), or examined the reliability of their measurement
instruments 4 (18%). 17 (77%) and 4 (18%) studies used JOA and mJOA scales,
respectively, as measures to assess functional outcomes. Among the remaining 2 studies,
1 (2.5%) used the Nurick scale and 1(2.5%) used the walking test were used also to
assess functional outcomes.
18
III. Loss of participants to follow up
3 (13%) studies had dropouts which comprised more than 20% of participants,
with no reasons listed for patients being lost to follow-up, and no available demographic
and clinical characteristics of the patients who were lost to compare them to the patients
in whom follow-up was complete. Therefore, it was impossible to investigate the effect
of lost patients on the validity of study. 19 out of 22 studies reported post-operative
functional scores or recovery percentage (%) from at least one follow-up time point, the
remaining 3 studies did not report on time of follow up [27, 31, 33].
19
3.1.4.2 MRI predictive factors
For Table 3.5, potential MR imaging predictors associated with post-operative functional
scores collected follow up, as well as with percent of recovery were extracted from all
cohort groups.
Table 3.7: List of MR imaging features as potential predictors of recovery percentage
and functional scores after surgery
Predictor/ MR imaging features
Outcomes
COMPRESSION OF SPINAL CANAL AND CORD
Transverse area Functional score Recovery percentage Compression ratio ------ Recovery percentage Anteroposterior diameter Functional score Recovery percentage Spinal canal and cord deformities on sagittal T1-/T2-weighted MRI
------ Recovery percentage
Cord compression on sagittal and axial T1-weighted MRI
------ Recovery percentage
Cord compression on axial T1-weighted MRI
------ Recovery percentage
Spinal canal and cord deformities on axial T1-/T2WI
Functional score ------
ISCHEMIC CHANGES OF SPINAL CORD
Absence/presence of signal changes on T2WI
Functional score Recovery percentage
Degree of intensity signal changes on T1/T2WI
------ Recovery percentage
Area of signal intensity changes on T1/T2WI
Functional score Recovery percentage
Intensity ratio of signal changes on T1/T2WI
------ Recovery percentage
Patterns of signal intensity changes on T1/T2WI
------ Recovery percentage
20
3.2.1 Compression of spinal canal and cord
Transverse area
Summary
The transverse area at the site of maximum compression was measured to study the
morphological changes of spinal cord [19, 22, 27]. The results of our systematic review
suggest that transverse area of spinal cord is associated with recovery percentage and
functional score at long term follow up (greater than 6 months) before and after
adjustment for other important confounding variables.
Recovery percentage
Fukushima et al (Level I) reported significant differences in recovery percentage
from 6 to 48 months after surgery (mean follow up of 17 months) in groups with a spinal
cord area of less than 0.45 cm2(p<0.01), reflecting the irreversible pathology of spinal
cord [22]. Similarly, Okada et al (Level IIc) showed an association for each individual
etiology group, OPLL (ossification of the posterior longitudinal ligament) (r=0.678,
p<0.01) and CSM (cervical spondylotic myelopathy) (r=0.586, p<0.01). After adjusting
for disease etiology, duration of symptoms, and signal intensity changes, the investigators
showed that the preoperative cross-sectional area of the spinal cord is an independent
predictor of recovery percentage in CSM patients (no follow up time was reported).
Morio et al (Level IIc) confirmed previous findings showing a mild, insignificant
association between recovery percentage at follow up (6 months to 10 years, mean follow
up of 3.4 years) and preoperative cross-sectional area of spinal cord (r=0.243, p =
0.0517). All three studies evaluated the percentage of recovery using the formula
proposed by Hirabayashi based on the original formulae of the JOA scale.
21
Functional score
Transverse area was consistently shown to be associated with postoperative JOA
scores in two prior studies before statistical adjustment for other important confounding
variables. Fukushima et al 1991 (Level I) reported significant association between
transverse area and post-operative JOA scores (r=0.298, p<0.05) [22]. Similarly, Morio
et al 2001 (Level IIc) reported significant association between post-JOA scores and
preoperative spinal cord surface (r=0.398, p = 0.0015) [19]. After adjustment for age,
duration of symptoms and preoperative scores, Morio et al showed that the preoperative
surface area of the spinal cord is not associated with the functional score of CSM
patients.
Compression ratio
Summary
The compression ratio was consistently defined as a ratio of sagittal and transverse
diameters on T1-weighted axial imaging [10, 27, 51]. Our systematic review identified
three original articles where compression ratio was studied as an MR imaging feature to
quantify the severity of spinal cord compression. The results of our systematic review
suggest that compression ratio is not associated with recovery percentage at long term
follow up (greater than 6 months).
Recovery percentage
In studies by Chen et al and Okada et al, the compression ratio measurements
were reported to have no associations with recovery percentage evaluated using the
formula proposed by Hirabayashi based on the mJOA and JOA scores, respectively [10,
22
27]. Okada et al 1993 (Level IV) reported that the recovery percentage was not
significantly associated with compression ratio irrespective of etiology (OPLL/, CSM or
CDH (cervical herniated disc)) (follow up period was not reported) [27]. Chen et al
(Level IV) reported similar findings with a report of insignificant association between
preoperative compression ratio and recovery percentage at 6 months follow up after
surgery (r=0.026, p=0.836) [10]. In contrast, Chung et al (Level IIc) showed that
compression ratio is associated with recovery percentage calculated from the JOA scores
from 24 to 84 months after surgery (mean follow up of 42 months), where patients were
divided into two groups according to the recovery percentage – a ‘good’ group (n=19),
and a ‘fair’ group (n=18) . The results showed that patients in the good group showed a
greater compression ratio (p<0.05) [28]. In all likelihood the inconsistent findings
reported by Chung et al is due to variable outcomes.
Anteroposterior diameter (AP diameter)
Summary
Whether AP diameter on the preoperative axial T1 image can be used to predict recovery
percentage and functional disability score after surgery evaluated by JOA scale, remains
inconclusive due to limited number of studies available in the literature and poor
methodology used to support these findings.
Recovery percentage & Functional score
In our systematic review, only one previous study by Yone et al (Level IV) (45
OPLL, 64 CSM, 31 healthy patients) compared morphological spinal cord changes,
functional score and recovery percentage (follow up period was not reported) [31]. No
23
relationship was found between AP diameter and post-operative functional scores. In
contrast, the author found significant association between recovery percentage, evaluated
by the Hirabayashi’s formula based on the JOA scores, and preoperative minimum AP
diameter among OPLL but not CSM patients. Although the authors reported the
differences in recovery percentages between two pathologies, no statistical analysis and
mean of JOA scores were documented.
Scoring systems to interpret spinal cord compression
Summary
The results of our systematic review suggest inconclusive findings as to whether the
severity of spinal cord and canal deformities (severity of deformity described based on
scoring systems) are associated with recovery percentage calculated based on JOA scores
in three prior studies. Recovery percentages across the cohorts were highly variable, in all
likelihood these inconsistencies are due to the number and variety of measures used to
interpret severity of spinal cord and canal compression on MRI. Similarly, whether MR
imaging severity scoring systems of CSM can be used to predict functional score in CSM
patients after surgery also remains inconclusive due to variable follow-up periods and
outcome measures used to quantify functional scores.
Recovery percentage
A. Spinal canal and cord deformities on sagittal T1-/T2-weighted MRI
Kasai et al (Level IIc; 128 CSM patients) retrospectively studied a new method of
evaluating the cumulative severity of stenosis captured on preoperative T1- and T2-
24
weighted sagittal images and recovery percentage at from 12 months to 9.7 years after
surgery (mean follow up of 4.8 years) [52]. The authors used a six grade scale to classify
the severity of spinal cord compression, describing severity in terms of anterior/posterior
space and cord compressions (Table 3.4 (II)). The recovery percentage was calculated
based on the JOA scores obtained at long term follow-up and correlated with the
preoperative MRI cumulative score. As a result, the authors found a significant negative
correlation between the defined MR imaging findings and recovery percentage (r= -
0.436, p<0.01).
B. Spinal cord compression on sagittal and axial T1-weighted MRI
Nagata et al (Level IV; 300 CSM patients) prospectively compared preoperative
MRI and recovery percentage calculated based on the JOA scores collected at an average
follow up of 19 months [53]. The morphological changes of the spinal cord were
stratified into four categories of preoperative cord compression on sagittal T1-weighted
MRIs: Class 0, no compression; Class 1, slight cord compression; Class 2, cord width
decreased by less than 1/3; Class 3, cord width decreased by at least 1/3. As a result,
they found that the degree of spinal cord compression on sagittal T1-weighted MRI was
not significantly correlated with the severity of myelopathy (no p or r values were
reported).
C. Spinal cord compression on axial T1-weighted MRI
Matsuyama et al (Level IV; 44 OPLL patients) compared recovery percentage,
calculated based on JOA scores obtained at 1 month follow up, across categories. These
categories are described cross-sectional spinal cord configurations of three shapes:
boomerang, teardrop, and triangular with the following means of percentage of recovery:
25
61.8%, 72.1% and 23%, respectively (no p or r values were reported) [54].
Functional score
D. Spinal cord compression on axial T1-/T2WI
One study by Singh et al (Level I; 69 CSM patients) found no significant
association between spinal cord compression and post-operative walking scores obtained
at 3 months follow up (r=0.07, p=0.60)[24]. Based on the pattern of spinal cord
compression on T2-weighted MR images obtained at baseline, the authors classified all
patients into four categories: none (0), mild (1; flattening or concavity of the anterior
surface only), moderate (2; <50% reduction in maximal sagittal diameter), severe (3;
>50% reduction in maximal sagittal diameter).
In addition to studying recovery percentage, Matsuyama et al examined the
relationship between morphological characteristics of spinal cord deformities on the
preoperative MR image and functional score assessed by JOA score obtained at 1 month
following surgery [54]. Although the mean functional disability scores were reported in
three groups (triangular cord configurations, A=31.8 mm2, post-JOA = 11.6; teardrop
cord, A=39.0 mm2, post-JOA = 15.2; boomerang cord, A= 35.4 mm2, post-JOA = 14.2),
no direct comparisons were reported.
One study by Nagata et al (Level IIc; 74 CSM, 52 CDH, 49 OPLL patients)
retrospectively studied morphological and functional scores obtained from 12 months to
4.5 years (mean follow up of 1.5 years) after surgery in elderly patients [18]. The
morphological changes of spinal cord were stratified into four categories of preoperative
cord compression on sagittal T1-weighted MRIs: Class 0, no compression; Class 1, cord
compressed slightly; Class 2, cord width decreased by less than 1/3; Class 3, cord width
26
decreased by at least 1/3. The author reported that their patients performed better with
lesser cord distortions at baseline. The validity of these findings is difficult to judge due
to poor description of performance (no report of mean, SE, p and r values).
The inconsistencies in observations obtained at short term follow up could be due
to variable measures used to assess post-operative functional scores. The findings
obtained at long term remains inconclusive.
E. Spinal canal compromise and cord compression on axial T1-/T2WI
One study by Uchida et al (Level IIc; 135 CSM/OPLL patients) retrospectively
studied morphological and functional scores assessed by JOA scale obtained from 12
months to 12.8 years (mean follow up of 8.3 years) after surgery [47]. The percentage
rate of flattening and canal narrowing were used to estimate the morphological changes
of spinal canal and cord on sagittal T1-weighted MRI. The authors reported that the
better functional scores of OPLL, but not CSM patients, are associated with the
preoperative spinal canal narrowing by ossification of <40% and an extent of cervical
cord flattening of ≥50%. The validity of findings is difficult to judge due to poor
description of performance (no report of mean, SE, p and r values).
3.2.2 T2 signal changes on MRIs of the spinal cord
The evaluation of signal intensity changes is intended to assess the secondary damage to
the spinal cord of patients with CSM.
Absence/presence of signal changes on T2WI
27
Summary
In our systematic review, signal changes on T2WI has the greatest number of
publications among all MR imaging features used for prediction of functional outcomes
following surgery in the setting of CSM. High T2 signal intensity changes on MRWI
were not associated with recovery percentage after surgery at long term follow up
(greater than 6 months); less conclusive findings are found at short term follow up. In
contrast, high T2 signal intensity changes found on preoperative mid-sagittal MR image
were associated with functional score after surgery at long term follow up (greater than 6
months); less conclusive findings are found at short term follow up.
Recovery percentage
In case series, Mizuno et al (Level IV; 82 CSM, 62 OPLL patients) found
significant difference in recovery percentage, calculated from JOA scores from 3 to 6
months after surgery (mean follow up of 3.7 months), in SEA (snake-eye appearance,
characterized as nearly symmetrical round high signal intensity of the spinal parenchyma
resembling the face of a snake; 32.2 ±15.1%) compared to the NSEA (no snake-eye
appearance) group (47.1 ±12.1%) (p<0.001) [55]. In contrast, Wada et al (Level IIc) [29]
showed the absence of a relationship between signal changes and recovery percentage
obtained at 1.5 months. The inconsistencies in findings by Wada et al 1995 could be due
to variable view dimensions (axial versus sagittal) and approaches used to describe signal
intensity changes (‘yes/no’ versus ‘snake eye/non-snake eye’ appearances).
Inconsistencies in recovery percentages could be due to selection bias, given that both
studies had not given sufficient details on source and methods of patient enrolment, as
well as description of key patients characteristics on CSM severity groups, presence of
28
other co-morbidities, inclusion/exclusion criteria, etc. It is important to note that the study
by Wada et al had analyzed recovery percentages in groups with comparable JOA scores
at admission.
Several authors [8, 15, 28, 31] found that high T2 signal is not associated with
recovery percentage, Yukawa et al (Level IV; 142 CSM/OPLL/CDH/calcification of the
yellow ligaments patients) found that the recovery percentage, calculated from JOA
scores obtained from 12 to 90 months after surgery (mean follow up of 40 months), was
significantly different between groups of patients with and without signal changes on
sagittal T2W MRIs (p=0.033, r-value was not reported) [56]. Given the important role of
age and duration of symptoms in affecting outcomes after surgery, studies by Houten et
al 2003 & Yamazaki et al 2002 ensured similar baseline characteristics of patients in
comparison groups.
Functional score
Wada et al (Level IIc) [29] showed no association between signal changes and
functional score assessed by JOA scale obtained at 1.5 months follow-up. A study by
Singh et al (Level I) showed that it has association with functional score assessed by
walking scale at 3 months after surgery [24]. The authors concluded that CSM patients
with higher severity scores at admission and T2 signal showed more change in functional
score. Because clinical severity at baseline was not statistically adjusted between
comparison groups, it is difficult to conclude that high T2 signal changes alone are
independently associated with functional scores after surgery (p=0.0011). However,
Wada et al compared groups with comparable baseline severity. The findings remain
inconclusive due to variable measures used to measure function in CSM patients.
29
Yukawa et al (Level IV; p=0.0012)[56], Papadopolous et al (Level IV;
p<0.001)[57] and Matsuda et al (Level IV; p<0.05) [14] showed consistent association of
signal intensity changes on sagittal T2-weighted MR images and functional scores
assessed by JOA scale at long term follow up (greater than 6 months after surgery);
Houten et al (Level II) [8] found that there was no significant difference across
comparison groups at long term follow-up. Given Houten et al and Yukawa et al had
comparable patients’ characteristics at admission except severity score and Papadopolius
et al had comparable baseline severity but no control for other important predictors, these
differences could be contributing to the inconsistencies seen in the results.
Degree of signal intensity changes on T1-/T2-WI
Summary
In the CSM population, the impact of altered degrees of signal intensity changes on
recovery percentage was documented in three previous studies that were captured in our
systematic review [10, 47, 56]. The findings suggest that the assessment of CSM severity
based on the degree of signal intensity changes on preoperative T2 WI is useful as a
predictor of recovery percentage at long term follows up after surgery, calculated based
on JOA scores. The findings were consistent before and after adjustment for important
confounding variables.
Recovery percentage
Yukawa et al (Level IV; 142 CSM/OPLL/CDH/calcification of the yellow
ligaments patients) and Chen et al (Level IV; 64 CSM patients) showed an association
between highly intense and well-defined border of signal intensity area and poor recovery
30
percentage obtained at long term follow up (p-value= -0.018 and p<0.001, respectively).
To study the effect of this variable on functional recovery percentage after controlling for
other important confounding variables such as age, sex, preoperative JOA score, cervical
curvature, and cord compression ratio, Chen et al confirmed these associations [10].
Although Uchida et al (Level IIc; 135 CSM/OPLL patients) reported no significant
association, the observed differences could be attributed to different MR imaging
classifications used to assess the degree of signal intensity changes [47].
Multisegmental area of signal intensity changes on T1-/T2WI
Summary
The results of our systematic review suggest that a multisegmental area of signal intensity
changes is associated with recovery percentage and functional score irrespective of scale
(original or modified version of JOA) at long term but not at short term of follow up after
surgery. Further research is needed to replicate the findings documented above
controlling for other important confounders including baseline severity score,
demographics and duration of symptoms.
Recovery percentage
Three studies reported consistent findings on the differences in recovery
percentage, measured by the original or modified version of JOA scales at long term
follow up, in groups of patients with focal and multisegmental areas of high signal
changes found on the preoperative MR imaging [32, 57, 58]. While Wada et al (Level IIc;
85 CSM patients), Fernandez et al (Level I; 12 CDH, 55 OPLL patients) and
Papadopolous et al (Level IV; 42 CSM patients), reported a significant relationship
31
between multisegmental area of high signal intensity on preoperative MRIs and poor
recovery percentage at long term follow up (greater than 6 months) (p<0.05; p<0.001, &
p=0.001, respectively). Another study by Wada et al (Level IIc; 31 CSM patients)
reported no significance in recovery percentage obtained at 1.5 months across
comparison groups. It would seem likely that the inconsistencies in these findings by
were due to the differences in follow up times [29].
Functional score
Based on the results of our systematic review, the relationship between area of T2
intensity signal change and functional score after surgery was examined in three studies.
While two studies found a significant association of area of T2 intensity signal and
functional score assessed by mJOA and JOA scales after surgery at long term follow-up
(longer than 6 months), respectively (Mastronardi et al (Level I) & Wada et al (Level
IIc)), Wada et al 1995 (Level IIc) found no statistical difference in post-operative JOA
scores obtained at 1.5 months between patients with multisegmental areas of high MRI
intensity (13.4±1.1) and ones with focal areas (13.5±2.0%). It would seem likely that the
inconsistencies seen in the reported findings by Wada et al were due to the differences in
follow up at which functional score after surgery were measured.
Intensity ratio of signal changes on T1- and T2WI
Summary
It remains inconclusive whether signal intensity ratio is associated with recovery
percentage due to limited information reported on timing of follow up.
Recovery percentage
32
Our systematic review identified only one original article where ratio of signal
intensity changes on T1/T2WI was examined for its relationship with recovery
percentage after surgery [27]. Okada et al (Level IV) defined signal-intensity ratio as
sagittal T2-weighted MRI cord signal at maximally compressed levels divided by the
comparable readings at contiguous non-compressed sites. Okada et al (23 had OPLL, 34
had CSM, 17 CDH patients) showed the significance of the relationship between
recovery percentage and the mean preoperative intensity ratio at baseline, in particular in
groups of myelopathy due to OPLL and CSM, r=0.537 (p<0.01) and r=0.426 (p<0.01),
respectively. The recovery percentages were illustrated in all thee groups (OPLL, RR
(54.7 ±17.7%); CSM, recovery percentage (52.21 ±5.9%); CDH, recovery percentage
(78.3 ±19.1%), however the CDH group had a significantly higher recovery percentage
(p<0.01) (no follow up time was reported).
Patterns of signal intensity changes on T1-/T2WI
Summary
The assessment of CSM severity based on the combination of signal intensity changes on
both T1WI and T2WI shows promise as a potential predictor of functional scores
obtained at long term follow up after surgery.
Recovery percentage
In our systematic review, there is only one prior study that examined the role of
sagittal T1-/T2WI signal intensity change patterns as an independent predictor of
functional recovery percentage [19]. The study used the following patterns of spinal cord
signal intensity changes on T1-/T2WI to stratify patients into comparable groups: normal/
33
normal (N/N), normal/high-signal intensity changes (N/Hi), and low signal/high-signal
intensity changes (Lo/Hi). Morio et al (Level IIc; 42 CSM, 31 OPLL, 9 CDH patients)
retrospectively compared recovery percentage obtained between 6 months and 10 years
(mean, 3.4 years) after surgery assessed by JOA across different patterns of spinal cord
signal intensity changes. The authors showed a statistically significant difference in N/Hi
(48.0± 24.9%) groups and Lo/Hi (19.1± 22.8%), respectively (p=0.0259). Using stepwise
multiple regression, the best model for prediction of recovery percentage included
preoperative signal pattern combined with clinical features such as age and duration of
symptoms (adjusted r2 = 0.297; p =0.0002). More research is needed to replicate this
finding in a prospective cohort study.
3.3. OVERALL SUMMARY OF THE SYSTEMATIC LITERATURE R EVIEW
Our systematic review identified 22 observational studies that examined
relationship of 9 MR imaging measures as predictors of functional score and recovery
percentage. These included transverse area of spinal cord, compression ratio of spinal
cord, anteroposterior diameter of spinal cord, severity scoring systems to interpret the
degree of spinal cord compression and/or canal compromise, presence of high T2SI,
degrees of signal intensity changes, multisegmental area of signal changes, signal
intensity ratio, and T1WI/T2WI signal intensity change patterns, which were reported in
original articles of level-4, level-2b or level-1 evidence. The associations were studied
based on subgroups of measures of functional outcomes, follow up periods, and
adjustment for age, duration of symptoms and baseline severity score.
34
MR imaging predictors of functional scores & recovery percentage at short term
follow up
No MR imaging features were found to be associated with functional recovery
percentage at short term follow up. However, the multisegmental (linear) high intensity
areas on T2-weighted MR image were associated with recovery percentage at 1.5 months
of follow up. The relationship between anteroposterior diameter of spinal cord,
classifications of severity of spinal cord and canal deformities, and high T2 signal
intensity changes and recovery percentage at short term remains inconclusive.
MR imaging predictors of functional recovery percentage at long term follow ups
The degree of signal intensity changes and transverse area of the spinal cord were
found to be associated with recovery percentage at long term follow up (greater than 6
months). These data suggest that as the degree of spinal cord compression increases, the
chances that the tissue damage is more likely to be irreversible despite surgical
decompression and therefore leads to poor recovery. For both MR imaging features, the
findings were consistent before and after adjustment for age, duration of symptoms, and
baseline severity score. In contrast, high T2 signal intensity change and compression ratio
were consistently not associated with recovery percentage at long term follow up. The
relationship of anteroposterior diameter, severity scoring systems, signal intensity ratio
and recovery percentage at long term remains inconclusive.
35
MR imaging predictors of functional scores at long term follow-up
Using univariate analysis, transverse area of the spinal cord, high T2 signal
changes, multisegmental area of signal change and combined T1WI/T2WI signal
intensity changes were found to be associated with functional scores at long term follow
up (greater than 6 months). After adjustment for age, duration of symptoms and baseline
severity score, transverse area of spinal cord and combined T1WI/T2WI signal intensity
changes patterns remained significantly associated with functional scores. Further
research is needed to evaluate the role of high T2 signal changes and functional scores,
adjusting for other important variables. The relationship of anteroposterior diameter,
severity scoring systems, signal intensity ratio and recovery percentage at long term
remains inconclusive.
Although transverse area of the spinal cord and combined T1WI/T2WI signal
intensity changes are consistently shown to be significantly associated with functional
scores at long term follow up, it must be noted that there are some methodological
limitations to the data that caution against definite interpretation. First, there are the
limitations associated with the insufficient information provided about sources and
methods of patient recruitment, about all key patients’ characteristics including degree of
CSM severity, co-morbidity, inclusion/exclusion criteria, age and sex. In this case, the
possibility of selection and measurement bias cannot be ruled out, which may have
distorted the true differences between comparison groups. Second, no reliability testing
has been undertaken regarding the method of using transverse area of the spinal cord to
ensure its consistency. Further studies are required to explore the role of MR imaging
36
variables in prediction of functional score on the basis of methodological standards
appropriate for good quality observational studies.
3.4. RATIONALE FOR STUDYING CLINICAL AND IMAGING PR EDICTORS
OF OUTCOME IN CSM
Predicting the extent of functional gain is important for many reasons: it provides
information to patients about surgery related risks; it can be used among clinicians to
guide therapeutic decisions; it can provide better allocation of services; and it may be
useful in designing clinical trials to test the effect of certain interventions on outcomes.
Age, duration of symptoms and baseline severity score are consistently associated
with functional scores following surgery. Therefore, it would be essential to adjust for
these variables in the comparison of functional scores across varying MR imaging
features. In addition, while it is clear that age, duration of symptoms and baseline severity
score are associated with functional outcomes in patients with CSM, it is less clear
whether they are reliable predictors of functional outcomes following surgery.
It has been suggested previously that MR imaging can be predictive of function
after surgery. However, these assessments fall short of providing clinicians with key
information in CSM because they are either qualitative, or studied` in a quantitative way
in less methodologically vigorous studies. The rationale for studying MR imaging
predictors of functional outcomes is supported by several factors. Few studies have
reported an extensive evaluation of predictors of outcomes after surgery, reporting the
beta estimates and therefore establishing the strength of the relationships of individual
variables and outcomes. The majority of studies simply report associations without
37
controlling for other important confounding variables such as age, duration of symptoms
and baseline score. The reported magnitude of strength and the statistical significance
therefore is of limited clinical utility. Furthermore, few report using the mJOA scale, the
current standard outcome measure of functional disability in CSM population [10]. The
mJOA scale was modified from the JOA scale to allow for cultural differences in western
populations. The majority of MR imaging predictors described in these studies were also
associated with recovery rate and not the mean of post-operative functional scores after
surgery. The question still remains whether the above mentioned MR imaging parameters
are predictive of patients’ functional score after surgery. There is also a lack of
availability of information on inter-rater reliability of MR imaging measurements or
stability of measurements. The last factor is due to the lack of standardized MR imaging
protocols and clinical assessments collected concurrently in a prospective cohort sample.
3.5. HYPOTHESIS AND STUDY OBJECTIVES
OVERALL STUDY OBJECTIVE:
To develop a predictive model of functional outcome incorporating key demographic,
clinical and MR imaging assessments in patients with cervical spondylotic myelopathy
undergoing surgical treatment.
Hypothesis: Key demographic parameters, clinical factors and MR imaging features of
the site of cervical cord compression are independently associated with baseline scores
and predictive of functional outcomes scores at 12 months follow up in patients with
CSM undergoing surgical treatment.
38
Each specific aim contributes to the overall objective:
Specific Aim I: Reliability assessment of MR imaging to assess cord compression in
CSM (Appendix 3).
Objective: To investigate the inter-rater reliability of two published methods
(transverse area and anteroposterior diameter) of examining cord stenosis on axial
MR images.
Question: Are the ICC values of transverse area and anteroposterior
diameter of spinal cord methods free of systematic errors (bias)?
Specific Aim II: Development of a predictive model of outcome in patients with CSM
undergoing surgical treatment
Objective: To address the limitations of the current literature by prospectively
evaluating if demographic, clinical and radiological factors in patients with CSM
are predictive of functional outcomes pre- and post- surgery.
Questions: After controlling for age, gender and duration of symptoms,
MRI is independently associated and predictive of functional outcomes at
baseline and 12 months follow-up, respectively.
39
CHAPTER 4
MATERIAL AND METHODS
4.1. STUDY OBJECTIVES
Chapter 4 provides details of study methodologies designed to answer two research
questions related to Specific Aims I & II.
4. 2. STUDY DESIGN
A total of 85 CSM patients who were consecutively referred to the spine clinic at
the Toronto Western Hospital (an academic tertiary care institution affiliated with the
University of Toronto) from February 2006 to November 2007 were prospectively
recruited for this study. This project is based on analysis of a single centre which is part
of a larger multicentre AOSpine North America CSM Trial; n=283 cases.
The proposed research is based on secondary analysis of an existing data set
housed in a research database. The primary aim of the present study was to compare the
clinical and radiological outcomes, functional status, disease specific, and general health
related quality of life between patients managed with anterious vs. posterior approaches
using the Nurick Score, mJOA score, MR and plain radiographs, Neck Disability Index,
30 meter walk test and the SF-36 at baseline, 6, 12 and 24 months following surgery.
Data entry was validated (e.g., logic checks including range checks, missing value
checks) both by visual inspection and built-in database programming during the data
entry process. The subject’s electronic study file was not considered complete until
mandatory data fields were completed. The central study database was monitored by an
40
external representative and queries were made on a regular basis to ensure the quality and
integrity of the data.
4. 3. TARGET POPULATION
This study included all consecutive CSM patients who referred to a single spine
centre of the Toronto Western Hospital from February 2006 to November 2007. A total
of 99 patients with CSM, surgically treated per standard of care. Surgeons used their
expertise and preferences to determine the method of surgical intervention. An individual
or combinations of techniques were possible including anterior cervical decompression
and fusion, laminoplasty, and laminosplasty and fusion. 20 out of 85 patients were
excluded who were unable to have MRI (e.g., pacemaker) and had CT/myelography
instead. After excluding 4 patients who were lost to follow up, 61 subjects (follow-up
percentage: 94%) were analyzed for prediction of functional outcomes (please see
summary characteristics of the study population in Table 5.1).
The patients had a clinical diagnosis of cervical myelopathy confirmed with
characteristic findings on MRI consistent with CSM. CSM was defined as a constellation
of symptoms and signs supported by appropriate radiological findings, including
symptoms (numb clumsy hands, impairment of gait, bilateral arm parasthesia,
L'Hermitte's phenomena) and signs (corticospinal distribution motor deficits, atrophy of
hand intrinsic muscles, hyperflexia, positive Hoffman sign, upgoing plantar responses,
lower limb spasticity, broad based unstable gait) [1]. Any associated conditions such as
cardiovascular disease, angina/coronary artery disease, congestive heart failure,
arrhythmia and hypertension and diabetes, were not considered to be exclusion criteria.
Eligible patients were identified by the treating spine neurosurgeons during the initial
41
examination in spine clinics at Toronto Western Hospital. The pathologic conditions were
cervical spondylotic myelopathy, cervical ossification of the posterior longitudinal
ligament, soft disc herniation, hypertropic ligamentum flavum and subluxation.
A flow diagram of the study population is shown in Figure 4.1.
*Not Eligible for the one or more reasons listed below:
- Asymptomatic cervical cord compression
- Previous surgery for CSM
- Active infection
- Neoplastic disease
Total CSM patients treated surgically
from February 2006 to November 2007
N=99
N=85 eligible subjects*
N=61 analyzed sample
(N=4 were lost to follow up,
N=20 had CT scans/myelography)
42
- Rheumatoid arthritis
- Ankylosing spondylitis
- Trauma
- Concomitant symptomatic lumbar spinal stenosis
- Not referred for surgical consultation
- Pregnant women or women planning to get pregnant during the study period
- History of substance abuse
- Incarceration
- Currently involved in a study with similar purpose
- Has a disease process that would preclude accurate evaluation (e.g. neuromuscular
disease, significant psychiatric disease)
- Patients seen by other services
- Age <18 years
- Unable and not willing to give consent to participate in study
- Not willing and not able to participate in the study follow up according to the protocol
- Does not understand and cannot read English at elementary level
4. 4. DEFINITION OF THE PRIMARY OUTCOME
Although the most important outcome of decompression surgery for stenosis is
resolution of symptoms, it is the ability to regain normal function in activities of daily
living that has become of a great importance. The functional disability scale allows us to
better understand the expectations of surgical treatment for the CSM. The modified
version of Japanese Orthopaedic Association (mJOA), functional disability scale was
43
used for classification of CSM severity through assessment of upper extremity function
(5 points), lower extremity function (7 points), sensory function (3 points), urinary
bladder function (3 points). The scale ranges from 0-18 with higher scores indicating
better function (Table 4.1) [11]. The mJOA was used as the primary outcome measure to
quantify function pre-surgery and at 12 months follow-up. The 12-month time frame was
chosen because it represents a typical time period of optimum recovery for CSM. A
Cronbach alpha of 0.66 and 0.65 for preoperative and postoperative JOA scores
respectively, has been reported for internal consistency. The preoperative and
postoperative JOA scores (original scale) also correlate with other measures of
Myelopathy Disability Index (MDI), European Myelopathy Score (EMS), Ranawat and
Nurick, with Pearson product-moment correlation coefficients ranging from 0.47 to 0.62
and 0.42 to 0.72 as expected [23].
We chose mJOA scale as an outcome measure instead of its original version
because currently it is a so-called standard outcome measure of functional disability in
the CSM population. It is disease specific and it was modified from the JOA scale to
allow for cultural differences in western populations.
4. 5. PRIMARY EXPOSURE (INDEPENDENT VARIABLES)
4.5.1 Strategies to improve accuracy and easy use of exposure variables
Because our aim is to develop a predictive model that can be used in research and
clinical practices, several steps were necessary. First, the model includes demographics,
clinical and MR imaging characteristics (eg. age, gender, duration of symptoms, baseline
mJOA score, intensity signal changes on T1WI and T2WI, degree of spinal cord
44
compression and number of compressed segments). These variables have been shown to
be promising in predicting the post-operative functional scores in other studies [18, 19,
22, 24]. Second, continuous variables were dichotomized for practical purposes; the cut-
off values were determined based on earlier investigations. Third, in all patients, MRI
was performed within 8 weeks prior to surgery using a 1.5 Tesla General Electric unit
and a standardized imaging protocol in the majority of cases (please see MRI protocol in
Table 4.3), to minimize measurement errors and increase observers’ reliability. Fourth, a
radiologist with 10 years of experience (Zvonimir Ivan Lubina, M.D., Clinic of
Traumatology in Zagreb) analyzed the MR images obtained from all 65 patients without
knowledge of the patient’s clinical and neurological status, and the clinical assessors
were also blinded to the imaging results, to avoid observer bias. Fifth, as a minimal
requirement for a valid tool, we quantitatively examined the degree of agreement across
different raters for the same patient (inter-rater reliability) for two published methods of
examining spinal cord compression using a systematic approach with a magnified
software based tools, written instructions and consistent interpretations. Based on results,
we identified a list of reliable measures and matched to the ones available in CSM trial
database. Transverse area (TA) was chosen over anteroposterior (AP) diameter of spinal
cord as measure of spinal cord compression due wide applicability to both symmetrical
and asymmetrical cases despite lower ICC (interclass correlation coefficient) value
(please refer to Appendix 3 for further details). We did not examine the reliability index
for the pathological changes within spinal cord, the classification of combined patterns of
T1-/T2-WI intensity signal changes, since earlier studies reported this method moderately
reliable with the concordance correlation coefficient between two observers on single
45
occasion is 0.62 (k=0.37; p=0.0063) and predictive of functional outcomes (kappa=0.37;
p=0.0063) [19].
4.5.2 Definition of primary exposure and psychometric properties (validity and
reliability) of the independent variables
Primary exposure was defined as variables that are known prior to the time of
surgery (preoperative) and may independently predict the primary outcome. These
variables constitute the theoretical framework (see above: Table 3.6). The following
described characteristics of CSM patients below were collected at the time of diagnosis
and clinical examinations.
4. 5. 2. 1. Age
Age was defined as the age of the patient at the time of diagnosis and baseline
clinical examinations. Originally, age was collected as a continuous variable. Then, this
continuous variable was dichotomized for practical purposes to: 0=age less than the cut
off value of 65 and 1=age equals to or more than the cut off value of 65 [15, 18, 41, 59,
60]. Age is a variable that is expected to be valid and reliable.
4. 5. 2. 2. Gender
Gender variable is a variable that is expected to be valid and reliable.
4. 5. 2. 3. Baseline score
46
Baseline mJOA [11] was defined as the functional score performance of the
patient at the time of diagnosis and clinical examinations at admission, just prior to the
surgery (on average, several months apart). mJOA score is a continuous variable that
ranges from 0-18 with higher scores indicating better function.
4. 5. 2. 4. Duration of symptoms
The duration of symptoms were measured up to the time of assessment.
The duration of symptoms at the first visit was divided into 2 categories: 0= duration of
symptoms less than the cut off value of 12 months and 1= duration of symptoms greater
or equal to the cut off value of 12 months [15, 21, 22].
4. 5. 2. 5. Degree of spinal cord compression (AP diameter and Transverse Area)
The level of maximum spinal cord compression was defined as a segment of the
spinal cord that was compressed and deformed with larger or smaller disappearance of
the surrounding subarachnoid space.
Anteroposterior diameter is one of the means of determining spine stenosis with
established intraclass correlation coefficient of 0.86, 0.72, 0.68, and 0.52 (Please see
Appendix 3) on four occasions. The application software used appeared to hold 1-digit
numbers. Potentially, the repeated reduction to 1 digit could cause systematic build-up of
error in the estimating the accurate reliability index.
Transverse area (TA) is another measure commonly used by researchers to assess
the degree of spinal cord compression [27] and repeated measurements on four occasions
are reliable in CSM with intraclass correlation coefficient of 0.68, 0.69, 0.73 and 0.76
47
(please see Appendix 3). It is a continuous variable measured in millimetres squared
(mm2).
4. 5. 2. 6. Signal intensity changes
The appearance of spinal cord signal intensity changes on T1-weighted sequences
and T2-weighted sequences is classified into three categories: Type 0, Normal T1WI and
Normal T2WI, Type 1, Normal T1WI and Hi T2WI, Type 2, Low T1WI and Hi T2WI.
Increased or decreased signal intensity has been defined on the T2WSIs and T1WSIs,
respectively, as a high intensity area in relation to the signal of the normal medulla at the
unaffected level.
4. 5. 2. 7. Number of affected stenotic levels
This categorical variable is coded as: 1 to 3 (1 = 1 compressed segment), (2 = 2
compressed segments), and (3 = ≥ 3 compressed segments) (Figure 1). This cut-off point
has been used in previous studies [47, 61]. The level of maximum spinal cord
compression was defined as a segment of the spinal cord that was compressed and
deformed with larger or smaller disappearance of the surrounding subarachnoid space.
Determination of the number of stenotic levels was determined by a radiologist who was
blinded to patient neurologic status.
Table 4.2: Definition of exposure variables
Domain Variable definition Type Unit Demographics Age at the time of admission assessment
0=<65 years 1= ≥65 years
Binary 0/1
48
Gender 0=Male 1=Female
Binary 0/1
Baseline mJOA score Continuous (0-18) Clinical Duration of symptoms 0=<12 months 1= ≥12 months
Binary 0/1
Transverse area Continuous mm² Anteroposterior diameter Continuous mm Signal intensity changes Type 0 = Normal T1WI/ Normal T2WI Type 1=Normal T1WI/ Hi T2WI Type 2 = Low T1WI/ Hi T2WI
Categorical 0/1/2
MR imaging
Number of affected stenotic levels 0=1 compressed segment 1=2 compressed segments 2 = ≥ 3 compressed segments
Categorical 0/1/2
4. 6. CONFOUNDING VARIABLES
It has been shown that some baseline characteristics such as pre-existing or
concomitant medical conditions (hypertension, diabetes mellitus, coronary insufficiency,
cardiomyopathy, pulmonary problems, previous cerebral infarction and gastrointestinal
ulcers) may slow the functional recovery in patients with CSM [14]. Given the
established inhibitory effects of smoking on spine fusion [62, 63], smoking may slow the
functional recovery. In addition, functional deterioration in the postoperative period may
also result from aggravation of diabetes mellitus [14]. Since this type of information was
collected at baseline examination, it was statistically tested for its significance. The
surgical interventions information (anterior cervical decompression and fusion,
laminoplasty, and laminoplasty and laminectomy and fusion) was not included in the
predictive model due to the limited size of the sampled population at one single centre.
49
4. 7. SAMPLE SIZE
General guidelines have suggested for the minimum number of events per
variable required in the multivariate analysis. It is generally suggested that a minimum of
ten subjects per variable analyzed (for continuous outcome) are required to prevent over-
fitting [64]. Given the total number of 61 patients available for analysis, we included no
more than 6 out of 10 given preoperative variables in the theoretical framework. Such
number ensures adequate sample size for future predictive models.
4.8. DATA ANALYSIS
4.8.1. Exploratory analysis
All data analyses were performed by using SAS, version 9.2 Software. Data
analysis followed standard procedures for a prediction study. Summary descriptive
statistics were computed on all variables. Categorical variables were summarized as
frequencies and percentages, and continuous variables as means and standard deviations.
Categorical variables were compared using Spearman Chi-square test for independent
proportions, and the student t-test was used as compare continuous variables.
Exploratory correlation coefficient analyses were performed to identify
associations between the ten individual independent variables and final mJOA scores and
associations or multicollinearity between variables. More specifically, Spearman
correlation analysis was used when both variables were continuous, t tests were used
when one variable was continuous and the other dichotomous, and continuity-adjusted
chi squares were calculated when both variables were categorical. The Mann Whitney U
test was used for analysis of the association between dichotomous variables and final
50
mJOA scores, because these scores did not follow a Gaussian distribution. The criterion
of r> 0.90 was used for excessive correlation between variables. At the same time, the p-
value was used in chi-square test to see if it is significantly smaller than 5%.
To assess normality of primary outcome measure and other variables’
distribution, plotting of histograms was used. The logarithmic transformation for
normality was used when distribution of follow up mJOA scores was negatively skewed
[65].
4.8.1.1. Univariable (unadjusted) analysis
Univariable data analyses that include unadjusted regression coefficients (beta
values estimates) and p-values were carried out for all variables under evaluation.
Initially, continuous variables (age, duration of symptoms, baseline mJOA scores,
transverse area and anterioposterior diameter of spinal cord) were analyzed individually
for a linear relationship with post-operative functional scores. Then, age and duration of
symptoms variables were dichotomized for convenience in clinical practice and ease of
interpretation of findings. In addition, three MR imaging variables (three patterns of
spinal cord signal intensity changes on T1- and T2-weighted sequences, transverse area
of the spinal cord and number of compressed segments), were analyzed. Table 5.5
summarizes the statistical details of the unadjusted analysis. All candidate variables were
examined using linear regression.
4. 8. 2. Model development
As the outcome of interest is continuous (functional score calculated using mJOA
score from 0-18), multivariable linear regression modeling techniques were used to
51
determine the relationship between each independent variable and the functional
outcomes.
Unadjusted (univariable) data analyses were carried out initially to estimate the
effect of each potential predictive variable individually, followed by the adjusted
(multivariable) analysis.
Efforts were made to maximize predictive performance using all-variables
regression for model building (no selection methods were applied, eg. stepwise selection
for example) and a variable remained in the final model if it met the following three
criteria: 1) a significance level of p of 0.1 or less; 2) the r2 statistic for the model
increased by at least 10%; and 3) if the beta coefficient did not change by more than 10%
with the addition of other variables into the model [66, 67]. Baseline scores were
included in the model to adjust for the effect of baseline differences on final scores [68].
This analysis was conducted using the PROC GLM procedure in SAS, version 9.2.
4.8.3 Data sources and management
Source of clinical data: Source data included all information in original records,
observations, or other activities necessary for the reconstruction of missing data and
verification of outliers. More specifically, it included surgery, imaging and laboratory
reports, medical history information and demographics. The study database was a secured
electronic database system known as OPVerdi.
Several strategies were implemented for the reconstruction of missing data and
verification of outliers. For continuous data, we plotted each variable and investigated for
any outliers were beyond 3 standard deviations. The same approach was applied on
52
categorical data by plotting a boxplot. The spotted outliers were checked against data
collection forms and were corrected. For age, gender, and duration of symptoms
variables, the data was 100% complete. After calculating the frequency of missing
values, the following was found: 3 (5%) for transverse area of spinal cord measurements,
0 (0%) for anteroposterior diameter measurements, 2 (3%) for signal intensity changes,
and 4 (7%) for number of compressed segments. The mJOA scores at 12 months for 4 out
of 65 patients (6%) were found to be missing due to loss of follow up. The subjects were
removed and as a result, 61 subjects were analyzed in statistical modelling.
MR imaging data: Issa [proprietary name] was used as an integrated system for
archiving patient data and examination data including images.
4.8.4 Ethics
The research protocol was approved by the University Health Network Research
Ethics Board.
53
CHAPTER 5
RESULTS
Chapter 5 provides findings to two research questions related to Specific Aims I & II.
OVERALL STUDY OBJECTIVE: To develop a predictive model of functional score incorporating key demographic, clinical and MR imaging assessments in patients with cervical spondylotic myelopathy undergoing surgical treatment. Hypothesis: Key demographic parameters, clinical factors and MR imaging features of the site of cervical cord compression are independently associated with baseline scores and predictive of functional outcomes scores at 12 months follow up in patients with CSM undergoing surgical treatment. Each specific aim contributes to overall objective: Specific Aim I : Reliability assessment of MR imaging to assess cord compression in CSM Objective: To investigate the inter-rater reliability of two published methods (transverse area and anteroposterior diameter) of examining cord stenosis on axial MR images. Question: Are the ICC values of TA and AP diameter of spinal cord methods free of systematic errors (bias)? Findings: The two-way analysis of variance indicated the interrater agreement ICC’s for transverse area (TA) and anteroposterior diameter (AP) of the spinal cord were 0.68, 0.69, 0.73 and 0.76, and 0.86, 0.72, 0.68, and 0.52 on 1st-4th sessions, respectively. Those coefficients were calculated using Shrout-Fleiss models for random effects (Model 2). Of note, TA and AP methods showed wider variability in cases of severe cord compression (presence of systematic error) and the variability of images interpretation was dependent of rater’s individual differences. TA and AP measurement techniques demonstrated moderate to good inter-reliability, with more consistent agreement noted in the assessment of transverse area of spinal cord. This is the first study to examine, the interobserver reliability of quantifiable methods to assess spinal cord stenosis in the setting of CSM. Based on our data, we recommend that the TA method be used to assess the extent of compression on axial T2 images.
54
Specific Aim II : Development of a predictive model of outcome in patients with CSM undergoing surgical treatment Objective: To address the limitations of the current literature by prospectively evaluating if demographic, clinical and radiological factors in patients with CSM are predictive of functional outcomes pre- and post- surgery. Questions: After controlling for age, gender and duration of symptoms, MRI is independently associated and predictive of functional outcomes at baseline and 12 months follow-up, respectively. Findings: Higher baseline mJOA scores were associated with younger age (p=0.0002), shorter duration of symptoms (p=0.03), fewer compressed segments (p=0.04) and less severe cord compression (p=0.02). Moreover, better post-operative mJOA scores were associated with younger age (p<0.0001), shorter duration of symptoms (p=0.09) and higher baseline mJOA score (p<0.0001). Using multivariate analysis, baseline and follow-up mJOA scores were best predicted by age. This data suggest that: first, it is important to diagnose and treat CSM at an early stage and that age is a key predictor of functional improvement on the mJOA scale; ischemic changes, degree of spinal cord deformity and multiplicity of stenosis could not predict post-operative functional status being measured by mJOA scale, after controlling for age and baseline mJOA score.
5. 1. DESCRIPTIVE STATISTICS
The final dataset included information on 61 CSM patients, who underwent spine
surgery at Toronto Western Hospital between February 2006 and November 2007. The
missing data were 6% of sample population lost to follow up in the development of
model. All 61 patients had complete data. The general patients’ characteristics with
cervical spondylotic myelopathy are illustrated in Table 5.1.
5. 2. MODEL DEVELOPMENT
5. 2. 1. Improving the validity of the predictive model
Among the potential predictor variables, two of these variables, transverse area
[TA] and anteroposterior diameter [AP] of spinal cord, both provide similar information
about the degree of spinal cord compression, efforts were made to establish the
55
reliabilities (inter-rater reliability and test-retest) of each variable were (please refer to
Appendix 3).
Based on three-way ANOVA, the observed differences between AP
measurements consists of true score variances, random error (imprecision) and systematic
error (bias) caused by raters’ specialty training and their interpretations of MRI based on
stage of CSM severity (Table F.5. - F.8).
In addition to the sources of systematic error mentioned above, the TA method
had time (learning or fatigue) as a source of variability. The time effect has been shown
to be statically significant in the TA method of spinal stenosis assessment, based on
three-way ANOVA with Bonferroni post-hoc analysis [TA, p= 0.01], specifically the
agreement among four raters consistently increased from Session 1 to Session 4 (Table
F. 4). The time differences are illustrated as normal fluctuations by graphical
representation (i.e. random error) (Figure F.3).
The TA and AP measurement techniques demonstrated a moderate level of inter-
reliability (0.68, 0.69, 0.73, 0.76 and 0.86, 0.72, 0.68, 0.52), with more consistent
agreement noted in the assessment of transverse area of spinal cord. As a result,
transverse area was chosen over anteroposterior diameter of spinal cord method. The
variable was selected based on clinical, practical, statistical and reliability criteria
described in Table 4.7.
Transverse area and anteroposterior diameter of spinal cord are statically collinear
and choosing TA for a predictive model avoids this collinearity. Collinearity is a
statistical phenomenon in which two predictor variables in a multiple regression model
are highly correlated. As a result, the coefficient estimates of individual predictor
56
variables may change erratically in response to small changes in the model or the data.
AP of spinal cord has the disadvantage of being less applicable in cases of compression
sites off midline of spinal cord. TA adjusts for asymmetrical compression of spinal cord;
thus it is a less biased measure.
5. 2. 2. Univariable (unadjusted) analysis
5. 2. 2. 1. mJOA Scores at baseline
Higher baseline mJOA scores were associated with younger age (p=0.0002, β(r) =
-2.83), shorter duration of symptoms (p=0.03, β(r) = -1.55), a smaller compression of
transverse area of the spinal cord (p=0.02, β(r) = 0.06) and less number of compressed
segments (p=0.04, β(r) = 2.35 and β(r) = 1.06) (Table 5.4).
Analysis of all variables revealed that three patterns of spinal cord signal intensity
changes on T1- and T2-weighted sequences, and gender variables were not significantly
associated with the functional score at admission (p-value > 0.2). Therefore, these
insignificant demographic and MR imaging variables (gender and signal intensity
changes) were excluded (Table 5.4).
5. 2. 2. 2. mJOA Scores at follow up
The mean mJOA score improved from 12.8 ± 2.7 points pre-operatively to 15.8 ±
2.3 points at 12 months post-operatively (p<0.0001), as determined by the Wilcoxon
signed-rank test. Higher post-operative mJOA scores were associated with younger age
(p<0.0001, β(r) = -1.07), shorter duration of symptoms (p=0.09, β(r) = -1.03) and higher
baseline mJOA score (p<0.0001, β(r) = 1.01) (Table 5.4).
57
Analysis of all variables revealed that the MR imaging features (three patterns of
spinal cord signal intensity changes on T1- and T2-weighted sequences and number of
compressed segments), and gender variables were not significantly associated with the
functional score at follow-up (p-value > 0.2). Therefore, these insignificant variables (list
variables) were excluded (Table 5.4).
5. 2. 3. Multivariate (adjusted) analysis
5. 2. 3. 1. mJOA Scores at baseline
The final statistical model includes age (Table 5.5), which explains 20% of the
total variability of the baseline mJOA scores. The average baseline score of CSM patients
in patients older 65 years of age was 13.5. The baseline mJOA scores in younger patients
are on average 2.83 higher.
5. 2. 3. 2. mJOA Scores at follow-up
The final model includes the baseline mJOA score and age (Table 5.5), and
explains 36% of the total variability of the final mJOA scores. This model indicates that,
for example, if baseline scores were identical, a patient less than 65 years of age has on
average score 1.04 higher than an older patient. Moreover, if age was identical, a patient
with moderate severity of myelopathy may benefit from surgical treatment more than a
patient with severe myelopathy (approximately by 1.01 points lower on average).
58
CHAPTER 6
DISCUSSION AND CONCLUSION
6.1. Summary of findings
The studies described herein have led to several major conclusions: 1) Age and
baseline severity score are good predictors of functional score after surgery. 2) Duration
of symptoms is not a good predictor of functional scores after surgery. 3) Measurements
of the transverse area and anteroposterior diameter of the spinal cord have shown good to
moderate inter-rater reliability. 4) No definite conclusions can yet be drawn on whether
transverse area of spinal cord, combined patterns of signal intensity changes on T1/T2WI,
and the number of compressed levels are predictors of functional score.
Age & Baseline severity score
Based on in-depth examination of the impact of predictors on outcome using beta
coefficient values and reliability assessments, our study confirms that age and baseline
severity score are two preoperative variables that can predict functional outcomes after
surgery (post-operative mean mJOA score). The most prominent patient information was
the age at the time of admission, which was shown to be associated with baseline
functional score and predictive of follow up functional score in the setting of CSM.
Based on the beta estimate magnitude, the following data suggest that there might be
more opportunities for greater improvement when performing surgery on younger
population. However, more research is needed to confirm these findings. In contrast,
Yamazaki et al showed no differences based on age in post-operative functional scores
after surgery in a retrospective study [15]. However, these results must be cautiously
59
interpreted because the study did not controlled for baseline severity score, the number of
patients in each subgroup was small, and patient characteristics were too poorly described
to understand the differences between two samples. Finally, baseline CSM severity score
was a strong independent predictor of functional score following surgery. Patients with
less severe functional disability may benefit from surgical treatment more than those with
a more severe disability. The greater benefit from surgery in patients with less functional
disability could be due to milder neuropathologic alterations in the spinal cord that reflect
greater recuperative potential [19]. These findings suggest the possibility that patients
may experience poorer outcome if surgery is delayed until the patient is more severely
affected. In contrast, Singh et al. reported patients with lower starting point in function
make the most gains after surgery[24]. We suspect that higher functional scores in the
more severe CSM group in this study could be due to other differences in patient
characteristics (age and duration of symptoms), which were not comparable at admission.
Duration of symptoms
In our study, duration of symptoms was mildly associated with functional scores
at admission at 12 months follow-up. However, after adjustments for age and baseline
severity score, duration of symptoms appears to be associated with functional score at
admission and follow up, though this is not significant. The question as to whether CSM
patients with indications for surgery should be offered operative interventions
irrespective of duration of symptoms is still unclear.” Our findings are inconsistent with
some other studies in the literature that support the notion of long-standing mechanical
compression causing additional circulatory impairment of the spinal cord [15, 19, 21, 46].
60
We suspect that these differences may be due to the interpretation of the onset of CSM.
Heterogeneity of samples, non-consecutive methods of recruitment and insufficient
descriptions of patients associated with retrospective design in previous studies could
also have contributed to the observed differences in functional scores. Although
Mastronardi et al prospectively analyzed CSM patients, these results must be cautiously
interpreted because baseline severity score and age were not similar between groups and
the number of patients in each subgroup was small [21].
MR imaging features
Based on the findings of our systematic review, transverse area of spinal cord,
combined patterns of signal intensity changes on T1/T2WI, and number of compressed
segments were found to be associated with functional scores at long term follow up
before and after adjustment for age, duration of symptoms, and baseline severity score.
The data obtained for this thesis did not support the findings of previous studies. We can
speculate that several factors may have contributed to these results. Firstly, the
inconsistencies in findings could be due to heterogeneity of the patients in this sample
population. The findings vary based on different etiology, ossification of posterior
ligaments (OPLL) versus cervical spondylotic myelopathy (CSM) vs herniated disc (HD)
[27, 31]. Differences could also be due to inter-institutional variations in MRI protocols.
For example, previous studies used T1-weighted axial imaging to measure spinal cord
deformity. At our institution (Toronto Western Hospital), MRI protocols for the cervical
spine include axial T2 slices. Differences among clinicians are another source of
variation. Based on our observations from reliability testing, the measurements of
61
transverse area of spinal cord is subjective; clinicians had different approaches to
interpret the exact location and boundaries of the most compressed site of the spinal cord,
especially in multisegmental CSM. Based on the findings of intra- and inter-rater
reliability project (Appendix 3), the interpretations of MR images varied depending on
the specialty and years of practice. In our study, we found that the percentage of
agreement was 68% to 76% and overall correlation was moderate to good. We
recommend that the use of this measurement technique be applied in a larger sample size.
In our study, we established that the variations in functional outcomes defined by
mJOA score after surgery cannot be further explained using MR modality, in addition to
age and baseline mJOA score. Our findings suggest that assessments of T1-/T2 signal
intensity changes, degree of spinal cord compression and number of levels involved in
compression have no statistically significant effect on post-operative functional status as
measured using the mJOA scale, and provide no additional clinically important
information in predicting function after surgery. We suspect that the study does not
support the use of MR imaging features as predictors because of the ceiling effect present
in mJOA measurements at follow up, which leads to poor discriminative response thus
resulting in low responsiveness. All of the subjects were fully developed in their ability to
function, therefore, no subjects scored below 10 on mJOA scoring system. The majority
of patients scored on the mild side of spectrum at follow up. Therefore, one of the
limitations of the study was the use of poorly variable pool of CSM individuals at follow-
up. A future study will require a better outcome measure than the mJOA that would have
capacity to differentiate subjects more precisely from all severity groups at follow-up.
62
Similar to these findings, Singh et al reported low levels of sensitivity to change in JOA
score (r=0.21) compared to SF-36 (r=0.32), Nurick score (r=0.42) and MDI (r=0.52),
indicating that the scale is possibly less sensitive when differentiating milder levels of
severity [23]. Predicting a perfect correlation between the clinical scores with poor
sensitivity and the findings seen on MR images of spinal cord remains a challenge. In
addition, MRI provides a quantitative measure as opposed to qualitatively subjective
report of observers to differentiate severity of CSM. More research is needed to
investigate in greater detail about the psychometric properties (reliability and validity) of
the modified version of JOA (mJOA) scale.
The availability of these predictors enables spine surgeons and referring
physicians to provide more information to patients in consulting sessions prior to surgery,
and in guiding their therapeutic decision making. It provides better allocation of services
and becomes useful in designing clinical trials to test the effect of surgical interventions
on outcomes.
6.2. Implications of the findings
The most significant finding of this study is that there are now known reliable
measures (transverse area and anteroposterior diameter of spinal cord) to assess the
degree of spinal cord compression using digitized/magnified images and a standardized
written protocol. In the past, there was a lack of concordance in the literature on the
optimal techniques to quantitatively assess MRIs in patients with CSM. It has not been
possible to replicate previously published results due to lack of availability of information
on MRI protocol details and its measures.
63
The findings also enhance knowledge which lends insight into how MR imaging
should be approached and analyzed with this population. Perhaps some studies should
include assessments using T2-weighted images as opposed to T1-weighted and have a
consistent approach in the selection of the most compressed site, especially in CSM cases
with multilevel involvements due to degenerative changes of spine.
In summary, the predictive model provides a detailed profile of patient
characteristics and their variability, enabling a clinician to council patients on individual
bases. We also report details about variability, age and baseline severity score.
Furthermore, the data was collected in a prospective fashion, which fills a void currently
existing in the literature. This study provides a detailed exploratory analysis, providing
new insights on discriminative abilities of mJOA scale in the area of CSM research.
6.3. Limitations
The present study has several limitations. The first is the absence of confirmed
reliability, validity and responsiveness of the modified version of JOA scoring system
and some MR imaging based predictive variables (number of compressed segments) used
in the baseline examinations. In addition, the modified version of JOA scale, which has
limited usefulness in detecting the precise benefit of surgery for mild CSM patients due
to a ceiling effect, was used in our study. The majority of patients scored on the mild side
of spectrum at follow up. A future study will require a better outcome measure than the
mJOA scale that would have a capacity to differentiate subjects more precisely from all
severity groups at follow-up. Second, a study with one single recruitment centre might
potentially systematically under- or overestimate measurement errors due to particular
64
characteristics of patients. Multicentre trial data in and outside of Toronto may help to
establish more representative estimates of CSM parameters. Finally, although the
findings were based on the secondary analysis of a prospectively collected data, there was
a restriction in the types of MR imaging features collected.
6.4. Future directions
Based on the results from this study it appears that age and baseline severity score
at admission can both provide valuable information and can be part of a new
multidimensional scoring system for clinicians to counsel patients with CSM.
Because the cumulative effect of age, gender, duration of symptoms, baseline
severity score and MR imaging predictors on functional score assessed by mJOA scale
following surgery in the present study were from a single centre and investigated for the
first time, a similar analysis must be conducted on the data collected from larger North
American and international CSM clinical trials databases to determine if our results can
be reproduced in other geographical regions with similar estimates for the magnitude of
all associations (beta values).
In light of the need to establish the predictive value of MR imaging features of
functional outcomes after adjusting for other important predictors, the mJOA scale
requires improvements to the existing measurements and should potentially add some
new ones. For example, some studies has shown that JOA score underestimates the initial
handicap in the hands, often among the first of patients’ complaints [69]. Similarly,
recovery of manual dexterity is poorly judged by this score. Potentially, the domain
including the functioning of the upper limb needs to be reconsidered to make it more
65
quantitative as compared to qualitative estimate that currently is. Gait dysfunction is the
most important issue in CSM patients regarding the surgical outcome and clinical
deficits. The measurements of ambulation have shown the relative advantages over
previous clinical assessment scales in determining clinical severity and, particularly, in
the detection of change following surgery [70]. In study by Singh et al 2001, walking-
related parameters were shown to have good correlation, along with validity, with other
functional and impairment scales such as the myelopathy disability index (MDI), the
Nurick Scale and the short form health survey (SF-36) in CSM setting [24]. Potentially
adding a new domain with a walking component may enable more accurate prediction of
patients’ functioning after treatment. Further work on the mJOA scale is necessary to
confirm its psychometric properties including reliability, construct validity and its
discriminative abilities (responsiveness).
Alternatively, MR imaging with T2 weighting has been reported to have a level of
sensitivity ranging from 15% to 65% [71], but low specificity for the visualization of
intramedullary pathology. The development of a more advanced spinal imaging
technique such as diffusion tensor imaging (DTI) with fractional anisotropy, diffusion-
weighted imaging (DWI), functional magnetic resonance (fMR), diffusion coefficient
(ADC), may enable more accurate correlations between imaging and clinical
presentation.
In addition, given that spin-echo MR imaging has limited pathophysiologic
usefulness in detecting myelopathy, diffusion-tensor imaging (DTI) and diffusion
weighted imaging (DWI) may be more useful in identifying additional shearing injuries
66
that are not visible on conventional MR images. In general, DTI analyzes the movement
of water in association with white matter fibers, providing three-dimensional
reconstruction of fiber tracts, and has the ability to help quantify the severity of injury to
individual white matter tracts [72]. Budzik et al found diffusion-tensor MR imaging to be
better correlate with clinical scores than T2WI in cervical spondylotic myelopathy [73].
Similarly, results of the Sagiuchi et al study showed that DWI has higher sensitivity for
detection of acute spinal cord imaging abnormality compared to standard MRI [74].
fMRI analysis of the spinal cord provides physiological readouts of neuronal
activity and neuronal plasticity, in a non-invasive manner. A number of studies have
demonstrated the utility of advanced MRI techniques in the setting of spinal cord injuries
with reliable results and good sensitivity to changes in neuronal activity.
ADC and fractional anisotropy may be beneficial in assessing a correlation
between imaging and clinical presentation. Demir et al found that diffusion ADC values
were a more sensitive indicator of spinal cord injury than T2-weighted images. The study
demonstrated a higher sensitivity when combined with electrophysiological examination
with sensitivity of 92% and negative predictive value of 75% compared to the T2-
weighted images that had 53% sensitivity and 50% negative predictive value [71]. Facon
et al. performed a similar study in six cervical spondylosis patients and determined that
the fractional anisotropy values had significantly higher sensitivity and specificity in the
detection of spinal cord abnormalities than T2 weighted images [72].
In conclusion, imaging indexes based on pathophysiologic models may enable
more accurate prediction of CSM and thereby facilitate better assessment of the prognosis
and better application of treatment strategies.
67
Conclusion
A predictive model of functional outcomes was developed to predict functional
outcome of patients undergoing surgery according to their age and baseline severity
score, though changes on MR imaging were not independently predictive of outcome. In
addition to validating reports in the existing literature, our study results suggest that MRI
is a reliable tool yielding reproducing stable measurements. Some work on
responsiveness of the current mJOA scale is needed to establish the ability of MRI to
predict the functional outcomes of CSM patients.
This study has shed some light on the need for a more responsive functional scale
than the mJOA that could detect more clinically important changes in functional
outcomes. More specifically, the main issue explained above with the mJOA scale is the
presence of ceiling effect with lack of discrimination of functional deficits in milder
patients with CSM. This is preliminary work which provides a first step in developing a
multidimensional scoring system for prediction of functional outcomes in CSM using
demographic, clinical and MR imaging domains.
Moreover, the proportions of variance in follow up functional scores explained by
age and baseline score is low, suggesting that this field has long way to go before
achieving equipoise in refusing someone surgery on the basis of unfavourable baseline
characteristics.
68
CHAPTER 7
REFERENCE LIST
1. Emery, S., Cervical spondylotic myelopathy: diagnosis and treatment. . Journal of the American Academy of Orthopaedic Surgeons, 2001. 9(6): p. 376-385.
2. Cadotte, D.W., Karpova, A.V., Fehlings,M.G. , Cervical spondylotic myelopathy: surgical outcomes in the elderly. Int. J. Clin. Rheumatol, 2010. 5(3): p. 327-337.
3. Montgomery, D.M. and R.S. Brower, Cervical spondylotic myelopathy. Clinical syndrome and natural history. [Review] [54 refs]. Orthopedic Clinics of North America. 23(3):487-93, 1992 Jul., 1992.
4. Adams, C.B., Logue, V., Some functional effects of operations for cervical spondylotic myelopathy. Brain, 1971. 94: p. 587-594.
5. Law, M.D., Jr., Bernhardt, M., White, A.A., Evaluation and management of cervical spondylotic myelopathy. Instr Course Lect 1995. 44: p. 99-110.
6. Young, W.F., Cervical spondylotic myelopathy: a common cause of spinal cord dysfunction in older persons. . Am Fam Physician 2000. 62: p. 1064-1070, 1073, 2000.
7. Matz, P.G., et al., The natural history of cervical spondylotic myelopathy. J Neurosurg Spine, 2009. 11(2): p. 104-11.
8. Houten, J.K. and P.R. Cooper, Laminectomy and posterior cervical plating for multilevel cervical spondylotic myelopathy and ossification of the posterior longitudinal ligament: effects on cervical alignment, spinal cord compression, and neurological outcome. Neurosurgery. 52(5):1081-7; discussion 1087-8, 2003 May., 2003.
9. Hirabayashi, K., Miyakawa, J., Satomi, K., Maruyama, T., Wakano, K., Operative results and postoperative progression of ossification among patients with offication of cervical posterior longitudinal ligaments. . Spine 1981. 6(4): p. 354-364.
10. Chen, C.J., et al., Intramedullary high signal intensity on T2-weighted MR images in cervical spondylotic myelopathy: prediction of prognosis with type of intensity. Radiology. 221(3):789-94, 2001 Dec., 2001.
11. Benzel, E.C., et al., Cervical laminectomy and dentate ligament section for cervical spondylotic myelopathy. Journal of Spinal Disorders. 4(3):286-95, 1991 Sep., 1991.
12. Park, Y.S., et al., Predictors of outcome of surgery for cervical compressive myelopathy: retrospective analysis and prospective study. Neurologia Medico-Chirurgica. 46(5):231-8; discussion 238-9, 2006 May., 2006.
13. Handa, Y., et al., Evaluation of prognostic factors and clinical outcome in elderly patients in whom expansive laminoplasty is performed for cervical myelopathy due to multisegmental spondylotic canal stenosis. A retrospective comparison with younger patients. Journal of Neurosurgery. 96(2 Suppl):173-9, 2002 Mar., 2002.
14. Matsuda, Y., et al., Outcomes of surgical treatment for cervical myelopathy in patients more than 75 years of age. Spine. 24(6):529-34, 1999 Mar 15., 1999.
69
15. Yamazaki, T., et al., Cervical spondylotic myelopathy: surgical results and factors affecting outcome with special reference to age differences. Neurosurgery. 52(1):122-6; discussion 126, 2003 Jan., 2003.
16. Hasegawa K, H.T., Chiba Y, Hirano T, Watanabe K, Yamazaki A. , Effects of surgical treatment for cervical spondylotic myelopathy in patients > or _ 70 years of age: a retrospective comparative study. J Spinal Disord Tech. , 2002. 15: p. 458-460.
17. Kohno K, K.Y., Oka Y, Matsui S, Ohue S, Sakaki S. , Evaluation of prognostic factors following expansive laminoplasty for cervical spinal stenotic myelopathy. . Surg Neurol. , 1997. 48: p. 237–245.
18. Nagata, K., et al., Cervical myelopathy in elderly patients: clinical results and MRI findings before and after decompression surgery. Spinal Cord. 34(4):220-6, 1996 Apr., 1996.
19. Morio, Y., et al., Correlation between operative outcomes of cervical compression myelopathy and mri of the spinal cord. Spine. 26(11):1238-45, 2001 Jun 1., 2001.
20. Yagi M, N.K., Kihara M, Horiuchi Y, Long-term surgical outcome and risk factors in patients with cervical myelopathy and a change in signal intensity of intramedullary spinal cord on magnetic resonance imaging. J Neurosurg Spine, 2010. 12: p. 59–65.
21. Mastronardi, L., et al., Prognostic relevance of the postoperative evolution of intramedullary spinal cord changes in signal intensity on magnetic resonance imaging after anterior decompression for cervical spondylotic myelopathy. Journal of Neurosurgery Spine. 7(6):615-22, 2007 Dec., 2007.
22. Fukushima, T., et al., Magnetic resonance imaging study on spinal cord plasticity in patients with cervical compression myelopathy. Spine. 16(10 Suppl):S534-8, 1991 Oct., 1991.
23. Singh A, C.H., Comparison of seven different scales used to quantify severity of cervical spondylotic myelopathy and post-operative improvement. Journal of Outcome Measures, 2001. 5(1): p. 798-818.
24. Singh, A., et al., Clinical and radiological correlates of severity and surgery-related outcome in cervical spondylosis. Journal of Neurosurgery. 94(2 Suppl):189-98, 2001 Apr., 2001.
25. Yukawa, Y., et al., MR T2 Image Classification in Cervical Compression Myelopathy. Spine, 2007. 32(15): p. 1675–1678.
26. Alafifi, T., Kern, R.,Fehlings, M. , Clinical and MRI Predictors of Outcome After Surgical Intervention for Cervical Spondylotic Myelopathy. Journal of Neuroimaging, 2006. 17(4): p. 315-322.
27. Okada, Y., et al., Magnetic resonance imaging study on the results of surgery for cervical compression myelopathy. Spine. 18(14):2024-9, 1993 Oct 15., 1993.
28. Chung, S., Chung, KH. , Factors affecting the surgical results of expansive laminoplasty for cervical spondylotic myelopathy. . Int Orthop, 2002. 26(6): p. 334-338.
29. Wada, E., M. Ohmura, and K. Yonenobu, Intramedullary changes of the spinal cord in cervical spondylotic myelopathy. Spine. 20(20):2226-32, 1995 Oct 15., 1995.
70
30. Uchida, K., Nakajima,H., Sato,R., Kokubo, Y., Yayama,T., Kobayashi,S., Baba, H., Multivariate analysis of the neurological outcome of surgery for cervical compressive myelopathy. Journal of Orthopaedic Science 2005. 10: p. 564–573.
31. Yone, K., et al., Preoperative and postoperative magnetic resonance image evaluations of the spinal cord in cervical myelopathy. Spine. 17(10 Suppl):S388-92, 1992 Oct., 1992.
32. Fernandez de Rota, J.J., et al., Cervical spondylotic myelopathy due to chronic compression: the role of signal intensity changes in magnetic resonance images. Journal of Neurosurgery Spine. 6(1):17-22, 2007 Jan., 2007.
33. Nagata, K., Kiyonaga, K., Ohashi, MS., Miyazaki, S., Inoue, A. , Clinical value of magnetic resonance imaging for cervical myelopathy. Spine 1990. 15(11): p. 1089-1096.
34. Matsuyama, Y., N. Kawakami, and K. Mimatsu, Spinal cord expansion after decompression in cervical myelopathy. Investigation by computed tomography myelography and ultrasonography. Spine. 20(15):1657-63, 1995 Aug 1., 1995.
35. Ramanauskas WL, W.H., Metes JJ, Lazo A, Kelly JK., MR imaging of compressive myelomalacia. J Comput Assist Tomogr. , 1989. 13(3): p. 300-404.
36. Takahashi M, S.Y., Miyawaki M, Bussaka H., Increased MR signal intensity secondary to chronic cervical cord compression. Neuroradiology., 1987. 29(6): p. 550-556.
37. Morio Y, Y.K., Kuranobu K, Murata M, Tuda K., Does increased signal intensity of the spinal cord on MR images due to cervical myelopathy predict prognosis? Arch Orthop Trauma Surg. , 1994. 113(5): p. 254-259.
38. Al-Mefty O, H.L., Middleton TH, Smith RR, Fox JL., Myelopathic cervical spondylotic lesions demonstrated by magnetic resonance imaging. J Neurosurg. , 1988. 68(2): p. 217-222.
39. Mehalic TF, P.R., Applebaum BI., Magnetic resonance imaging and cervical spondylotic myelopathy. Neurosurgery, 1990. 26(2): p. 226-227.
40. Serizawa Y, O.K., Tanaka K, Tamaki S, Matsuura K, Uchihara T., Spontaneous resolution of an acute spontaneous spinal epidural hematoma without neurological deficits. Intern Med. , 1995. 34(10): p. 992-994.
41. Mihara, H., et al., Cervical myelopathy caused by C3-C4 spondylosis in elderly patients: a radiographic analysis of pathogenesis. Spine. 25(7):796-800, 2000 Apr 1., 2000.
42. Tuszynski MH, S.J., Fawcett JW, Lammertse D, Kalichman M, Rask C, Curt A, Ditunno JF, Fehlings MG, Guest JD, Ellaway PH, Kleitman N, Bartlett PF, Blight AR, Dietz V, Dobkin BH, Grossman R, Privat A; , Guidelines for the conduct of clinical trials for spinal cord injury as developed by the ICCP Panel: clinical trial inclusion/exclusion criteria and ethics. Spinal Cord, 2007. 45(3): p. 222-231.
43. Peolsson A, H.R., Vavruch L, Prediction of fusion and importance of radiological variables for the outcome of anterior cervical decompression and fusion. Eur Spine J, 2004. 13: p. 229–234.
44. McCormack, B.M. and P.R. Weinstein, Cervical spondylosis. An update. [Review] [116 refs]. Western Journal of Medicine. 165(1-2):43-51, 1996 Jul-Aug., 1996.
71
45. Lee, T.T., G.R. Manzano, and B.A. Green, Modified open-door cervical expansive laminoplasty for spondylotic myelopathy: operative technique, outcome, and predictors for gait improvement. Journal of Neurosurgery. 86(1):64-8, 1997 Jan., 1997.
46. Hashizume, Y., Iijima, S., Kishimoto, H. Yanagi,T. , Pathology of Spinal Cord Lesions caused by Ossification of the Posterior Longitudinal Ligament Acta neuropathology, 1984. 63: p. 1230-130.
47. Uchida K, N.H., Sato R, Kokubo Y, Yayama T, Kobayashi S, Baba H., Multivariate analysis of the neurological outcome of surgery for cervical compressive myelopathy. J Orthop Sci., 2005. 10(6): p. 564-573.
48. Shinomiya K, M.N., Furuya K., Study of experimental cervical spondylotic myelopathy. Spine (Phila Pa 1976). 1992. 7(10 Suppl): p. S383-387.
49. Ryan R, H.S., Broclain D, Horey D, Oliver S, Prictor M, Cochrane consumers & communication review group: study quality guide. 2007: p. 1-50.
50. Nurick, S., The pathogenesis of the spinal cord disorder associated with cervical spondylosis. Brain. 95(1):87-100, 1972., 1972.
51. Chung SS, L.C., Chung KH., Factors affecting the surgical results of expansive laminoplasty for cervical spondylotic myelopathy. Int Orthop., 2002. 26(6): p. 334-338.
52. Kasai Y, U.A., New evaluation method using preoperative magnetic resonance imaging for cervical spondylotic myelopathy. Arch Orthop Trauma Surg., 2001. 121(9): p. 508-510.
53. Nagata K, K.K., Ohashi T, Sagara M, Miyazaki S, Inoue A., Clinical value of magnetic resonance imaging for cervical myelopathy. Spine (Phila Pa 1976). , 1990. 15(11): p. 1088-1096.
54. Matsuyama Y, K.N., Yanase M, Yoshihara H, Ishiguro N, Kameyama T, Hashizume Y., Cervical myelopathy due to OPLL: clinical evaluation by MRI and intraoperative spinal sonography. J Spinal Disord Tech. , 2004. 17(5): p. 401-404.
55. Mizuno J, N.H., Inoue T, Hashizume Y., Clinicopathological study of "snake-eye appearance" in compressive myelopathy of the cervical spinal cord. J Neurosurg., 2003. 99(2 Suppl)(162-168).
56. Yukawa Y, K.F., Yoshihara H, Yanase M, Ito K., MR T2 image classification in cervical compression myelopathy: predictor of surgical outcomes. Spine (Phila Pa 1976). , 2007. 32(15): p. 1675-1678.
57. Papadopoulos CA, K.P., Papagelopoulos PJ, Karampekios S, Hadjipavlou AG., Surgical decompression for cervical spondylotic myelopathy: correlation between operative outcomes and MRI of the spinal cord. Orthopedics., 2004. 27(10): p. 1087-1091.
58. Wada, E., et al., Can intramedullary signal change on magnetic resonance imaging predict surgical outcome in cervical spondylotic myelopathy? Spine. 24(5):455-61; discussion 462, 1999 Mar 1., 1999.
59. Tanaka J, S.N., Tokimura F, Doi K, Inoue S. , Operative results of canal-expansive laminoplasty for cervical spondylotic myelopathy in elderly patients. . Spine, 1999. 24: p. 2308-2312.
72
60. Tani T, Y.H., Kimura J. , Cervical spondylotic myelopathy in elderly people: a high incidence of conduction block at C3-4 or C4-5. . J Neurol Neurosurg Psychiatry, 1999. 66: p. 456–464.
61. Suri A, C.R., Mehta VS, Gaikwad S, Pandey RM., Effect of intramedullary signal changes on the surgical outcome of patients with cervical spondylotic myelopathy. Spine J., 2003. 3(1): p. 33-45.
62. Andersen T, C.F., Laursen M, Hoy K, Hansen ES, Bunger C, Smoking as a predictor of negative outcome in lumbar spinal fusion. . Spine, 2001. 26: p. 2623–2628.
63. Glassman SD, A.S., Parker A, Burke D, Johnson JR, Dimar JR The effect of cigarette smoking and smoking cessation on spinal fusion. Spine, 2000. 25: p. 2608–2615.
64. Concato J, F.A., Holford TR. , The risk of determining risk with multivariable models. Annals of Internal Medicine, 1993. 118: p. 201-210.
65. Geoffrey R. Norman, D.L.S., Biostatistics: The Bare Essentials. 2008, People's medical publishing house Shelton.
66. Feinstein, A., Multivariate analysis: an introduction. . 1996, London: Yale Univ Pr.
67. Rothman KJ, G.S., Modern epidemiology. . 1998, Philadelphia: Lippincott-Raven. 68. Vickers AJ, A.D., Analysing controlled trials with baseline and follow up
measurements. . BMJ, 2001. 323: p. 1123-1126. 69. Pascal -Moussellard H, D.L.-R., Olindo S, Rouvillain J-L, Catonné Y
Neurological recovery after cervical cord decompression for canal stenosis myelopathy. Elsevier Masson SAS, 2006. 91: p. 607-614.
70. Singh A, C.H., Quantitative assessment of cervical spondylotic myelopathy by a simple walking test. Lancet 1999. 354: p. 370–373.
71. Demir A, R.M., Moonen CT, Vital JM, Dehais J, Arne P, Caillé JM, Dousset V., Diffusion-weighted MR imaging with apparent diffusion coefficient and apparent diffusion tensor maps in cervical spondylotic myelopathy. Radiology, 2003. 229(1): p. 37-43.
72. Facon D, O.A., Fillard P, Lepeintre JF, Tournoux-Facon C, Ducreux D., MR diffusion tensor imaging and fiber tracking in spinal cord compression. AJNR Am J Neuroradiol, 2005. 26(6): p. 1587-1594.
73. Budzik JF, B.V., Le Thuc V, Duhamel A, Assaker R, Cotten A., Diffusion tensor imaging and fibre tracking in cervical spondylotic myelopathy. Eur Radiol., 2010.
74. Sagiuchi T, T.S., Endo M, Hayakawa K., Diffusion-weighted MRI of the Cervical Cord in Acute Spinal Cord Injury With Type II Odontoid Fracture. J Comput Assist Tomogr. , 2002. 26(4): p. 654-656.
73
TABLES
CHAPTER 3: Systematic review Table 3.1: presents criteria in a modified version of quality assessment checklist Yes No Comments
Source description Was the source of participants adequately described?
Referral pattern Was the recruitment method adequately described? eg. Representative sample: participants were selected as consecutive or random cases.
Patients characteristics Was the population of interest adequately described for key characteristics: severity, co-morbidity, inclusion/exclusion criteria, age and sex? Yes, if all characteristics are reported. No, if the description is limited to age and sex characteristics, or none.
Representative sample
Sample size Was the sample size large enough? The rule of thumb: At least 10 cases per independent variable are required at a power of 80% and a 5% significance level (eg. The author runs a comparison for age, sex, symptom duration, pre-/post-operative neurological scores, etc).
Blinding Blinded assessor Were MRI assessors involved in the study blinded to clinical data? eg. Blinded outcome assessment: assessor was unaware of prognostic factors at the time of outcome assessment.
Baseline comparability Compared baseline performance of clinical status Is baseline performance of clinical status measured? If yes, is the absolute difference between the groups less than 10%? If yes, score the quality criterion as YES. If no, did the analysis take into consideration the baseline imbalance (for example, analysis of co-variance or analysis by change scores between groups? eg. Statistical adjustment: multivariate analyses conducted with adjustment for potentially confounding factors. If yes, score the quality criterion as YES. If no, score the quality criterion as NO. Otherwise, if no comparison is completed, then NA
74
Compared baseline performance of other predictive variables Is baseline performance of age, sex and symptom duration measured? If yes, is the absolute difference between the groups less than 10%? If yes, score the quality criterion as YES. If no, did the analysis take into consideration the baseline imbalance (for example, analysis of co-variance or analysis by change scores between groups? eg. Statistical adjustment: multivariate analyses conducted with adjustment for potentially confounding factors. If yes, score the quality criterion as YES. If no, score the quality criterion as NO. Otherwise, if no comparison is completed, then NA
Complete Was follow up reported? If yes, was follow-up complete? Follow-up >80%: outcome data were available for at least 80% of participants at one follow-up point. If not, then score the quality criterion as NO.
Comparison of drop outs with remained Were those followed up comparable to those who dropped out?
Follow-up
Reasons of drop outs Were reasons for loss to follow-up provided?
Valid Were outcome measures adequately valid? Yes, if the prognostic study tested the validity of measurements used or referred to other studies which had established validity. Otherwise, no.
Validation of outcome measurement
Reliable Were outcome measures adequately reliable? Yes, if the prognostic study tested the reliability of measurements used or referred to other studies which had established reliability. Otherwise, no.
Validation of predictive factor measurement
Defined Were definitions or descriptions of MRI predictor adequately provided? Yes, if there is clear indication of measurement method such as detailed description of MRI protocol including planes (axial/sagittal and thickness of slices). Otherwise, no.
75
Reliable Were predictive factors measures adequately reliable? Yes, if inter/intra-observer reliability tests with/without coefficient value are reported (eg. Cronbach alpha or Kappa coefficients). Otherwise, no.
76
Table 3.2: Presents the summary of methodological limitations in a format of modified version of quality assessment checklist designed by the Cochrane collaboration group et al (2007) [No-0, Yes -1].
Representative sample Blinding Baseline comparability Follow up
Validity of clinical scales
Validity of exposure variables
Cohort Source description
Referral pattern
Patients characteristics
Sample size
Blinded assessor
Compared severity score
Compared baseline
performance
Complete Reliability Validity Definition
Reliability
Prospective cohort study
Nagata et al. 1990 0 0 0 0 0 0 0 0 1 0 1 0 Fukushima et al. 1991 0 0 0 1 0 0 0 1 1 0 1 0 Yukawa et al. 2007 0 0 1 1 1 1 0 0 1 0 1 1 Yone et al. 1992 0 0 0 0 0 0 0 0 1 0 0 0 Okada et al. 1993 0 0 0 1 0 0 0 1 1 0 1 0 Chen et al. 2001 1 1 0 1 1 1 0 1 1 0 1 1 Papadopolous et al. 2004 0 1 0 0 1 0 1 1 1 0 0 0 Singh et al. 2001 1 1 0 0 1 0 0 1 1 0 0 1 Mastronardi et al.2007 0 1 0 0 1 0 0 1 0 0 1 0 Fernandez et al. 2007 1 0 0 1 0 1 0 1 0 0 1 0 Retrospective cohort study
Nagata et al. 1996 0 0 0 1 0 0 0 1 1 0 1 0 Uchida et al. 2005 1 0 0 0 0 NA NA 1 1 0 0 0 Kasai et al. 2001 1 0 0 1 0 NA NA 0 1 0 1 0 Chung et al. 2002 0 0 0 0 0 1 0 0 1 0 0 0 Wada et al. 1999 1 0 0 0 1 1 1 0 1 0 1 0 Morio et al. 2001 0 0 0 1 1 1 1 1 1 0 1 1
77
Representative sample Blinding Baseline comparability Follow up
Validity of clinical scales
Validity of exposure variables
Cohort Source description
Referral pattern
Patients characteristics
Sample size
Blinded assessor
Compared severity score
Compared baseline
performance
Complete Reliability Validity Definition
Reliability
Yamazaki et al. 2002 1 0 0 0 0 1 0 1 1 0 0 0 Wada et al. 1995 0 0 0 0 0 1 0 1 1 0 1 0 Houten et al. 2003 0 1 0 0 0 1 0 1 0 0 1 0 Park et al. 2006 0 0 0 1 0 1 0 1 0 0 1 0 Case series Mizuno et al. 2003 0 0 0 1 0 0 0 1 1 0 0 0 Matsuyama et al., 2004 0 0 0 0 0 1 0 1 1 0 1 0 Matsuda et al. 1991 1 0 0 1 0 0 0 1 1 0 1 0
78
Table 3.3: Study design, sample size, type of outcome measures and level of evidence
Citation Study design Sample N=
Outcome measure scale Level of Evidence*
Nagata et al. 1990 Prospective cohort
300 JOA IV Follow up No
Inception point No Fukushima et al. 1991 Prospective cohort
55 JOA I
Follow up YES Inception point YES
(onset) Yukawa et al. 2007 Prospective cohort
142 JOA IV
Follow up NO Inception point NO
Yone et al. 1992 Prospective cohort
140 JOA IV Follow up NO
Inception point NO Okada et al. 1993 Prospective cohort
74 JOA IV
Follow up YES Inception point NO
(symptom duration?) Papadopolous et al. 2004
Prospective cohort
42 JOA IV Follow up YES
Inception point NO (symptom duration?)
Singh et al. 2001 Prospective cohort
69 Walking Test I Follow up YES
Inception point YES (surgery)
Chen et al. 2001 Prospective cohort
64 mJOA IV Follow up YES
Inception point NO Mastronardi et al.2007 Prospective cohort
42 mJOA I
Follow up YES Inception point YES (onset of symptoms)
79
Fernandez et al. 2007 Prospective cohort
67 mJOA I Follow up YES
Inception point YES (3 months before surgery)
Nagata et al. 1996 Retrospective cohort
173 JOA IIc
Uchida et al. 2005 Retrospective cohort
135 JOA IIc
Kasai et al. 2001 Retrospective cohort
128 JOA IIc
Chung et al. 2002 Retrospective cohort
113 JOA IIc
Wada et al. 1999 Retrospective cohort
85 JOA IIc
Morio et al. 2001 Retrospective cohort
73 JOA IIc
Yamazaki et al. 2002 Retrospective cohort
64 JOA IIc
Wada et al. 1995 Retrospective cohort
31 JOA IIc
Houten et al 2002 Retrospective cohort
38 mJOA IIc
Park et al 2006 Retrospective cohort
80 NCSS IIc
Mizuno et al. 2003
Case series study 134 JOA IV
Matsuyama et al., 2004
Case series study 44 JOA IV
Matsuda et al. 1991
Case series study 29 JOA IV
* http://www.eboncall.org/content/levels.html: NHS R&D Centre for Evidence-Based Medicine (Bob Phillips, Chris Ball, Dave Sackett, Brian Haynes, Sharon Straus and Finlay McAlister) (2002)
80
Table 3.4: Data extracted were groups of MRI features (signal intensity, spinal cord compression and spinal canal compromise) Table 3.4 (I): Descriptions of increased signal intensity (ISI) of the spinal cord in T2-/T1-weighted MRI
Predictive variable Author Method assessments:
Matsuda et al. 1991 1.5-tesla superconductive magnet* and a surface coil was used. The slices were from 3 to 5 mm thick.
Papadopolous et al. 2004 No description
Absence/presence of T2 signal intensity changes on sagittal view
Yukawa et al. 2007 1.5-T A surface coil was used. The slice width was 4 mm Absence/presence of T2 signal intensity changes on axial views
Mizuno et al. 2003
Snake-eye appearance was defined as one left- and one right-sided small round or elliptical high signal intensity lesion in the central gray matter near the ventrolateral posterior column
Absence/presence of T2 signal intensity changes (type of plane is not mentioned)
Singh et al. 2001 No description
Yukawa et al. 2007
Grade 0 none Grade 1 light (obscure) Grade 2 intense (bright)
Degree of intensity on sagittal T2WI
Chen et al. 2001
Type 0 no SI on T2 Type 1 (>50%) faint and fuzzy border Type 3 (>50%) intense and well-defined border
Three patterns of axial T1/sagittal T2 –weighted sequences
Morio et al. 2001 Alafifi et al. 2007 Mastronardi et al.2007
(A) normal intensity on both T1- and T2-weighted images (B) normal intensity on T1- weighted and high signal intensity on T2-weighted images (C) low signal intensity on T1-weighted and high signal intensity on T2-weighted images
Signal-intensity ratio on sagittal T2-WI
Okada et al. 1993
The intensity of the intramedullary, sagittal T2-weighted MRI cord signal at maximal compressed levels divided by comparable readings at contagious noncompressed sites
81
Table 3.4 (II): Descriptions of degree of spinal cord compression and/or canal compromise for cervical spondylotic myelopathy by magnetic resonance imaging (MRI) finding
Predictive variable Author Method assessments:
Yone et al. 1992
No description Slice thickness: 5 mm
Anterioposterior diameter on sagittal T1WI
Kasai et al. 2001 A 1.5-T MRI device The slice width was set at 5 mm and the number of slices at 7.Sagittal view of T1-/T2-weighted images MRI cumulative score: 6 degrees of spinal stenosis captured on T1/T2-weighted sagittal imaging: Grade 0: normal image; Grade 1: either the anterior or posterior subarachnoid space is not maintained; Grade 2: both the anterior and posterior subarachnoid spaces are not maintained; Grade 3: either anterior or posterior spinal cord deformity, but the posterior or anterior subarachnoid space is maintained; Grade 4: either anterior or posteror spinal cord deformity is observed, and the posterior or anterior subarachnoid space is not maintained; Grade 5: spinal cord deformity is observed both anteriorly and posteriorly
Degrees of spinal cord on sagittal T1WI
Nagata et al.1996 None (0) Mild (1; flattening or concavity of the anterior surface only) Moderate (2; <50% reduction in maximal sagittal diameter) Severe (3; >50% reduction in sagittal diameter)
Okada et al. 1993 The transverse area at the site of maximal cord compression was measured with a digitizer linked to a computer
Transverse area on axial T1WI
Fukushima et al. 1991 MRI axial views perpendicular to the spinal cord were obtained with a 0.5 tesla superconducting MRI system Critical value of transverse area is 0.45 cm2
82
Chung et al. 2002 Thickness of slices was not reported Pre-operative T1-weighted axial imaging with a Signa 1.5-tesla Compression ratio=a/b: a Smallest sagital diameter of the spinal cord, b broadest transverse diameter of the cord at the same level
Chen et al. 2001
Cord compression ratio = sagittal diameter/transverse diameter The imagers were superconducting 1.5-T MR systems Section thickness was 4 mm with 1-mm gap on both sagittal and transverse images.
Compression ratio on axial T1-weighted
Okada et al. 1993
(Saggital diameter/transverse diameter)*100% MRI examinations were performed with a 0.5 Tesla Slice thickness =10 mm
Degree of diameter on sagittal view
Houten et al.2002 Thickness is not reported Grade 0: 360 degree cushion of CSF around SC Grade 1: loss of CSF cushion without indentation of SC. May have slight anterior cord flattening Grade 2: mild cord compression Grade 3: Severe spinal cord compression
Table 3.4 (III): Area of high T2-signal change for cervical spondylotic myelopathy by magnetic resonance imaging (MRI) finding
Predictive variable Author Method assessments:
Wada et al. 1999 1.5-T with surface coil. Slice thickness =3-5 mm Mastronardi et al.2007 1.5-T with surface coil. Slice thickness =5 mm
Focal/ multisegmental high MRI intensity areas Fernandez et al. 2007
No thickness of slices was reported Type 0 no intramedullary high-signal intensity on T2-weighted images Type 1 high-signal intensity involved only one segment Type 2 high signal intensity extended over two segments
83
Table 3.5: Potential predictors with reported for univariate analyses and strength of association where available short (less than 6 months) and long (greater than 6 months) terms follow –up. Table 3.5 (I): Signal intensity changes as potential predictors
Prognostic factors Author Outcome Length of follow-up
Statistical significance
Strength of association
Yukawa et al 2007 JOA Long term p=0.033 p=0.0012
NA * NA **
Yone et al 1992 JOA Unknown p>0.05 NA * Papadopolous et al 2004 JOA Long term p>0.05
p<0.001 NA * NA **
Absence/presence of T2 signal intensity changes on sagittal view
Matsuda et al 1991 JOA Short term p<0.05 p<0.05
NA * NA **
Wada et al 1995 JOA Short term p>0.05 p>0.05
NA * NA **
Yamazaki et al 2002 JOA Long term p>0.05 NA *
Absence/presence of T2 signal intensity changes on axial/sagittal views
Chung et al 2002 JOA Long term p>0.05 NA * Absence/presence of T2 signal intensity changes on axial views
Mizuno et al 2003
JOA Short term p<0.001 NA *
Yukawa et al 2007 JOA Long term p=0.020 NA * Chen et al 2001 JOA Long term p=-0.018 NA *
Degree of intensity on sagittal T2WI
Uchida et al 2005 JOA Long term p>0.05 NA * Three patterns of axial T1/sagittal T2 –weighted sequences
Morio et al 2001 JOA Long term p = 0.0259 NA *
Signal-intensity ratio on sagittal T2-WI Okada et al 1993 JOA Unknown p<0.001 r=0.537 OPLL * r=0.426 CSM *
Fernandez et al 2007 mJOA Long term p>0.05 NA ** Absence/presence of T2 signal intensity changes on sagittal view Houten et al 2003 mJOA Short term p>0.05 NA ** Three patterns of axial T1/sagittal T2 –weighted sequences
Mastronardi et al 2007 mJOA Long term p=0.001 NA **
Absence/presence of T2 signal intensity changes (type of plane is not mentioned)
Singh et al 2001
Nurick Walking
Short term
p=0.03 p=0.0011
r=0.26 ** NA **
Area of signal intensity changes Wada et al 1995 JOA Short term p>0.05 NA *
84
p>0.05 NA ** Wada et al 1999 JOA Long term p<0.05
p<0.05 NA * NA **
Mastronardi et al 2007 mJOA Long term p=0.001 p<0.05
NA * NA **
Fernandez et al 2007 mJOA Long term p=0.001 NA * Table 3.5 (II): Severity of spinal cord compression as potential prognostic indicator Prognostic factors Author Outcome Length of
follow-up Statistical significance
Strength of association
Yone et al 1992
JOA Unknown p>0.05 p>0.05
NA * NA **
Anterioposterior diameter on sagittal T1WI
Kasai et al 2001 JOA Long term p<0.01 r=-0.436 * Degrees of spinal cord on sagittal T1WI Nagata et al 1996 JOA Long term p<0.05 NA **
Okada et al 1993 JOA Unknown p<0.01 (CSM/OPLL)
r=0.678/0.586 *
Fukushima et al 1991 JOA Long term p<0.05 r=0.295**
Transverse area on axial T1WI
Morio et al 2001 JOA Long term p=0.0517 p=0.0015
r=0.243 * r=0.398 **
Okada et al 1993 JOA Unknown p>0.05 NA * Chen et al 2001 JOA Long term p=0.836 r=0.026 *
Compression ratio on axial T1-weighted
Chung et al 2002 JOA Long term p<0.05 NA * Uchida et al 2005 JOA Long term p<0.05 in OPLL
p>0.05 in CSM NA ** NA **
Rate of flattening of the cord
Nagata et al 1990 JOA Long term p>0.05 NA * Grade 0 360 degree cushion of CSF around SC on…..
Houten et al 2003
mJOA Short term NA NA **
Degree of diameter on sagittal view Singh et al 2001 Nurick Short term p=0.60 r=0.07 ** Cord deformity on axial T1-weighted MRI Matsuyama et al 2004 JOA Short term NA
NA NA * NA **
*- recovery rate ** - post –operative functional score
85
Table 3.6: RESULTS - PREVIOUS PREDICTIVE MODELS
Study Name Year
Population Number
Fashion of selection
Range of years
Data collection
Statistics Outcome Measure
Recovery percentage
& Mean post-operative
score
Explained variation
(r2)
Variables in final model
Park 2006
80 Non-consecutive CSM cases 2000-2003 3 months after surgery
Patients charts
Stepwise, multivariate regression
NCSS Recovery (%) Maximum score 14
62.2% 25.2% Duration of symptoms Number of high intensity segments
Chen 2001
64 consecutive CSM cases, 1999-2000 6 months after surgery
Clinical database
ANCOVA mJOA Recovery (%) Maximum score 21
79.3%
47.9% Age Degree of intrinsic signal changes
Morio 2001
1998-1999 Non-consecutive CSM cases, Mean 3.4 years, range, 0.5–10 years after surgery
Clinical database
Stepwise, multivariate regression
JOA Recovery (%)
& Mean post score Maximum score 17
180% 14.5
29.7% 70.3%
Recovery percentage: Age Duration of symptoms Signal patterns Post-JOA: Age Duration Signal patterns
86
Baseline score Okada 1993
74 non-consecutive CSM cases No follow-up time was provided
Clinical database
Multiple regression analysis
JOA Recovery (%) Maximum score 17
(OPLL) 54.7% (CSM) 52.2% (CDH) 12.7%
71.8% 70.2%
Transverse area Signal Intensity ratio Duration of symptoms
Uchida 2005
1988-2001 Non-consecutive OPLLCSM cases Mean 8.3 years, range, 1.0–12.8 years
Medical records
Multiple regression analysis/ PCC(partial correlation coefficient)
JOA Recovery (%) Maximum score 17
Not reported Not reported
CSM group: Anterior Surgery Preoperative JOA score Crandall and Batzdorff’s type Radiographic abnormality Level of compression Rate of flattening of the cord Increased transverse area of the cord Grade of SCEP Laminoplasty Surgery Preoperative JOA score Crandall and Batzdorff’s type
87
Level of compression Rate of flattening of the cord Increased transverse area of the cord Grade of SCEP OPLL group Anterior Surgery & Laminoplasty Preoperative JOA score Crandall and Batzdorff’s type Spinal canal narrowing (preoperative CT) Type of OPLL Level of compression Rate of flattening of the cord Increased transverse area of the cord Grade of SCEP
88
CHAPTER 4: Material and Methods
Table 4.1: The mJOA scale for functional assessment for CSM* Score I. Motor dysfunction 0 Inability to move hands 1 Inability to eat with a spoon but able to move hands 2 Inability to button shirt but able to eat with a spoon 3 Able to button shirt with great difficulty 4 Able to button shirt with slight difficulty 5 No dysfunction II. Motor dysfunction of the lower extremities 0 Complete loss of motor and sensory function 1 Sensory preservation without ability to move legs 2 Able to move legs but unable to walk 3 Able to walk flat floor with a walking aid (such as a cane
or crutch) 4 Able to walk up and/or down stairs with hand rail 5 Moderate to significant lack of stability but able to walk
up and/or down stairs without hand rail 6 Mild lack of stability but walks unaided with smooth
reciprocation 7 No dysfunction III. Sensation 0 Complete loss of hand sensation 1 Severe sensory loss of pain 2 Mild sensory loss 3 No sensory loss IV. Sphincter dysfunction 0 Inability to micturate voluntarily 1 Marked difficulty with micturation 2 Mild to moderate difficulty with micturation 3 Normal micturation *From Benzel and colleagues, 1991.
89
Table 4.3: Standard parameters for cervical spine T1- and T2-weighted Magnetic Resonance Image (MRI) used in our study PROTOCOL: C-Spine 1.5T - start w/ 3-pl Loc & Asset Cal Series # 3 4 5 Scan Pl. / Mode Sag T2 Sag T1 Ax 3D T2 Pulse Sequence FrFSE FrFSE 3D FrFSE PSD File NPW, EDR NPW FC Name & Fast Imaging FR Options TR* / R-R#** 3200-6887 467-2616 2000-2500 TE1 / TE2* 110-119 10.1 97-106 ETL (Echo Train Length) 24-33 1--6 39 FOV (Field of View) 24-26 24-26 18-24 Slice Thickness 3 3 2.5 Spacing*** 3.3-3.5 3.3-3.5 2.5 # of Slices 13-18 13-18 24-80 Matrix 512X224 512X224 320X224 Phase FOV (Field of View) Frequency Direction A/P A/P R/L Number of excitation 2--4 1--2 1 Shim on Spatial Sat I,S,a,p I,S,a,p a Scan Time 0:26-17:22 0:55-23:43 4:35-5:44
*TE, echo time; TR, repetition time; ** R-R, rest & relaxation; ***Space, gap/space between slides.
90
CHAPTER 5: Results
Table 5.1: Characteristics of Patients with Cervical Spondylotic Myelopathy
% (No. of Patients) Characteristics n=61 Mean duration of symptoms ± SD (months) 21.1±18.2 Mean age ± SD (y) 56.2±11.9 Mean age (years)** <=65 years old 75% (46) >65years old 25% (15) Gender Female 31%(19) Male 69%(42) Severity of CSM*** Mild (mJOA>=15) 32% (19) Moderate (mJOA 12-14) 34% (21) Severe (mJOA<12) 34% (21) Anatomical level of stenosis C3/C4 9% (6) C4/C5 13% (8) C5/C6 25% (15) C6/C7 49% (30) Unknown 3% (2) Number of stenotic levels** One 45% (26) Two 23% (13) Three and more 32% (18) Unknown 6% (4) Signal intensity changes
91
Normal T1/Norm T2 20 (34%) Normal T1/High T2 28 (47%) Low T1/High T2 11 (19%) Surgical approach Anterior approach 42 (67%) Posterior approach 18 (30%) Anterior & posterior approach 1 (3%) Etiologies of myelopathy One etiology OPLL 6% (4) Spondylosis 37% (24) Disk 17% (11) Hypertrophic ligament flavum 2% (1) Subluxation 2% (1) Two etiologies 29% (19) Three etiologies 5% (3) Unknown 3% (2) Table 5.2: Values of the mJOA in CSM sample Baseline 12 months Change Score 95% CI for
change score mJOA functional scale
12.9+/-2.7 15.8+/-2.3 2.93+/-2.4 2.32-3.55
NOTE. Values are mean +/- SD. Abbreviation: CI, confidence interval.
92
Table 5. 3: Correlation matrix and coefficients between functional outcomes and independent variables Age Gender Duration of
symptoms Baseline score Signal
intensity changes
Transverse area Anteroposterior diameter
Number of compressed segments
Age 1.00 Gender 0.03 1.00 Duration of symptoms
0.27 -0.12 1.00
Baseline score 0.44 0.05 0.26 1.00 Signal intensity changes
0.24 0.12 0.13 0.13 1.00
Transverse area 0.27 0.13 0.08 0.29 0.39 1.00 Anteroposterior diameter
0.21 0.12 0.03 0.19
0.41 0.62
1.00
Number of compressed segments
0.20 0.12 0.20 0.32 0.24 0.35 0.23 1.00
93
Table 5.4: Unadjusted beta value estimates for independent variables (univariable analysis)
Variable Coefficient 95% CI P Value
R2
Baseline mJOA Age as dichotomized* <=65 years old >65 years old
-2.83 -1.42, -4.24 0.0002 0.20
Age as continuous -0.08 -0.13, -0.03 0.0051 0.12 Gender* 0.30 -1.15, 1.75 0.68 0.003 Duration of symptoms as dichotomized* <=12 months >12 months
-1.55 -2.97, -0.15 0.03 0.07
Duration of symptoms as continuous 0.00 -0.04, 0.04 0.97 0.00 TA as dichotomized 0.96 -0.4, 2.32 0.17 0.03 TA as continuous* 0.06 0.02, 0.10 0.02 0.08 AP diameter 0.43 -0.13, 0.99 0.14 0.03 Intensity signal changes* Low T1/high T2 vs. Normal T1/High T2 Low T1/high T2 vs. Normal T1/Norm T2
0.75 0.99
-0.16, 2.64 -0.96, 2.98
0.61 0.02
Number of compressed segments* ≥ 3 vs. 2 compressed segments ≥ 3 vs. 1 compressed segment
2.35 1.06
0.57, 4.13 -1.01, 3.13
0.04
0.10
Final mJOA
Baseline mJOA* 1.014 <.0001 0.30 Age as continuous -1.002 -3.005, 1.001 0.01 0.11 Age as dichotomized* <=65 years old >65 years old
-1.072 -3.110 , 0.966 <.0001 0.22
94
Gender* -1.018 -3.057, 1.022 0.33 0.06 Duration of symptoms as continuous 1.0 -1.003 , 3.003 0.72 0.002 Duration of symptoms as dichotomized* <=12 months >12 months
-1.034 -3.075, 1.007 0.09 0.05
TA as continuous* 1.0 -1.005, 3.005 0.24 0.02 TA as dichotomized 1.01 -1.030, 3.050 0.56 0.01 AP diameter as continuous 1.005 -1.012, 3.022 0.49 0.01 Intensity signal changes* Low T1/High T2 vs. Normal T1/High T2 Low T1/High T2 vs. Normal T1/NormT2
1.016 1.038
-1.037, 3.069 -1.017, 3.093
0.33
0.04
Number of compressed segments* ≥ 3 vs. 2 compressed segments ≥ 3 vs. 1 compressed segment
-1.00 1.03
-3.055 , 1.055 -1.017, 3.077
0.78
0.01
* Chosen exposure variables for multivariable analysis
95
Table 5.5: Statistical details of full models (multivariable analysis)
Dependent Variable Independent Variables
Coefficient 95% CI MSE for the Model
P Value for the Model
Adjusted R2 for the Model
Baseline mJOA score Age -2.83 -1.420, -4.240 2.44 p=0.0002 20% Follow-up mJOA score adjusted for baseline mJOA score
Age -1.04 -3.081, 1.001 0.06 p<0.0001 36%
96
FIGURES Figure D.1: Measurements for the antero-posterior diameter (AP) (A) and transverse area (TA) measurements of the spinal cord using T2-weighted MR image (B).
Figure D.2: T1-weighted image of the sagittal view revealing hypointensity in the spinal cord (C) and T2-weighted image of the sagittal view showing hyperintensity in the spinal cord (D) before surgery (arrow).
97
Figure D.3: (E) Focal compression (F) Multiple level of compression.
Figure D. 4: Distribution of baseline mJOA scores.
98
Figure D. 5: Distribution of post-operative mJOA scores at 12 months.
99
CHAPTER 8
APPENDICES Appendix 1 Search strategy (results: November 28, 2008) Database Searches # Ovid MEDLINE(R) 1. Magnetic Resonance Imaging/
2. (functional adj6 MRI).mp. 3. fMRI.mp. 4. (functional adj6 magnetic resonance imag:).mp. 5. magnetic resonanc: imag:.mp. 6. mr tomograph:.mp. 7. nmr imag:.mp. 8. nmr tomograph:.mp. 9. zeugmatograph:.mp. 10. functional mri:.mp. 11. chemical shift imag:.mp. 12. magnetization transfer contrast imag:.mp. 13. (mri adj2 scan:).mp. 14. proton spin tomograph:.mp. 15. or/1-14 16. exp cohort studies/ 17. exp prognosis/ 18. exp morbidity/ 19. exp mortality/ 20. exp survival analysis/ 21. exp models, statistical/ 22. prognos*.tw. 23. predict*.tw. 24. course*.tw. 25. diagnosed.tw. 26. cohort*.tw. 27. death.tw. 28. exp case-control studies/ 29. disease-free survival.mp. 30. medical: futil:.mp. 31. treatment outcome:.mp. 32. treatment failure:.mp. 33. exp disease progression/ 34. (disease adj1 progress:).mp. 35. fatal outcome:.mp. 36. hospital mortality:.mp. 37. exp survival analysis/ 38. natural histor:.mp.
6751
100
39. or/16-38 40. spinal cord diseases/ or spinal cord compression/ 41. cervical spondylotic myelopath:.mp. 42. cervical spond: myelopath:.mp. 43. (cervical adj2 myelopath:).mp. 44. spinal canal.mp. [mp=title, original title, abstract, name of substance word, subject heading word] 45. spinal cord.mp. [mp=title, original title, abstract, name of substance word, subject heading word] 46. spin: cord compress:.mp. 47. exp Cerebrospinal Fluid/ 48. cerebrospinal fluid:.tw. 49. central cord syndrome/ 50. or/40-49 51. 50 and 39 and 15 52. exp animals/ 53. exp human/ 54. 52 not (52 and 53) 55. 51 not 54
EMBASE 1. exp Magnetic Resonance Imaging/ 2. (functional adj6 MRI).mp. 3. fMRI.mp. 4. (functional adj6 magnetic resonance imag:).mp. 5. magnetic resonanc: imag:.mp. 6. mr tomograph:.mp. 7. nmr imag:.mp. 8. nmr tomograph:.mp. 9. zeugmatograph:.mp. 10. functional mri:.mp. 11. chemical shift imag:.mp. 12. magnetization transfer contrast imag:.mp. 13. (mri adj2 scan:).mp. 14. proton spin tomograph:.mp. 15. or/1-14 16. exp cohort studies/ 17. exp prognosis/ 18. exp morbidity/ 19. exp mortality/ 20. exp survival analysis/ 21. exp models, statistical/ 22. prognos*.tw. 23. predict*.tw.
101
24. course*.tw. 25. diagnosed.tw. 26. cohort*.tw. 27. death.tw. 28. exp case-control studies/ 29. disease-free survival.mp. 30. medical: futil:.mp. 31. treatment outcome:.mp. 32. treatment failure:.mp. 33. exp disease progression/ 34. (disease adj1 progress:).mp. 35. fatal outcome:.mp. 36. hospital mortality:.mp. 37. exp survival analysis/ 38. natural histor:.mp. 39. or/16-38 40. exp Spinal Cord Compression/ 41. cervical spondylotic myelopath:.mp. 42. cervical spond: myelopath:.mp. 43. (cervical adj2 myelopath:).mp. 44. spinal canal compromis:.mp. 45. spin: cord compress:.mp. 46. central cord syndrome/ 47. medulla: compress:.mp. 48. (spinal cord: adj2 pinch:).mp. 49. conus medullaris syndrome.mp. [mp=title, original title, abstract, name of substance word, subject heading word] 50. conus medullaris syndromes.mp. [mp=title, original title, abstract, name of substance word, subject heading word] 51. or/40-50 52. 39 and 51 and 15 53. exp animals/ 54. exp human/ 55. 53 not (53 and 54) 56. 52 not 55
102
Appendix 3: RELIABILITY
A comparison of four quantitative methods to assess spine stenosis and
spinal cord compression on magnetic resonance imaging in patients with
cervical spine myelopathy
F. 1. INTRODUCTION AND OVERVIEW F. 2. STUDY OBJECTIVE F. 3. HYPOTHESIS F. 4. STUDY DESIGN F. 5. TARGET POPULATION F. 6. DEFINITION OF MR IMAGING PARAMETERS
F.6.1 Strategies to improve reliability of MR imaging parameters (variation due to) F.6.1. a. Clinicians
F.6.1. b. Patients F.6.1. c. MR Imaging protocol F.6.1. d. Measurement errors
F. 7. SAMPLE SIZE F. 8. DATA ANALYSIS
INTRODUCTION AND OVERVIEW
Lack of standardized approaches to assess the severity of cord compression in the
setting of CSM may contribute to variability in interpretations of MRI-based features.
The process of developing a radiological measure to assess severity of CSM requires
selection of most suitable imaging modality, appraisal of reliability and determination of
validity. A valid measurement instrument needs to be reliable or reproducible, even
though reliability is not sufficient condition for validity. Reliability measures the degree
of consistency across repeated assessments of different patients by the same rater
(intrarater reliability) or agreement across different raters for the same patient (interrater
reliability). The estimate of reliability is significant because: 1) reliability represents the
minimal requirement for a valid clinical measure, and 2) efficiency of clinical trials relies
on reliable measurements.
103
STUDY OBJECTIVE
The objective is to investigate the intra- and inter-reliability of four published
methods of examining cord stenosis and canal compression on axial (transverse area
[TA], anteroposterior diameter [AP]) and sagittal (maximum spinal cord compression
[MSCC], maximum canal compromise [MCC]) MR imaging planes.
HYPOTHESIS
We hypothesize that using a systematic approach to evaluate cervical canal
stenosis and spinal cord compression with a magnified software based tools, written
instructions and consistent interpretations, TA, AP, MSCC and MCC would be
reproducible imaging assessment of the severity of cord compression in CSM patients,
irrespective of clinician’s background/experience, learning and CSM severity.
STUDY DESIGN Subjects:
The patients were randomly selected from a prospectively accrued database of
CSM patients who were referred for surgical treatment in our unit which is a large tertiary
care, university-based spine center.
Procedures:
Seventeen cervical spine digital MR images were evaluated by four spine
specialists (two neurosurgery, two orthopaedic surgery), in a blinded fashion on four
separate occasions, from North America (n=2), Europe (n=1) and Asia (n=1).
TARGET POPULATION
The patients had a clinical diagnosis of myelopathy confirmed by evidence of
cord compression on MRI. This project is based on analysis of a single centre (n=65)
which is part of a larger multicentre AOSpine North America CSM Trial; n=283 cases.
104
DEFINITION OF MR IMAGING PARAMETERS (EXPOSURE VARIA BLES)
Defined Radiological Parameters
Based on three dimensions from MRI, the maximum cord compression using T2-
weighted MRI and canal compromise using T1-weighted MRI were calculated using the
following formulas [9].
Maximum spinal cord compression (%):
where Di is the anteroposterior spinal cord diameter at the level of maximum spinal cord
compression, Da is the anteroposterior spinal cord diameter at the normal levels
immediately above, and Db is the anteroposterior spinal cord diameter at the normal
levels immediately below the level of injury (Figure F.1.).
Maximum canal compromise (%):
where Di is the anteroposterior spinal canal diameter at the level of maximum spinal cord
compression, Da is the anteroposterior spinal canal diameter at the normal levels
immediately above, and Db is the anteroposterior spinal canal diameter at the normal
levels immediately below the level of injury. Measurements of the normal canal
anteroposterior diameter should be taken at midvertebral body level.
Transverse area: was identified as the site of greatest compression using T2 axial view of
the spinal cord [3] (Figure F.2.).
Anteroposterior diameter: was identified as smallest sagittal diameter of the spinal cord,
[Yone et al 1992] (Figure F.2.).
105
Strategies to improve reliability of MR imaging parameters (variation due to)
Clinicians
First, raters were blinded to clinical and neurologic data. Second, raters assessed
the same patients on four occasions (or rounds), three days apart from each other to guard
against memory recall. Third, the scans were read individually and randomly. Fourth, for
validity of the experiment, the raters will be given the same images on all four occasions.
Fifth, the teaching session prior to the first round of measurements was conducted in one
meeting to ensure consistencies of images interpretations.
Patients
To ensure a range of symptoms severity for reliability testing, the modified
version of the Japanese Orthopaedic Association Scale (mJOA) (Table C.1), was used to
classify CSM into mild ( mJOA score >=15), moderate ( mJOA score 12-14) and severe
(mJOA< 12) degrees of functional disability. Of the seventeen subjects in this study, six
individuals had mild CSM (mJOA score >=15), five individuals had a moderate CSM
(mJOA score 12-14) and six individuals had severe CSM (mJOA score <12). As
described in Table F.1., the cases had varying numbers of levels of cord compression due
to a variety of different pathologies, which are commonly seen in clinical practice
including spondylosis, disc herniation, ossification of the posterior longitudinal ligament,
hypertrophy of the ligamentum flavum, degenerative subluxation and congenital stenosis.
MR Imaging protocol
The preoperative mid-sagittal T1-weighted, axial and midsagittal T2-weighted
MRI series of all patients were included in a CD-ROM with eFilm Lite (2003) and
Mango 2.0 software (Multi-Image Analysis GUI). Figures F.1. - F.2. demonstrate
examples of the measurement techniques used in this study. The patients were evaluated
by the spine specialists using operational guidelines, which detailed the methodologies
described by the original studies from Fehlings et al. 2007 [9] for maximum spinal cord
compression [MSCC] and maximum canal compromise [MCC], from Okada et al.1993
[3] for transverse area [TA], and from Yone et al. 1992 [reference] for anteroposterior
diameter [AP]. Raters were asked to amplify 200% the images, consistently across all
patients, using the E-film and Mango programs before measuring parameters, potentially
106
reduces the procedural variability of the measurements of cervical canal stenosis and
spinal cord compression in CSM.
SAMPLE SIZE
Given that the primary objective of this study was to assess the reliability of four
instruments in the setting of myelopathy, we calculated a sample size of seventeen
patients based on four raters carrying out four separate ratings of each subject in order to
obtain results with a Type I error of 5%, a minimal power of 80%, and a desired
interclass correlation coefficient (ICC) of 0.75 (expected level of ICC of 0.9) [11] [12].
DATA ANALYSIS
Data were entered and all analyses were performed using constructed data sets in
SAS, version 9.2 Software and Microsoft Excel 2003 software packages. Interrater and
intrarater reliability was evaluated using ICCs derived from two-way analysis of
variance (ANOVA)[13]. In general, ICCs range from 0 to 1, where 0 indicates no
agreement and 1 indicates perfect agreement/consistency [14]. Interpretation of the ICC
values was carried out according to the criteria proposed by Burdock et al [15]. The
criteria of Burdock et al [15] to interpret the minimum ICC value of 0.75 were used as a
reference for an excellent level of agreement/consistency. However, it is important to
acknowledge that such criteria are somewhat arbitrary.
Intrarater reliability and Interrater reliability.
ICC is a relative index of variability and ICC of 0.95 means that an estimated
95% of the observed score variance is due to true variance between subjects. The ICC
estimates were calculated according to Shrout-Fleiss models for random effects (Model
2) using 1) 2-way model, 2) random effect model with absolute agreement (the raters
assumed be randomly selected from the population), 3) include systematic error, 4) mean
score (the scores in the analysis represent the average of all trials from each subject)
(Fleiss et al. 1979). The intra-rater and inter-rater ICCs establish reliability of ratings
including systematic differences between raters.
Data are represented in terms of estimates of the true mean, standard deviations,
standard error of the mean (SEM) and confidence intervals [17].
107
RESULTS
As described in Table F.1, our study population was composed of four females
and thirteen males (age, 37–82 years; mean, 54.5 years) with varying severity of CSM.
Table F.1 Characteristics of the patients with Cervical Spondylotic Myelopathy (CSM)
Gender Age (yrs) Etiology of CSM
Number of stenotic segments
Severity of CSM by mJOA Grades Mild (mJOA score >=15) Moderate (mJOA score 12-14) Severe (mJOA score <12)
Male 50 Spondylosis + CS 1 18 Male 53 Spondylosis + CS 2 17 Male 52 Spondylosis + CS 3 15 Male 43 OPLL + HLF 2 15 Male 65 Disc 4 16 Male 60 Spondylosis 2 15 Male 38 Spondylosis + CS 1 13 Male 68 Spondylosis + SL 8 13 Male 61 Spondylosis + HLF 3 14 Male 37 Disc herniation 1 14 Female 54 SL 2 12 Male 82 OPLL 2 11 Male 52 OPLL + CS 4 8 Female 58 HLF 3 10 Male 59 SL+ CS + HLF 4 10 Female 55 Spondylosis 1 11 Female 40 Disc herniation 1 10
***mJOA - modified version of Japanese Orthopaedic Association Scale
(CSM - cervical spondylotic myelopathy/ SL- subluxation / CS- congenital stenosis / HLF- Hypertrophic ligament flavum/ OPLL-
Ossification of the posterior longitudinal ligament).
Descriptive statistics
The differences among the four raters for all four radiological parameters (MCC,
MSCC, TA and AP) met statistical significance based on two-way ANOVA with
Bonferroni post-hoc analysis (Table F.2).
The transverse area of spinal cord ranged from 32.8 to 122.0 mm2, with mean
value of 74.8±15.67 mm2, 80.0±23.17 mm2, 59.6±19.89 mm2 and 71.4±16.48 mm2 for
Rater 1-4, respectively, with the largest deviation reported by Rater 2 (Table F.2). Rater
2 and Rater 3 had consistently different ratings from Rater1 and 4 (p<0.05).
108
The anteroposterior diameter of spinal cord ranged from 0.40 to 0.43mm, with
mean value of 0.41±0.09mm, 0.43±0.07mm, 0.40± 0.08mm and 0.40±0.08 mm for Rater
1-4, respectively, with the largest deviation reported by Rater 1 (Table F.2). Rater 2 had
consistently different ratings from Rater 3 and 4 (p<0.05).
The maximum canal compromise ranged from 77.2 to 93.6, with mean value of
82.0±2.25, 82.4±3.71, 85.7±3.01 and 82.6±2.53 for Rater 1-4, respectively, with the
largest deviation reported by Rater 2 (Table F.2). Rater 3 had consistently different
ratings from Rater 1, 2 and 4 (p<0.05).
The maximum spinal cord compression ranged from 78.3 to 89.1, with mean
value of 82.8±2.71, 82.4±2.71, 84.1±2.43, and 82.1±2.37 for Rater 1-4, respectively, with
the largest deviation reported by Rater 1 and 2 (Table F.2). Rater 3 had consistently
different ratings from Rater 1, 2 and 4 (p<0.05).
Table F.2 presents the results in terms of means, standard deviations, minimum and
maximum values of 17 cases.
Measure (Mean±SD, Min, Max)
Rater 1 Rater 2 Rater 3 Rater 4
Transverse Area (TA) 74.8±15.67 38.0, 98.0
80.0±23.17 40.1, 122.0
59.6±19.89 32.8, 103.7
71.4±16.48 30.6, 92.2
Anteroposterior Diameter (AP)
0.41±0.09 0.2, 0.6
0.43±0.07 0.2, 0.6
0.40± 0.08 0.2, 0.6
0.40±0.08 0.2, 0.5
Maximum Canal Compromise (MCC)
82.0±2.25 78.8, 88.3
82.4±3.71 77.2, 89.8
85.7±3.01 82.1, 93.6
82.6±2.53 80.0, 90.2
Maximum Spinal Cord Compression (MSCC)
82.8±2.71 78.3, 88.5
82.4±2.71 79.3, 88.8
84.1±2.43 80.9, 89.1
82.1±2.37 78.5, 87.5
Assessment of Intrarater Reliability
Using the Shrout-Fleiss model for random effects, the intrarater consistency ICC’s
were 0.82, 0.99, 0.98, 0.88 for the transverse area of spinal cord, 0.76, 0.91, 0.88, 0.84
for the anterposterior diameter of spinal cord, 0.76, 0.89, 0.85, 0.76 were for the
assessment of maximum spinal cord compression using the T2-weighted MRIs for Rater
1-Rater 4, respectively; and 0.82, 0.97, 0.80, 0.72 were for the measurement of maximum
spinal compromise using the T1-weighted MRI for Rater 1-Rater 4, respectively.
109
Consistently, Rater 2 has ratings above the other three raters. According to the general
guidelines by Burdock et al. [15] , in our study, all four measurement methods had an
acceptable consistency (ICC values higher than 0.75) (Table F.3).
Table F. 3 outlines inter-observer agreement ICC values using the Shrout-Fleiss model
for random effects regarding spinal cord and canal deformities evaluated by TA, AP,
MSCC and MCC, respectively.
Measure TA AP MCC MSCC
Intra-rater (ICC, SEM*, 95% CI**)
Rater 1 0.82, 13.3 (0.62-0.93)
0.76, 0.06 (0.73-0.79)
0.82, 1.96 (0.61-0.93)
0.76, 2.56 (0.53-0.91)
Rater 2 0.99, 3.9 (0.94-1.00)
0.91, 0.04 (0.87-0.95)
0.97, 1.34 (0.93-0.99)
0.89, 1.53 (0.77-0.96)
Rater 3 0.98, 6.3 (0.95-0.99)
0.88, 0.05 (0.83-0.93)
0.80, 2.69 (0.59-0.92)
0.85, 1.86 (0.69-0.94)
Rater 4 0.88, 11.4 (0.75-0.95)
0.84, 0.05 (0.79-0.90)
0.72, 2.63 (0.43-0.89)
0.76, 2.36 (0.49-0.90)
*SEM= square root of MSE ** ICC 95% CI: ICC ± 1.96*SD*squared root of [ ICC (1 - ICC)], where SD = square root of (sst/n-1) (Weir et al 2005).
Assessment of Interrater Reliability
Using the Shrout-Fleiss model for random effects, the interrater agreement ICC’s
were 0.68, 0.69, 0.73 and 0.76 on 1st-4th session for the transverse area of spinal cord,
0.86, 0.72, 0.68, and 0.52 on 1st-4th session for the anterposterior diameter of spinal cord,
0.83, 0.65, 0.62, and 0.65 on 1st-4th session were for the assessment of maximum spinal
cord compression using the T2-weighted MRIs, and 0.46, 0.64, 0.46 and 0.52 on 1st-4th
session were for the measurement of maximum spinal compromise using the T1-weighted
MRI. Although, mean ICC’s consistently improved from session 1 to session 4 for
transverse area measurements (Table F.4), graphical representation illustrated normal
fluctuations (Figure F. 3).
110
Table F.4. Reliability Assessment (Using the Shrout-Fleiss model for random effects)
Measure TA AP MCC MSCC
Inter-rater (ICC, SEM*, 95% CI**)
Time 1 0.68, 15.6 (0.36-0.87)
0.86, 0.05 (0.84-0.88)
0.46, 3.0 (-0.01-0.76)
0.83, 2.07 (0.65-0.93)
Time 2 0.69, 17.8 (0.37-0.87)
0.72, 0.06 (0.69-0.75)
0.64, 2.79 (0.28-0.85)
0.65, 2.56 (0.30-0.86)
Time 3 0.73, 14.3 (0.42-0.89)
0.68, 0.06 (0.65-0.71)
0.46, 3.44 (-0.05-0.77)
0.62, 2.40 (0.24-0.85)
Time 4 0.76, 13.9 (0.50-0.90)
0.52, 0.06 (0.49-0.55)
0.52, 2.82 (0.10-0.79)
0.65, 2.40 (0.30-0.86)
*SEM= square root of MSE ** ICC 95% CI: ICC ± 1.96*SD*squared root of [ICC*(1 - ICC)], where SD = square root of (SST/n-1) (Weir et al 2005).
To explore the sources of systematic errors that contribute to ICCs mentioned in
Table F.3-F.4, three-way ANOVA was used to investigate time and rater as facets of
interest.
The data illustrated in Table F.5. - F.8 show the effect for trials (time facet) is
shown to be statically insignificant in three methods of spine and canal stenosis
assessment except the transverse area of spinal cord based on three-way ANOVA with
Bonferroni post-hoc analysis ([MSCC, p=0.28], [MCC, p = 0.35], [AP, p=0.12], [TA, p=
0.01]). This observation is also supported by consistently increased level of agreement
among four raters from Session 1 to Session 4 (Table F.4). However, the time differences
are illustrated as normal fluctuations (i.e. random error) (Figure F.3), indicating that
there is no systematic error in the data.
The data illustrated in Table F.5. - F.8 show the effect for rater is
shown to be statically significant in all four methods of spine and canal stenosis
assessment based on three-way ANOVA with Bonferroni post-hoc analysis ([MSCC,
p<0.0001], [MCC, p <0.0001], [AP, p=0.0008], [TA, p <0.0001]).
111
Table F.5. Analysis of Variance summary table for maximum spinal cord compression
(MSCC) measurements data set
SOURCE OF VARIATION
Df MS F Sig
Between subjects 16 68.58 (BMS) 15.30 <0.0001 Within subjects Between raters 3 53.37 (RMS) 11.90 <0.0001 Between times 3 5.76 (TMS) 1.28 0.2813 Rater*time 9 3.41 (RTMS) 0.76 0.6520 Rater*subject 48 9.05 (RSMS) 2.20 0.0004 Error (EMS) 4.48
Table F.6. Analysis of Variance summary table for maximum canal compromise (MCC)
measurements data set
Source of variation Df MS F Sig Between subjects 16 75.21(BMS) 15.20 <0.0001 Within subjects Between raters 3 193.49 (RMS) 39.09 <0.0001 Between times 3 5.46 (TMS) 1.10 0.3489 Rater*time 9 5.75 (RTMS) 1.16 0.3212 Rater*subject 48 20.76 (RSMS) 4.20 <0.0001 Error 4.95 (EMS)
Table F.7. Analysis of Variance summary table for transverse area of spinal cord (TA)
measurements data set
Source of variation Df MS F Sig Between subjects 16 3696.59(BMS) 41.17 <0.0001 Within subjects Between raters 3 5094.48(RMS) 56.74 <0.0001 Between times 3 343.369(TMS) 3.82 0.0108 Rater*time 9 157.22 (RTMS) 1.75 0.08 Rater*subject 48 700.00(RSMS) 7.80 <0.0001 Error 89.78(EMS)
112
Table F. 8. Analysis of Variance summary table for anteroposterior diameter (AP) of
spinal cord measurements data set
Source of variation Df MS F Sig Between subjects 16 0.047 (BMS) 18.92 <0.0001 Within subjects Between raters 3 0.0148(RMS) 5.85 0.0008 Between times 3 0.005(TMS) 1.99 0.1175 Rater*time 9 0.0033(RTMS) 1.33 0.2243 Rater*subject 48 0.0068(RSMS) 2.71 <0.0001 Error 0.0025(EMS)
DISCUSSION AND CONCLUSION
This project enhances the understanding of challenges in MRI interpretations in
CSM population. First, the advantage of T2W is that it provides a visual contrast to the
spinal cord due to its bright CSF. In contrast, T1W imaging shows indistinct anatomy
regions of bony canal and spinal cord typically presented in CSM population. This is
likely why MSCC provides more reliable measurements than MCC on T1W technique
(Table F.3. - F. 4). However, both measurement methods demonstrate the ability to
provide degree of spinal cord compression relative to its own normal values. Second, the
applications of software used for transverse area and anteroposterior diameter of spinal
cord are underdeveloped to establish more accurate estimates of spinal cord deformities.
For example, the application software used to assess the anteroposterior diameter
measurements appeared to hold 1-digit numbers. We suspect that the repeated reduction
to 1 digit could cause systematic build-up of error in the calculation of ICC value. Further
research requires it to utilize more rigorous mathematical procedures.
In contrast to the previously published studies (reference TA and AP), the
refinement of two published MR imaging techniques such as the TA and AP diameter
method took place with improvements in the written instructions. Furlan et al. 2007
supported the hypothesis that the interrater and intrarater reliability of MR imaging
assessments techniques are enhanced using magnified digitized images and therefore
reduce procedural variability of the measurements. In our study, the MR scans were
113
consistently magnified across all cases. Lack of publications of quantified intra- and
inter-reliability of the measurement methods listed above limit further comparisons.
Based on the findings of our study, the variances in the severity of population,
clinicians’ experience and individual approaches of MR imaging reading appear to
influence the procedural variability of measurements. Therefore, future studies should
include these details in the descriptions of study design and discussions. First, all four
methods appeared to be significantly varied by the raters’ individual interpretations based
on CSM severity. Second, specialty training seems to influence the variability of
measurements. After completing review of the circumstances of third rater’s consistently
higher ratings, it seems reasonable to speculate that the differences between raters could
have been influenced to some extent by specialty training Table F. 3. While all raters
were fellowship trained spine surgeons, Rater 2 had orthopaedic compared to
neurosurgery residency training background. Third, some individual approaches
employed by raters that were not apparent at the stage of designing protocol but crucial
for future studies. First, clinician may have an internal subjective standard as to what they
believe to be the anatomical midline of the spine on MR imaging. Secondly, fluctuations
of the internal subjective standard with the selection of the most compressed site, which
is partially contributed by the tendency of multilevel involvement as the result of
degenerative changes of spine in CSM.
Limitations
One limitation associated with statistical analysis of reliability is averaging of
ratings. If more than one measurement were performed, the means of several trials are
usually used to estimate reliability. Averaging data can increase the reliability coefficient
by minimizing the magnitude of differences between measurements. In our study, the
reliability is reported for the mean of all trials. Yet, practitioners typically administer a
single trial when determining a measure.
There are some limitations regarding our study design that are potential sources
for an increased inter-observer variation and, therefore, reduced reliability. First, a study
with one single recruitment centre might potentially systematically under- or
114
overestimate measurement errors due to particular characteristics of patients. Second, the
position of patients during MR imaging scanning might affect the results. When the
positioning is slightly changed from flexion to extension, the dural sac cross sectional
area diminishes. Despite careful selection of images, at least one report of abnormal
positioning was recognized. Third, the variations due to lack of standardized features of
imaging protocol such as different slice thicknesses of MRI scans might effect the results.
Although it is true that not all MR images had similar slice thickness that might have
introduced some bias, majority of scans (11/17) had slice thickness of 2.50 mm, the rest
had higher thickness of 3 mm. Nevertheless, methods used for the scans in this study
reflected the typical protocols available during the study period. Fourth, clinicians’ area
of expertise trained at different institutions is another potential limitation. However, we
anticipate that these limitations are actually relatively minor and reflect real world issues.
References:
1. Montgomery, D.M. and R.S. Brower, Cervical spondylotic myelopathy. Clinical syndrome and natural history. [Review] [54 refs]. Orthopedic Clinics of North America. 23(3):487-93, 1992 Jul., 1992.
2. Chen, C.J., et al., Intramedullary high signal intensity on T2-weighted MR images in cervical spondylotic myelopathy: prediction of prognosis with type of intensity. Radiology. 221(3):789-94, 2001 Dec., 2001.
3. Okada, Y., et al., Magnetic resonance imaging study on the results of surgery for cervical compression myelopathy. Spine. 18(14):2024-9, 1993 Oct 15., 1993.
4. Morio, Y., et al., Correlation between operative outcomes of cervical compression myelopathy and mri of the spinal cord. Spine. 26(11):1238-45, 2001 Jun 1., 2001.
5. Fukushima, T., et al., Magnetic resonance imaging study on spinal cord plasticity in patients with cervical compression myelopathy. Spine. 16(10 Suppl):S534-8, 1991 Oct., 1991.
6. Feinstein, A., Clinical biostatistics: XLI. Hard science, soft data, and the challenges of choosing clinical variables in research. . Clinical Pharmacology & Therapeutics, 1977. 22(0): p. 485–498.
7. Henrica C.W. de Veta, C.B.T., Dirk L. Knola, Lex M. Boutera, When to use agreement versus reliability measures. Journal of Clinical Epidemiology, 2006. 59 p. 1033–1039.
8. Wright J. G. , F.A.R., Improving the reliability of orthopaedic measurements. The Journal of Bone and Joint Surgery, 1992. 74B(2): p. 287-291.
9. Fehlings MG, F.J., Massicotte EM, et al. , Interobserver and intraobserver reliability of maximum canal compromise and spinal cord compression for evaluation of acute traumatic cervical spinal cord injury. . Spine 2006. 31: p. 1719–1725.
115
10. Bednarik, J., Kadanka, Z., Dusek, L., Kerkovsky, M., Vohanka, S., Novotny, O., Urbanek, I., Kratochvilova, D. , Presymptomatic spondylotic cervical myelopathy: an updated predictive model. . European Spine Journal, 2008. 17: p. 421–431.
11. Kraemer HC, K.A., Statistical alternatives in assessing reliability, consistency and individual differences for quantitative measures: application to behavioral measures of neonates. Psychol Bull 1976. 83: p. 914–921.
12. Walter S.D., E., M., Donner, A. , Sample size and optimal designs for reliability studies. . Statistics in medicine, 1998. 17: p. 101-110.
13. Shrout, P.E., Fleiss, J.L., Intraclass Correlations: Uses in Assessing Rater Reliability. . Psychological Bulletin, 1979. 86(2): p. 420-428.
14. Fleiss JL, C.J., The equivalence of weighted kappa and intraclass correlation coefficient as measures of reliability. . Educ Psychol Meas, 1973. 2: p. 113–117.
15. Burdock EIF, H.A., A new view of interobserver agreement. Perspect Psychol 1963. 16: p. 373–384.
16. Morris, R., ed. Assessing the reliability of clinical measurement. 1997, ed. , 1st ed. Oxford: Butterworth-Heinemann. 1-18.
17. Weir, J.P., Quantifying test-retest reliability using the intraclas correlation coefficient and the SEM. Journal of Strength and Conditioning Research, 2005. 19(1): p. 231–240.
18. Furlan, J.C., Fehlings, M.G., Massicotte, E.M. Aarabi, B., Vaccaro, A.R. Bono, C.R., Madrazo, I. Villanueva, C., Grauer, J.N., Mikulis, M. , A quantitative and reproducible method to assess cord compression and canal stenosis after cervical spine trauma. . Spine, 2007. 32: p. 2083–2091.
19. Singh, A., et al., Clinical and radiological correlates of severity and surgery-related outcome in cervical spondylosis. Journal of Neurosurgery. 94(2 Suppl):189-98, 2001 Apr., 2001.
20. Boutin RD, S.L., Finnesey K. , MR imaging of degenerative diseases in the cervical spine. . Magn Reson Imaging Clin N Am 2000. 8: p. 471-490.
21. Emery, S., Cervical spondylotic myelopathy: diagnosis and treatment. . J Am Acad Orthop Surg 2001. 9: p. 376-88
Figure F.1: Measurements for the maximum spinal cord compression (MSCC) using T2-weighted MRI [Da,Dx,Db] and maximum canal compromise (MCC) using T1-weighted MRI [da,dx,db].
116
Figure F. 2: Measurements for the anteroposterior diameter (AP) and drawing of the transverse area (TA) of spinal cord using axial T2-weighted MRI.
117
118
Figure F. 3: These graphs illustrate that there was not a time dependency
(learning/fatigue) of the MCC, MSCC, AP and TA measurements for spine and canal
stenosis assessments.
119
Appendix 4 Grade of recommendation: Levels of Evidence Table (2002).
Grade of recommendation
Level of Evidence
Therapy: Whether a treatment is efficacious/ effective/harmful
Therapy: Whether a drug is superior to another drug in its same class
Prognosis Diagnosis Differential diagnosis/symptom prevalence study Economic and decision analysis
1a
SR (withhomogeneity*) of RCTs
SR (with homogeneity**) of head-to-head RCTs
SR (with homogeneity*) of inception cohort studies;CDR† validated in different populations
SR (with homogeneity*) of Level 1 diagnostic studies;CDR† with 1b studies from different clinical centres
SR (with homogeneity*) of prospective cohort studies
SR (with homogeneity*) of Level 1 economic studies
1b
Individual RCT (with narrow Confidence Interval‡)
Within a head-to-head RCT with clinically important outcomes
Individual inception cohort study with > 80% follow-up; CDR† validated in a single population
Validating** cohort study with good††† reference standards; or CDR† tested within one clinical centre
Prospective cohort study with good follow-up****
Analysis based on clinically sensible costs or alternatives; systematic review(s) of the evidence; and including multi-way sensitivity analyses
A
1c All or none§ All or none case-series Absolute SpPins and SnNouts†† All or none case-series Absolute better-value or worse-value analyses‡‡
2a SR (withhomogeneity*) of cohort studies
Within a head-to-head RCT withvalidated surrogate outcomes‡‡‡
SR (with homogeneity*) of either retrospective cohort studies or untreated control groups in RCTs
SR (with homogeneity*) of Level >2 diagnostic studies
SR (with homogeneity*) of 2b and better studies
SR (with homogeneity*) of Level >2 economic studies
2b
Individual cohort study (including low quality RCT; e.g., <80% follow-up)
Across RCTs of different drugs v. placebo in similar or different patients with clinically important or validated surrogate outcomes
Retrospective cohort study or follow-up of untreated control patients in an RCT; Derivation ofCDR† or validated onsplit-sample§§§ only
Exploratory** cohort study with good†††reference standards; CDR† after derivation, or validated only on split-sample§§§ or databases
Retrospective cohort study, or poor follow-up
Analysis based on clinically sensible costs or alternatives; limited review(s) of the evidence, or single studies; and including multi-way sensitivity analyses
2c "Outcomes" Research; Ecological studies
"Outcomes" Research Ecological studies Audit or outcomes research
3a
SR (withhomogeneity*) of case-control studies
Across subgroup analyses from RCTs of different drugs v. placebo in similar or different patients, with clinically important or validated surrogate outcome
SR (with homogeneity*) of 3b and better studies
SR (with homogeneity*) of 3b and better studies
SR (with homogeneity*) of 3b and better studies
B
3b
Individual Case-Control Study
Across RCTs of different drugs v. placebo in similar or different patients but with unvalidated surrogate outcomes
Non-consecutive study; or without consistently applied reference standards
Non-consecutive cohort study, or very limited population
Analysis based on limited alternatives or costs, poor quality estimates of data, but including sensitivity analyses incorporating clinically sensible variations.
C 4
Case-series (andpoor quality cohort and case-control studies§§ )
Between non-randomised studies (observational studies and administrative database research) with clinically important outcomes
Case-series (and poor quality prognostic studies ***)
Case-control study, poor or non-independent reference standard
Case-series or superseded reference standards
Analysis with no sensitivity analysis
D 5 Expert opinion without explicit critical appraisal, or based on physiology,
Expert opinion without explicit critical appraisal, or based on physiology, bench research or
Expert opinion without explicit critical appraisal, or based on physiology, bench research or
Expert opinion without explicit critical appraisal, or based on physiology, bench research or
Expert opinion without explicit critical appraisal, or based on physiology, bench research or
Expert opinion without explicit critical appraisal, or based on economic theory or "first
120
bench research or "first principles"
"first principles"; or non-randomised studies with unvalidated surrogate outcomes
"first principles" "first principles" "first principles" principles"
Source: Sackett DL, Straus SE, Richardson WS, Rosenberg WM, Haynes RB (2000) Evidence-based medicine: how to practice and teach EBM. Toronto: Churchill Livingstone.
1. These levels were generated in a series of iterations among members of the NHS R&D Centre for Evidence-Based Medicine (Bob Phillips, Chris Ball, Dave Sackett, Brian Haynes, Sharon Straus and Finlay McAlister).
2. Users can add a minus-sign "-" to denote the level of that fails to provide a conclusive answer because of: o EITHER a single result with a wide Confidence Interval (such that, for example, an ARR in an RCT is not statistically significant
but whose confidence intervals fail to exclude clinically important benefit or harm) o OR a Systematic Review with troublesome (and statistically significant) heterogeneity.
3. Grades of recommendation are shown as linked directly to a level of evidence. However levels speak only of the validity of a study not its clinical applicability. Other factors need to be taken into account (such as cost, easy of implementation, importance of the disease) before determining a grade. Grades that are currently in the guides link closely to the validity of the evidence - these will change over time to reflect better concerns that we highlight in the text of the guide or related CATs.
Notes * By homogeneity we mean a systematic review that is free of worrisome variations (heterogeneity) in the directions and degrees of results between individual studies. Not all systematic
reviews with statistically significant heterogeneity need be worrisome, and not all worrisome heterogeneity need be statistically significant. As noted above, studies displaying worrisome heterogeneity should be tagged with a "-" at the end of their designated level.
† Clinical Decision Rule. (These are algorithms or scoring systems which lead to a prognostic estimation or a diagnostic category)
‡ See comment #2 for advice on how to understand, rate and use trials or other studies with wide confidence intervals.
§ Met when all patients died before the Rx became available, but some now survive on it; or when some patients died before the Rx became available, but none now die on it.
§§ By poor quality cohort study we mean one that failed to clearly define comparison groups and/or failed to measure exposures and outcomes in the same (preferably blinded), objective way in both exposed and non-exposed individuals and/or failed to identify or appropriately control known confounders and/or failed to carry out a sufficiently long and complete follow-up of patients. By poor quality case-control study we mean one that failed to clearly define comparison groups and/or failed to measure exposures and outcomes in the same (preferably blinded), objective way in both cases and controls and/or failed to identify or appropriately control known confounders.
§§§ Split-sample validation is achieved by collecting all the information in a single tranche, then artificially dividing this into "derivation" and "validation" samples.
†† An "Absolute SpPin" is a diagnotic finding whose Specificity is so high that a Positive result rules-in the diagnosis. An "Absolute SnNout" is a diagnostic finding whose Sensitivity is so high that a Negative result rules-out the diagnosis.
121
‡‡ Better-value treatments are clearly as good but cheaper, or better at the same or reduced cost. Worse-value treatments are as good and more expensive, or worse and equally or more expensive.
††† Good reference standards are independent of the test, and applied blindly or objectively to applied to all patients. Poor reference standards are haphazardly applied, but still independent of the test. Use of a non-independent reference standard (where the 'test' is included in the 'reference', or where the 'testing' affects the 'reference') implies a level 4 study.
** Validating studies test the quality of a specific diagnostic test, based on prior evidence. An exploratory study collects information and trawls the data (e.g. using a regression analysis) to find which factors are 'significant'.
*** By poor quality prognostic cohort study we mean one in which sampling was biased in favour of patients who already had the target outcome, or the measurement of outcomes was accomplished in <80% of study patients, or outcomes were determined in an unblinded, non-objective way, or there was no correction for confounding factors.
**** Good follow-up in a differential diagnosis study is >80%, with adequate time for alternative diagnoses to emerge (eg 1-6 months acute, 1 - 5 years chronic)
‡‡‡ Surrogate outcomes are considered validated only when the relationship between the surrogate outcome and the clinically important outcomes has been established in long-term RCTs.