uva-dare (digital academic repository) morbidity after ... · dissection (nd) for head and neck...
TRANSCRIPT
UvA-DARE is a service provided by the library of the University of Amsterdam (http://dare.uva.nl)
UvA-DARE (Digital Academic Repository)
Morbidity after lymph node dissection in patients with cancer: Incidence, risk factors, andprevention
Stuiver, M.M.
Link to publication
Citation for published version (APA):Stuiver, M. M. (2014). Morbidity after lymph node dissection in patients with cancer: Incidence, risk factors, andprevention.
General rightsIt is not permitted to download or to forward/distribute the text or part of it without the consent of the author(s) and/or copyright holder(s),other than for strictly personal, individual use, unless the work is under an open content license (like Creative Commons).
Disclaimer/Complaints regulationsIf you believe that digital publication of certain material infringes any of your rights or (privacy) interests, please let the Library know, statingyour reasons. In case of a legitimate complaint, the Library will make the material inaccessible and/or remove it from the website. Please Askthe Library: https://uba.uva.nl/en/contact, or a letter to: Library of the University of Amsterdam, Secretariat, Singel 425, 1012 WP Amsterdam,The Netherlands. You will be contacted as soon as possible.
Download date: 29 Jul 2020
137
CHAPTER 7Psychometric properties of three patient reported outcome measures for the assessment of shoul-der disability after neck dissection
Martijn M. Stuiver MSc1,2, Marieke R. ten Tusscher MSc1, Anita van Opzeeland, PT3, Wim Brendeke, PT4,
Robert Lindeboom PhD2, Pieter U. Dijkstra PhD5, Neil K. Aaronson PhD 6.
1. Department of Physiotherapy, The Netherlands Cancer Institute, Amsterdam, The Netherlands
2. Department of Clinical Epidemiology, Biostatistics and Bioinformatics, Academic Medical
Centre, University of Amsterdam, Amsterdam, The Netherlands
3. Department of Physiotherapy, Medical Centre Leeuwarden, Leeuwarden, The Netherlands
4. Department of Physiotherapy, Rijnstate Hospital, Arnhem, The Netherlands
5. University of Groningen, University Medical Centre Groningen, Department of Rehabilitation and
Department of Oral and Maxillofacial Surgery, Groningen, the Netherlands
6. Division of Psychosocial Research and Epidemiology, The Netherlands Cancer Institute,
Amsterdam, The Netherlands
Submitted
Cha
pter
7
138
ABSTRACTBackgroundPatient-reported outcome measures evaluating shoulder disability after neck dissection (ND) have
not been sufficiently validated. We assessed the psychometric properties of the Shoulder Disability
Questionnaire (SDQ), Neck Dissection Impairment Index (NDII) and the Shoulder Pain and Disability
Index (SPADI) in patients after ND.
Methods107 patients completed the SDQ, NDII and SPADI on 4 occasions over 6 months, and underwent
physical examination. We assessed internal consistency, test-retest reliability, clinical- and construct
validity, and responsiveness to change. The possibility of combining the NDII and SPADI items into
a single scale was explored by Rasch-analysis.
ResultsAll questionnaires exhibited good reliability and validity. We were successful in fitting a Rasch model
to the data.
ConclusionThe results support the suitability of the SDQ, NDII and the SPADI for use in ND patients. Combining
the SPADI and NDII in a single Rasch-scale improves item difficulty distribution, but reduces variability
and discriminative ability.
139
IntroductionShoulder complaints such as pain and restricted range of motion are well known sequelae of neck
dissection (ND) for head and neck cancer 1. Shoulder complaints can impact negatively on daily
activities, and can compromise the patient’s health-related quality of life 2-4. The prevalence of
shoulder disability after ND ranges from 20% after selective neck dissection (SND) to 77% after radical
neck dissection (RND), although there is considerable variability across studies 1,5,6.
Patient-reported outcome measures (PROMs) are used in research and clinical practice to quantify
the subjective shoulder complaints resulting from neck dissection. A number of PROMs are currently
available to assess shoulder complaints, but their psychometric properties for use in ND populations
have been insufficiently established. This complicates the interpretation of study findings and may
also account, in part, for the variability observed in reports of the prevalence of ND-related shoulder
complaints and disability 1,7.
The University of Washington Quality of Life questionnaire (UW-QOL) 8-11, the Shoulder Disability
Questionnaire (SDQ)2,5,8,9, the Shoulder Pain and Disability Index (SPADI)12,13, and the Neck dissection
impairment index (NDII)8,9,13-16 are the most commonly used PROMs for assessing ND related shoulder
complaints 1. The UW-QOL is a head-and-neck cancer specific questionnaire that includes a single
item regarding shoulder function 17. Although this may suffice for screening purposes 8, the lack of
detail limits its usefulness in evaluating changes in shoulder complaints over time in the context of
prevention or treatment trials. The SDQ and SPADI were developed for use in patients with general
shoulder pathology 18-20. Although both questionnaires have exhibited good psychometric properties
when used in various clinical populations 21-23, neither questionnaire has been validated for use in a
ND population. The NDII was developed specifically to assess the disability and quality of life impact
of neck dissection. Preliminary data supported its validity in a single, small cross-sectional study 14.
Although the SDQ, SPADI and the NDII all assess shoulder complaints, they do so in different ways. The
SDQ reflects primarily the International Classification of Human Functioning and Health (ICF)-domain
of physical function. The SPADI also includes items related to activity restriction, although these are
limited to non-complex activities such as reaching for and carrying objects. While the SDQ and SPADI
are more comprehensive in the assessment of shoulder pain, the NDII includes assessment of more
complex activities such adverse changes in overall activity level. Additionally, it contains items relating
to the ICF-domain of social participation, such as the ability to work and to engage in social and
recreational activities.
The primary aim of our study was to conduct a comprehensive evaluation of the psychometric
properties of the SDQ, the SPADI and the NDII when used in patients who have undergone a neck
dissection (ND), including reliability, validity and responsiveness to change over time. Additionally,
we were interested in determining, with the use of item response theory analysis, the extent to
which it is empirically justifiable to combine the items of the SPADI and the NDII into a single, more
comprehensive measure of shoulder complaints, and to evaluate the psychometric properties of such
a combined measure.
Cha
pter
7
140
Methods Setting and patients
We recruited patients consecutively from three specialized HNC-centres in the Netherlands:
The Netherlands Cancer Institute, the Medical Centre Leeuwarden and the Rijnstate Hospital Arnhem.
Their treating physiotherapist recruited patients during regular outpatient control visits. All patients
provided written informed consent. The medical ethics committees of the participating hospitals
approved the study. Patients were eligible for the study if they had undergone a neck dissection
1 to 3 months earlier as part of their treatment for HNC and were aged 18 years or older. Exclusion
criteria included: lack of basic written and oral command of the Dutch language; serious psychiatric
or cognitive problems that would preclude completion of self-report questionnaires; prior serious
shoulder complaints unrelated to the neck dissection (e.g., due to orthopaedic or rheumatoid
disorders); or accessory nerve damage prior to the neck dissection.
Sociodemographic and medical characteristics
We collected age, gender, height and weight, primary tumour, type and extent of neck dissection,
(neo)adjuvant medical treatment (radiotherapy, chemoradiation, chemotherapy) of the neck from the
medical record. Highest level of education, current profession and leisure activities involving use of
the arm or neck on the operated side were collected through self-report.
PROMS
The SDQ is a 16 item scale with three response categories (yes/no/not applicable). Sum scores are
calculated as the percentage of applicable items that are endorsed 18. The SPADI is a 13 item scale
which uses a 0-10 numerical rating scale 20. For both the SDQ and SPADI, higher scores indicate
more complaints. The NDII contains 10 items, with 5 response options with verbal anchors ranging
from ‘not at all’ to ‘a lot’. Higher scores indicate fewer complaints 14. A Dutch translation of the
NDII was not available. Therefore, we performed a standard forward-backward translation procedure.
The provisional Dutch version of the NDII was then pilot tested in a small sample of 10 patients
to evaluate clarity. This led to minor rephrasing of three questions, after which the Dutch NDII was
considered fit for further psychometric evaluation.
The SDQ, NDII and SPADI were administered at four time points: (T1) during a follow up visit 1-3
months after surgery; (T2) within 7 days after T1; (T3) and (T4) during regular medical control visits
approximately 3 and 6 months after T1. At T3 and T4, patients were asked to indicate whether or not
they had a need for rehabilitation treatment of their shoulder.
We assessed health-related quality of life at all time points except T2 with the Dutch language version
of the RAND 36-item Health Survey (RAND-36) 24,25, a generic 36 item questionnaire that has been
used in previous studies addressing shoulder morbidity in Dutch HNC centres 2,4. The RAND-36
includes 9 scales assessing physical functioning, social functioning, role limitations due to physical
problems, role limitations due to emotional problems, mental health, vitality, bodily pain, general
health perception and health change.
Figure 1 summarizes the measurements taken at all timepoints.
141
Physical examination
At all time points except for T2, we measured active range of motion (AROM) for abduction using
an inclinometer according to a standardized protocol and assessed the presence of pain (yes/no)
on passive external rotation of the shoulder. AROM for abduction is indicative of accessory nerve
dysfunction 26 and, like pain on external rotation, is a predictor for shoulder disability 2.
Statistical analysis
Statistical analyses were performed using R, version 2.15.2 (R Core Team, Vienna, Austria) 27,and OPLM
(CITO, Arnhem, the Netherlands) 28.
Figure 1Studyflow
Descriptive statistics
We generated descriptive statistics (frequency and percentage, mean and standard deviation or
median and range, as appropriate) for sociodemographic and medical variables. We calculated sum
scores for the SDQ, SPADI and NDII, and linearly transformed all scores to obtain a 0-100 score range,
maintaining the original scoring directions. For all follow up points, we calculated summary statistics
(mean, standard deviation, median, minimum and maximum) per scale, as well as floor and ceiling
effects (expressed as the proportion of patients with the worst and best possible score). We used
mean imputation for patients with less than 50% items missing on a questionnaire, and excluded
patients with more than 50% missing items.
Item Response Theory Scaling
One of the objectives of the study was to examine, by means of Item Response Theory analysis (IRT)
the possibility of combining the NDII and SPADI items into a single scale.
Rasch and related IRT based models estimate ‘item difficulty’ of individual items, together with person
(dis)ability on a common logit scale. This enables visualization of item difficulty along the continuum
of a construct to detect gaps and redundancies in difficulty of items. Also, it provides a meaningful
ordering of the items, which enhances clinical interpretation.
Cha
pter
7
142
Rasch models have been applied to improve questionnaires in many settings, including the assessment
of quality of life and mood in cancer survivors 29,30. Using a Rasch model has important benefits, as
it results in a scale with true metric properties. Also, the resulting scale is considered ‘person free’,
meaning that the observed measurement properties hold true for other populations as well 31. In
Rasch analysis, item fit tests can be used to evaluate the appropriateness of the item response scales,
i.e whether item score categories should be collapsed before summation. Rasch analysis assumes, and
tests, the unidimensionality of a scale. After exploratory factor analysis and inspection of the scree
plot, we performed Rasch analysis on the combined NDII and SPADI items with the OPLM software
package. We estimated item difficulty locations, assessed the extent of item difficulty coverage along
the continuum of subjective shoulder disability, and tested the fit of the combined items to the
unidimensional Rasch model.
The SDQ was not included in this analysis, since it contains a ‘not applicable’ response option that
prohibits meaningful dichotomization of responses, which is a prerequisite for Rasch analysis. Data
from all time points were used in the analysis.
Item difficulties were estimated using conditional maximum likelihood estimation in a one parameter
logistic model (Rasch). Because this method makes no assumptions on the distribution of data in
the sample or about the way the sample is selected, it accommodates the use of dependent
observations 32. The Rasch analysis consisted of two parts. First, we examined the appropriateness
of the rating scale of each item of the SPADI and NDII in OPLM and collapsed disordered rating step
categories. Second, we fitted the data to the one parameter logistic model with the collapsed rating
categories. Fit of the items to the unidimensional model was tested using specific item oriented fit
statistics, so-called M-tests, that compare deviations of observed and expected frequencies of item
scores for shoulder patients. M-tests values follow a t-distribution and values between - 2 and +2
indicate fit for an item 28. Overall fit of the combined scales to the unidimensional Rasch model was
examined using the R1c statistic P-value, that should exceed P > 0.05 to accept the model for the
data 28. We then calculated absolute agreement between expected and observed item scores, condi-
tional on the sum score. We plotted the item difficulty locations to identify gaps and redundancies
and to assess the extent to which measurement sensitivity and comprehensiveness could be improved
by combining both questionnaires.
Additionally, we evaluated the combined scale alongside the original PROMS using classical test
theory, as described below. For this purpose a sum score was calculated by summing the item scores
as used in the IRT analysis, and a linear transformation was employed to obtain a 0-100 score, with
higher scores indicating more complaints.
Reliability
We calculated intraclass correlation coefficients (ICC(2,1)) 31 for all scales, using the T1 and T2 measure-
ments to assess test-retest reliability coefficients, and Cronbach’s alpha coefficient 31 on the T1 data
to estimate internal consistency.
143
Clinical validity
To assess clinical validity, we calculated the area under the curve (AUC) of the Receiver Operating
Characteristic curve (ROC-curve) for the SDQ, SPADI, NDII, and the combined scale, with the patients’
self-reported need for rehabilitation as the criterion. The AUC reflects the probability that the
questionnaires correctly classify patients as having a self-reported need for shoulder rehabilitation.
Known groups validity is also an aspect of clinical validity 31. To establish this property of each of the
scales, between group comparisons of median scores were made for several subgroups of patients
based on T1 data: patients with RND or modified RND (level 1-5 dissection) versus SND; patients
with AROM for abduction ≥90° versus <90°; and patients who had shoulder pain on external rotation
versus patients who did not. Previous research has demonstrated that shoulder disability is signifi-
cantly different between these subgroups 2,33.
Construct (Convergent and divergent) validity
We assessed convergent and divergent correlations between each questionnaire and the RAND-36
domains as well as shoulder range of motion. Also, correlations between all questionnaires were
calculated. For this purpose, we constructed univariable linear multilevel models, with random
intercepts per patient to account for the repeated measurements. All variables (SDQ, NDII, SPADI,
combined scale, ROM for shoulder abduction and scores on each of the RAND-36-domains) were
centred and scaled by subtracting the mean and dividing by the standard deviation, to obtain
Beta coefficients equal to the correlation coefficient. We expected moderate to high correlations
(r>0.40) 34 of the PROMs with AROM for shoulder abduction and the RAND-36 domains physical
functioning, role functioning-physical and bodily pain, moderate correlations (0.3< r <0.5) 34 with
social functioning, and small correlations (r<0.3) 34 with role functioning-emotional, mental health,
energy, general health perception and health change. Additionally, we generated a plot representing
the mean scores of SDQ, NDII, SPADI and the combined scale over time in relation to ROM for
shoulder abduction to visually assess responsiveness to change of the scales compared to an external
reference measure.
ResultsWe enrolled 107 patients in the study. Characteristics of the sample are shown in Table 1. Ninety-two
patients (86%) returned their T2 questionnaires. T2 questionnaires that were completed and returned
later than 8 days after T1 were excluded from the test-retest analyses (n= 32). Additionally, some
questionnaires contained too much missing data and were therefore excluded from the analysis,
leaving between 54 and 58 evaluable patients available for the T2 measurement (Table 3) . Patients
who did not return their questionnaire (in time) were, on average, 6 years younger, and 2 weeks closer
in time to post-surgery, than patients who did.
Eighty-eight patients (82%) completed the T3 questionnaires and 82 (77%) the T4 questionnaires.
Number of available questionnaires, mean time since surgery, and reasons for loss to follow-up are
depicted in Figure 1. Not all patients returned fully completed questionnaires. Missing data on the
questionnaires was < 10% at all time points, except for the SPADI at T1 (14% missing).
Cha
pter
7
144
Table 1Descriptive statistics of the study sample
Characteristic 1/2 Frequency PercentTotal number of participants 107 100
Male 78 73
Median age (min-max) 62 (31-83)
Median BMI (min-max) 25.6 (15.2-42.6)
Localisation primary tumour
Larynx/pharynx 7 6
Oropharynx/ tongue 43 40
Salivary glands 11 10
Skin/ lip 34 32
Other 12 12
T classification
Carcinoma in situ 1 <1
1 20 19
2 25 23
3 12 11
4 6 6
x 17 16
unknown* 26 24
N classification
0 39 36
1 20 19
2 16 15
x 6 6
unknown* 26 24
Surgical procedure
Radical (modified) neck dissection 33 31
Accessory nerve sacrificed 10 10
Sternocleidoid muscle sacrificed 40 37
Internal jugular vein sacrificed 32 30
145
Characteristic 2/2 Frequency PercentRadiotherapy/chemotherapy
Neoadjuvant radiotherapy 11 10
Neoadjuvant chemoradiation 2 2
Adjuvant chemotherapy 2 2
Currently on chemotherapy 1 <1
Adjuvant radiotherapy 46 43
Currently on radiotherapy treatment 18 17
Education†
Elementary school 11 10
Secondary school (high school) 15 14
Vocational education 43 40
Higher vocational education (B) 22 21
University 13 12
Employment‡
None (retired, unemployed) 46 43
Desk job 35 32
Light physical work 8 8
Moderate to heavy physical work 6 6
Homemaker 4 4
Leisure activities involving use of arm/neck§
No relevant activities reported 37 35
Sports/ exercise 39 36
Handcrafting (timbering etc.) 17 16
Gardening 8 8
Community work 3 3
Musician 1 1
Other 2 2
* For patients who had previously been treated elsewhere, no TN classification is available† Level of education is missing for 2 persons‡ Employment is missing for 8 persons§ If participants were active in more than one category, the most strenuous activity category is reported
Cha
pter
7
146
Item Response Theory analysis
For the IRT analysis, only complete data from all time points were used and patients with zero-scores
on all items were excluded. Thus, a total of 292 observations were included in the analysis.
Visual inspection of the screeplot suggested unidimensionality of the combined items. Also, the first
factor explained 60% of the variance, and although the second added another 9% explained variance,
the correlation between the two factors was >0.90 at all time points. We considered this sufficient
evidence to proceed with the Rasch analysis.
Rating scale analysis showed disordered rating scale step categories for all items, which was resolved
by dichotomising the rating scales of the SPADI and the NDII. We recoded SPADI item scores <4 as
0 and scores ≥ 4 as 1. For the NDII, we recoded the categories ‘not at all’ and ‘a little’ as 0 and all
higher scores (‘moderate’ to ‘very much’) as 1. After dichotomisation, a Rasch-type model could be
fitted, with a R1c of 71.5 with 66 degrees of freedom (p=0.30), indicating good model fit. Absolute
agreement of expected and observed item scores, conditional on the sum score, ranged between
78% and 96% (median 88%). The overall ICC between expected and observed scores was 0.996.
Figure 2 displays the item difficulty spread of separate and combined SPADI and NDII items. From this
plot it is apparent that the NDII and SPADI both have gaps in item difficulty coverage, which can be
resolved by combining the two scales. There was also some overlap, with 4 items having equal item
difficulty. Table 2 shows the item content ordered by item difficulty.
Figure 2Item difficulty on a logit scale fo the Shoulder Pain and Disability Index (SPADI), the Neck Dissection Impairment Index (NDII) and the combined scale. Dots represent scale items and are stacked in case of equal item difficulty
147
Table 2Item order (easy to difficult) and the problems addressed in the combined questionnaire using dichomotomized responses. The italicized items have equal difficulty
Item* Problem queried
N5 Limitations with lifting heavy objects.
S12 Difficulty with carrying heavy objects of 10 pounds (5 kg)
S11 Difficulty with placing an object on a high shelf
S1 Pain at its worst > 3 on Numeric Rating Scale
S3 Pain when reaching for something on a high shelf
N2 Bothered by stiffness in neck or shoulder
N1 Pain or discomfort of the neck or shoulder
N10 Limitations with work (including work at home)
N6 Limitations reaching up to kitchen top level
S7 Difficulty with washing the back
N9 Limitations in leisure time activities
N7 Diminished overall activity level
S2 Pain when lying on the involved side
S8 Difficulty with putting on an undershirt or jumper
S5 Pain while pushing with involved arm
S4 Pain when touching the back of the neck
S6 Difficulty with washing hair
S9 Difficulty with putting on a front buttoned shirt
N3 Difficulty with self care
S13 Difficulty with removing something from back pocket
N4 Limitations with lifting light objects
N8 Diminished participation in social activities
S10 Difficulty with putting on trousers
* Letters and numbers correspond to the original scale and item (N= Neck Dissection Impairment Index, S= Shoulder Pain and Disability Index).
Classical Test Theory analysis
Reliability, floor- and ceiling effectsAll questionnaires exhibited good to excellent internal consistency and test-retest reliability, with
Cronbach’s alpha ranging from 0.91 to 0.96 and ICC(2,1) from 0.84 to 0.93. The NDII exhibited fewest
floor effects, followed by the SPADI, SDQ and the combined scale. Floor effects increased with follow
up time and ranged up to 56% in the combined scale at T4. Some ceiling effects were present for
the SDQ and the combined scale. The number of valid questionnaires, reliability statistics, scores
summary statistics and floor/ceiling effects at all time points are shown in Table 3.
Cha
pter
7
148
Table 3Number of valid questionnaires, reliability and descriptive statistics for the questionnaires at all time points.
Instruments*N
valid alpha†ICC(2,1) (95%CI )‡ P Mean SD Median Min Max
Floor effect§
Ceiling effect§
SDQ T1 103 0.91 33 28.4 31 0 100 19 3
T2 58 0.84 (0.74 - 0.90)
<0.001 27 28.0 25 0 100 23 1
T3 87 27 26.2 20 0 100 25 1
T4 77 14 18.9 0 0 81 40 0
SPADI T1 92 0.96 23 21.8 17 0 80 16 0
T2 56 0.91 (0.85 ; 0.95)
<0.001 23 22.9 16 0 82 20 0
T3 85 18 18.9 12 0 77 19 0
T4 76 12 17.7 3 0 86 33 0
NDII T1 101 0.94 73 21.0 78 10 100 5 0
T2 54 0.93 (0.87 ; 0.96 )
<0.001 75 21.2 78 10 100 11 0
T3 85 80 19.1 85 8 100 13 0
T4 76 87 12.7 90 43 100 20 0
Combined scale T1 104 0.94 28 28.8 17 0 100 26 2
T2 56 0.90 (0.84 ; 0.94)
<0.001 25 29.3 9 0 91 39 0
T3 88 19 26.3 7 0 100 42 1
T4 78 11 19.6 0 0 87 56 0
*Original item scores were used for calculating sum scores of the individual instruments, and dichotomized item scores for the combined scale. Lower scores indicate less disability on the Shoulder Pain and Disability Index (SPADI), Shoulder Disability Questionnaire (SDQ) and combined score, and higher disability on the Neck Dissection Impairment Index (NDII).†Cronbach’s alpha as calculated on t1 data‡ test-retest reliability between t1 and t2§Floor- and ceiling effects are expressed as the percentage of respondents with respectively the best and worst possible score.
149
Clinical validityThere were no statistically significant differences between the questionnaires in the ability to discrim-
inate between patients with or without a self-reported need for treatment. At T3 and T4, 29 and
17 patients, respectively, expressed a need for treatment,. The area under the receiver-operating
characteristic curves (AUC) was 0.85 (95%CI 0.78 – 0.94) for the SDQ, 0.85 (95%CI 0.77 – 0.94) for
the SPADI, 0.85 (0.77 – 0.94) for the NDII and 0.79 (95%CI 0.69 – 0.90) for the combined scale. At T4,
the discriminatory ability was less for all scales, with an AUC of 0.77 (95%CI 0.63 – 0.91) for the SDQ,
0.71 (95%CI 0.57 – 0.86) for the SPADI, 0.74 (95%CI 0.58 – 0.90) for the NDII and 0.72 (95%CI 0.57 –
0.87) for the combined scale.
Known groups comparisonMedian scores on the SDQ, SPADI, NDII and the combined scale differed in the expected direction
between all known groups (Table 4). All differences were statistically significant at the 0.05 level, with
the exception of the comparisons between R(M)ND and SND, where only NDII score differences were
significant .
Convergent correlations, divergent correlations and responsiveness to changeConvergent and divergent correlations with the RAND-36 domains and objectively measured shoulder
function were as expected (Table 5). Visual assessment of change in mean scores of the SDQ, NDII,
SPADI and the combined scale showed a strong association over time with change of shoulder AROM
for abduction, demonstrating their responsiveness to change (Figure 3).
Table 4Known group comparisons at T1*
Scale Type of neck dissection AROM abduction Pain at passive external rotation of the shoulder
R(M)ND SND <90° >90° yes no
33 74 Z p 68 39 Z p 22 85 Z p
SDQ† 38 28 -0.9 0.17 40 6 -5.5 <0.01 52 19 -3.3 <0.01
SPADI‡ 22 13 -1.4 0.08 22 4 -4.2 <0.01 32 13 -3.5 <0.01
NDII § 70 80 1.8 0.03 68 86 4.7 <0.01 68 80 2.5 <0.01
Combined Scale
26 13 -1.3 0.10 30 4 -4.7 <0.01 52 13 -3.6 <0.01
* Listed are number of patients per subgroup, median scores, and Z-scores and corresponding p- values from a Mann-Whitney U test with continuity correction.† Shoulder Disability Index; higher scores indicate more disability‡ Shoulder Pain and Disability Index; higher scores indicate more disability§ Neck Dissection Impairment Index; higher scores indicate less disability
Cha
pter
7
150
Figure 3Mean scores over time for Shoulder Disability Questionnaire (SDQ), Shoulder Pain and Disability Index (SPADI), Neck Dissection Impairment Index (NDII) and the combined scale. For ease of inter-pretation an inverse score is used for the NDII (lower score indicating less complaints), and the Y-axis for shoulder abduction is reversed (descending line for Active Range of Motion (AROM) reflects improved shoulder function). T2 data are omitted because AROM was not measured at T2.
DiscussionOur results provide support for the reliability and validity of the SDQ, SPADI and the NDII for
assessing shoulder complaints after neck dissection, with the SPADI and NDII exhibiting the highest
(and comparable) reliability. While the SPADI provides more detail on pain, the NDII would be the
obvious choice if aspects of activity and social participation are of interest. Also, the NDII was
the only scale that was sensitive to the type of neck dissection. In addition, the NDII exhibited the least
floor effects. Floor effects on all scales increased over time, as shoulder function improved in a large
number of patients. Floor effects were largest in the combined scale. Post-hoc analysis of the number
of patients with a self-expressed need for treatment among patients with a 0-score on the combined
scale, showed that this was the case for only one patient. This could indicate that the observed floor
effects appropriately reflect the absence of serious shoulder complaints.
We hypothesized that the scales could be complementary, and used Rasch analysis to explore this
possibility. In order to combine the NDII and SPADI in a Rasch model, item scores were dichotomised.
Our choice for the cut-off points of the NDII was to a certain extent arbitrary. Different cut-points
have been described for dichotomising 0-10 point numeric (pain) scales as used on the SPADI 35,36.
We also considered the often used 5-point cutoff, but that resulted in misfit of the Rasch model.
This indicates that the 4-point cutoff is the most optimal to discriminate between trait levels.
Dichotomising responses comes at the cost of losing variability in the data, but in return it improved
interpretability. The IRT-analysis showed that a number of items had disordered step-ratings which
was resolved by dichotomisation.
151
From figure 2 it is apparent that patients with a disability score between -1 and 0 logit, and from 0.5 to
1.5 logit cannot be distinguished from one another with the SPADI. The same applies for patients with
NDII scores between -2 and -1 logit, and between 0 and 1 logit. Combining the SPADI and NDII into
a single scale resolved these gaps and resulted in a more even spread of item difficulties.
Some limitations of this study should be noted. Only 60 patients returned their T1 questionnaires within
an appropriate time window, which limited the number of observations available for the test-retest
analysis. Also, at all time points there was a substantial number of invalid questionnaires due to
missing or ambivalent responses (e.g., two response options chosen for a single item). Although this
was below 10% at most time points, it nevertheless suggests that a small percentage of patients found
it difficult to complete the questionnaires. The number of missing items could possibly be reduced
with computer-aided assessment, which can provide instant feedback on missed items. However,
considering that most patients in this population are over 60 years of age, one cannot assume that all
patients will have the requisite computer skills or have access to the internet. This should be less of
a problem with future generations of patients.
Although sufficient, the sample size in our study was relatively small, particularly for the IRT analysis.
Therefore, although our results are promising, they need to be confirmed in future studies.
Clinical validity was, in part, assessed using patients’ self-expressed need for shoulder rehabilitation
as a criterion. While a clinically relevant anchor, it should be pointed out that perceived need for
treatment may be influenced by other factors than shoulder complaints alone 2.
Our initial aim was to include all questionnaires that had previously been used in studies on shoulder
complaints after neck dissection, but we chose not to include the UW-QoL 17 because of its very limited
coverage of shoulder problems (a single item). The Disability of Arm Shoulder and Hand -question-
naire has also been used in a fairly recent study evaluating shoulder complaints after neck dissection 37,
but that study was published after the enrolment of the current study had started. Hence, this scale
was not included in our study. A recent cross-sectional clinimetric study provided some evidence for
the reliability and validity of the DASH in patients following neck dissection 38.
Although our study provides information on the performance of the scales between 1 and 8 months
after neck dissection, future studies are needed to evaluate the scales when used earlier or later in
the cancer care trajectory of these patients.
Notable strengths of the study include its longitudinal design, the inclusion of shoulder range of
motion measures, and the comprehensive approach taken to psychometric evaluation, in particular,
the use of item response theory.
ConclusionThe results of this study support the suitability of the SDQ, NDII and the SPADI for assessing shoulder
complaints in individual patients after neck dissection, and for evaluating change in these complaints
over time. Combining the SPADI and NDII into a single scale and dichotomising the responses yields
a Rasch-scale which allows for true metric measurement and provides a meaningful item ordering as
well as better spread of item-difficulty along the continuum of shoulder disability, but at the expense
of lower variability and discriminative ability.
Cha
pter
7
152
Table 5Correlation coefficients between the shoulder questionnaires and the RAND-36* domains and shoulder abduction.
Shoulder questionnaires
SDQ SPADI NDII COMBINED SCALE
Shoulder questionnaires
SDQ† 1 0.78 -0.76 0.77
SPADI‡ 1 -0.75 0.91
NDII§ 1 -0.87
Rand-36 domains
Physical functioning -0.45 -0.52 0.49 -0.54
Social functioning -0.41 -0.37 0.48 -0.44
Role functioning physical -0.49 -0.40 0.47 -0.43
Role functioning emotional -0.28 -0.23 0.31 -0.26
Menthal health -0.28 -0.24 0.28 -0.28
Vitality -0.41 -0.45 0.45 -0.47
Pain -0.59 -0.55 0.65 -0.59
Health perception -0.19 -0.19 0.24 -0.25
Health change -0.33 -0.27 0.37 -0.34
Shoulder range of motion
Abduction -0.56 -0.46 0.50 -0.49
* RAND 36-item Health Survey; higher scores indicate better health† Shoulder Disability Index; higher scores indicate more disability‡ Shoulder Pain and Disability Index; higher scores indicate more disability § Neck Dissection Impairment Index; higher scores indicate less disability
AcknowledgementWe would like to thank M.L. Vos, M.B. Pantlin and J.C. Chepeha for their assistance with the trans-
lation of the NDII, and P. Venema and E.M. de Boer for patient recruitment.
REFERENCES1. Goldstein DP, Ringash J, Bissada E, Jaquet Y, Irish J, Chepeha D, et al. Scoping review of the liter-
ature on shoulder impairments and disability after neck dissection. Head Neck 2014; 36:299-308
2. Stuiver MM, van Wilgen CP, de Boer EM, de Goede CJT, Koolstra M, van Opzeeland A, et al. Impact of shoulder complaints after neck dissection on shoulder disability and quality of life. Otolaryngol Head Neck Surg 2008;139:32–9.
3. Terrell JE, Welsh DE, Bradford CR, Chepeha DB, Esclamado RM, Hogikyan ND, et al. Pain, quality of life, and spinal accessory nerve status after neck dissection. Laryngoscope 2000;110:620–6.
4. van Wilgen CP, Dijkstra PU, van der Laan BFAM, Plukker JT, Roodenburg JLN. Shoulder and neck morbidity in quality of life after surgery for head and neck cancer. Head Neck 2004;26:839–44.
5. van Wilgen CP, Dijkstra PU, van der Laan BFAM, Plukker JTM, Roodenburg JLN. Shoulder complaints after nerve sparing neck dissections. Int J Oral Maxillofac Surg 2004;33:253–7.
6. Shone GR, Yardley MP. An audit into the incidence of handicap after unilateral radical neck dissection. J Laryngol Otol 1991;105:760–2.
7. Goldstein DP, Ringash J, Bissada E, Jacquet Y, Irish J, Chepeha D, et al. Evaluation of shoulder disability questionnaires used for the assessment of shoulder disability after neck dissection for head and neck cancer. Head Neck 2013. doi:10.1002/hed.23490
8. Rogers SN, Scott B, Lowe D. An evaluation of the shoulder domain of the University of Washington quality of life scale. Br J Oral Maxillofac Surg 2007;45:5–10.
9. Orhan KS, Demirel T, Baslo B, Orhan EK, Yücel EA, Güldiken Y, et al. Spinal accessory nerve function after neck dissections. J Laryngol Otol 2007;121:44–8.
10. Kuntz AL, Weymuller EA. Impact of neck dissection on quality of life. Laryngoscope 1999;109:1334–8.
11. Laverick S, Lowe D, Brown JS, Vaughan ED, Rogers SN. The Impact of Neck Dissection on Health-Related Quality of Life. Arch Otolaryngol Head Neck Surg 2004;130:149–54.
12. Selcuk A, Selcuk B, Bahar S, Dere H. Shoulder function in various types of neck dissection. Role of spinal accessory nerve and cervical plexus preservation. Tumori. 2008;94:36–9.
13. McNeely ML, Parliament MB, Seikaly H, Jha N, Magee DJ, Haykowsky MJ, et al. Effect of exercise on upper extremity pain and dysfunction in head and neck cancer survivors. Cancer 2008;113:214–22.
14. Taylor RJ, Chepeha JC, Teknos TN, Bradford CR, Sharma PK, Terrell JE, et al. Development and validation of the neck dissection impairment index: a quality of life measure. Arch Otolaryngol Head Neck Surg 2002;128:44–9.
15. Güldiken Y, Orhan KS, Demirel T, Ural HI, Yücel EA, Deðer K. Assessment of shoulder impairment after functional neck dissection: long term results. Auris Nasus Larynx. 2005;32:387–91.
16. Murer K, Huber GF, Haile SR, Stoeckli SJ. Comparison of morbidity between sentinel node biopsy and elective neck dissection for treatment of the N0 neck in patients with oral squamous cell carcinoma. Head Neck; 2011;33:1260–4.
17. Hassan SJ, Weymuller EA. Assessment of quality of life in head and neck cancer patients. Head Neck 1993;15:485–96.
18. van der Heijden GJ, Leffers P, Bouter LM. Shoulder disability questionnaire design and respon-siveness of a functional status measure. J Clinical Epidemiol 2000;53:29–38.
Cha
pter
7
19. Elvers RI, Oostendorp RAB, N SI. The Dutch-language version of the Shoulder Pain and Disability Index (SPADI-Dutch Version) in patients after subacromial decompression according to Neer: internal consistency and construct validity. Dutch Journal of Physical Therapy 2003;113:126–31.
20. Roach KE, Budiman-Mak E, Songsiridej N, Lertratanakul Y. Development of a shoulder pain and disability index. Arthritis Care Res 1991;4:143–9.
21. Beaton D, Richards RR. Assessing the reliability and responsiveness of 5 shoulder questionnaires. J Shoulder Elbow Surg 1998;7:565–72.
22. Bot SDM, Terwee CB, van der Windt DA, Bouter LM, Dekker J, de Vet HC. Clinimetric evalu-ation of shoulder disability questionnaires: a systematic review of the literature. Ann Rheum Dis 2004;63:335–41.
23. van der Windt DA, van der Heijden GJ, de Winter AF, Kroes BW, Deville W, Bouter LM. The responsiveness of the shoulder disability questionnaire. Ann Rheum Dis 1998;57:82-7.
24. van der Zee KI, Sanderman R, Heyink JW, de Haes H. Psychometric qualities of the rand 36-item health survey 1.0: A multidimensional measure of general health status. Int J Behav Med 1996;3:104–22.
25. van der Zee KI, Sanderman R. Het meten van de algemene gezondheidstoestand met de RAND-36, een handleiding. 2nd ed. Groningen: Research Institute SHARE, UMCG, Groningen University; 2012.
26. Dijkstra PU, van Wilgen CP, Buijs RP, Brendeke W, de Goede CJ, Kerst A, et al. Incidence of shoulder pain after neck dissection: a clinical explorative study for risk factors. Head Neck. 2001;23:947–53.
27. R Core Team. R: A language and environment for statistical computing. R Foundation for Statistical Computing. wwwR-projectorg. 2013.
28. Glas CAW, Verhelst NDG. One Parameter Logistic Model (OPLM). Arnhem, the Netherlands; 1995.
29. Smith AB, Wright P, Selby PJ, Velikova G. A Rasch and factor analysis of the Functional Assessment of Cancer Therapy-General (FACT-G). Health Qual Life Outcomes 2007;5:19.
30. Lambert S, Pallant JF, Girgis A. Rasch analysis of the Hospital Anxiety and Depression Scale among caregivers of cancer survivors: implications for its use in psycho-oncology. Psycho-Oncology 2011;20:919–25.
31. Streiner DL, Norman GR. Health Measurement Scales. New York: Oxford University Press; 2008.
32. Verhelst NDG, Glas CAW. The one parameter logistic model. In: Fischer GH, Molenaar IW, editors. Rasch Models: foundations, recent developments and applications. New York: Springer-Verlag; 1995.
33. Cheng PT, Hao SP, Lin YH, Yeh AR. Objective comparison of shoulder dysfunction after three neck dissection techniques. Ann Otol Rhinol Laryngol. 2000;109(8 Pt 1):761–6.
34. Cohen J. Statistical power analysis for the behavioral sciencies. New Your: Academic Press;1977.
35. Fejer R, Jordan A, Hartvigsen J. Categorising the severity of neck pain: Establishment of cut-points for use in clinical and epidemiological research. Pain. 2005;119:176–82.
36. Serlin RC, Mendoza TR, Nakamura Y, Edwards KR, Cleeland CS. When is cancer pain mild, moderate or severe? Grading pain severity by its interference with function. Pain. 1995;61:277–84.
37. Carr SD, Bowyer D, Cox G. Upper limb dysfunction following selective neck dissection: A retro-spective questionnaire study. Head Neck. 2009;31(6):789–92.
38. Goldstein DP, Ringash J, Irish JC, Gilbert R, Gullane P, Brown D, et al. Assessment of the Disabilities of the Arm, Shoulder and Hand (DASH) questionnaire for use in patients following neck dissection for head and neck cancer. Head Neck. 2013. doi: 10.1002/hed.23593