archivo das28 md consult

12
463 3 1 Assessment o Health Outcomes Dorcas ElEanor BEa ton MaartEn BoErs • PEtEr tugwEll KEY POINTS Any single health outcome can give only a partial view o the impact o a disease on a patient. Core sets are minimal, but not exclusive, domains o outcomes agreed on by proessional groups as important to include in studies; they are available or several rheumato- logic conditions. Defning measurement need is key to the choice o the right instrument. Choosing an instrument ollows a step-by-step process—looking or evidence o practical aspects o using the instruments and statistical properties. I an instrument lacks evidence o a certain property, a study can be conducted to create the evidence, rather than abandon the instrument. In an era o rising health care costs, increased choice, and greater provider accountability, 1 health outcome measures have become essential tools or researchers, clinicians, and unding bodies. By denition, outcomes reer to “all possible eects o a disease or intervention.” 2 Outcomes cover a spec- trum o the burden o arthritis on a patient rom biomark- ers o disease to subjective appraisals o overall well-being. Other chapters reer to some o the most common measures o health and disease encountered in rheumatology, includ- ing the Disease Activity Scale (DAS, DAS28), 3 the Health Assessment Questionnaire (HAQ), 4 and the SF-36 (short orm, 36 items). 5 Using any one o these health outcome assessments is like looking out a window in a house, with the burden o arthritis the landscape outside. Each window in the house provides a view o the outside world, but it is a specic view dened by the size o the window and the side o the house it is on. Another window may oer a slightly better angle on what you would like to see. Dierent health outcome assessments can have a degree o overlap in their views, in which case an inormed choice needs to be made between them, whereas others can hold quite distinct views. One assessment might be useul to compare the burden o arthritis against the general population, another might be useul to assess dierences in the prevalence in dierent subgroups, and yet another might be useul to measure the specic benets o an arthritis intervention. This chapter ocuses on describing the dierent windows clinicians have on the burden o arthritis and how they relate to each other. A ramework is provided or ensuring that a selected measure is the right one or a given need. This chapter addresses our questions: (1) What health outcome assessment tools are available generally and specically or use in rheumatology? (2) How do the dierent assessment tools relate to one another? (3) How does one characterize what one needs to measure? (4) How does one nd a mea- sure that can meet that need? WHAT HEALTH OUTCOME ASSESSMENT TOOLS ARE AVAILABLE? In reading the rheumatology literature, and in monitor- ing the clinical care o patients, certain highly relevant outcomes emerge. A group o these oten emerge and are termed core sets o outcomes. DISEASE-SPECIFIC MEASURES—THE CORE SETS Core sets are the minimal, but not exclusive, set o domains to be measured in a study o arthritis. Historically , they ol- low the Ds o outcome measurement in arthritis: disability , disease activity, damage, discomort, dissatisaction, and death. 6,7 They are usually recommended by groups such as OMERACT, EULAR, ILAR, or ACR or groups ormed around specic diseases, such as ASAS or ankylosing spon- dylitis or GRAPPA or psoriatic arthritis. All core sets have a great interest in agreeing on a common set o relevant and psychometrically sound outcomes that would allow them to compare ndings across studies and o clinical care. Table 31-1 presents the list o core sets or clinical trials in six types o arthritis 7-16 ; it also shows what each group rec- ommends as additional domains or as needing more research beore they become core set members. The rst column on the let o T able 3 1-1 is the core set or longitudinal observational studies in rheumatology by Wole and colleagues 7 with some minor additions. Wole’s core set is broader than the core sets presented in other columns because observational studies are oten looking or a broader range o outcomes that are rel- evant or studies outside o treatment trials. It also is designed to be used across dierent orms o arthritis. The table also uses this broader list o outcomes as an axis against which the other core sets can be described. The remaining columns show that across dierent types o arthritis, the core sets have many common elements; most contain or recommend pain, physi- cal unction, patient and clinician global assessments, and markers o infammation. Many also include disease activity indices, which are oten an aggregation across other clini- cal ndings (e.g., joint count, acute-phase reactants, global ratings o severity) into a score refecting the activity o the disease at that point in time. Some core sets contain domains refecting the unique aspects o the disease (e.g., spinal mobil- ity in ankylosing spondylitis) 17 or the unique target o the study using the outcomes (ractures are core outcomes only in osteoporosis studies ocused on racture prevention). 18

Upload: diana-carolina-moreno-santiago

Post on 07-Apr-2018

218 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Archivo Das28 MD Consult

8/6/2019 Archivo Das28 MD Consult

http://slidepdf.com/reader/full/archivo-das28-md-consult 1/12

463

31 Assessment o HealthOutcomesDorcas ElEanor BEaton • MaartEn

BoErs • PEtEr tugwEll

KEY POINTS

Any single health outcome can give only a partial view o the

impact o a disease on a patient.

Core sets are minimal, but not exclusive, domains o 

outcomes agreed on by proessional groups as important toinclude in studies; they are available or several rheumato-

logic conditions.

Defning measurement need is key to the choice o the rightinstrument.

Choosing an instrument ollows a step-by-step

process—looking or evidence o practical aspectso using the instruments and statistical properties.

I an instrument lacks evidence o a certain property, a studycan be conducted to create the evidence, rather than

abandon the instrument.

In an era o rising health care costs, increased choice, andgreater provider accountability,1 health outcome measureshave become essential tools or researchers, clinicians, andunding bodies. By denition, outcomes reer to “all possibleeects o a disease or intervention.”2 Outcomes cover a spec-trum o the burden o arthritis on a patient rom biomark-

ers o disease to subjective appraisals o overall well-being.Other chapters reer to some o the most common measureso health and disease encountered in rheumatology, includ-ing the Disease Activity Scale (DAS, DAS28),3 the HealthAssessment Questionnaire (HAQ),4 and the SF-36 (shortorm, 36 items).5 Using any one o these health outcomeassessments is like looking out a window in a house, withthe burden o arthritis the landscape outside. Each windowin the house provides a view o the outside world, but it is aspecic view dened by the size o the window and the sideo the house it is on. Another window may oer a slightlybetter angle on what you would like to see. Dierent healthoutcome assessments can have a degree o overlap in theirviews, in which case an inormed choice needs to be madebetween them, whereas others can hold quite distinct views.One assessment might be useul to compare the burden o arthritis against the general population, another might beuseul to assess dierences in the prevalence in dierentsubgroups, and yet another might be useul to measure thespecic benets o an arthritis intervention.

This chapter ocuses on describing the dierent windowsclinicians have on the burden o arthritis and how theyrelate to each other. A ramework is provided or ensuringthat a selected measure is the right one or a given need. Thischapter addresses our questions: (1) What health outcomeassessment tools are available generally and specically or

use in rheumatology? (2) How do the dierent assessmenttools relate to one another? (3) How does one characterizewhat one needs to measure? (4) How does one nd a mea-sure that can meet that need?

WHAT HEALTH OUTCOME ASSESSMENTTOOLS ARE AVAILABLE?

In reading the rheumatology literature, and in monitor-ing the clinical care o patients, certain highly relevantoutcomes emerge. A group o these oten emerge and are

termed core sets o outcomes.

DISEASE-SPECIFIC MEASURES—THE CORE SETS

Core sets are the minimal, but not exclusive, set o domainsto be measured in a study o arthritis. Historically, they ol-low the Ds o outcome measurement in arthritis: disability,disease activity, damage, discomort, dissatisaction, anddeath.6,7 They are usually recommended by groups such asOMERACT, EULAR, ILAR, or ACR or groups ormedaround specic diseases, such as ASAS or ankylosing spon-dylitis or GRAPPA or psoriatic arthritis. All core sets havea great interest in agreeing on a common set o relevant andpsychometrically sound outcomes that would allow them to

compare ndings across studies and o clinical care.Table 31-1 presents the list o core sets or clinical trials

in six types o arthritis7-16; it also shows what each group rec-ommends as additional domains or as needing more researchbeore they become core set members. The rst column on thelet o Table 31-1 is the core set or longitudinal observationalstudies in rheumatology by Wole and colleagues7 with someminor additions. Wole’s core set is broader than the core setspresented in other columns because observational studies areoten looking or a broader range o outcomes that are rel-evant or studies outside o treatment trials. It also is designedto be used across dierent orms o arthritis. The table alsouses this broader list o outcomes as an axis against which theother core sets can be described. The remaining columns showthat across dierent types o arthritis, the core sets have manycommon elements; most contain or recommend pain, physi-cal unction, patient and clinician global assessments, andmarkers o infammation. Many also include disease activityindices, which are oten an aggregation across other clini-cal ndings (e.g., joint count, acute-phase reactants, globalratings o severity) into a score refecting the activity o thedisease at that point in time. Some core sets contain domainsrefecting the unique aspects o the disease (e.g., spinal mobil-ity in ankylosing spondylitis)17 or the unique target o thestudy using the outcomes (ractures are core outcomes only inosteoporosis studies ocused on racture prevention).18

Page 2: Archivo Das28 MD Consult

8/6/2019 Archivo Das28 MD Consult

http://slidepdf.com/reader/full/archivo-das28-md-consult 2/12

464 BEAToN | AssEssmENT o HEAlTH ouTcomEs

    *    T    h   e    l   e    f   t  -    h   a   n    d   c   o    l   u   m   n    i   s   t    h   e    b   r   o   a    d   e   r   s   e   t   o    f   m   e   a   s   u   r   e   s   r   e   c   o   m   m   e   n    d   e    d    b   y    W   o    l    f   e   a   n    d   c   o    l    l   e   a   g   u   e   s   7

     f   o   r    l   o   n   g    i   t   u    d    i   n   a    l   s   t   u    d    i   e   s .    I   t   s   e   r   v   e   s    h   e   r   e   a   s   a    b

   r   o   a    d   e   r   r   a   n   g   e   o    f   o   u   t   c   o   m   e   s ,   a   n    d   a   n   a   x    i   s    f   o   r   t    h   e   o   r

   g   a   n    i   z   a   t    i   o   n   o    f   c   o   r   e   o   u   t   c   o   m   e   s

    i   n   t    h   e   o   t    h   e   r   c   o   n    d    i   t    i   o   n   s    (   c   o    l   u   m   n   s    ) .

    ✓ ,   c

   o   r   e    d   o   m   a    i   n   ;    R ,   r   e

   c   o   m   m   e   n    d   e    d    f   o   r    f   u   r   t    h   e   r   r   e   s   e   a   r   c    h   a   n    d   p   o   s   s    i    b    l   e    i   n   c    l   u   s    i   o   n    i   n   c   o   r   e   s   e

   t   ;    O ,   o

   p   t    i   o   n   a    l   o   u   t   c   o   m   e    (   o   s   t   e   o   a   r   t    h   r    i   t    i   s   c   a   t   e   g   o   r   y   o

   n    l   y    ) .

    O   s   t   e   o   p   o   r   o   s    i   s   :    C   o   r   e   s   e   t    d   e   p   e   n    d   s   o   n    f   o   c

   u   s   :    B    L

 ,    b   o   n   e    l   o   s   s   s   t   u    d    i   e   s   ;    † ,   s   t   u

    d    i   e   s   a    i   m    i   n   g   t   o   r   e    d

   u   c   e    f   r   a   c   t   u   r   e   r   a   t   e   s .

    A   n    k   y    l   o   s    i   n   g   s   p   o   n    d   y    l    i   t    i   s   :    C   o   r   e   s   e   t   e    l   e   m   e   n   t   s   v   a   r   y    d   e   p   e   n    d    i   n   g   o   n    f   o   c   u   s   o    f   s   t   u    d   y   :    ‡ ,   c

    l    i   n    i   c   a    l   r   e   c   o   r    d   s   a   n    d   s   y   m   p   t   o   m

    m   o    d    i    f   y    i   n   g   ;    § ,    d

    i   s   e   a   s   e   m   o    d

    i    f   y    i   n   g   o   n    l   y   ;   o   t    h   e   r   s  =   a    l    l .

    D    A    I ,    d    i   s   e   a   s   e   a   c   t    i   v    i   t   y    i   n    d   e   x   ;    E    U    L    A    R    D    A    S

 ,    E   u   r   o   p   e   a   n    L   e   a   g   u   e    A   g   a    i   n   s   t    R    h   e   u   m   a   t    i   s   m

     D    i   s   e   a   s   e

    A   c   t    i   v    i   t   y    S   c   a    l   e    (   r   e   v    i   s   e    d  =

    D    A    S  -    2

    8 ,    2

    8  -    j   o

    i   n   t   c   o   u   n   t    ) .

    L   o   n   g    i    t   u    d    i   n   a    l    S    t   u    d   y    C   o   r   e

    S   e    t   o    f    D   o   m   a    i   n   s

    7

    C    l    i   n    i   c   a    l    T   r    i   a    l    C   o   r   e    S   e    t   s   o    f    D   o   m   a    i   n   s    b   y    D    i   s   e   a   s   e    G   r   o   u   p

    R

    h   e   u   m   a    t   o    i    d

    A

   r    t    h   r    i    t    i   s

    1    5

 ,    1    6

    O   s    t   e   o   p   o   r   o   s    i   s    1    1

    O   s    t   e   o   a   r    t    h   r    i    t    i   s

    1    3

    S   y   s    t   e   m    i   c    L   u   p   u   s

    E   r   y    t    h   e   m   a    t   o   s   u   s

    1    4

 ,    1    9

    A   n    k   y    l   o   s    i   n   g

    S   p   o   n    d   y    l    i    t    i   s    8

    P   s   o   r    i   a    t    i   c    A   r    t    h   r    i    t    i   s

    1    0

 ,    1    2

    H   e   a

         t     h       t   a

    t          /   q      a

         i    t   y   

              i        e    ✓

    Q      a

         i    t   y   

              i        e

    s   y   m   p

    t      m   

    R

    (       t    i         i    t   y    )

    ✓

     P   a

    i   n

    R    (    B    l    )    (    †    )

    R    (     b   a   

     k    )    (    †    )

    R    ✓     P

   a    i   n

    ✓ R    (        a    t    i   g      e    )

    ✓    P   a

    i   n

      ✓ ✓     P

   a    i   n

    ✓     f

   a    t    i   g      e

    P     h

   y       i      a

                n   

    t    i      n

    ✓

     D    i      a

     b    i         i    t   y

    R    (    †    )

    ✓

    R

    ✓

    ✓

    P      y   

     h                i   a

     

    R

    R

    D    i      e   a      e   p   r         e          ✓

    A   g   g   r   e   g   a

    t   e    i   n

     d   e   x

    ✓

     (    E    u    l    A    R    D    A    s    )

    ✓     D

    A    I ,    R   =

      e   v   e   r    i    t   y

    P   e   n

     d    i   n   g

    ✓     D

    A    s

    B    i      m   a   r     k   e   r   

    ✓     B

    i      

     h   e   m

    i      a     

    o

    J       i   n    t    t   e   n

     d   e   r   n   e      

    E   n

    t     h   e       i    t    i   

    J       i   n    t      w   e

              i   n   g

    J       i   n    t       t    i             n   e      

    G             b

   a     

    P   a

    t    i   e   n

    t

    P     h

   y       i       i   a   n

    A      

    t   e  -   p

     h   a      e   r   e   a   

    t   a   n

    t   

    ✓

     2    8      r

    6    8    j       i   n    t   

    ✓ ✓ ✓ ✓

    o     ✓ R o

    R

    ✓     P

   e   r    i   p

     h   e   r   a

         (    4    4    j       i   n    t       )

    E   n

    t     h   e       i    t    i       ‡

    ✓     s

   p    i   n   a

            t    i             n   e      

    ✓     s

   p    i   n   a

        m   

     b    i         i    t   y

    ✓ ✓    ‡

    ✓ ✓ ✓ ✓

    D   a   m   a   g   e    ✓

    R   a

     d    i      g   r   a   p

     h   y      r

    i   m   a   g

    i   n   g

    D   e

           r   m

    i    t   y

    s      r   g   e   r   y

    o   r   g   a   n

     d   a   m   a   g   e

    ✓

     >    1   y   r

    ✓     B

      n   e   m

    i   n   e   r   a

          d   e   n       i    t   y

    R    (    B    l    )    /    ✓     (    †

    )        r   a

       t      r   e   

    R    (    B    l    )    /    ✓     (    †

    )    c

     h   a   n   g   e

     h   e

    i   g     h    t

    ✓     >

    1   y   r

    ✓     D

   a   m   a   g   e

    i   n     d

   e   x

    ✓     s

   p    i   n   e   a   n

     d     h    i   p    §

    ✓     s

    t   r      

    t      r   a

     

    ✓    s

     k    i   n

     d    i      e   a      e

    D    i      a

     d   v   a   n

    t   a   g   e   

    T      x

    i       i    t   y   e

             e   

    t       ✓

    R

    ✓

    ✓

    D   e   a

    t     h    ✓

    D   

             a   r             t   

    [    R    ]

    W      r     k

     d    i      a

     b    i         i    t   y    [    R    ]

    R

    R    (    †    )

        R R

    ✓

    T   a    b    l   e    3    1  -    1

    c      r   e    s   e    t              r    s    i   x    R

     h   e      m   a    t              g    i       c      n

     d    i    t    i      n      a   n

     d           r    l      n

   g    i    t        d    i   n   a

         o     b      e   r   v   a    t    i      n   a

         s    t        d    i   e       *

Page 3: Archivo Das28 MD Consult

8/6/2019 Archivo Das28 MD Consult

http://slidepdf.com/reader/full/archivo-das28-md-consult 3/12

465PART 4 | BRoAD IssuEs IN THE APPRoAcH To RHEumATIc DIsEAsEs

Table 31-1 ocuses on the core domains that should bemeasured; the next step is deciding on the instrument thatis able to provide that well in a reproducible, accurate man-ner. In some cases, an instrument choice has been suggested(e.g., the HAQ or disability in rheumatoid arthritis). Othertimes several options are provided. Strand and cowork-ers19 reviewed six disease activity indices in systemic lupuserythematosus and ound they gave comparable results. In

some cases, the domains are shared, but the measurementtechnique varies within or by disease; in rheumatoid arthri-tis, the DAS28 uses 28 joints,3 and in ankylosing spondy-litis, 44 joints are counted.17 Some o the more commonlyencountered instruments in arthritis are briefy reviewed.

Health Status/Quality o Lie

gee Heh s. Generic health outcomes provideinormation on an aspect o health across many conditionsso that theoretically comparisons can be made to comparethe burden o low back pain with that o arthritis or diabe-tes. This comparison depends on the ability o that measureto capture the burden in a disease group well. Generic mea-

sures have advantages o allowing comparisons across dis-eases and covering a broader range o health issues—thingsthat may have been overlooked in a core set (e.g., mentalhealth issues). Oten because o their breadth, however, thegeneric measures tend not to delve as well into the depth o experience in any one disease. Arthritis-related atigue is notdetected well in many generic measures because a genericmeasure asks about being tired or not sleeping well. Becauseo this, a generic measure is usually weaker in its ability todetect specic changes and their sensitivity to dierent lev-els o disease activity and should usually be supplementedwith disease-specic measures (described previously).20

Two commonly used generic measures are the SicknessImpact Prole (SIP)21 and the SF-36.22 The SIP is a 136-

item list o illness behaviors that provides a weighted scoreor the impact o a disease across 12 categories, such as bodilypain, work and role unctioning, and dressing,21 which leadto global scores (physical, psychosocial, and overall). TheSIP has been shown to measure illness across a wide vari-ety o health conditions.20 The SF-36 is a 36-item question-naire o which 35 items are used to obtain eight domainscores (including physical unctioning, mental health, roleunctioning, and pain) scored on a 0-to-100 scale (with100 = better health)22 and two summary scores (mental andphysical); the questionnaire is scored with normal o 50 andstandard deviation o 10. The SF-36 and brieer SF-12 arewell supported on the website (www.qualitymetric.com) andthrough manuals that supply age and disease group distri-butions o scores.23 Direct comparisons o generic measureshave shown dierences in scores and health states attribut-able to the choice o measure.24-26 Studies or clinical resultsmay not be comparable to each other i they are using dier-ent health status scales.

uiiie—Ve Heh se. Utility scales oer anoverall score or the value o a health state, setting deathat 0 and ull health at 1. The emphasis is not on describingthe state, but on assigning a value, worth, or preerence tothat state.27,28 Utilities are needed or economic appraisalsand orm the health assessment or cost per quality-adjusted

lie-years estimations. Utility states can be obtained by di-rect or indirect methods. Direct methods, such as standardgamble and time tradeo, involve the respondent work-ing through exercises to elicit the value or his or her ownhealth state against things such as time or more or less a-vorable health situations.27 Indirect methods capture thestate with standardized questions and apply predeterminedweights.28 Examples include the EQ-5D which is ve items

(three response categories) combined to describe a healthstate. Similarly, the Health Utility Index gathers inorma-tion on six or seven dimensions o health (depending on theversion) on ve-item to six-item response scales to dene ahealth state.28 Both these scales then use weights determinedin dierent populations to assign the value to these healthstates—hence the “indirect” weighting. The absolute valueobtained across these dierent approaches varies.27

Generic measures o health and utility scores are verybroad. They oten do not perorm as well as the more specicmeasures described in the ollowing sections because they aredesigned to allow comparisons across populations and need toinclude items that might not be relevant in arthritis. In someareas, there are measures o quality o lie that are designed

or that disease, such as rheumatoid arthritis or osteoporosis,which oer a measure o the broad concept o quality o liein a disease-specic manner. Such measures would not allowcomparisons across diseases. At the time o this writing, thesewere not yet recommended as core instruments.

sympm. Pain is usually measured using a 10-cm visualanalog scale or a 0-to-10 numeric rating scale o the intensityo the symptoms.29 This simple measure has been well testedand is easily understood by patients. Fatigue is another im-portant symptom and quite distinct rom being “tired.” Theankylosing spondylitis modied core set contains atigue,and it is a recommended area o urther research in rheuma-toid arthritis and lupus. Measures are being tested or devel-

oped at present. Ankylosing spondylitis is currently using the10-cm visual analog scale o atigue rom the Bath Ankylos-ing Spondylitis Disease Activity Index (BASDAI).30

Dibiiy se. Physical disability in rheumatoid arthritisand osteoarthritis is oten measured using the Health Assess-ment Questionnaire–Disability Index (HAQ-DI),31 whichcovers 20 items looking at dierent aspects o daily unc-tioning. Patients score each item on a 0-to-3 scale, where3 represents the greatest disability. Scores are obtained oreach domain and summed into a total score expressed onthe same 0-to-3 scale. Scores are adjusted to a 2/3 i an aidis used to complete a task. More details on the HAQ-DI arewidely available in print and on the Internet.

There are other scales asking about physical unction suchas the Arthritis Impact Measurement Scale (AIMS)32 and theAIMS233 and measures with even more specic oci, such as theWestern Ontario McMaster (WOMAC) osteoarthritis index,which is commonly used in hip and knee osteoarthritis,34 andthe AUSCAN osteoarthritis index or hand osteoarthritis.35

Disease Process (Activity, Severity)

Core sets oten include indices o disease process. Diseaseprocess can be divided into activity (infammatory activity)and severity (overall severity o disease). There are several

Page 4: Archivo Das28 MD Consult

8/6/2019 Archivo Das28 MD Consult

http://slidepdf.com/reader/full/archivo-das28-md-consult 4/12

Page 5: Archivo Das28 MD Consult

8/6/2019 Archivo Das28 MD Consult

http://slidepdf.com/reader/full/archivo-das28-md-consult 5/12

467PART 4 | BRoAD IssuEs IN THE APPRoAcH To RHEumATIc DIsEAsEs

Sleep

The OMERACT patient group has identied sleep qual-ity as an important domain or outcome assessment.The concept and measurement o it are expected to beaddressed at the upcoming OMERACT 9 meeting in2008.

HOW DO OUTCOME MEASURES RELATETO ONE ANOTHER?

With an array o potential measures, or windows to viewthe burden o disease, how does one organize them toget an accurate picture o the whole? Conceptual rame-works help one to understand how domains, such as thosedescribed previously, theoretically relate to one anotherwhen applied in research or clinical practice. They alsousually provide an operational denition o each o their domains, which becomes essential when choos-ing between instruments or deciding how to model theoutcome o an intervention and its modiying actors.56,57 Conceptual rameworks acilitate accurate communica-

tion and the generation o hypotheses o understandingo a disease process and impact. In the past, the mostcommon conceptual ramework was the main (sometimescausal) pathway o a biomedical model: Pathology leadsto organ/system changes leads to pain/symptoms leads tounctional loss leads to diminished quality o lie. This issimilar to a Wilson and Cleary model58 and might helpin understanding that unction and quality o lie mightrelate more closely than sedimentation rate and quality o lie—based on their proximity and distance in the model.Expanding on this type o model are conceptual rame-works such as Nagi59 or Verbrugge’s60 disablement process.In these models, there is a main pathway, with slightlydierent denitions, but there also are boxes o infuence

rom outside this pathway—the patient’s personal actorsand environmental actors. In 2001, the World HealthOrganization ratied the International Classication o Functioning (ICF),61 which oers an even more expandedramework based on a biopsychosocial understanding o disease. It is receiving wide endorsement in arthritis.56,62 In this model (Fig. 31-1), there are three main elements o burden. A disease can aect an individual by impairment(body part), activity limitations (individual’s ability todo a task), or participation restriction (restrictions in theexecution o the individual’s roles in lie situations). Theboxes in Figure 31-1 are joined by bidirectional arrowsbecause this model suggests that the right-side boxes alsocould infuence the let with secondary problems (e.g.,joint contractures or wounds secondary to prolonged bedrest). For the ICF, disablement is not a linear progression;it is a dynamic interaction infuenced by several actors.57 The ICF strongly emphasizes the infuence o personalactors and environmental actors on all three domains.Several initiatives are under way to examine the t anduseulness o the ICF ramework in various orms o arthri-tis.57,63,64 Readers are reerred to Jette and Keysor65 or anexcellent comparison o the ICF and Verbrugge models inthe context o arthritis.

In addition to providing the conceptual ramework, theICF undertook the task o a classication system listing

all possible categories o impact alling under each o theve main headings. Many groups, particularly in arthritis,have reviewed the categories and developed a list o the

categories relevant to that orm o arthritis. This list sug-gests which o these categories should be refected in coreset measures in arthritis, oering a orm o standard or thecontent o scales.63

Frameworks dene the realm o outcomes that shouldbe considered, and the hypothetical relationships betweenthem. They orm the basis or understanding observa-tions, testing hypotheses, or planning and executing ananalysis. Shiting rameworks is challenging because di-erent tools may vary in how they dene certain aspectso health or disability. Shits can prompt ruitul rethink-ing o concepts, however, and how they relate to eachother.64

HOW DOES ONE CHARACTERIZE WHATONE NEEDS TO MEASURE?

  Just as investigators dene a research question beoreembarking on a clinical trial, so should users o health out-comes dene a measurement question beore choosing aninstrument.

WHY MEASURE?

Clarity about a measure’s intended purpose helps ensure theright one is selected. Measures can be used in three ways: todescribe people at one point in time, to predict a uturestate, and to measure change over time.66,67 This chapterocuses on the purposes used or health outcome assessment:describing an end point state in a trial, which is descriptive,and evaluating change over time.

WHAT TO MEASURE

An understanding o concepts and denitions is important.It is not good enough to decide to measure physical unc-tion, or example; a better outcome would be physical unc-tion at the level o disability according to Verbrugge and

 Jette.60 Similarly, when measuring pain, is intensity moreimportant than requency? What about the degree to which

Body functionand structure

Activities Participation

Health condition(disorder/disease)

Environmentalfactors

Personalfactors

Figure 31-1  The Internatina caifatin ntining neptaraewrk hwing the hyptheized reatinhip between dain  ipairent, ativity iitatin, and partiipatin retritin.

Page 6: Archivo Das28 MD Consult

8/6/2019 Archivo Das28 MD Consult

http://slidepdf.com/reader/full/archivo-das28-md-consult 6/12

468 BEAToN | AssEssmENT o HEAlTH ouTcomEs

pain intereres with daily activities? Questions such as theseshould be addressed beore any instruments or core sets arereviewed. The instrument should meet the need, and notthe reverse.

WHO CONSTITUTES THE TARGET POPULATION

The target population is crucial, but oten overlooked. A

given instrument may work well in severe osteoarthritis o the hip, but not be sensitive to the early symptoms o thedisease. Equally important is to consider i one wants tomeasure or an individual patient or or describing a group o patients as a whole. The ormer demands much higher lev-els o measurement properties (e.g., reliability coecients>0.90 as opposed to 0.75 to 0.80 being adequate or groupdescriptions).66

SELECTING THE OUTCOME THAT CANMEET THE MEASUREMENT NEED

The selection o an outcome measure depends entirelyon a clear understanding o measurement need. Oten a

well-used instrument is used, rather than looking or one

that matches the concept, population, and purpose. Manyguidelines that oer more detail than can be describedhere are available, particularly or the acceptable levelso reliability and validity.66-72 This section describes adecision-making process or t between a given instru-ment and the clinician’s need (Fig. 31-2). This processbuilds on the work o Law72 and the OMERACT lter69 and highlights key understandings in each area rom the

published guidelines.This decision-making process emphasizes three things.

First, it begins by stating the measurement need (why,what, and in whom), which reinorces that a candidatemeasure meet one need, but not the next. Second, itemphasizes that a lot o the appraisal can be done with-out statistics. It is done by appraising the questionnaireor instrument itsel and knowledge o its administration.Third, the inability to arm each stage suggests thatthere is no need to continue. At the later data-basedstages, the clinician may choose to run a small study tocreate the evidence (the “do-it” loops) in patients, ratherthan abandoning the instrument that seems like a goodcandidate. Given these three key eatures, the process

rom let to right is reviewed next.

 

Need: Concept Population Intended purpose: describe evaluate change

Boxes marked with “do it loop” arethose where you can create theevidence and continue on

Blue boxes = pre-data evaluationYellow boxes = data-based evaluation

Chooseanother

tool

Candidate measure:

1. Matchestarget concept?

2. Feasableto use?

3. Does it measure whatit says it does? (truth)

No

No

No

No

No

Content Face Construct

4. Does it havepurpose-specificproperties in your

population?

5. Is it interpretable?

Good fit for your needs!

Combinations:change and state

Changethresholds

 b. Change over time (evaluative)

-test-retest reliability-inter-rater reliability-responsive to change that issimilar to target situation

 a. One point in time (descriptive)

Reliability-internal consistency-inter-rater reliability

Construct validity-differentiates high/low levels-acts as expected with other indicator

 

Benchmarkingscores (states)

Figure 31-2 Agrith hwing deiin-aking pre r the ft a andidate eare with yr eareent need. The et-hand ide thepage i dne by appraiing the intrent and it intrtin. The right-hand ide reqire neri evidene the reevant eareent prpertie.many intrent are weeded t a a pr ft in tep 1, 2, and 3. The “d-end” p dente tage at whih y an pae t reate evidene i it iiing and nt have t abandn the intrent.

Page 7: Archivo Das28 MD Consult

8/6/2019 Archivo Das28 MD Consult

http://slidepdf.com/reader/full/archivo-das28-md-consult 7/12

469PART 4 | BRoAD IssuEs IN THE APPRoAcH To RHEumATIc DIsEAsEs

STEP 1: IS THERE A MATCH BETWEENTHE INSTRUMENT’S CONCEPT AND THEMEASUREMENT NEED (CONCEPT,POPULATION, PURPOSE)?

An operational denition o the target concept, the appli-cable populations (patients or general population), andintended purpose should be articulated by the developer and

match your current need.42,68,71,73 I this is not the case, or i it is not a good match, start with another candidate measurebecause this one would not work.68

STEP 2: IS IT FEASIBLE TO USE?

Feasibility covers the practical aspects o using this scale inthe intended setting.42,69,72,73 Does it take too much time? Arethe licensing costs too high? Does it require special equip-ment? Is it too burdensome or your patients (language,literacy, acceptability o questions)? Is it ormatted well onthe page, and do the responses make sense given the tar-get and the question? Are the questions phrased in a clearand simple manner? Are the necessary scoring instructions

available? A negative response to any o these questionscould direct you to go to another, more easible, instrument.Feasibility oten makes or breaks a decision about a candi-date measure.69

STEP 3: IS IT MEASURING WHAT IT SAYSIT IS MEASURING? (TRUTH)

Does the instrument measure what it says it will measure?We divide this into three areas: content, ace, and constructvalidity. Content validity appraises the items and domainso a scale. Have the authors covered what McHorney andTarlov66 call the breadth and the depth o the concept, thatis, all the important areas, but also enough depth to capture

the range o experience o the patients? Face validity is anappraisal o the general direction o the scale—will it hitthe target? Are the response options organized in a logicaldirection or high and low levels o this attribute? Does thescoring make sense?

The stage o construct validity is the dividing pointin the decision process between data-ree and data-basedappraisal. Up until now, the appraisal is done by look-ing at the instrument and its manuals. In construct valid-ity, we begin to explore data to see i the numeric scoresarising rom the instrument make sense. Basic constructvalidity should be established regardless o the purpose.Sometimes it is de-emphasized in evaluative instruments;however, responsiveness without knowing i the instru-ment is measuring the target seems misplaced. We placeit beore the purpose-specic properties to emphasize itsneed as a basis. Construct validity is generally measuredby comparisons with other similar scales or related con-structs (i.e., high and low levels o pain and unction) tosee i the numeric scores are behaving in the way theyshould i this were a valid measure o the target con-cept. Theoretic situations are set up before analysis, thedirection and magnitude o the expected relationship aredeclared, and then the relationship is tested.68,73 Com-parisons also should be made between groups known todier (high versus low severity) or with scales where no

relationship is ound—again based on an a priori the-ory—to see i the candidate measure behaves accordingto the theory. These add to the evidence that the instru-ment is measuring what it is supposed to measure.68 I theevidence is unavailable or is not in the intended popula-tion, you have a choice o abandoning the measure ordoing a study to create that evidence and then continu-ing to advance.

STEP 4: PURPOSE-SPECIFIC EVIDENCE IN YOURPOPULATION

Ater getting a general sense o the construct validity o theinstrument, the next step is to address the specic attributesneeded or the instrument to unction to meet your measure-ment need (discrimination component o the OMERACTlter).69 More attention is now paid to the intended purposeand the intended population and the properties o consis-tency in measurement (reliability—obtaining the same scorein dierent settings) and additional validity (cross-sectionalagain or responsiveness or sensitivity to change).

Descriptive Purpose

Descriptive outcomes oten are used to classiy individualsas to severity o condition or to identiy them by a prog-nostic group. In health outcome assessment, descriptiveinstruments are needed to classiy individuals as respond-ers or as being in a low disease activity state or remission.An instrument needs to be precise, that is, the observedscore is very close to the true score with low error. This isestimated by the internal consistency o a multi-item scaleor questionnaire and Cronbach alpha coecients or KuderRichardson 20 i the scale is dichotomous (yes/no). Internalconsistency is a eature o a scale with many items measur-ing the same thing—are the responses similar across items

within the instrument? It is not a eature o a scale con-taining weighted sums o dierent attributes, such as diseaseactivity measures.37

I more than one person will be gathering the data, inter-rater/observer reliability should be measured and quantiedwith an intraclass correlation coecient (ICC) or continu-ous measures or a weighted kappa or ordered categories.74 There are dierent types o ICC depending on the modelused or the variance estimates; the type o ICC should benamed.74 The ICC and weighted kappa measure the com-parability o actual numeric scores and are preerred overcorrelation coecients that look only or trends and nota direct match in number values. Cutos are always chal-lenging, but in general, reliability (including test-retest)should be at minimum 0.7566,73 or group level analyses,and or describing an individual patient, it should be 0.90to 0.95.66,73 The internal consistency reliability can be con-verted back into the scale score by calculating the precisionlimits—using 95% limits, the true score lists somewherewithin 1.96 × s[1 − r]1/2 where r= internal consistency ands= standard deviation. This calculation tells us the rangewithin which the true score or an individual can be ound.I it is too wide (reliability too low), it is impractical to usethat instrument.

Construct validity is revisited or the descriptive instru-ment, but with more attention to looking or evidence close

Page 8: Archivo Das28 MD Consult

8/6/2019 Archivo Das28 MD Consult

http://slidepdf.com/reader/full/archivo-das28-md-consult 8/12

470 BEAToN | AssEssmENT o HEAlTH ouTcomEs

to the intended application. I the goal is to measure highversus low health, the sample should be divided into knowngroups with high and low health according to anotheraccepted opinion, and then this scale is tested against it.The image o a window can help here in selecting compara-tors to use or testing. What else gives a bit (or more) o overlap with the target view? How much correspondence isexpected between scores? For good construct validity, this

a priori hypothesized relationship should be recreated withdata, whether that be a strong correlation or no correlationat all. An instrument or measure is never universally validand requires ongoing testing to improve understanding o the scores in dierent situations.

Evaluative Purpose

In evaluative measures, the intent o the study is to ocus onthe amount o change over time. Many clinical trials are doingthis and comparing results between treatment and controlgroups. Interobserver reliability is important i more than onemeasurer is to be involved. The hallmark o a good evaluativemeasure relates, however, to time: First, do the scores remain

the same when the target concept has not changed over time(test-retest reliability)? Second, when the concept changes,does the score on the instrument/measure change as well?

Test-retest reliability requires two administrations o the measure over a time when no change has occurred.This may be easier said than done sometimes, but theauthors should justiy their design and how they ensuredno change had occurred. Similar to interobserver reliabil-ity, the ICC is the preerred statistic or continuous scores,and weighted kappa, its equivalent, is preerred or cate-gorical scores. The cutos are the same, and a coecientcan be converted into a “minimal detectable change” 75 as 1.96 × s(2[1 − r])1/2, where s = standard deviation andr = test-retest reliability (ICC).66,75 Ninety-ve percent o 

subjects who are stable have change scores less than thisvalue; a change greater than this is not likely to occur in astable patient, only in a changing one. It becomes a lowerboundary o meaningul change—anything below thatcould be day-to-day fuctuations in scores.

Responsiveness—the accurate detection o change whenit has occurred—is sometimes best thought o as longitudinalconstruct validity. Similar to construct validity, responsive-ness depends on an a priori theoretic relationship—one inwhich the attribute is changing over time. Oten the ocusis on the amount o change picked up, rather than the typeor amount o change that had occurred. A large change isnot useul i we were expecting a small one; rather it suggestsnoise. The construct embedded in a study o responsivenessshould be described careully and should be a clear match withthe intended application (measurement need). I the goal isto detect change in a clinical trial, it is important to assess theinstrument’s ability to detect the dierence in change betweentreatment and control groups. I the goal is to detect changein a cohort, it might be more useul to examine change in asingle group perhaps in a treatment o known ecacy (hipreplacement) or in subjects who rated themselves as improvedon an external anchor (global index o change).

Responsiveness is summarized with statistics o sig-nal (change) over noise (error), such as the standardizedresponse mean (mean change/standard deviation o change),

t statistic (mean change/standard error), and eect size (meanchange over standard deviation o baseline)74; each can beadapted to quantiy the relative change between treatmentand control groups.53,76 Deyo and Centor77 also describedthe correlational approach (correlate change and anotherindicator o change) and the receiver operator curve approach(various change scores against external “gold standard” thatthe person has changed) where the area under the curve is

a summary statistic.77 The numeric summaries o responsive-ness, such as eect sizes or areas under the curve, should cor-respond to the type o change expected (a priori theory). Alarge eect size or area under the curve does not mean aninstrument is “responsive.” It should correspond with thechange anticipated in the study—small or large. Comparisonso the eect sizes are helpul i dierent instruments are beingcompared in the same study, as done by Buchbinder and col-leagues53 or by Verhoeven and coworkers,76 who ocused onresponsiveness in early rheumatoid arthritis. Responsivenessis a highly contextualized property, and the same instrumentmay not be responsive in another situation (e.g., early versuslate disease, osteoarthritis versus rheumatoid arthritis).73

STEP 5: INTERPRETABILITY OF SCORES

The nal step, oten deemed the most elusive,78 is the inter-pretability o the scores.

Benchmarking States

What is the meaning o a score o 2/10 on a pain score? Is ita good outcome? The meaning o dierent scores on an out-come assessment is used or classiying subjects at the begin-ning o a trial and at the end point. To do this, comparisonsare made to other known health states—severity indices,ability to work, sel-rating as mild.79 Gradually, enoughtrends might be seen across dierent scenarios to gain con-

dence in the meaning o “good” or “mild.”80,81 In rheumatol-ogy, we see the emergence o low disease activity states82,83 or patient acceptable symptom states84 or remission criteriawith the DAS2838 as thresholds below which subjects areconsidered to be in an acceptable state (either tolerablesymptoms or disease activity where it does not require medi-cation changes). At this point, these thresholds are beingestablished, and similar to change thresholds, we may ndvariability in the values38 that need to be sorted out withmethodologic work and application in clinical practice.

Changes in State

The second type o interpretability concerns change scores.

amei cee rhemy repe ciei. The American College o Rheumatology took the coreset measures and determined that i one observed an X%change in joint count and in swollen joint count and in atleast three other areas—erythrocyte sedimentation rate orC-reactive protein, physician global, patient global, pain,or physical disability—one had a clinical response, and theindividual would be classied as a responder. The percentis usually 20%, but 50% and 70% have been considered.The ACR20 is widely used, catches responses across awide variety o domains, and discriminates well in clinical

Page 9: Archivo Das28 MD Consult

8/6/2019 Archivo Das28 MD Consult

http://slidepdf.com/reader/full/archivo-das28-md-consult 9/12

471PART 4 | BRoAD IssuEs IN THE APPRoAcH To RHEumATIc DIsEAsEs

trials76; however, it is currently being revalidated owing tothe changing nature o rheumatoid arthritis and its care.85

Miim ciiy Imp Dieee d Impve-me. Dening the threshold o change above which anindividual has had an “important” shit in outcome is whatKirwan78 has described as the “elusive crock o gold at theend o the rainbow.” Nevertheless, important advances have

been made. There are many sources o variation in score,including the method used, the baseline severity, and thetype o change to be sought.86,87 In 2000, Wells and cowork-ers88 described nine dierent methods or deriving minimalclinically important dierences rom the literature. Someuse distributional cutos (½ standard deviation, or eectsize o 0.2 or 0.5),89 which have been criticized as lackingany meaningul anchor. Other methods depend on someexternal anchor that important improvement has occurred,but are sometimes challenged by the dependence on thatanchor and the perspective o the individual who deter-mines it (patient, physician, third-party payer). Minimalclinically important dierences repeatedly have been shownto vary with baseline state90,91 and with improvement versus

deterioration.92 Tubach and associates93 changed the termto minimal clinically important improvement and looked onlyat improvement. Minimal clinically important dierencesvary depending on the context o measurement. You needto plan on working with a range o values,42,86,87 to make surethe measurement situation is similar to your own (severity,timing, type o intervention), and to build condence withcongruence in minimal clinically important dierences romacross methods i you can achieve that.

Combined Approaches: Change and State

An attractive, although oten overlooked, option is com-bining change and state. In 1996, EULAR dened clini-

cal response as a change in DAS28 score o more than 1.2(change) plus a nal DAS28 score o less than 2.4 (nalstate).36 Jacobson and colleagues94 did the same in deningresponse to psychotherapy; change greater than error wasused (minimal detectable change mentioned earlier) plus anal “normal” state. Studies rom the patient’s perspectivehave oten refected the same thing.80,81,95 Treatment needs toinduce a change, but perhaps it also needs to land patientsin a healthy state to make them eel better.

The approaches described previously ocus on interpreta-tion at the level o the individual, perhaps or use in clinicalpractice or in a response-type analysis o a clinical trial or eco-nomic appraisal (% responder). Verhoeven and coworkers76 showed that the same instrument may not perorm equallywell in a responder type o analysis and a group level change.

At each stage o this appraisal, there is an element o judgment. It is likely there will never be perect evidenceacross all stages. The user needs to assess the potential risko accepting less than ideal evidence or abandon the scale.Users also may create the evidence, however, by doing itthemselves. An instrument that makes it through thisappraisal is likely a good t with the measurement need.Anything short o that could lead to error and be prone tomisinterpretation. By working rom let to right, scales thatare not targeting the right concept or are impractical to usein the intended setting can be eliminated quickly beore

reviewing the literature extensively or the measurementproperties.

SECTION 6: AREAS OF GROWTH IN HEALTHOUTCOME ASSESSMENT

Item Response Theory 

Users o outcomes in arthritis come across item responsetheory (IRT) and computer adaptive testing (CAT). Theseterms relate to newer methods o ordering and calibratingitems on a scale so that there is equal meaning in score incre-ments across the scale. Most o our outcomes were developedin “classic test theory” (internal consistency, summed scoreswithout weights). There are two schools within IRT: Rasch,which xes the parameters and assesses i items in a scale tor do not t that model, and IRT itsel, which ts a modelto the data, rather than the reverse. IRT and Rasch are otenpresented as conficting schools, but they are both workingtoward an item calibration that allows more accuracy andprecision. In the uture, direct comparisons may reveal theirsimilarities and dierences in practical ways. The weights

are cumbersome to apply or the clinician, but can be eas-ily integrated into a computer-based scoring system or easydata entry and CAT. CAT chooses items based on the previ-ous set o responses and uses the ewest number o items toreach a precise score skipping easier items i condent thesubject can do the harder ones. This streamlined scoring isquite attractive, but there are some limitations. It dependson technology that may not at this time be available inevery setting. It also may be infuenced by dierential itemunctioning, which means an item might change weight, ororder, in certain subgroups and necessitates more complexweights. An example o dierential item unctioning wouldbe putting on a pullover sweater. It is a hard task i you haveshoulder pain, but pretty easy i you have a hand problem

and would require a dierent weight.The National Institutes o Health is currently unding

PROMIS (Patient Reported Outcomes Measurement Inor-mation System; available at http://www.nihpromis.org/ ) todevelop a CAT system (currently based on a two-parametergraded response model IRT) or common chronic diseases,including arthritis.96 Several measures have been pooledinto a large database and are being rened and rescaled atthe time o this writing. Well-known measures, such as theHAQ, also will be used to allow or cross-calibration withthe newer items. All ndings will be reported on the PRO-MIS website, as will access to the scoring algorithms.

Use o Technology in Health Outcomes Assessment In addition to enabling eorts such as PROMIS to developCAT systems, inormation technology has changed manyaspects o health outcome assessments. Streamlined, cus-tomized assessments can be set up on the Internet or on astand-alone computer with interaces such as touch screen,light pens, or “point and click.” Patients can complete thequestionnaires at home, at the clinic, on their PDA, oron a tablet. Language and literacy issues can be overcomewith talking screens. Scoring becomes instantaneous,and reports can be printed immediately summarizing thescored results in time or the clinical visit.97,98 Comparisons

Page 10: Archivo Das28 MD Consult

8/6/2019 Archivo Das28 MD Consult

http://slidepdf.com/reader/full/archivo-das28-md-consult 10/12

472 BEAToN | AssEssmENT o HEAlTH ouTcomEs

between touch screens and traditional paper and pen-cils are promising, and the acceptability by patients witharthritis is good.97-99 New technology means that healthoutcome assessment can become part o the patient- clinician experience and acilitate the ability o the clini-cian to monitor the patient’s health.97

 Adaptation to an Ongoing Disease

This chapter has ocused on the measurement o health statesand their interpretation over time. Individuals with chronicdiseases adapt to ongoing disease with behavioral strategies orcognitive reraming o their situation.100 In some circles, thisis adjustment95; in others, it is response shit.101 The challengein health outcomes assessment is to tell when a state is chang-ing only because o adaptation and not the intervention. Inmany situations, we try to induce adaptation, or cognitivereraming, and it can be constructive. It does create a biasin measurement,101 however, and a challenge to the healthoutcome assessor. Numerous groups are researching how toincorporate adaptation into health outcome assessments.

SUMMARY

There is considerable room or improvement in health out-come assessment in rheumatology, despite the work done todate. A battery o instruments have been developed, many o which exhibit the measurement properties described in thischapter and meet the challenge o a changing arthritis tar-get (less severe, earlier disease), and several more measuresare being considered or membership in core sets to capture acomprehensive view o the burden o arthritis. We are on thebrink o deciding on the role to be played by IRT and CATin widespread care settings. Despite progress in assigning anumeric value to a complex health state, however, we are nowstruggling with the back-translation—what does the numeric

score mean in the real patient world. It is not always a simpletranslation rom questionnaire score to clinical meaning.

Health outcome assessment is well advanced in arthritis care,and we should recognize the years o work and commitment o many proessional and patient/consumer groups. Advances willcontinue in the use o technology, the breadth and depth o outcomes, and the quality o measurement to keep pace withthe needs o patients, clinicians, and researchers.

AcknowledgmentsDorcas Beaton is supported by a New Investigators Award through theCanadian Institutes o Health Research. Peter Tugwell holds a CanadaResearch Chair.

The authors would like to thank Ms. Taucha Inrig, Dr. Claire

Bombardier, Dr. Fred and Mrs. Janet Krieger, Mr. William Francis, andthe OMERACT executive or their help with this manuscript, andDr. M. Ward, whose chapter in the seventh edition o Kelley’s Textbook of Rheumatology was a helpul guide.

REFERENCES

1. Relman AS: Assessment and accountability: The third revolution inmedical care. N Engl J Med 319:1220-1222, 1988.

2. Last JM: A Dictionary o Epidemiology. New York, Oxord UniversityPress, 1988.

3. Prevoo MLL, Van’t Ho MA, et al: Modied disease activity scoresthat include twenty-eight-joint counts: Development and validationin a prospective longitudinal study o patients with rheumatoidarthritis. Arthritis Rheum 38:44-48, 1995.

4. Fries JF, Spitz PW, Young DY: The dimensions o health outcomes:The Health Assessment Questionnaire, Disability and Pain Scales.

 J Rheumatol 9:19203, 1982.5. Ware JE Jr, Sherbourne CD: The MOS 36-Item Short-Form Health

Survey (SF-36), I: Conceptual ramework and item selection. MedCare 30:473-483, 1992.

6. Fries JF: The hierarchy o outcome assessment. J Rheumatol 20:546-547, 1993.

  7.  we F, lee M, v de Heijde D, e : Peimiy e e dmi d epi eqieme idi bev-

i die i hemy. J rhem 26:484-489, 1999.8. van der Heijde D, van der Linden S, Bellamy N, et al: Which

domains should be included in a core set or endpoints in ankylos-ing spondylitis? Introduction to the ankylosing spondylitis module o OMERACT IV. J Rheumatol 26:945-947, 1999.

9. Gladman DD, Mease PJ, Healy P, et al: Outcome measures in psori-atic arthritis (PsA). J Rheumatol 34:1159-1166, 2007.

10. Gladman DD, Mease PJ, Strand V, et al: Consensus on a core set o domains or psoriatic arthritis. OMERACT 8 PsA Module Report.

 J Rheumatol 34:1167-1170, 2007.11. Sambrook PN, Cummings SR, Eisman JA, et al: Guidelines o osteo-

porosis trials (workshop report). J Rheumatol 24:1234-1236, 1997.12. Gladman DD, Strand V, Mease PJ, et-al: OMERACT 7 psoriatic

arthritis workshop: Synopsis. Ann Rheum Dis 64:ii-115-ii-116, 2005.13. Bellamy N, Kirwan J, Boers M, et al: Recommendations or a core set

o outcome measures or uture phase III clinical trials in knee, hip,and hand osteoarthritis: Consensus development at OMERACT III.

 J Rheumatol 24:799-802, 1997.14. Smolen JS, Strand V, Cardiel M, et al: Randomized clinical trials and

longitudinal observational studies in systemic lupus erythematosus:Consensus on a preliminary core set o outcome domains. J Rheuma-tol 26:504-507, 1999.

15. Boers M, Tugwell P, Felson DT, et al: World Health Organizationand International League o Associations or Rheumatology coreendpoints or symptom modiying antirheumatic drugs in rheumatoidarthritis clinical trials. J Rheumatol 21:86-89, 1994.

16. Felson DT, Anderson JJ, Boers M, et al: The American College o Rheumatology preliminary core set o disease activity measures orrheumatoid arthritis clinical trials. The Committee on OutcomeMeasures in Rheumatoid Arthritis Clinical Trials. Arthritis Rheum36:729-740, 1993.

17. van der Heijde D, Landewe R: Selection o a method or scoring radio-graphs or ankylosing spondyolitis clinical trials, by the Assessmentin Ankylosing Spondylitis working groups (ASAS) and OMERACT.

 J Rheumatol 32:2048-2049, 2005.18. Guidelines o osteoporosis trials (workshop report). J Rheumatol

24:1234–1236, 1997.19. Strand V, Gladman DD, Isenberg D, et al: Outcome measures to be

used in clinical trials in systemic lupus erythematosus. J Rheumatol26:490-497, 1999.

20. Patrick DL, Deyo RA: Generic and disease-specic measures inassessing health status and quality o lie. Med Care 27(Suppl):S217-S232, 1989.

21. Bergner M, Bobbitt RA, Pollard WE, et al: The sickness impact pro-le: Validation o a health status measure. Med Care 14:57-67, 1976.

22. Ware JE Jr: SF-36 health survey update. Spine 25:3130-3139, 2000.23. Ware JE Jr, Snow KK, Kosinski M, et al: SF-36 Health Survey Man-

ual and Interpretation Guide. Boston, The Health Institute, 1993.24. Beaton DE, Bombardier C, Hogg-Johnson SA: Measuring health in

injured workers: A cross-sectional comparison o ve generic healthstatus instruments in workers with musculoskeletal injuries. Am J Ind

Med 29:618-631, 1996.25. Beaton DE, Hogg-Johnson S, Bombardier C: Evaluating changes in

health status: Reliability and responsiveness o ve generic healthstatus measures in workers with musculoskeletal disorders. J Clin Epi-demiol 50:79-93, 1997.

26. Visser MC, Fletcher AE, Parr G, et al: A comparison o three qual-ity o lie instruments in subjects with angina pectoris: The SicknessImpact Prole, the Nottingham Health Prole, and the Quality o Well Being Scale. J Clin Epidemiol 47:157-163, 1994.

27. Revicki DA, Kaplan RM: Relationship between psychometric andutility-based approaches to the measurement o health-related qual-ity o lie. Qual Lie Res 2:477-487, 1993.

28. Feeny D: Preerence-based measures: utility and quality-adjusted lieyears. In Fayers P, Hays R (eds): Assessing Quality o Lie in ClinicalTrials, 2nd ed. New York, Oxord University Press, 2005, pp 405-429.

Page 11: Archivo Das28 MD Consult

8/6/2019 Archivo Das28 MD Consult

http://slidepdf.com/reader/full/archivo-das28-md-consult 11/12

473PART 4 | BRoAD IssuEs IN THE APPRoAcH To RHEumATIc DIsEAsEs

  29.  F Jt, Pey rK, Bei Ja, e : Defi he iiyimp dieee i pi me mee. Pi 88:287-294,2000.

30. Garrett S, Jenkinson T, Kennedy LG, et al: A new approach to den-ing disease status in ankylosing spondylitis: The BATH AnkylosingSpondylitis Disease Activity Index. J Rheumatol 21:2286-2291, 1994.

31. Fries JF: The hierarchy o quality-o-lie assessment, the Health Assess-ment Questionnaire (HAQ), and issues mandating development o atoxicity index. Controlled Clinical Trials 12:106S-117S, 1991.

32. Meenan RF, Gertman PM, Mason JH: Measuring health status in

arthritis: The Arthritis Impact Measurement Scales. Arthritis Rheum23:146-152, 1980.

33. Meenan RF, Mason JH, Anderson JJ, et al: Aims2: The content andproperties o a revised and expanded arthritis impact measurementscales health status questionnaire. Arthritis Rheum 35:1-10, 1992.

34. Bellamy N, Buchanan WW, Goldsmith CH, et al: Validation studyo WOMAC: A health status instrument or measuring clinically-important patient-relevant outcomes ollowing total hip or kneearthroplasty in osteoarthritis. J Orthop Rheum 1:95-108, 1988.

35. Bellamy N, Campbell J, Haraoui B, et al: Clinimetric properties o theAUSCAN osteoarthritis hand index: An evaluation o reliability, valid-ity and responsiveness. Osteoarthritis Cartilage 10:863-869, 2002.

36. Van Gestel AM, Prevoo MLL, Van’t Ho MA, et al: Developmentand validation o the European League Against Rheumatism responsecriteria or rheumatoid a rthritis. Arthritis Rheum 39:34-40, 1996.

37. Vrijhoe HJM, Diederiks JPM, Spreeuwenberg C, et al: Applying lowdisease activity criteria using the DAS28 to assess stability in patients

with rheumatoid arthritis. Ann Rheum Disease 62:419-422, 2003.38. Aletaha D, Ward MM, Machold KP, et al: Remission and active

disease in rheumatoid arthritis: Dening criteria or disease activitystates. Arthritis Rheum 52:2625-2636, 2005.

39. Lassere M, van der Heijde D, Johnson K, et al: Robustness and gen-eralizability o smallest detectable dierence in radiological progres-sion. J Rheumatol 28:911-913, 2001.

40. Ravaud P, Giraudeau B, Auleley GR, et al: Assessing smallest detect-able change over time in continuous structural outcome measures:Application to radiological change in knee osteoarthritis. J Clin Epi-demiol 52:1225-1230, 1999.

41. Lassere M, Johnson K, Van Santen S, et al: Generic patient sel-report and investigator report instruments o therapeutic saety andtolerability. J Rheumatol 32:2033-2036, 2005.

42. U.S. Department o Health and Human Services Food and DrugAdministration Center or Drug Evaluation and Research (CDER):Guidance or industry: Patient-reported outcome measures: Use

in medical product development to support labeling claims: Dratguidance. Available at: http://www.da.gov/cder/gdlns/prolbl.htm . Accessed November 3, 2006.

43. Woodworth T, Furst DE, Alten R, et al: Standardizing assessment andreporting o adverse eects in rheumatology clinical trials, II: Rheu-matology Common Toxicity Criteria v2.0. J Rheumatol  34:1401-1414, 2007.

44. Kristjansson E, Tugwell PS, Wilson AJ, et al: Development o theeective musculoskeletal consumer scale. J Rheumatol  34:1392-1400, 2007.

45. Escorpizo R, Bombardier C, Boonen A, et al: Worker productivityoutcome measures in arthritis. J Rheumatol 34:1372-1380, 2007.

46. Lerner D, Amick BC III, Rogers WH, et al: The work limitationsquestionnaire. Med Care 39:72-85, 2001.

47. Gignac MAM, Badley EM, Lacaille D, et al: Managing arthritis andemployment: Making arthritis-related work changes as a means o adaptation. Arthritis Care Res 51:909-916, 2004.

48. Gilworth G, Chamberlain AM, Harvey A, et al: Development o a work instability scale or rheumatoid arthritis. Arthritis Care Res49:349-354, 2003.

49. Backman C, Kennedy SM, Chalmers A, et al: Participation in paidand unpaid work by adults with rheumatoid arthritis. J Rheumatol31:47-57, 2004.

50. Tugwell P, Bombardier C, Buchanan WW, et al: The MACTARPatient Preerence Disability Questionnaire—an individualizedunctional priority approach or assessing improvement in physi-cal disability in clinical trials in rheumatoid arthritis. J Rheumatol14:446-451, 1987.

51. Buchbinder R, Bombardier C, Yeung M, et al: Which outcome mea-sures should be used in rheumatoid arthritis clinical trials? Clinicaland quality-o-lie measures’ responsiveness to treatment in a ran-domized controlled trial. Arthritis Rheum 38:1568-1580, 1995.

52. O’Boyle CA, Hoer S, Ring L: Individualized quality o lie. In FayersP, Hays R (eds): Assessing Quality o Lie in Clinical Trials: Methodsand Practice, 2nd ed. New York, Oxord University Press, 2005,pp 225-242.

  53.  Bhbide r, Bmbdie c, Ye M, e : whih memee hd be ed i hemid hii ii i?ahii rhem 38:1568-1580, 1995.

54. Solomon DH, Bates DW, Horsky J, et al: Development and valida-tion o a patient satisaction scale or musculoskeletal care. ArthritisCare Res 12:96-100, 1999.

55. Hudak PL, McKeever PD, Wright JG: Understanding the meaning o satisaction with treatment outcome. Med Care 42:718-725, 2004.

56. Jette AM, Haley SM: Contemporary measurement technique orrehabilitation outcome assessment. J Rehabil 37:339-345, 2005.

57. Jette AM, Keysor JJ: Disability models: Implications or arthritisexercise and physical activity interventions. Arthritis Care Res 49:114-120, 2003.

58. Wilson IB, Cleary PD: Linking clinical variables with health-relatedquality o lie: A conceptual model o patient outcomes. JAMA273:59-65, 1995.

59. Nagi SZ: A study in the evaluation o disability and rehabilitationpotential. Am J Public Health 54:1568-1579, 1964.

60. Verbrugge LM, Jette AM: The disablement process. Soc Sci Med38:1-14, 1994.

61. World Health Organization: International Classication o Function-ing, Disability and Health. Geneva, World Health Organization, 2001.

62. Arts DGT, Keizer NF, Scheer G: Dening and improving data qual-

ity in medical registries: A literature review, case study and genericramework. J Am Med Inorm Assoc 9:600-611, 2002.

63. Stucki G, Boonen A, Tugwell P, et al: The World Health Organi-sation International Classication o Functioning, Disability andHealth (ICF): A conceptual model and interace or the OMERACTprocess. J Rheumatol 34(3):600-606, 2007.

  64.   Jee aM: td mm e i, dibiiy dheh. Phy the 86:726-734, 2006.

65. Jette AM, Keysor JJ: Uses o evidence in disability outcomes andeectiveness research. Milbank Q 80:325-345, 2002.

  66.  MHeyca, tv ar: Idivid pie mii i ii- pie: ae vibe heh vey deqe? Qlie re 4:293, 1995.

67. Lohr KN, Aaronson NK, Alonso J, et al: Evaluating quality-o-lieand health status instruments: Development o scientic review cri-teria. Clin Therap 18:979-992, 1996.

68. McDowell I, Jenkinson C: Development standards or health mea-

sures. J Health Serv Res Policy 1:238-246, 1996.  69.  Be M, Bk P, sd V, e : the oMEract Fie -

me mee i hemy. J rhem 25:198-199, 1998.70. Kane RA, Kane RL: Assessing the Elderly: A Practical Guide to mea-

surement. Toronto, Lexington Books, 1981, pp13-17.71. Bergner M: Health status measures: An overview and guide or selec-

tion. Ann Rev Public Health 8:191-210, 1987.72. Law M: Measurement in occupational therapy: Scientic criteria or

evaluation. Can J Occup Ther 54:133-138, 1987.  73.  sieif adviy cmmiee he Medi ome t:

aei heh d qiy ie ime: aibed evie iei. Q lie re 11:193-205, 2002.

74. Hays RD, Revicki D: Reliability and validity (including responsive-ness). In Fayers P, Hays R (eds): Assessing Quality o Lie in ClinicalTrials: Methods and Practice, 2nd ed. New York, Oxord UniversityPress, 2005, pp 25-39.

  75.  sd Pw, Bikey JM: appyi he e e-ep me-

e idivid pie: a exmpe i he rd-MiQeiie. J ohp sp Phyi the 29:232-239,1999.

76. Verhoeven A, Boers M, van der Linden S: Responsiveness o the coreset, response criteria, and utilities in early rheumatoid arthritis. AnnRheum Dis 59:966-974, 2000.

  77.  Dey ra, ce rM: aei he epivee ie ii he: a y dii e pe-me. J chi Di 39:897-906, 1986.

78. Kirwan J: Minimum clinically important dierence: The crock o gold at the end o the rainbow? J Rheumatol 28:439-444, 2001.

79. Deyo RA, Carter WB: Strategies or improving and expand-ing the application o health status measures in clinical set-tings: A researcher-developer viewpoint. Med Care 30(5 Suppl):MS176-MS186, 1992.

Page 12: Archivo Das28 MD Consult

8/6/2019 Archivo Das28 MD Consult

http://slidepdf.com/reader/full/archivo-das28-md-consult 12/12

474 BEAToN | AssEssmENT o HEAlTH ouTcomEs

80. Tubach F, Dougados M, Falissard B, et al: Feeling good rather thaneeling better matters more to patients. Arthritis Care Res 55:526-530, 2006.

81. Beaton DE, Tarasuk V, Katz JN, et al: Are you better? A qualitativestudy o the meaning o being better. Arthritis Care Res 7:313-320,2001.

82. Boers M, Anderson JJ, Felson D: Deriving an operational denitiono low disease activity state in rheumatoid arthritis. J Rheumatol30:1112-1114, 2003.

83. Tubach F, Wells GA, Ravaud P, et al: Minimal clinically important

dierence, low disease activity state and patient acceptable symptomstate: Methodological issues. J Rheumatol 32:2025-2029, 2005.

84. Tubach F, Ravaud P, Baron G, et al: Evaluation o clinically relevantstates in patient reported outcomes in knee and hip osteoarthrits:The patient acceptable symptom state. Ann Rheum Dis 64:34-37,2005.

85. Fe Dt, F DE, Be M: rie d eie eev-i he acr20. J rhem 34:1184-1187, 2007.

86. Be DE, Be M, we ga: My e he miim ii-y imp dieee (McID): a iee evie d die-i e eeh. c opi rhem 14:109-114,2002.

87. Hays RD, Woolley JM: The concept o clinically meaningul dier-ence in health-related quality o lie research. PharmacoEconomics18:419-423, 2000.

88. Wells GA, Beaton DE, Shea B, et al: Minimal clinically importantdierences: Review o methods. J Rheumatol 28:406-412, 2001.

89. Norman GR, Sloan JA, Wyrwich KW: Interpretation o changes inhealth-related quality o lie: The remarkable universality o hal astandard deviation. Med Care 41:582-592, 2003.

90. Sala F, Stancati A, Silvestri CA, et al: Minimal clinically impor-tant changes in chronic musculoskeletal pain intensity measures on anumerical rating scale. Eur J Pain 8:283-291, 2004.

91. Stucki G, Daltroy L, Katz JN, et al: Interpretation o change scores inordinal clinical scales and health status measures: The whole may notbe equal to the sum o the parts. J Clin Epidemiol 49:711-717, 1996.

92. Angst F, Aeschlimann A, Stucki G: Smallest detectable and minimalclinically important dierences o rehabilitation intervention withtheir implications or required sample sizes using WOMAC and SF-36 quality o lie measurement instruments in patients with osteo-arthritis o the lower extremities. Arthritis Care Res 45:384-391,2001.

93. Tubach F, Ravaud P, Baron G, et al: Evaluation o clinically relevantchanges in patient reported outcomes in knee and hip osteoarthritis:The minimal clinically important improvement. Ann Rheum Dis64:29-33, 2005.

94. Jacobson NS, Roberts LJ, Berns SB, et al: Methods or dening anddetermining the clinical signicance o treatment eects: Descrip-tion, application, alternatives. J Consult Clin Psychol 67:300-307,1999.

95. Norman G: Hi! How are you? Response shit, implicit theories anddiering epistemologies. Qual Lie Res 12:249, 2003.

96. National Institutes o Health: Patient Reported Outcome Measure-ment Inormation System (PROMIS) network. 2006. Available at:http://www.nihpromis.org/ .

97. Athale N, Sturley A, Koczen Z, et al: A web-compatible instrumentor measuring sel-reported disease activity in arthritis. J Rheumatol31:223-228, 2004.

98. Fransen J, Stucki G, Twisk J, et al: Eectiveness o a measurementeedback system on outcome in rheumatoid arthritis: A controlledclinical trial. Ann Rheum Dis 62:624-629, 2003.

99. Bischo-Ferrari HF, Vandechend M, Bellamy N, et al: Validationand patient acceptance o a computer touch screen version o the

WOMAC 3.1 osteoarthritis index. Ann Rheum Dis 64:80-84, 2004.100. Shaul MP: From early twinges to mastery: The process o adjustment in

living with rheumatoid arthritis. Arthritis Care Res 8:290-297, 1995.101. Schwartz C, Sprangers M, Fayers P: Response shit: You know it’s

there but how do you capture it? Challenges or the next phase o research. In Fayers P, Hays R (eds): Assessing Qualiy o Lie in Clini-cal Trials: Methods and Practice, 2nd ed. New York, Oxord Univer-sity Press, 2005, pp 275-290.