a psychometric evaluation of measures of spirituality validated in culturally diverse palliative...

19
Review Article A Psychometric Evaluation of Measures of Spirituality Validated in Culturally Diverse Palliative Care Populations Lucy Selman, BA, MPhil, PG Cert Pall Care, Richard Siegert, BSc, MSocSci, DipPsych (Clin), PhD, Richard Harding, BSc, MSc, PhD, DipSW, Marjolein Gysels, BA, MA, PhD, Peter Speck, BSc, MA, and Irene J. Higginson, BMedSci, BMBS, PhD, FFPHM, FRCP Department of Palliative Care, Policy and Rehabilitation (L.S., R.S., R.H., P.S., I.J.H.), Cicely Saunders Institute, King’s College London, London, United Kingdom; and Barcelona Centre for International Health Research (CRESIB) (M.G.), University of Barcelona, Barcelona, Spain Abstract Context. Despite the need to accurately measure spiritual outcomes in diverse palliative care populations, little attention has been paid to the properties of the tools currently in use. Objectives. This systematic review aimed to appraise the psychometric properties, multifaith appropriateness, and completion time of spiritual outcome measures validated in multicultural advanced cancer, HIV, or palliative care populations. Methods. Eight databases were searched to identify relevant validation and research studies. A comprehensive search strategy included search terms in three categories: palliative care, spirituality, and outcome measurement. Inclusion criteria were: validated in advanced cancer, HIV, or palliative care populations and in an ethnically diverse context. Included tools were evaluated with respect to psychometric properties (validity, reproducibility, responsiveness, and interpretability), multifaith appropriateness, and time to complete. Results. A total of 191 articles were identified, yielding 85 tools. Twenty-six tools (representing four families of measures and five individual tools) met the inclusion criteria. Twenty-four tools demonstrated good content validity and 12 demonstrated adequate internal consistency. Only eight tools demonstrated adequate construct validity, usually because specific hypotheses were not stated and tested. Seven tools demonstrated adequate test-retest reliability; two tools showed adequate responsiveness, and two met the interpretability criterion. Data on the religious faith of the population of validation were available for 11 tools; of these, eight were tested in multifaith populations. Conclusion. Results suggest that, at present, the McGill Quality of Life Questionnaire, the Measuring the Quality of Life of Seriously Ill Patients Address correspondence to: Lucy Selman, BA, MPhil, PG Cert Pall Care, Department of Palliative Care, Policy and Rehabilitation, King’s College London, Cicely Saunders Institute, Bessemer Road, Denmark Hill, London SE5 9PJ, United Kingdom. E-mail: [email protected] Accepted for publication: January 26, 2011. Ó 2011 U.S. Cancer Pain Relief Committee Published by Elsevier Inc. All rights reserved. 0885-3924/$ - see front matter doi:10.1016/j.jpainsymman.2011.01.015 604 Journal of Pain and Symptom Management Vol. 42 No. 4 October 2011

Upload: lucy-selman

Post on 05-Sep-2016

216 views

Category:

Documents


0 download

TRANSCRIPT

604 Journal of Pain and Symptom Management Vol. 42 No. 4 October 2011

Review Article

A Psychometric Evaluation of Measuresof Spirituality Validated in Culturally DiversePalliative Care PopulationsLucy Selman, BA, MPhil, PG Cert Pall Care,Richard Siegert, BSc, MSocSci, DipPsych (Clin), PhD,Richard Harding, BSc, MSc, PhD, DipSW, Marjolein Gysels, BA, MA, PhD,Peter Speck, BSc, MA, and Irene J. Higginson, BMedSci, BMBS, PhD, FFPHM, FRCPDepartment of Palliative Care, Policy and Rehabilitation (L.S., R.S., R.H., P.S., I.J.H.), Cicely

Saunders Institute, King’s College London, London, United Kingdom; and Barcelona Centre for

International Health Research (CRESIB) (M.G.), University of Barcelona, Barcelona, Spain

Abstract

Context. Despite the need to accurately measure spiritual outcomes in diverse

palliative care populations, little attention has been paid to the properties of thetools currently in use.

Objectives. This systematic review aimed to appraise the psychometricproperties, multifaith appropriateness, and completion time of spiritual outcomemeasures validated in multicultural advanced cancer, HIV, or palliative carepopulations.

Methods. Eight databases were searched to identify relevant validation andresearch studies. A comprehensive search strategy included search terms in threecategories: palliative care, spirituality, and outcome measurement. Inclusioncriteria were: validated in advanced cancer, HIV, or palliative care populations andin an ethnically diverse context. Included tools were evaluated with respect topsychometric properties (validity, reproducibility, responsiveness, andinterpretability), multifaith appropriateness, and time to complete.

Results. A total of 191 articles were identified, yielding 85 tools. Twenty-six tools(representing four families of measures and five individual tools) met theinclusion criteria. Twenty-four tools demonstrated good content validity and 12demonstrated adequate internal consistency. Only eight tools demonstratedadequate construct validity, usually because specific hypotheses were not statedand tested. Seven tools demonstrated adequate test-retest reliability; two toolsshowed adequate responsiveness, and two met the interpretability criterion. Dataon the religious faith of the population of validation were available for 11 tools; ofthese, eight were tested in multifaith populations.

Conclusion. Results suggest that, at present, the McGill Quality of LifeQuestionnaire, the Measuring the Quality of Life of Seriously Ill Patients

Address correspondence to: Lucy Selman, BA, MPhil,PG Cert Pall Care, Department of Palliative Care,Policy and Rehabilitation, King’s College London,Cicely Saunders Institute, Bessemer Road, Denmark

Hill, London SE5 9PJ, United Kingdom. E-mail:[email protected]

Accepted for publication: January 26, 2011.

� 2011 U.S. Cancer Pain Relief CommitteePublished by Elsevier Inc. All rights reserved.

0885-3924/$ - see front matterdoi:10.1016/j.jpainsymman.2011.01.015

Vol. 42 No. 4 October 2011 605Systematic Review and Psychometric Evaluation of Spirituality Measures

Questionnaire, and the Palliative Outcome Scale are the most appropriatemultidimensional measures containing spiritual items for use in multiculturalpalliative care populations. However, none of these measures score perfectly on allpsychometric criteria, and their multifaith appropriateness requires furthertesting. J Pain Symptom Manage 2011;42:604e622. � 2011 U.S. Cancer Pain ReliefCommittee. Published by Elsevier Inc. All rights reserved.

Key Words

Systematic review, spirituality, outcome measurement, psychometrics, culture

IntroductionSpirituality, understood to include existen-

tial questions relating to meaning and pur-pose, as well as religious belief and practice,often underpins the experience of advancedillness.1e8 Spirituality has been identified asan important concern for patients with incur-able progressive disease, and studies suggestthat many people wish to discuss their beliefswith their physicians.9e11 Within palliativecare, the need to take into account the roleof spirituality is reflected in global policy guid-ance, which stipulates spiritual care provisionand assessment as integral components of themultidimensional care of persons affected byprogressive life-limiting disease.12e17

The measurement of spiritual outcomes isessential in screening for spiritual distress, iden-tifying spiritual health and providing appropri-ate spiritual support,18e21 service evaluationand quality improvement,22 and for researchpurposes, for example, testing spiritual inter-ventions and investigating the relationship be-tween spiritual variables and other healthoutcomes. With the growth in research in spiri-tuality within health care in recent years,23 thenumber of outcome measurement tools hasproliferated.24 However, existing measureshave been criticized for cultural and religiousbias25e27 and psychometric limitations.26,28

In psychometric terms, biased tests are thosein which persons from different groups withequal amounts of a trait have different probabil-ities of scoring high on that trait.29 The culturaland religious bias of spiritual outcome mea-sures is related to inadequate sample represen-tativeness in the validation studies of themeasures. Most tools have been developedand tested in ethnically and religiously homoge-neous samples in the United States, primarilyCaucasian24e27 and Protestant.30e32 Bias results

from a lack of fit between the ‘‘worldview’’ em-bedded in the measure and that of the respon-dent population,28 often because of theuncritical transfer of concepts between culturesor belief systems,33 or the emphasis of irrelevantissues or de-emphasis of issues of impor-tance.27,32 There is evidence from the UnitedStates of differences in how spirituality isconceptualized in Caucasian, Latino, andAfrican-American populations34e36 and thatthis has implications for the validity and appro-priateness of outcome measures in diversepopulations.25,37e39 For example, differencesin the factor structure of the Spiritual Well-Being Scale (SWBS)40,41 in Caucasian andAfrican-American populations suggest culturaldifferences in the concept of spirituality andthe interpretation of scale results37,38 anddraw into question the construct validity of thetool in African-American populations.39 Thelanguage of the SWBS also assumes that reli-gious well-being consists of a close relationshipwith God and that spiritual well-being can bemeasured as a composite score of religiousand existential well-being. Arguably, the SWBSdefines spirituality toonarrowly anddoes not re-flect the spirituality of people of a non-Christianbackground, including atheists, agnostics, andthose who define themselves as ‘‘spiritual butnot religious.’’42 Similar concerns have beennoted with respect to terms in the Brief Reli-gious Coping (RCOPE) inventory (sins, devil,and church).43 Higher scores on the SWBSamong evangelical Christian groups thanamong mainline denominations44 and Catho-lics45 also suggest denominational bias.

In addition to cultural and religious biases,psychometric limitations have been identifiedin existing spiritual outcome measures.26 Forexample, ceiling effects, which exist wherethere is considerable negative skew and/or

606 Vol. 42 No. 4 October 2011Selman et al.

most individuals score within one or two stan-dard deviations of the maximum score, havebeen reported for the SWBS among evangelicalpopulations.46 Concerns also have been raisedregarding the construct validity of a number oftools, including the SWBS,44,46 the Quest scale(a measure of religious orientation),47 and thePurpose in Life Test.48

From a palliative care perspective, an addi-tional problem is that only a limited numberof existing tools have been developed andtested in palliative care populations. Given thespecific spiritual needs and experiences of pa-tients with progressive incurable disease,3 it isessential that the measures used in palliativecare practice and research have been validatedin relevant populations. However, evidence sug-gests this is not the case at present. We recentlypublished results from a comprehensive andsystematic review of tools used to measure spiri-tuality in palliative care, advanced cancer, andHIV populations.49 We found that of 50 differ-ent tools used to measure spiritual outcomesin those populations, only 30 had been psycho-metrically validated in those groups.

In our previous publication, we sought toguide tool selection by palliative care cliniciansand researchers by focusing on the clinical andcultural characteristics of the populations inwhich tools had been tested. We identifiedand categorized those spiritual outcome mea-surement tools that had been validated in ad-vanced cancer, HIV infection, or palliative carepopulations, and went on to identify those toolsthat had been validated cross-culturally. How-ever, it was beyond the remit of that article to in-vestigate the psychometric properties of theidentified tools, their multifaith appropriate-ness and the burden of completion time theyplace on patients (i.e., additional factors thatare crucial in the selection of measures in palli-ative care populations). Although previous re-views have identified some of the spiritualmeasures used in palliative care research,50,51

none have evaluated the psychometric proper-ties of the tools in a systematic way.

In this article, we present a psychometric eval-uationof the cross-culturally validatedmeasuresidentified in our previous publication, also as-sessing time to complete and the religious diver-sity of the populations in which the tools werevalidated. In doing so, we aim to guide thechoice of outcomemeasurement tools in future

research into spirituality inmulticultural, multi-faith palliative care populations.

MethodsDefinitionsSpirituality. An inclusive approach to definingspirituality was adopted, as the review aimed toinclude measures of different types using therange of concepts relating to spirituality, includ-ing, for example, indicators of spiritual well-being such as hope. An inclusive conceptionof spirituality includes religious faith as well asexistentialist/humanist positions and is consid-ered applicable to all human beings.52e55

According to this view, ‘‘spirituality’’ refers tothose beliefs, values, and practices that relateto the search for existential meaning, purpose,or transcendence, which may or may not in-clude belief in a higher power.

Cross-Cultural Applicability. Although the con-cept of culture exceeds ethnic, national, and lin-guistic boundaries, ethnicity is commonly usedas a proxy for culture in research.56 This studyalso adopts this approach, defining measuresthat are cross-culturally applicable as thosewhich have been validated either in more thanone country or in at least one ethnically diversepopulation in a single country. A study popula-tion is defined as ‘‘ethnically diverse’’ if no oneethnic group makes up >60% of the sample.

DesignThe systematic review was conducted in

three stages, described below.

Stage 1: Identification of Measures Used in theLiterature.

Data Sources. Eight electronic databases weresearched (all to June 10, 2010): MEDLINE(from 1950), EMBASE (from 1980), PsycINFO(from 1806), CINAHL (from 1981), BritishNursing Index and Archive (from 1985),AMED (Allied and Complementary Medicine;from 1985), Health and Psychosocial Instru-ments (from 1985), all EBM reviews (CochraneDSR, ACP Journal Club, DARE, CCTR, CMR,HTA, and NHSEED). Other sources were:hand searching of relevant journals; referencelists of identified studies and relevant review

Vol. 42 No. 4 October 2011 607Systematic Review and Psychometric Evaluation of Spirituality Measures

articles; Google Internet search engine (to lo-cate additional validation articles relating toidentified tools); and the gray literature.

Search Strategy. The search strategy followeda standardized format designed for Medlineand adapted for the other databases used.Search terms were a combination of controlledvocabulary (MeSH) and free text terms. Thesearch strategy for all databases used threegroups of terms combined with AND: ‘‘pallia-tive care,’’ ‘‘spirituality,’’ and ‘‘outcome mea-sure’’ (see Ref. 49 for the key words andMedline subject heading terms used).

Inclusion/Exclusion Criteria for Studies. Origi-nal research studies that measured aspects ofspirituality in patients with advanced cancer orHIV disease and/or receiving palliative care(‘‘research study publications’’) or that vali-dated quantitative instruments that measure as-pects of spirituality in patients with advancedcancer or HIV disease and/or receiving pallia-tive care (‘‘validation study publications’’) wereincluded; studies reported in English. Qualita-tive studies and case studies were excluded.

Stage 2: Application of Inclusion/Exclusion Criteria.

Data ExtractiondResearch Study Publications.Data were extracted from identified researchstudy publications by L. S. into a commontable designed by L. S. and confirmed byR. H., M. G., and I. J. H. The name(s) and keyfeatures of the tools used were recorded in thetable along with characteristics of the studies.

Data ExtractiondValidation Study Publications.Data were extracted from identified validationstudy publications by L. S. into three commontables designed by L. S. and confirmed byR. H., M. G., and I. J. H. The following extracteddata were entered into the tables, organized bytool name: purpose, description, psychometric,and clinical properties (e.g., content and con-struct validity, internal consistency, test-retest re-liability, responsiveness, and time to complete),and demographic and clinical characteristics ofpopulation(s) of validation.

Inclusion/Exclusion Criteria for Tools. The inclu-sion/exclusion criteria included the following:1) validated in at least one of the following

populations: patients with advanced cancer(stated to be at an advanced stage, Stage III orIV, or no longer responding to curative treat-ment); patients with HIV or AIDS; and patientsattending palliative care services (includinghospices, end-of-life/terminal/supportive careservices), regardless of diagnosis. For excludedpopulations, see Ref. 49; 2) validated in morethan one country or in a study population inwhich no one ethnicity dominated by >60%. Agrading system was developed by L. S., R. H.,and I. J. H. following a format similar to existingcriteria for the evaluation of outcome mea-sures57 and applied by L. S.49

Stage 3: Evaluation of Psychometric Properties andAppropriateness. Descriptive information wasextracted for each of the selected instruments,including the number of items in the measure,the number of spiritual items, the timeperiod as-sessed, and the scoring method. In Stage 3, out-come measures meeting Criteria 1 and 2, thatis, which had been validated in an ethnically di-verse palliative care population, were evaluatedaccording to predetermined review criteria:1) psychometric properties: content validity, internalconsistency, construct validity, floor and ceilingeffects, test-retest reliability, agreement, and re-sponsiveness (sensitivity to change) and 2) appro-priateness for use in multifaith palliative care contexts:time to complete and multifaith applicability.

The quality criteria (Table 1) were adaptedby L. S. and R. S. from the criteria of Terweeet al.57 for the evaluation of health status ques-tionnaires, which have been used in other eval-uative systematic reviews of measures.58e60

Table 2 shows the specific adaptations madeto the psychometric criteria of Terwee et al.,and the rationales for these changes. Time tocomplete is a recognized factor in assessingthe appropriateness of tools for use in pallia-tive care populations and was, therefore, in-cluded.61 If no information was given ontime to complete, this was recorded as ‘‘notstated.’’ As the tools identified in this reviewmeasure aspects of spirituality, which is closelylinked to religious faith as well as culture, wealso included assessment of the multifaith ap-propriateness of tools by identifying the faithof the population of validation.

The quality criteria adapted from Terweeet al. were applied independently by L. S. andR. S. to evaluate the quality of each instrument’s

Table 1Quality Criteria for Measurement Properties

Property Definition Quality Criteriaa,b

1. Content validity The extent to which the domain of interest iscomprehensively sampled by the items inthe questionnaire

þ A clear description is provided of themeasurement aim, the target population,the concepts that are being measured, andthe item selection AND target populationand (investigators OR experts) wereinvolved in item selection (e.g., throughfocus groups, surveys, etc.) AND(if translated) rigorous methods oftranslation and adaptation were used anddescribed.

? A clear description of above-mentionedaspects is lacking OR only targetpopulation involved OR doubtful designor method (e.g., no adaptation iftranslated).

� No target population involvement.0 No information found on target

population involvement.2. Internal consistency The extent to which items in a (sub)scale are

intercorrelated, thus measuring the sameconstruct

þ Factor structure tested through factoranalyses performed on adequate samplesize (7� # items and $100) ANDCronbach’s alpha(s) calculated perdimension AND Cronbach’s alpha(s)between 0.70 and 0.95 for the total scoreand $50% of the dimensions reported.

? No factor analysis OR doubtful design ormethod.

� Cronbach’s alpha(s) <0.70 or >0.95,despite adequate design and method.

0 No information found on internalconsistency.

3. Criterion validity The extent to which scores on a particularquestionnaire relate to a gold standard

þ Convincing argument that gold standardexisted for comparison purposes (i.e.,‘‘gold’’ for the outcome measured) ANDcorrelation with gold standard $0.70.

? No convincing argument that goldstandard is ‘‘gold’’ OR doubtful design ormethod.

� Correlation with gold standard <0.70despite adequate design and method.

0 No information found on criterion validity.4. Construct validity The extent to which scores on a particular

questionnaire relate to other measures ina manner that is consistent withtheoretically derived hypothesesconcerning the concepts that are beingmeasured

þ Specific hypotheses were formulated ANDat least 75% of the results are inaccordance with these hypotheses.

? Doubtful design or method (e.g., nohypotheses).

� Less than 75% of hypotheses wereconfirmed, despite adequate design andmethods.

0 No information found on constructvalidity.

5. Reproducibility5.1. Agreement(interrater reliability)

The extent to which the scores on repeatedmeasures are close to each other (absolutemeasurement error)

þ MIC< SDC OR MIC outside the LOA ORconvincing arguments that agreement isacceptable.

? Doubtful design or method OR (MIC notdefined AND no convincing argumentsthat agreement is acceptable).

� MIC$ SDC OR MIC equals or insideLOA, despite adequate design andmethod.

0 No information found on agreement.

(Continued)

608 Vol. 42 No. 4 October 2011Selman et al.

Table 1Continued

Property Definition Quality Criteriaa,b

5.2. Test-retest reliability The extent to which patients can bedistinguished from each other, despitemeasurement errors (relativemeasurement error)

þ ICC or weighted Kappa$ 0.70 for $50%of ICCs/weighted Kappa values reported.

? Doubtful design or method (e.g., timeinterval not mentioned).

� ICC or weighted Kappa< 0.70 for >50%of ICCs/weighted Kappa values reported,despite adequate design and method.

0 No information found on reliability.6. Responsiveness The ability of a questionnaire to detect

clinically important changes over timeþ In the context of an appropriate study

design, SDC or SDC<MIC OR MICoutside the LOA OR responsiveness ratioof Guyatt (RR)> 1.96 OR AUC$ 0.70.

? Doubtful design or method.� SDC or SDC$MIC OR MIC equals or

inside LOA OR RR# 1.96 ORAUC< 0.70, despite adequate design andmethods.

0 No information found on responsiveness.7. Floor and ceiling

effectsThe number of respondents who achieved

the lowest or highest possible scoreþ #15% of the respondents achieved the

highest or lowest possible scores.? Doubtful design or method.� >15% of the respondents achieved the

highest or lowest possible scores, despiteadequate design and methods.

0 No information found on floor and ceilingeffects.

8. Interpretability The degree to which one can assignqualitative meaning to quantitative scores

þ Mean and SD scores presented of at leastfour relevant subgroups of patients.

? Doubtful design or method OR less thanfour subgroups.

0 No information found on interpretation.9. Multifaith

appropriatenessAppropriateness of the tool in multifaith

populationsþ Validated in at least one population in

which no one faith >60% OR validated ina population >60% of one faith and in atleast one other population in which >60%were of a different faith.

� All populations >60% of one faith.0 No information found on faith of

population of validation.

Table adapted from Ref. 57.SDC¼ smallest detectable change; LOA¼ limits of agreement; ICC¼ intraclass correlation; SD¼ standard deviation; AUC¼ area under thereceiver operating characteristics curve.aþ¼ positive rating; ?¼ indeterminate rating; �¼ negative rating; 0¼ no information available.bDoubtful design or method¼ lacking a clear description of the design or methods of the study, sample size smaller than 50 subjects (should be atleast 50 in every (subgroup) analysis), or important methodological weakness in the design or execution of the study.

Vol. 42 No. 4 October 2011 609Systematic Review and Psychometric Evaluation of Spirituality Measures

properties, summarizing each variable as ade-quate (þ), doubtful (?), poor quality (�), or un-known (0) if insufficient information wasavailable. L. S. and R. S. then compared results.Any discrepancies in application of the criteriawere resolved through discussion and, if neces-sary, consultation with a third reviewer (R. H.).

ResultsStage 1: Identification of Measures Used in theLiterature

A flowchart of the stages of the review pro-cess, according to Preferred Reporting Items

for Systematic Reviews and Meta-Analyses rec-ommendations,62 is given in Fig. 1. The data-base searches yielded 3068 articles; 241 werejudged relevant on the basis of title and ab-stract. On further examination, 96 publica-tions were excluded according to inclusioncriteria. Hand searching journals, reviews,gray literature, and references and Internetsearching yielded an additional 46 articles, giv-ing a total of 191 articles for data extraction.One-hundred eighteen of these were researchstudy publications, utilizing a total of 50 differ-ent tools; 73 validation articles related to 55tools. Eliminating duplicates and excludingone tool where the validation article was not

Table 2Modifications to Quality Criteria of Terwee et al. for the Measurement Properties of Health Status

Questionnaires

Content validity To reflect the fact that many tools were adapted versions of measures originally developed and tested indifferent cultural contexts, the requirement, for translated tools, of being adapted and translatedaccording to rigorous and described methods was added to the content validity criterion. If nocultural adaptation of the measure had been carried out, this counted as doubtful design or methodand the tool was given the rating ‘‘?’’ for content validity.

Internal consistency The criteria of Terwee et al. state that Cronbach’s alpha values should be between 0.70 and 0.95;however, they do not state what proportion of reported Cronbach’s alphas should attain this value. Intools with several subscales, Cronbach’s alphas may be reported for the whole scale as well as eachindividual subscale. Therefore, we modified the criterion to state that Cronbach’s alpha should bebetween 0.70 and 0.95 for the total scale and for $50% of the Cronbach’s alpha values reported inthe validation article(s).

Test-retest reliability The criteria of Terwee et al. state that the ICC or weighted Kappa value should be $0.70; however, theydo not specify what proportion of reported ICC/weighted Kappa values should attain this level.Therefore, we modified this criterion to state that the ICC/weighted Kappa values should be $0.70for $50% of the ICCs/weighted Kappa values reported.

Interpretability As ‘‘minimal important change’’ has not been defined for spiritual well-being or other spiritualoutcomes, and given the complexity of doing so, the requirement of Terwee et al. that the MIC bedefined in order for a tool to score favorably on interpretability was omitted.

ICC¼ intraclass correlation.

610 Vol. 42 No. 4 October 2011Selman et al.

written in English gave a total of 85 differenttools for appraisal.

Stage 2: Application of Inclusion/ExclusionCriteria

Twenty-six tools met both inclusion criteria(see Ref. 49 for details of excluded tools).Of these, 21 individual tools comprised four‘‘families,’’ that is, were of different versionsor adaptations of four tools: the McGill Qualityof Life Questionnaire (MQOL),63e77 theMissoula-VITAS Quality of Life Index(MVQoLI),78e80 the Palliative care OutcomeScale (also called the Palliative OutcomeScale [POS]),81e88 and the World HealthOrganization’s Quality of Life Instrument-HIV(WHOQOL-HIV).89e95 In addition, the follow-ing five individual tools were included: theBeck Hopelessness Scale (BHS),96e98 the Exis-tential Loneliness Questionnaire (ELQ),99

the Existential Meaning Scale (EMS),100 theIronson-Woods Spirituality/Religiousness Index(I-W SR Index Short Form),101 and the Measur-ing the Quality of Life of Seriously Ill PatientsQuestionnaire (QUAL-E).102e104 Table 3 pres-ents details of the included measures.

Stage 3: Evaluation of Psychometric PropertiesThe psychometric properties of the selected

measures are evaluated in Table 4. In some in-stances, there were discrepancies in the scoresawarded by L. S. and R. S. during their evalua-tion of the tools. In all but one of these cases,disagreements were resolved by L. S. and R. S.

through consulting the validation articles andreferring to the evaluation criteria together.In the remaining instance, which related tothe construct validity of the CanadianMQOL, the arbitrator (R. H.) was consultedand the issue resolved.

Validity. Twenty-four of the 26 tools evalu-ated demonstrated adequate content validity.The ELQ did not meet the criterion as therewas no target population involvement in theselection of tool items. Testing of the Israeliversion of the MQOL used an inadequatedesign or method as no cultural adaptationprocess was described. Adequate internal con-sistency as assessed by the criterion was dem-onstrated in 12 of the 26 tools. Five toolsdemonstrated poor internal consistency as as-sessed by the criterion: the Persian version ofthe MQOL and Malay MQOL-Cardiff ShortForm (MMQOL-CSF), the revised MVQOLI(MVQOLI-R), and the Argentinean andAfrican versions of the POS. Testing of the in-ternal consistency of seven tools had useda doubtful design or method that did notmeet the criterion; most commonly thiswas a lack of factor analysis. For two tools(the Spanish version of the MQOL and theGerman version of the POS), internal consis-tency was not assessed.Only the BHS was rated positively for crite-

rion validity; however, this was achieved in a psy-chiatric rather than palliative care population.Ten tools were assessed as having poor criterion

a

Fig. 1. Flow diagram for measure selection and evaluation (following PRISMA52).

Vol. 42 No. 4 October 2011 611Systematic Review and Psychometric Evaluation of Spirituality Measures

validity. The Canadian, Persian, HongKong Chinese, and Israeli versions of theMQOL and the MQOL-CSF and MMQOL-CSF

demonstrated inadequate criterion validity uti-lizing a single item scalemeasuring overall qual-ity of life as the gold standard. The Ugandan

612 Vol. 42 No. 4 October 2011Selman et al.

version of the MVQoLI (MVQoLI-M) demon-strated inadequate criterion validity against thetool’s global quality of life item. The original,Italian, and Brazilian versions of theWHOQOL-HIV demonstrated inadequate crite-rion validity against the tool’s general quality oflife facet. Of the remaining tools, 14 did not as-sess criterion validity and testing of the ELQused a doubtful design or method that did notmeet the evaluation criteria.

Eight of the 26 tools demonstrated adequateconstruct validity. Testing of 10 of the toolsused an inadequate design or method; thisincluded the I-W SR Index Short Form;Canadian, Persian, Hong Kong Chinese,Korean MQOL and the MMQOL-CSF; MVQO-LI and MVQOLI-M; U.K. version of the POSand the shorter version of the WHOQOL-HIVinMalay (WHOQOL-HIV BREFMalay version).Themost frequent reason for rating amethodol-ogy as inadequate was the lack of specific hy-potheses before testing. Only the MVQOLI-Rdemonstrated poor construct validity. For seventools, no information was available to assess con-struct validity.

Reproducibility. Information on agreement wasgiven for only five of the tools. The Hong KongChinese versionof theMQOLand theU.K. POSmet the criterion. Some information on agree-ment was provided for the Canadian MQOLand the German and Argentinean versions ofthe POS; however, this was not sufficient to as-certain agreement according to the criterion.

Seven tools demonstrated adequate test-retest reliability: the I-W SR Index Short Form,Canadian and Hong Kong Chinese versions ofthe MQOL, United Kingdom and Africanversions of the POS, QUAL-E, and WHOQOL-HIV BREF Malay version. Two tools demon-strated poor test-retest reliability (MVQOLI-Rand MVQOLI-M). Some information on test-retest reliability was provided for an additionalthree tools (the Persian version of the MQOL,MQOL-CSF, and the Argentinean version ofthe POS); however, thedesignormethodwas in-adequate to meet the criterion. There was noavailable information on test-retest reliabilityfor 14 of the 26 tools.

Responsiveness. For 20 tools, no informationonresponsiveness was available. For the remainingsix tools, two showed adequate responsiveness

according to the criterion (the Canadian ver-sion of the MQOL and the MVQOLI-R); valida-tion articles for the four other tools gave someinformation on responsiveness but did notuse the methods required by the criterion(MQOL-J and the United Kingdom, German,and Argentinean versions of the POS).

Interpretability. Two toolsmet the interpretabil-ity criterion (the Canadian MQOL and theWHOQOL-HIV). For 18 tools, some relevant in-formation was available; however, this was insuf-ficient to meet the criterion, usually becausedetails for less than four subgroups were pro-vided. For the remaining six tools, no informa-tion on interpretability was available.

Appropriateness. No information on the reli-gious faith of the population of validation wasgiven for 15 of the 26 tools. In particular, infor-mation in this area was not available for any ofthe versions of the POS or WHOQOL-HIV;QUAL-E; ELQ; EMS; Canadian, Korean, andSpanish versions of the MQOL; and MQOL-CSF. Three tools, the Persian and Israeli ver-sions of the MQOL and the MVQOLI-R, werevalidated in populations that did not count asmultifaith. The remaining eight tools hadbeen tested in multifaith populations: theHong Kong Chinese, Taiwanese, and Japaneseversions of the MQOL, the MMQOL-CSF,Beck Hopelessness Scale, I-W SR Index ShortForm, MQVOLI, and MVQOLI-M.No information on time to complete was

given for 16 of the 26 tools evaluated. Time tocomplete was given for five of the 10 versionsof the MQOL; the longest time to completewas for the Hong Kong Chinese69 and Taiwa-nese73 versions (both30minutes) and the short-est was for the Cardiff Short Form (3.26minutes).76 Time to complete was available forall four versions of the POS, ranging from fourto six minutes for staff completing the Germanversionof thePOS81 to#12minutes for patientscompleting the Argentinean version.85 For theMVQOLI, time to complete was not given inthe United States validation studies, whereasthe Ugandan study reports that the averagecompletion time was 15e20 minutes in patientswith a Karnofsky score above 50% and 30e35minutes in others. Time to complete was avail-able for the WHOQOL-HIV (45e60 minutes)and the Italian version of the tool (28 minutes).

Table 3Overview of Cross-Cultural Measures of Spirituality (n¼ 9)

Tool Language of ToolPopulation ofValidation Patient Diagnoses

Spiritual ConstructsMeasured (as statedin validation article)

Number of SpiritualItems in Tool (totalnumber of items intool, if additional)

Time Period ofReference Scoring

BHS96e98 English (UnitedStates)

Ethnically diverseU.S. population

AIDS, hospiceinpatients withcancer

Hopelessness 20 Items Current perception True/false (9 keyedfalse, 11 true).Each responsescored 0 or 1.

Total hopelessnessscore¼ sum ofscores onindividual items(possible range0e20, 20¼mosthopeless).

ELQ99 English (UnitedStates)

Ethnically diverseU.S. population

HIV The experience ofexistentialloneliness

22 Items Current perception 6-Point scale,(1¼ not at all trueof me to 6¼ verymuch true of me).

EMS100 English (UnitedStates)

Ethnically diverseU.S. population

General population,cancer, HIV

Existential meaning:a singleconceptual entitythat is notconfounded bycontextualvariables such asphysical health,vocation, or otherexternal sourcesof meaning.Based onFrankl108

10 Items Current perception 5-Point scale(1¼ stronglydisagree to5¼ stronglyagree).

Higher scoresindicate morepositiveperception ofexistentialmeaning.

I-W SR IndexShort Form101

English (UnitedStates)

Ethnically diverseU.S. population

HIV/AIDS Aims to determinewhat people meanwhen they say theyare spiritual orreligious (i.e., todissect thedimensions ofS/R), to berelevant to privateas well as public S/R and to capturespirituality as wellas religiosity).

22 Items Current perception 5-point scale(1¼ stronglydisagree to5¼ stronglyagree).

(Continued)

Vol.

42No.

4October

2011

613

Systematic

Review

andPsychom

etricEvalu

ationof

Spirituality

Measu

res

Table 3Continued

Tool Language of ToolPopulation ofValidation Patient Diagnoses

Spiritual ConstructsMeasured (as statedin validation article)

Number of SpiritualItems in Tool (totalnumber of items intool, if additional)

Time Period ofReference Scoring

MQOLa,63e75 English (Canadian);Spanish (PuertoRican,Dominican,Mexican,Salvadoran,Ecuadorian andColumbian);b

Hebrew; Korean;Hong KongChinese; Persian;Taiwanese;Japanese

Ethnically diverseCanadianpopulation; Israel,Korea, HongKong, Iran,Taiwan

Canada: Palliativecare, cancer, HIV

Israel: Advancedcancer

Korea: Terminalcancer

Hong Kong: Palliativecare

Iran: Incurablecancer

Taiwan: Terminalcancer

Japan: Palliative care

Spiritual aspects ofquality of life:meaning andpurpose of life,life worth, feelinggood aboutoneself, value oflife

Four items (17 intotal) (HongKong Chineseversion: suggestadding threeadditional items,giving a total of 20items)

Previous 2 days(inpatients);previous 7 days(hospice &outpatients)

11-point scale from0 to 10, variousanchors, forexample,completelyworthless to veryworthwhile.

MQOL-CSFa,76,77 English (U.K.);Malay

Wales, Malaysia Wales: Palliative care(outpatients)

Malaysia: Advancedcancer

Spiritual aspects ofquality of life:control over life,life as burden/gift

Two items (eight intotal)

Previous 2 days(inpatients);previous 7 days(hospice &outpatients)

11-point scale from0 to 10, sameanchors as fullMQOL.

MVQoLIa,78e80 English (UnitedStates); Luganda

United States,Uganda

United States:Hospice patients,end-stage renaldisease (ESRD),long-term care

Uganda: AdvancedAIDS

Spiritual aspects ofquality of life:

United States: feelingat peace/being atpeace withoneself, Feelingprepared to leavelife, Feelingsatisfied withoneself, Sense ofconnection to allthings, (Sense of)meaning in life,Beingcomfortable/uneasy withthought of death,Value of life

Uganda: As above,except ‘‘Sense ofconnection to allthings’’ replacedwith ‘‘Sense ofconnection to thesupernaturalbeing I believe in’’

Eight itemsc (26 intotal)

Current perception 5-point scale withvarious anchorsand scores, forexample, agreestrongly todisagree strongly.

614

Vol.

42No.

4October

2011

Selman

etal.

POSa,81e88 English (UnitedKingdom);German; Spanish(Argentine);Luganda;Runyoro;Runyankole;Sesotho;Setswana;isiXhosa; andisiZulu (Gautengand KwaZuluNatal dialects)

United Kingdom,Germany, Austria,Argentina, eightSouthern andEastern Africancountries (SouthAfrica, Uganda,Botswana, Kenya,Malawi, Tanzania,Zambia, andZimbabwe)

United Kingdom:Palliative carepatients(community,home, hospice,and day care)

Germany/Austria:Advanced cancer

Argentina: Mixedcancer population(>50% Stage III/IV)

Africa: Palliative carepatients

Spiritual outcomesof palliative care:

United Kingdom/Argentina/Germany: Lifeworth, feelinggood aboutoneself asa person

Africa: Life worth,feeling at peace

Two items (UnitedKingdom,German/Austrianand Argentineanversions: 11 intotal; Africanversion: 10 intotal)

Previous 3 days United Kingdom,German/Austrianand Argentineanversions: 0e4scale, variousanchors, forexample, no, notat all to yes, all thetime.

African version: 0e5scale, variousanchors, forexample, not atall to yes, all thetime.

QUAL-E102e104 English (UnitedStates)

United States Stage IV cancer,CHF with ejectionfraction #20%,dialysisdependent ESRD,COPD with FEV1

#1 L

Spiritual aspects ofquality of life:feeling at peace,sense of meaningin life, fear ofthoughts of death

Three items (31 intotal)

Physical symptoms/problems: pastmonth (to elicit),past week (torate)

Other items: currentperception/ingeneral

5-point scale, level ofagreement(majority scored:not at all tocompletely).

WHOQOL-HIVa,89e94

English(Australian);Tamil; Kannada;Hindi; Portuguese(Brazilian); Thai;Shona; Italian;Ukrainian

Australia, India,Brazil, Thailand,Zimbabwe, Italy,and Ukraine

HIV Spiritual aspects ofquality of life:personal beliefs,strength andunderstandingfrom personalbeliefs, givingmeaning to life/extent ofmeaningfulness oflife, guilt,forgiveness

16 items (124 intotal)

Current perception 1e5 scale (higherscores¼ betterquality of life).

WHOQOL-HIVBREFa,95

Malay Malaysia HIV Spiritual aspects ofquality of life:extent ofmeaningfulness oflife, death anddying, concernsabout the future,and forgivenessand blame

Four items (31 itemstotal)

Current perception 1e5 Likert scale(higherscores¼ betterquality of life)

aFamilies of tool (more than one version of the tool exists).bThe Spanish version of the MQOL has been translated for conceptual equivalence, but not yet validated.74cAll items from the Transcendence subscale and four items from the Well-Being subscale included (the items relating to affairs being in order and worrying about things ‘‘getting out of control’’ from the Well-Being subscale were excluded as considered psychological).

Vol.

42No.

4October

2011

615

Systematic

Review

andPsychom

etricEvalu

ationof

Spirituality

Measu

res

Table 4Psychometric Evaluation of Selected Measures

MeasureContentValidity

InternalConsistency

CriterionValidity

ConstructValidity Agreement

Test-RetestReliability

Respon-siveness

Floor/CeilingEffects

Interpret-ability

MultifaithAppropriateness

Time toComplete(Minutes)

BHS96e98 +a + +a +a 0 0 0 0 0 + Not statedELQ99 e ? ? ? 0 0 0 0 ? 0 Not statedEMS100 + + 0 +a 0 0 0 + ? 0 Not statedI-W SR Index Short Form101 + + 0 ? 0 + 0 0 ? + Not statedMQOL*Canadian version63e67,75 + + e ? ? + + + + 0 15e20Persian version70 + e e ? 0 ? 0 0 0 e Not statedHong Kong Chinese

version69+ + e ? + + 0 0 ? + 30 minutes

Korean version71 + + 0 ? 0 0 0 0 ? 0 Not statedIsraeli version68 ? + e 0 0 0 0 0 ? e Not statedTaiwanese version73 + ? 0 + 0 0 0 0 ? + 30 (range 15e45)Spanish version74 + 0 0 0 0 0 0 0 0 0 Not statedMQOL-J72 + ? 0 0 0 0 ? 0 ? + Not statedMQOL-CSF76 + e e + 0 ? 0 0 0 0 3.26MMQOL-CSF77 + + e ? 0 0 0 0 ? + 5.35

MVQoLIU.S.: MVQoLI80 + ? 0 ? 0 0 0 + ? + Not statedU.S.: MVQoLI-R78 + e 0 e 0 e + 0 ? e Not statedUganda: MVQoLI-M79 + ? e ? 0 e 0 0 ? + 15e20 for well

patients, 30e35for less well

POSU.K. version82,86e88 + + 0 ? + + ? 0 ? 0 6.9 (patients);

5.7 (staff)German version81 + 0 0 0 ? 0 0 0 ? 0 9e11 (patients);

4-6 (staff)Argentinean version85 + e 0 + ? ? ? 0 0 0 #12 (patients);

#6 (staff)African version83,84 + e 0 + 0 + ? 0 0 0 Median 5-7;

mean 7.8e9.3QUAL-E102e104 + + 0 + 0 +b 0 + ? 0 Not stated

WHOQOL-HIVWHOQOL-HIV89e92,94 + + e 0 0 0 0 + + 0 45e60WHOQOL-HIV Italian

version92+ ? e 0 0 0 0 0 ? 0 28

WHOQOL-HIV Brazilianversion93

+ ? e 0 0 0 0 ? ? 0 Not stated

WHOQOL-HIV BREF Malayversion95

+ + 0 ? 0 + 0 + ? 0 Not stated

aPositive rating for a non-palliative care population (BHS: psychiatric; EMS: general population); further testing required in palliative care populations.bReports test-retest reliability of full sample and subgroups; only data from the full sample were taken into account when evaluating test-retest reliability.

616

Vol.

42No.

4October

2011

Selman

etal.

Vol. 42 No. 4 October 2011 617Systematic Review and Psychometric Evaluation of Spirituality Measures

DiscussionThis is the first systematic review and psycho-

metric evaluation of spiritual measures vali-dated in multicultural advanced cancer, HIV,or palliative care populations. Given the extentto which the tool has been tested and used, it isunsurprising that the Canadian MQOL wasfound to be one of the tools with the strongestpsychometric properties, with good contentvalidity, internal consistency, test-retest reliabil-ity, responsiveness, and interpretability, and nofloor or ceiling effects in evidence. It hada moderate mean completion time of 15e20minutes, suggesting that it may be burden-some for very sick patients, and further testingis required to determine whether it is appro-priate in multifaith populations. For severalversions of the MQOL, at least one subscalehad a Cronbach’s alpha <0.7 (for the Persianversion, this applied to the existential sub-scale70); hence, further investigation of theconstruct validity of the tool in different cul-tural contexts may be beneficial. The Spanishversion of the MQOL74 has been translatedand culturally adapted but has not yet beensubjected to psychometric testing.

The QUAL-E, developed and validated rela-tively recently, also showed good psychometricproperties in a number of areas. However, itsresponsiveness, time to complete, and appro-priateness in a multifaith population requirefurther testing. The U.K. POS demonstratedgood content validity, internal consistency,agreement, and test-retest reliability; however,further testing of its construct validity, respon-siveness, and interpretability would be benefi-cial. Whereas the validation article for theU.K. POS gave a good argument that agree-ment was sufficient, the German version didnot provide a convincing argument that thiswas the case. Moreover, the statistics reportedare not those stated in the criteria of Terwee;therefore, further testing of the German POSis required. The POS demonstrated the shorttime to complete expected of a tool developedspecifically for palliative care clinical practice;however, further testing in multifaith popula-tions is required. The WHOQOL-HIV andWHOQOL-HIVBREFMalay versionwere foundto have promising psychometric properties;however, further evidence of reliability and re-sponsiveness is needed. The full measure’s

long completion time (45e60 minutes) sug-gested it would be overly burdensome for sickpatients, and neither tool has been tested inmultifaith populations.

Of those tools evaluated that measure solelyspiritual constructs, the BHS demonstrated thestrongest psychometric properties, althoughfurther testing of reliability, and in specificallypalliative care populations, is required. TheELQ scored poorly, reflecting the small samplesize (n¼ 47) of the validation study.

Our evaluation highlights a number of waysin which psychometric studies of tools couldbe made more robust. For example, when test-ing construct validity, many studies did not re-port hypotheses for testing, and floor andceiling effects were sometimes reported with-out presenting the data to back up the claim,for example, in the validation study of theWHOQOL-HIV Brazilian version [93]. The re-sponsiveness of tools is particularly importantin palliative care given the short time in whichpatients are cared for; however, it was not oftenreported and where it was reported, it was veryrarely done so at the level required by thecriterion.

In evaluating individual tools using specificcriteria, care needs to be taken to ensure thatall the domains evaluated are relevant for thatparticular tool. Internal consistency may notbe relevant in tools such as the POS that arenot unidimensional, that is, do not aim to mea-sure one underlying construct. As a measure ofthe extent to which items are intercorrelated,Cronbach’s alpha may not be appropriate asa test of the validity of a multidimensionaltool. Furthermore, if measures are designedfor self-reporting, then agreement (the extentto which scores on repeated measures by differ-ent raters are close to each other) also may notbe relevant, and test-retest will be amore appro-priate test of reliability. The criterion relating tofloor and ceiling effects is also arguably relevantonly in scales with total scores and/or subscalescores, and is less relevant if the tool producesno summative scores. Testing for criterion valid-ity is not always possible or meaningful, giventhe difficulty of determining what counts asa gold standard for a specific outcome. Finally,defining minimal important change (MIC) iscomplex in the domain of spiritual well-being;hence, consideration of MIC in appraising

618 Vol. 42 No. 4 October 2011Selman et al.

agreement and responsiveness may not alwaysbe appropriate. For substantive measures ofspirituality or religiousness such as the I-W SRIndex Short Form,101 it is unclear whether theconcept of MIC is meaningful.

We also found the criteria used in this study,adapted from those developed by Terwee et al.and used by others, to be particularly stringentin certain respects. For example, the U.K. POShas evidence of responsiveness but did notmeet the criteria of Terwee et al. because thestudy authors did not report the smallest de-tectable change, MIC, Guyatt’s responsivenessratio or the area under the receiver operatingcharacteristics curve. Regarding interpretabil-ity, the criteria require that mean values andstandard deviations are given for four ormore subgroups for the purposes of compari-son. Given that this evaluation refers to spiri-tual tools validated in palliative carepopulations, (rather than general medicalpopulations), this criterion may be rather toostrict; we suggest that three subgroups are suf-ficient evidence of interpretability for sometools, if the mean scores are interpreted ina clinically meaningful way.

There are a number of limitations to this eval-uation. Thefirst two relate to assumptionsmadewhen applying the criteria. First, we consideredcontent validity to be ‘‘transferred’’ when a toolis adapted to another cultural context (as longas adaptation takes place) or when the tool isused in another clinical population (e.g.,BHS). Both of these assumptions are debatable,and we recommend further testing of the BHSin palliative care populations. Second, a degreeof bias in applying the evaluation criteria couldresult from how liberal or conservative the re-viewers were in their application of the qualitycriteria of Terwee et al. For example, when as-certaining floor and ceiling effects for some ofthe tools (e.g., WHOQOL-HIV BREFMalay ver-sion) we looked at statedmean and standard de-viation scores and scored the tool positively if itcould be determined from looking at the datagiven that less than 15% scored lowest/highestscores.Other reviewersmay decide to be stricterand require an explicit statement that not morethan 15% scored thusly. Third, application ofthe criteria does not take into account the sam-ple size used in the validation study (or studies),other than the requirement of a sample sizegreater than 50 subjects in order for the design

to count as adequate. The fact that the Cana-dian MQOL, POS, and WHOQOL-HIV havebeen tested in numerous populations whereasthe Japanese72 and Taiwanese73 versions of theMQOL have been tested in only one relativelysmall sample (n¼ 83 and n¼ 64, respectively)should, therefore, be taken into account wheninterpreting our findings.

ImplicationsThe findings from this review have implica-

tions for research in palliative care, particularlyin multicultural populations. When selectingsuitable tools for the measurement of spiritualvariables, the psychometric properties and ap-propriateness of measures (multifaith applica-bility and completion time) should be takeninto account. This review provides these datafor the multicultural tools evaluated. In addi-tion, application of the criteria demonstrateshow psychometric studies can improve the test-ing of spiritual tools and ensure that addi-tional factors of relevance, such as time tocomplete and multifaith appropriateness, areinvestigated and reported.

Future ResearchThis review suggests a number of areas for

future research. First, further research shouldaddress the omissions and psychometric weak-nesses identified in the review, and subjectmeasures to targeted testing to improve theirrobustness. Second, tools which have beentranslated and adapted from an existing mea-sure should be evaluated according to existingguidelines,105,106 which focus on translation,synthesis of translations, back translation, ex-pert committee review, and pretesting beforepsychometric testing.105 Third, to inform theprovision of spiritual care and the evaluationof spiritual interventions,7 existing measuresthat focus solely on spiritual well-being (e.g., theFunctional Assessment of Chronic IllnessTherapy-Spiritual Well-Being Scale107 and theSWBS)40,41 need to be psychometrically testedin palliative care populations and in diverse cul-tural and religious contexts.

ConclusionWhen selecting an appropriate measure,

researchers of spirituality in palliative care

Vol. 42 No. 4 October 2011 619Systematic Review and Psychometric Evaluation of Spirituality Measures

populations should consider the clinical andcultural features of the population in whichoutcome measures have been validated, thepsychometric properties of the measures, theirmultifaith appropriateness, and the potentialburden of completion to patients. This system-atic review suggests that, taking into accountthese factors, at present the MQOL, QUAL-E,and POS are the most appropriate multidimen-sional measures containing spiritual items foruse in multicultural palliative care populations.However, none of these measures currentlyscore perfectly on all relevant psychometric cri-teria, and their multifaith appropriateness alsorequires further testing.

Disclosures and AcknowledgmentsMany thanks to the Sir Halley Stewart Trust,

Cicely Saunders International, the DunhillMedical Trust, and the Luff Foundation fortheir financial support of this study. The au-thors declare no conflicts of interest.

References1. Williams AL. Perspectives on spirituality at the

end of life: a meta-summary. Palliat Support Care2006;4:407e417.

2. Saunders C. Spiritual pain. J Palliat Care 1988;4:29e32.

3. Kearney M, Mount BM. Spiritual care of the dy-ing patient. In: Chochinov HM, Breitbart W, eds.Handbook of psychiatry in palliative medicine.New York: Oxford University Press, 2000:357e373.

4. Roberts JA, Brown D, Elkins T, Larson DB. Fac-tors influencing views of patients with gynecologiccancer about end-of-life decisions. Am J Obstet Gy-necol 1997;176(1 Pt 1):166e172.

5. Brady MJ, Peterman AH, Fitchett G, Mo M,Cella D. A case for including spirituality in qualityof life measurement in oncology. Psychooncology1999;8:417e428.

6. Cotton S, Tsevat J, Szaflarski M, et al. Changesin religiousness and spirituality attributed to HIV/AIDS. J Gen Intern Med 2006;21:S14eS20.

7. Speck P, Higginson IJ, Addington-Hall J. Spiri-tual needs in health care. BMJ 2004;329:123e124.

8. Reed PG. Spirituality and well-being in termi-nally ill hospitalized adults. Res Nurs Health 1987;10:335e344.

9. The George H, Gallup International Institute.Spiritual beliefs and the dying process: a report on

a national survey. Report No. 2. New York: NathanCummings Foundation, 1997.

10. Ehman JW, Ott BB, Short TH, Ciampa RC,Hansen-Flaschen J. Do patients want their physiciansto inquire about their spiritual or religious beliefs ifthey become gravely ill? Arch Intern Med 1999;159:1803e1806.

11. King DE, Bushwick B. Beliefs and attitudes ofhospital inpatients about faith healing and prayer.J Fam Pract 1994;39:349e352.

12. National Institute for Health and Clinical Ex-cellence. Improving supportive and palliative carefor adults with cancer. 2004. Available fromhttp://www.nice.org.uk/csgsp. Accessed March 24,2011.

13. National Quality Forum. A national frameworkand preferred practices for palliative and hospicecare quality. ReportNo.:NQFCR-16e06.Washington,DC: National Quality Forum, 2006.

14. Ferris FD, Balfour HM, Bowen K, et al.A model to guide hospice palliative care. Ottawa,ON: Canadian Hospice Palliative Care Association,2002.

15. National Consensus Project for Quality Pallia-tive Care. Clinical practice guidelines for qualitypalliative care, 2nd ed. Pittsburgh, PA: NationalConsensus Project for Quality Palliative Care, 2009.

16. Palliative Care Australia. Standards for provid-ingquality palliative care for allAustralians.Canberra,Australia: PCA, 2005.

17. Puchalski C, Ferrell B, Virani R, et al. Improv-ing the quality of spiritual care as a dimension ofpalliative care: the report of the consensus confer-ence. J Palliat Med 2009;12:885e904.

18. Cobb M. The dying soul: spiritual care at theend of life. Buckingham, UK: Open UniversityPress, 2001.

19. Association of Hospice and Palliative CareChaplains. Guidelines for hospice and palliativecare chaplaincy, 2nd ed. London, UK: Associationof Hospice and Palliative Care Chaplains, 2006.

20. Byrne M. Spirituality in palliative care: whatlanguage do we need? Int J Palliat Nurs 2002;8:67e74.

21. Lo B, Chou V. Directions in research on spiri-tual and religious issues for improving palliativecare. Palliat Support Care 2003;1:3e5.

22. Hunt J, Cobb M, Keeley VL, Ahmedzai SH.The quality of spiritual careedeveloping a standard.Int J Palliat Nurs 2003;9:208e215.

23. Stefanek M, McDonald PG, Hess SA. Reli-gion, spirituality and cancer: current status andmethodological challenges. Psychooncology 2005;14:450e463.

24. Hall DE, Meador KG, Koenig HG. Measuringreligiousness in health research: review and cri-tique. J Relig Health 2008;47:134e163.

620 Vol. 42 No. 4 October 2011Selman et al.

25. Lewis LM. Spiritual assessment in African-Americans: a review of measures of spiritualityused in health research. J Relig Health 2008;47:458e475.

26. Slater W, Hall TW, Edwards KJ. Measuring reli-gion and spirituality: where are we and where are wegoing? J Psychol Theol 2001;29:4e21.

27. Hill PC, Pargament KI. Advances in the con-ceptualization and measurement of religion andspirituality. Implications for physical and mentalhealth research. Am Psychol 2003;58:64e74.

28. Gray J. Measuring spirituality: conceptual andmethodological considerations. J Theory ConstrTest 2006;10:58e64.

29. Anastasi A. Psychological testing, 6th ed. NewYork: Macmillan, 1988.

30. Hill PC, Hood RW. Measures of religiosity.Birmingham, AL: Religious Education Press, 1999.

31. Gorsuch R. Measurement in psychology of reli-gion revisited. JPC 1990;9:82e92.

32. Hill PC. Measurement assessment and issues inthe psychology of religion and spirituality. In:Paloutzian R, Park CL, eds. Handbook of the psy-chology of religion. New York: Guildford Press,2005:43e61.

33. Rogler LH. Methodological sources of culturalinsensitivity in mental health research. Am Psychol1999;54:424e433.

34. Campesino M, Schwartz GE. Spiritualityamong Latinas/os: implications of culture in con-ceptualization and measurement. ANS Adv NursSci 2006;29:69e81.

35. NewlinK,KnaflK,MelkusGD.African-Americanspirituality: a concept analysis. ANS Adv Nurs Sci2002;25:57e70.

36. Conner NE, Eller LS. Spiritual perspectives,needs and nursing interventions of Christian African-Americans. J Adv Nurs 2004;46:624e632.

37. Miller G, Gridley B, Fleming W. Spiritual Well-Being Scale ethnic differences between Caucasiansand African-Americans: follow up analyses. Paper pre-sented at the 109th Annual Meeting of the AmericanPsychological Association, August 24, 2001, SanFrancisco, CA. Available from http://www.eric.ed.gov/PDFS/ED467831.pdf. Accessed May 17, 2011.

38. Miller G, Fleming W, Brown-Anderson F. Spiri-tual Well-Being Scale ethnic differences betweenCaucasians and African-Americans. J Psychol Theol1998;26:358e364.

39. Utsey SO, Lee A, Bolden MA, Lanier Y.A confirmatory test of the factor validity of scoreson the Spiritual Well-Being Scale in a communitysample of African Americans. J Psychol Theol2005;3:251.

40. Bufford RK, Paloutzian RF, Ellison CW. Normsfor the Spiritual Well-Being Scale. J Psychol Theol1991;19:56e70.

41. PaloutzianR,EllisonC.Loneliness, spiritualwell-being and quality of life. In: PeplauA, PerlmanD, eds.Loneliness: A sourcebook of current theory, researchand therapy. New York: Wiley Interscience, 1982.

42. Gioiella ME, Berkman B, Robinson M. Spiritu-ality and quality of life in gynecologic oncology pa-tients. Cancer Pract 1998;6:333e338.

43. Mytko JJ, Knight SJ. Body, mind and spirit: to-wards the integration of religiosity and spirituality incancer quality of life research. Psychooncology1999;8:439e450.

44. Scott E, Agresti A, Fitchett G. Factor analysis ofthe spiritual well-being scale and its clinical utilitywith psychiatric inpatients. J Sci Study Relig 1998;37:314e321.

45. Bassett RL, Camplin W, Humphrey D, et al.Measuring Christian maturity: a comparison of sev-eral scales. J Psychol Theol 1991;19:84e93.

46. Ledbetter M, Smith L, Vosler-Hunter W,Fischer J. An evaluation of the research and clinicalusefulness of the spiritual well-being scale. J PsycholTheol 1991;19:49e55.

47. Watson P, Morris R, Hood RW, Waddell M. Re-ligion and the experiential system: relationships ofconstructive thinking with a religious orientation.Int J Psychol Relig 1999;9:195e207.

48. Dufton B, Perlman D. The association betweenreligiosity and the purpose-in-life test: does it reflectpurpose or satisfaction. J Psychol Theol 1986;14:42e48.

49. Selman L, Harding R, Gysels M, Speck P,Higginson IJ. The measurement of spirituality andthe content of tools validated cross-culturally: a sys-tematic review in palliative care. J Pain SymptomManage 2011;41:728e753.

50. Vivat B. Measures of spiritual issues for pallia-tive care patients: a literature review. Palliat Med2008;22:859e868.

51. Albers G, Echteld MA, De Vet H, et al. Contentand spiritual items of quality-of-life instruments ap-propriate for use in palliative care: a review. J PainSymptom Manage 2010;40:290e300.

52. Burkhardt MA. Spirituality: an analysis of theconcept. Holist Nurs Pract 1989;3:69e77.

53. Heyse-Moore LH. On spiritual pain in the dy-ing. Mortality 1996;1:297e315.

54. Dyson J, Cobb M, Forman D. The meaning ofspirituality: a literature review. J Adv Nurs 1997;26:1183e1188.

55. Sulmasy DP. A biopsychosocial-spiritual modelfor the care of patients at the end of life. Gerontol-ogist 2002;42(Spec No. 3):24e33.

56. Moscou S. The conceptualization and opera-tionalization of race and ethnicity by health servicesresearchers. Nurs Inq 2008;15:94e105.

Vol. 42 No. 4 October 2011 621Systematic Review and Psychometric Evaluation of Spirituality Measures

57. Terwee CB, Bot SDM, de Boer MR, et al. Qual-ity criteria were proposed for measurement proper-ties of health status questionnaires. J Clin Epidemiol2007;60:34e42.

58. Bot SDM, Terwee CB, van der Windt DAWM,et al. Clinimetric evaluation of shoulder disabilityquestionnaires: a systematic review of the literature.Ann Rheum Dis 2004;63:335e341.

59. De Boer MR, Moll AC, De Vet HC, et al. Psy-chometric properties of vision-related quality oflife questionnaires: a systematic review. OphthalmicPhysiol Opt 2004;24:257e273.

60. Ashford S, Slade M, Malaprade F, Turner-Stokes L. Evaluation of functional outcome mea-sures for the hemiparetic upper limb: a systematicreview. J Rehabil Med 2008;40:787e795.

61. Higginson IJ. Quality criteria valuable withslight modification. [Letter]. J Clin Epidemiol2007;60:1315.

62. Moher D, Liberati A, Tetzlaff J, Altman DG.Preferred reporting items for systematic reviewsand meta-analyses: the PRISMA statement. Ann In-tern Med 2009;151:264e269.

63. Cohen SR, Mount BM, Strobel MG, Bui F. TheMcGill Quality of Life Questionnaire: a measure ofquality of life appropriate for people with advanceddisease. A preliminary study of validity and accept-ability. Palliat Med 1995;9:207e219.

64. Cohen SRM. Existential well-being is an impor-tant determinant of quality of life: evidence fromthe McGill Quality of Life Questionnaire. Cancer1996;77:576e586.

65. Cohen SR, Hassan SA, Lapointe BJ,Mount BM. Quality of life in HIV disease as mea-sured by the McGill Quality of Life Questionnaire.AIDS 1996;10:1421e1427.

66. Cohen SR, Mount BM, Bruera E, et al. Validityof the McGill Quality of Life Questionnaire in thepalliative care setting: a multi-centre Canadian studydemonstrating the importance of the existential do-main. Palliat Med 1997;11:3e20.

67. Cohen SR, Mount BM. Living with cancer:‘‘good’’ days and ‘‘bad’’ daysewhat produces them?Can the McGill Quality of Life Questionnaire distin-guish between them? Cancer 2000;89:1854e1865.

68. Bentur N, Resnizky S. Validation of the McGillQuality of Life Questionnaire in home hospice set-tings in Israel. Palliat Med 2005;19:538e544.

69. Lo RS, Woo J, Zhoc KC, et al. Cross-cultural val-idation of the McGill Quality of Life Questionnairein Hong Kong Chinese. Palliat Med 2001;15:387e397.

70. Shahidi J, Khodabakhshi R, Gohari MR,Yahyazadeh H, Shahidi N. McGill Quality of LifeQuestionnaire: reliability and validity of the Persianversion in Iranian patients with advanced cancer.J Palliat Med 2008;11:621e626.

71. Kim SH, Gu SK, Yun YH, et al. Validation studyof the Korean version of the McGill Quality of LifeQuestionnaire. Palliat Med 2007;21:441e447.

72. Tsujikawa M, Yokoyama K, Urakawa K,Onishi K. Reliability and validity of Japanese versionof the McGill Quality of Life Questionnaire assessedby application in palliative care wards. Palliat Med2009;23:659e664.

73. Hu WY, Dai YT, Berry D, Chiu TY. Psychometrictesting of the translated McGill Quality of LifeQuestionnaire-Taiwan version in patients with termi-nal cancer. J Formos Med Assoc 2003;102:97e104.

74. Tolentino VR, Sulmasy Dp. A Spanish versionof the McGill Quality of Life Questionnaire.J Palliat Care 2002;18:92e96.

75. Henry M, Huang LN, Ferland MK, Mitchell J,Cohen SR. Continued study of the psychometricproperties of the McGill Quality of Life Question-naire. Palliat Med 2008;22:718e723.

76. Lua PL, Salek S, Finlay I, Lloyd-Richards C.The feasibility, reliability and validity of the McGillQuality of Life Questionnaire-Cardiff Short Form(MQOL-CSF) in palliative care population. QualLife Res 2005;14:1669e1681.

77. Lua PL, Salek MS, Finlay IG, Boay AG,Rahimah MS. The feasibility, reliability and validityof the Malay McGill Quality of Life QuestionnaireeCardiff Short Form (MMQOL-CSF) in Malaysian ad-vanced cancer population. Med J Malaysia 2005;60:28e40.

78. Schwartz CE, Merriman MP, Reed G, Byock I.Evaluation of the Missoula-VITAS Quality of LifeIndexerevised: research tool or clinical tool?J Palliat Med 2005;8:121e135.

79. Namisango E, Katabira E, Karamagi C,Baguma P. Validation of the Missoula-VITAS Quality-of-Life Index among patients with advanced AIDS inurban Kampala, Uganda. J Pain Symptom Manage2007;33:189e202.

80. Byock IR, Merriman MP. Measuring quality oflife for patients with terminal illness: the Missoula-VITAS Quality of Life Index. Palliat Med 1998;12:231e244.

81. Bausewein C, Fegg M, Radbruch L, et al. Vali-dation and clinical application of the German ver-sion of the palliative care outcome scale. J PainSymptom Manage 2005;30:51e62.

82. Stevens AM, Gwilliam B, A’hern R, Broadley K,Hardy J. Experience in the use of the palliative careoutcome scale. Support Care Cancer 2005;13:1027e1034.

83. Powell RA, Downing J, Harding R, Mwangi-Powell F, Connor S. Development of the APCAAfrican Palliative Outcome Scale. J Pain SymptomManage 2007;33:229e232.

84. Harding R, Selman L, Agupio G, et al. Valida-tion of a core outcome measure for palliative care

622 Vol. 42 No. 4 October 2011Selman et al.

in Africa: the APCA African Palliative OutcomeScale. Health Qual Life Outcomes 2010;8:10.

85. Eisenchlas JH, Harding R, Daud ML, et al. Useof the Palliative Outcome Scale in Argentina:a cross-cultural adaptation and validation study.J Pain Symptom Manage 2008;35:188e202.

86. Hearn J, Higginson IJ. Development and vali-dation of a core outcome measure for palliativecare: the Palliative care Outcome Scale. Qual HealthCare 1999;8:219e227.

87. Siegert RJ, GaoW,Walkey FH,Higginson IJ. Psy-chological well-being and quality of care: a factor-analytic examination of the Palliative care OutcomeScale (POS). J Pain Symptom Manage 2010;40:67e74.

88. Krug R, Karus D, Selwyn PA, Raveis VH. Late-stage HIV/AIDS patients’ and their familial care-givers’ agreement on the Palliative care OutcomeScale. J Pain Symptom Manage 2010;39:23e32.

89. O’Connell K, Skevington S, Saxena S, WHO-QOL HIV Group. Preliminary development of theWorld Health Organisation’s Quality of Life HIV in-strument (WHOQOL-HIV): analysis of the pilot ver-sion. Soc Sci Med 2003;57:1259e1275.

90. O’Connell KA, Saxena S, Skevington SM.WHOQOL-HIV for quality of life assessment amongpeople living with HIV and AIDS: results from thefield test. AIDS Care 2004;16:882e889.

91. The WHOQOL HIV Group. Initial steps to de-veloping the World Health Organization’s Qualityof Life Instrument (WHOQOL) module for inter-national assessment in HIV/AIDS. AIDS Care2003;15:347e357.

92. Starace F, Cafaro L, Abrescia N, et al. Qualityof life assessment in HIV-positive persons: applica-tion and validation of the WHOQOL-HIV, Italianversion. AIDS Care 2002;14:405e415. Availablefrom http://www.ncbi.nlm.nih.gov/pubmed?term¼%22Chirianni%20A%22%5BAuthor%5D. AccessedSeptember 7, 2011.

93. Zimpel RR, Fleck MP. Quality of life in HIV-positive Brazilians: application and validation ofthe WHOQOL-HIV, Brazilian version. AIDS Care2007;19:923e930.

94. Pedroso B, Pilatti LA, de Francisco AC, dosSantos CB. Quality of life assessment in peoplewith HIV: analysis of the WHOQOL-HIV syntax.AIDS Care 2010;22:361e372.

95. Saddki N, Noor MM, Norbanee TH, et al. Valid-ity and reliability of the Malay version of WHOQOL-HIV BREF in patients with HIV infection. AIDSCare 2009;21:1271e1278.

96. Rosenfeld B, Gibson C, Kramer M, Breitbart W.Hopelessness and terminal illness: the construct ofhopelessness in patients with advanced AIDS. PalliatSupport Care 2004;2:43e53.

97. Nissim R, Flora DB, Cribbie RA, et al. Factorstructure of the Beck Hopelessness Scale in individ-uals with advanced cancer. Psychooncology 2010;19:255e263.

98. Abbey JG, Rosenfeld B, Pessin H, Breitbart W.Hopelessness at the endof life: the utility of the hope-lessness scale with terminally ill cancer patients. Br JHealth Psychol 2006 May;11(Pt 2):173e183.

99. Mayers AM, Khoo ST, SvartbergM. The Existen-tial Loneliness Questionnaire: background, develop-ment, and preliminary findings. J Clin Psychol 2002;58:1183e1193.

100. Lyon DE, Younger J. Development and prelim-inary evaluation of the existential meaning scale.J Holist Nurs 2005;23:54e65.

101. Ironson G, Solomon GF, Balbin EG, et al. TheIronson-woods Spirituality/Religiousness Index isassociated with long survival, health behaviors, lessdistress, and low cortisol in people with HIV/AIDS. Ann Behav Med 2002;24:34e48.

102. Steinhauser KE, Clipp EC, Bosworth HB, et al.Measuring quality of life at the end of life: validationof the QUAL-E. Palliat Support Care 2004;2:3e14.

103. Steinhauser KE, Bosworth HB, Clipp EC, et al.Initial assessment of a new instrument to measurequality of life at the end of life. J Palliat Med 2002;5:829e841.

104. Steinhauser KE, Voils CI, Clipp EC, et al. ‘‘Areyou at peace?’’: one item to probe spiritual concernsat the end of life. Arch Intern Med 2006;166:101e105.

105. Beaton DE, Bombardier C, Guillemin F,Ferraz MB. Guidelines for the process of cross-cultural adaptation of self-report measures. Spine(Phila Pa 1976) 2000;25:3186e3191.

106. Bullinger M, Alonso J, Apolone G, et al. Trans-lating health status questionnaires and evaluatingtheir quality: The IQOLA Project Approach. J ClinEpidemiol 1998;51:913e923.

107. Peterman AH, Fitchett G, Brady MJ,Hernandez L, Cella D. Measuring spiritual well-being in people with cancer: the Functional Ass-essment of Chronic Illness TherapyeSpiritualWell-Being Scale (FACIT-Sp). Ann Behav Med2002;24:49e58.

108. Frankl VE. Man’s search for meaning: An in-troduction to logotherapy. Boston: Beacon Press,1963.