testing the incremental utility of the negative impression–positive impression differential in...

6
Testing the Incremental Utility of the Negative Impression–Positive Impression Differential in Detecting Simulated Personality Assessment Inventory Profiles m Christopher J. Hopwood Texas A&M University, Massachusetts General Hospital, and Harvard Medical School m Christy A. Talbert, and Leslie C. Morey Texas A&M University m Richard Rogers University of North Texas The usefulness of multiscale inventories depends on their ability to evaluate response styles effectively, such as fake-bad (feigning) and fake-good (defensiveness) profiles. The current investigation com- bined validity data across clinical, nonclinical, and simulating samples to evaluate the usefulness of the Personality Assessment Inventory (PAI) negative impression (NIM)–positive impression (PIM) difference score to detect simulated profiles. In general, its effect sizes were not appreciably different from those afforded by NIM and PIM alone. Likewise, its incremental contributions in logistic regression were minimal. These results do not support the routine use of a NIM-PIM difference score in detecting response styles with the PAI. & 2008 Wiley Periodicals, Inc. J Clin Psychol 64: 338--343, 2008. Keywords: Personality Assessment Inventory; faking; dissimulation; validity scales Psychologists often seek to maximize the interpretation of psychological measures by looking beyond single scores to examine scale configurations. In the realm of The authors would like to thank Ruth Baer for providing data that were used in the current report. Correspondence concerning this article should be addressed to: Christopher J. Hopwood, Department of Psychology, Massachusetts General Hospital and Harvard Medical School, 15 Parkman Street WACC 815, Boston, MA 02114; e-mail: [email protected] JOURNAL OF CLINICAL PSYCHOLOGY, Vol. 64(3), 338–343 (2008) & 2008 Wiley Periodicals, Inc. Published online in Wiley InterScience (www.interscience.wiley.com). DOI: 10.1002/jclp.20439

Upload: christopher-j-hopwood

Post on 11-Jun-2016

216 views

Category:

Documents


2 download

TRANSCRIPT

Page 1: Testing the incremental utility of the negative impression–positive impression differential in detecting simulated personality assessment inventory profiles

Testing the Incremental Uti l i ty of the NegativeImpression–Posit ive Impression Differential in DetectingSimulated Personality Assessment Inventory Profiles

m

Christopher J. HopwoodTexas A&M University, Massachusetts General Hospital,and Harvard Medical School

m

Christy A. Talbert, and Leslie C. MoreyTexas A&M University

m

Richard RogersUniversity of North Texas

The usefulness of multiscale inventories depends on their ability to

evaluate response styles effectively, such as fake-bad (feigning) and

fake-good (defensiveness) profiles. The current investigation com-

bined validity data across clinical, nonclinical, and simulating samples

to evaluate the usefulness of the Personality Assessment Inventory

(PAI) negative impression (NIM)–positive impression (PIM) difference

score to detect simulated profiles. In general, its effect sizes were not

appreciably different from those afforded by NIM and PIM alone.

Likewise, its incremental contributions in logistic regression were

minimal. These results do not support the routine use of a NIM-PIM

difference score in detecting response styles with the PAI. & 2008

Wiley Periodicals, Inc. J Clin Psychol 64: 338--343, 2008.

Keywords: Personality Assessment Inventory; faking; dissimulation;

validity scales

Psychologists often seek to maximize the interpretation of psychological measures bylooking beyond single scores to examine scale configurations. In the realm of

The authors would like to thank Ruth Baer for providing data that were used in the current report.

Correspondence concerning this article should be addressed to: Christopher J. Hopwood, Department ofPsychology, Massachusetts General Hospital and Harvard Medical School, 15 Parkman Street WACC815, Boston, MA 02114; e-mail: [email protected]

JOURNAL OF CLINICAL PSYCHOLOGY, Vol. 64(3), 338–343 (2008) & 2008 Wiley Periodicals, Inc.Published online in Wiley InterScience (www.interscience.wiley.com). DOI: 10 .1002/ jc lp .20439

Page 2: Testing the incremental utility of the negative impression–positive impression differential in detecting simulated personality assessment inventory profiles

response styles, investigations have combined validity scales in an effort to increasetheir clinical accuracy. One well-known example is the F-K Index (Gough, 1950) ofthe Minnesota Multiphasic Personality Inventory (MMPI; Hathaway & McKinley,1951). In feigning studies, the F-K Index has been shown to produce large effect sizesand moderate levels of accurate classification. However, a meta-analysis by Rogers,Sewell, Martin, and Vitacco (2003) found it produced comparable effect sizes to theF and Fp scales alone. Moreover, its increased complexity posed a formidablechallenge to establishing consistent cut scores. Nonetheless, the F-K Index is oftenincluded in faking studies with the MMPI, and guidelines for its interpretation areincluded in major texts on the instrument (e.g., Graham, 2000; Greene, 2000).As with the F and K scales, an inverse relationship is consistently found between

the Personality Assessment Inventory (PAI; Morey, 1991) negative impression(NIM) and positive impression (PIM) scales. For example, in the normative samples(see Morey, 1991), the correlations between these variables were �.34 (communitysample) and �.45 (clinical sample). Greene’s (1997) bipolarity hypothesis suggeststhat the differences between scales such as NIM and PIM should be investigated toevaluate their incremental validity over each scale in isolation for detecting fake-bad(feigning) and fake-good (defensiveness) response styles. Anecdotal evidence alsosuggests that a difference score between NIM and PIM may be effective for thedetection of dissimulated profiles. The purpose of the current report is to test thevalidity of such recommendations.

Method

Data were used from relevant subsets of two large standardization samples and ninesimulation samples. The relevant subsets were (sample 1) community (n5 1,000) and(sample 2) clinical (n5 1,246) standardization samples (Morey, 1991), which werecompleted under normal instructions. During standardization, two samplescompleted PAI profiles with instructions to (sample 3) fake bad (n5 42) or (sample4) fake good (n5 43). A subsequent study (Morey & Lanier, 1998) again askedrespondents to (sample 5) fake bad (n5 44) or (sample 6) good (n5 46). Rogers,Sewell, Morey, and Ustad (1996) conducted a study in which they asked respondentsto try to produce profiles suggesting specific diagnoses: 65 feigned schizophrenia(sample 7), 60 feigned depression (sample 8), and 57 feigned generalized anxietydisorder (GAD; sample 9). Baer and Wetter (1997) asked respondents to fake goodunder (sample 10) uncoached (n5 24) and (sample 11) coached (n5 24) conditions.Complete participant data for each of these samples are described in the originalstudies.In the detection of feigning, the critical task is the differentiation of fake-bad

from genuinely impaired PAI protocols. Therefore, fake-bad groups wereonly compared to clinical samples. Conversely, fake-good profiles must bedistinguished from relatively adjusted community samples; therefore, they werecompared only to the standardization sample. For samples characterized bygeneralized faking conditions (i.e., 3–6, 10–11), random samples of equal size weredrawn from the standardization data. Because samples differed from comparisongroups in original studies, effect sizes reported here for individual scales may varyfrom effect sizes reported in those studies. For the Rogers et al. (1996) samplesof individuals feigning specific disorders, individuals from the clinical normativesample assigned the same diagnosis were the comparison group, as was done in theoriginal study.

339NIM/PIM Differential

Journal of Clinical Psychology DOI: 10.1002/jclp

Page 3: Testing the incremental utility of the negative impression–positive impression differential in detecting simulated personality assessment inventory profiles

Descriptive statistics and effect sizes were computed for NIM or PIM and theNIM–PIM differential for all simulation samples. This score was computed usingT-scores based on community standardization sample data. For interpretiveclarity, this difference score was reversed (i.e., PIM–NIM) in positive simulationconditions. Two sets of analyses were conducted, the first to test the abilityof this difference score to detect simulated profiles, and the second to test theability of the difference score to increment NIM (in negative simulationconditions) or PIM (in positive simulation conditions) classifications. Effectsizes (Cohen’s d) and areas under the curve (AUC) from receiver operatingcurve (ROC) analyses were computed to test the main effect of the differencescore. To test the incremental contributions of these scores over the relevantvalidity scale in isolation, hierarchical logistic regressions were conducted, andchange in model significance, regression coefficients, and hit rates were computed forthe validity scale in isolation and in combination with the difference score.

Results and Discussion

The major findings of the study are summarized in Table 1. With reference to the twosubsets, the mean NIM–PIM score in the PAI clinical standardization sample was

Table 1Testing the Comparative Effectiveness of NIM-PIM for Fake-Bad and PIM-NIM for Fake-Good Studies

Cohen’s ROCLogistic regression

Sample Indicator M SD d AUC Dw2 b1 b2 HR

Fake-bad studies

Morey & Lanier NIM 110.32 23.18 2.73 .97��� 79.82��� .11��� .24� 87.8

NIM–PIM 72.18 28.81 2.13 .94��� 4.24� �.10 92.2

Morey NIM 116.47 23.74 2.97 .97��� 80.18��� .11��� .23� 89.5

NIM–PIM 77.56 31.71 2.83 .94��� 4.23� �.09 91.9

Rogers SCZ NIM 86.97 21.18 0.91 .74��� 20.93��� .04��� .09��� 71.3

NIM–PIM 40.64 26.70 0.96 .68�� 4.05� �.04 72.2

Rogers DEP NIM 79.88 21.45 0.87 .72��� 27.20��� .05��� .08��� 74.1

NIM–PIM 40.92 27.59 0.96 .68��� 3.02 �.03 75.1

Rogers GAD NIM 68.77 21.32 0.85 .74��� 15.71��� .07�� .15�� 64.7

NIM–PIM 27.30 26.75 0.43 .68�� 3.04 �.05 74.1

Fake-good studies

Morey & Lanier PIM 66.24 6.90 2.09 .93��� 64.29��� .25��� .09 81.5

PIM–NIM 21.52 7.80 1.78 .94��� 3.91� .14 85.9

Morey PIM 64.70 6.76 1.41 .86��� 36.52��� .17��� .12� 73.6

PIM–NIM 18.05 10.59 1.34 .85��� 2.08 .05 77.0

Baer uncoached PIM 64.58 5.93 1.52 .86��� 24.17��� .20��� .04 78.4

PIM–NIM 19.92 6.89 1.71 .89��� 5.05� .16 78.4

Baer coached PIM 50.29 10.44 0.34 .61 1.85 .04 �.04 59.6

PIM–NIM 4.23 14.05 0.28 .67� 3.90� .07 61.5

Note. All ROC analyses were univariate. Dw2 5 change from baseline logistic model for each step; hit rate

is additive. b5Beta coefficients in regression models, b1 refers to first step, b2 to second step. Significance

of Betas was tested with the Wald Test. NIM5Negative impression; PIM5positive impression;

ROC5 receiver operating curve; AUC5 area under the curve; HR5hit rate; SCZ5 schizophrenic;

DEP5depressed; GAD5 generalized anxiety disorder.�po.05; ��po.01; ���po.001.

340 Journal of Clinical Psychology, March 2008

Journal of Clinical Psychology DOI: 10.1002/jclp

Page 4: Testing the incremental utility of the negative impression–positive impression differential in detecting simulated personality assessment inventory profiles

16.49 (SD5 23.49). Based on test construction, the mean PIM–NIM score in thePAI community standardization sample was 0 (SD5 16.36). Although itproduced large effect sizes in the fake-bad condition, NIM–PIM (M Cohen’sd5 1.46) underperformed NIM alone (M Cohen’s d5 1.67). For fake-goodcomparisons, PIM–NIM (M Cohen’s d5 1.28) was virtually identical to PIM alone(M Cohen’s d5 1.34). Under both simulation conditions, however, the respectivedifference scores did not appreciably increment effect sizes.Receiver operating curve analyses examine the potential effectiveness of any cut

score in differentiating simulation conditions from their relevant comparison groups.Its findings (see Table 1) are generally consistent with the reported effect sizes. Aninteresting finding occurred with the Rogers, Ornduff, and Sewell (1993) schizo-phrenic and depressed groups. Although NIM–PIM had slightly higher effect sizes,they accounted for slightly less (4 to 6%) area under the curve. These data suggestthat the setting of cut scores might make at least a marginal difference in therespective classification rates of NIM–PIM and NIM alone. With one exception,ROC analyses of fake-good comparisons produced nearly identical AUC estimates.The sole exception was the fake-good coached condition for which PIM–NIMproduced a modest improvement of 6%.Hierarchical logistic models were constructed to test the ability of the difference

scores to significantly increment the validity scales. For fake-bad conditions, w2 testsand b coefficients suggested that the difference score did not increment themodel that included NIM in isolation. In addition, most improvements in hitrates were modest. The one exception is the feigning of GAD; the combinationof NIM and NIM–PIM improved the classification rates by nearly 10%. Particularlyfor feigned GAD, knowledgeable simulators often do not produce NIM elevations.For these feigners, the combined cut scores (NIM and NIM–PIM) may havesome potential to improve classification rates. In the second step, the regressioncoefficients for NIM–PIM were negative for all analyses, and the coefficientsfor NIM were increased relative to the first step, suggesting suppression of theNIM-faking relation by the NIM–PIM difference score. Although none ofthese coefficients were statistically significant, these data may suggest that, to theextent that NIM–PIM increments the information provided by NIM alone,it does not do this in the way that might be anticipated. Specifically, whereasindividuals in faking conditions may manifest elevations on NIM, but havescores on PIM that are not suppressed, individuals with genuine psychopathologymay have elevations on NIM and suppressed PIM scores. Therefore,after controlling for NIM, the NIM–PIM differential may be in the oppositedirection as anticipated or as implied by the common use of the F-K MMPI Index.Given the modesty of effects, such an interpretation should be subjected to futureresearch.The hierarchical logistic regression analyses in the fake-good studies indicated

that the PIM–NIM difference score improved the hit rate slightly and the w2 testssuggested a significantly improved model in three of four studies, but theb coefficient for the difference score in the second step was not statisticallysignificant for any of these models. These results suggest that the PIM–NIM difference may increment PIM, although modestly at best, in the predictionof positive dissimulation. Positive impression in isolation was least effectivein detecting coach simulators from the Baer and Wetter (1997) study,and results suggested some potential incremental validity for PIM–NIM in thecontext of coached positive dissimulation. However, it should also be

341NIM/PIM Differential

Journal of Clinical Psychology DOI: 10.1002/jclp

Page 5: Testing the incremental utility of the negative impression–positive impression differential in detecting simulated personality assessment inventory profiles

noted that other PAI indicators may be more useful than PIM–NIM in augmentingPIM.1 However, unlike with negative dissimulation, there was no suppression,indicating perhaps that individuals who fake good on the PAI produce elevated PIMscores and somewhat suppressed NIM scores relative to individuals in thecommunity.Although the NIM–PIM configuration appears to provide minimal information

above and beyond each scale in isolation regarding dissimulation, other PAIindicator configurations may be helpful in understanding the extent and nature ofdissimulation. For example, Hopwood, Morey, Rogers, and Sewell (2007) showedthat discrepancies between scores on clinical scales and scores on those scalespredicted by NIM can indicate the type of disorder being feigned. Future researchshould focus on the potential of other PAI scale configurations to assist clinicians indetecting dissimulated profiles.In summary, difference scores for NIM and PIM are able to detect both fake-good

(defensiveness) and fake-bad (feigned) PAI profiles. Like its analog, the F-K MMPIIndex, data suggest that these difference scores do not offer incrementaldifferentiation between the relevant criterion groups. With the possible minorexceptions of feigned GAD, positive, and coached dissimulation, NIM and PIMdifference scores do not appear to enhance the accurate classification of individualswith different response styles. At present, clinicians should focus on the NIM andPIM scales as well as other validity indicators shown to detect dissimulation. Futureresearch should continue to help elaborate the conditions under which NIM–PIMand other indicators provide useful information above and beyond the PAI validityscales.

References

Baer, R.A., & Wetter, M.W. (1997). Effects of information about validity scales on

underreporting of symptoms on the Personality Assessment Inventory. Journal of

Personality Assessment, 68, 402–413.

Cashel, M.L., Rogers, R., Sewell, K., & Martin-Cannici, C. (1995). The Personality

Assessment Inventory (PAI) and the detection of defensiveness. Assessment, 2, 333–342.

Gough, H.G. (1950). The F minus K dissimulation index on the MMPI. Journal of Consulting

Psychology, 14, 408–413.

Graham, J.R. (2000). MMPI-2: Assessing personality and psychopathology (3rd ed.). New

York: Oxford University Press.

Greene, R.L. (1997). Assessment of malingering and defensiveness by multiscale inventories.

In R. Rogers, (Ed.), Clinical assessment of malingering and defensiveness (2nd ed.).

(pp. 169–207). New York: Guilford Press.

Greene, R.L. (2000). The MMPI-2: An interpretive manual (2nd ed.). Needham Heights, MA:

Allyn & Bacon.

Hathaway, S.R., & McKinley, J.C. (1951). MMPI manual. New York: Psychological

Corporation.

1Baer and Wetter did not investigate the Cashel discriminant function (CDF: Cashel, Rogers, Sewell,

Martin-Cannici, 1995), which was developed contemporaneously with their study. We calculated means on

the CDF for the Baer and Wetter data and found significant differences between community norms and

coached simulators (simulators5 144.40, SD5 8.63; norms5 130.38, SD5 14.24, t5 4.29, po.001), as

well as uncoached simulators (simulators5 149.41, SD5 9.09; norms5 137.64, SD5 12.39, t5 3.86,

po.001); ROC analyses suggested the effectiveness of the CDF in coached (AUC5 .78, po.001) and

uncoached (AUC5 .79, po.001) simulators, and it significantly incremented PIM with coached

(dw2 5 14.96, po.001), but not uncoached (dw2 5 2.16, p4.10) simulators.

342 Journal of Clinical Psychology, March 2008

Journal of Clinical Psychology DOI: 10.1002/jclp

Page 6: Testing the incremental utility of the negative impression–positive impression differential in detecting simulated personality assessment inventory profiles

Hopwood, C.J., Morey, L.C., Rogers, R., & Sewell, K., (2007). Malingering on the Personality

Assessment Inventory: Identification of specific feigned disorders. Journal of Personality

Assessment, 88, 43–48.

Morey, L.C. (1991). Personality Assessment Inventory professional manual. Odessa, FL:

Psychological Assessment Resources.

Morey, L.C., & Lanier, V.W. (1998). Operating characteristics of six response distortion

indicators for the Personality Assessment Inventory. Assessment, 5, 203–214.

Rogers, R., Ornduff, S.R., & Sewell, K. (1993). Feigning specific disorders: A study of the

Personality Assessment Inventory (PAI). Journal of Personality Assessment, 60, 399–405.

Rogers, R., Sewell, K.W., Martin, M.A., & Vitacco, M.J. (2003). Detection of feigned mental

disorders: A meta-analysis of the MMPI-2 and malingering. Assessment, 10, 160–177.

Rogers, R., Sewell, K.W., Morey, L.C., & Ustad, K.L. (1996). Detection of feigned mental

disorders on the Personality Assessment Inventory: A discriminant analysis. Journal of

Personality Assessment, 67, 629–640.

343NIM/PIM Differential

Journal of Clinical Psychology DOI: 10.1002/jclp