validating assessment centers

52
Validating Assessment Centers Kevin R. Murphy Department of Psychology Pennsylvania State University, USA

Upload: wanda

Post on 13-Feb-2016

65 views

Category:

Documents


0 download

DESCRIPTION

Validating Assessment Centers. Kevin R. Murphy Department of Psychology Pennsylvania State University, USA. A Prototypic AC. Groups of candidates participate in multiple exercises Each exercise designed to measure some set of behavioral dimensions or competencies - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: Validating Assessment Centers

Validating Assessment Centers

Kevin R. MurphyDepartment of Psychology

Pennsylvania State University, USA

Page 2: Validating Assessment Centers

A Prototypic AC

• Groups of candidates participate in multiple exercises

• Each exercise designed to measure some set of behavioral dimensions or competencies

• Performance/behavior in exercises is evaluated by sets of assessors

• Information from multiple assessors is integrated to yield a range of scores

Page 3: Validating Assessment Centers

Common But Deficient Validation Strategies

• Criterion-related validity studies

– Correlate OAR with criterion measures

• e.g., OAR correlates .40 with performance measures, but written ability tests do considerably better (.50’s)

– There may be practical constraints to using tests, but psychometric purists are not concerned with the practical

Page 4: Validating Assessment Centers

Common But Deficient Validation Strategies

• Construct Validity Studies

– Convergent and Discriminant Validity assessments

• AC scores often show relatively strong exercise effects and relatively weak dimension/competency effects

• This is probably not the right model for assessing construct validity, but it is the one that has dominated much of the literature

Page 5: Validating Assessment Centers

Common But Deficient Validation Strategies

• Content validation– Map competencies/behavioral descriptions onto the

job

– If competencies measures by AC show reasonable similarity to job competencies, content validity is established

• Track record for ACs is nearly perfect because job information is used to select competencies, but evidence that competencies are actually measured is often scant

Page 6: Validating Assessment Centers

Ask the Wrong Question, Get the Wrong Answer

• Too many studies ask “Are Assessment Centers Valid?”

• The Question should be “Valid for What?”– That is, validity is not determined by the

measurement procedure or even by the data that arises from that procedure. Validity is determined by what you attempt to do with the data

Page 7: Validating Assessment Centers

Sources of Validity Information

• Validity for what?– Determine the ways you will use the data coming

out of an AC. ACs are not valid or invalid in general, they are valid for specific purposes

• Cast a wide net!– Virtually everything you do that gives you insight

into what the data coming out of an AC mean can be thought of as part of the validation process

Page 8: Validating Assessment Centers

Sources of Validity Information

• Raters– Rater training, expertise, agreement

• Exercises– What behaviors are elicited, what situational

factors affect behaviors• Dimensions– Is there evidence to map from AC behavior to

dimensions to job

Page 9: Validating Assessment Centers

Sources of Validity Information

• Scores– Wide range of assessments of the relationships

among the different scores obtained in the AC process provide validity information

• Processes– Evidence that the processes used in an AC tend to

produce reliable and relevant data is part of the assessment of validity

Page 10: Validating Assessment Centers

Let’s Validate An Assessment Center!• Design the AC• Identify the data that come out of an AC• Determine how you want to use that data• Collect and evaluate information relevant to

those uses– Data from pilot tests– Analysis of AC outcome data– Evaluations of AC components and process– Lit reviews, theory and experience

Page 11: Validating Assessment Centers

Design• Job - Entry-level Human Resource Manager

• Competencies– Active Listening– Speaking– Management of Personnel Resources – Social Perceptiveness

• Being aware of others' reactions and understanding why they react as they do.

– Coordination • Adjusting actions in relation to others' actions.

– Critical Thinking – Reading Comprehension– Judgment and Decision Making– Negotiation – Complex Problem Solving

Page 12: Validating Assessment Centers

DesignExercise #1 Exercise #2 Exercise #3

Competency 1

Competency 2

Competency 3

Competency 4

Competency 5

Competency 6

Populate the Matrix – which competencies and what exercises?

Page 13: Validating Assessment Centers

Assessors

• How many, what type, which exercises?

Exercise 1 Exercise 2 Exercise 3

Assessor 1

Assessor 2

Assessor 3

Assessor 4

Assessor 5

Page 14: Validating Assessment Centers

Assessment Data• Individual behavior ratings?– How will we set these up so that we can assess their

accuracy or consistency?

• Individual competency ratings?– How will we set these up so that we can assess their

accuracy or consistency?

• Pooled ratings– What level of analysis?

• OAR

Page 15: Validating Assessment Centers

Uses of Assessment Data• Competency

– Is it important to differentiate strengths and weaknesses?

• Exercise– Is AC working as expected (exercise effect might or might not be

confounds)

• OAR– Do you care how people did overall?

• Other– Process tracing for integration. Is it important how ratings change in

this process?

Page 16: Validating Assessment Centers

Validation

• The key question in all validation efforts is whether the inferences you want to draw from the data can be supported or justified

– A question that often underlies this assessment involves determining whether the data are sufficiently credible to support any particular use

Page 17: Validating Assessment Centers

Approaches to Validation

• Assessment of the Design– Competency mapping• Do exercises engage the right competencies• Are competency demonstrations in AC likely to

generalize• Are these the right competencies?

– Can assessors discriminate competencies?• Are the assessors any good?

– Do we know how good they are

Page 18: Validating Assessment Centers

Approaches to Validation

• Assessment of the Data– Inter-rater agreement– Distributional assessments– Reliability and Generalizability analysis– Internal structure– External correlates

Page 19: Validating Assessment Centers

Approaches to Validation

• Assessment of the Process– Did assessors have opportunities to observe

relevant behaviors?– What is the quality of the behavioral information

that was collected?– How were behaviors translated into evaluations– How were observations and evaluations

integrated

Page 20: Validating Assessment Centers

Approaches to Validation

• Assessment of the Track Record– Relevant theory and literature– Relevant experience with similar ACs• Outcomes with dissimilar ACs

Page 21: Validating Assessment Centers

Assessment of the Design:Competencies

• Competency Mapping (content validation)– Do exercises elicit behaviors that illustrate the

competency• Are we measuring the right competencies?• Evidence that exercises reliably elicit the competencies

– Generalizability from AC to world of work

Page 22: Validating Assessment Centers

Assessment of the Design:Assessor Characteristics

• Training and expertise• What do we know about their performance as

assessors– One piece of evidence for validity might be

information that will allow us to evaluate the performance or the likely performance of our assessors

Page 23: Validating Assessment Centers

Assessment of the Data

• Distributional assessments– Does the distribution of scores make sense– Is the calibration of assessors reasonable given

the population being assessed– Is the variability in scores?

Page 24: Validating Assessment Centers

Assessment of the Data

• Reliability and Generalizability analyses– Distinction between reliability and validity is not

as fundamental as most people believe– Assessments of reliability are an important part of

validation– The natural structure of AC data fits nicely with

generalizability theory

Page 25: Validating Assessment Centers

Assessment of the Data

• Generalizability– AC data can be classified according to a number of

factors – rater, ratee, competency, exercise– ANOVA is the stating point for generalizability

analysis – i.e., identifying the major sources of variability• Complexity of ANOVA design depends largely on

whether the same assessors evaluate all competencies and exercises or some

Page 26: Validating Assessment Centers

Assessment of the Data

• Generalizability – an example

– Use ANOVA to examine the variability of scores as a function of• Candidates• Dimensions (Competencies)• Exercises (potential source of irrelevant variance)• Assessors

Page 27: Validating Assessment Centers

Assessment of the Data

Candidates Overall differences in candidate performance

Dimensions Does the pool of candidates show more strength in some competency areas than others?

Assessors Are assessors calibrated?

C x D Do candidates show different strengths and weaknesses?

C x A Do assessors agree about candidates?

A X D Do assessors agree about dimensions (competencies)

C x D x A Do assessors agree in their evaluations of the patterns of strength and weakness of different candidates?

Page 28: Validating Assessment Centers

Assessment of the Data

• Internal Structure– Early in the design phase, articulate your

expectations regarding the relationships among competencies and dimensions

– This articulation becomes the foundation for subsequent assessments• It is impossible to tell if the correlations among ratings of

competencies are too high or too low unless you have some idea of the target you are shooting for

Page 29: Validating Assessment Centers

Assessment of the Data

• Internal Structure– Confirmatory factor analysis is much better than

exploratory for making sense of the internal structure

– Exercise effects are not necessarily a bad thing. No matter how good assessors are, they cannot ignore overall performance levels• Halo is not necessarily an error, it is part of the judgment

process all assessors use

Page 30: Validating Assessment Centers

Assessment of the Data

• Confirmatory Factor Models– Exercise only• Does this model provide a reasonable fit?

– Competency• Does this model provide a reasonable fit?

– Competency + exercise• How much better is the fit when you include both sets

of factors?

Page 31: Validating Assessment Centers

Assessment of the Data

• External Correlates– The determination of external correlates depends

strongly on• The constructs/competencies you are trying to

measure• the intended uses of the data

Page 32: Validating Assessment Centers

Assessment of the Data

• External Correlates– Alternate measures of competencies– Measures of the likely outcomes and correlates of

these competencies

Page 33: Validating Assessment Centers

Competencies– Active Listening– Speaking– Management of Personnel Resources – Social Perceptiveness– Coordination – Critical Thinking – Reading Comprehension– Judgment and Decision Making– Negotiation – Complex Problem Solving

Page 34: Validating Assessment Centers

Alternative Measures

Critical Thinking , Reading ComprehensionStandardized tests

Judgment and Decision MakingSupervisory ratings, Situational

Judgment Tests

Page 35: Validating Assessment Centers

Possible Correlates

• Active Listening– Success in coaching assignments– Sought as mentor

• Speaking– Asked to serve as spokesman, public speaker

• Negotiation– Success in bargaining for scarce resources

Page 36: Validating Assessment Centers

Assessments of the Process

• Opportunities to observe– Frequency with which target behaviors are

recorded

• Quality of the information that is recorded– Detail and consistency• Influenced by format – e.g., narrative vs. checklist

Page 37: Validating Assessment Centers

Assessments of the Process

• Observations to evaluations– How is this done?– Consistent across assessors?

• Integration– Clinical vs. statistical• Statistical integration should always be present but

should not necessarily trump consensus• Process by which consensus moves away from

statistical summary should be transparent and documented

Page 38: Validating Assessment Centers

Assessment of the Track Record

• The history of similar ACs forms part of the relevant research record

• The history of dissimilar ACs is also relevant

Page 39: Validating Assessment Centers

The Purpose-Driven AC

• What are you trying to accomplish with this AC?

• Is there evidence this AC or ones like it have accomplished or will accomplish this thing?

– Suppose the AC is intended principally to serve as part of leadership development. Identifying this principal purpose helps to identify relevant criteria

Page 40: Validating Assessment Centers

Criteria

• Advancement• Leader success• Follower satisfaction• Org success in dealing with turbulent

environments– The process of identifying criteria is largely one of

thinking through what the people and the organization would be like if you AC worked

Page 41: Validating Assessment Centers

An AC Validation Report

• Think of validating an AC the same way a pilot does his or her pre-flight checklist

• The more you know about each of the items on the checklist, the more compelling the evidence that the AC is valid for its intended purpose

Page 42: Validating Assessment Centers

AC Validity Checklist

• Do you know (and how do you know) whether:

– The exercises elicit behaviors that are relevant to the competencies you are trying to measure

– These AC demonstrations of competency are likely to generalize

Page 43: Validating Assessment Centers

AC Validity Checklist

• Do you know (and how do you know) whether:

– Raters have the skill, training, expertise needed?

– Raters agree in their observations and evaluations

– Their resolutions of disagreements make sense

Page 44: Validating Assessment Centers

AC Validity Checklist

• Do you know (and how do you know) whether:

– Score distributions make sense • Are there differences in scores received by candidates?• Can you distinguish strengths from weaknesses

Page 45: Validating Assessment Centers

AC Validity Checklist

• Do you know (and how do you know) whether:– Analyses of Candidate X Dimensions X Assessors

yields sensible outcomes• Assessor – are assessors calibrated?• C X D – do candidates show patterns of strength and

weakness?• A X D – do assessors agree about dimensions• C X D X A – do assessors agree about evaluations of

patterns of strengths and weaknesses

Page 46: Validating Assessment Centers

AC Validity Checklist

• Do you know (and how do you know) whether:– Factor structure makes sense, given what you are

trying to measure• Do you know anything about the relationships among

competencies?• Is this reflected in the sorts of factor models that fit?

Page 47: Validating Assessment Centers

AC Validity Checklist

• Do you know (and how do you know) whether:– Competency scores related to• Alternate measures of these competencies• Likely outcomes and correlates of these competencies

Page 48: Validating Assessment Centers

AC Validity Checklist

• Do you know (and how do you know) whether:– There are Competency X Treatment Interactions• Identifying individual strengths and weaknesses of

most useful when different patterns will lead to different treatments (training programs, development opportunities) and when making the right treatment decision for each individual leads to better outcomes than treating everyone the same

Page 49: Validating Assessment Centers

AC Validity Checklist

• Do you know (and how do you know) whether:– The process supports good measurement• Do assessors have opportunities to observe relevant

behaviors?• Do they record the right sort of information?• Is there a sensible process for getting from behavior

observation to competency judgment?

Page 50: Validating Assessment Centers

AC Validity Checklist

• Do you know (and how do you know) whether:– The integration process helps or hurts• How is integration done?• Is it the right method given the purpose of the AC?• How much does the integration process change the

outcomes?

Page 51: Validating Assessment Centers

AC Validity Checklist

• Do you know (and how do you know) whether:– Other similar ACs have worked well– Other dissimilar ACs have worked better, worse,

etc.

Page 52: Validating Assessment Centers

AC Validity Checklist

• Don’t overdo the checklist metaphor– A pilot will not take off unless everything on the

list checks out– Validation is not an all-or-none thing• More evidence is better• Broadly based evidence is better than lots of one kind• Validation checklist can help you improve AC

– Your goal is not to make an AC perfect, but to accumulate evidence