psy 6430 unit 5

72
PSY 6430 UNIT 5 Validity Determining whether the selection instruments are job- related 1 Today and Wednesday: Lecture Exam: Monday, 3/18

Upload: brede

Post on 11-Jan-2016

34 views

Category:

Documents


0 download

DESCRIPTION

PSY 6430 Unit 5. Today and Wednesday: Lecture Exam: Monday, 3/18. Validity Determining whether the selection instruments are job-related. SO1: NFE, Validity, a little review. Predictor = test/selection instrument Use the score from the test to predict who will perform well on the job - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: PSY 6430 Unit 5

PSY 6430 UNIT 5

Validity

Determining whether the selection instruments are job-related

1

Today and Wednesday: LectureExam: Monday, 3/18

Page 2: PSY 6430 Unit 5

SO1: NFE, Validity, a little review

2

Predictor = test/selection instrument Use the score from the test to predict

who will perform well on the job Possible confusion (again)

You need to determine the validity of the test based on your current employees

Then you administer it to applicants and select employees based on the score

(a few students had a problem distinguishing between validity and reliability on E4, example next)

Page 3: PSY 6430 Unit 5

SO1: NFE, Validity, example

3

Administer a test to current employees Obtain measures of how well they perform on

the job Correlate the test scores with the performance

measures Assume: The correlation is statistically

significant Assume: Current employees who score 50-75

also are performing very well on the job Now you administer the exam to applicants,

predicting that those who score 50-75 will also perform well on the job

(main point next slide)

Page 4: PSY 6430 Unit 5

SO1: NFE, Validity main point

4

You determine the validity of a selection test or instrument based on your current employees

Then after establishing the validity or job relatedness of the test

Give the test to applicants and select them on the basis of their test scores

Page 5: PSY 6430 Unit 5

SO2: Reliability vs. Validity5

ReliabilityIs the score on the measure stable and dependable?

Are you actually measuring what you want to be measuring?

ValidityIs the measure related to performance on the job?

Page 6: PSY 6430 Unit 5

SO3: Relationship between reliability and validity

6

A measure can be reliable, but not valid However, a measure cannot be valid

unless it is reliable *Reliability is a necessary but not

sufficient condition for validity Text gives a perfect example

You can reliably measure eye color, however, it may not be related to job performance at all

*key point

Page 7: PSY 6430 Unit 5

Types of validation procedures

7

Content: expert judgment Criterion-related: statistical analyses (concurrent

& predictive) Construct (but not practical-not covering this) Validity generalization (transportable, no local

validity study – jobs are similar) Job component validity (not covering this in this

unit, but will return to it briefly in the next unit, transportable, job elements/components are similar but jobs are not)

Small businesses: Synthetic validity (not covering it, not very relevant now –content validity)(main types are the two kinds of criterion-related and content validity; construct really a hold over from test construction - not very

relevant - I have only seen this used by a few organizations – create their own tests; touch on validity generalization, but right now while validity generalization has excellent professional support, may not be legal - professional guidelines depart from legal; in one case, 6th Circuit Court ruled it illegal as a matter of law based on Griggs/Duke and Albermarle - 1987)

Page 8: PSY 6430 Unit 5

SO5 NFE but 7B is: Difference between content and criterion-related validity8

Criterion-related validity is also called “empirical” validity Concurrent validity Predictive validity

This type of validity relies on statistical analyses (correlation of test scores with measures of job performance) Measures of job performance = criterion

scores(content next slide)

Page 9: PSY 6430 Unit 5

SO5 NFE but related to 7B which is: Difference between content and criterion-related validity

9

Content validity, in contrast, relies on expert judgment and a match between the “content” of the job and the “content” of the test

Expert judgment refers to the determination of the tasks and KSAs

required to perform the job via a very detailed type of job analysis

linking the KSAs to selection procedures that measure them

Page 10: PSY 6430 Unit 5

NFE: Intro to content validity

10

You do NOT use statistical correlation to validate your tests Validation is based “only” on your job

analysis procedures and matrix between KSAs and selection measures

It is much more widely used than criterion-related validity Particularly since Supreme Court ruled it

was OK to use for adverse impact cases (1995)

Page 11: PSY 6430 Unit 5

SO6: Two reasons why content validity is often used11

It can be used with relatively small number of employees

Large sample sizes are required to use criterion-related validity due to the correlation procedures

The text later when talking about criterion-related validity indicates you may need over several hundred

Dickinson: usually 50-100 is adequate How many companies have that many

current employees in one position?(small number of incumbents and applicants)

Page 12: PSY 6430 Unit 5

SO6: Two reasons why content validity is often used

12

Many organizations do not have good job performance measures

You need good performance criterion measures to do a criterion-related validity study because you correlate the test scores with job performance measures

Page 13: PSY 6430 Unit 5

SO7A: Content vs. criterion-related validity and the type of selection procedure

13

If you use content validity you should write the test, not select an off-the-shelf test

If you use criterion-related validity, you can do either It is much easier and less time

consuming to use an off-the-shelf test than to write one!

(VERY IMPORTANT!; book waffles on this a bit, indicating that emphasis should be placed on constructing a test,But only in rare situations would I recommend selecting off-the-shelf test with content validity - legally too risky; why, next slide)

Page 14: PSY 6430 Unit 5

SO7A: Why should you write the test if you use content validity? (this slide, NFE)14

Content validity relies solely on the job analysis The KSAs must be represented proportionately on

the selection test as indicated in the job analysis in terms of: Their relative importance to the job The percentage of time they are used by the employees It is highly unlikely that an off-the-shelf test will

proportionately represent the KSAs as determined by your job analysis

In some discrimination court cases, the judge has gone through the test item by item to determine whether the items were truly proportional to the KSAs as determined by the job analysis

Both professional measurement reason and legal reason to write the test rather than using an off-the-shelf test

Page 15: PSY 6430 Unit 5

SO7B: Content vs. criterion-related validity: Differences in the basic method used to determine validity (review)

15

Content validity Relies solely on expert judgment - no

statistical verification of job-relatedness Criterion-related validity

Relies on statistical prediction to determine job-relatedness

(I am not going to talk about SO8, face validity; very straightforward)

Page 16: PSY 6430 Unit 5

SO9: What is the “heart” of any validation study and why?

16

Job analysis The job analysis determines the content

domain of the job – the tasks and KSAs that are required to perform the job successfully

Page 17: PSY 6430 Unit 5

SO10: Major steps of content validity - very, very specific requirements for the job analysis

17

Describe tasks for the job *Determine the criticality and/or

importance of each of the tasks Specify the KSAs required for EACH task

KSAs must be linked to each task (NFE)

(cont. next slide)

*Now because of ADA, is it an essential function?

Page 18: PSY 6430 Unit 5

SO10: Major steps of content validity, cont.

18

Determine the criticality and/or importance of each KSA* Operationally define each KSA Describe the relationship between each KSA and each task

statement You can have KSAs that are required for only one or two tasks, or you can

have KSAs that are required to perform several tasks The more tasks that require the KSAs, the more important/critical they are

Describe the complexity or difficulty of obtaining each KSA (formal degree, experience)

Specify whether the employee must possess each KSA upon entry or whether it can be acquired on the job (cannot test for a KSA if it can be learned within 6 months)

Indicate whether each KSA is necessary for successful performance of the job

*Only the first major point will be required for the exam, but I want to stress how detailed your job analysis must be for content validity

(cont on next slide)

Page 19: PSY 6430 Unit 5

SO10: Major steps of content validity, cont.

19

Link important job tasks to important KSAs* (FE) Reverse analysis; you have linked the KSAs to the

tasks, now you must link the KSAs to the tasks (NFE) KSA # 1 may be relevant to Tasks 1, 6, 7, 10, 12, & 22 KSA # 2 may be relevant to Tasks 2, 4, & 5 Etc.

(NFE) Develop test matrix for the KSAs If you want see how you go from the task analysis to

the actual test, turn ahead to Figures 7.12, 7.13, 7.14, 7.15, and 7.16 on pages 283-286 and Figure 7.17 on page 290

Page 20: PSY 6430 Unit 5

SO11: When you can’t use content validity according to the Uniform Guidelines20

When assessing mental processes, psychological constructs, or personality traits that cannot be directly observed, but are only inferred You cannot use content validity to justify a test for

judgment, integrity, dependability, extroversion, flexibility, motivation, conscientiousness, adaptability

The reason for that is that you are basing your job analysis on expert judgment - and judgment is only going to be reliable if you are dealing with concrete KSAs such as mechanical ability, arithmetic ability or reading blue prints

The more abstract the KSA, the less reliable judgment becomes

If you can’t see it, if you can’t observe it, then the leap from the task statements to the KSAs can result in a lot of error(text mentions three; I am having you learn the first one and one I added in the SOs -- these are the two that

are most violated in practice; the second one is relevant to BOTH content and criterion-related so shouldn’t be listedunder when you can’t use content validity: cannot test for KSAs that can be learned on the job)

Page 21: PSY 6430 Unit 5

SO11: When you can’t use content validity according to the Uniform Guidelines, cont.

21

When selection is done by ranking test scores or banding them (from U1) If you rank order candidates based on their test scores

and select on that basis, you cannot use content validity - you must use criterion-related validity

If you band scores together, so those who get a score in a specified range of scores are all considered equally qualified, you cannot use content validity - you must use criterion-related validity

Why? If you use ranking or banding, you must be able to prove that individuals who score higher on the test will perform better on the job - the only way to do that is through the use of statistics

The only appropriate (and legally acceptable) cut-off score procedure to use is a pass/fail system where everyone above the cut-off score is considered equally qualified

(only relevant if adverse impact)

Page 22: PSY 6430 Unit 5

Criterion-related validity studies:Concurrent vs. predictive

22

SO13A: Concurrent validityAdminister the predictor to current employees and correlate scores with measures of job performanceConcurrent in the sense that you have collected both measures at the same time for current employees

SO18A: Predictive validityAdminister the predictor to applicants, hire the applicants, and then correlate scores with measures of job performance collected 6-12 months laterPredictive in the sense that you do not have measures of job performance when you administer the test - you collect them later

(comparison of the two, SO13A, describe concurrent validity; SO18A, describe predictive validity)

Page 23: PSY 6430 Unit 5

Predictive Validity: Three basic ways to do it

23

Pure predictive validity: by far the bestAdminister the test to applicants and randomly hire

Current system: next best, more practicalAdminister the test to applicants, use the current selection system to hire (NOT the test)

Use test to hire: bad, bad, bad both professionally and legallyAdminister the test, and use the test scores to hire applicants(going to come back to these and explain the evaluations; text lists the third as an approach! Click: NO!!)

Page 24: PSY 6430 Unit 5

SO13B: Steps for conducting a concurrent validity study

24

Job analysis: Absolutely a legal requirement Discrepancy between law and profession (learn for

exam) Law requires a job analysis (if adverse impact & challenged) Profession does not as long as the test scores correlate

significantly with measures of job performance

Determine KSAs and other relevant requirements from the job analysis, including essential functions for purposes of ADA

Select or write test based on KSAs (learn for exam) May select an off-the-shelf test or Write/construct one

Page 25: PSY 6430 Unit 5

SO13B: Steps for conducting a concurrent validity study

25

Select or develop measures for job performance Sometimes a BIG impediment because organizations

often do not have good measures of performance Administer test to current employees and

collect job performance measures for them Correlate the test scores with the job

performance measures (SO14: add this step) Determine whether the

correlation is statistically significant at the .05 level

(not necessary for exam) Administer test to job applicants and select on the basis of the test scores

Page 26: PSY 6430 Unit 5

SO15: The basic reason that accounts for all of the weaknesses with concurrent validity26

All of the weaknesses have to do with differences between your current employees and applicants for the job

You are conducting your study with one sample of the population (your employees) and assuming conceptually that your applicants are from the same population

However, your applicants may not be from the same population - they may differ in important ways from your current employees

Ways that would cause them (as a group) to score differently on the test or perform differently on the job, affecting the correlation (job relatedness) of the test

(text lists several weaknesses and all of them really relate to one issue; dealing with inferential statistics here)

Page 27: PSY 6430 Unit 5

SO16: Restriction in range27

With criterion-related validity studies the ultimate proof that your selection test is job related is that the correlation between the test scores and job performance measures is statistically significant

A high positive correlation tells you People who score well on the test also perform well People who score middling on the test also are middling

performers People who score poorly on the test also perform poorly

on the job In order to obtain a strong correlation you need

People who score high, medium, and low on the test People who score high, medium, and low on the

performance measure(before really understanding the weaknesses related to concurrent validity and why pure predictive validity is the most sound type of validation procedure, you need to understand what “restriction in range” is and how it affects correlation coefficient; relatedto some of the material from the last unit on reliability - so if you understood it in that context, this is the same conceptual issue)

Page 28: PSY 6430 Unit 5

SO16: Restriction in range, cont.

28

That is, you need a range of scores on BOTH the test and the criterion measure in order to get a strong correlation If you only have individuals who score about the same on

the exam, regardless of whether some perform well, middling, and poorly, you will get a zero correlation

Similarly if you have individuals who score high, medium, and low on the test, but they all perform reasonably the same, you will get a zero correlation

Any procedure/factor that decreases the range of scores on either the test or the performance measure Reduces the correlation between the two and, hence, Underestimates the true relationship between the test and

job performance That is, you may conclude that your test is NOT valid, when

in fact, it may be

Page 29: PSY 6430 Unit 5

SO16: Restriction in range, cont.

29

Restriction in range is the technical term for the decrease in the range of scores on either or both the test and criterion

Concurrent validity tends to restrict the range of scores on BOTH the test and criterion, hence underestimating the true validity of a test

(cont on next slide)

Page 30: PSY 6430 Unit 5

SO16: Restriction in range, cont.

30

Why? You are using current employees in your sample Your current employees have not been fired because

of poor performance Your current employees have not voluntarily left the

company because of poor performance Your current employees have been doing the job for

a while and thus are more experienced All of the above would be expected to

Result in higher test scores than for the population of applicants

Result in higher performance scores than for the population

Thus, restricting the range of scores on both the test and the performance criterion measure(diagrams on next slide)

Page 31: PSY 6430 Unit 5

SO16: Restriction in range, cont.

31

Top diagram No restriction in

range Strong correlation

Bottom diagram Restriction in range

Test scores and Performance scores

Zero correlation

Test Scores

Per

form

ance

LowLow

High

High

Test ScoresP

erfo

rman

ce

LowLow

High

High

(extreme example, but demonstrates point - concurrent validity is likely to restrict range on both, underestimating true validity)

Page 32: PSY 6430 Unit 5

SO17A&B: Factors that affect concurrent validity

32

A. Why the length of employment of current employees may affect the results of a concurrent validity study An aging, experienced work force has been

performing the job for a long time, thus You would expect them to score better on an

ability test than inexperienced job applicants AND You would expect them all to perform reasonably

well on the job Thus, you have restricted the range on both your

test and performance scores, which would result in a lower correlation coefficient than would occur with applicants

Underestimate the job-relatedness of the test

(17a&b are really questions about restriction in range)

Page 33: PSY 6430 Unit 5

SO17A&B: Factors that affect concurrent validity

33

Why rejected applicants, turnover and promotions would affect the results of a concurrent validity study Rejected applicants and those that leave are likely to

be poorer performers; your most skilled workers are promoted: what is left are employees who perform similarly on the test & performance measure You would expect the remaining, current

employees to score more similarly on an ability test than job applicants AND

You would expect them to perform similarly on the job

Thus, you have restricted the range on both your test and performance scores, which would result in a lower correlation coefficient than would occur with applicants

Underestimate the job-relatedness of the test(b same logic as A; both have to do with restriction in range)

Page 34: PSY 6430 Unit 5

SO18: Predictive validity34

SO18A: Predictive validity (review)

Administer the predictor to applicants, hire the applicants, and then correlate scores with measures of job performance collected 6-12 months later

Predictive in the sense that you do not have measures of job performance when you administer the test - you collect them later, hence, you can determine how well your test actually predicts future performance

Page 35: PSY 6430 Unit 5

SO18B: Steps for a predictive validity study35

Job analysis: Absolutely a legal requirement Determine KSAs and other relevant

requirements from the job analysis, including the essential functions for purposes of ADA

Select or write test based on KSAs* You may select an off-the-shelf test or Write/construct one

Select or develop measures for job performance*Learn this point for the exam

(first four steps are exactly the same as for a concurrent validity study)

Page 36: PSY 6430 Unit 5

SO18B: Steps for a predictive validity study36

Administer the test to job applicants and file the results away Do NOT use the test scores to hire applicants

(I’ll come back to this later) After a suitable time period, 6-12 months,

collect job performance measures (or training measures)

Correlate the test scores with the performance measures

(SO18B: add this step) Determine whether the correlation is statistically significant

(NFE) If so, administer test to new job applicants and select on the basis of the scores

Page 37: PSY 6430 Unit 5

SO19: Two practical (not professional) weaknesses of predictive validity37

Time it takes to validate the test Need appropriate time interval after applicants

are hired before collecting job performance measures

If the organization only hires a few applicants per month, it may take months or even a year to obtain a large enough sample to conduct a predictive validity study (N=50-100)

Page 38: PSY 6430 Unit 5

SO19: Two practical (not professional) weaknesses of predictive validity38

Very, very difficult to get managers to ignore the test data (politically very difficult) Next to impossible to get an organization to

randomly hire - some poor employees ARE going to be hired

Also difficult to convince them to hire without using the test score (but much easier than getting them to randomly hire)

(I don’t blame them; admissions process for I/O program)

Page 39: PSY 6430 Unit 5

SO20A: Best predictive validity design39

Figure 5.5 lists 5 types of predictive validity designs

Follow-up: Random selection (pure predictive validity) Best design No problems whatsoever from a measurement

perspective; completely uncontaminated from a professional perspective

Follow-up: Use present system to select OK and more practical, but It will underestimate validity if your current selection

system is valid; and the more valid it is the more it will underestimate the validity of your test Why?(answer not on slide)

Page 40: PSY 6430 Unit 5

SO20C: Predictive validity, selection by scores

40

Select by test score: Do NOT do this!!! Professional reason:

If your selection procedure is job related, it will greatly underestimate your validity - and, the more job related the selection procedure is, the greater it will underestimate validity.

In fact, you are likely to conclude that your test is not valid when in fact it is

Why? You are severely restricting the range on both your test and your job performance measures!

(professional and legal reasons not to do this)

Page 41: PSY 6430 Unit 5

SO20C: Predictive validity, selection by scores 41

Legal reason: If adverse impact occurs you open yourself

up to an unfair discrimination law suit You have adverse impact, but you do not

know whether the test is job related

Page 42: PSY 6430 Unit 5

SO20: NFE, Further explanation of types of predictive validity studies

42

Hire, then test and later correlate test scores and job performance measures If you randomly hire, this is no different than

pure predictive validity: #1 previously, Follow-up: Random selection

If you hire based on current selection system, this is no different than #2 previously, Follow-up: Select based on current system

(one more slide on this)

Page 43: PSY 6430 Unit 5

SO20: NFE, Further explanation of types of predictive validity studies

43

Personnel file research - applicants are hired and their personnel records contain test scores or other information that could be used as a predictor. At a later date, job performance scores are obtained. This is no different than Follow-up: Select based

on current system

Page 44: PSY 6430 Unit 5

For exam: Rank order of criterion-related validity studies in terms of professional measurement standards

44

1. Predictive validity (pure) - randomly hire

2.5 Predictive validity - current selection system

2.5 Concurrent validity4. Predictive validity - test

scores to hire

Page 45: PSY 6430 Unit 5

Which is better: Predictive vs. concurrent, research results (NFE)

45

Data that exist suggest that: Concurrent validity is just as good as

predictive validity for ability tests (most data)

May not be true for other types of tests such as personality and integrity tests Studies have shown differences between the

two for these type of tests - so proceed with caution!

Page 46: PSY 6430 Unit 5

SO21: Sample size needed for a criterion-related validity study (review)46

Large samples are necessary The text indicates that frequently over several

hundred employees are often necessary Dickinson maintains that a sample of 50-100

is usually adequate - learn Dickinson’s number

What do companies do if they do not have that many employees? They use content validity They could possibly also use validity

generalization or job component validation, but I want to hold off on that for a moment – these are legally risky

Page 47: PSY 6430 Unit 5

SO23: NFE, Construct validity

47

Every selection textbook covers construct validity

I am not covering it for reasons indicated in the SOs, but will talk about it at the end of class if I have time

Basic reason for not covering it is that while construct validity is highly relevant for test construction, very, very few organizations use this approach - it’s too time consuming and expensive First, the organization develops a test and determines

whether it is really measuring what it is supposed to be measuring

Then, they determine whether the test is job related

Page 48: PSY 6430 Unit 5

SO27: Validity generalization, what it is

48

Validity generalization is considered to be a form of criterion-related validity, but you don’t have to conduct the validity study in your organization for your employees

Rather you take validity data from other organizations for the same or very similar positions and use those data to justify the use of the selection test(s) Common jobs: computer programmers and systems

analysts, set-up mechanics, clerk typists, sales representative, etc.

(I am skipping to SO27 for the moment, SOs24-26 relate to statistical concepts about correlation; organization of this chapterIs just awkward. I want to present all of the validity procedures together, and then compare them with respect to when you should/can use one or the other. Then, I’ll return to SOs 24-26: cont on next slide)

Page 49: PSY 6430 Unit 5

SO27: Validity generalization, what it is

49

Assumption is that those data will generalize to your position and organization

Thus, you can use this approach if you have a very small number of employees and/or applicants*

*Note this point well

Page 50: PSY 6430 Unit 5

SO28: Validity generalization, cont.

50

Testing experts completely accept the legitimacy of validity generalization Primarily based on the stellar work of Schmidt and

Hunter (who was a professor at MSU until he retired) Gatewood, Field, & Barrick believe this has a bright

future Frank Landy (also a legend in traditional I/O) is much

more pessimistic about it Wording of the CRA of 1991 may have made

this illegal There has not been a test case No one wants to be the test case (you should not be

the test case)

(this slide, NFE, cont. on nxt slide)

Page 51: PSY 6430 Unit 5

SO28: Validity generalization, cont.

51

Actually have come full circle with respect to validity generalization and its acceptance by testing specialists In the early days of testing, validity generalization

was accepted If a test was valid for a particular job in one organization

it would be valid for the same or a similar position in another organization

It then fell into disfavor, with testing specialists reversing their position, and adhering to situational specificity

Now, based on Schmidt and Hunter’s work, it is again embraced by testing specialists

(this slide, also NFE)

Page 52: PSY 6430 Unit 5

SO29 FE: Two reasons why CRA 1991 may make validity generalization illegal

52

Both reasons relate to the wording in the CRA that the only acceptable criterion measure (job performance measure) is actual job performance

1. Criterion-related validity studies have often included the use of personnel data such as absenteeism, turnover, accident rates, training data, etc. as the criterion or in multiple regression/correlation studies as one or more of the criteria – this may not be considered job performance under CRA 1991

2. If courts interpret “actual” in actual job performance literally, then the courts could maintain that only the performance of the workers who participate in the study would be an acceptable criterion measure

Could ban the use of data from other organizations and require local validity studies (local meaning in your own organization)

Page 53: PSY 6430 Unit 5

SO30: Correction!!

The material in this study objective relates to synthetic validity (pages 199-201) in the section “Validation Options for Small Businesses” not job component validity

I am going to talk about job component validity in the next unit – because it is tied to a particular type of job analysis procedure – the Position Analysis Questionnaire

53

Page 54: PSY 6430 Unit 5

SO30NFE: Synthetic validity (briefly)

54

This is a way to conduct a criterion-related validity study with small samples as long as you have related jobs in the organization Jobs that require some of the same KSAs

I believe it has become obsolete since the Supreme Court ruled in 1995 that content validity is an acceptable defense for adverse impact

Criterion-related studies are simply more costly than content validity

Selection experts, however, will always prefer criterion-related studies

Page 55: PSY 6430 Unit 5

SO31: Interesting fact (and for the exam)

55

In a 1993 random survey of 1,000 organizations listed in Dun’s Business Rankings with 200 or more employees, the percentage of firms indicating that they had conducted validation studies of their selection measures was:

24%

In today’s legal environment, the other organizationscould find themselves in a whole world of hurt!

(click, click!)

Page 56: PSY 6430 Unit 5

Factors that affect the type of validity study: When to use which validity strategy56

Four main factors that influence the type of validity study you can do Sample size Cut-off score procedures Type of attribute measured: observable

or not Type of test: write or off-the-shelf

(on the exam, I am likely to give you situations and ask you, given the situation, what type of validity strategy could you use and why:That is, what options do you have? That’s exactly the type of decision you are going to have to make in organizations. So, to make iteasier, and summarize things: Include validity generalization in your answers

Page 57: PSY 6430 Unit 5

Factors that affect the type of validity study: When to use which validity strategy

57

Sample sizeLarge # employees Concurrent

(all forms, OK) Predictive

Content

Validity generalization

Small # employees Content

Validity generalization

(it’s OK to use content and validity gen with large sample sizes; many orgs do use content!)

Page 58: PSY 6430 Unit 5

Factors that affect the type of validity study: When to use which validity strategy58

Cut-off score proceduresMinimum (pass/fail) Concurrent

(all forms, OK) Predictive

Content

Validity generalization

Ranking or banding Concurrent

(only criterion-related- Predictive

all but content) Validity generalization

(validity generalization is based on correlation, even if you don’t do the study yourself, so remember it is considered a typeOf criterion-related study)

Page 59: PSY 6430 Unit 5

Factors that affect the type of validity study: When to use which validity strategy

59

Attribute being measuredObservable Concurrent

(all forms, OK) Predictive

Content

Validity generalization

Not observable Concurrent

(only criterion-related- Predictive

all but content) Validity generalization

(personality, extraversion, social sensitivity, flexibility, integrity, etc.)

Page 60: PSY 6430 Unit 5

Factors that affect the type of validity study: When to use which validity strategy60

Type of testWrite/construct Concurrent

(all forms, OK) Predictive

Content

Validity generalization

Off-the-shelf Concurrent

(only criterion-related- Predictive

all but content) Validity generalization

(next slide, back to SO 24; interpretation of validity correlation)

Page 61: PSY 6430 Unit 5

SO24: Statistical interpretation of a validity coefficient

61

Recall, r = correlation coefficient r2 = coefficient of determination Coefficient of determination:

The percentage of variance on the criterion that can be explained by the variance associated with the test

r = .50, to statistically interpret it: r2 = .25 25% of the variance on job performance can be

explained by the variance on the test Less technical, but OK

25% of the differences between individuals on the job performance measure can be accounted for by differences in their test scores

Page 62: PSY 6430 Unit 5

SO25: Validity vs. reliability correlations

62 You interpret a validity correlation

coefficient very differently than a reliability correlation coefficient You square a validity correlation

coefficient You do NOT square a reliability

correlation coefficient Why?

With a reliability correlation coefficient you are basically correlating a measure with itself Test-retest reliability Parallel or alternate form reliability Internal consistency reliability (split half) (I am not going to go into the math on that to prove that to you)

Page 63: PSY 6430 Unit 5

SO25B: Validity vs. reliability correlations, examples for test

63

You correlate the test scores from a mechanical ability test with a measure of job performance

The resulting correlation coefficient is .40 How would you statistically interpret that?

16% of the differences in the job performance of individuals can be accounted for by the differencesin their test scores

Page 64: PSY 6430 Unit 5

SO25B: Validity vs. reliability correlations, examples for test

64

You administer a computer programming test to a group of individuals, wait 3 months and administer the same test to the same group of individuals.

The resulting correlation coefficient is .90 How do you statistically interpret that

correlation coefficient?

90% of the differences in the test scores betweenindividuals are due to true differences in computerprogramming and 10% of the differences are due to error

Page 65: PSY 6430 Unit 5

Different types of correlation coefficients: or why it is a good idea to take Huitema’s correlation and regression

65

The most common type of correlation to use is the Pearson product moment correlation

However, you can only use this type of correlation if You have two continuous variables, e.g., a

range of scores on both x and y If the relationship between the two variables

is linear Some have shown a curvilinear relationship

between intelligence test scores and performance of sales representatives

(NFE, I think)

Page 66: PSY 6430 Unit 5

Different types of correlation coefficients: or why it is a good idea to take Huitema’s correlation and regression

66

Point biserial coefficient is used when one variable is continuous and the other is dichotomous High school diploma vs. no high school diploma (X) Number of minutes it takes a set-up mechanic to set up

a manufacturing line (Y) x is dichotomous, y is continuous

Phi coefficient is used when both variables are dichotomous High school diploma or no high school diploma (X) Pass or fail performance measure (Y) Both x and y are dichotomous

(NFE, I think, one more slide on this)

Page 67: PSY 6430 Unit 5

Different types of correlation coefficients: or why it is a good idea to take Huitema’s correlation and regression

67

Rho coefficient - Spearman’s rank order correlation - when you rank order both x and y, and then correlate the ranks Rank order in test scores Rank order number of minutes it takes set-up mechanics

to set up a manufacturing line Use rank order when either your x or y scores are not

normally distributed - that is, when there are a few outliers - either very high scores on either or very low scores on either

(NFE, I think,last slide)

Page 68: PSY 6430 Unit 5

END OF UNIT 5Questions?Comments? 68

Page 69: PSY 6430 Unit 5

NFE: Back to construct validity

69

Construct validity:Does the test actually measure the “construct” you think it is measuring?

This is a hold-over from the more traditional cognitive psychology and psychometrics field that philosophically believes in mind-body dualism (mentalism)

That is, there really is something called “general intelligence” that is more than just the sum of what you ask on an exam and it is different than a behavioral repertoire

One of the reasons I like this text so much is that it is clear that the authors are not from this old school This will become more obvious when you read the

material related to ability testing

Page 70: PSY 6430 Unit 5

NFE: Back to construct validity70

But, back to the question you are asking with construct validity:Does the test actually measure the “construct” you think it is measuring?

Is your measure of extroversion really measuring extroversion? Is your measure of creativity really measuring creativity?Is your measure of ability to work with others (agreeableness) really measuring the ability to work with others?

Page 71: PSY 6430 Unit 5

NFE: Construct validity, cont.71

You construct a test You correlate your test with other tests that

supposedly measure the same thing (or a very similar construct) and other measures that might get at that construct

Correlations are not going to be perfect because your measure is not measuring exactly the same thing as those other measures, but should be reasonably correlated with those measures

Continue to do that until you have pretty good evidence that your test is indeed measuring what it is supposed to be measuring

Page 72: PSY 6430 Unit 5

NFE: Construct validity, cont.72

But notice, for validation purposes, you are NOT done yet

You have evidence that the test is supposedly measuring what you say it is, but

You still need to conduct a criterion-related validity study to determine whether the test is related to the job

Thus, you end up doing a lot of time-consuming work

The ONLY reason you would do this was if you could not locate a test that measured what you want and had to create your own (not likely, by the way)