measuring human performance. introduction n kirkpatrick (1994) provides a very usable model for...

Measuring Human Performance

Introduction

Kirkpatrick (1994) provides a very usable model for measurement across the four levels; Reaction, Learning, Behavior, and Results. These categories are discrete and can be measured. The goal of this presentation is to bring to light many of the topics, concerns, and issues that must be understood before carrying out the business of testing, measuring, or evaluating the success of training in the work force today.

What is a test? What is testing?

The instrument used to collect data A process of collecting quantifiable

information about the degree to which a competence or ability is present in the test taker. (Anderson, BC)

Reasons for Testing

Prerequisite tests Entry test Diagnostic test Post test Equivalency test

Norm Reference Vs

Criterion Reference

Norm Referenced Testing

Test items separate test-takers one from another

Normal distribution curve

Criterion Referenced Testing

Test items based on specific objectives Mastery Curve / Skewed from Normal

Distribution

SKA

Skill Knowledge Attitude

Domains of Learning

Cognitive Affective Psychomotor

Bloom’s Taxonomy for Cognitive Levels

Knowledge Comprehension Application Analysis Synthesis Evaluation

Krathwohl’s Taxonomy for Affective Levels

Receiving Responding Valuing Organization Characterization by a value or value complex

Simpson’s Taxonomy for Psychomotor Levels

Perception Set Guided Response Mechanism Complex Overt Response Adaptation Origination

Test Items Related to Bloom’s Taxonomy

Multiple Choice– Most flexible across the Taxonomy spectrum, especially

first three levels Advantages:

– Guessing probability low– Diagnostic capabilities– East to grade– Statistical Analysis

Multiple Choice cont…..

Disadvantages– Difficult to write – Provides keys for recall– doesn’t do well for high level cognition

evaluation

True and False

Could be used at all levels but…. Advantages

– easy to write– easy to score– can to item analysis

T/F cont….

Disadvantages– 50/50 guess factor– often used when M/C seems too hard to write

Reliability is so poor…..Very little evaluation value.

So why do teachers often include T/F?

Matching

Best suited for Application level….not recommended for any by me.

Advantages– Easy to write– East to Grade– Statistical Analysis

Matching cont…

Disadvantage:– Requires the two lower learning level– Process of elimination diminishes probability– low reliability

Why would a teacher use Matching?

Fill in the Blank Best suited for the lower levels Advantage

– Recall is essential, few clues Disadvantage

– Single word or phrase– grading beyond single word or phase is in trouble– enters the realm of subjective grading..poor

reliability

Short Answer

Can get to the high order thinking Advantages

– Easy to write– produces original responses

Disadvantages– Basically same as fill in….reliability

Essay

The best for higher order Advantage

– high order– creative ability– writing ability

Essay cont…

Disadvantage– Tough to grade– forget stats

You’ll see this often in Master’s and Ph.d. classes

Validity

Does the test measure what it is suppose to measure.

How close to the bull’s eye did it hit.

Reliability How consistent is the test Is there a tight pattern of hits

Types of Validity

Concurrent Validity Content Validity Criterion Related Validity Predictive Validity Construct Validity

Types of Reliability

Test-Retest Reliability Inner-Rater Reliability

What is the real score of a test?

An error factor must be considered test score + error factor

Ten Evaluation Instruments for Technical Training

Interviews Questionnaires Group Discussion Critical Incident Work Diaries

Instruments cont...

Performance Records Simulation Role-Play Observation Written Test Performance Test

Designing Tests

Questions you must ask yourself– Who is the test designed for?– What do you want to know?– How many Questions will be required?– How will it be administered?– How will it be scored?

3 Methods of Test Construction

Topic Based Statistical Based Objective Based

Topical Based Test

Selection done by chapter Selection done by topic Selection done by the importance of the

topic

Limitations of Topic System

Procedure lacks precision Doesn’t identify test takers Not designed on learners level Doesn’t specify competencies

Statistical Selection

Items statistically selected Standardized Norm Referenced

Limitations of Statistical

What is measured not specific Lacks precision of CRT Difficult to select items

Objectives Based Test

Based on defined competencies Applies to criterion referenced tests and

scores

Testing and Kirpatrick’s Four Levels

The more downward, from the performance of the company to the performance of the individuals, the more difficult to obtain.

The more downward...the more usable the information

Four Levels

REACTION LEARNING BEHAVIOR RESULTS

Reaction

Checking individuals reaction often means, measuring “Customer Satisfaction”

Happy rating sheets observations other How can you quantify the responses?

Learning

Measurable behavior changes in the three “SKA” Dimensions

Behavior

Behavior change due to training program. Surveys Interviews Other

Results

Measurable by looking at changes in: production quality Safety Sales other

measuring human performance. introduction n kirkpatrick (1994) provides a very usable model for...

Documents

n reliability

matching n

essay n

data n

false n

blank n

levels n advantages

affective levels n