innovations for the 21st century - ipst · e-assessment innovations for the 21st century ......
TRANSCRIPT
E-assessment Innovations for the 21st Century
IPST Thailand, 26 November 2015
Jan Wiegers Director CITO 1
Content of my presentation:
• About Cito • Computer Based Testing (CBT) • Formative Assessment in The Netherlands
– Pupil Monitoring System (PMS) – Math Garden
• 21st Century Skills • Project Kazachstan (formative assessment) • Project Nigeria (summative assessment)
2
About Cito
• Based in Arnhem, The Netherlands • More than 600 employees • More than 2000 freelancers • International consultancy and training in
many countries (Nigeria, Ghana, Singapore, Vietnam, Brazil, Kazachstan, Azerbaijan, UK, Germany)
Main Products of Cito
• Final test for primary school • Student monitoring system (SMS) • Pyramid method (preschool / early
childhood) • Questify (software) • Final examination secondary education • Language proficiency tests
Cito’s Psychometric
Research Centre
Psychometric Research Centre
Responsable for the psychometric quality
of tests, exams and other assessment procedures developed by Cito.
Skills @ PRC
Psychometrical and methodological knowledge:
• Classical test theory • Item Response Theory • Factor analysis • Sampling • Equating • … and much more
Staff @ PRC
30 core psychometric staff 23 PhD; 2 Professors (UvA; UT) 7 non PhD of which 3 PhD-students
2 visiting Professors
11 additional staff 6 students 3 supporting personnel
How to assess
What is the purpose? • Summative - High stakes (certificate) • Formative - Low stakes (diagnostic,
classroom assessment) (When the cook tastes the soup, that’s formative. When the guests taste the soup, that’s summative)
9
The use of the computer in summative assessment
(national examinations in secundary schools
in The Netherlands)
10
Why CBT?
• controlled delivery (higher security) • scoring by computer/ independent marking • immediate feedback • logistic benefits • adaptive testing • more resources (databases, applications, etc.) • nonverbal item types ( e.g. hotspot, drag and
drop) • more attractive (multimedia, interactivity, etc.) • saving money?
11
• 600 different examinations • Cito composes the examinations together with
500 teachers
• more than 1200 schools
• 200,000 examinees
• 1,400,000 exams are taken
• 49 million A4 pages
Final examinations in figures
IAEA Bangkok 13
IAEA Bangkok 14
IAEA Bangkok 15
Psychometric Research Centre
Responsable for the psychometric quality
of tests, exams and other assessment procedures developed by Cito.
Skills @ PRC
Psychometrical and methodological knowledge: Classical test theory IRT Factor analysis Sampling Equating … and much more
• 600 different examinations • Cito composes the examinations together with
500 teachers
• more than 1200 schools
• 200,000 examinees
• 1,400,000 exams are taken
• 49 million A4 pages
Final examinations in figures
Planner Web front end
Reporter Web back end
Builder
Player
Administrative programs (e.g. Pupil Monitoring System)
CD Stand alone
front end
Stand alone back end
(mail)
Rater
What
• Modular Platform (core: Builder and Player)
• Delivery via Smart Client
• Manage complete test development process
• Generic where applicable, custom where appropriate
• High stakes, high volume (scalable)
• Online & offline
• Template oriented
• Based on QTI standards: complimentary to other software
Tools
• Calculator
• Sound
• Voice (for students with reading problems – dyslexia)
• Magnifying glass
• Symbols (Math)
Future developments
More emphasis on formative assessment
30
Types and formats of classroom assessment: • interacting during lessons (asking questions) • exercises and assignments • observations • marking written work • teacher made tests • portfolio • etc.
Day-to-day assessment
31
Pitfalls by judging
Two types of mistakes in judging: • Instability:
– different judgements from one case to the next
• Lack of intersubjective conformity: – one teacher assesses differently compared to
another teacher
32
External test
Characteristics needed / desired: • Not person related (construction &
marking) • Proven quality (reliability, validity) • Consistent over time • Clear objective standards • Continuity over the years • Showing progress in time (monitoring)
33
Pupil Monitoring System (PMS)
General Characteristics • Tests for important core skills • Monitors the progress of pupils throughout
their school careers • Textbook independent • Built on national curriculum • Based in Item Response Theory
34
Components PMS
Testing components: • administration • marking • evaluation
Diagnostic component: • collecting additional data • identifying specific problems
Remedial component: • drawing up remedial plan • carrying out remedial plan • evaluating remedial plan
35
What information do the tests give?
Reports: • Level Report Pupil on paper • Through the internet www.cito.nl
– Progress Report – Group Report – Ability scores – Percentile scores – Grades Report
36
Pupil Report
37
Schools compared (1)
38
Self-organizing adaptive practice and monitoring tools
Oefenweb & University of Amsterdam
Math Garden
39
Background ideas 1. The cognitive system in development is a
complex system – Daily measurements
2. Arithmetic and language learning are instances of cognitive expertise – Lots of (adaptive) practice, direct feedback
(deliberate practice)
3. Individual differences are huge – 20% of grade 2 performs above the mean level
of grade 3, 7% above mean of grade 4
4. ICT allows new developments – iPads, laptops, fast internet
40
Web-based adaptive practice and monitoring systems
• Idea: Digital notebooks for daily work in classroom
• Choice practice items automatically adapted to child, differentiation
• Combining (playful) practicing and pupil monitoring - Less tests in classrooms - High frequency monitoring
• No checking, automatic progress reports • Web-based (cloud) • New type of adaptive testing (psychometrics)
41
User statistics
• Rekentuin.nl: 95.000 active users • Over 230 million responses in 4 years
(now > 750,000 per day) • Game, Train, Track & Teach • English version: www.mathsgarden.com
42
21st Century Skills
• Which skills
• How to teach • Outcome / how to assess
43
Skills
• Creativity • Innovation • Critical thinking • Problem solving • Communication • Collaboration • Digital literacy • Citizenship.
44
Collaborative problem solving
Definition (from ATC 21 S-project): “Working together to solve a common challenge, which involves the contribution and exchange of ideas, knowledge or resources to achieve the goal”.
45
Improving quality of rating
• Explicit criteria for rating • More than one assessment • More than one assessor • Training of assessors • Use of camera (high stakes decisions:
soccer)
46
47
NETWORK OF NIS - 20 SCHOOLS
Uralsk
Atyrau
Aktau Kyzylorda
Shymkent
Taraz Almaty
Ust’-Kamenogorsk Semey Karaganda
Astana
Kokshetau
Aktobe
Kostanay
Petropavlovsk
Pavlodar
Taldykorgan
48
48
Individual support of each student from admission to the NIS up to entrance to the university
Assessment system
Formative assessment
Internal summative assessment
External summative assessment Students
selection system
Competitive graduate
Monitoring of progress during the education
49
Purpose :
– support the learning process
Task:
– to get diagnostic information to support the learning process in accordance with students needs
NIS students monitoring system
50 50
MAIN FEATURES OF THE STUDENT MONITORING SYSTEM MATHEMATICS
51
Close relation to the Integrated
Program of Development
Five content
domains
Taxonomy of Bloom
35 items per test taking per domain
Field testing of items
Trained test constructors
Setting Standards.
Use of Item Response Theory
Monitoring two or three
times per school year
STUDENT MONITORING
SYSTEM
51
Stages of developing the monitoring system
1. Defined a project group:
- NIS teachers as item developers - Project and subject coordinators 2. Conducted several trainings on test development
52 52
Curricula (Math)
Topics (Subdomains)
Formulation of assessment criteria
Expected outcomes
Development of test matrix (->Bloom)
3.Development of test items
Numbers Algebra Statistics Math modeling Geometry
53
Stages of developing the monitoring system
Reporting possibilities: idea of level descriptors
53
Steps on developing the system 4. Item Construction (NIS teachers), screening (Cito subject experts), reviewing (NIS teachers plus Cito experts) and piloting
• Test administration paper/computer; Pilot 2015 fully digital with Questify • Item properties: difficulty level, assessment criteria, Bloom level etc. • CTT (Classical TT) and IRT (Item Response T) were used for statistical
analysis • After piloting standard setting • NIS teachers and Cito subject experts identify 4 achievement levels in
accordance with expected outcomes of the curriculum.
Stages of developing the monitoring system
54
Developing reporting Categories
55
Does not require extra help from teacher
56
Achievements levels
Require extra help from teacher
“Beginner”
It is necessary to pay a lot of attention
and support, necessary to develop
an individual study plan in school and at
home, need an intensive
involvement of the parents
“Elementary”
It is necessary to pay attention, to develop an individual study
plan in school .
“Good level”
Mastered the curriculum in a sufficient level.
Consistent with the expected outcomes of
the curriculum .
Perfectly mastered the curriculum,
During the lessons requires an extra
challenge Could be considered as
a candidate for math competitions
“High level”
56
57
DEVELOPMENT OF REPORTING
Reporting to all stakeholders: Students and parents: 1 detailed records on performance on each item per domain 2 individual progress report on ability scale per domain Teachers and subject sections in school: Performance of student in combined groups (grade level, school level etc.) School administration and policy makers: An analytical report on monitoring results
57
Reports
58
Reporting: grade level
58
Individual student report Оқушы:
Сәйкестендіргіш
Мониторинг кезеңі
Сандар Ж
7C Алгебра Ж
23 Геометрия Ж
184Статистика
Ж
1061Мат. модельдеу
Э
Бөлім Тапсырмалардың жалпы саны
Қарастырылмаған тапсырмалардың
жалпы саны
Параллельдегі позициясы
Сандар 30 0 144
Алгебра 30 0 47
Геометрия 30 0 102
Статистика 30 0 120
Мат. модельдеу 30 0 142
Жалпы 150 0
Сандар Алгебра Статистика
Бастауыш 50,0-94,8 50,0-94,1 50,0-91,3
Элементарлық 94,9-100,4 94,2-98,6 91,4-101,9
Жақсы 100,5-107,4 98,7-106,1 102-119,1
Жоғары 107,5-150 106,2-150 119,2-150
* Бірдей ұпайлары бар барлық оқушылар бірдей позицияда орналасқан** Интервалдар қабілеттілік шкаласы бойынша көрсетілген
Толығырақ қосымшалардан қараңыз
110,6-150
Бөлім бойынша интервалдар*
Геометрия Матем. Модельдеу
Астана -2013
95,4-99,6 94,1-103,7
99,7-107,7 103,8-110,6
107,7-150
50,0-95,3 50,0-94,0
555 10
569 15
Шартты белгілер
Жетістік деңгейі
698 15
154 8
383 11
Параллельде тапсырғандар саны107.1
Зияткерлік мектептердегі оқушылар саны
101.3
Зияткерлік мектептері
арасындағы позициясы*
Сыныптағы позициясы
Баллы *
Мектеп ФМБ Астана қ.102.3
Сынып 115.2
Сыныпта тапсырган окушылар саны102.6
Педагогикалық өлшеулер орталығы
Cito педагогикалық өлшеулер институты, Нидерланды
Абдрахманов Даурен
001213550159
1 (қыркүйек 2013)Бөлімдер Жетістік деңгейі
59
60
96
104
50
60
70
80
90
100
110
120
130
140
150
M1 M2 M3 M4 M5 M6
Abi
lity
scal
e
Monitoring moments
High
Advanced
Base
Beginner
Reporting: student progress under development
60
Plans for further development
1 • Developing Monitoring system for other grades
2 • Further development of reporting
3 • Computerized testing
4 • Validity research of the system
5 • Introduction of Itembanking system
61 61
A PAPER PRESENTED AT THE EMBASSY OF THE KINGDOM OF THE NETHERLANDS ON THE CITO/JAMB MEMORANDUM OF UNDERSTANDING.
62
University entrance exams The UTME is an achievement test and it
comprises: 4 subjects with the Use of English( compulsory) The examination is for 3, 5 hours duration. 1,6 mil. candidates 90,000 staff members involved 3,000 test centers
• In 2015 only computer based exams (MC) 375 test centers 30,000 staff members
63
WHY TRANSIT FROM PAPER AND PENCIL MODE OF TEST ADMINISTRATION TO COMPUTER-BASED TESTING? The paper and pencil mode of test administration
has been in use since the inception of the Board in 1978.
Upsurge in the number of prospective candidates overwhelmed limited carrying capacity in the available institutions in the country.
Resulting in monumental sharp practices that eroded the confidence not only in the selection instrument but in the entire admission system.
64
The spate of examination security breaches was compounded by
various logistic and test administration challenges This borders mainly on:
Huge costs and extensive preparations involved in printing and
distribution of examination syllabuses brochures etc.
Logistics of examination administration in terms of personnel requirement , transport, security etc.
Delay in releasing examination results.
Prior to 2007, it took a minimum of 3 months to release candidates’ results after the entrance examination.
Generally created avenues for serious examination security
breaches thus diminishing the validity and reliability.
65
The training was on: Item Response Theory (IRT) Item Banking Computer-Based Testing.
66
ADDED VALUES OF CBT TO OPERATIONAL EXAMINATION OF JAMB-UTME S/No. LOGISTICS PPT CBT REMARKS
1. Printing & distribution of syllabus
and brochure
1.6 million copies
(paper)
On compact Disk
(CD).
Saves cost
2. Printing of examination materials 1.6 million papers,
Duration: 6 weeks.
Nil. Saves cost
3. Use of custodians for
examination materials
Third party (banks)
in every
examination town
Nil. Saves cost and
enhances security
4. Distribution of examination
materials
1 week (7days)
3 minutes More efficient and
enhances security
5. Retrieval of examination
materials
3 days 3 minutes More efficient and
enhances security
6. Release of results 5 days Instantaneous More efficient and
more secure
7. Storage and security of test
items
In item pool
(paper)
In item bank
(virtual)
Enhances validity,
reliability and
security of
examination. 67
8. Manpower involvement High
18 invigilators per
centre
5 attendants per
centre
1 supervisor
1 centre
coordinator per
centre
= 25
Low
2 technical
personnel
1 supervisor
4 invigilators BVM
invigilators
= 7
Saves cost and
more efficient
9. Examination centre Sometimes
chaotic/rowdy
Conducive More efficient
10. Transportation of examination
materials
98 vehicles Nil Flexibility and
effectiveness
11. Test administration Fixed and
sacrosant
Flexible Flexibility and
effectiveness
12. Unscannable, missing scripts
and results
Rampant Nil More effective and
secure
13. Examination malpractice Rampant Negligible Reduced cases of
examination
malpractice 68
ขอขอบคุณ
69