DOCUMENT RESUME
ED 248 190 SP 024 016
TITLE Florida Teacher Certification Examination. TechnicalReport 1981-1982.
INSTITUtION Florida State Dept. of Education,PUB DA'P'S 84NOTE 57p.; For r ad documents, see SP 024 017-020.PUB TYPE Reports - De.. iptive (141) -- Statistirail Data (110)
EDRS PRICE MF01/PC03 Plus -'ostage.DESCRIPTORS Higher Educatio; Knowledge Level; *Measurement
Techniques; *Minimum Competency Testing; PreserviceTeacher Education; Quantitative Tests; Reading Tests;
/ Standardized Tests; State Standards; *TeacherCertification; Tiathing Skills; *Test Construction;Test Interpretation; Test Reliability; *Test Results;*Test Theory; Test Validity; Writing Skills
IDENTIFIERS *Florida Teacher Certification Examination.
.
ABSTRACTThe Florida Teacher Certification Examination (FTC!)
is based Upon selected competencies that have been identified asminimal entry-level skills for prospective teachers. k description isgiven of the four suttests which make up the FTCZ: (1) writing--essayon general topics; (2) readingmultiple choice "clove" procedure ongeneral education passages derived.from textbooks, journals, andstate publications; (3) mathematics--multiple choice questions onbasic mathematics, simple computation, and "real world* problems; and(4) professional education-7multiplirchoice quettioniron generaleducation (personal, social, academic development, administrativeskills, and exceptional student education). This technical reportprovides information on the test's creation and assembly,administratira procedures, and scoring and reporting.-A tablepresents statistics on the number and,percent of students (1981-82)passing all subtexts, by education mayor or program. The psychometriccharacteristics of validity, reliability, item discrimination, andcontrasting group 4erformance of the FTCE are discusaed. Appendicesprovide information on: (1) essential competencies and subskillstested; (2) mathematical illustrations of formulas; (3) security andquality control proceduresp.and (4) scoring the writing examination.(JD)
************************************************************************ Reproductions supplied by EDRS are the best that can be made *
* from the original document. *
********************************************************4.**************
Print your full name hers:
(last)
-T
EST
BO
OK
NU
MB
ER
0000(first)
(n%111$1111)
TE
ST FO
RM
CO
DE
8A
X %
N:: 4 ;.c.;,k %
It
131.01411Mm
i os TE
RU
DA
TIC
EI
NA
TO
PUK
MU
IRT
UT
E O
ED
UC
AT
OR
ED
UC
AT
MA
L R
ESO
UR
CE
SN
IFOR
YY
IA M
AI
CE
NT
ER
Watt.
TfR
5 dot .amosti
01.111 inraPPIX
IMN
IItieelw
ww
wte hum
the mew
l Cs
cwoontinon
tiA
lma cliprw
m, N
a ve bornmai
eproodkno qualm,
%A
nis of Vire is
uppnsur staledfin ttry deC
ii.m
ane do net nece4 riopesemW
INK
* NN
Epotshot+
ispallet
*%::: "PE
RM
ISolohl TO
RE
PRO
DU
CE
Me
....vv.%M
AT
ER
IAL
HA
S BM
GR
AN
TE
DSy
vs>.
4:,),Nt
4:,:;%\:
NA
,W
O.*
State of FloridaD
eportment of E
ducationT
eacher Certification Section
Tutlahossee, Florida 32301
Ffalph D. T
uribmion, C
omm
issionerA
ffirmative action equal opportunity
employer
011111111401SO
W el PIM
&iblerlim
e Oar
4 Sell.2
Questica relative to this report or the Florida* Teacher CertificationExamination may. be directed to the staff by calling 904/488-8198 or bywriting:
Student Assessment SectionDepartment of Education,.Knott BuildingTallahassee, Flori 32301
State of FloridaDeportment of EducationMohnen.,FloridaRalph D. Todegion, CommissionerAffirmative action/equalopportunity dyer
FLORIDA: A STATE OF EDUCATIONAL DISTIPICTIOP. "On etatewhle average. educationalachievement in the State of Florid, will equal that of the upper quardhe sd' states within fiveyears, as indicated by cosausenly ae...;pted criteria of Ettidninent."
349-020784-MCP-200-1
f
FLORIDA TRACI=
CERTIFICATION RRANIMTIC01
TECERICAt REPORT
1981-1982
Student Assesmmea SectionBureau of Program Support Services
Division of Public SchoolsDepartment of Education
Tallahassee, Florida 32301
CopyrightState of Florida
Department of State1984
4
TABLE OF CONTENTS
Background
Desiription of tbi Examination 2
Calibgation of Items 4
TEST composrpm, ADMINISTRATION, AND SCORING 5
Test Creation and Assembly 5
Administration-procedures.4.
6
4Scoring and Reportinga
7
TEST RESULTS TOR 1981-82 9
PSYCMOKETRIC cRARACTERISTICS 15
Validity 15
Reliability 16
DiscriminationI . 19
Contrastive Group Performance 23
SUMMARY t 27
APPENDICES
A. Essential*Competencieq. and Subskills 29
B. Mathematical Illbstrations of FormuAas 33
C., Security and Quality Control Procedures 37
D. Scoring the Witing.Exarination 41
5
INTRODUCTION
Background
Each applicant for an initial Florida teacher's certificate must pass the
Florida Teacher Certification Examination (FTCE). The FTCE wee established by
Section 321.17 Florida statutes and is administered by the Florida Department of
- Education. sb
The competencies that fo& the basis for the Florida Teacher Cietificatiom
Examination were identified' through a study conducted by the Council on Teacher
Education (COTE)1. As a result of the study, twenty-three Essential Generic
Competencies were established upon which to base the Examination and to form a
part of the curricular requirements at Florida college. and universities with
approved teacher education programs. Later legislative action combined two of
the competencies,. numbers six and nineteen, and created an additional competency
dealing with education for exceptional students.
f An ad hoc task force convened by the Department of Education developed
subskills for the identified competencies. The subskills were4reviewed and
critiqued by various individuals' and organizations including a random sample of
certified education personnel, statewide professional teacher organizations, and
all colleges and universities with approved teacher education progress. The
twenty-three Essential Generic Competencies and the subskills are listed in
AppendixA.
Teat item specifications were written for each sub.skilt. Specifications
are rules and psfameters for writing test items to :lessors a particular
subskill. They provide information such as the length of the stimuli, the mode
of the stimuli (graph, problem situation, mathematical algorithms), the
characteristics of the stem (question, statement completion), the
characteristics of the correct answer, and the characteristics of the foils.
The specifications also include detailed information'about the content upon
which the testa are based. The complete specifications bra contained in the
Florida Teacher Certification Examination Bulletin II: The General Education
Subtests -- Reading, Writing, Nathemetics and brothe Florida Teacher
Certification Examination Bulletin In: The Professional Education Subtext.
Copies are available from the.Department for a nominal fee.
Passing scores for each subtext were recommended by a panel of judges, all of
whom were either current or past members of COTE and who had been involved in
*the development of the Examination.- The panel was made up of.classroom
....11...ren
'COTE was a statutory advisory council appointed by the State Board of
Education to advise the Commissioner of Education on all matters dealing with
teacher education arid certification. COTE was replaced by the Florida Education
Standards Commission in 1980.
Is
teachers, school administrators, teacher educators, and community representa-tives. Passing score recommendations were made to the Commisiioner of El!wationfor each subtest. These recodmendations were adopted'as a rule by the StirBoard of Education on July 30, 1980.
4110
The operational tast,,i of prepiring test forms, administering the tests, andaco;ing the answer sheets are completed through an external contract. Thecontract for these tasks was awarded to the University of Florida Office ofInstructional Resources'for the three administratiOnsof the 1981-82 schoolyear.
Periodically, contracts are issued for the development of additional testitems. New items are needed to maintain a large poolpf high'quality and securetest items. A large ited pool makes it possible to develop alternate folios ofthe test so that an examinee, who retakes a subtest will receive a new set ofquestions.
All 4"-imi development is subject to the restrictions of the itemspecific's-tons. Test development contractors must provide intensive itemreviews aii conduct pilot tests of the items. Following this, the Departmentinvites a panel of college and university educators to review the new items.This reriew consists of a critical reading of each item for possible bias,adequate subject content, and adequate technical quality. After the new itemshave been thoroughly reeview40 and revised they are field- tested by imbeddingthem in a regular test formfand administering them to a sample of examinees.The item difficulties are calibrated with latent trait techniques and equated toexisting items. Later forma of the FTCE contain the new items.
Description of the Examination
The FTCE is ad6inistered three times a year at sites throughout Florida.The test takes an entire Saturday to complete. Examinees usually receive theirresults within one month. Examinees who'fail any part of the FTCE may retakethat portion at a subsequent. administration. The FTCE isa written testcomposed of four aubtests. The characteristics of the four-aubtests aresummarized in Trble 1.
72
1
More detailed information. on FTCE administrations is contained in the
Florida Teacher Certification Examination Registration Bulletin. This booklet
TABLE 1
A Descriptipn of the Four Subtestaof the
Florida Teacher Certification Examination
Competency Type of
. Subteat Tested -Questionp
Writing 2 Essay writingproductions
.RLding 4
MathematicS 5
Multiple choice"clone" proce-dure
Multiple choice
Professional 6, 7, 9 -1t4, Multiple choice
Education 20-24 (problem solvingapplicationlevel)
Content
General topics
General educa-tion passagesderived fromtextbooks, jour-nals, statepublications
Baalc mathematics:simple computa-tion and "realworld" problems
General education(personal, social,academic develop7'sent,, administra-
tive skills, excep-tional studenteducation)
Scoring.
Balleticscoring bytrainedexperts
Objectiv
C".
Objective
Objective
The Writing Subteat is scored holistically (general impression marking) by
three trained judges. The scoring criteria include an assessment of the
following:
1. Using language apprppriate to Ore topic and reader
2. Applying basic mechanics of writing3. Applying appropriate sentence structure,4. Applying basic techniques of or6anization5. Applying standard English usage6. Focusing on the topic7. Developing ideas and covering We topic
3D
is available free from Florida school district offices and from the Departmentof Educatfon.
Four bulletins have been developed to provide information about thedevelovAent of the FTCE. The subtest and item specifications have been-
.
published in Bulletin I: Overview, Bulletin II: The General EducationSubtesto Reading: Writing, Mathematics, and Bulletin.III: The ProfessionalEducation Subtest. Bulletin IV: The Technical Manual describes the technicaladequacy of the first examination. The first three bulletins were distributid toall Florida teacher education institutions and school system.personiel officesin the fall of 1979. Bulletin IV, designed primarily for measurement.professionals, was published in 1981. An overview of the coverage of the FTCEis provided in Appendix C of Bulletin IV. An annual Technical Report isproduced to describe the psychometric characteristics of the three testsadministered during each academic year. This report covers the 1981-1982Examinations.
1.
Ranch Calibration of Iteins 4
Calibration oit items is conducted using Ranch methodology and the BICALcomputer program. The Rasch mode bases the probability of a particular scoreon two parameters, the person's a 1 ty Lod the item's difficulty. The model isexpressed as:
p fXvi I Bv,S1.1 exp IXvi
(IIv
- t )1 / fl + exp (Bv I ) Ii
in which Xvi
= a score
B = person ability
ni 1.tfic difficulty
Estimates of person ability and item difficulty are obtained using 'maximumlikelihood estimation as described in Wright, Mead, and Bell (BICAL:Calibrating Items with the Reach Model, 1980).
4
The process of obtaining item difficulties for new items involves fieldtesting experimental items within regularly administered test forms. Multipleforms for each administration are comprised of sets of scored items in each formand different sets of experimental items. A subset of the scored items forms acommon' link between torms. The new items are calibrated to the same scale asthe regular items. All items are then linked to the base scale of November 1980by a linking constant. This linking constant is the difference between theaverage calibration values for the common items in November 1980 and their meandifficulty in the current adminAstration. A description of this procesi can befound in Ryan (Item Banking, 1980).
Following each administration, the data are randomly divided into threesets of 700 candidates each. Candidates are assigned in sequential order to theappropriate data set. Calibrations are conducted on oe data of the candidatesin each set and the mean difficulty 'values across the data are calculated foreach item.
4
TEST COMPOSITION, ADMINISTRATION, AND CORING
Test Creation and Assembly
The items contained in the Department of Educatitititem bank are calibrated
and equated to the base scale established during the April 1980 field test. The
items are given identification codes and detailed information on the item usage
is maintained including the identification of the form on which each item wa
used, the difficulty value, item point-biserial correlation, and Reach fit
statistics for each item.
Each test form is designed to ensure that the items (a) fit the item
speatifications for the skill that they were designed to measure, (b) conform to
the test specifications in number and type, and (c) represent a range of
difficulty with a mean difficulty approximating zero logics.
A test blueprint is prepared for each form. Items are selected snd
subjected to content, style, and statistical reviews by the Office of
Instructional Resources at the 12piversity pf Florida and by the Florida Department
of Education. Test items are sgteened for content overlap.
Placement of the items on the test is primarily a function of appearance
and content. The order of the items is ndt related to their difficulty. Items
are grouped together if they are similar in editorial style,,directions, amd
question stems.
Experimental items are field-tested within each subtest but are not colotted
in a candidate's score. When multiple test forms are used, the core of regular
(scored) items in each form remains the same for any administration. Test forms
are spiralled so that each test center receives approximately the same number of
each form. In this way, all experimental items are field-tested by at least 400
candidates who represent a cross-uction of the people who take the Examination.
Once the form has been approved, the scoring key is verified. Staff
members from the Department of Education, the Office of Instructional Resources,
and three te Thera from the public schools take the Examination. These persons
are also asked to identify any ambiguous items or confusing directions.
`Camera-ready copy is prepared by a test specialist and a f;raphic artist.
Attention is paid to the proper placement of items io provide workspace where
necessary. The camera-ready copy is again critiqued by the staff in the
Department of Education and the Office of Instructional Resources. Corrections
are made, the copy is dent to the printer, and a final check of Lae pro9f is
made before the tests are printed.
5 0
Administration Procedures
Examination Dates, Times, and Locations
The FTCE is administered in the fall, winter, and summer of each year.Administration dates for 1981-82 were October 31, 1981; February 27, 1982; andJuly 10, 1982. Candidates were permitted to take all four subteets or anyeubtest previously not passed. Thirteen locations'in the state were designatedas testing areas. Specific sites within each area were selected as testcenters. These centers were selected from the pool of established centers forthe administration bf stAdardised examinations.the 1981-82 administrations were:
Designated test locations for
1. Pensacola 8. Miami2. Tallahassee 9. Fort Myers3. Gainesville -10. Orlando4- Jacksonville 11. Boca Raton5. St. Petersburg 12. DeLand6. Tampa 13. Lakeland7. Sarasota
All test centers were inspected to ensure that the rooms met the requiredspecifications for lighting, seating capacity, storage facilities, airconditioning, and protection from outside disturbances. All facilities wereable to accommodate handicapped candidates.
The test schedule is divided into morning and afternoon sessions. Testingtime is fixed but allows adequate time for candidates to complete all sectionsof the Examination. Candidates may continue to the Reading Subtest after theyfinish the Mathematics section. The schedule for each subtest is listed below:
WritingMathematicsReadingBreakProfessional
Education
45 minutes70 minutes50 minutes60 minutes
150 minutes
9:00 a.m. - 9:45 a.m.10:00 a.m. - 11:10 a.m.11:10 a.m. - 12:00 noon12:00 noon - 1:00 p.m.
1:30 p.m. - 4:00 p.m.
A security plan has been developed and implemented for the'prograp. Refer toAppendix C for further information about security and quality control.
Special arrangements are made as necessary for handicapped candidates. ABraille version of the Examination is available. Typewriters or a flexible timeschedule is permitted for handicapped candidates.
6
g
Test Manuals
Uniform testing procedures were established for use at all centers
0 throughout the state. Documentation of the procedures is available in the Test
Administration Manual for the, program. The administration manual includes the
following topics:'
1. Duties of the Test Center Personnel2. Receipt and Security of Test Materials3. Admission, Identification and Seating of Candidates
4. On-site Test Administration Practices and Policies5. Timing of Subtest Sectionsb. Instructions for Completing Answer Documents7. Special Arrangements for Handicapped Students
Additional information to candidates is found in several other sources.Candidates are notified about Examination requirements, locations, and
procedures in the itraistrati6n bulletin, Specific directions to candidates.-
about the assigned teat center and necessary supplies are printed on the
Admission Ticket.
Scoring and Reporting
Scoring,
The scoring princess begins with a hand edit of the answer sheets, followed
with the scanning of an initial set of nheets to verify the accuracy of the
scanner, the key, and the scoring programs. The remaining sheets are scanned,
and the data are divided into, sets of 700 candidates. Three data sets are drawn
using a systematic random sampling method for the calibration of items using the
Basch methodology. The items are adjusted to the base scale established by the
April 1980 field test. A score table of equivalent raw scores to ability logits
is calculated and used to determine the ability logits fon the remaining
candidates. Each person's score in ability limits is then transform.: to ascale score with 200 as the minimum passing score. For a discussion of the
procedures used to establish the cutting score see the technical discussion in
Bulletin IV.,m
The essay is rated by three readers who use a four-point scale defined in
State Board of Education rules. The resulting scores range from thie/e to twelve
points. The passing standard is set at six points. Details of the criteria for
the rating of essays are available in Bulletin II.
Reporting
The reports generated for each administration include a candidate report
and score interpretation guide, reports for institutions, and state-level
reports.
Candidate reports indicate whether or not a test is passed; scaled scoresare reported only for testa failed. Scores above the passing standard are notreported. However; candidates who fail one or more tests are provided theirscale score for each aubtest failed. A detailed uplysis of performance isprovided to individuals who fail the Professionserducation Subtext.
The reports generated for the inctitutions and the state are listed below:
1. Number and Percent Passing for:
a. Each subtextb. All four subtextsc. Three, two, one or no subtexts
2. Number, Percent Passing and Me.. Scores for Each Subtext and theTotal Examination by All Candidates and:
a. First-tiTe candidatesb. Re-take candidatesc. Vocational candidatesd. Non-- Icational candidatese. FlorJa candidatesf. Non - Florida candidatesg. Florida candidates from approved degree programsh. Florida candidates from non-approved programsi. Sex and ethnic categories
3. Number and Percent of Candidates by Florida Institutions and byPrograms, Passing All Subtests and Each Subtest
4. Number and Percent of Candidates Passing All Subtests and EachSubtest by Program Statewide
5. Frequency Distribution for All Candidates for Each Subtest by Sexand Ethnic Category,
6. Frequency Distribution for Each Subtest for Florida Institution
Statistical analyses of data are reported in the sections on thepsychometric characteristics of the Examination.
TEST RESULTS FOR 1981-82
Results for the three test administrations2 in this report are summa-
rized in this chapter. The overall passing rates are shown in Table 2. As
can be seen ram the data there are no differences between first-time and
all cand ida es for the first two administrations, and two percentage points
higher p ormance for first-time takers in February 1982. The February
1982 inistration was thr'first one that showed a large eff- - from
retakers. This effect is not unexpected.
Table 2
Percent of Candidates Patsing All Subtestsof :'the
Florida Teacher Certification ExaminationAugust 1981 - February 1982
First-TimeCandidates
'All
Candidates
August 1981 , 80 80
October 1981 84 84
February 1982 86 84
Table 3 through 5 on the following pages show the number and percent
. passing each subtext and all subtests for: (a) approved program candidates;
(b) non- approved program candidates; and (c) vocational technical candidates.
2The August 1981 administration was a part of the data for the 1980-81
Technical Report. it is also in this report because of a decision to make
the Technical Report coincide with the saiTe test administrations that are
used in calculating the eighty percent report for approved programs. The
eighty percent performance report is calculated from the summer, fall,
and winter administrations.
9
14
15
BEST COPY HARARETable 3
Florida Teacher Certifichtion Examination Number and Percent PassingAll Subtests and;Each Subteseby Program Statewide for
Approved Program Candidates
August 1981; October 1981; and February 1982 Administrations
t ADMINItTUATION/UURERMISION4 ART ECUCATIENS 81OLE6 8uSINE55 CONCATIUN14 EMILY C1411.0h0U0 FU.13 VARYING ERCERTI1tNAL1712514 NEARING 0154011.17IESIS VISUAL DIIACILITIESI? 411414L ntutsuorsom18 SPECCh CURRELTIUN20 ELEMENTARY EDUCATION ,21 414C4.1144
!1 HI Am. 114 LOUC 41111424 VuCAT1114AL KOIL. CLONUNiC546 14011.S1. IAL apt
INAA'SPCOTATILN.14 .11404411C *117511$ 1.k4 *Ch
',PAN1SH17 LAT It.11 4LNSAh4' CuUCATIJNAL ',moot sptctoLisrAS 111114*.KATICS4b 01,1102 LAMAIIUN0 oftYSICAL C944411411
C) 41. SCIENCES4 COWNISTRV51 PHVSICS52 01111.06056 SOCIAL STUDIESST HISTORYSO PULITICAL SCIENCE60 SOCIULGGY62 SPLECN64 VISITING TEACHER70 ADNINISTRATICN* AOMLI Cif74 PsirCHCLSV
INSTH6NLUTAL MUAC14 ALPI1 Itft,41100
auMAN1111,534 IECH41CAL EE4CAT1ON45 wcC1ALIST IN SCHOOL PSY.46 READINGSJ 444SIC V4CAL/A EARLY calt p. C0 0$.X'4. ED.04 LLCO/EN4T1131tAL DISTURBANCE
1LEMIPA4NING CI5AUILIT406 tErmeMENTAL NETIONATIEIN91 LIASIMIa SPEC. WWII 01';4014.A4 LvaItENAktrEATALILLARN/VARYI 615 1.oc71 ctmeou.TAL/!..Prctric9? imUT1CA4LI5s1C1rICe4AUVING
pag fprf1CAAL RIUTURUANCL20: SPECIFIC LCANN. 91SARILITIL.&OS 4tALTI- OCCUIFATIVit, C3.201 ENGLISH,FRENCH240 L6CLItt.,511CIAL URPIES 17-S1210 LNCLISN'SPLECH!lb THENCNFJPANIEH22.2.07 INOUSTRIAL £0,171 .11aor SCIENCE /MATH2S5 491.1011 NIGNI,1941.: ScHjth4/2DU CMWMUNICATICA26.f CUOC4I1441306 NEALTN $ERVICES
TrITAL
Lap1 100.0
11 TO 06.42 2 100.0
59 50 84.?24 21 07.8
1 100.06 5 03.3
10 0 00.0175 144 00.0tr 17 100.0
*601 1242 .84.5121 124 96.102 24 94.0to 10 90.03ft 32 00.9$7 12 70.6
I 100.05 100.05 00.9
11 IS 711.9I 1 100.0
1 100.020 IF 11b.072 66 91.7
Ill 95 46.42110 indll 74.0
9 100.02 2 100.01 I 100.0
35 33 94.3104 98 44.211 V 64.66 * 66.71 0 0.04 180.0
2 100.01 $ 100.0
11/..53 100.0
1 1 100.01 1 100.0
1 50.01 100.0
6 6 100.01 1 100.0
ION 04 86.25 4 80.0
21 20 05.23 100.0
6 6 100.0,1 $ 100.022 18 01.0II IR 100.064 S6 07,5ibi 156 94.5
1 100.0$ 100.0
2 2 100.02 109.0
1 I 190.0ft 2 100.0
100.0I 149.0
1 3 140.01 50.0
80.4$00.0
suiviresipsBrie .1
100.082 70 90.12 2 100.4
60 86 90.02. 22 910
1 1 100.06 6 100.010 10 100.0
177 41610 94.917 17 $00.0
1413 1325 8,301129 120 49.222 12 100.020 20 100.0.16 34 44.414 IS 83.3
1 100.05 5 100.09 9 100.0
19 IV $0.91 I 100.01 1 100.0
20 20 100:073 7111 95.9
113 10 92.0268 251 87,2
9 100.02 2 100.0
1 190.036 35 97.2105 195 10001I 11 100.06 6 100.03 2 66.7
100.02 2 100.01 I 100.0A 7 0753 3 100.01 I 100.01 1 100.42 2 100.01 1 100.06 6 100.0A 1 00
149 101 192.700.0
.0215
214
1003 3 %OOP*6 6 100.0
100.022 21
%95.5
11 18 100.065 41 95.1166 leo 98.01.0
1 100.02 2 100.2 10000
0
1 100.0100.04 *00.0I 100.0
3 3 160.0a a MAISO 9 90.4,3 3 100.0
1 1 100.001 78 96.32 2 100.0
39 54 90.921 .a 1 10000.0
0
6 5 03.31.8I TS0
10999006.0
IT 17 104.01012 1346 95.3129 119 100.022 22 100.020 14 90.036 3 94.419 13 64.4I 1 190.05 5 1100.010 10 100019 17 40.9s 1 100.01 I 100.0
20 20 100.072 07 93.1
111 105 94.6247 254 68.59 100.02 2 100.01 1 100.0
15 34 91.1IDS 100 95.211 0 11.1 11 0 01.86 5 03.3 6 0 83.33 -1 33.3 4 1 00.9
4 100.0 4 1410.12 2 100.0 2 2 100.01 1 me.* I 1 100.0
44401 . A 100.0
114 VG 90.5ONO/
40 50 941.72 23 sus1 1 100.0
100.010 9 90.0
177 102 01.5111. 10 100.0
$429 1208 91.0129 105 90.923 21 91.320 49 05.036 33 91.718 16 80.9
1 1 100.45 S 100.0
04.914 IS 70.9
1 1 100.01 100.0
24 IT 85.072 71 98.6
III IDS 02.92900 249 86.9
9 0 100.2 100.411 1090.0
35 34 701108 107 09.1
9 9 100.0 03 3 140.0 31 1 100.0 1
s 1 100.0 I
2 $ 00.0 2I 1 100.0 1
6 6 100.0I 1 100.0 I
110 104 96.5 113S S 100.0 0
21 21 100.0 213 3 100.0 30 100.0 6I I 100.0 I $ 100.0
22 22 100.0 22 SO 06.414 14 100.0 15 10 100.065 60 92.3 65 42 95.4165 161 91.6 168 163 97.0
I 1 100.0 1 1 100.01 1 100.0 I 1 100.05 2 100.0 2 2 100.02 2 10440 2 2 100.0I 1 100.0 1 1 110.02 1 100.0 2 a 100.0
4 100.0 4 4 100.01 1 10000 s 1 100.0
. 3 3 100.0 3 3 100.00-- 2 2 100.0 2 1 00.0
O 0 100.4 10 0 00.43 3 00000 $ 3 200.0
7 87.531
100.0100.0
1 100.02 100.01 100.06 100.01 100.0
100 80.55 100010
20 95.23 100.06 100.0
2141E-11111
2 14097.0SR 00 00a a 1900.0
4 0 94 00024 2 100.0
0I 109.0
00.41/11 110 14 1100.0
175 170 10009.17 IV 100.0
141100 13411 90.9aro 127 90.422 22 100.020 20 100.036 35 97.217 16 90.11 I 180.05 3 100.09 9 100.419 1 44.23 I 1000a a mho
20 24 'mos12 71 011.4113 107 94.1too 149 9260
9 mule2 2 109.41 1
35 30 108.00:00104 104 100.0It II 110.0* 1008..0 0
3 010000
2 2 1110.01 100.0
O 7 1.113 2 $04.01 I. 100.01 I 100.02 2 100.4
I 104.0t 100.0
$ I 100.0A10 re9 99.1
5 9 100.021 21 100.0
3 3 $10.04 180.01 100
22 22 104..to 10 1026064 63 0644165 163 9200
I 00.0I 1
M00.0
2 2 124.02 2 100.01 1 10900142
100000.0
3$ 100.0
3 140.0a It 100..0
11 II 10603 3 160.0
Table 3 cent id.
PROBRAR TOTAL.
UP 14 X
SUR TES TS
Wig N X 161+11 N X INA Si Xeafg...sh
X
370 L 1 IVA V SC MCC 3 * 67.7 3 3 100.0 3 100.0 7 i. 03.7 3 3 114111
374 MASH C AFF A IRS L SERVICE 1 1 150.0 I 1 100.0 I 1 100.0 I I 00000 I I 11111.0
376 SIM IAL SC 'EKES 0 3 70. 4 4 100.0 4 3 75.0 4 3 75.5 S 100.1
379 SE CON; ART SCIENCE 3 101..0 3 3 100.41 ' 3 3 100. t, 3 3 100.0 3 3 200.0
OW I NOUSTR IAL *DUCA TIUN 6 4 1,16. 6 5 03.3 6 II 53.3 6 0 03.3 0 0 100.0
035 D ISM IBUT !VI MCAT ION 7 It 100.5 7 7 10010 IP 7 1110.0 7 7 100.0 7 7 100.0
17 BE CON4ViiiiA.ILE 18
a
CZ
Ck?
.111Till%ItillItt111111:111111tlerrett1111111%111,111114:14:11114141.11/Ittl
000000.0000004100000.426002m.40..0014.00.0.4000.0.0000+0001000O.00040:0
000000000000000%0000246000046000000,000000000000000000.0000 0000100
... .
.6w
.0sow
wa
....... D
seeO
EM
OE
Meeeeee
W006101145.41nO
WA
IPO
OD
OM
OM
WS
0.00.4~0W
0WW
,*ftelanO0W
alieNO
IRA
WIP
PeeeW
OO
MM
0400WW
ftWO
W11100
5N
M!
MO
ftne0e
04..
Ofte04 m
e,.5
..
.V
W.s
.
Ili WW
WV
000°M414400000002100.101%
.00.,MO
NW
MO
wN
0M0010.00w
essw.w
..00wV
NN
WO
Otw
mO
Meow
Megiaft sow
es 0 M000m
M M
0004% M
in 0 s%en
ate
S .0.04m0ftse....04.000000400,1000.00004.040.0400000000.0k000000..004.400
04..tispl.
00.0000Y000fe,00M00014,00011400004001,040.40.00.0000000.000000...w.4e40
00,00.04000VIN04011%.0.0b.00000.000000.,..00.000000.0.000000000 0.0000000
weelle
AM
eeM
eesefie
MI
eleO
MM
.r . .... ...
.... .
.
IrrkeltO4111411011O
rnIO0q2960F
rna01160.fesomed06P
ttrOftitoeftrIftaltittO
01041titseNteat......
. -....
m.o..
.fki ... . omen 0 soft
se
...
0.
ft
00m
mir...s..p.............,......m
...,..m..m
..0.10ftm.m
m...t..........m
.r.t.so
041.
...ma --. . ft 44,... c. v... .m 0
Nm
.
4 OA0
040
tt
*pi44
440i
041 w
O 0 ..... *O
M00000060000N
0W600000000000N
OW
0000N0w
00000010000000000000t111111111ttl0000b0000000000000000040000000000N
New
OO
NO
N000g000000000000000096000.400
a4
- inamm
ipmem
m.m
p.mm
mem
..m....m
mm
ft...........m................fteftm
...
................................emm
omm
emm
mm
mem
mom
mom
omm
emm
emm
.. .....
Wee
Me
we
aleO
ftm
ee MW
eeMee W
OO
M ..... W
M elk
40..g
.rt CO
. ...
.0... .
. .0 .04*
0/01440, 04 0
..
..
?: in
.
itu.
.
CI=
0arat04111.0A
lPt.em
ett,424.WW
,100**010norn,rotA.N
tAfinflrnireas,O
rnrnt«rnirnOrnO
esttt061001ftW
14
NP
lasmalltir,O
1 mou
mg006140.
4.141.M 4 .
..
..
.6
Ili
2:1,.
1it vi
104
24g illC
C;
L414e
i .01
0$41
611-4
0raI
fird
11
Ilii
'12
1
W O
OSSO
OS
tts...***.so.***. ........ epsp...4.6.. pi. .1
.tt°41111111W47.11
AO0W00.010Mmmm.0000,600040210.01,004.01400000M000t0000
0
SingIVMS11;014::::;:221=1:nataira:::::::::11:113:Urging:Ving
4m m m
gooloowmor
Ww
w ......... se
sowW
el Mw
olion-
ww
ee
b. 0Z
NN
N00,0N
ewN
01000000000%00100000w
00400M0.0w
0MO
nMw
eNw
0w0w
w.m
emm
tbegiO
Wesw
we rim
sen w
wW
04P w
.09000w0
soft 0 ww
esw
wO
wm
W
:a4itii
MftftW
00,61n0wM
0Ne0000w
000.01614000M000,00W
OM
ON
0MO
MM
welew
0ww
lww
w0w
wiw
aNt404000000
Yffii
0 Mess*
w00000 00e0
N .01 .w
00100401s win 0 w
v
ww
ww
114
'es=E
cla
r4,
UO
's410
0.-
3k
000etemmmeenotetet.06o6,0p.seo,00414564400ft0-tm70ot..16000006atece0060%**000.00
tot I,
IW
FZ444iii4i444J:VJA444:44;47iligl444%14414i44;4ZOWZ4Z4ZZeU4444;44JiiiZ
1444 1 0ita
0000
0,1460 %%.0.0001),.%*
%.70.N0.0.P.000 0000.004Chwet020 Genob.
...a .
..
..
...... ...
.... .0 .
tt0Z
gatstim0.0.410-0m41,..24414.01.....0041.mwet.mo.m......mm.......00.......N....00
8 Itm
....,...
. . ..,... Amy,. 0
sorta
..
..
.
W4%
41
1.f U.4
a*N
O1P
14.441,60,00404erna,fe00.tee.lokor"CO
04MN
.e..NW
..iNth,040,0184010......P
...rnrnmecrjelN
tik....0.0
CC
*N
..m000
N ...... 4m0... .m 0 .
..
..
..
.0 0
Ch
s.t..
U 02
-P
.4
2.444
Et:g
Lf0
reel
taa
irc.
. -
i7SNi
S.
5..
.5.t.
0..J.
.
szJa
!iiil
wi
04
Iii
..
va
o).
..
t4
0
-ttbb
EMI
0.
sVvW,*C
.,00
,a
025e4
7.21ti
tesS.s
to.. 13
g c-tag
a 4111.442:p.
0tJA
'44
4g
w04
1tgj2nat=:
4 aw:4.^5
UV WS6=
42 . 0
ai.0
ow rA
.w
....0.
..
.4
1'0
0 mw
gd.....s
civo
4.44.
Vi
m 4
iii
ore
=490
0 Ja 2
dam
ig
0.w.r...4.
...I
4 0.:01a0
aZ terow
bouri
,W e.to
.et
AIW..4"4461100111
W00 40,44
vug a
0i.o.fla0
11.20 22 w.Auw
we
ta 4
44110..
4.1J4
..,,,
We
id41 M
eN
Oeve
.4.40.
tESV4.314Zikielet:1,6
.It!lac
..
. 04 041M
u w
-J040
a 0
ut. .
ei
. WzAme0.4.Dr!
°
Wwwo
92
i...0.w22
n40
I.jos.*
7t,A2W
000 1p"44 r."M
440,#141 22.4.44'1..
ea: **.it
sowv..
L.6..i t
o et..::
,...1..+.
14. 2Tortuloas.:: Ou234 A0
.totoid4
.21. w...6 2 ..w.w.... .umWe
ismam
i..tOt
.w.g.%%.
414w 2 u
mm00 VW 44402042'! :,1712 WV0W44X4114d104441444X 4
I44.Z7.t41144a1.1.40LW
Zt41wW
look04j4.04,*10*..A V'44WW'V".N.2000,4.10.1e22Z"0444S.. -toi2Pt ost
..ot
A-t
logal..ix.a44
a ziesuz.44.1a0.1.4....z.u6....64.mvcog $.40..J.x.4.-4.mg.vww441
awit.bm0 .4.2kAup2..w22 w..2.0.m.v.0u..tUww..41.agw4y
81.7,WCw244X1E031440.
clawa.a
41+114.jez.Aajd 2m4042zwzz-=-z;d=.41,.3041a44=42041/Lia.M..w.a.JOaa..awa
44b44ataM4X1102Ati42:10:.1211Mitt.I.ZOIttAi0.2:11.415.42122,0a44..tie44.%11.2r
2,
44.101,0110fte0"an4NIM00."0W",000VN4NOftVV*.4M00eN04,b.0001
0.A0ft00..ftuot.
.....NNfteuNftnnnmmee...poocoloofta4441..m...4pm...,c^-.41,11A0Ateocooe
etturtrImnsimelftetrimPt
12
'volt
Table 4 cant d
4
PRUGSAN TUTAL SWUMgltU N X Nita
366 LAN 4 4 6687 7
309 LOTTER, 1 1 100.0 I
370 Limon SCIONCE 1 11 100.0
N
Ati
tt127,1100.0100.0
T611
N RSSIA N S rerualt S
I. 6 I09.0O 66.7 0 6 MO1 1 100.0 1 I 100.01 106.0
1 1I 100.0 I 108,0 lipit! PSTONCLOGY 2 2 100.0
a 100.0
PUSLIC *M INS Ie SERVICE 7 0 71.4 / I Ma I "V*:$ ii.,0 :
1:::: 7 7
lii SOCIAL 4CIEWIS 3 3 100.0 4 4 100.0 3 2 100.0 r 1 1. a376 tommotreuttassy Motes s 1 100.0 s 1 too s4 Iso 0 00.371 DEGREE AVIONAL
1 1 100.09 0 00.9 9 9 100.0 10 9 90.0
4 1 100.0
I. 1 $00.1 100.0 1 1 100.0370 SEUD4ART scteNct 1
1100.0
SOO simusTals. GOUCATION 1 1 $ 100.01 1 100.0
1
1 100.0
1 100.0 , --1 1 100.0 A 400.0
21
3*.
BEST COPY AVAIIAL:-
a
22
Table 5
Florida Teacher Certification Examination Number and Percent PassingAll Subtests and Each Subtest by Program Statewide for
Vocational Technical Candidates
Augdat 1981; October 1981; and February 1982 Advinistrations
TOTAL TEST
READING
MATH'
PROF. EDUCATION
WRITING
422
369
379
354
361
$4t1
179
263
273
265
247
42Z
71%
72%
75%
68%
r
PRIX:flA
340 Lim20 LETTERS370 LIORARV SCIENCE373 PSVCNCLOGTS?$ PUBLIC APPOIRS 4 SERVICE270 SOCIAL OCIRACDS370 INTLROISCIPLIMAN7 STUDIES377 DES VOCATIONAL219 DECO, ADT SCIENCE009 /NOUS DIAL EDUCATION
24
Table 4 coned.
T Li VAL Sariifin9Li, 0 /1141g I
1141
Op ,iirjaX NI I IS II 111 If 9
6 III 66.7 7 *7.1 6 60.7 9 0 13.06 t LT:
111$ 1 1000 1 $ 10.0 1 1 10140 1 a leo.* s
I t 1110* 0 1 1 100.0 1 1 1000 1 1 100.0 1 1 SO 02 2 IMO* 0 2 I 1130*0 a 2 10000 I III WIWI II I 1100 0:
7 5 71.4 7 0 71 .4 7 5 71.4 V 0 05.7 7.3 3 100.0 100.0 3 2 10..0 6 100.0 3 I 11:14$ 1 atm.° 1 1 100.0 1 1 104.0 1 1 100.11 1 1 100.09 0 5e. 9 0 9 1 vi- 10 3 *0.0 9 9 10000 10 0.0.01 1 100.0 1 1 1 *0 1 1 100.0 1 1 100.0 1 1 1190.01 1 100. a I i a .0 1 1 100.0 1 1 10040 1 1 1110.0
EST COPY ii;,1;1:obt.
25
Table 5
Florida Teacher Certification Examination Number and Percent PassingAll Subtests and Each Subtext by Program Statewide for
Vocational Technical Candidates
August 1981; October 1981; and February 1982 Administrations
TOTAL # # PASSED 2 PASSED
TOTAL TEST' 422 179 42%
READING 369 263 712
MATH 379 273 722
PROF. EDUCATION 354 265 751
WRITING 361 , 247 682
PSTOKIIMIC CHARACTERISTICS
The psychpietric characteristics of validity, reliability, item
discriminattoas and contrasting Group Oerformence of the Florida Teacher
Certification Examination (FTCE) will be addressed in this section. Knowledge
of the psychometric characteristics of assessment tests is necessary for
evaluating the tests.
Validity
Validity refers to the relevance of inferences that are made from test
scores or other forms of assesseeit. The validity of a test can be defined as
the degree to which a test Impure* what it was intended to measure. Validity
is not an all -or -none characteristic, but a matter of degree. Validity is
needed to ensure the accuracy of information that is inferred from a test score.
Specific pries of validation techniques traditionally used to summarize
,educational and psychological test use -- criterion-related validity
(predictive and concurrent), content validity, and construct validity -- are
described in Standards for Educational and Psychological Measurement (APA, 1974,
pp. 26-31). For the FTCE, the primary validity issue that must be addressed is
the question of content validity. Content validity demonstrates that testbehaviors constitute a representative sample of behaviors in a desired
performance domain. The intended domain of the FTCE fa that of entry-levelskills as identified in the statute requiring the Examination as a basis for
certification. This statute (231.17, 17.13.) provides that
Beginning July 1, 1980 ... each applicant for initialcertification shall demonstrate, on a comprehensive writtenexamination and through such other procedures as may bespecified by the state board, mastery of those minimsessential generic and specialization competencies and othercriteria as shall be adopted into rules by the state board.
The statute addresses only the status at certification and does not require
that inferences be made from test scores to future success as a classroom
teacher. No claims have been made with regard to smasurement of specificaptitudes or traits, and no attempt has been made to establish relationships
between the lICE and independent concurrent or future criteria. It is only
claimed that the test adequately measures the skills for Which it was developed.
The construct and criterion-related validation approaches are not appropriate to
the validity issues related to development and use of the PTO.
The content validity of the.FTCE rests upon the procedures used to describe
and develop test items and content areas. The intended coverage of the test was
determined by a process involving professional consensus to (1) identify
competencies which should be demonstrated as a condition for certification, and
(2) identify subakills associated with each ximpetency. The procedures by which
the intended coverage was identified included surveys of the profession, reviews
15
I
by the Council on Teacher Education (COTt), reviews by the ad hoc COTE taskforce, and reviews by teachers and other professional personnel.
The general procedures used in test development were as follows:
1. The intended test coverage was identified and explicated. Competenciesand subskills associated with esch competency 'were identified andvalidated.
2. Teat item specifications were developed and validated.
3. Draft items were written according to test item specifications andpilot-tested on a small sample of senior students preparing to beteachers.
4. The final item review consisted of (a) a review by a specialpanel comprised of classroom teachers, teacher educators, andadministrators, and (h: item field-testing with seniors whowere in teacher etiolation programs. This was followed byanother review by Department of Education staff: Items weresubsequently placed in the item bank for future use.
5. Field-test data were reviewed by Department of Education staff. Items,
that did not perform well were deleted from the item bank or revisedand field.4ested again.
For the final item review process outlined in the fourth step, the itemswere divided by test area and reviewers were divided by area of expertise. Theprocess included a review of item content, group differences in performance, andtechnical quality. Bulletin IV (pp. 13-17) contains further information aboutthe development and re.71vtewest items.
In summary, the validity of the Examination has been well established as a.result of (1) the extensive involvement of education professionals in theidentification and explication of the necessary competencies and theirassociated subskills, (2) the precise item specifications which guided the itemwriters, and (3) the reviews of the items and the competencies/skills that theywere designed to measure.
Reliability of Test Scores
Reliability referp to the consistency betiveen two measures of the sameperformance domain. Althoqgh reliability does not ensure validity, it limitsthe extent to.which a test is valid for a particular purpose. The mainreliability consideration for the FTCE multiple-choice tests (Reading,Mathematics, and Professional &location) is the reliability of an individual'sscore. For the Writing test, a production writing sample, the reliabilityconsideration is the reliability of the judges' ratings. The data in thissection refer to the three FTCE administrations between October 1981 and July1982. For information about field test reliability data, refer to Bulletin IV(1981).
2816
!!141,1111LA1211161:stoimpatE
A test score is comprised of a "true" score ("domain" score) and an "error'
score. If an individual took several forms of a test, all constructed by
sampling from the defined item domain, scores of the various test forms would
not vary except as a result of random errors associated with item sampling
errors and changes within an individual from one test to another such as
attention, fatigue, or interest..
Reliability evidence is generally of two types: (a) internal consistency,
which is essential if items are viewed as a sample from a relatively homogeneous
universe; and (b) consistency over time, which is important for tests that are
used for repeated measurement. For the FTCE, toe primary reliability issue is
that of internal consistency. Sintir one form of the test is administered to
examinees at each administration, the reliability concern is that of consistency
of items within that particular teat (homogeneity of items). A test can be
regarded as composed ores many parallel tests as the test has items, and every
item is treated as parallel to the other items. In such a case, the appropriate
reliability index is the Kuder-Richardson Formula 20 (KR-20) index. The KR-20
formula it shown in Appendix B.
The KR-20 index estimates the internal-consistency reliability of a test
from statistical data on individual items. Separate KR-20 coefficients are
calculated for the Reading, Mathematics, and Professional Education subteets for
each FTCE administration. A high coefficient indicates that a test accurately
measures some characteristic of persons taking it and means that the individual
test items are highly correlated. The subtext KR-20 coefficients for the three
1981-1982 test administrations were above .78, indicating that the individual
test items were highly consistent measures of the three subject areas assessed.
Refer to Table 6. for the KR-20 coefficients.
Table 6
Kuder-Richardson Coefficients
Math ReadingProfessionalEducation
October 1981 .88 .87 .83
February 1982 .84 .85 .79
July 1982 .87 .87 .81
17 29
Reliability of the passing standards for the objective tests is estimatedwith the Brennan-Kane (B-K) Index of Dependability. This index is an estimateof the consistency of test scores in classifying examinees as masters ornonmasters of the minimal performance standards. The high B-K coefficients ofthe tests (refer to Table 7) indicate that-the candidates' scores are consistentwith their classification as masters or nonmaeters. Refer to Appendix B for thestatistical formula for the B-K Index.
TABLE 7
Brennan-Kane Indices
t... ProfessionalMath Reading Education
October 1981 .94 .94 .96
February 1982 .96 .94 .96
July 1982 .95 .94 .95
Reliability of.Bcoriotof the Writingpubtest
The major reliability consideration for the Writing test is theinter-judge reliability of ratings. The Writing test is a production writingsample that addresses one of two specific topics. The essays are ratedindependently by three judges with a referee to reconcile discrepant scores.Original reliability data were obtained from a study in which essays werewritten by 360 teacher educatioi students at two universities. Raters weretrained by the same procedures which are being used in the actual testadministrations. The reliability of the scoring process is monitored at theUniversity of Florida for each test administration. (Refer to Appendix D foradditional informatjon about the scoring of the Writing test.)
Two approaches are used to estimate the reliability. First, four indices
of inter-rater agreement are computed. These four indices are: (a) percentcomplete agreement; (b) average percent of two of the three raters agreeing;(c) average percent agreement by pairs as to pass/fail; and (d) percentcomplete agreement about pass/fail. The second approach for reliabilityestimation is the calculation of coefficient alpha for the raters and the ratingteam. This coefficient indicates the expected correlation between the ratingsof the team on this task and those of a hypothetical team of similarly comprisedand similarly trained raters doing the same task. Field test inter-raterreliability data and coeff.I.tient alpha for the inter-rater reliabilities arereported.in Tables 3.4 and 3.5 of Bulle.in IV (pp. 22-23). Refer to Table 8 forrater reliability data for the 1981-19 FTCE administrations.
3018
TABLE 8
Percentage of Rater Agreement for FTCEWriting Test
October1981
February1982
July1982
Index 1 - Z Complete Agreement 46.44 38.55 38.60
Index 2 - Average 2 Two of the ThreeRaters Agreeing 99.87 100.00 91.81
Index 3 - Average i Agreement byPairs as to Pass/Fail 98.10 96.75 96.72
Index 4 - Z Complete AgreementAbout Pass/Fail 97.15 95.12 95.08
Topic 1 .84 .87.
.82
Coefficient Alpha
Topic 2 .85 .85 .86
Examination of the reliability data for the Writing test indicates that the.
level of reliability achieved by the rating teams met acceptable Standards for
such ratings. '
Discrimination
Item analysis for the FTCE includes examination of the items' capacity to
differentiate between ability groups and the evaluation of response patterns to
the individual items. The item analysis indices used are item difficulty level,
item discrimination index, and point-biserial correlation coefficients.
Item difficulty level -- the percentage of examinees who answer each item
correctly -- is calculated for each item. These percentages provide important
information because items in the moderate range of difficulty differentiaterelatively more examinees from each other than do extremely easy or extremely
difficult items.
Related to the item difficulty level is the item discrimination index (see
page 32) which is the extent to which each item contributes to the total test in
terms of discriminating between the high and low achievers with regard
19
31
to the total test score. Any item that is below .20 on. this index is evaluated
for content and ambiguity of wording. Items that appear to be flawed arerevised or eliminated. The ranges for item difficulty level and correspondingitem discrimination indices are reported in Table 9.
The number and percent of examinees who select each alternative response(foil) were reported for each item in the multiple-choice tests. This foil
analysis permits further evaluation of response patterns to the individual itemsand provides useful information about variations in response performance bydifferent groups. These data are provided to the Department of Education staffand appropriate subcontractors and are not reported in this document.
Point-biserial correlation coefficients indicate the extent to whichexaminees with high test scores tend to answer an item correctly and those withlow test scores tend to miss an item. While the item discrimination index isbased on the performance of high and low achievers, the point-biserialcoefficient includes the entire range of scores in the correlation, therebyindicating the item-total correlation or the extent to which an item scorecorrelates with all other items measuring a particular subject area.tatistical formulas for these indices are listed in Appendix B.
32
20
IN .81-1.00
1 .61- .80
.60
g .21- .40
1! .0- .20
TOTAL
. 81-1.00
. 61- .80
.41- .60a
a a .21- .40
! .0- .20TOTAL
a
!
.81-1.00
.61- .80
.41- .60
.21- .40
.0- .20
TOTAL
.10 andbelow .11-.20
TABLE 9
Iftsquesty of Items within SpecificIts Difficulty and 18,am Dlacriminstion Sanas ,
.21-.30Item Discrimination Seaga
31-.40 .41-.30 .31-.60 .61-.70 .71-.00 .81 -.90 .91-1.00 TOTAL
--, 33 27 14 9 0 0 06 -
0 0
0 3 4 12A
2 0 0.....
0 0 o 4. 0 2 2 0 * 0 0 0
0 0 0 0 0 0 0 0A
0it
0
0 o o 0. 0 , 0 0 0 0
IS
.19 andbalm .11-.20
30
.21-.30
18 23
SHADINGIts Discrimination Bangs
.31-.40 .41-.30 .51-.60
9 2
.61-.70 .71-.80
0
.81-.90
ta28
4
00
0 130
.91,- 1.00 TOTAL
139 50 19 6 0 0 0 0 0 0
0 0 3 7 6 s
,IP
0 .0 0
1 0 0 1 1 2 0 0 0 o0 0 0 0 0 0 0 0
40 0
0...
0 0.
0 0. .
0 0 _.... 0
140
.10 mod
below
50
.11 -.20
22
.21-.30
14 7
PROFESSIONAL EDUCATIONItem DIsceledostion Banos
.31.40 .41 -.5O .51-.60 .61-.70
0
.71-.80
0
.81 -.90
0
.91-1.00.
32 42 28,
2lb-
0
,
0 0 0
.
0.
P, 1
2 16. I
24 27 1$ 1 0 0 0 0
3 1 17 '1/4, 18 4 1 0 0 0
1 1 2 ) 2.-
0 0 0 0P
0 . .0
- ° 0 1 0 0 0...
0 0.
0 , 0
60 72 49 19 2 0 0 0 0
214
21
S
0
0
240
TOTAL
104
85
44
6
1
240
Means and ranges for the point-biserial correlation coefficients for thethree 1981-1982 administrations are reported in Tables 10 and 11.
TABLE 10
Mean Point-BiserialCorrelation Coefficients
Math ReadingProfessionalEducation
October 1981 .38 .32 ..27
February 1982 .37 .31 .25
July 1982 .40 .34 .25
0
TABLE 11
Point-Biserial Correlation Coefficients BetweenCorrect Item Response and Subtext Score
1!
Range ofPoint- SiserialCoefficients Math Reading
ProfessionalEducation
.90-.99 0 0 0
.80-.89 0 0 0
.70-.79 0 0 0
.60-.69 1 0 0
.50-.59 18 1 0
.40-.49 38 55 14
.30-.39 42 93 64
.20-.29 23 78 100
.10-.19 8 12 50
Below .10 0 1 12
TOTAL ITEMS FOR 1981-82 130 240 2401/4
34 22
The appropriateness of mean point-biserial correlation coefficients must be
evaluated in the context of a particular testing progn"... According to A
Reader's Guide to Teat Analysis Reports (ETS, 1981), the mean biserial
correlation will be higher when the examinee group represents a wide range of
ability or knowledge or when the test items are very similar in content. Since
the FTCE Reading and Professional Education tests are relatively easy, the
scores were not greatly different. Thus, variability was reduced, and the
point -biserial correlation coefficients were attenuated.
Contrasting Group Performance
To the egLent that scores on a test refleCt group membership rather thanthe-knowledge pi. skill that the test is designed to measure, the test is
invalid. Although not all groups necessarily exhibit the same performance 1
in different areas of achievement, the procedure for analysing contrasting group
* performance is to screen for any specific areas or items. Extensive reviewprocedures were used during FTCE development to ensure that the Examination
content was an accurate representation of candidate perprmance in terms of the4
competencies being evaluated. The procedure included (a) a series of reviewsduring the item development stage to screen for possibly offeneive materials and
for it that might invalidate examinee performance and (b) statistical'
analysis of field groups, ethnic groups, and program groups. These procedures
are described in Bulletin IV (pp. 33-38).
After each FTCE administration, test content is examined for contrasting
group performance. Score distributions and summary statistics (including mean,_an, and'standard deviation of the distribution and an index of skewness) are
reported for each test.. The content review for contrasting group performance
includes (a) examination of scatterplots of performance on individual items and
overall content by sex and ethnic category (male-female, black- white,white-hispadic, and hispanic-black), (b) analysis of performance by groups based
on their test scores, and (c) individual item analysis by sex and ethnic
category to screen for items that may discriminate negatively for a specific
group.
Scatterplots
Scatter diagrams are graphic representations of the extent to whichperformance by two separate grop3 is related. Twelve scatterplots are produced
for each FTCE administration, comparing performance by sex and ethnic category
for each aubtest. Entries that depart from the general pattern indicate thatone group is performing differently from anothet group on specific items. In
such cases, entries that depart substantially from the general pattern of otherentries are reviewed for content that could account for differences inperformance level. Items that are determined to be flawed during this review
are revised or deleted from-the item pool. An example of a scatterplot is
illustrated by Figure 1.
2335
Figure 1. Scatterplot of Percent of Examinees Who Answered-Specific Items Correctly.a
peer sift ocoO1100L0T4 124.:,1 poor $$
mast vs fsmaLrMO 6000101. SlarAlfsim Mosr geopirsils
r1.00 *0.000011104100, isr =011 NIL.
V1.041 p.m, .16.00 11.1:°419 Mgt 0 600...........o.-....-............,....-............-.--.....-..............-......--..--......-----$-..-I.
new*1
100.40
10.001
141
I 1,0 0 810 0
11
0.00 :". I t 01.00m
1
00.001
". s:06peso 0010 MOP
i
041.010
111.00
1 *A
0160
I/
..... .6.0. .11
IWOW.10.00
100.00 1 10.00
0 0
10.611
10.00
10.00
0.0
'OM0.0 10.00 .00 20.00 00.00 61.011 11.00 10.11* 001.00 00.00 101.00
11...M..m.,.........T.................... ..... .m..elemamormwm.m.m.m....w.W+4...
10.00
r 10.00
aPerformance for males is plotted on the, vertical axis whileperformance for females is plotted on the horizontal axis.
Subtext Performance by Croups
10.61
0.0
The number and percentage of candidates who pass all subtests andindividual subtests are reported by sex and ethnic designation after each FTCEadministration. Table 1Z displays these data.
uadva tig3 uT peausavad u861 -I260 adenomas/alma 23/A sista gal aaai 'nap lusatudis *pm u; immense
16 %II 911 61 61 tZ 001 91 91 IL fra 1SC Minn 06 C01 91T 88 T1 9/ 96 gI 9I 59 66Z /iC MI And 99 66 9TT 88 IL Vt 001 91 91 CL 6c1 TLC MUM 99 96 9IT 6/ 61 9t 96 61 ST 11 111 CSC Mil 94 68 OH 95 CT 91 98 VT 91 is 00Z 996 142.1 191103
II 8 14:m 2 8 uti 2 8 101 2 8 IOL
impo 3010141 -WarinfiVr nudely
i
. arm ,mw/ uvaTassy
Et 466 0191. 96 9519 OM 96 0111 916/ 06 6061 LOU .56 5116 5C96 08111181
91 819 618 66 ICE9 01C9 16 LOLL 9E61 14 0861 1011 96 /9/6 1996 03 &MI 19 696 919 16 6009 CEEB 96 8011 C66/ 06 0061 60Ti C6 9006 1996 08148811
IS 1St flil 66 9161 61E9 16 9989 1(61 06 9061 RIZ 16 ILLS L996 8.1191
TO OCC 609 U 0651 50E9 18 0569 616/ 19 9891 TOTE 99 9019 9196 1981 39111111
B a sal s g los 3 9 sal $ 1 LOL 1 8 101
4**I6 swim eptelek r item emp1pie3 ITV
I
112111111100 90111-392IpAst
014$311111WG 4u432 Pus wog Ati Pu* 111302 d4 Ran
paw Moving /T11 fwd usaisa put iagnam
Z1 TO
Pi
item Analysis E rani c Category
Separate item analyses -- including item difficulty levels, itemdiscrimination indices, point-biserial correlations, foil analyses (alternativeresponse choices), and Kit-20 estimates of reliability -- are reported for eachsex and ethnic category. The item analysis process includes the screening ofthe individual test item= that may discriminate negativdly for a specific group.When an outlying entry is identified on a scatter diagram, the item content iscarefully reviewed to determine the necessity of deleting or revising the item.Foil analyses may also 'provide useful information with regard to contrastinggroup performance. Variations in response patterns by groups to different foils(alternative responses) may indicate the need for item revision.
The procedur-s described in this section -- including scatter diagrams, theanalysis of subtext performance by groups, and item analysis by sex and ethniccategory -- are used to ensure that scores obtained on the FTCE are accuraterepresentations of the candidates' performance levels in terms of thecompetencies that are addressed and Are not a reflection of membership in aspecific sex or ethnic category.
The Florida Teacher Certification Examination (FTCE) is an examinationbased upon eelected competencies that have been identified by Florida educatorsas minimal entry-level skills for prospective teachers. In order to develop theExamtnatiog, the following tasks had to be accomplishtd: (a) planning;(b) writing acrd validation of test items; (c) field-testing the examinationitems; (d) setting passing scores; and (e) preparing for test assembly,administration, and scoring. The competencies (described in Appendix A) havebeen adopted by the Board of Education as curricular requirements for teachereducation:programs in the colleges and universities in Florida.
The FTCE consists of three objective tests (Reading, Mophematicaq andProfessional Education) and an essay test (Writing) that is scored by trainedreaders. The general test content is as follows:
Test Content
Writing One of two general topics
Reading General education passagesderived from textbooks, journals,state publications
Mathematics Basic mathematics: simple computation,and "real world" problems
ProfessionalEducation General education including personal, social,
academic development, administrativelkills, exceptional student education
Developmental items are included in the Examination along with regular test items.These developmental items are not counted in computing an individual's score.
The psychometric characteristics of validity, reliability, itemdiscrimination, and contrasting group performance of the FTCE are described inthis report. The validity of the examination has been well established as aresult of (1) the extensive involvement of education professlonals in theidentifidation and explicatilpof the necessary competencies and theirassociated subskills, (2) the precise item specifications which guided the itemwriters, and (3) reviews of the items and the competencies /skills that they weredesigned to measure. The reliability data indicate that the test items areconsistent measures of the three subject areas and that the examinees' scoresare consistent with their classification as masters or nonmastera of the minimalperformance standards. The reliability data for the Writing test demonstratesthat the scoring by the writing teams meets acceptable standards of consistency.Item analyses for the FTCE examine the power of the items to differentiate
27 3i
between ability groups and evaluate response patterns to individual items. Theindices that are used to monitoethe differences between ability groups are itempercent correct, item discrtminatibn index, and point-biserial correlation
coefficients. Additional item analysis procedures -- including scatterdiagrams, the analysis of subtest performance by groups, and item analysis bysex and ethnic category -- are used. These procedures ensure that scores
obtained on the FTCE are accurate representations of the candidates' performancelevels in terms of the competencies that are addressed and are not a reflectionof membership in a specific sex ov ethnic category.
'The FTCE is administered three times a year in selected locations
throughout the state. Data from this report indicate that the percentage ofcandidates who passed the entire FTCE for the October 1981, February 1982, andJuly 1982 administrations were 84 percent, 84 percent and 85 percent,
respectively. Examinees who do not pass all of the tests at one administrationmay retake the tests not passed at later scheduled testing dates.
28
1111 i1110 /l
plop I q
0111 11111imyi
111111111111111111118.11:11pLialii I itirls
1 I fa 1
P4:
li141 r 1 i
Hil i iidill! III
thin u
I ill111111111111 II 111 11111:11 I 1111.1 111 Ililli lim I
8 ai 6 4 d1 1 1 1 1
1 I 4 'a d 4 .. 41 Ai d 4 si C1/4)
a 4a
; p
I112.lot [I/ II it Pod! oto III till 51 14011
I, II I 1 HP' 11; I ih it
HiPeipilliiiipillittill 9.1:* Pciifigiiiii iiitirt
11111"4911billuillillillitilil
II 1 ii 1111h 11 ft" till I 1111.111111 11 11,11111 11111:11c4) cit ii 0 fil d olioFi ; 0
rg, 56 P F rill r- F pit p
:j. 1111 Mill il 1111111111111111/11111111 IIIIPN/11111/11 I
1 roillip it 'writ illihitil. odd of iiiiiii Oki 1
II 1110 0 II b ii Pit 1 ii 111111 .P. i! vti filtiii I
P
°MI
II.
41..1
11
1111
1111
1111
1/I
II I
"Pit
11.1
illp
!Hit
F
Itu
gdi
h. b
itr
17P
wP
aa
14
PP11
1111
/111
1111
11Pi
'ph 11
1410
II I
I II
Nip
lit!I
II 1
/11
11
The following formulas were used in the calculation of statistics for the
FTCE:
(a) Point- Biserial Correlation
m - ms u
rpb a [ ]
where point biserial correlationcoefficient
ms
= mean total score of examineesanswering item right
mu
= mean total score of examineesanswering item wrong
a = standard. deviation of total scorefor entire group
p = proportion of examinees gettingitem right
(b) Kuder-Richardson Formula 20 Reliabilit1 Coefficient
where I = number of items/questions (anythomitted questions not includ d)
EpqKR 20 i- 1[1 -a2 2 3
y
(c) Standard Error of Measurement
p = P roportion of examinees gett g
item right
q = 1 p
a 200 variance of the total score
where as
= standard error of measurement
ax
- the standard deviation of totalas .ux r
XXscores
rxx
= the reliability coefficient
46
(d) Item Discrimination Index
(e) Coefficient Alpha
where Rum number of students in high score
range (i.e., the upper 27%) whoanswered the item correctly
R - number of students in low scorerange (i.e., the lower 27%) whoanswered the item correctly
T - total number in the upper andlower groupp
Coefficient alpha is used as an estimate of the inter-rater reliabilityof Writing test scores. This coefficient indicates the expected correlationbetween the ratings of the team on this task and those of a hypothetical teamof tmilarly comprised trained raters doing the same task.
krkk k - 1
[ 1a
2
Ea 200 sum of the variances of each item
where r coefficient of reliability (alpha)kk
2
k * number of test items
2a 10 variance of the examinees' totaltest score
(f) Brennan-Kane Reliability
1XpI(1 - XpI) S2(Xpi)
mac) -n 1
C)2 +S2 ('I)
where n * number of items
Xpi SAD grand mean over n persons and n
items
$2
(31.ri
- sample variance of persons' meanscores over items
that is, SS persons
n* n
i
35 47
SECURITY
A security and quality control plan has been developed and implemented foithe program. Components of the security plan include:
1. Controlled, limited access to,all examination materials;2. Shredding of developmental materials and used booklets;3. Strict accounting of all materials and of persons working with
test items at the testing agency and test centers.
A signed security document is obtained from every individual who has accessto the examination materials. The security document contains an agreement thatthe individual will not reveal --f in any manner to any other individuals -- theexamination items, paraphrases of\the examination items, or close approximationsto the examination items. Only persona who have a "need to see" the itemsbecause of their work on the project are allowed to view any parts of theexamination.
Test Security_Durisg the Administration
During the production phases of this project. all typing and reproduction aredone by persons who have security clearances. All materials are signed out whenthey are removed from locked storage and checked in when they are returned. Oneperson is assigned responsibility for the secure files while all work is inprocess;'this person is able to account for all materials. at all times.Material that needs to be revised and unusable materials are not placed inwastebaskets but are kept in a locked file for special destruction.
1%4
The following plan has been implemented to ensure rigorous security of allmaterials during actual examination administration. Materials remain in securestorage at the test centers until the morning of the test date. If multiplerooms are used at a center, each room is assigned blocks of materials that mustbe signed foi by a room supervisor, the only person who has access to the roomsupply. Test books and materials are never left unguarded. Candidates areassigned seats by center personnel. The seating arrangements minimize thepossibility of a candidate seeing the papers of other candidates. Books aredistributed by the room supervisor and proctors. Each booklet is handed to theexaminees individually and the examinees sign a receipt for the booklets byserial number. Immediately after distribution, an inventory is taken to ensurethat the sum of the distributed and unused books equals the number of booksassigned to the testing room. Any discrepancy is reported to the centersupervisor and immediate steps are taken to reconcile the discrepancy and locatethe missing material. Every such incident is.reported to the Project Manager,and appropriate action is instituted to prevent further occurrences and torecover any missing materials.
Candidates cannot leave the room during a test session except for anemergency. If a candidate must leave the room, materials are delivered to theroom supervisor or pr)ctor and held until the candidate's return. No materials
49
38
may be removed from the test room at any time. Only one candidate may leave
room at a time. Provision of a break between eubtests reduces the need for
candidates to leave during a test session. At the end of a session all test
books are collected and accounted for.before collection of the answer documents.
After the answer documents are accounted for, all candidates in the room are
dismissed. Upon return to their original seats, candidates are reidentif ied by
test administration personnel before the distribution of materials for the next
subtest. During breaks and the lunch period, all materials are either locked in
secure storage or are placed under direct supervision of test administration
personnel. All used and unused materials are returned to locked storageimmediately after test administration.
Quality Control
To ensure quality control during the scoring and reporting process, the
following procedures are used:
1. Each answer document is checked for proper coding and marking in
response areas;
2. Computer edit programs are used to check for valid program codes
on the registration forms and for matching names and social security
numbers on the registration and scoring files;
3. Test data are used to verify the accuracy of all scoring and reporting
programs;
4. Sample data are drawn prior to scoring from each administration toscreen for key, printing, or procedural errors;
5. Random answer documents are hand-scored during the scanning process toverify proper operation of the scanner;
6. A complete review of all procedures -- which includes hand-checking a
sample of test data -- is completed by members of the University of
Florida, Office of Instructional Resources and the Department of
Education before printing the candidate score reports;
7. Analyses of the holistic scoring process are conducted. This review
addresses the overall reliability of the ratings, the distribution of
scores, and number of refereed scores for each reader. Specific
procAures for quality control during the holistic scoring process aredocumented in the Procedural Manual for Holistic Scoring;
8. The accuracy of the calculations for the institutional and state
reports are hand-verified.
HOLISTIC SCORING OF THE WRITING SUSTESTOF THE FLORIDA TEACHER CERTIFICATION EXAMINATION
The Writing Subtext.
The Writing Subtext was designed to assess a candidate's ability to writein a logical, easily understood style with appropriate grammar and sentencestructure. The subskills to be measured are:
a. Uses language atthe level appropriate to the topic and reader;
b. Comprehends and applies basic mechanics of writing: spelling,
capitalization, and punctuation;c. Comprehends and applies appropriate sentence structure;d. Comprehends and applies basic techniqpes for the organization of
written material;e. Comprehends and applies standard English usage in written
communication.
The candidate is given a choice between two topics on which to write an
essay during the 45-minute examination period. This essay should demonstrate
the competency and subskills specified above. The essay or writing sample is
scored holistically by at least three trained and experienced judges.
The Process of Holistic Scoring
Holistic Scoring Defined
Holistic scoring or evaluation is a process for judging the quality of
writing samples. It has been used for many years by professional testingagencies for credit-by-examination, state assessment and teacher certificatton
programs.
Essays are scored holistically, that is for the total, overall impressionthey make on the reader, rather than for an analysis of specific features of a
piece of writing. Holistic scoring assumes the skills which make up the ability
to write are closely interrelated and that one skill cannot be separated from
the others. Thus, the writing is viewed as a total work in which the whole is .
something more than the sum of the parts. A reader reads a writing sample
quickly, once. He or she obtains an impression of its overall quality and thenassigns a numerical rating to the paper based on judgments of how well it meets
a particular set of established standards.
The Reader
The key to effectiveness of the holistic scoring process is the readers who
must make valid and reliable judgments. Readers must bring to the process
experience in teaching and grading English compositions. In addition, they must
5242
be willing to undergo training in holistic scoring which demands they set aside
personal standards for judging the quality of a writing sample and adhere to
standards which have beer set for the examination. The goal for the reading of
the Writing Subtest of the Florida Teacher Certification Examination is to rate
a large number of essays according to their overall competence in a consistent
or reliable manner according to previously established standards based on a set
of defined criteria.' By undergoing a set of trainitg procedures a group of
experienced teachers of composition can develop a high level of consistency in
making judgments about the quality of a group of essays.
The Criteria
The criteria established to score the essays for the F;prida Teacher
Certification Examination are listed below. They were developed to accommodate
specific conditions imposed by the Writing Subtest:
(1) They reflect those characteristics widely accepted as indicative of
good writing;
(2) They can be translated into operational descriptions of levels of
competence;(3) They reflect the general competency statement and subskills
identified by the Council on Teacher Education.
Specific Criteria for Evaluation of Essays
1. Rhetorical quality
1.1 Unity: An ordering and interdependence of parts producing a single
Affect: completeness.
1.2 Focus: Concentration of a topic; the presence of a "center of
gravity."
1.3 Clarity: Lucidity of expression; lack of ambiguity and distortion.
1.4 Sufficiency: Appropriate depth and breadth or expression to meetthe writer's purposes and the demands of the particular topic.
2. Structural and Mechanical Quality
2.1 Organization: Consistent and coherent integration and connection of
parts.
2.2 Development: Appropriate and sufficient exposition of ideas; use of
dftail, examples, illustration, comparisons, etc.
2,3 Paragraph and Sentence Structure: Appropriate form, variety, logic,
relatedness of and among structural units.
2.4 Syntax: Appropriate ordering of words to convey intended meaning.
43
3. Observance of Conventions in Writing
3.1 Usage: Appropriate use of. language features: inflections, tense,agreement, pronouns, modifiers, vocabulary, level of discourse, etc.
3.2 Spelling, Capitalization, Punctuation: Consistent practice ofaccepted forms.
The relationship between the subskills and the scoring criteria is .
illustrated in the figure below.
ESSENTIAL COMPETENCIES:Demonstrate the ability towrite in a logical, easilyunderstood style withappropriate grammar andsentence structure.
a. Use language appropriateto the topic and reader.
b. Apply basic mechanics ofwriting.
c. Apply appropriate sen-tence structure.
d. Apply basic techniques fororganization.
e. Apply standard Englishusage.
Operational Descriptions
RHETORICAL STRUCTURAL CONVENTIONAL
a;.. o 4.0U .01 aa CO
W U:NY **4 N 1 g .144.1 C.)
"' 194 t 40
>' CO 44 v.41 W W.1.1 0 $4 NM
gb V 4 " 00 4*ri 0 CO 44 a Ua u4.
0 r 0 8ce,
0..1 C en ..1C
oflo 1 a .44,
er!e e
re,1 gr. e....1 wi C4 csi 0.1 CSC fn In
X X X
X
111111M11111
R x
X x
The operational descriptions based on the scoring criteria reflect the fourlevels of competency which the readers are to assign each of the essays theyread. Each reader will independently score or rate a paper on a scale of 1 to4, with 4 being the highest rating. The descriptions which follow are anattempt to express clearly and precisely the general, overall impressions areader has in terms of the criteria when he or she reads essays of varying
44
quality. The four levels or quality of competence could be expanded or
decreased. However, for the task of scoring the Writing Subtest, it providesenough degrees of distinction to be meaningful yet manageable for largescale testing.
4. The essay is unified, sharply focussed, and distinctively effective.It treats the topic clearly, completely, and in suitable depth andbreadth. It is clearly and fully organized, and it develops ideas withconsistent appropriateness and thoroughness. The essay reveals anunquestionably firm command of,paragraph and sentence structure.Syntactically, it is smooth and often elegant. Usage is uniformly
sensible, accurate, and sure. There are very few, if any, errorsin spelling, capitalization, and punctuation.
3. The essay is focussed and unified, and it is clearly if notdistinctively written. It gives the topic an adequate though notalways thorough treatment. The essay is well organized, and much ofthe time. it develops ideas appropriately and sufficiently. It slants a
good grasp of paragraph and sentence structure, and its usage isgenerally accurate. and sensible. Syntactically, it is clear and
reliable. There may be a few errors in, elling, capitalization, andpunctuation, but they are not'serious.
2. The essay has some degree of unity and focus, but each could be
improved. It is reasonably clear, though not invariably so, and ittreats the topic with a marginal degree of sufficiency. The essayreflects some concern for organization and for some development ofideas, but neither is necessarily consistent nor fully realized.'The essay reveals some sense, if not full command, of paragraph andsentence structure. It is syntactically bland and, at times,
awkward. Usage is generally accurate, if not consistently so. Thereare some errors in spelling, capitalization, and punctuation that
detract from the essay's effect if not frOm its sense.
1. The essay lacks unity and focus. It its distorted and/or ambiguous,
and it fails to treat the topic ,in sufficient depth and breadth.There is little or no discernible organization and only scantdevelopment of ideas, if any at all. The essay betrays onlysporadically a sense of paragraph and sentence structure, and it issyntactically slipshod. Usage is irregular and often questionable orwrong. There are serious errors in spelling, capitalization, andpunctuatidtt.
Traininj of Readers
The training of readers for the Writing Subtest of the Florida Teachers
Certification Examination consists of three steps:
45
Step 1: Acquiring information about the examination and holistic scoring
process.
Step 2: Reading and scoring essays which have been selected as good examplesof the various levels of competence in writing. The practice essayshave been scored by'experienced readers and annotated in accordancewith the operational descriptions. By reading, scoring anddiscussing the essays, the readers practice until they consistentlygive the same ratings to essays as the experienced readers.
Step Reading and scoring a sample of the actual Writing Subtests whichhave been selected and scored prior to the training session. Thesesamples will serve as the standards for the scoring of theexamination end will include essays which represent each of thecompetency levels. Asiin Step 2, the emphabis will be on each 'readerto assign scores which agree with those established earlier by theexperienced readers. This step occurs'imeediately before the actualscoring session and often is repeated during the session to ensurecontinued consistency or reliability of assigned scores or ratings.
Setting the Standards
Prior to Step 3 in the training, standards for the Writing Subtest areestablished. The Chief Reader, who is responsible fdr conducting the holistiLscoring, and his assistants, the Assistant Chief Reader and the Table Leaders,select, at random,'a sample of papers from the total group of essays written ona particular topic. These papers are read and accred independently by eachperson. Results are compared and consensus is reached for the identification offour papers. Each becomes.a standard for one of the four competency levels.Additional papers are chosen to be used in Step 3 of the training procedures.This process is repeated for the second topic of the Writing Subtest.
The Scoring Session
The scoring session begins immediately after Step 3. Readers ire assignedto tables in groups of four or five. The number of readers and the number oftables are determined by the number of essays to be scored. Each table of
readers is also assigned a Table Leader. The Table Leader's primary t,..alt is to
continually monitor the scoring process and consult with readers as questions or"problem" papers arise. The Table Leader is an experienced reader who hashelped set the standards.
Each reader is given a set of papers to read, rate and mark the score. Theidentity of the writer is not known to the reader. The papers range, on theaverage, from 200 to 400 words in length, and each can be read and scoredholistically in approximately two minutes. As the scoringof a set of papers iscompleted by a reader, a clerk collects and returns the piper to operationtable. The scores given by the reader ure covered, and the papers are
5b46
redistributed to another set of folders and delivered by the clerk to a secondtable of readers. This procedure continues until each paper has been read bythree different readers. Each reader reads, judges and scores at his own pace.Scoring sessions are approximately three hours long, with ten minute breaks eachhour. Usually there are two scoring sessions for each day of holistic reading.
After a paper has been scored by three different readers, the scores areexamined at the operations table. If one of the scores varies from any other bytwo levels or more (ex. 3-3-1), the paper is sent.to the Chief Reader orAviistant Chief Reader who serves as referee. This'person assigns a rating
which replaces the discrepant score. Papers whose original ratings are 1-2-3 or2-3-4 are refereed and scored as follows:
Rating of 1-2-3
(a) A referee rating of 1 will replace the 3, resulting in a score of 4(b) A referee rating of 2 will replace the 1, resulting in a score of 7(c) A referee rating of 3 will replace the 1, resulting in a score of 8
Rating of 2-3-4
(a) A referee rating of 2 will replace the 4, resulting in a score of 7
(b) A referee rating of 4 will replace the 2, resulting in a score of 11(c) A referee rating of 3 will replace the 2, resulting in a score of 10
All initial scores of S will be refereed. If any paper is refereed and adiscrepancy still occurs, the essay is submitted to a new team of readers until.consistency is obtained.
The three scores are then added together for a total score. Thus the
lowest score possible is a 3, the highest, 12.
Final Steps
After the reading sessions are completed, Table Leaders evaluate theperformance of Readers. The Chief Reader evaluates the Table Leaders. Readers
are asked for comments and suggestions for improving training and scoring .
procedures.
Two approaches for reliability estimation are the percentage of rateragreement and the calculation of coefficient alpha for the raters and the ratingteam, which indicates the expected correlation between the ratings of the teamand those of a hypothetical team of similarly comprised and similarly trainedraters doing the same task. The four indices that represent rater agreementare: (a) percent complete agreement; (b) average percent-of two of the threeraters agreeing; (c) average percent agreement by pairs as to pass/fail; and(d) percent complete agreement about pass/fail. These data are reported inTable 8 of this report.