observing reading instruction for students with learning disabilities: a synthesis
TRANSCRIPT
Hammill Institute on Disabilities
Observing Reading Instruction for Students with Learning Disabilities: A SynthesisAuthor(s): Elizabeth A. SwansonSource: Learning Disability Quarterly, Vol. 31, No. 3 (Summer, 2008), pp. 115-133Published by: Sage Publications, Inc.Stable URL: http://www.jstor.org/stable/25474643 .
Accessed: 16/06/2014 01:17
Your use of the JSTOR archive indicates your acceptance of the Terms & Conditions of Use, available at .http://www.jstor.org/page/info/about/policies/terms.jsp
.JSTOR is a not-for-profit service that helps scholars, researchers, and students discover, use, and build upon a wide range ofcontent in a trusted digital archive. We use information technology and tools to increase productivity and facilitate new formsof scholarship. For more information about JSTOR, please contact [email protected].
.
Sage Publications, Inc. and Hammill Institute on Disabilities are collaborating with JSTOR to digitize,preserve and extend access to Learning Disability Quarterly.
http://www.jstor.org
This content downloaded from 195.78.109.162 on Mon, 16 Jun 2014 01:17:53 AMAll use subject to JSTOR Terms and Conditions
OBSERVING READING INSTRUCTION FOR STUDENTS WITH LEARNING DISABILITIES:
A SYNTHESIS
Elizabeth A. Swanson
Abstract. This article synthesizes previous research studies examining reading instruction for students with learning disabil ities (LD) through classroom observation methods. An extensive search of the research literature between 1980 and 2005 yielded 21 observation studies. Findings revealed that reading instruction for students with LD is generally of low quality, with little to no explicit instruction in phonics or comprehension strategy. Findings were consistent, whether studies were conducted more than 10 years ago or within the last few years. Estimates of time students with LD spend reading orally or silently are low. The
most frequently observed grouping structure was whole-class
instruction, regardless of the setting.
ELIZABETH A. SWANSON, Ph.D., The University of Texas at Austin.
The provision of public education for students with disabilities began as a reform movement spearheaded by parents who demanded equal access for their chil dren to America's public schools (Bergeron, 2003). The
right to special education was established through the
passage of the Education for All Handicapped Children Act of 1975 and its subsequent revisions, resulting in the current Individuals with Disabilities Education Act of 2004.
Since 1975, other educational reform movements have been initiated with varying impact on the quality of education for students with disabilities. For example, the Regular Education Initiative (REI) of 1986 was
prompted, in part, by concern over the segregation of children with disabilities into special education class rooms based on the belief that segregation led to dis
jointed educational opportunities (Wang, Reynolds, &
Walberg, 1986; Will, 1986) and increased stigma (Will, 1986). Because a major assumption of the REI was that instruction provided in the general education setting
was at least equal, if not superior, to that provided in the special education setting, children with learning dis abilities (LD) in particular were integrated into general education classrooms with varying levels of academic
support. The good intentions of REI were to increase
integration of students with LD into the general educa tion classroom setting as well as to improve academic outcomes.
In support of the REI, educators were alerted to the often inappropriate social and instructional outcomes from segregated special education. Researchers such as Bentum and Aaron (2003), for example, reported no
growth in word recognition and reading comprehen sion as well as a decline in verbal IQ scores after three to six years in the resource room. Due to the single-group longitudinal research design of their study, it is impos sible to know if students with LD would have experi enced similar outcomes in the general education
setting. However, other researchers (McKinney &
Feagans, 1984) who included a non-LD comparison
Volume 31, Summer 2008 115
This content downloaded from 195.78.109.162 on Mon, 16 Jun 2014 01:17:53 AMAll use subject to JSTOR Terms and Conditions
group within their longitudinal design reported that direct instruction was rare in the resource room, with student scores on word recognition and reading com
prehension declining with time spent in the resource room. Additional evidence indicates improved out comes when students with LD are reintegrated into gen eral education, showing progress in oral reading at a rate
equal to that of their low-reading, nondisabled peers (Shinn, Powell-Smith, Good, & Baker, 1997). Alternatively, other studies (Waldron & McLeskey,
1998; Zigmond & Jenkins, 1995) exposed that while 48%-54% of students with LD who received reading instruction in the general education classroom made
gains in excess of one standard deviation, 46%-52% did not make reading gains considered to reflect significant growth, establishing at least some argument that not all students with LD benefit adequately from typical reading instruction in the general education classroom.
As a result of concerns about students' reading progress and the quality of reading instruction, a num
ber of syntheses and consensus reports (e.g., Bentum &
Aaron, 2003; McKinney & Feagans, 1984) have been issued that describe the most effective components of
reading instruction in elementary and, most recently, in
secondary settings. One synthesis examined effective instruction in reading (Swanson & Hoskyn, 1998)
- as well as other instructional areas (Swanson, Hoskyn, &
Lee, 1999) - for students with LD. Another synthesis
(National Reading Panel, 2000) and a consensus report (Snow, Burnes, & Griffin, 1998) addressed research on
effective instruction in reading for students who experi ence difficulty learning to read. Most recently, a meta
analysis of reading interventions for adolescent
struggling readers (Scammacca et al., 2007) yielded important implications for practice. All of these syntheses reached similar conclusions
about reading instruction for students with reading dif
ficulties/disabilities: (a) students benefit from explicit and systematic instruction; (b) foundational skills such as phonemic awareness and phonics are essential ele ments of instruction; (c) higher processing skills such as fluency, vocabulary, and comprehension are essential from the beginning of reading instruction and are
continually beneficial to adolescent struggling readers; and (d) students who have difficulties benefit from
smaller group instruction that provides support from
the teacher. Particular to students with LD, Swanson and Hoskyn
(1998) conducted a synthesis of 180 reading interven tion studies. Findings indicated that students receiving instruction through a model that combined direct instruction and strategy instruction performed better
than students who received direct instruction alone,
strategy instruction alone, or no components of direct
or strategy instruction. In particular, three instructional
strategies predicted large effect sizes: control of task dif
ficulty (teacher controls difficulty level and provides necessary assistance), small-group instruction, and directed response/questioning techniques (teachers and students engage in dialogue and questioning between teacher and students). While best practices for reading instruction have been identified through the interven tion research, it is important to assess whether these
findings are reflected during actual classroom reading instruction for students with LD. A source for such evidence lies within observation
research (e.g., Allington & McGill-Franzen, 1989;
Moody, Vaughn, Hughes, & Fischer, 2000; O'Connor &
Jenkins, 1996). One synthesis of observation studies conducted during reading instruction for students with LD or emotional/behavior disorders (EBD) was pub lished in 2002 by Vaughn, Levy, Coleman, and Bos.
They concluded that while a large amount of time was
allocated for reading instruction, this time varied
depending upon the location of service provision (spe cial education, general education, or a combination of
both). Within special education settings, students were
provided with a greater amount of individual and small
group instruction. However, the instructional quality of
reading instruction was reported as low overall, with a
large amount of time during reading instruction spent engaged in independent seatwork and worksheets.
Indeed, such practices are contradictory to those estab lished as effective in remediating reading difficulty among students with LD (e.g., directed response/ques tioning techniques; Swanson & Hoskyn, 1998).
Vaughn et al. (2002) included studies spanning from 1975 to 2000, and focused on reading instruction for students with LD or EBD conducted in special education or in general education settings. The present effort
expands upon their synthesis in several ways. First, while the current synthesis also includes studies that focus on all instructional settings where students with LD are taught (e.g., general education setting and spe cial education settings), it is extended to studies pub lished through December 2005. Second, increased research focus has emerged in remediating reading difficulties among adolescent struggling readers
(Scammacca et al., 2007). As of yet, there is no synthesis of observation
research indicating the extent to which effective read
ing instruction is implemented in settings designed to
serve adolescents with LD. Therefore, to begin under
standing the nature of reading instruction currently
provided to students with LD in middle and high school, this synthesis will include not only studies con
ducted in the elementary grades, but also those con
ducted in grades 6 through 12. Third, this synthesis will
Learning Disability Quarterly 116
This content downloaded from 195.78.109.162 on Mon, 16 Jun 2014 01:17:53 AMAll use subject to JSTOR Terms and Conditions
address issues of training and reliability procedures among observers.
Briefly, the purpose of the current synthesis was to
examine a wider breadth of observation studies to bet ter understand the nature of reading instruction for stu
dents with LD. Specifically, it addressed the following research questions: (a) What components of effective instruction have been documented during reading instruction for students with LD in elementary, middle, and high school? (b) What trends in student academic achievement have been reported? (c) What training and inter-rater reliability procedures are employed by researchers to ensure valid observation data?
METHOD A comprehensive search of the literature was per
formed through a three-step process, as suggested by Cooper (1998). First, an electronic search was conducted to locate studies published between 1980 and 2005.
Descriptors or root forms of those descriptors (observa tion studies, observations, reading teachers, reading, remedial reading, reading difficult*, disability, dyslexia, learning problems, minimal brain dysfunction, resource room programs, resource teachers, special needs stu
dents, special education teachers) were used in various combinations to capture the greatest possible number of articles. The initial search resulted in the identification of 874 abstracts.
Second, the six most frequently cited journals in the field of LD were determined by referring to Journal Citation Reports: Social Sciences Edition (ISI Web of
Knowledge, 2005). A hand search of these journals (Exceptional Children, Journal of Learning Disabilities, Journal of Special Education, Learning Disability Quarterly, Scientific Studies of Reading, and Annals of Dyslexia) from 2000 through 2005 was conducted. Third, the citation search phase involved searching reference lists of iden tified observation studies that fit the criteria for inclu sion in this synthesis.
In addition to the three ways Cooper (1998) suggests for locating articles, citation searches were conducted
through the use of the Web of Science. Titles of articles that met the inclusion criteria located through the elec tronic search, hand search, and citation search were used as the search term to trace journal articles within which they were cited. These abstracts were studied for
potential inclusion. Studies were selected if they met the following
criteria: 1. A formal observation tool was used to observe
reading instruction. 2. Observation took place during reading instruction
in either general education or special education resource room settings.
3. At least one student in the classroom was identi fied with a learning disability.
4. The study was conducted in elementary (K-5), middle (6-8), or high school (9-12).
5. When observation data included more than read
ing instruction, data pertaining to reading instruc tion were reported separately.
6. Observations to determine the effectiveness of interventions proposed by researchers were
excluded. A total of 21 studies were located through this proce dure for inclusion in this synthesis.
Coding Procedures
Relying upon code sheets developed for past inter vention syntheses (Kim, Vaughn, Wanzek, & Wei, 2004; Kim et al., 2003), extensive coding procedures were used to organize pertinent information from each study. Revisions were made to ensure that the code sheet addressed elements specified in the What Works
Clearinghouse Design and Implementation Assessment Device (Institute of Education Sciences, 2003), a docu ment used to evaluate the quality of studies. In addi
tion, items were added to include information unique to observation studies.
The code sheet was used to record information on
variables, including participant information, design information, reliability procedure, and reported find
ings. Teacher and student participant information was coded using a series of forced-choice items (student items: socioeconomic status, identified disability, and
gender; teacher items: role of person implementing instruction and gender) and two open-ended items (age as described in text and narrative description of student
disability as described in text). Similarly, design infor mation was gathered using a combination of forced choice (e.g., student selection) and open-ended items
(e.g., names of academic measures). Reliability proce dures were recorded using a series of open-ended items
(e.g., number of observers, procedure for reliability) and one forced-choice item indicating whether observers became reliable using video, in-person observations, or other means. Another open-ended section required a
description of reported findings. Interrater Reliability
The author and one graduate student participated in
training on the use and interpretation of items from the code sheet. Interrater reliability was established by hav
ing each of the two coders independently code a single article. Responses were used to calculate the percent agreement (i.e., agreements divided by agreements plus disagreements). An interrater reliability of .97 was achieved. After interrater reliability was established, and all articles were coded by the author, a meeting was con
Volume 31, Summer 2008 117
This content downloaded from 195.78.109.162 on Mon, 16 Jun 2014 01:17:53 AMAll use subject to JSTOR Terms and Conditions
Table 1
Basic Study Information
Number of Number of
Student Age/ Teacher Number of
Length of Person Obser. Obser.
Study Participants Grade Participants Observations
Observations Observed Setting Inst.
I Allington & McGill- 64 total; 2nd, 4th,
Not reported
1 per teacher 1 entire T All academic SOI I
Franzen, 1989a 32 mild dis. and 8th school day settings
Bresnahan, 2001b 3 total 7th 2 sp. ed. 3
per teacher 90 minutes T and S 2 resource Rshr.
3 LD 1 gen. ed. 1 gen. ed. Dev.
Gelzheiser & Meyers, Not reported 2nd-5th
31
gen. ed. 4 to 33 per Not reported T Gen. ed., SOBR
t^ 1991 7 sp. ed.
teacher
resource, and
I 10
remedial
remedial
S' reading reading g- Haynes & Jenkins, 1986a 178 total 4th-6th 7
gen.
ed. 5-8 times per Gen. ed.: full T and S Gen. ed.: SOBR g 114 LD 23 sp. ed.
student
school day all academic
^ Resource settings
/O room: Resource:
? Reading period reading
^ Kethley, 2005b Not reported 6th-8th
4 sp.
ed. 7 times per 50-90 min. T Resource Rshr.
teacher Dev.
S Leinhardt, Zigmond, & 105 total 6-12 Not reported 30 times 1 hour T and S Resource SOBR
Cooley, 1981 105 LD years per student
and teacher
Meents, 1990b Not reported High 12 sp. ed. Minimum 2 Not reported T Resource Rshr.
school per teacher Dev.
Moody, Vaughn, Hughes, 63 total Elem. 6 sp. ed. 4 per teacher 60-120 min. T Resource CCS
I & Fischer>
2000 59 LD a?e
O'Connor & Jenkins, Year 1: 12 3rd-6th Year 1: Not reported 2-6 hours T and S Gen. ed: Rshr.
1996 Total; 11 LD 8 teachers 8 teachers cooperative Dev.
Year 2: 10 Year 2: Year 2: learning Total; 10 LD 5 teachers 5 teachers groups
O'Sullivan, Ysseldyke, 77 total 2nd-4th Not
reported
1 per student Entire school S All academic CISSAR
Christenson, & Thurlow,
21 LD day settings
1990
continued
on next page
This content downloaded from 195.78.109.162 on Mon, 16 Jun 2014 01:17:53 AMAll use subject to JSTOR Terms and Conditions
Table
1 continued
Basic Study Information
Number of Number of
Student Age/ Teacher Number of Length of Person Obser. Obser.
Study Participants Grade Participants
Observations
Observations Observed Setting Inst.
I Rieth, Bryant, Kinzer, 62 total 9th 1
gen. ed. 1 per teacher 50 min T Gen. ed. Rshr. I
Colburn, Hur, Hartman, 14 LD Dev.
& Choi, 2003
Schumm, Moody, & Not reported 3rd 29 gen. ed. 3 per teacher Approx. 90 T Gen. ed. CCS
Vaughn, 2000 minutes
Thurlow, Graden, 34 total 3rd-5th 12 gen. ed. 2 per student Entire school S All academic CISSAR
^ Greener,&
Ysseldyke,
1983a 34 LD day settings
2- III III
| Thurlow, Ysseldyke, 8 total 3rd-4th 8 gen. ed. 2 per student Entire school S All academic CISSAR
Z \ Graden, & Algozzine, 1983a 8 LD day settings
? Vaughn, Moody, & 82 total 3rd 14
sp.
ed. 3 per teacher 60-90 min T Resource CCS
| Schumm, 1998 77 LD
a III II
^ Ysseldyke, Christenson, 122 total 2nd-4th Not
reported
1 per student Entire school S All academic CISSAR
| Thurlow, &
Bakewell,
1989a 30 LD day settings
Ysseldyke, O'Sullivan, 122 total 2nd-4th 51 gen. ed. 2 per student
Reading T and S NLD: gen. ed. TIES
^ Thurlow, & Christenson,
30 LD 24 sp. ed. and math LD: gen. ed.
^> 1989 instruction and resource
Ysseldyke, Thurlow, 122 total 2nd-4th Not
reported
1 per student Entire school S All academic CISSAR
Christenson, & Weiss, 30 LD day settings
19873
Ysseldyke, Thurlow, 34 total 3rd_4th 17
gen. ed.
2 per student Entire school S All academic CISSAR
Mecklenburg, & Graden, 17 LD day settings
I
1984a
|
Ysseldyke, Thurlow, 77 total 2nd-4th Not reported Not reported Entire school S Gen. ed. and CISSAR
O'Sullivan, & Christenson, 21 LD day resource
1989a
Zigmond & Baker, 1994 Year 1: 6 LD 4th Not
reported
4 per student Not reported T and S Gen. ed. Rshr.
Year 2: 12 each year Dev.
Total; 6 LD
Note. CISSAR = Code for Instructional Structure and Student Academic Response; TIES = The
Instructional
Environment Scale; CCS = Classroom Climate Scale; SOI = Student Observation Instrument;
SOBR = Student-Level Observation of Beginning Reading; Rshr. Dev. = Researcher developed; T = Teacher; S = Student.
I aStudy reports data from entire school day. Only data from reading
and/or
language arts class time is reported here. bDoctoral dissertation.
This content downloaded from 195.78.109.162 on Mon, 16 Jun 2014 01:17:53 AMAll use subject to JSTOR Terms and Conditions
ducted with the graduate student to resolve outstanding questions. Once the coding was completed, the studies were summarized in table format.
Data Analysis According to Cooper (1998), research syntheses focus
on empirical studies in order to draw conclusions from
disparate sources that address related research ques tions. The goal of such an effort is to highlight impor tant findings and detail areas left unresolved. While there are many ways to conduct data analysis for the
purpose of syntheses, it must be directed by the nature of included studies. In this case, observational studies were included that described the location, type, amount, and quality of reading instruction that stu dents with LD receive in elementary through high school. No studies that implemented an intervention and used observation methods to detect ways instruc tion changed over time are included. Because no inter vention studies were included, it is impossible to conduct meta-analytic statistical calculations of effect of intervention on student achievement. Data reported in studies included in the synthesis will
be categorized in a variety of ways to shed light on dif ferent learning environments (e.g., resource rooms, gen eral education), common findings among observations, and changes over time in the nature of instruction pro vided to students with LD. Unique compared to the pre vious synthesis of observation studies (Vaughn et al., 2002) will be an analysis of authors' report of interrater
reliability procedures. Interrater reliability is considered an important tool for controlling the quality of data col lection methods and is considered essential in estab
lishing the validity of reported outcomes (Gwet, 2001). Gwet warns of variability in results due to poor inter rater reliability among data gathering sources; therefore, observation studies that do not employ strong interrater
reliability standards risk reporting findings based on unreliable data. Here, studies will be scrutinized for their inclusion of interrater reliability procedures in their
study design as well as other design features that con
tribute to valid and reliable observation data.
RESULTS Twenty-one studies met the criteria for inclusion in
this synthesis. Three studies used observation data from the entire school day, 12 observed reading instruction in the general education setting, and 13 included resource room observations. Six studies included the use
of standardized measures of student achievement and one utilized curriculum-based measures to assess stu
dent growth over time in reading skills. Basic study information is reported in Table 1. Study procedures and reported findings are summarized in Table 2.
Components of Effective Instruction Amount of reading instruction. Among the 10
studies that reported the total amount of time students
spent engaged in reading instruction, a simple majority reported few differences between students with and
without LD (Gelzheiser 8c Meyers, 1991; O'Sullivan, Ysseldyke, Christenson, & Thurlow, 1990; Thurlow, Graden, Greener, 8c Ysseldyke, 1983; Ysseldyke, Thurlow, Christenson, & Weiss, 1987; Ysseldyke, Thurlow, Mecklenburg, 8c Graden, 1984; Ysseldyke, Thurlow, O'Sullivan, & Christenson, 1989). This, paired with a
report published within the time period of the exam ined studies suggesting that greater proportions of time are allocated to academic activities in special education
settings (Ysseldyke et al., 1987), may lead some to believe that reading instruction is at least equitable for students with LD receiving services in resource and
general education settings. However, there is considerable evidence of the vari
able nature of reading instruction dependent upon where reading instruction occurs (general education or
special education settings). Not only did time spent in the special education resource room for reading instruction vary from 11 to 180 minutes per day (Haynes & Jenkins, 1986), but the nature of that instruction varied as well, with evidence showing that resource teachers spent only 44% of the time focused on reading activities (Haynes 8c Jenkins, 1986) and twice as much time on non-reading activities during allocated reading instruction time (Gelzheiser 8c
Meyers, 1991). Similarly, Leinhardt, Zigmond, and
Cooley (1981) noted disturbing trends in the allocation of time spent in the special education resource room, with an average of 20% of the time spent outside of the resource classroom during intended instructional time and another 26% of the time spent engaged in off task behavior, waiting, or classroom management.
Following is a more detailed description of components of effective reading instruction that were observed in resource rooms.
Word study. Of the few studies that reported teachers' word study or phonics instruction (Gelzheiser &
Meyers, 1991; Kethley, 2005; Meents, 1990; Moody et
al., 2000; Schumm, Moody, 8c Vaughn, 2000; Vaughn, Moody, 8c Schumm, 1998), most reported little explicit instruction in phonics. While a greater percentage of time in the special education resource room than the
general education classroom was dedicated to phonics instruction, the amount was still minimal (Gelzheiser 8c
Meyers, 1991). Kethley (2005) reported similar findings at the middle school level. While all teachers provided some instruction in decoding, explicit instruction and practice ranged from 10% to 25% of the class time. Others reported little to no phonics instruction
Learning Disability Quarterly 120
This content downloaded from 195.78.109.162 on Mon, 16 Jun 2014 01:17:53 AMAll use subject to JSTOR Terms and Conditions
[;7-
Table
2 |'
| Procedure and Results/Findings
Measures of Student Study Procedure Achievement Results/Findings
Allington & Observations in all
academic
None Reading time: Chapter 1 > LD
McGill-Franzen, 1989a settings Active teaching: gen. ed. > Ch. 1 > res.
Teacher interviews
Undifferentiated
seatwork: gen. ed. < Ch. 1 < res.
Bresnahan, 2001b Observed for 45 minutes on
each
None Whole-group instruction most common. i
of 3 days 2/3 teachers provided time for silent reading.
s. Teacher interviews Res. room: no reading comprehension strategy | Administrator interviews
instruction,
oral reading, silent reading, phonological u> Reviewed student records awareness activities, or vocabulary instruction.
^ Gen. ed. classroom: no decoding, word analysis, | comprehension strategy instruction, or fluency
| building activities.
t\> I
? Gelzheiser & Meyers, Four observations per student in None Purpose setting, modeling, telling information, and
1991 three settings: resource room, teacher explanation: resource = gen. ed.
^ gen. ed. classroom, and remedial Work
with
inds.: remedial > sp. ed; remedial > gen. ed.
^ classroom Work in groups: sp. ed. > remedial < gen. ed.
2 observers present. One used Comprehension: sp. ed. < remedial < gen. ed. | SOBR. Other took running record Indirect reading: sp. ed. < remedial < gen. ed. |
of classroom activities Oral
reading:
sp. ed. > remedial = gen. ed. I
Haynes & Jenkins, Observations in resource room CAT; SORT; One-to-one, small group, or individual instruction:
1986a 18 studs, with mild
disabilities
WRAT resource > gen. ed.
observed in gen. ed. Whole-class instruction: resource < gen. ed.
Reviewed student records
Kethley, 2005 b Observations conducted in
None
Use of instructional time: decoding: 10%-25%;
October and November Fluency: l%-36% ; Vocabulary: incidental Teacher interviews Comprehension: 0% to maximum not reported
Whole-class instruction
for decoding and fluency Independent seatwork for vocabulary and
comprehension.
continued on next page
This content downloaded from 195.78.109.162 on Mon, 16 Jun 2014 01:17:53 AMAll use subject to JSTOR Terms and Conditions
Table 2
continued
l|f|
Procedure and Results/Findings
|??|
Measures of rnSm Student mim Study Procedure Achievement Results/Findings Leinhardt, Zigmond, Observations between December Spache
Diagnostic
Large variation in different instructional activities. & Cooley, 1981 and May Reading Scales; Oral reading = 10% of day mim
WRAT; CTBS Off-task reading activities = 8% of day ffl
Management tasks or waiting = 26% ||i?
Outside of classroom = 20% |||||
? Meents, 1990b Observations None No decoding strategy instruction observed. mS]
| Teacher interviews Focus on
gen. ed. classroom assignments. pHI
<?" Administrator interviews No
remediation
of basic skills observed. If||
t3 Document review mim
K" II Stllli a [ I iilil
li Moody, Vaughn, 4 observations of entire
TORF (ORF)
Little phonics instruction observed. mill] ^ Hughes, 8r. Fischer, reading/language arts period WJ-R (reading Compared to results of 1998 study, teachers used less m?m
? 2000 Teacher interviews comprehension) whole-group instruction, implemented more 1111
ij Teacher self-reports
individualized
instruction, and grouped students with |||?
^5* similar reading levels for instruction. ?g|||
Gain in ORF scores. LD made one year's growth in
^ reading comprehension. IS
Ni plill t>o sssii
O'Connor ck Jenkins, Observations conducted
during
None Received help from teammates: LD > NLD |?J|
1996 cooperative learning Most LD participated
in "unsuccessful" grouping mlm
Semi-structured teacher interviews experiences.
Assistance from special educators and shortened
assignments disrupted group functioning and hindered ||?|
interdependence. mm\
O'Sullivan, Ysseldyke, One observation of each student BASIS Academic response and engagement: sp. ed. > gen. ed. |||I
Christenson, & Thurlow, BASIS administered in spring Passive attention: sp. ed. < gen. ed. mfSi]
1990 In gen. ed., academic engaged time LD < NLD; miSi]
academic responding time LD < NLD. p|
Rieth, Bryant, Kinzer, Observations at beginning and end
None Teachers asked twice as many short, low-level mgi
Colburn, Hur, Hartman, of 2-month period questions as long, higher-level questions. Teacher jf?|
& Choi, 2003 Teacher interviews responses were three times more likely to be short; 1?||
8 students per class randomly 18 times more likely to be at a low level. wB\
selected for interviews Students asked almost 10 times as many short, low- mm
level questions as long, higher-level questions. pii
Student responses were seven times more likely to be |?||?
short; twice as
likely
to
be at a low level. |||j
This content downloaded from 195.78.109.162 on Mon, 16 Jun 2014 01:17:53 AMAll use subject to JSTOR Terms and Conditions
Schumm, Moody, & Teachers interviews None Dominant grouping patterns: 1. whole-class; 2.
Vaughn, 2000 3 observations conducted independent activities; 3. group activities; 4. student
Teachers' self-report checklists pairs.
Teachers at least average at monitoring student
performance, communicating expectations to students, providing positive feedback, and redirecting off-task
behavior.
Undifferentiated instruction most common.
Thurlow, Graden, Students observed all day PIAT Writing or reading silently: LD < NLD
Greener, & Ysseldyke, PIAT at end of school year Reading aloud, spelling, or language: LD > NLD
1983a LD made raw score gains on all subtests of PIAT.
Thurlow, Ysseldyke, Each student observed 2 full days None Time spent writing, reading silently, reading aloud, or
Graden, & Algozzine, using readers: resource > gen. ed.
1983a
Vaughn, Moody, & 4 observations in each resource None Teachers reported using whole language.
Schumm, 1998 room Whole-class and independent activities > small group,
Teacher interviews pairs, or individual work.
Teacher self-reports 3/14 teachers taught word recognition or decoding.
1/41 observations note a comprehension strategy
? being taught.
? Ysseldyke, Christenson, Students observed entire school None Working with readers: LD > NLD
^ Thurlow, & Bakewell, day Listen to lecture, teacher-student discussion,
c 1989a'c transition, teacher directed tasks: LD < NLD 2 LD paper
tasks:
sp. ed. > gen. ed.
*"*
? Ysseldyke, O'Sullivan, Each student's classroom None Gen. ed. setting:
00 Thurlow, & Christenson, instruction was observed twice, Teacher feedback: LD > NLD
1989c each on consecutive days instructional presentation, checking for
g
understanding,
or practice: LD < NLD
^ Special education setting:
Instructional presentation, checking for understanding,
or practice and feedback: LD > EBD or MR.
Ysseldyke, Thurlow, Each student observed for one None Reading time LD = NLD
Christenson, & Weiss, school day Spelling or handwriting: LD < NLD
1987a Language activities LD > NLD
Academic time and transition time for LD: gen. ed >
resource.
Ysseldyke, Thurlow, Target student observed for 2 full None Time with readers LD = NLD
Mecklenburg, & Graden, consecutive
days Small group instruction LD < NLD
1984a Individual work and active engagement when using
readers LD > NLD
LD academic responding: whole group < small group
or individual.
continued on next page
This content downloaded from 195.78.109.162 on Mon, 16 Jun 2014 01:17:53 AMAll use subject to JSTOR Terms and Conditions
Table 2 continued
Procedure and Results/Findings
Measures of Student Study Procedure Achievement Results/Findings
Ysseldyke, Thurlow, Observations conducted during the None Minutes of reading instruction: LD = NLD
O'Sullivan, & entire academic day Dominant grouping for LD in gen. ed.: 1. whole
Christenson, 1989a group, 2. small group, 3. individual
Dominant grouping for LD in resource: 1. individual,
2. small group, 3. whole group.
3 Zigmond & Baker, 1994 Years 1 and 2: Curriculum-based
Standard scores in year 1 = year 2 <?' 4 observations per year measure of ORF growth: year 1 < year 2
5 Teacher interviews oral
reading
ORF time: resource = gen. ed.
j| Review of student records fluency; BASS Off task: resource > gen. ed.
Ip Year 1: Resource room Teacher monitored work: resource < gen. ed.
r> Year 2: General education
?
? 7
9 ! 1
K Note. CAT = California Achievement Test; SORT: Slosson Oral Reading Test; WRAT = Wide Range Achievement Test; CTBS = Comprehensive Test of Basic Skills; BASIS = Basic Achievement Skills
^ Individual Screener; PIAT = Peabody Individual Achievement
Test; BASS = Basic Academic Skills Samples.
aStudy reports data from entire school day. Only data from reading and/or language arts class
time is reported here. bDoctoral dissertation. cWhile no statistically significant differences in the !
amount of time for each coded event were reported between students with LD and without a disability, practical differences were noted.
This content downloaded from 195.78.109.162 on Mon, 16 Jun 2014 01:17:53 AMAll use subject to JSTOR Terms and Conditions
provided to students with LD (Meents, 1990; Moody et
al., 2000; Schumm et al., 2000; Vaughn et al., 1998). Comprehension. Of the few studies that reported evi
dence of comprehension instruction, all reported small amounts of time spent on low-quality comprehension instruction. Gelzheiser and Meyers (1991) reported only 8% of instructional time in the special education resource room was dedicated to comprehension instruc
tion; however, the proportion is not much higher in the
general education classroom, with only 13% of allo cated time. Even when comprehension instruction was
reported, it was judged as being of low quality, with instruction consisting of teachers reading a story aloud or having students take turns reading a story followed
by teacher questioning (Vaughn et al., 1998). These
questions were largely factual and literal (Kethley, 2005;
Vaughn et al., 1998), with teachers asking twice as
many short, low-level questions as long, higher-level questions (Rieth et al., 2003). Strikingly, across the four studies that reported comprehension instruction, repre senting a total of 263 observations, comprehension strategy instruction was noted in only three observa tions (2 in Kethley, 2005; 1 in Vaughn et al., 1998).
Vocabulary and reading fluency. Reports of vocabu
lary and reading fluency instruction were overwhelm
ingly missing from this corpus of studies. Among the three studies (Gelzheiser & Meyers, 1991; Kethley, 2005;
Meents, 1990) that included reports of vocabulary instruction, one lumped vocabulary instruction with other activities that were coded as "indirect reading" (Gelzheiser & Meyers, 1991), resulting in an inability to isolate a report on the quantity and quality of vocabu
lary instruction. Meents reported no direct instruction in vocabulary. Finally, two out of four teachers inter viewed by Kethley (2005) described explicit vocabulary instruction. However, little explicit instruction was later observed, with students mostly engaged in completing vocabulary worksheets.
Evidence of reading fluency instruction was even less attainable. No studies reported evidence of explicit instruction in reading fluency. Some studies noted amount of time spent reading orally and silently (e.g., Haynes & Jenkins 1986; Leinhardt et al., 1981), but did not specify time spent reading for the purpose of devel
oping reading fluency. From the data provided by these studies, it is impossible to determine the amount of
reading fluency instruction that took place in these classrooms.
Time spent reading connected text. Five studies
reported the time students spent engaged in reading text either orally or silently (Haynes & Jenkins, 1986; Leinhardt et al., 1981; O'Sullivan et al., 1990; Thurlow, Graden et al., 1983; Thurlow, Ysseldyke, Graden, &
Algozzine, 1983). Overall estimates were low, with the
time students with LD spent engaged in oral or silent
reading ranging from zero to 17.4 minutes per day. While Haynes and Jenkins (1986) reported that stu dents with LD spent more minutes reading in the gen eral education classroom than in the special education resource room, others reported proportions of time that conflict with this finding. For example, O'Sullivan et al.
(1990) and Thurlow, Ysseldyke et al. (1983) compared reading time in general education and the resource room for students with LD. According to their reports, students with LD spent between 12.8% and 34.8% of their time in the resource room engaged in silent or oral reading, whereas in the general education class
room, this percentage dropped to between 2.5% and 17.7%. When students with and without LD are com
pared across all instructional settings, the difference is less evident - students with LD engaged in oral or silent
reading 12.1% of the time compared to students with out LD who engaged in oral or silent reading 13.5% of the time.
Instructional Grouping The most frequently reported grouping structure was
whole-class instruction, whether in the general educa tion classroom (Thurlow, Graden et al., 1983; Ysseldyke, Thurlow, O'Sullivan et al., 1989; Zigmond 8c
Baker, 1994) or the resource room (Moody et al., 2000; Schumm et al., 2000; Vaughn et al., 1998). However, when the reading experiences of students with and without LD are compared over the span of an entire school day, the picture changes to reflect students with LD spending an average of 25% of their time in whole
group settings, 50% of their time in small groups, and 25% of their time engaged in individual work. Peers without LD spent 32% of their time in whole-group set
tings, 66% of their time in small groups, and only 2% of their time engaged in individual work (Ysseldyke, Thurlow, Mecklenburg et al., 1984).
Another group of studies reported that students with LD spent more than half of their instructional time in the resource room engaged in undifferentiated seat
work (Allington & McGill-Franzen, 1989; Haynes &
Jenkins, 1986; Ysseldyke, Thurlow, O'Sulllivan et al., 1989; Zigmond & Baker, 1994). In fact, Zigmond and Baker (1994) described a resource classroom where most of students' time was spent engaged in individual seatwork while the teacher worked with students
grouped by grade level, resulting in students focusing on either paper-and-pencil tasks or "nothing."
Small groups were reported by some authors as the second most often observed grouping strategy in resource rooms (Haynes & Jenkins, 1986; Moody et al., 2000; Thurlow, Graden et al., 1983; Ysseldyke, Thurlow, O'Sullivan et al., 1989), with only one author
Volume 31, Summer 2008 125
This content downloaded from 195.78.109.162 on Mon, 16 Jun 2014 01:17:53 AMAll use subject to JSTOR Terms and Conditions
reporting small-group instruction as the primary grouping structure across the entire academic day (Ysseldyke, Thurlow, Mecklenburg et al., 1984).
Student Academic Achievement Six studies included measures of academic achieve
ment in reading for students with LD (Haynes 8c
Jenkins, 1986; Leinhardt et al., 1981; Moody et al., 2000; O'Sullivan et al., 1990; Thurlow et al., 1983; Zigmond &
Baker, 1994). Of the studies that reported standard scores from the beginning and the end of the school
year (Haynes & Jenkins, 1986; Moody et al., 2000), stu dents made significant growth in oral reading (Haynes 8c
Jenkins, 1986; Moody et al., 2000) and/or wide range reading ability (Haynes & Jenkins, 1986). However,
Moody and colleagues (2000) reported no significant pre-/posttest differences in reading comprehension.
Leinhardt and colleagues created a composite score
ranging from 0 to 30 from two assessments to indicate relative levels of reading skill. On the scoring scale, a
mean pretest score of 12.0 indicated that students exhibited fairly well-developed phonics skills and could read isolated words, but they read at a rate of less than 50 words per minute and had underdeveloped reading comprehension skills. Students with devel
oped reading skills scored at least 24. At the end of the school year, students with LD in this study scored an
average of 17.49 points on the composite scale. Two studies (O'Sullivan et al., 1990; Thurlow et al.,
1983) compared standard scores of students with and without LD. While one study (Thurlow et al., 1983)
reported that students with and without LD scored
comparably on tests of general information, both studies noted that students without LD outscored students with LD on measures of reading ability. One study (Zigmond 8c Baker, 1994) reported find
ings from one student's progress on a curriculum-based measure of oral reading fluency over two school years. At the end of the participant's fourth-grade year, he read 38.5 words per minute. By the end of fifth grade, this score rose to 56.0 words per minutes. While in the resource room for reading instruction, his oral reading score declined at a rate of 0.12 words per week; how
ever, when he transitioned into reading instruction in the general education classroom, this student gained 0.38 words per week.
Quality of Studies As scholars continue to conduct observational
research, it is important to identify which methods contribute to high-quality observation studies and assess their use in current reports of findings. Therefore, this portion of the research synthesis will focus on describing research methods used in the
corpus of studies included in this synthesis.
As opposed to unobtrusive observation, where obser vations are conducted without participants' knowledge, all 21 studies employed continuous monitoring obser vation procedures, whereby participants knew that
they were being observed (Bernard, 1994). Certain limitations are inherent in using continuous monitor
ing; however, they can be reduced through high-quality study design.
First, while it is impossible to eliminate observer bias, it can be reduced through extensive training (Gwet, 2001; Kent, Kanowitz, O'Leary, 8c Cheiken, 1977). Second, even when a fixed coding observation tool is used, an observer must decide among alternatives
when coding behavior (Bernard, 1994). This introduces the need for establishing interrater reliability prior to data collection. Third, study designs should include
measures to reduce the Hawthorne Effect, whereby those being observed behave differently simply because an observer is present. Because contrived behavior is difficult to maintain over time, one way to reduce the Hawthorne Effect is to conduct multiple observations (Hartmann 8c Wood, 1990). A summary of interrater reliability information is reported in Table 3.
Training Hartmann and Wood (1982) suggested a model for
training observers that includes learning the observa tion manual, practice sessions, retraining (to prevent observer drift), and post-investigation debriefing. Ten studies reported training, practice sessions, and retrain
ing (Gelzheiser 8c Meyers, 1991; Haynes 8c Jenkins, 1986; Leinhardt et al., 1981; O'Sullivan et al., 1990;
Thurlow, Graden et al., 1983; Thurlow, Ysseldyke et al.,
1983; Ysseldyke, Christenson, Thurlow, Sc Bakewell, 1989; Ysseldyke, O'Sullivan, Thurlow, 8c Christenson, 1989; Ysseldyke et al., 1987; Ysseldyke, Thurlow et al., 1989). Four studies employed initial training and prac tice sessions (Moody et al., 2000; Rieth et al., 2003; Schumm et al., 2000; Vaughn et al., 1998), but did not
require retraining or checks during the data collection
phase. Of the seven that did not report training proce dures, one investigation reported informal retraining during the study period (Ysseldyke et al., 1984). Two
studies employed only one observer (Bresnahan, 2001;
Kethley, 2005), and four did not report the number of
observers collecting data (Allington 8c McGill-Franzen, 1989; Meents, 1990; O'Connor & Jenkins, 1996;
Zigmond 8c Baker, 1994). No studies reported post investigation debriefing. In sum, while seven studies did not report observer training procedures, the major ity of studies (n = 10) completed three of the suggested training steps, neglecting only a post-investigation debriefing phase.
Learning Disability Quarterly 126
This content downloaded from 195.78.109.162 on Mon, 16 Jun 2014 01:17:53 AMAll use subject to JSTOR Terms and Conditions
IIIilliillM I
Table 3
Reliability
Initial
and/or Criterion Number of Follow-Up for Initial Method for
Study Observers Reliability Reliability Reliability Training Procedure
Allington 8c McGill- Not reported Not reported Not reported Not reported Not reported
Franzen,
1989a
Bresnahan, 200 lb 1 Not reported Not reported Not reported Not reported
Gelzheiser 8c Meyers, 12 Initial and 90% Gold 8 group meetings
1991 follow-up standard Met weekly with authors to resolve ? ambiguities
| 4-6 monthly reliability
checks
^ Haynes Sc Jenkins, 1986a 6 Initial and 75% Gold 3 weeks of daily, 2-hour sessions
| follow-up standard Reliability checks every other week
2 I I I
^ Kethley, 2005b 1 Not reported Not reported Not reported Not reported
\ Leinhardt, Zigmond, 10 Initial and 80% Gold Self-training manual
& Cooley, 1981 follow-up standard Reliability checks every other week
^ Meents, 1990b Not reported Not reported Not reported Not reported Not reported
Moody, Vaughn, Hughes, Not reported Initial 90% Consistency 11 hours training
& Fischer, 2000
O'Connor 8c Jenkins, Not reported Not reported Not reported Not reported Not reported
1996
O'Sullivan, Ysseldyke, 4 Initial and 90%
Consistency
Training of unspecified content or length
Christenson, 8c Thurlow, follow-up 12 checks during study
1990
Rieth, Bryant, Kinzer, 4 Initial 90% Gold Training of unspecified length
Colburn, Hur, Hartman, standard
& Choi, 2003
Schumm, Moody, 8c 7 Initial 80% Gold 10-hour training
Vaughn, 2000 standard
continued on next page
This content downloaded from 195.78.109.162 on Mon, 16 Jun 2014 01:17:53 AMAll use subject to JSTOR Terms and Conditions
Table 3 continued
Reliability
Initial
and/or Criterion Number of Follow-Up for Initial Method for
Study Observers Reliability Reliability Reliability Training Procedure
Thurlow, Graden, Greener, 12 Initial Not reported Consistency Training of unspecified content or length
8c Ysseldyke, 1983a Informal checks during study at bi-weekly
meetings
Thurlow, Ysseldyke, 9 Initial and 85% to 95% Consistency 8 units of training
Graden, Sc Algozzine, follow-up Mastery required on exercises and quizzes
1983a 41 reliability checks during data collection
3 Vaughn, Moody, & Not reported Initial 85% Consistency 11 hours of training
c3 Schumm, 1998
2
I- Ysseldyke, Christenson, Not reported Initial and 90% Consistency Training of unspecified content or length
S Thurlow, St Bakewell, 1989 follow-up Mastery required on training tests
^o 2 days classroom practice
a 12 reliability checks during data collection
I
^" Ysseldyke, O'Sullivan, 6 Initial and 60% and Consistency 2 weeks of half-day training
Thurlow, 8c Christenson, follow-up 84% 2-3 days of practice in classrooms
^ 1989 18 reliability checks during data collection
00
Ysseldyke, Thurlow, Not reported Initial and 90% Consistency Training of unspecified content or length
Christenson, & Weiss, follow-up Mastery required on training tests
1987a 2 days classroom practice
Reliability checks once every 20 observations
Ysseldyke, Thurlow, 12 Initial 85% Consistency Training not reported
Mecklenburg, 8c Graden, Informal checks during study at bi-weekly
1984a meetings
Ysseldyke, Thurlow, 4 Initial and Not reported Consistency 2 weeks of formal training sessions
i O'Sullivan, 8c Christenson, follow-up 2-3 days of classroom practice
1989a 12 reliability checks during data collection
Zigmond Sc Baker, 1994 Not reported Initial Not reported Consistency Training not reported
During 20% of observations, 2 observers
coded
Note. aStudy reports data from entire school day. Only data from reading and/or language arts class time is reported here. bDoctoral dissertation.
This content downloaded from 195.78.109.162 on Mon, 16 Jun 2014 01:17:53 AMAll use subject to JSTOR Terms and Conditions
Interrater Reliability Methods for calculation. Interrater reliability
- con
sistency of scores obtained from a measure - can be established by calculating the percentage of agreement between coders. Percentage of agreement can be calcu lated in two ways: (a) the "gold-standard" approach, and (b) consistency between two or more raters. The
"gold standard" approach requires an expert coder to establish a set of correct observation codes (Gwet, 2001). Each subsequent observer's trial observation is com
pared, and a percent agreement is calculated. Four stud ies reported using the "gold standard" method of
establishing interrater reliability (Gelzheiser 8c Meyers, 1991; Haynes Sc Jenkins, 1986; Leinhardt et al., 1981; Rieth et al., 2003), with overall agreement criterion
ranging from 75% to 90%. The second method for establishing interrater percent
agreement is to report the extent to which scores recorded by two or more raters are consistent. Here, raters are considered consistent based on a relative number of times they agree on the code that should be
assigned to a behavior (Goodwin, 2001). Twelve studies used this form of percent agreement (Moody et al., 2000; O'Sullivan et al., 1990; Schumm et al., 2000; Thurlow, Graden et al., 1983; Thurlow, Ysseldyke et al., 1983; Vaughn et al., 1998; Ysseldyke, Christenson et al., 1989; Ysseldyke, O'Sullivan et al., 1989; Ysseldyke et al., 1987; Ysseldyke et al., 1984; Zigmond Sc Baker, 1994),
with overall agreement ranging from 80% To 90%.
Agreement can be calculated by item, section, or the entire measure. When agreement is calculated by item, it is possible to provide further training on specific items of disagreement. Four studies reported requiring
matches by item (Moody et al., 2000; Vaughn et al., 1998; Ysseldyke, Christenson et al., 1989; Ysseldyke, O'Sullivan et al., 1989). Interrater percent agreement ranged from 85% to 90%. Two studies reported calcu
lating interrater reliability by section (Gelzheiser Sc
Meyers, 1991; Thurlow, Ysseldyke et al., 1983), with interrater percent agreement ranging from 85% to 94%.
No studies reported calculating interrater reliability based upon the entire measure. Finally, in 10 studies, it was not possible to determine the level upon which interrater percent agreement was calculated (Haynes Sc
Jenkins, 1986; Leinhardt et al., 1981; O'Sullivan et al., 1990; Rieth et al., 2003; Schumm et al., 2000; Thurlow,
Graden et al., 1983; Ysseldyke et al., 1987; Ysseldyke, Thurlow et al., 1989; Zigmond Sc Baker, 1994; Ysseldyke et al., 1984). Determining agreements. Another variation in
method of calculating interrater agreement is whether or not to require exact matches in coding. The more conservative criterion for agreement is to require an exact match; however, some authors use the more
lenient criterion that the difference between two raters is not more than one point in either direction.
Percentages of agreement tend to be lower when agree ment is defined in the more conservative, exact-match manner.
Two studies required exact matches on all sections of the observation instrument (Thurlow, Graden et al., 1983; Thurlow, Ysseldyke et al., 1983), with interrater
agreements of 94% and 85-90%, respectively. One study (Ysseldyke, O'Sullivan et al., 1989) required exact
matches on one section (60% agreement achieved) but not on another (84% agreement achieved). Other stud ies did not report whether they required exact matches.
Acceptable level of agreement. Interrater reliability above 75% is said to be excellent agreement beyond chance, and those that fall between 40% and 75% are
considered to be fair to good agreement (Landis 8c Koch, 1977). Of the 12 studies that reported a minimum inter rater reliability level, all required at least 75%, with seven of these setting the criterion at greater than or
equal to 90% interrater reliability (Gelzheiser 8c Meyers, 1991; Moody et al., 2000; O'Sullivan et al., 1990; Rieth et al., 2003; Thurlow, Ysseldyke et al., 1983; Ysseldyke, Christenson et al., 1989; Ysseldyke et al., 1987). The one
partial exception was the study conducted by Ysseldyke, O'Sullivan, and colleagues (1989), in which they required exact matches on one section, achieving a 60% level of agreement, but did not require exact matches on another section, achieving an 84% level of agree
ment.
Seven studies did not report criteria or a method for
establishing initial reliability prior to beginning obser vations (Allington 8c McGill-Franzen, 1989; Bresnahan, 2001; Kethley, 2005; Meents, 1990; O'Connor 8c
Jenkins, 1996; Ysseldyke, Thurlow et al., 1989; Zigmond 8c Baker, 1994). Two of these studies used only one observer (Bresnahan, 2001; Kethley, 2005), and four failed to report the number of observers or the method for obtaining interrater reliability (Allington 8c McGill
Franzen, 1989; Meents, 1990; O'Connor 8c Jenkins, 1996; Zigmond 8c Baker, 1994). Finally, two described an interrater reliability procedure that was employed during data collection; however, interrater reliability was not established prior to observations (Ysseldyke, O'Sullivan et al., 1989; Ysseldyke, Thurlow et al., 1989).
Multiple Observations When Harvard Business School professors researched
productivity in the Hawthorne Plant of the Western Electric Company, they found that workers altered their behavior simply because they were being observed
(Roethlisberger 8c Dickson, 1939). The Hawthorne Effect is realized in experimental research as the unwanted effect of experimental methodology itself (Parsons,
Volume 31, Summer 2008 129
This content downloaded from 195.78.109.162 on Mon, 16 Jun 2014 01:17:53 AMAll use subject to JSTOR Terms and Conditions
1974). Thus, in observational research, the Hawthorne Effect can be detected among both teachers and stu dents being observed. In other words, everyone is on their best behavior when a researcher is collecting obser vation data in their classroom. While the Hawthorne Effect may not be eliminated, researchers can reduce the effect by conducting multiple observations in a class room (Hartmann Sc Wood, 1990).
The number of observations reported in studies of
reading instruction for students with LD varied
widely. Five studies conducted only one observation of either students (O'Sullivan et al., 1990; Ysseldyke, Christenson et al., 1989; Ysseldyke et al., 1987) or
teachers (Allington Sc McGill-Franzen, 1989; Rieth et
al., 2003). However, in three of the studies, students were observed over the entire school day (O'Sullivan et
al., 1990; Ysseldyke, Christenson et al., 1989; Ysseldyke et al., 1987). Five studies conducted 2 observations of students (Thurlow, Graden et al., 1983; Thurlow,
Ysseldyke et al., 1983; Ysseldyke, O'Sullivan et al., 1989; Ysseldyke et al., 1984) or teachers (Meents, 1990). Three studies conducted 3 observations of students
(Bresnahan, 2001; Schumm et al., 2000; Vaughn et al.,
1998). Four studies conducted more than 3 observa tions of students (Kethley, 2005; Leinhardt et al., 1981;
Moody et al., 2000; Zigmond Sc Baker, 1994), and two
studies reported a range of 4 to 33 observations
(Gelzheiser Sc Meyers, 1991; Haynes Sc Jenkins, 1986). Two studies did not report the number of observations conducted (O'Connnor Sc Jenkins, 1996; Ysseldyke, Thurlow et al., 1989).
DISCUSSION The results from the synthesis reported here indicate
that there is a disconnect between what occurs during reading instruction for students with LD and research
supported components of effective reading instruction.
Several findings are of specific interest.
First, the results indicated that students with LD
spent little time engaged in phonemic awareness,
phonics, reading fluency, comprehension, and vocabu
lary instruction. It has been demonstrated that inter
ventions that focused on these components are more
effective than those that are not Quel Sc Minden-Cupp, 2000; Torgesen, 2002). In particular, evidence suggests that methods that directly teach phonemic awareness
and phonics skills are especially effective for struggling readers (Hatcher, Hulme, Sc Ellis, 1994; Torgesen et al.,
1999). Phonemic awareness is the ability to focus on
and manipulate phonemes in spoken words (Liberman, Shankweiler, Fischer, Sc Carter, 1974). However, it is
important to link phonological awareness to printed letters (Ehri, 2004). In this way, a connection is made to phonics, which is the understanding of letter-sound
correspondences and how to use these correspondences to read and spell words (National Reading Panel, 2000).
According to the results of the current synthesis, little phonics instruction was observed. In fact, four studies (Meents, 1990; Moody et al., 2000; Schumm et
al., 2000; Vaughn et al., 1998) that included at least 145 students with LD (Meents and Schumm studies did not report number of student participants) reported no
phonics instruction delivered during observed class times. In a recent meta-analysis of phonics instruction
(Ehri, 2004), synthetic phonics approaches, whereby correspondences between letters and sounds are taught in a clearly defined sequence, outperformed other
approaches that taught larger phonic units or provided more miscellaneous instructional methods.
Second, inappropriate grouping structures were often used during reading instruction for students with LD.
Multiple research syntheses (National Reading Panel, 2000; Swanson et al., 1999) have established the impor tance of using small groups to provide high-quality reading instruction to struggling readers. For example, The National Reading Panel (2000) reported that
phonemic awareness instruction is most effective when
provided in a small-group setting. Likewise, teaching students with LD in small groups of six or fewer stu dents was one of three key instructional components reported to produce the strongest impact on reading outcomes (Swanson et al., 1999). Students with LD in the current synthesis most often experienced undiffer entiated seatwork or whole-class instruction. Thus, evi dence from this synthesis suggests that the most
beneficial grouping structure - small groups - is rarely
used during reading instruction for students with LD.
Third, students were engaged in very little compre hension instruction. Learning from text and under
standing what you read are the purposes of reading instruction. Yet, these studies reported that comprehen sion instruction rarely occurred, and when it did, it
included, for the most part, literal comprehension ques tions.
Finally, students with LD spent little time engaged in
the actual task of reading, when they are perhaps the
population of students who require the greatest amount of practice reading connected text. One way to
increase reading fluency - the ability to read quickly,
accurately, and with expression - is repeated reading
(Chard, Vaughn, 8c Tyler, 2002). Teaching struggling readers to read fluently is difficult. In a recent synthe sis of fluency instruction studies (Kuhn 8c Stahl, 2003), it was reported that even with high-quality fluency instruction, struggling readers (including those with
LD) rarely made one month's progress in oral reading
fluency within one month's time. However, they made more progress in reading fluency than they did in other
Learning Disability Quarterly 130
This content downloaded from 195.78.109.162 on Mon, 16 Jun 2014 01:17:53 AMAll use subject to JSTOR Terms and Conditions
types of instruction. The results from the current syn thesis lead to cause for concern regarding the amount of
fluency instruction taking place during reading instruc tion for students with LD, providing at least some evi
dence that students with LD are not engaged in text
reading long enough to make a difference in oral read
ing fluency ability.
Limitations and Directions for Future Research Limitations in the current synthesis. While this syn
thesis reports findings from observation studies report
ing reading instruction for students with LD published in peer-reviewed journals available through ERIC,
PsycINFO, and Education Full Text, as well as dissertation studies listed in Dissertation Abstracts International
within the last 25 years, one limitation cannot escape mention. To establish interrater reliability, one article was coded by the author and a graduate student and
percent agreement (.97) was established. While this per cent agreement is high, additional articles could have
been coded to more robustly establish reliability. Defining key variables. Researchers often failed to
define key variables measured during observations. For
example, of the seven studies that reported grouping structure during reading instruction, none detailed their definition of the terms "whole group," "small group," or "individual" work. This becomes problematic when authors of two or more studies report amounts of small
group work, and different observation tools define the term "small group" each in their own way. A similar
problem occurred when analyzing the amount of com
prehension strategy instruction reported by authors. While one author provided a description of instruction from which the definition could be assumed (Kethley, 2005), the other did not (Vaughn et al., 1998). Neither author defined what they meant by "comprehension strategy," resulting in tenuous cross-study findings. For these reasons, it is important for authors of observation
studies to define key variables of interest. Interrater reliability. Two particular aspects of inter
rater reliability are of concern among the studies included in this synthesis. First, only two studies described the details regarding the requirement of exact matches in the coding described, leaving 19 studies that did not report such information. Because the exact match method is a more conservative method for estab
lishing interrater reliability, one could argue that these
findings are more robust. However, one cannot assume
that because it was not reported, it did not occur.
Therefore, it would be helpful if authors reported whether they required exact matches for the establish ment of interrater reliability.
Second, eight studies did not report criteria or a method for establishing initial reliability prior to begin
ning observation. Because interrater reliability estab lished prior to observations being conducted is a way to
systematically reduce observer bias, it is important to
report these procedures in order to lend credibility to
findings. Without such information, findings are sub
jected to increased scrutiny and questions about its reli
ability. School descriptions. Researchers rarely provided a
rich description of the schools where observations were
conducted. This left many contextual questions unan
swered. For example, (a) Were the school's students suc
cessful on state accountability measures? (b) What was the school population's socioeconomic status? Without this information, it is unclear whether this body of research is based on overall failing or successful schools.
During the past five years, fewer observation studies have been conducted during reading instruction for stu
dents with LD. However, such studies continue to be conducted for students at risk for reading difficulty (e.g., Chard 8c Kame'enui, 2000) in early grades. This shift in
observation research may be the result of a move away from labeling students' disabilities toward investigating the types of academic problems that students experi ence. As the field continues to identify essential com
ponents of reading instruction for students with LD, it will be important to document implementation in class rooms where students with LD are served. In other
words, is what has been described in research making its
way into classrooms?
Finally, observation studies of reading instruction for middle and high school students with LD need to be conducted. Of the 21 studies reported here, only five observed in classrooms that served students with LD in
grades 7-12. Of these, three described instructional con tent. One of the studies was conducted in 1990, while the other three were conducted more recently (2001 and 2005), and report findings from a total of 61 obser vations of 21 reported students (two studies did not
report the number of students) and 20 reported teachers
(one study did not report the number of teachers). Certainly, this does not constitute enough research to
generalize findings regarding the nature of reading instruction currently provided to middle and high school students with LD.
Implications for Practice To provide research-based reading instruction to stu
dents with LD, teachers must be knowledgeable about such practices and must develop the skills necessary to
successfully implement research-based reading instruc tion. The absence of such practices may indicate a lack of acceptance, knowledge, or skill among teachers of students with LD. Therefore, it is critical for teacher edu cation programs to focus on developing a deep knowl
Volume 31, Summer 2008 131
This content downloaded from 195.78.109.162 on Mon, 16 Jun 2014 01:17:53 AMAll use subject to JSTOR Terms and Conditions
edge of research-based reading instruction. Paired with this knowledge-building component should be mean
ingful, guided opportunities for teachers to practice the skills necessary to deliver such instruction.
This synthesis has reported findings from observation studies of reading instruction for students with LD in
grades 1 through 12 over the past 25 years. Despite lim itations within the corpus of studies, it was possible to
identify trends in instruction for students with LD. While additional research is necessary to better under stand the extent to which students with LD are pro vided effective reading instruction, it is apparent from this synthesis that the field has work to do in order to
bridge the gap between what is known to be effective
through research and what is observed in the classroom.
REFERENCES * = Studies included in the synthesis.
*Allington, R. L., & McGill-Franzen, A. (1989). School response to
reading failure: Instruction for chapter 1 and special education
students in grades two, four, and eight. The Elementary School
Journal 89, 529-542.
Bentum, K. E., 8c Aaron, P. G. (2003). Does reading instruction in
learning disability resource rooms really work?: A longitudinal
study. Reading Psychology, 24, 361-382.
Bergeron, T. (2003). A brief history of the parent advocacy movement.
Retrieved November 11, 2007, from www.parentinformation
center.org/aboutus/history/pam.html Bernard, H. R. (1994). Research methods in anthropology: Qualitative
and quantitative approaches. Thousand Oaks, CA: AltaMira Press.
*Bresnahan, M.V.P. (2001). How students in middle school with read
ing disabilities experience reading instruction: Three case studies.
Unpublished doctoral dissertation, Northern Illinois
University. Chard, D. J., & Kame'enui, E. J. (2000). Struggling first grade read
ers: The frequency and progress of their reading. The Journal of
Special Education, 34, 28-38.
Chard, D. J., Vaughn, S., & Tyler, B.-J. (2002). A synthesis of
research on effective interventions for building reading fluency with elementary students with learning disabilities. Journal of
Learning Disabilities, 35, 386-406.
Cooper, H. (1998). Synthesizing research: A guide for literature
reviews. Thousand Oaks, CA: Sage Publications.
Ehri, L. C. (2004). Teaching phonemic awareness and phonics. In
P. McCardle & V. Chhabra (Eds.), The voice of evidence in reading research (pp. 153-186). Baltimore: Brookes.
*Gelzheiser, L. M., & Meyers, J. (1991). Reading instruction by
classroom, remedial, and resource room teachers. The Journal of
Special Education, 24, 512-526.
Goodwin, L. D. (2001). Interrater agreement and reliability. Measurement in Physical Education and Exercise Science, 5(1), 13-34.
Gwet, K. (2004). Handbook of interrater reliability: How to estimate
the level of agreement between two or multiple raters. Gaithersburg, MD: STATAXIS Publishing.
Hartmann, D. P., & Wood, D. D. (1982). Observational methods.
In A. Bellack, M. Hersen & A. E. Kazdin (Eds.), International
handbook of behavior modification and therapy (109-131). New
York: Plenum.
Hatcher, P., Hulme, C, & Ellis, A. W. (1994). Ameliorating early
reading failure by integrating the teaching of reading and
phonological skills: The phonological linkage hypothesis. Child
Development, 65, 41-57.
*Haynes, M. C, & Jenkins, J. R. (1986). Reading instruction in spe cial education resource rooms. American Educational Research
Journal, 23, 161-190.
Institute of Education Sciences. (2003). What Works Clearinghouse
study review standards. Retrieved January 10, 2005, from
http://www.whatworks.ed.gov/reviewprocess/study_standards_
final.pdf ISI Web of Knowledge. (2005). Journal citation reports. Retrieved
January 10, 2005, from http://portal.isiknowledge.com.ezproxy.
lib.utexas.edu/portal.cgi/jcr/?Init=Yes&SID=4F8Nbln4@PDD32
Agnlo Juel, C, & Minden-Cupp, C. (2000). Learning to read words:
Linguistic units and instructional strategies. Reading Research
Quarterly, 35, 458-492.
Kent, R. N., Kanowitz, J., O'Leary, K. D.; 8c Cheiken, M. (1977). Observer reliability as a function of circumstances of assess
ment. Journal of Applied Behavior Analysis, 10, 317-324.
*Kethley, C. I. (2005). Case studies of resource room reading instruction
for middle school students with high-incidence disabilities. Unpub lished doctoral dissertation, The University of Texas, Austin.
Kim, A-H, Vaughn, S., Elbaum, B., Hughes, M. T., Sloan, C.V.M., & Sridhar, D. (2003). Effects of toys or group composition for children with disabilities: A synthesis. Journal of Early Intervention, 25, 189-205.
Kim, A-H, Vaughn, S., Wanzek, J., 8c Wei, S. (2004). Graphic
organizers and their effects on the reading comprehension of
students with LD: A synthesis of research. Journal of Learning Disabilities, 37, 105-118.
Kuhn, M. R., 8c Stahl, S. A. (2003). Fluency: A review of develop mental and remedial practices. Journal of Educational Psychology, 95, 3-21.
Landis, R. J., & Koch, G. G. (1977). An application of hierarchical
kappa-type statistics in the assessment of majority agreement
among multiple observers. Biometrics, 33, 363-374.
*Leinhardt, G., Zigmond, N., &c Cooley, W. W. (1981). Reading instruction and its effects. American Educational Research
Journal, 343-361.
Liberman, I., Shankweiler, D., Fischer, F., & Carter, B. (1974).
Explicit syllable and phoneme segmentation in the young child. Journal of Experimental Child Psychology, 18, 201-212.
McKinney, J. D., & Feagans, L. (1984). Academic and behavioral
characteristics of learning disabled children and average achievers: Longitudinal studies. Learning Disability Quarterly, 7, 251-265.
*Meents, C. K. (1990). Literacy instruction in high school resource
rooms. Unpublished doctoral dissertation, State University of
New York, Albany.
*Moody, S. W., Vaughn, S., Hughes, M. T., & Fischer, M. (2000).
Reading instruction in the resource room: Set up for failure.
Exceptional Children, 66, 305-316.
National Reading Panel. (2000). Teaching children to read: An evi
dence based assessment of the scientific research literature on read
ing and its implications for reading instruction. Bethesda, MD:
National Reading Panel, National Institute of Child Health and
Human Development. O'Connor, R. E., 8c Jenkins, J. R. (1996). Cooperative learning as
an inclusion strategy: A closer look. Exceptionality, 6, 29-51.
O'Sullivan, P. J., Ysseldyke, J. E., Christenson, S. L., 8c Thurlow, M. L. (1990). Mildly handicapped elementary students'
opportunity to learn during reading instruction in mainstream
and special education settings. Reading Research Quarterly, 25, 131-146.
Learning Disability Quarterly 132
This content downloaded from 195.78.109.162 on Mon, 16 Jun 2014 01:17:53 AMAll use subject to JSTOR Terms and Conditions
Parsons, H. M. (1974). What happened at Hawthorne? Science,
183, 922-932.
*Rieth, H. J., Bryant, D. P., Kinzer, C. K., Colburn, L. K., Hur, S.,
Hartman, P., et al. (2003). An analysis of the impact of
anchored instruction on teaching and learning activities in two
ninth-grade language arts classes. Remedial and Special Education, 24, 173-184.
Roethlisberger, F. J., 8c Dickson, W. J. (1939). Management and the
worker. Cambridge, MA: Harvard University Press.
Scammacca, N., Roberts, G., Vaughn, S., Edmonds, M., Wexler, J.,
Reutebuch, C. K., et al. (2007). Interventions for adolescent strug
gling readers: A meta-analysis with implications for practice.
Portsmouth, NH: RMC Research Corporation, Center on
Instruction.
*Schumm, J. S., Moody, S. W., & Vaughn, S. (2000). Grouping for
reading instruction: Does one size fit all? Journal of Learning
Disabilities, 33, 477-488.
Shinn, M. R., Powell-Smith, K. A., Good, R. H., & Baker, S. (1997). The effects of reintegration into general education reading instruction for students with mild disabilities. Exceptional Children, 64, 59-79.
Snow, C. E., Burns, M. S., 8c Griffin, P. (1998). Vreventing reading
difficulties in young children. Washington, DC: National
Academy Press.
Swanson, H. L., & Hoskyn, M. (1998). Experimental intervention
research on students with learning disabilities: A meta-analysis of treatment outcomes. Review of Educational Research, 68, 277
321.
Swanson, H. L., Hoskyn, M., 8c Lee, C. (1999). Interventions for stu
dents with learning disabilities: A meta-analysis of treatment out
comes. New York: Guilford.
*Thurlow, M., Graden, J., Greener, J., & Ysseldyke, J. E. (1983). LD
and non-LD students' opportunities to learn. Learning Disability
Quarterly, 6, 172-183.
*Thurlow, M., Ysseldyke, J. E., Graden, J. L., & Algozzine, B.
(1983). What's special about the special education resource
room for learning disabled students? Learning Disability
Quarterly, 6(283-288).
Torgesen, J. (2002). The prevention of reading difficulties. Journal
of School Psychology, 40, 7-26.
Torgesen, J., Wagner, R. K., Rashotte, C. A., Rose, E., Lindamood,
P., Conway, T., et al. (1999). Preventing reading failure in
young children with phonological processing disabilities:
Group and individual responses to instruction. Journal of Educational Psychology, 91, 579-593.
Vaughn, S., Levy, S., Coleman, M., & Bos, C. S. (2002). Reading instruction for students with LD and EBD: A synthesis of obser
vation studies. The Journal of Special Education, 36, 2-13.
*Vaughn, S., Moody, S. W., & Schumm, J. S. (1998). Broken prom ises: Reading instruction in the resource room. Exceptional Children, 64, 211-225.
Waldron, N. L., & McLeskey, J. (1998). The effects of an inclusive
school program on students with mild and severe learning dis
abilities. Exceptional Children, 64, 395-406.
Wang, M. C, Reynolds, M. C, Sc Walberg, H. J. (1986).
Rethinking special education. Educational Leadership, 44, 26-31.
Will, M. C. (1986) Educating children with learning problems: A
shared responsibility. Exceptional Children, 52, 411-416.
Ysseldyke, J. E., Christenson, S. L., Thurlow, M. L., & Bakewell, D. (1989). Are different kinds of instructional tasks used by dif
ferent categories of students in different settings? School
Psychology Review, 18, 98-111.
Ysseldyke, J. E., O'Sullivan, P. J., Thurlow, M. L., & Christenson, S. L. (1989). Qualitative differences in reading and math
instruction received by handicapped students. Remedial and
Special Education, 10, 14-20.
Ysseldyke, J. E., Thurlow, M. L., Christenson, S. L., & Weiss, J.
(1987). Time allocated to instruction of mentally retarded,
learning disabled, emotionally disturbed, and nonhandicapped
elementary students. The Journal of Special Education, 21, 43-55.
Ysseldyke, J. E., Thurlow, M. L., Mecklenburg, C, & Graden, J.
(1984). Opportunity to learn for regular and special education
students during reading instruction. Remedial and Special Education, 5, 29-37.
Zigmond, N., & Baker, J. M. (1994). Is the mainstream a more
appropriate educational setting for Randy? A case study of one
student with learning disabilities. Learning Disabilities Research
and Practice, 9, 108-117.
Zigmond, N., & Jenkins, J. (1995). Special education in restruc
tured schools: Findings from three multi-year studies. Phi Delta
Kappan, 76, 531-541.
AUTHOR NOTE This research was supported by a grant from the Meadows
Foundation.
Please address correspondence to: Elizabeth A. Swanson, The
University of Texas at Austin, 1 University Station D4900, Austin, TX 78712; e-mail: [email protected]
Volume 31, Summer 2008 133
This content downloaded from 195.78.109.162 on Mon, 16 Jun 2014 01:17:53 AMAll use subject to JSTOR Terms and Conditions