draft: october 17, 2012 introduction€¦ · draft: october 17, 2012 introduction ... decisions...
TRANSCRIPT
DRAFT WY Educator Evaluation Framework. Advisory Committee October 17-18, 2012 1
THE WYOMING FRAMEWORK FOR EDUCATOR EVALUATION
DRAFT: October 17, 2012
Introduction
The Advisory Committee to the Wyoming Select Committee on Educational Accountability was
charged with carrying out the recommendations put forth in the Wyoming Accountability in
Education Act of 2012 (WEA 65). The specific charge for the Advisory Committee was to
design a framework for educator evaluation in Wyoming. The Select Committee was quite clear
that they wanted a balance between state and local control and, in keeping with Wyoming’s
educational philosophy, the Select Committee placed considerable authority for making specific
design and implementation decisions with local educational leaders and teachers. However, in
order to best support the work of districts, the Advisory Committee produced this document: A
Framework of Educator Evaluation in Wyoming. This framework outlines methods and design
decisions necessary for implementing an educator evaluation system and indicates where the
Advisory Committee recommends where the requirements should be “tight” or more
standardized across districts and where flexibility is expected and even encouraged. The
Advisory Committee intends for the Framework described below to be able to be used by
districts as the basis for their local systems if they choose. The Framework will not be “plug and
play” in that local districts will still have many decisions to make to operationalize their local
system, but the Framework is designed to make districts’ jobs considerably easier.
A critical aspect of the framework, as reflected below in the key principles, is the intention to
build both an internally coherent system and an educator evaluation system that is coherent with
other educational accountability systems in Wyoming. A coherent system would use
information from the school accountability system (and perhaps the district accreditation system)
to supplement the information generated from the educator accountability systems. For example,
if a school has demonstrated high achievement and the students are growing at admirable rates,
there is good evidence of high quality education in the school, which therefore suggests we can
trust that the educators in the building are performing well. Relying on the larger sample sizes
associated with the school than any individual teacher means that the determinations are that
much more reliable. This intent to build off of the information from the school accountability
does not relieve school districts from implementing educator evaluation systems, but it could
mean that the state would have to provide far less oversight of educator evaluation systems in
high performing schools.
DRAFT WY Educator Evaluation Framework. Advisory Committee October 17-18, 2012 2
Key Principles
The following principles guided the development of the Wyoming Framework Educator
Evaluation system. The Advisory Committee kept these principles at the center of its
deliberations in the development of the various components of the system and are at the heart of
the recommendations discussed throughout this document. As noted below, the primary purpose
of the system is to maximize student learning and improvements in student learning. The system
must maintain the focus on student learning and all of the following principles support this
primary purpose.
1. The primary purpose of Wyoming’s educator evaluation and the reason for engaging
in this work is to support and promote increases in student learning in Wyoming
schools.
2. The system must be designed coherently to support a system of continuous school
improvement. A coherent system will work seamlessly with the school and leader
accountability systems and foster collaboration among educators, administrators, and
other stakeholders.
3. The Framework and locally-aligned versions of the system shall be designed to
promote opportunities for meaningful professional growth of educators. As such, the
system must be designed to provide specific and timely feedback on multiple aspects
of professional practice and student learning. A feedback-oriented system must be a
continuous improvement process and not a one-time event.
4. The system must be designed and implemented with integrity. Doing so will offer a
positive and far-reaching vision for education as a profession, one built on respect,
caring, and fairness. A system designed with integrity will be transparent such that
all relevant participants clearly understand the expectations.
5. The Framework must allow for flexibility to best fit local contexts and needs. The
local evaluation systems should be design collaboratively by administrators and
educators, with input gathered from parents and community members.
6. The system will provide credible information to support hiring, placement, and career
ladder decisions in a technically and morally defensible manner.
Domains of a the Wyoming Educator Evaluation Framework
A key aspect of the Framework is that it will contain five major components, four domains of
professional practice and one domain of student performance data. The four domains of
professional practice noted below represent the overarching categories of the Interstate Teacher
DRAFT WY Educator Evaluation Framework. Advisory Committee October 17-18, 2012 3
Assessment and Support Consortium Model Core Teaching Standards (InTASC Standards)1.
Districts will a variety of tools to measure professional practice (e.g., Danielson’s Framework for
Effective Teaching; Marzano’s Art and Science of Teaching). The Advisory Committee does not
want to limit the options to specific tools, but recommends that all local systems measure the
four domains of effective teaching described in the InTASC Standards.
Learner and Learning
Content Knowledge
Instructional Practice
Professional Responsibility
The Advisory Committee intends for each domain, including student performance results, to be
equally valued in the overall evaluation. Further, the Framework is designed to promote
coherence and integration among the five domains. Therefore, the Advisory Committee
recommends weighting each component, especially student learning, as equally as possible in the
overall evaluation of each educator. Further, there is an important difference between nominal
(intended) and effective (actual) weights and the Advisory Committee recommends that as each
district pilots its system, it analyzes the data to determine the actual weight of the various
domains. This actual weighting will depend on the variability in the responses to the specific
instruments used in each district. In the following sections, the major components of the
Framework are discussed in more detail.
Standards of Professional Practice
The Framework uses InTASC Standards as the measurement framework for evaluating teachers
relative to the four domains of effective teaching. This recommendation is based on the research
base supporting this framework and the extensive materials available to support its use and
professional development. Local districts may adopt tools or approaches to add more specificity
to the InTASC Standards, but the Advisory Committee recommends requiring that any
framework used must document the research supporting its use and provides the specifications
necessary to support reliable and valid measurement of teacher practices. The specific InTASC
Standards, grouped by domain are presented below. For a more complete explanation of the
standards, please refer to the InTASC document reference in the footnote.
Learner and Learning
Standard #1: Learner Development. The teacher understands how learners grow and develop,
recognizing that patterns of learning and development vary individually within
1 Council of Chief State School Officers. (2011, April). Interstate Teacher Assessment and Support Consortium
(InTASC) Model Core Teaching Standards: A Resource for State Dialogue. Washington, DC: Author.
http://www.ccsso.org/Resources/Programs/Interstate_Teacher_Assessment_Consortium_(InTASC).html
DRAFT WY Educator Evaluation Framework. Advisory Committee October 17-18, 2012 4
and across the cognitive, linguistic, social, emotional, and physical areas, and
designs and implements developmentally appropriate and challenging learning
experiences.
Standard #2: Learning Differences. The teacher uses understanding of individual differences
and diverse cultures and communities to ensure inclusive learning environments
that enable each learner to meet high standards.
Standard #3: Learning Environments. The teacher works with others to create environments
that support individual and collaborative learning, and that encourage positive
social interaction, active engagement in learning, and self motivation.
Content Knowledge
Standard #4: Content Knowledge. The teacher understands the central concepts, tools of
inquiry, and structures of the discipline(s) he or she teaches and creates learning
experiences that make the discipline accessible and meaningful for learners to
assure mastery of the content.
Standard #5: Application of Content. The teacher understands how to connect concepts and
use differing perspectives to engage learners in critical thinking, creativity, and
collaborative problem solving related to authentic local and global issues.
Instructional Practice
Standard #6: Assessment. The teacher understands and uses multiple methods of assessment to
engage learners in their own growth, to monitor learner progress, and to guide the
teacher’s and learner’s decision making.
Standard #7: Planning for Instruction. The teacher plans instruction that supports every
student in meeting rigorous learning goals by drawing upon knowledge of content
areas, curriculum, cross-disciplinary skills, and pedagogy, as well as knowledge
of learners and the community context.
Standard #8: Instructional Strategies. The teacher understands and uses a variety of
instructional strategies to encourage learners to develop deep understanding of
content areas and their connections, and to build skills to apply knowledge in
meaningful ways.
Professional Responsibility
Standard #9: Professional Learning and Ethical Practice. The teacher engages in ongoing
professional learning and uses evidence to continually evaluate his/her practice,
particularly the effects of his/her choices and actions on others (learners, families,
other professionals, and the community), and adapts practice to meet the needs of
each learner.
DRAFT WY Educator Evaluation Framework. Advisory Committee October 17-18, 2012 5
Standard #10: Leadership and Collaboration. The teacher seeks appropriate leadership roles
and opportunities to take responsibility for student learning, to collaborate with
learners, families, colleagues, other school professionals, and community
members to ensure learner growth, and to advance the profession.
Performance Standards
All Wyoming schools, as determined by their districts, will classify all licensed personnel, as
illustrated by the Framework, as highly effective, effective, needs improvement effectiveness,
and ineffective based on data from measures of the standards for professional practice and
measures of student performance. The evaluation system will produce an overall rating for each
teacher. To arrive at an overall rating, a description of performance that characterizes the types
of knowledge, skills, dispositions, and behaviors of an “effective” teacher (as well as other
levels) must be described. Further, if there is any hope in comparable ratings across the state,
common performance level descriptors must be used. Performance standards describe “how
good is good enough” and the “performance level descriptor” (PLD) is the narrative component
of the performance standard that describes the key qualities that differentiate educators at each of
the various levels.
The InTASC Standards provide performance descriptors for each of the ten standards, but they
do not provide an overall description for various levels of teacher effectiveness. One might ask,
why not require educators to meet the requirements on each of the ten standards in order to be
classified as effective? Such a conjunctive system where candidates must meet every threshold
in order to be classified as “effective” is both unrealistic and unreliable. No Child Left Behind’s
(NCLB) Adequate Yearly Progress (AYP) system is the most recent, well known example of a
conjunctive system that leads to many unreliable in invalid decisions. Therefore, a more
compensatory approach where stronger performance in one area may offset weaker performance
in other areas is more reliable and often much more realistic. Further, hybrid systems can clearly
value important aspects of the domain while allowing some compensatory decisions elsewhere in
the system. Therefore, an educator evaluation system that results in an overall classification for
each educator must also include an omnibus description of educator effectiveness. This
definition is also critical to help guide the data collection and validity evaluation of the system.
The Framework provides PLDs for each of the four overall levels of the system. These
descriptors connect the standards for professional practice with the various data produced by the
measurement instruments used in the system. This overall description is necessary, because an
effective teacher is not necessarily a simple sum of the scores on the various
components/indicators in the system. Further, defining an effective teacher as one who is
effective on each component will establish a “conjunctive” system (e.g., NCLB-AYP) with the
DRAFT WY Educator Evaluation Framework. Advisory Committee October 17-18, 2012 6
potential negative consequence of having very few teachers classified as effective or highly
effective. A DRAFT PLD for an effective teacher in WY is as follows:
Effective teachers in Wyoming have the knowledge, skills, and commitments that
ensure meaningful learning opportunities for all students and high rates of
growth for most students. Effective teachers facilitate mastery of content and skill
development, and identify and employ appropriate strategies for students who are
not achieving success. Effective teachers communicate high expectations to
students and their families and find ways to engage them in mutually-supportive
teaching and learning environments. They also develop in students the skills,
interests and abilities necessary to be lifelong learners. Because effective
teachers understand that the work of ensuring meaningful learning opportunities
for all students cannot happen in isolation, they engage in collaboration,
continuous reflection, on-going learning and leadership within the profession.
The Advisory Committee (or subcommittee) should craft PLDs for each of the remaining
performance levels in the WY Framework. The Advisory Committee strongly endorses
employing a set of common performance descriptors for WY in order to promote comparable
expectations for educators across districts.
General Evaluation Framework
The general measurement framework describes the overall approach for how local districts
following the Framework would approach the data collection involved in evaluating educators.
The measurement framework follows from the key principles outlined at the beginning of this
document. There are four domains of educator practice along with evaluations based on student
achievement. The general measurement framework is tied to this overall depiction, but provides
more structure for the Framework and perhaps local instantiations of the Framework. All
evaluations, conducted using the Framework, shall include:
Professional practice measures
Multiple approaches and measures will be used to collect data on educator practices to best tailor
the data collection approaches to complex nature teaching practice. Each educator shall conduct
a self-assessment each year that will be used as the foundation of a goal setting meeting with the
principal and/or peer coach (mentor). The self assessment and collaboratively established goals
will be used to focus the professional practice data collection for the year in which the educator
is being formally evaluated. For the years in which the educator is not undergoing a formal
evaluation, the self assessment and goals shall be used to guide professional development and
formative evaluation. Data related to professional practices shall be collected using:
DRAFT WY Educator Evaluation Framework. Advisory Committee October 17-18, 2012 7
A focused professional portfolio used to document specific goals and artifacts related to
these goals, and
Observations of practice by educational leaders and potentially peers.
Measures of student performance
Student Learning Objectives
Student Growth Percentiles (if applicable)
The SLO and/or SGP results may be “shared” among multiple educators depending upon
local theories of action around school improvement.
As part of the general measurement approach, the Framework includes the use of multiple
measures of each domain when possible and when the use of the multiple measures improves the
validity of the evaluation decision. In addition to multiple measures, the Advisory Committee
recognizes the challenge of having enough expertise and time in any single individual to conduct
all required evaluations. Therefore, the Framework calls for the use of peer teams, in addition to
building-level administrators, to participate and advise in the evaluation process.
The Advisory Committee further recommends that at least part of the SLO and/or SGP results be
shared among multiple educators depending upon local theories of action around school
improvement.
The Professional Portfolio
The professional portfolio is a critical component of WY’s Framework and contributes data to
multiple domains of teacher practice. All educators are required to establish yearly professional
goals in consultation with their supervisor or designee and document the process and products
associated with these goals through a professional portfolio that is reviewed each year. The
Wyoming Department of Education (WDE) or other designees will produce guidance outlining
the requirements of a professional portfolio to be used as a starting point for local requirements.
The Advisory Committee recommends that each educator maintain a professional portfolio that
includes the following components:
Documentation of self assessment
Documentation of collaboratively established specific goals
A plan, including identified professional development, for achieving the goals
Includes among other things analyses of key artifacts such as student work from
specific assignments, planning documents, and assessments related to the
established goals
Self reflection at the end of the year to self evaluate the extent to which the
specific goals have been achieved
DRAFT WY Educator Evaluation Framework. Advisory Committee October 17-18, 2012 8
Implementation and Differentiation [Note we need to address and avoid any potential contract
issues in this section]
The Advisory Committee has been sensitive to balancing the needs of creating a valid system
with an understanding that the system or one like it must be implemented by all school districts
without creating an unmanageable burden. While many states have required the full evaluation
of every teacher every year, the Advisory Committee recognized that this would place an
impossible and inefficient burden on WY schools. Therefore, the Advisory Committee
recommends that evaluations should be differentiated according to the experience and status of
the schools’ educators. Ultimately, each district shall enact a policy and set of procedures to
differentiate evaluation systems for its different classes of educators (e.g., novice, veteran, and/or
high performing, low performing) and to the specific evaluation questions to be investigated.
Within the first three years of implementation, each educator shall undergo a full evaluation. To
the extent possible, yearly evaluations shall include multiple years of student performance
results.
Novice educators, defined as those within the first two years of the teaching profession, must be
evaluated every year until they are rated “effective” for three consecutive years. In order to be
granted professional (continuing contract) status, educators must be rated effective for three
consecutive years. These two events can happen concurrently. Districts may decide to focus
specific aspects of the evaluation for novice educators by reducing the demands of the
professional portfolio, for example.
Teachers with professional status (continuing contract) shall be evaluated every year until they
receive “effective” ratings or better for two consecutive ratings. Once these teachers receive two
consecutive effective ratings, they shall receive summative evaluations every three years. A
yearly evaluation schedule shall not be required as long as the educator continues receiving
effective or better ratings.
Specific Measurement Framework
The specific measurement framework adds the details to the general measurement framework to
guide the data collection methods in order to successfully conduct educator evaluations. Such a
detailed measurement framework would describe the type and frequency of data collection
approaches for each of the major domains. The following paragraphs briefly highlight aspects of
the specific measurement framework, organized by major domain. Subsequent work will be
required to fully describe the specific measurement procedures and policies to be enacted for the
various educators in the system.
DRAFT WY Educator Evaluation Framework. Advisory Committee October 17-18, 2012 9
Domain 1: Planning and Preparation
A professional portfolio shall be required as evidence of educator performance related to
Domain-1 for each educator. Given the scope of Domain 1, each educator, along with her/his
evaluator (principal) shall identify the sub-components of the Domain that will be the focus of
the evaluation for that particular year. The focal sub-domains for the given year will determine
the specific data to be included in the portfolio. For example, if one of the foci was on planning
instruction, the teacher and evaluator might agree that a series of lesson and unit plans with
structured reflections would serve as useful entries in the professional portfolio. District
evaluation teams and building administrators will need to track the focus of each educator’s
portfolio each year to ensure that the planning and preparation domain is fully represented for
each educator over time.
Domains 2 (Classroom Environment) and Domain 3 (Instruction)
These domains generally require direct observation to collect evidence of the educator’s
successful mastery of these domains. The Advisory Committee recognizes that any manageable
schedule of observations when the system becomes operational will be necessarily “thin.”
Therefore, districts must think carefully about the nature and frequency of the observations. For
example, the Advisory Committee recommends that Novice and Ineffective teachers be formally
observed at least three times each year, while Effective educators may be observed at least three
times only in the year of their evaluation and less frequently during their “off years.”
In the years that the teacher is evaluated, teachers shall be observed formally on at least three
different occasions. The general time frame/unit of instruction for the observations shall occur in
consultation with the educator, but the specific lessons observed may be unannounced. At least
one of the observations, but preferably most of them, should be tied to aspects of the curriculum
that are the focus of the SLOs. Further, the observations shall include an analysis and discussion
of relevant documents associated with the unit of study being observed. These documents may
include lesson plans, assessments, assignments, student work, and other relevant documents
associated with the teaching, learning, and assessment of the unit. To improve coherence, at
least some of these artifacts or other documents should be included in the professional portfolio.
Domain 4: Professional Responsibility
Similar to Domain 1, professional responsibility cannot be evaluated with direct observation.
The Advisory Committee separated Domain 4 from Domain 1 in this discussion because the
Framework recognizes that the nature of professional responsibility will be quite different for
novice compared with experienced teachers. The professional responsibility for a novice
DRAFT WY Educator Evaluation Framework. Advisory Committee October 17-18, 2012 10
educator would tend to focus more on Standard 9 (Professional Learning and Ethical Practice),
while experienced educators should be expected to provide evidence for both Standards 9 and 10
(Leadership and Collaboration). For experienced educators, defining the specific aspects of
their professional responsibilities to be evaluated is a critical aspect of their goal setting. The
specific focus of the professional responsibility will guide the required data collection and
reflection.
Domain 5: Student Performance
As stated in the first guiding principle of this Framework, the primary purpose of Wyoming’s
educator evaluation and the reason for engaging in this work is to support and promote increases
in student learning in Wyoming schools. Therefore, it is critical that the results of student
achievement be incorporated in the evaluations of all educators. While this sounds so intuitively
simple, it is one of the most complex aspects of new forms of educator evaluation. The
Wyoming Framework uses a three part approach for incorporating student achievement and
growth into evaluations in order to attempt to maximize the benefits of doing so, while striving
to minimize potential unintended negative consequences.
Student Learning Objectives (SLO) form the foundation of Wyoming’s approach for
documenting changes in student performance associated with a teacher or group of educators
and, as such, all educators will have the results of SLOs incorporated into their evaluations. For
educators in “tested” subjects and grades, those grades and subjects for which there is a state,
standardized test as well as a state test in the same subject in the previous year, student
performance will be evaluated using Student Growth Percentiles (SGP), and the results of SGP
analyses, along with SLO results, will be used in the evaluations of educators in tested subjects
and grades. Both SGP and SLO approaches are described in more detail below.
Both SGP and SLO approaches can be used to attribute the academic achievement and growth of
students to individual educators or to appropriate aggregations of educators such as grade or
content-level teams or even the whole school. Distributing student performance results to
multiple educators is referred to as “shared attribution.” The tradeoffs associated with shared
attribution are also discussed below.
Student Learning Objectives (SLO)
All teachers, whether in “tested grades and subjects” or not shall be required to document student
academic performance each year using SLOs in accordance with Wyoming’s SLO guidance
Both SGP and SLO analyses shall produce results in three classifications of performance, to the
extent possible, such as: high, typical/average, and low. The results of the SLO determinations
shall be incorporated into the evaluation of all educators according to the rules described below
DRAFT WY Educator Evaluation Framework. Advisory Committee October 17-18, 2012 11
in the section on combining multiple measures. [Note: A draft of the SLO Guidance is found in
Appendix A].
Calculating Student Performance Results in “Tested” Subjects and Grades
The growing interest in reforming long-standing approaches for evaluating and compensating
teachers has been characterized by among other things incorporating student performance results
in teacher evaluations. Advances in growth and value-added models in education have
contributed to the interest in using changes in student test scores over time as part of educator
accountability systems. Many districts, states, and non-governmental organizations have
embraced these test-based accountability initiatives, but the initial focus has been on the content
areas and grade levels for which there are state standardized tests, generally administered at the
end of each school year, or “tested” grades/subjects. Student performance, for the purposes of
educator evaluation, is generally evaluated using complex statistical models such as value-added
or student growth percentile models.
There are several possible approaches that Wyoming could use for evaluating student
performance in tested grades, but in order to adhere to the coherence principle, the Advisory
Committee recommends using the same Student Growth Percentile model currently being used
for the school accountability system to be used for educator evaluation. However, this is not
necessarily as simple as it sounds to move from school to teacher accountability. Appendix B
outlines multiple considerations for using SGPs in educator evaluation.
WDE shall produce Student Growth Percentiles (SGP) results documenting the individual
student and aggregate growth for students. These results will be aggregated according to
“teacher of record” rules as well as for the whole school. Further, results will be disaggregated
according to identifiable student groups in the school. All educators in “tested” grades and
subjects shall receive a report each year from WDE. These results, based on PAWS and
eventually Smarter Balanced Assessment Consortium (SBAC) test scores or another assessment,
using the SGP model, shall be incorporated into teachers’ evaluations either using a shared or
individual attribution framework.
Share Attribution
The Advisory Committee recognizes the challenges of properly attributing the results of student
performance to individual teachers. It is easy to think of many examples where it does not make
much sense to attribute the performance of students to any individual teachers, such as the case
when grade-level teams of teachers place students into differentiated instructional groups and
providing instruction to students by educators other than the child’s regular teachers. Therefore,
the Wyoming Framework relies on a mix of shared attribution and individual attribution of
student performance results. The SGP results, based on state tests in grades 3-8 should,
DRAFT WY Educator Evaluation Framework. Advisory Committee October 17-18, 2012 12
depending on the specific theory of improvement for the particular school, be shared among
educators at the same grade and/or teaching the same subject areas. SLO results, assuming
groups of educators are working on the same SLO, may also be shared among educators at the
same grade and/or content area. However, SLOs allow for more control than state test results
and the Framework requires that at least some portion of the SLOs used to document student
performance by attributed to the individual educator of record. Like anything else in
accountability system design, there are both advantages and disadvantages to using shared
attribution.
One of the major concerns with attributing the results of student performance to individual
teachers is that many fear that this could erode collaborative cultures at many schools, especially
if the results are used in some sort of “zero sum game” accountability design. Shared attribution
approaches, if implemented sensibly, can help promote both collaboration and internal (to the
group of teachers) accountability orientations. Both of which are associated with high
performing schools and organizations. Another concern for policy makers and accountability
system designers are potential unintended negative consequences of having the mathematics and
reading teachers in grades 4-8 evaluating in potentially very different ways than the other 70-
75% of educators in the district. This could lead to higher rates of attrition from these subjects
and grades or perhaps a feeling of professional isolation. The requirement for all educators to
participate in the SLO process is one hedge against this potential problem. However, sharing the
results of all of the student performance indicators among multiple educators, as appropriate is
one way to recognize the contributions of other educators to student performance, especially in
reading and math. Finally, one of the major concerns with tying student performance results to
individual teachers involves the reliability concerns when dealing with such small groups of
students. Aggregating the student performance results for multiple educators is one way to
ameliorate, but far from eliminate, the reliability challenges.
This discussion could lead one to believe that shared attribution has so many advantages, why
would a system include any other approach. Of course there are potential disadvantages to
shared attribution too. One important disadvantage—that could be reduced with careful
design—is the educators maybe held accountable for results for which they may have little to no
control. This was a considerable criticism of Tennessee’s approach for including student
performance results in the evaluations of teachers from non-tested subjects and grades. This
threat is likely greatest when student performance on the state math and/or reading tests is
attributed to all educators in the school as opposed to a finer-grained aggregation. Another
potential disadvantage to shared attribution is that it may mask true variability in educator
quality. If we believe that educator quality is truly variable along a continuum of being able to
influence student performance, then pooling results among multiple educators could mask such
differences. Of course, being able to separate the “signal” (true variability) from the “noise”
(unreliability in the system) is not easy.
DRAFT WY Educator Evaluation Framework. Advisory Committee October 17-18, 2012 13
Therefore, the Advisory Committee recommends that sharing student performance results among
multiple educators should be based on more than just reliability concerns, but that such decisions
must be tied to local theories of improvement. For example, if the focus of improvement
activities is the grade level team attribution should be shared among educators at that grade and
not at the whole school level. Therefore, the first step in implementing any sort of shared
attribution approach involves a careful articulation of the school’s locus of improvement actions.
This theory of improvement (action) should also make clear which subjects are shared and with
whom. For example, does the 5th
grade team share both math and ELA results or just one
subject? Finally, while the Advisory Committee favors shared attribution approaches in many
cases and for at least some of the weight in the accountability determinations, it also
recommends that at least some of the changes in student performance be attributed to individual
teachers. This might best be accomplished with SLOs rather than SGPs because of the closer
ties to the specific course, but the Advisory Committee suggests leaving this specific decision to
local school districts.
Combining Multiple Measures
There are many approaches for combining multiple indicators to yield a single outcome:
compensatory, conjunctive, disjunctive, and profile methods. Compensatory means that higher
performance in one measure may offset or compensate for lower performance on another
measure. Conjunctive means that acceptable performance must be achieved for every measure
(e.g., AYP). Disjunctive means that performance must be acceptable on at least one measure. A
profile refers to a defined pattern of performance that is judged against specific performance
level descriptions. A profile approach is often operationalized using a matrix to combine
indicators for making judgments. Given the challenges involved in characterizing the
complexities of teaching, the Framework must employ a thoughtful approach for combining the
multiple sources of data in order to arrive at the most valid inferences about overall teacher
quality possible.
A compensatory approach recognizes that some degree of variability in performance across
indicators may be expected. Such an approach has a higher degree of reliability because the
overall decision is based on multiple indicators evaluated more holistically. Conjunctive
decisions are less reliable because errors accumulate across multiple judgments meaning a
teacher might fail to be classified as effective due to poor performance on the least reliable
measure. A conjunctive approach does not appear to make much sense for an educator
evaluation system. A disjunctive method is used when any one component is viewed as adequate
assurance the teacher met expectations. Again, this does not appear to make much sense in a
teacher evaluation system. Finally, profiles are useful especially when there are certain patterns
DRAFT WY Educator Evaluation Framework. Advisory Committee October 17-18, 2012 14
that can be described that reflect valued performance that are not easily captured, usually
because the combinations of criteria are judged to be not equivalent.
These approaches should not be regarded as mutually exclusive. It is possible, for example, to
combine aspects of compensatory and profile ‘rules’ to arrive at a final result. For example, a
compensatory approach may be used to aggregate the data from the multiple measures within
any single domain, while a profile approach could be used to combine information across
domains. A major advantage of a profile or decision matrix approach is that once established,
the teacher can never receive an unexpected overall rating, whereas simple averages
characteristic of compensatory approach can produce some surprising outcomes.
The Advisory Committee recommends using, as part of the Framework, an approach for
combining the various sources of information that avoids mechanistic approaches such as simple
averaging, but that takes into account the nature of the different sources of information. A
“panel” or “decision matrix” approach” for combining the multiple measures allows the goals of
the system to be reflected explicitly and not buried in some numerical composite. An example of
such a panel approach is found below.
EXAMPLE Panel Approach for Combining Multiple Measures (based on an approximate
25/75 weighting between student performance and teacher practices)
“P
rofe
ssio
nal
Pra
ctic
e” R
ati
ng
4 Automatic
Review
Highly Effective Highly Effective
3 Needs
Improvement
Effective Effective
2 Needs
Improvement
Needs
Improvement
Needs
Improvement
1 Ineffective Ineffective Automatic
Review
1 2 3
“Student Performance” Rating
Again, this is just an example. If we want to include such detail in the Framework, we will need
to provide the details for combining across the various domains.
Supports and Consequences
Assumptions
As stated in the guiding principles, Wyoming’s Framework is being designed such that it can
support improvements in teaching and learning. As part of this design, the Advisory Committee
DRAFT WY Educator Evaluation Framework. Advisory Committee October 17-18, 2012 15
emphasizes the importance of reporting detailed and actionable information so that educators and
their leaders have the information they need to guide efforts to improve their practice. This
means that educators need to receive information on each of the indicators in the system, while
recognizing that the information at the indicator level is considerably less reliable than the total
evaluation. This will require that each local system is well documented, in terms of the
components and indicators outlined in this document, so that each local educator understands the
nature of the information on which they will be evaluated.
The WY Framework and all local systems must produce an overall effectiveness rating that
guides support, career development, and employment decisions. The overall rating can only be
an overall flag to guide support since the detailed information is really required to allow for
focused support and development.
Supports
A critical support requires having each educator understand the rules by which they will be
evaluated. Therefore, each district shall develop and implement a process for training all
licensed personnel on the educator evaluation system including the consequences associated with
the ratings. Further, the district shall require all personnel conducting classroom observations to
undergo a defined training and qualification process.
In order to fulfill one of the major guiding principles that the system is being designed to
improve educators’ performance, the Framework requires that each Wyoming school district
must include well-specified and formalized process of mentoring and support designed to
improve the performance of all educators in the district. The support and mentoring systems
should be designed in collaboration with teachers, administrators, and other key stakeholders
(e.g., parents, Board members) and based on research and documented best practices.
Additionally, all evaluators (administrators) must receive research-based training on how best to
share results of the evaluation system with those evaluated in order to support understanding of
the information and to improve practice.
Educators rated ineffective or needs improvement in one year must be placed on directed
professional growth (improvement) plan that includes receiving targeted mentoring and support.
These support systems must be research-based to the maximum extent possible. Further, the
evaluations of the educators involved in a directed professional growth plan shall include
additional data sources such as video records of classroom teaching experiences. The video
recording of classroom teaching is designed to serve two purposes. It can be a very effective
feedback tool for all educators, but particularly for struggling educators if viewed with an expert
mentor. Second, the video evidence will allow for review by an appeals panel within the school
district to ensure the accuracy of the principal ratings for classroom performance.
DRAFT WY Educator Evaluation Framework. Advisory Committee October 17-18, 2012 16
Consequences
Ultimately, the system will lead to certain consequences for educators falling well below or well
above expectations. While the system is designed for improvement and a significant support
system is required to help struggling educators, there will likely come a point where educators
may need to be counseled out of the profession. The Framework includes the following
expectations for such eventualities:
1. An experienced, educator with two consecutive years of ineffective ratings shall lose
her/his current (continuing contract) status and may be dismissed without additional
cause.
2. An educator with two consecutive years of needs improvement ratings shall be moved to
ineffective status.
3. An educator rated highly effective for two consecutive ratings should receive recognition
or reward, as determined by the local district, and may assume a “teacher leader role” as
part of the mentoring and support system.
4. Only educators with consistent ratings of highly effective may participate in the
evaluations of other educators in their district or building.
Implementation Recommendations
The Advisory Committee, as can be seen from the preceding discussion, has been very
thoughtful about designing a framework for educator evaluation in Wyoming. We have
attempted to outline a clear approach to addressing the complexities for designing and
implementing educator evaluation systems in Wyoming. However, the Advisory Committee
wants to stress that there are enormous challenges to implementing such systems in any locale.
One positive aspect of having Wyoming follow other states and districts in this work is that we
have the opportunity to learn from the experiences of others. One of the most striking things
being learned is that significant time and thoughtfulness are needed to implement these systems
well. Further, the odds of getting things wrong is much greater than the odds of getting things
right when these systems are rushed into operational practice too soon.
This would be true under conditions where the state standards and assessment systems were
stable. As we know, both the standards and assessments have or are in the process of being
revised. Further, the Common Core State Standards call for deeper levels of understanding on
the part of students than ever before. Shifting instructional practices and curriculum will require
considerable effort on the part of local school districts. Adding requirements for a new school
accountability system will further stress systems. Therefore, the Advisory Committee strongly
recommends the proposed educator evaluation system be implemented thoughtfully with an
DRAFT WY Educator Evaluation Framework. Advisory Committee October 17-18, 2012 17
extended pilot period to both gradually implement the system and to allow for formative
feedback to make adjustment to the system before it is implemented for high stakes.
DRAFT WY Educator Evaluation Framework. Advisory Committee October 17-18, 2012 18
APPENDIX A:
Student Learning Objectives: Guidance
Introduction
The Wyoming Accountability Advisory Committee recommends the use of Student Learning
Objectives (SLOs) to document educators’ contributions to student performance in both “tested”
and “non-tested” subjects and grades. SLOs are content- and grade/course-specific measurable
learning objectives that can be used to document student learning over a defined period of time.
In essence, educators establish learning goals for individual or groups of students, monitor
students’ progress toward these goals, and then evaluate the degree to which educators help
students achieve these goals. This is a key advantage of the SLO approach. It is designed to
reflect and incentivize good teaching practices such as setting clear learning targets,
differentiating instruction for students, monitoring students’ progress toward these targets, and
evaluating the extent to which students have met the targets.
There are several important considerations for employing SLOs in educator evaluations. First,
the quality of the objectives and the validity of the inferences that can be made from the SLO
process must be assured. Second, the process by which the objectives are established must be
considered if the objectives are seen as fair for all educators. Third, the measurement approaches
and tools must enable educators and their evaluators to judge the extent to which educators have
met their objectives. Finally, the oversight and support, especially the professional development
necessary to help educators and administrators learn how to set and evaluate meaningful
objectives, and the cross school/district monitoring will be critical to assure fairness and rigor
within and across schools and districts.
While many have an interest in developing “growth-based” SLOs (i.e., measuring the change in
student achievement over two or more points in time), most will be “status-based,” usually
roughly conditioned on estimated initial understanding, then evaluating the degree to which
students reach specific targets on the measurement at the end of the instructional period. This
distinction between growth and status SLOs is discussed in more detail in Marion, et al., (20122).
This section of the report will help guide educators and administrators in designing and
implementing a local SLO process. It is divided into the four sections: 1) The Objectives; 2)
The Objective Setting Process; 3) Assessment/Measures; and, 4) Oversight and Support. Each
section provides both recommendations and a rationale for the recommendations. To the extent
2Marion, S., DePascale, C., Domaleski, C., Gong, B., and Diaz-Bilello, E. (2012, May). Considerations for analyzing educators’
contributions to student learning in non-tested subjects and grades with a focus on Student Learning Objectives
http://www.nciea.org/publication_PDFs/Measurement%20Considerations%20for%20NTSG_052212.pdf
DRAFT WY Educator Evaluation Framework. Advisory Committee October 17-18, 2012 19
applicable, reference is made regarding the distinction between the early implementation years
and a more complete operational system.
The Objectives
The number and specificity of the objectives are important considerations in terms of
maximizing the validity of the evidence regarding the claims one is trying to make as a result of
the SLO process. At a minimum, evaluators are at least implicitly claiming that the results of the
SLO determinations for a given time period are a fair and valid depiction of the learning results
of an individual or group of students associated with a particular educator or educators. The
intention is to clearly use the results of the SLO process as evidence of the quality of a particular
educational experience in a particular setting.
SLOs will work best if they are situated within the theory of action or theory of improvement for
the particular school. In order to help ensure the validity of the claims about educators from the
SLO process, it is important to use a sufficient number and representativeness of objectives to
ensure that the domain of the course is appropriately sampled, but not so many objectives that
certain objectives become trivialized. As such, educational leaders should consider requiring
that at least a portion of the SLOs in the building will be shared among a group of educators
(e.g., grade level team). Further, while most SLOs will be tailored to the specific learning targets
in the particular class or course, district and school leaders should work to have SLOs related to
overall school improvement goals to the extent practical. The following recommendations are
designed to maximize the validity of the inferences from the SLOs related to educator quality
while trying to manage the implementation challenges of a new SLO process.
1. All non-administrator educator evaluations shall include a minimum of two, individually-
based SLOs for each individual educator in a building during the first pilot year. By the
first operational year up to four SLOs per teacher should be the requirement to ensure
that the subjects and grades are more appropriately represented in the complete set of
SLOs. Reliability concerns can be mitigated by:
a. Using multiple measures for each SLO, and,
b. Increasing the number of SLOs, each with its own measure.
2. The objectives shall be established as “close to the individual students as possible.” This
may involve establishing subgroup, overall class or school goals, and then allowing
variation from these goals based on the current achievement levels of individual or
groups of students.
3. Objectives for each educator should be as representative of the set of courses/subjects
they teach as possible. For example, a middle or high school teacher should have
objectives from multiple sections or courses. This does not mean that every
course/section is represented, but there should be an effort to ensure such representation
DRAFT WY Educator Evaluation Framework. Advisory Committee October 17-18, 2012 20
over time. Similarly, objectives for elementary school teachers should be as
representative as possible for the subjects that these teachers teach.
4. The objectives shall be linked to the appropriate specific content and skills from the
Wyoming Content Standards and/or course standards. The SLOs should be targeted to
“enduring understandings” or high priority standards. In other words, given the limited
number of student learning objectives for each teacher, they should be tied to the most
critical learning outcomes. It will be important for educators to focus on the most
important outcomes and be cautious not to narrow the curriculum.
5. Each educator shall participate in at least one shared or aggregate objective. This may be
in alignment with a school wide goal or could be a grade level or content area goal
(typically for middle or high school). This should be based on a theory of
action/improvement that leaves the school and district able to decide on the appropriate
aggregation (e.g., grade level teams) based on school/ district philosophy. For example,
most schools have “literacy across the curriculum” initiatives in place and it will make
sense to maintain focus on such initiatives through the SLO process.
6. Objectives for each individual educator, and especially the shared/aggregate objectives,
should reflect consideration of the overall school improvement plan.
7. Growth-based objectives should be encouraged and employed only where possible to do
so in technically defensible ways (Marion, et al., 2012).
8. The objectives should be ambitious, but realistic. Further, the objectives should be rich
enough such that educators are not simply classified as having met or not met the specific
objectives. The student learning objectives should be tied to a rubric of performance that
includes at least three or four levels. The objectives should be able to produce nuanced
results such as “clearly not met,” “partially met,” “met objective,” and “exceeded
objective,” as categories of performance. Such an approach will encourage objectives
rich enough to support such a scoring scheme and will hopefully maximize the chances of
capturing the true variance in educators.
The Objective Setting Process
The process of setting the student learning objectives is critical to the fairness, educator buy-in,
and manageability of the SLOs. A process should be established so that educators are held to
similar levels of rigor at least within a school building. The focus should be on trying to
implement as comparable a process within each school as possible. Hopefully in the long run,
this comparability will expand across the district. If SLOs are to lead to the improvements in
DRAFT WY Educator Evaluation Framework. Advisory Committee October 17-18, 2012 21
student learning that many hope to see, educators should fully participate in the process and not
“have SLOs done to them.” The following recommendations are designed to address these
concerns.
1. Each district shall establish a framework for ensuring that objectives across the district
are comparable as possible. Participating on statewide peer teams to set objectives for
content area may be an option for districts to consider. Further, the principal or her/his
designees shall consider comparability when approving all objectives in the building.
2. Generally, the school principal is legally responsible for the evaluation of all personnel in
the building and therefore should approve all objectives. However, the principal,
especially at the secondary level, should consider employing a team approach to take
advantage of distributed leadership and expertise. Having a single point person (or team)
can help ensure the comparability of SLOs across the school building.
3. In addition to school administrators, teams of educators shall be involved in establishing
both shared and individual teacher objectives. Teams members may include: members
of the same academic department, grade level colleagues, district content area experts,
and other qualified individuals. This recommendation is designed to address three major
concerns: content knowledge, comparability, and buy-in.
4. Each educator shall have considerable say in establishing her/his objectives. Shared
district objectives can influence educator SLOs, but with administrator approval,
significant input is appropriate to better fit the needs of the educators’ particular classes.
5. Relevant performance data on students for whom objectives will be set as well as data
from the same course in prior years shall be used to assist in establishing meaningful
objectives. Student information and longitudinal information as well as information from
the same course in previous years shall be used if available.
6. The objectives for each course should be established within six weeks of the start of the
course.
Assessments/Measures
Even with rigorous and appropriate learning goals, SLOs may be meaningless without high
quality measures to evaluate the degree to which students achieved these learning goals. In fact,
the quality of the measures may be the Achilles Heel in the entire SLO process, because outside
of a few core content areas, the quality of the available measures is quite variable at best.
However, rather than using concerns about potential measures as a reason to abandon the SLO
DRAFT WY Educator Evaluation Framework. Advisory Committee October 17-18, 2012 22
process, we should use the SLO approach as motivation to upgrade the quality of measures and
assessments available for teachers to be able to document student learning.
Educators should rely on the best measures available to evaluate the specific SLOs. The use of
the measures should be driven by the fit between the particular learning targets and the
assessments used to evaluate the SLOs. The highest quality assessments should be used to
evaluate the SLOs, but these assessments should be the ones that best match the specific learning
targets. It will be a challenge in the early years to find high quality assessments to evaluate the
SLOs, but this should be seen as an opportunity to improve the quality of local assessments.
This is one of the main reasons why it makes sense to focus first on status-based SLOs. It will
be hard enough to develop or select at least one high quality assessment to evaluate SLOs
without the challenge of needing to find both a high quality pretest and posttest (again, see
Marion, et al., 2012). The following recommendations are intended to help guide the assessment
component of the SLO process.
1. State standards-based assessments shall be used to evaluate the teachers’ contributions to
student performance in the subjects and grades where such assessments are available.
This recommendation allows for local districts to decide to use an SLO process to
contextualize the student assessment results or the district can choose to use a more
conventional test-based approach. Supplementing the use of student growth percentiles
(SGPs) in tested subjects and grades with a small set of SLOs can provide another set of
measures to broaden the assessment information for each educator.
2. When state assessments are not available, which is the case for all non-tested subjects and
grades, schools and districts will have to choose another method for measuring the SLOs.
Common benchmark tests created by the district or other entities shall be used to evaluate
SLOs to the extent that the assessment provides a valid measure of the learning
objectives. Determining what constitutes a valid measure of the learning objectives is not
an easy task and there will be other resources available, such as quality criteria for
assessments, to help districts evaluate the technical quality of various assessments.
3. WDE and a consortia of districts shall be encouraged to facilitate the development of
resources/tools (e.g., common rubrics, common assessments) as examples to aid in the
assessment of SLOs in non-tested subjects and grades. It makes little sense for every
district to tackle this challenge on its own, so this recommendation is intended to
encourage cross-district collaboration to build higher quality assessments for SLOs than
would be possible if each district was working on its own. Because we are concerned
about the cost, both in terms of time and money, of creating new common assessments
for courses and grades where there are currently no state-supported assessments, criteria
for quality student assessments will be established, Frameworks and examples will be
provided, and local districts and schools will be provided professional development on
DRAFT WY Educator Evaluation Framework. Advisory Committee October 17-18, 2012 23
creating quality assessments. This is an important aspect of building professional human
capacity.
5. Educator performance on the SLOs should generally be scored using at least three or four
categories of performance (e.g., exceeded SLO, met SLO, partially met, and did not meet
SLO).
Oversight and Support
Designing and implementing an SLO process assumes that teachers and leaders have the
knowledge and skills to establish appropriate objectives, locate or develop assessments suitable
for measuring student learning relative to these targets, and evaluate educator performance
according to how well the students performed. Educators will need professional development to
gain the knowledge and skills necessary to sustain wide-scale implementation of the SLO
process. Further, some level of monitoring and oversight at the state level is necessary to
promote comparability in SLO processes and outcomes. Comparability of SLOs and SLO
outcomes is a major concern of the Wyoming Advisory Committee. As such, the
recommendations discussed below are intended to help ensure comparability of goals and
objectives starting from the classroom (i.e., multiple SLOs within the same classroom and across
classrooms should be comparable) to the school, district, and state. The recommendations that
follow are intended to address the support necessary to successfully implement an SLO approach
for documenting educator contributions to student learning as well as to provide guidance around
the type of monitoring and support the Advisory Committee recommends for the state and
districts.
1. WDE, based on recommendations from the Advisory Committee, shall create clear
guidance for creating a local SLO process that includes the items described in this
document. This guidance shall describe criteria for developing and evaluating high
quality SLOs and should provide examples of both high quality and weaker (for contrast)
SLOs.
2. A State SLO Advisory Review Committee shall be established to review and support the
SLO process including evaluating the quality and rigor of objectives, assessment
measures, and performance expectations (what counts as “good enough”). This SLO
Advisory Review Committee will be designed to ameliorate differences in SLOs across
districts due, in part, to differences in district capacity. At a minimum, districts shall
conduct such processes across schools within their districts.
3. WDE along with contributing schools and districts shall develop a resource bank of
exemplar SLOs and potential assessment instruments.
DRAFT WY Educator Evaluation Framework. Advisory Committee October 17-18, 2012 24
4. Each district, with WDE support, shall design a structure and process for providing
professional development on the development of an SLO process for its educators and
administrators. This shall include training for educational leaders on how to work with
his/her teachers in establishing meaningful and rigorous learning objectives, how to
establish and support peer teams, and how to determine what types of assessments are
suitable for evaluating SLOs. The support for educators shall include training for how to
use data to establish learning objectives, determining the appropriateness and
meaningfulness of targets, monitoring student progress toward the targets, and using
assessments to evaluate the degree to which students met the targets.
5. As part of the pilot of the educator evaluation system, special attention should be devoted
to the ways that student growth measures work within the systems. The results of the
pilot process shall be reported and used to inform subsequent modifications to the SLO
process and the weighting of student growth in the Wyoming evaluation system.
DRAFT WY Educator Evaluation Framework. Advisory Committee October 17-18, 2012 25
APPENDIX B:
Considerations When Calculating Student Performance Results in “Tested” Subjects and
Grades
Incorporating the results of student achievement tests requires the Advisory Committee to
consider and make recommendations about several important issues. The following pages lay
out many of these considerations to provide background information for decisions to be made by
the Advisory Committee.
Tests Included
It is assumed that the grade/ subject tests included in the Wyoming Framework will be the same
as those included in the school accountability system. Creating as much overlap as possible
among the set of included tests is a desirable feature of coherence. The proposed school
accountability system to meet the requirements of WEA 65 includes academic growth based on
state assessment results (PAWS currently) in grades 4-8 in reading and mathematics. Therefore,
these grades and content areas should serve as the basis for inclusion of SGP in the Wyoming
Framework as well.
Obviously, it is not desirable to exclude high schools from SGP calculations. The State Board
and the legislature are currently considering a plan for implementing end of course tests (EOC),
which may open up new options for calculating SGPs at the high school. However, calculating
growth at the high school level is extremely complex, particularly if, as expected, there is
variability in course sequence. Therefore, until we know much more about the developing high
school assessment system, the focus should be on grades 4-8 in reading and mathematics.
Teacher/Leader of Record
Another important consideration in operationalizing growth in Wyoming’s Educator Evaluation
Framework is determining which teacher/leader should be held accountable for a student’s
performance (leaving aside for the moment the discussion of shared attribution). A suitable
definition - and an accompanying data system that permits operationalization of this definition -
should establish the conditions and circumstances governing the connection of educators with
classes and account for the variety of learning environments in Wyoming’s schools. For
example, the Data Quality Campaign (DQC) (2010) advises states seeking to use assessment data
to inform educator evaluation to:
Account for contributions of multiple educators in a single course
Enable teachers to review rosters for accuracy
DRAFT WY Educator Evaluation Framework. Advisory Committee October 17-18, 2012 26
Account for schedule changes and variable class environments such as virtual classes or
labs
Link attendance records with teachers to track actual days of instruction
Based on the framework for defining teacher of record offered by DQC (2010b) the following
questions are important to address in order to arrive at an operational definition for included
teacher/ leader of record. Sample responses, intended only as ‘placeholders’ at this time, are
provided. It is recommended that the advisory committee carefully consider each.
What educators and leaders will be included?
o The primary educator who provides instruction contributing to and culminating in
the statewide PAWS test in reading or mathematics
o Elementary and middle school principals
o Other building level leaders/administrators whose role is primarily associated
with instruction
How much instructional time is required to establish a link?
o Teacher has primary responsibility for instruction in the class of record
o Minimum of 90 days of instruction (approximately half of the full academic year)
for the class of record
What prior measures will be required?
o At least one prior year summative state test score in the same content area
Will any courses/ schools be specifically excluded and why?
Will any teachers/ leaders be specifically excluded and why?
What is the minimum n size?
o Class and school growth estimates reported for groups of 20 or more students, but
multiple years of data can be aggregated to reach 20 students.
What is the inclusion rule?
o Class scores are not reported if contributing students represent fewer than 25% of
class size.
o School scores are not reported if contributing students represent fewer than 25%
of school size.
What students will be included?
o Students in grades 4-8 continuously enrolled for the full academic year in the
current year participating in the state PAWS in reading or math.
o All prior test scores in PAWS reading or math regardless of term of enrollment.
Missing/ Incomplete Data
Another ‘data issue’ to address is missing and/or incomplete data. This situation exists when any
of the following occur:
DRAFT WY Educator Evaluation Framework. Advisory Committee October 17-18, 2012 27
One or more prior (pre) test scores are missing
The current year (post) test score is missing
The student is not continuously enrolled in a single building/class throughout the term of
instruction
The student record is missing or incomplete (e.g. test scores but no identifier)
Missing data can impact the precision and stability of the model and introduce systematic bias in
the resulting estimates (Braun et al, 2010). Moreover, it is generally acknowledged that data are
not Missing At Random (MAR), meaning that it is likely that the performance of students with
missing or incomplete data differ systematically from those with complete records. Consider,
for example, that mobility rates are typically higher for economically disadvantaged students
compared to other students.
There is no single or best approach to dealing with missing data. It is recommended that
Wyoming take these near-term steps moving forward.
Identify business rules to clearly define what data are usable and which are not.
Investigate the extent that data are missing for districts, schools, and classes. Seek to
understand patterns of missing data for various levels of performance and by subgroup.
Such analyses will help determine the extent to which data are MAR or differ in a
systematic manner.
Multiple Educators and Shared Attribution
Another issue to consider is how to handle circumstances where students receive instruction
from multiple educators. This may be regarded as a special case of the teacher/ leader of record
issue, but merits specific attention.
There are three general cases that lead to this occurrence. First, the student may receive planned,
ongoing instruction from multiple teachers, as with a team teaching approach or scheduled
support sessions. Second, changes can occur throughout the year, such as a leave of absence for
the primary instructor or the student transitions to another class. Finally, additional instruction
can occur in a variety of contexts, such as when a student receives tutoring outside of class.
Whatever the case, multiple sources of instruction will likely have an impact on student
achievement.
Some researchers have hypothesized that a ‘dosage’ model may be appropriate in such
circumstances. That is, if Ms. Smith provides 70% of instruction and Mr. Jones provides 30% of
instruction, then the outcomes are assigned to the educators consistent with the proportion of
DRAFT WY Educator Evaluation Framework. Advisory Committee October 17-18, 2012 28
instruction provided. While it may be useful to research the feasibility of this approach, the
following caveats should be considered:
It is unlikely that proportional contribution to instruction can be captured with precision,
particularly when it is unscheduled. Also, it will be necessary to create potentially
complex connections in the state data system to account for this.
The proportional contribution to instruction may not be governed by time alone. For
example, an hour spent introducing new concepts to a class may not represent the same
‘instructional contribution’ as an hour spent overseeing time allotted for student directed
study.
The research on attributing a student’s academic performance to teachers and leaders is
emerging – even for the least ambiguous circumstances when the teacher of record is well
defined. Much less is known about the credibility of results based on proportional
attribution of scores.
Therefore, we strongly recommend using the shared attribution framework discussed in the main
part of this document and base decisions on which results get shared by which teachers on an
explicit theory of action or improvement for the school.
Performance Standards
Coherence
In order to maximize the coherence between school and educator evaluation system it is
desirable for performance expectations for growth at the class level to be similar by design to
growth targets at the school level. By so doing, the likelihood that outcomes will be favorable
for schools but not educators at that school (or vice versa) will be minimized. Additionally, it is
critical to ensure that the system does not create incentives that are in conflict.
More specifically, it is expected that growth outcome for classes will be the median student
growth percentile (MGP) and that standards for meeting and exceeding targets will be coherent
with those established for the school. At the time of writing this document, the growth targets
for schools have not been finalized, but draft plans call for three categories of performance—
high, typical, and low—at the school level in grades 4-8 based on PAWS.
Before moving forward with growth standards for the educator accountability system, there are
at least three critical considerations that should be addressed by the Advisory Committee. The
first is to determine the number and type of growth levels that need to be produced to support the
intended purposes and uses of the system. The second consideration is to explore the extent to
which the proposed growth rates are both attainable and meaningful at the class level. Based on
the documentation provided to date, it appears that the school targets were selected normatively.
DRAFT WY Educator Evaluation Framework. Advisory Committee October 17-18, 2012 29
That is, performance cutscores were selected based on the percentages of schools that would end
up in each of the three categories. However, it is less clear if the growth rates in the proposed
meets and exceeds range are sufficient to establishing meaningful growth to be on track to
achieve or maintain proficiency or readiness.
Finally, it is important to deal with the inherent unreliability of class level outcomes. Given that
class level results will be much more variable and subject to sampling error than school level
results, mechanisms must be put in place to deal with the lack of stability of outcomes in order to
have a greater degree of confidence in the results. The remaining two sections will address these
issues.
Reporting Outcomes
It is essential to determine the number and type of growth outcomes necessary to support the
purposes and uses of the educator evaluation system. In general, there is a tension between
reporting high-level results that are more reliable and the desire to report more nuanced but less
precise outcomes for multiple indicators. For example, there will be a much higher level of
confidence in classifications of class effects as low, typical, or high compared to a class effects
described on a ten point scale from 1 (ineffective) to 10 (highly effective). In the latter case,
stakeholders may regard this information as useful to understand more fine grained degrees of
difference, but such a scale may carry only the appearance of precision that is not supported by
evidence, particularly for adjacent ratings. The same issue is generally true for reporting units.
That is, results for individual content areas or classes will be much less defensible (and results
based on strands or subscores will be almost certainly indefensible) than aggregate results for
multiple classes. The goal, of course, is to find the balance between the necessary specificity of
outcomes and an acceptable level of precision. As a matter of best practice, is advisable to
privilege technical defensibility, in order to provide the best case for results to be meaningfully
interpreted and utilized.
Norm and Criterion Referenced Growth
Broadly, approaches to identifying growth standards can be characterized as either norm-
referenced or criterion-referenced. A norm-referenced approach compares student achievement
to an expectation often based on a distribution of observed performance. Alternatively,
criterion-referenced growth standards establish a specific target outcome. For example,
requiring students who are not proficient to grow at a rate such that they achieve proficiency in a
set amount of time is a criterion referenced approach.
Each approach has advantages and limitations. Setting a norm-referenced expectation is useful
for identifying comparably high or low growth. Indeed, it seems intuitively reasonable to
DRAFT WY Educator Evaluation Framework. Advisory Committee October 17-18, 2012 30
describe valued growth as that which is significantly higher than other students. However, a
limitation is that some students who grow at very high rates relative to their peers may not
achieve proficiency in a reasonable amount of time. A criterion-referenced standard resolves this
potential ‘growth to nowhere’ problem, but raises a new issue: some students may be so far
below standard that even at exceptionally high rates of growth the student will not achieve
proficiency in a reasonable time frame. Particularly when growth is used for accountability
purposes, this can create a condition where some classes are uniformly disadvantaged.
Conversely, very high performing classes could exhibit little or no growth and meet standard.
While the Advisory Committee recommended blending both normative and criterion approaches
for evaluating growth for school accountability purposes, standards for growth in educator
evaluation systems should only be normative. This is due to the fact that students, rightfully so
in many cases, are not randomly assigned to teachers. Requiring teachers to equally advance
students toward meaningful outcomes (e.g., proficient) does not take into account that this is
much more challenging for students far below proficient than for students closer to the proficient
cut. However, expecting all teachers to have their students grow at meaningful rates compared
to each student’s academic peers in a normative sense is fairer to all educators in the system.
Reliability
Reliability refers to the consistency or stability of a measure. In this case, we are interested in
the reliability of the measures of teacher/leader effectiveness based on a system influenced by
growth estimates. Reliability is challenging in this context due to the error in achievement
measures and growth measures and the likely variation in the performance of teachers – about
which, little is known. We know little, except anecdotally, about the extent to which
performance differs across content areas for the same teacher. For example, would we expect a
teacher to be effective in ELA but not math? If so, to what extent would the levels of
effectiveness differ? Further, how stable is teaching effectiveness across years? Could a teacher
be effective one year but not the next and if so, to what would we attribute this variability?
Ultimately, it is challenging to disentangle measurement error from true variation in
performance. In the end, an educator evaluation system is built on the assumption that
performance is “stable-enough” to reliably detect some differences in true effectiveness.
One way to mitigate issues of unreliability is to base overall outcomes on aggregations of results
within content areas for the current year and across multiple years. For example, if a teacher
teaches three sections of the same mathematics class, the median growth informing the
performance category is based on all students across sections. Additionally, if that teacher has
results for the prior year, the teacher’s outcome for the current year could be based on the median
of the two years combined. The idea behind this approach is to both minimize uncertainty. The
reliability of overall outcomes will also be improved by the manner in which additional elements
DRAFT WY Educator Evaluation Framework. Advisory Committee October 17-18, 2012 31
aside from academic growth are incorporated into the system (e.g. professional practices), but
that will be addressed separately.
Shared and Individual Attribution of Student Performance Results
The Advisory Committee recognizes the challenges of properly attributing the results of student
performance to individual teachers. Therefore, the Framework relies on a mix of shared
attribution and individual attribution of student performance results. The SGP results, based on
PAWS tests in grades 3-8 should, depending on the specific theory of improvement for the
particular school, be shared among educators at the same grade and/or teaching the same subject
areas. SLO results, assuming groups of educators are working on the same SLO, may also be
shared among educators at the same grade and/or content area. However, SLOs allow for more
control than state test results and the Framework requires that at least some portion of the SLOs
used to document student performance by attributed to the individual educator of record.
References
Data Quality Campaign. (2010a). Strengthening the teacher-student data link to inform teacher
quality efforts. Retrieved from: www.DataQualityCampaign.org/resources/947.
Data Quality Campaign. (2010b). Developing a definition of teacher of record. Retrieved from:
http://dataqualitycampaign.org/files/Teacher%20of%20Record.pdf.
National Research Council. 2010. Getting value out of value-added. H. Braun, N. Chudowsky,
and J. Koenig (eds.). Washington, DC: National Academy Press.