statistics math 6th grade - kimberly unit... · marygrace ahern 2014-2015 pg. 2...

76
Marygrace Ahern 2014-2015 Pg. 1 Kimberly Core Unit Plan Unit Title: Statistics – Math 6 th grade Unit Author: Marygrace Ahern Grade Level: 6 Content Area(s): Math Time Frame: 4 – 5 weeks Statistics – Math 6 th grade UNIT NAME: Statistics CREATED BY: Marygrace Ahern SUBJECT: Math GRADE: 6 th grade Unit Rational: Students will learn the skills to ask statistical questions; gather, organize, and display data; and compare data sets. Students will also be able to determine the shape of the data distribution, and come to a conclusion about the meaning of its shape. Students will also understand the difference between the measure of the center of data, and the measure of variability. Essential Question: How can research build my understanding of the world? What do I want to know? What makes a good research question? How do I know when I’ve answered it? Naming Knowledge When identifying these skills, think about… - Do these skills mirror what experts do in their discipline? Mathematical Practices Domain and Grade Level Standards CCSS.MATH.PRACTICE.MP1 Make sense of problems and persevere in solving them. CCSS.MATH.PRACTICE.MP2 Reason abstractly and quantitatively. CCSS.MATH.PRACTICE.MP3 Construct viable arguments and critique the reasoning of others. CCSS.MATH.PRACTICE.MP4 Model with mathematics. Statistics and Probability: Develop understanding of statistical variability. CCSS.MATH.CONTENT.6.SP.A.1 Recognize a statistical question as one that anticipates variability in the data related to the question and accounts for it in the answers. CCSS.MATH.CONTENT.6.SP.A.2 Understand that a set of data collected to answer a statistical question has a distribution which can be described by its center, spread, and overall shape. CCSS.MATH.CONTENT.6.SP.A.3 Recognize that a measure of center for a numerical data set summarizes all of its values with a single number, while a measure of variation describes how its values vary with a single number. Summarize and describe distributions.

Upload: vantruc

Post on 17-Mar-2018

214 views

Category:

Documents


0 download

TRANSCRIPT

Marygrace Ahern 2014-2015 Pg. 1

Kimberly Core Unit Plan

Unit Title: Statistics – Math 6th grade

Unit Author: Marygrace Ahern

Grade Level: 6

Content Area(s): Math

Time Frame: 4 – 5 weeks

Statistics – Math 6th grade

UNIT NAME: Statistics

CREATED BY: Marygrace Ahern

SUBJECT: Math

GRADE: 6th grade

Unit Rational: Students will learn the skills to ask statistical questions; gather, organize, and display data; and compare data sets. Students will also be able to determine the shape of the data distribution, and come to a conclusion about the meaning of its shape. Students will also understand the difference between the measure of the center of data, and the measure of variability.

Essential Question: How can research build my understanding of the world? What do I want to know? What makes a good research question? How do I know when I’ve answered it?

Naming Knowledge When identifying these skills, think about…

- Do these skills mirror what experts do in their discipline?

Mathematical Practices Domain and Grade Level Standards

CCSS.MATH.PRACTICE.MP1 Make sense of problems and persevere in solving them. CCSS.MATH.PRACTICE.MP2 Reason abstractly and quantitatively. CCSS.MATH.PRACTICE.MP3 Construct viable arguments and critique the reasoning of others. CCSS.MATH.PRACTICE.MP4 Model with mathematics.

Statistics and Probability: Develop understanding of statistical variability. CCSS.MATH.CONTENT.6.SP.A.1 Recognize a statistical question as one that anticipates variability in the data related to the question and accounts for it in the answers.

CCSS.MATH.CONTENT.6.SP.A.2 Understand that a set of data collected to answer a statistical question has a distribution which can be described by its center, spread, and overall shape.

CCSS.MATH.CONTENT.6.SP.A.3 Recognize that a measure of center for a numerical data set summarizes all of its values with a single number, while a measure of variation describes how its values vary with a single number. Summarize and describe distributions.

Marygrace Ahern 2014-2015 Pg. 2

CCSS.MATH.PRACTICE.MP5 Use appropriate tools strategically. CCSS.MATH.PRACTICE.MP6 Attend to precision. CCSS.MATH.PRACTICE.MP7 Look for and make use of structure. CCSS.MATH.PRACTICE.MP8 Look for and express regularity in repeated reasoning.

CCSS.MATH.CONTENT.6.SP.B.4 Display numerical data in plots on a number line, including dot plots, histograms, and box plots.

CCSS.MATH.CONTENT.6.SP.B.5 Summarize numerical data sets in relation to their context, such as by:

CCSS.MATH.CONTENT.6.SP.B.5.A Reporting the number of observations.

CCSS.MATH.CONTENT.6.SP.B.5.B Describing the nature of the attribute under investigation, including how it was measured and its units of measurement.

CCSS.MATH.CONTENT.6.SP.B.5.C Giving quantitative measures of center (median and/or mean) and variability (interquartile range and/or mean absolute deviation), as well as describing any overall pattern and any striking deviations from the overall pattern with reference to the context in which the data were gathered.

CCSS.MATH.CONTENT.6.SP.B.5.D Relating the choice of measures of center and variability to the shape of the data distribution and the context in which the data were gathered.

Academic Vocabulary Data Research

Content Vocabulary measure of center deviation mode absolute deviation median mean absolute deviation balance point range mean quartiles stem-and-leaf plot interquartile range histogram five number summary dot plot box-and-whisker plot measures of variation or variability outlier distribution frequency table

Culminating activity Students will work with a partner to create a research question, gather data, display data, and analyze their own data, as well as compare their own findings with the findings of other students. *Student description and Rubric included at end of unit plan

Project Description Students will create a research question (for example, What is the average number of siblings that 6th graders have?), gather data through surveys or interviews, and organize their data using tables and graphs. They must use a dot plot, histogram, and a box-and-whisker plot to display the mean of their data, as well as the mean absolute deviation. Students will present their findings to the class, including conclusion as to which measure of center best describes their data, and what the measure of variability means within the context of the research

Marygrace Ahern 2014-2015 Pg. 3

question.

Measurable Objectives: Students will be able to find the mean and median of data they have collected. Students will be able to find the mean absolute deviation and the quartiles of data they have collected. Students will create a dot plot, histogram, and box-and-whisker plot for their data. Students will determine the most meaningful measure of center and measure of variability for their data.

Frontloading KWL Chart Question – “How do you find an average?” and “What is an average?”

Sequencing Scaffolding Activities Students will go through a series of lessons that will prepare them with the skills they will need to participate successfully in the culminating project. Also, as a means of preparing for the culminating activity, students will participate in a 6th grade research project together. The research project will ask the question: “What is the average number of hours a 6th grade student spends on electronic devices?” Students will learn the process of data collection, as well as how to create a dot plot, a histogram, and a box-and-whisker plot. Students will then display their data as a class, and compare their findings to those of other classes. An entire 6th grade display will also be available, so students can see how their class compared with the sixth grade as a whole. This will be a grade level project, but students will be held responsible for individual graphs and summaries of results. In this way, their individual readiness for the culminating project can be determined.

Lesson 1: Statistical Questions – Duration 1 hour long class

Lesson Goal: Distinguish between statistical questions and those that are not statistical. Formulate a statistical question and explain what data could be collected to answer the question. Distinguish between categorical data and numerical data.

Marygrace Ahern 2014-2015 Pg. 4

Intro Activity: “Exploring Statistics” Book One pgs. 4 – 6 Have students take out their journal to begin a new unit called Statistics. (moved from lesson 2) Have them create 2 sections, one section will be “Statistics Vocabulary”, have them skip 6 – 8 pages, and create a second section called “The Picture of Data: Organizing and Displaying Data” Lesson: Example 1: What is a Statistical Question? Show the students some baseball cards and pose this example. Jerome, a 6th grader at Roosevelt Middle School, is a huge baseball fan. He loves to collect baseball cards. He has cards of current players and of players from past baseball seasons. With his teacher’s permission, Jerome brought his baseball card collection to school. Each card has a picture of a current or past major league baseball player, along with information about the player. When he placed his cards out for the other students to see, they asked Jerome all sorts of questions about his cards. Some asked:

How many cards does Jerome have altogether?

What is the typical cost of a card in Jerome’s collection?

Where did Jerome get the cards?

Then, consider the questions that follow the description and ask the students:

Which of these questions do you think might be statistical questions?

What do you think I mean when I say “a statistical question”? Students do not have a definition or understanding of what a statistical question is at this point. Allow them to discuss and make conjectures about what that might mean before guiding them to the following: A statistical question is one that can be answered with data and for which it is anticipated that the data (information) collected to answer the question will vary. Add this to the vocab section of the journal, followed by the two types of data collected from example 2. The second and third questions are statistical questions because the answer for each card in the collection could vary. The 1st question, “How many cards do you have in your collection?” is not a statistical question because we do not anticipate any variability in the data collected to answer this question. There is only one data value and no variability. Convey the main idea that a question is statistical if it can be answered with data that varies. Point out to the students the concept of variability in the data means that not all data values have the same value. The question, “How old am I?” is not a statistical question because it is not answered by collecting data that vary. The question, “How old are the students in my school?” is a statistical question because when you collect data on the ages of students at the school, the ages will vary – not all students are the same age. Ask students if the following questions would be answered by collecting data that vary:

How tall is your 6th grade math teacher?

What is your hand span (measured from tip of the thumb to the tip of the small finger)?

Ask students which of these data sets would have the most variability.

Marygrace Ahern 2014-2015 Pg. 5

Number of minutes students in your class spend getting ready for school.

Number of pockets on the clothes of students in your class. After arriving at this understanding as a class, post the informal definition of statistical question on the board for students to refer to for the remainder of the class. Pose exercises 1 – 5 to students. These question sets are designed to reinforce the definition of a statistical question. The main focus is on whether there is variability in the data that would be used to answer the question. You may want to have students share their answers to Exercise 3 with a partner and have the partner decide whether or not the question is a statistical question. Exercises 1–5 1. For each of the following, determine whether or not the question is a statistical question. Give a reason for your answer. a. Who is my favorite movie star? No, not answered by collecting data that vary. b. What are the favorite colors of 6th graders in my school? Yes, colors will vary. c. How many years have students in my school’s band or orchestra played an instrument? Yes, number of years will vary. d. What is the favorite subject of 6th graders at my school? Yes, subjects will vary. e. How many brothers and sisters does my best friend have? No, not answered by collecting data that vary. 2. Explain why each of the following questions is not a statistical question. a. How old am I? Not answered by data that vary. b. What’s my favorite color? Not answered by data that vary – I just have one favorite color. c. How old is the principal at our school? The principal has just one age at the time I ask the principal’s age. Answered by data that does not vary. 3. Ronnie, a 6th grader, wanted to find out if he lived the farthest from school. Write a statistical question that would help Ronnie find the answer. What is a typical distance from home to school (in miles) for students at my school? 4. Write a statistical question that can be answered by collecting data from students in your class. What is the typical number of pets owned by students in my class? How many hours each day does a typical student in my class play video games? 5. Change the following question to make it a statistical question: “How old is my math teacher?” What is the typical age of teachers in my school?

Marygrace Ahern 2014-2015 Pg. 6

To answer statistical questions, we collect data. In the context of baseball cards, we might record the cost of a card for each of 25 baseball cards. This would result in a data set with 25 values. We might also record the age of a card or the team of the player featured on the card. Example 2: Types of Data We use two types of data to answer statistical questions: numerical data and categorical data. If we recorded the age of 𝟐𝟓 baseball cards, we would have numerical data. Each value in a numerical data set is a number. If we recorded the team of the featured player for 𝟐𝟓 baseball cards, you would have categorical data. Although you still have 𝟐𝟓 data values, the data values are not numbers. They would be team names, which you can think of as categories.

What are other examples of categorical data? Eye color, the month in which you were born, and the number that may be used to identify your classroom are examples of categorical data.

What are other examples of numerical data? Height, number of pets, and minutes to get to school are all examples of numerical data. To help students distinguish between the two data types, encourage them to think of possible data values. If the possible data values include words or categories, then the variable is categorical. Suppose that you collected data on the following. What are some of the possible values that you might get?

Eye color

Favorite TV show

Amount of rain that fell during storms

High temperatures for each of 12 days Have students complete the exercise to reinforce students’ understanding of the two types of data Exercises 6–7 6. Identify each of the following data sets as categorical (C) or numerical (N). a. Heights of 𝟐𝟎 6th graders N b. Favorite flavor of ice cream for each of 𝟏𝟎 6th graders C c. Hours of sleep on a school night for 𝟑𝟎 6th graders N d. Type of beverage drank at lunch for each of 𝟏𝟓 6th graders C e. Eye color for each of 𝟑𝟎 6th graders C f. Number of pencils in each desk of 𝟏𝟓 6th graders N 7. For each of the following statistical questions, students asked Jerome to identify whether the data are numerical or categorical. Explain your answer, and list four possible data values.

Marygrace Ahern 2014-2015 Pg. 7

a. How old are the cards in the collection? Numerical data, as I anticipate data will be a number. Possible data values: 𝟐 years, 𝟐 𝟏𝟐 years, 𝟒 years, 𝟐𝟎 years b. How much did the cards in the collection cost? Numerical data, as I anticipate data will be a number. $ 𝟎. 𝟐𝟎, $ 𝟏. 𝟓𝟎, $ 𝟏𝟎. 𝟎𝟎, $ 𝟑𝟓. 𝟎𝟎 c. Where did you get the cards? Categorical, as I anticipate the data represents the name of a place. (e.g., a store, a garage sale, from my brother, from a friend)

Homework/Activity:

Resources

Scaffolding Options/UDL

Idaho Core Standards Connection

Formative/ Summative Assessment

Additional problem sets can be found at Engage NY webpage linked in resources (pg. 17)

https://www.engageny.org/file/15836/down

load/math-g6-m6-teacher-

materials.pdf?token=1MaOY3fBSNcufrNiv8u24x01jt77pU5ORNCO35a5ua8

Date: 10/23/13 29

© 2013 Common Core, Inc. Some rights reserved. commoncore.org This work is

licensed under a Creative Commons Attribution-

NonCommercial-ShareAlike 3.0 Unported

License. NYS COMMON CORE MATHEMATICS

CURRICULUM Lesson 3 6•6

For ELL student, use Google translate to translate notes, homework, and tests into Spanish Provide guided notes to students in SPED and on 504s

CCSS.MATH.CONTENT.6.SP.A.1 Recognize a statistical question as one that anticipates variability in the data related to the question and accounts for it in the answers.

Additional problem sets can be found at Engage NY webpage linked in resources (pg. 17), which can be used as either formative or summative assessment

Lesson 2: Measures of Center – Duration 1 – 2 hour long classes

Lesson Goals: Review the meaning of mean, median, and mode. Calculate and interpret the mean, the median, and the mode for a set of data. Lesson: Begin with a KWL Chart: “How do you find an average?” and “What is an average?” Jot down everything students say, keep for reference throughout unit. Revisit at end of unit to see what students have learned. Give them vocabulary to add to their journal: measure of center (which tells you how data values are clustered or where the “center” of a graph of the data is located – there are 3 measures of center: mean, media, and mode), mode (data value or values that occur most often), and median (the middle number in a data set when the values are placed in order from least to greatest). These 2 measures of center are a review from 5th grade. Have them gather their own data by doing the “It’s a Snap” activity in Exploring Statistics book 2 pg. 24. Collect the data and create a table of values for the class in the Pictures of Data section of the journal. As a review, have them find the median and mode of the data they gather. You can give them more data to work with if they need more practice. You can use a story problem with the data to give it context, such as the

Marygrace Ahern 2014-2015 Pg. 8

need for a coach to determine which player should have more game time in a championship game by using the scores for the past three games for 3 of his players. As a review, determine the median and mode for each player and determine which measure of center is most useful given the context of the story. As per our example, the three players and their scores are: Josephine: 4; 6; 12; 12; 12; 26 (Median = 12, Mode = 12) Shelly: 2; 3; 8; 10; 17; 20 (Median = 9; Mode = no mode) Chanice: 8; 10; 12; 13; 14; 15 (Median = 12.5, Mode = no mode) Remind them that there are 3 measures of center, the third one being the mean. Hand out centimeter cubes and have students work in table teams for the next part of the lesson. Explain that the mean is finding a balance point in a set of data. Give them the data set 2; 3; 5; 6. Have them stack cubes so there are four stacks matching that set of data. Now ask them to balance out the data. Tell them they must keep the same number of stacks, but they need to shift the individual cubes until all of the stacks are the same size. What they are done, they should have 4 stacks with 4 cubes each. Tell them the mean of that set of data is 4, because that is the balance point of the data. Repeat this process with multiple sets of data (2; 3; 5; 10). As they work through this, have them think about how they shifted the cubes so the stacks were even. How else could this be done? You are going to lead them to discover that if you just put all the cubes in one pile, and divided them into the number of stacks, the number of cubes in each stack would be the mean. Use more data, to get the point across. This leads you to the algorithm, it may take a while to get here so you may suggest this strategy along the way. When they’ve found the strategy, ask them if they can figure out how to do it without cubes, again, leading toward the algorithm. Finally, solidify the algorithm, formally writing it down in their journals and practicing it with the original data from the girls and the championship game. Have them use the mean for each player to determine the best choice for the coach. As a closure for the lesson, have students watch this video, summarizing the lesson rather well (It does introduce the term, outlier, which will be used in the next lesson): https://www.youtube.com/watch?v=tPmJzrzIEHw&list=PLnIkFmW0ticMG9urjT2exiZNpUsROrZws&index=1 You can also watch this video, which discusses how to describe the spread of the data using the range. Range is a term taught in 5th grade, so this should be a review: https://www.youtube.com/watch?v=2iNWBpjw3ec&list=PLnIkFmW0ticMG9urjT2exiZNpUsROrZws&index=2 End the lesson with this problem: In math class, there were 5 quizzes given that were 10 pts. each. Who did the best overall? Julian: 3; 9; 9; 9; 10 Mona: 6; 7; 7; 10; 10

Marygrace Ahern 2014-2015 Pg. 9

Tim: 6; 7; 8; 9; 10 Have them find the mode, median, and mean for each student, and determine the more useful measure of center for the data. Have students create a foldable to review and reinforce mean, median, mode, and range. Have them glue it into their journals. Here is a link to the foldable and what it should look like: http://www.fortheloveofteachingmath.com/2013/02/11/mean-median-mode-and-range-graphic-organizer-freebie/

Homework/Activity:

Resources Scaffolding Options/UDL Idaho Core Standards Connection

Formative/ Summative Assessment

Go to http://www.math-aids.com/Mean_Mode_Median/Mean_Mode_Median_Range.html To generate additional practice for mean, median, mode, and range. (The website was the better alternative for the first assignment) Use Homework/Activity A for additional practice with mean, median, mode. (Do this as the next night’s assignment)

Carnegie Learning: Math Series – Course 1 Teachers-Pay-Teachers

Review of mode and median may take longer, depending upon whether students remember/have learned those measures of center before. Additional review problems may be needed. For ELL student, use Google translate to translate notes, homework, and tests into Spanish Provide guided notes to students in SPED and on 504s

6.SP 3 Recognize that a

measure of center for a numerical data set summarizes all of its values with a single number, while a measure of variation describes how its values vary with a single number.

Use white boards for students to display their answers with the final problem, going through mode, median, and mean individually. Use free summative assessment from Teachers-pay-teachers, found at the following link: https://www.teacherspayteachers.com/FreeDownload/Mean-Median-Mode-Range-FREE-Quiz-and-Answer-Key

Lesson 3: Graphing Data – Duration 4 – 6 hour long classes + 2 class sessions for testing

Lesson Goals: Calculate the mean, median, and mode from a graphical display of data. Determine when to use the mean, median, or mode to describe a data set. Lesson: Line Plot - (took 2 classes)

Marygrace Ahern 2014-2015 Pg. 10

Introduce a line plot as a graphical representation of data along a number line, adding this description and the following examples. Give students a simple set of data (2; 3; 5; 6), and show them how to graph it on a line plot. Give them a second set of data (2; 3; 5; 10) and have them create a line plot on their own. Have them calculate the median for both sets; then have them calculate the mean. Have them see that the two sets of data are identical, except the six in the first set has been changed to a ten. Ask them what happened to the mean when that data value increased (the mean increased). Ask them what happens to the median when the data value is replaced (the median stays the same). Have students create new line plots for the following data: Set 1: 50; 50; 55; 60; 60 Set 2: 25; 50; 55; 60; 60 Remind them about scale, and ask what would be a good scale for the number line (going up by 5’s, starting at 10 or 20). Determine the mean, median, and mode for each set of data. Ask what they notice about the medians for both sets? What do they notice about the mean? Ask them what they notice about the shape of the data. Tell them there are 3 ways of describing the way data is distributed in the graph, the distribution: symmetric, skewed left, or skewed right. Skewed left: the peak of the data is to the right side of the graph.

Skewed right: the peak of the data is to the left side of the graph. Symmetric: the left and right halves are mirror images of the each other.

Add this information into the vocab section of the journal with examples as seen in the graphs below. The line plot for set 1 is symmetric and the line plot for set 2 is skewed left. Ask them how the mean and the median are affected by the distributions of the two line plots.

Have students construct a line plot on a whiteboard with a data set about the number of minutes students exercise. Data is in minutes: 0; 40; 60; 30; 60; 10; 45; 30; 300; 90; 30; 120; 60; 0; 20 Talk about scale (going up by 10 or 20 from 0 – 300). Before they start calculating, have students write down a conjecture about the mean, if they think the mean will be the same as the median, greater or lesser; explaining why they reached the conclusion. Then, have students determine the mean, median, and mode. Allow them to use a calculator on this problem, as the numbers may be problematic. Ask them to share their conjectures about the mean with a partner, and then share what they found the mean to be. Have some students share their ideas with the class. Then ask them to discuss in a think-pair-share if the mean or the median would be the better measure to describe the center of the exercise data.

Marygrace Ahern 2014-2015 Pg. 11

Discuss the data point, 300, and ask students how it affects the data. Tell students that this in an outlier, a data point that is far from the others that skews the data. How does the 300 skew this data? (It skews the data to the right, toward the larger values.) Another graphic to show the distribution:

This highlights how the outlier skews the data toward smaller or larger values. Point out to the students that a skewed distribution has values that are not typical of the rest of the data. They either could be data much greater than the rest of the data or much lower than the rest of the data. The graph will have a tail that is longer on one side than the other. Have them do the next problem on their own. This will be a formative assessment, to see if students understand how to read a line plot. A good video about discussing distribution, including discussion of clusters, peaks, and gaps. Good reinforcement for the lesson. https://www.youtube.com/watch?v=lkaXWFvutQM&list=PLnIkFmW0ticMG9urjT2exiZNpUsROrZws&index=4 Formative Assessment for Line Plot:

A 6th grade class collected data on the number of letters in the first names of all the students in class. Here is the dot plot of the data they collected:

1. How many students are in the class?

2. What is the shortest name length?

3. What is the longest name length?

4. What is the most common name length?

5. What name length describes the center of the data?

Sample Solutions

A 6th grade class collected data on the number of letters in the first names of all the students in class. Here is the dot plot

of the data they collected:

Marygrace Ahern 2014-2015 Pg. 12

1. How many students are in the class?

𝟐𝟓 2. What is the shortest name length?

𝟑 letters

3. What is the longest name length?

𝟗 letters

4. What is the most common name length?

𝟔 letters

5. What name length describes the center of the data?

𝟔 letters

------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------

Stem-and-Leaf Plot - (took 2 classes) Next, show students a stem-and-leaf plot. Let them know this is another graphical representation of data (add to the journal), which is good for longer lists of data. The stem is all the digits in a number except the right-most digit, which is the leaf. Let them know that if they rotate the stem-and-leaf plot, the stems look like the values on a number line. Don’t forget that all stem-and-leaf plots must have a key. You can give them a list of the ages of 38 of the Presidents of the United States when they were first inaugurated president and the ages at which they die. First, show students how to create a stem-and-leaf plot with the ages when the presidents were first inaugurated.

Marygrace Ahern 2014-2015 Pg. 13

Discuss how you would determine the starting and ending points for the leaves. Here is an example:

Make sure when you add the leaves, the values of the data point are in ascending order, and that you have to include every data point, even if it’s a duplicate. You don’t want to leave out anyone’s age! Also, make sure students realize that zero is a value. There should be 38 leaves if there are 38 data points. Be sure to complete a key for the graph.

Marygrace Ahern 2014-2015 Pg. 14

Next, have students create a stem-and-leaf plot of the ages when the presidents died. Walk around and assess student understanding. When they are done, and you have gone over the graph, talk about how these two graphs used the same stems (4 – 9) and how it would be easier to compare these two sets of data in one graph. Then teach them about a side-by-side stem-and-leaf plot, which allows comparison of two sets of data in two columns. Once they are set up in one graph, have students describe the distribution of both sets of data, and make observations about what they see. Formative Assessment for Stem-and-Leaf Plot Routinely, track team runners have coaches record their practice and competitive times in races. For the John Glenn Middle School track team, the following times were recorded for one practice run and one competitive run of the 100 meter dash.

Practice Runs: 13.1, 14.2, 13.3, 13.3, 13.0, 14.6, 12.9, 13.9, 12.8, 15.1, 15.0, 15.2, 14.3, 14.3, 16.1, 16.4 Competitive Runs: 12.4, 14.0, 12.6, 13.8, 14.2, 12.9, 14.2, 12.9, 14.3, 13.6, 12.9, 13.7, 13.4, 12.6, 13.3, 12.7

1. Let’s create a side-by-side stem-and-leaf plot to organize the practice and competition runs. a. What stems will you use in your side-by-side stem-and-leaf plots? b. What key will you use? What will the key mean? c. Create the side-by-side stem-and-leaf plot. d. What does this side-by-side stem-and-leaf plot tell you about the practice and competitive runs? Use what you know about the distribution and patterns of graphical displays for your side-by-side stem-and-leaf plot. 2. Why do you think the times were scattered in the practice run times, but more clustered in the competitive run times? ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------ The last graphical representation in this lesson will be the histogram. A histogram displays numerical or quantitative data using vertical bars. The width of the bar represents an interval of data, and the height of the bar indicates the frequency of the data values included in each bar. To create a histogram, usually data is organized into a frequency table, which is a way to organize data values according to how many times they occur Frequency Table Have them do the following problem to build a frequency table: Building and Interpreting a Frequency Table A group of 6th graders investigated the statistical question: “How many hours per week do 6th graders spend playing a sport or outdoor game?” Here are the data the students collected from a sample of 𝟐𝟔 6th graders showing the number of hours per week spent playing a sport or a game outdoors:

𝟑 𝟐 𝟎 𝟔 𝟑 𝟑 𝟑 𝟏 𝟏 𝟐 𝟐 𝟖 𝟏𝟐 𝟒 𝟒 𝟒 𝟑 𝟑 𝟏 𝟏 𝟎 𝟎 𝟔 𝟐 𝟑 𝟐

Marygrace Ahern 2014-2015 Pg. 15

(Can use surveys in class for # of hours students spent playing outside each week…this creates buy in). To help organize the data, the students placed the number of hours into a frequency table. A frequency table lists items and how often each item occurs. To build a frequency table, first draw three columns. Label one column “Number of Hours Playing a Sport/Game,” label the second column “Tally,” and the third column “Frequency.” Since the least number of hours was 𝟎, and the most was 𝟏𝟐, list the numbers from 𝟎 to 𝟏𝟐 under the “Number of Hours” column.

As you read each number of hours from the survey, place a tally mark opposite that number. The table shows a tally mark for the first number 𝟑. 1. Complete the tally mark column. 2. For each number of hours, find the total number of tally marks and place this in the frequency column. 3. Make a dot plot of the number of hours playing a sport or playing outdoors. 4. What number of hours describes the center of the data? Around 𝟑. 5. How many 6th graders reported that they spend eight or more hours a week playing a sport or playing outdoors? Only 𝟐 students. 6. The 6th graders wanted to answer the question, “How many hours do 6th graders spend per week playing a sport or playing an outdoor game?” Using the frequency table and the dot plot, how would you answer the 6th graders’ question? Most 6th graders spend about 𝟐 to 𝟒 hours per week playing a sport or playing outdoors. The data shown come from a random sample of 6th graders collected from the Census at School website (http://www.amstat.org/censusatschool/). The format for the frequency table is presented, and students are directed on how to complete the table. It is important to point out to students when listing values under the number column that the numbers must be listed sequentially with no missing numbers or gaps in the numbers. Students should be able to draw a dot plot from the frequency table and build a frequency table from the dot plot. After students have completed the frequency table and the dot plot of the data, discuss with them what each representation tells about

Marygrace Ahern 2014-2015 Pg. 16

the data. Histograms (took 2 classes) Complete the following exercise with students to discuss histograms. Example 1: Show this frequency table to begin, without the frequency column filled in:

Exercises 1–4 1. If someone has a head circumference of 𝟓𝟕𝟎, what size hat would they need? Large 2. Complete the frequency columns in the table to determine the number of each size hat the students need to order for the adults who wanted to order a hat. 3. What hat size does the data center around? Medium 4. Describe any patterns that you observe in the frequency column? The numbers start small but increase to 𝟏𝟓 and then go back down. Example 2: Histogram One student looked at the tally column and said that it looked somewhat like a bar graph turned on its side. A histogram is a graph that is like a bar graph, except that the horizontal axis is a number line that is marked off in equal intervals. To make a histogram:

Draw a horizontal line and mark the intervals.

Draw a vertical line and label it “frequency.”

Mark the frequency axis with a scale that starts at 𝟎 and goes up to something that is greater than the largest frequency in the frequency table.

Marygrace Ahern 2014-2015 Pg. 17

For each interval, draw a bar over that interval that has a height equal to the frequency for that interval. The bars are called bins. Draw the following histogram during your discussion, ending with the first two bars of the histogram drawn below.

The students are introduced to a histogram in this example. They use the data that was organized in a frequency table with intervals in Example 1. You may want to begin this lesson by showing the students an example of a bar graph. For example, show a bar graph showing favorite pizza toppings. Point out the horizontal axis is not a number line, but contains categories. The vertical axis is the frequency (or count) of how many people chose the particular pizza topping. As you present the histogram to the students, point out the main difference is the horizontal axis is a number line, and the intervals are listed in order from smallest to largest. Some students may struggle with the notation for the intervals. Point out to the students that the interval labeled 510−< 530, represents any head circumference from 510 mm to 530, not including 530. A head circumference of 530 is counted in the bar from 530 − 550 and not in the bar from 510 − 530. Pose the following questions as students develop the following exercise: 1. Why should the bars touch each other in the histogram? 2. How are histograms and bar graphs similar? How are they different? In the first problem, students are asked to complete the histogram. Emphasize that the bars should touch each other and be the same width. Also point out the jagged line (or “scissor cut”), and explain that it is used to indicate a cutting of the horizontal axis. (A “scissor cut” could also be used on a vertical axis.) The cut is used to show the graph by “pulling in” unused space. Have them do the following practice problem as a formative assessment, to measure student understanding.

Marygrace Ahern 2014-2015 Pg. 18

Formative Assessment/Practice Problem: The frequency table below shows the length of selected movies shown in a local theater over the past six months.

1. Construct a histogram for the length of movies data. (example provided to some, if needed)

2. Describe the shape of the histogram. 3. What does the shape tell you about the length of movies? Show this graphic, which shows how you can create a misleading graph by changing the intervals on a histogram: http://www.shodor.org/interactivate/discussions/ClassInterval/ Finally, talk about what happens when the data is skewed left (the mean is affected by small data values, so the median is the best measure of center), skewed right (the mean is affected by large data values, so the median is the best measure of center), and symmetrical (the mean is not skewed, it is the best measure of center).

Marygrace Ahern 2014-2015 Pg. 19

Homework/Activity: Resources Scaffolding Options/UDL Idaho Core Standards Connection

Formative/ Summative Assessment

Use Homework/Activity B for additional practice with Line Plots Use Homework/Activity C for additional practice with stem-and-leaf plots. Use Homework/Activity D for additional practice with Frequency Tables Use Homework/Activity E for additional practice with Histograms. There are additional practice problems at the Engage Ny site

Carnegie Learning: Math Series – Course 1 https://www.engageny.org/file/158

36/download/math-g6-m6-teacher-

materials.pdf?token=1MaOY3fBS

NcufrNiv8u24x01jt77pU5ORNCO

35a5ua8

Date: 10/23/13 29

© 2013 Common Core, Inc. Some

rights reserved. commoncore.org

This work is licensed under a

Creative Commons Attribution-

NonCommercial-ShareAlike 3.0

Unported License.

NYS COMMON CORE

MATHEMATICS

CURRICULUM Lesson 3 6•6

Remind students that they have used number lines to graph ratios before, called double number lines. More practice may be needed for the line plot, stem-and-leaf plot, and the histogram. A calculator can be used on the large number sets so the computations do not get in the way of learning the new skill. The base of the histogram can be given instead of having students generate their own, to ensure a proper scale and set up.

Give students this video to view, or send to parents about shape and distribution of data https://www.youtube.com/watch?v=lkaXWFvutQM&list=PLnIkFmW0ticMG9urjT2exiZNpUsROrZws&index=4

For ELL student, use Google translate to translate notes, homework, and tests into Spanish Provide guided notes to students in SPED and on 504s

CCSS.MATH.CONTENT.6.SP.B.4 Display numerical data in plots on a number line, including dot plots, histograms, and box plots.

CCSS.MATH.CONTENT.6.SP.B.5 Summarize numerical data sets in relation to their context, such as by:

CCSS.MATH.CONTENT.6.SP.B.5.A Reporting the number of observations.

CCSS.MATH.CONTENT.6.SP.B.5.B Describing the nature of the attribute under investigation, including how it was measured and its units of measurement.

CCSS.MATH.CONTENT.6.SP.B.5.C Giving quantitative measures of center (median and/or mean) and variability (interquartile range and/or mean absolute deviation), as well as describing any overall pattern and any striking deviations from the overall pattern with reference to the

Check students’ graphs throughout to see where there may be misunderstandings. Use summative assessment A, modified from Carnegie Learning, to measure student understanding

Marygrace Ahern 2014-2015 Pg. 20

context in which the data were gathered.

CCSS.MATH.CONTENT.6.SP.B.5.D Relating the choice of measures of center and variability to the shape of the data distribution and the context in which the data were gathered.

Lesson 4: Variability and the Mean Absolute Deviation – Duration 3 – 5 hour long periods

Lesson Goals: Calculate the deviations of each data value from the mean of a data set. Calculate the absolute deviations of each data value from the mean of a data set. Calculate and interpret the mean absolute deviation for a data set. Compare and contrast two small data sets that have the same mean but different amounts of variability. Understand that a data distribution is not characterized only by its center. Its spread or variability must be considered as well. Informally evaluate how precise the mean is as an indicator of the typical value of a distribution, based on the variability exhibited in the data. Lesson:

Variability in a Data Distribution Begin with the following example: Example 1: Comparing Two Distributions Robert’s family is planning to move to either New York City or San Francisco. Robert has a cousin in San Francisco and asked her how she likes living in a climate as warm as San Francisco. She replied that it doesn’t get very warm in San Francisco. He was surprised, and since temperature was one of the criteria he was going to use to form his opinion about where to move, he decided to investigate the temperature distributions for New York City and San Francisco. The table below gives average temperatures (in degrees Fahrenheit) for each month for the two cities.

Read through the introductory paragraph as a class. Give students a moment to examine the table and then ask: How would you describe the temperatures in New York City?

Marygrace Ahern 2014-2015 Pg. 21

The temperatures change a lot through the year. How would you describe the temperatures in San Francisco? The temperatures do not change much. Let students work independently and confirm their answer with a neighbor. Encourage calculator use when working with larger data sets. Exercises 1–2 Use the table above to answer the following: 1. Calculate the annual mean monthly temperature for each city. New York City: 𝟔𝟑 degrees San Francisco: 𝟔𝟒 degrees 2. Recall that Robert is trying to decide to which city he wants to move. What is your advice to him based on comparing the overall annual mean monthly temperatures of the two cities? Since the means are almost the same, it looks like Robert could move to either city. Example 2: Understanding Variability In Exercise 2, you found the overall mean monthly temperatures in both the New York City distribution and the San Francisco distribution to be about the same. That didn’t help Robert very much in making a decision between the two cities. Since the mean monthly temperatures are about the same, should Robert just toss a coin to make his decision? Is there anything else Robert could look at in comparing the two distributions? Although variability was introduced in an earlier lesson, write the meaning of it in the vocab section of the journal. Measures of variation or measures of variability are used in statistics to describe how spread out the data in a distribution are from some focal point in the distribution (such as the mean) – uses a single number to describe how a data set’s values vary. Maybe Robert should look at how spread out the New York City monthly temperature data are from its mean and how spread out the San Francisco monthly temperature data are from its mean. To compare the variability of monthly temperatures between the two cities, it may be helpful to look at dot plots. The dot plots for the monthly temperature distributions for New York City and San Francisco follow.

Marygrace Ahern 2014-2015 Pg. 22

Read through the first paragraph as a class. Since the means are about the same, it would be helpful if Robert had more information as basis for a decision. He needs to go beyond comparing centers to incorporating variability into his decision-making process. Ask students:

Should he just toss a coin to make a decision? Answers will vary.

What else do you think Robert could use to make a decision? He could consider the range or variety of temperatures in each city.

Read though the second paragraph (above) and define variability. In this example, we want students to become familiar with the concept of variability by viewing how spread out the data are from their mean in dot plots. Give students a moment to examine the dot plots and ask:

How are the two dot plots different?

The temperatures for New York City are spread out, while the temperatures for San Francisco are clustered together. Exercises 3 – 7 For exercises 3 – 7, let students work independently and compare answers with a neighbor. Use the dot plots above to answer the following: 3. Mark the location of the mean on each distribution with the balancing ∆ symbol. How do the two distributions compare based on their means? Place Δ at 𝟔𝟑 for New York City and at 𝟔𝟒 for San Francisco. The means are about the same. 4. Describe the variability of the New York City monthly temperatures from the mean of the New York City temperatures. The temperatures are widespread around the mean. From a low of around 𝟒𝟎, to a high of 𝟖𝟓. 5. Describe the variability of the San Francisco monthly temperatures from the mean of the San Francisco monthly temperatures. The temperatures are compact around the mean. From a low of 𝟓𝟕, to a high of 𝟕𝟎. 6. Compare the amount of variability in the two distributions. Is the variability about the same, or is it different? If different, which monthly temperature distribution has more variability? Explain.

Marygrace Ahern 2014-2015 Pg. 23

The variability is different. The variability in New York City is much greater compared to San Francisco. 7. If Robert prefers to choose the city where the temperatures vary the least from month to month, which city should he choose? Explain. He should choose San Francisco because the temperatures vary the least, from a low of 𝟓𝟕 to a high of 𝟕𝟎. New York City has temperatures with more variability, from a low of 𝟒𝟎, to a high of 𝟖𝟓. Example 3: Using Mean and Variability in a Data Distribution The mean is used to describe the “typical” value for the entire distribution. Sabina asks Robert which city he thinks has the better climate? He responds that they both have about the same mean, but that the mean is a better measure or a more precise measure of a typical monthly temperature for San Francisco than it is for New York City. She’s confused and asks him to explain what he means by this statement. Robert says that the mean of 𝟔𝟑 degrees in New York City (𝟔𝟒 in San Francisco) can be interpreted as the typical temperature for any month in the distributions. So, 𝟔𝟑 or 𝟔𝟒 degrees should represent all of the months’ temperatures fairly closely. However, the temperatures in New York City in the winter months are in the 𝟒𝟎s and in the summer months are in the 𝟖𝟎s. The mean of 𝟔𝟑 isn’t too close to those temperatures. Therefore, the mean is not a good indicator of typical monthly temperature. The mean is a much better indicator of the typical monthly temperature in San Francisco because the variability of the temperatures there is much smaller. The concept in this example may be challenging for some students. When Robert talks about the precision of the mean, Sabina asks him to explain what he means by a mean being precise. Although the means are about the same for the two distributions, Robert is suggesting that the mean of 64 degrees for San Francisco is a better indicator of the city’s typical monthly temperature, than the mean of 63 degrees is as an indicator of a typical monthly temperature in New York City. He bases this on the variability of the monthly temperatures in each city. He says that a mean is a only precise indicator of monthly temperatures if the variability in the data is very low. The higher the variability gets, the less precise the mean is as an indicator of typical monthly temperatures. If there is still confusion, draw two dot plots similar to Example 3 on the board and ask the following:

Which dot plot has greater variability?

If data points have a lot of variability, is the mean a good indicator of a “typical” value in the data set? No.

If the data points are clustered around the mean, is the mean a good indicator of a “typical” value in the data set? Yes.

Exercises 8 – 11: Let students work independently and confirm their answer with a neighbor. Consider the following two distributions of times it takes six students to get to school in the morning and to go home from school in the afternoon.

Marygrace Ahern 2014-2015 Pg. 24

8. To visualize the means and variability, draw dot plots for each of the two distributions.

9. What is the mean time to get from home to school in the morning for these six students? The mean is 𝟏𝟒 minutes. (Note: It is visible from the graphs.) 10. What is the mean time to get from school to home in the afternoon for these six students? The mean is 𝟏𝟒 minutes. (Note: The sum of the negative deviations is −𝟏𝟑, and the sum of the positive deviations is +𝟏𝟑.) 11. For which distribution does the mean give a more precise indicator of a typical value? Explain your answer. The morning mean is a more precise indicator. The spread of the afternoon data is far greater around the mean. Only do exercises 12-14 if there is plenty of time and/or a need for more examples to work through. Exercises 12–14 Let students work in pairs or small groups. If time allows, discuss Exercise 13 as a class. Distributions can be ordered according to how much the data values vary around their means. Consider the following data on the number of green jellybeans in seven bags of jellybeans from each of five different candy manufacturers (AllGood, Best, Delight, Sweet, Yum). The mean in each distribution is 𝟒𝟐 green jellybeans.

12. Draw a dot plot of the distribution of number of green jellybeans for each of the five candy makers. Mark the location of the mean on each distribution with the balancing ∆ symbol. The dot plots should each have a balancing ∆ symbol located at 𝟒𝟐.

Marygrace Ahern 2014-2015 Pg. 25

13. Order the candy manufacturers from the one you think has least variability to the one with most variability. Explain your reasoning for choosing the order. Note: Do not be critical, answers and explanations may vary. One possible answer: In order from least to greatest: AllGood, Sweet, Yum, Delight, Best. The data points are all close to the mean for AllGood, which indicates it has the least variability, followed by Sweet and Yum. The data points are spread further from the mean for Delight and Best, which indicates they have the greatest variability. 14. For which company would the mean be considered a better indicator of a typical value (based on least variability)? AllGood. ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------

The Mean Absolute Deviation (MAD) Variability Variability was discussed informally in the previous lesson. This lesson focuses on developing a more formal measure of variability in a data

Marygrace Ahern 2014-2015 Pg. 26

distribution called the mean absolute deviation, denoted by MAD. The concept of deviation from the mean should be discussed, point out that in previous lessons we used deviations to develop the idea of the mean as a balance point. How far away was each data point from the mean? That distance is the mean. Add the definition and the following info about deviations into the Statistics Vocab section of the student journals. Deviation: The distance of a data point from the mean. The formula for finding the deviation is: Data point – Mean = Deviation (remember subtraction is NOT commutative, so you have to put the values in this order in the formula) If the deviation is negative, the data point is to the left of the mean, and if the deviation is positive, the data point is to the right of the mean. This lesson challenges students to answer why absolute values of deviations are used in calculating the MAD. Mean absolute deviation is the measure of variability used in the middle school curriculum. At the high school level, deviations are squared instead of using the absolute value. This leads to other important measures of variability called the variance and the standard deviation. Example 1: Variability In Lesson 8, Robert tried to decide to which of two cities he would rather move, based on comparing their mean annual temperatures. Since the mean yearly temperature for New York City and San Francisco turned out to be about the same, he decided instead to compare the cities based on the variability in their monthly temperatures from the overall mean. He looked at the two distributions and decided that the New York City temperatures were more spread out from their mean than were the San Francisco temperatures from their mean. Read through Example 1 as a class, and recall the main idea of the previous lesson. Then ask students:

What is variability? The spread of data in a distribution from some focal point in the distribution (such as the mean).

What does a distribution that has no variability look like? All of the data points are the same.

What does a distribution that has a lot of variability look like? The data points are spread far apart.

Suggest a visual way to order several data sets from the one with least variability to the one with most variability. Exercises 1–3: Let students work in small groups on this exercise. Then confirm answers as a class. The discussion of Exercise 3 leads into Example 2. Students are asked to order the seven data sets from least variability to most variability. Students will no doubt suggest different orderings. Several orderings are reasonable – focus on the students’ explanations for ordering the distributions. What is important is not their suggested orderings but rather their arguments to support their orderings. Also, the goal for this example is for students to realize that they need to have a more formal way of deciding the best ordering. Sabina suggests that a formula is needed, and she proceeds in this lesson to develop one. The following temperature distributions for seven other cities all have a mean temperature of approximately 𝟔𝟑 degrees. They do not have the same variability. Consider the following dot plots of the mean yearly temperatures of the seven cities in degrees Fahrenheit. (Give students one or

Marygrace Ahern 2014-2015 Pg. 27

two copies of the seven dot plots to work with in teams at their tables)

1. Which distribution has the smallest variability of the temperatures from its mean of 𝟔𝟑 degrees? Explain your answer. City 𝑨, because all points are the same. 2. Which distribution(s) seems to have the most variability of the temperatures from the mean of 𝟔𝟑 degrees? Explain your answer. One or more of the following is acceptable: Cities 𝑫, 𝑬, and 𝑭. They appear to have data points spread furthest from the mean. 3. Order the seven distributions from least variability to most variability. Explain why you listed the distributions in the order that you chose. Several orderings are reasonable. Focus on students’ explanations for choosing the order. Example 2: Measuring Variability Based on just looking at the distributions, there are different orderings of variability that seem to make some sense. Sabina is interested in developing a formula that will give a number that measures the variability in a data distribution. She would then use the formula for each data set and order the distributions from lowest to highest. She remembers from a previous lesson that a deviation is found by subtracting the mean from a data point. The formula was summarized as: deviation = data point – mean. Using deviations to develop a formula measuring variability is a good

Marygrace Ahern 2014-2015 Pg. 28

idea to consider. No doubt students had different orderings of variability for the seven cities in Exercise 2. Sabina suggests that, in this example, a formula is needed to give a formal ordering. Since variability is being viewed from the mean, it seems reasonable that a formula should be based on how far data points are from the mean. Recall that a deviation results from subtracting the mean from a data point, or deviation = data point – mean. She concludes that it seems to be a good idea to use deviations in developing a formula for a measure of variability. Ask students:

Do you think using deviations is a good basis for a formula to measure variability? Yes. A deviation measures how far a data point is from the mean of its distribution. That certainly addresses the concept of variability.

When are deviations negative? When they are located to the left of the mean.

When are deviations positive? When they are located to the right of the mean.

Exercises 4–6 Let students work in pairs. In this exercise, City 𝐺 is used to focus on calculating deviations and verifying that the sum of deviations is equal to zero. This means summing deviations is not a good measure of variability because it always turns out to be zero (by the development of the mean as a balance). A graph is drawn of City 𝐺 to illustrate the values of the deviations. The dot plot for the temperatures in City 𝑮 is shown below. Use the dot plot and the mean temperature of 𝟔𝟑 degrees to answer the following questions.

4. Fill in the following table for City 𝑮 temperature deviations.

Marygrace Ahern 2014-2015 Pg. 29

5. Why should the sum of your deviations column be equal to zero? (Hint: Recall the balance interpretation of the mean of a data set.) The mean is the value that makes the sum of the positive and negative deviations 𝟎. It is the balance point. 6. Another way to graph the deviations is to write them on a number line as follows. What is the sum of the positive deviations (the deviations to the right of the mean)? What is the sum of the negative deviations (the deviations to the left of the mean)? What is the total sum of the deviations?

Sum of the positive deviations = +𝟐𝟐 Sum of the negative deviations = −𝟐𝟐 Sum of all of the deviations = 0 Example 3: Finding the Mean Absolute Deviation (MAD) By the balance interpretation of the mean, the sum of the deviations for any data set will always be zero. Sabina is disappointed that her idea of developing a measure of variability using deviations isn’t working. She still likes the concept of using deviations to measure variability, but the problem is that the sum of the positive deviations is cancelling out the sum of the negative deviations. What would you suggest she do to keep the deviations as the basis for a formula but to avoid the deviations cancelling out each other?

Marygrace Ahern 2014-2015 Pg. 30

This example asks students how they could still use deviations in developing a measure of variability but correct the difficulty of having the negative deviations offset the positive deviations when the deviations are summed. The operation of absolute value should come to mind because it is part of what students have previously studied in mathematics. The example leads them through the calculation of the deviations and then to taking the mean of the absolute deviations. Ask students:

If we were to treat the negative deviations as distances, what mathematical operation would do that? Remind students about absolute values. Finding the absolute value of a negative deviation makes it a positive distance. Emphasize that the concept of deviation has been maintained.

If we use the absolute value of all the deviations, what will happen to the sum? It will not be zero.

Add the definitions for absolute value – the distance from zero; absolute deviation – the absolute distance of a data value from the mean. Include examples of each. Define the mean absolute deviation as the sum of the absolute values of the deviations divided by the number of deviations. Add the definition of Mean Absolute Deviation (MAD) into the journal: a measure of variability in a data set To find the MAD *Add up the absolute values of the deviations *Divide the sum by the number of deviations Add an example (See Exercise 7, part (c).) Teachers may wish to work through Exercise 7, parts (a)–(c) as a class to develop this concept. Exercises 7–8 Let students continue to work in pairs or small groups. As previously indicated, teachers may decide to work through Exercise 7, parts (a)–(c) as a class. 7. One suggestion to possibly help Sabina is to take the absolute value of the deviations. a. Fill in the following table.

Marygrace Ahern 2014-2015 Pg. 31

Leave out the teal numbers, having students fill those in. b. From the following graph, what is the sum of the absolute deviations?

The sum of the absolute deviations is +𝟒𝟒. c. Sabina suggests that the mean of the absolute deviations could be a measure of the variability in a data set. Its value is the average distance that all the data values are from the mean temperature. It is called the Mean Absolute Deviation and is denoted by the letters, MAD. Find the MAD for this data set of City 𝑮 temperatures. Round to the nearest tenth. The mean absolute deviation is 𝟒𝟒/𝟏𝟐 or 𝟑.𝟕 degrees to the nearest tenth of a degree. d. Find the MAD for each of the temperature distributions in all seven cities, and use the values to order the distributions from least variability to most variability. Recall that the mean for each data set is 𝟔𝟑 degrees. Does the list that you made in Exercise 2 by just looking at the distributions match this list made by ordering MAD values? *If time is a factor in completing this lesson, assign a city to individual students. After each student has calculated the mean deviation, organize results for the whole class. Direct students to calculate the MAD to the nearest tenth of a degree.

e. Which of the following is a correct interpretation of the MAD? i. The monthly temperatures in City 𝑮 are spread 𝟑. 𝟕 degrees from the approximate mean of 𝟔𝟑 degrees. ii. The monthly temperatures in City 𝑮 are, on average, 𝟑. 𝟕 degrees from the approximate mean temperature of 𝟔𝟑 degrees. iii. The monthly temperatures in City 𝑮 differ from the approximate mean temperature of 𝟔𝟑 degrees by 𝟑. 𝟕 degrees. Answer is (ii). 8. The dot plot for City A temperatures follows:

Marygrace Ahern 2014-2015 Pg. 32

a. How much variability is there in City A’s temperatures? Why? No variability. The deviations are all 𝟎. b. Does the MAD agree with your answer in part (a)? Yes. The mean absolute deviation is 𝟎. *A video of lesson on mean absolute deviation can be viewed to reinforce this lesson. It has a good explanation of how the MAD describes the spread of the data. You can also give this link out to students who are struggling with the idea, and/or include it with the weekly email home so parents have an idea of how to help their students. The video can be found at: https://www.youtube.com/watch?v=zkv_VHGqKqI&list=PLnIkFmW0ticMG9urjT2exiZNpUsROrZws&index=3 -----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------

Describing Distributions Using the Mean and MAD Example 1: Describing Distributions In the previous lesson, Sabina developed the mean absolute deviation (MAD) as a number that measures variability in a data distribution. Using the mean and MAD with a dot plot allows you to describe the center, spread, and shape of a data distribution. For example, suppose that data on the number of pets for ten students is shown in the dot plot below.

There are several ways to describe the data distribution. The mean number of pets each student has is three, which is a measure of center. There is

Marygrace Ahern 2014-2015 Pg. 33

variability in the number of pets the students have, which is an average of 𝟐. 𝟐 pets from the mean (the MAD). The shape of the distribution is heavy on the left and it thins out to the right. Introduce the data set and explain that distributions can be described by their center, spread, and shape. Note that the mean is 3 pets and the MAD is 2.2 pets. The shape is described as well. In your discussion, you want your students to begin to conceptualize the measures. Have them draw a triangle over the 3 on the number line and see that the distribution is balanced there with the sum of negative deviations (−11) balancing the sum of positive deviations (+11). Then ask:

Without extensive calculating, how is the MAD 2.2 pets? The total distance the data are from the mean is 2(11) = 22, and the mean of the absolute deviations is 22/10 = 2.2 pets.

Let students work with a partner. Then discuss and confirm answers to Exercises 1–3. Exercises 1–4 1. Suppose that the weights of seven middle-school students’ backpacks are given below. a. Fill in the following table.

Have students fill in the deviations and the absolute deviations. b. Draw a dot plot for these data and calculate the mean and MAD.

The mean is 𝟏𝟖 pounds. The MAD is 𝟎. c. Describe this distribution of weights of backpacks by discussing the center, spread, and shape. The mean is 𝟏𝟖 pounds. There is no variability.

Marygrace Ahern 2014-2015 Pg. 34

All of the data is centered. 2. Suppose that the weight of Elisha’s backpack is 𝟏𝟕 pounds, rather than 𝟏𝟖. a. Draw a dot plot for the new distribution.

b. Without doing any calculation, how is the mean affected by the lighter weight? Would the new mean be the same, smaller, or larger? The mean will be smaller because the new point is smaller. c. Without doing any calculation, how is the MAD affected by the lighter weight? Would the new MAD be the same, smaller, or larger? Now there is variability, so the MAD is greater than zero. 3. Suppose that in addition to Elisha’s backpack weight having changed from 𝟏𝟖 to 𝟏𝟕 lb., Fred’s backpack weight is changed from 𝟏𝟖 to 𝟏𝟗 lb. a. Draw a dot plot for the new distribution.

b. Without doing any calculation, what would be the value of the new mean compared to the original mean? The mean is 𝟏𝟖 lbs. c. Without doing any calculation, would the MAD for the new distribution be the same, smaller, or larger than the original MAD? Since there is more variability, the MAD is larger. d. Without doing any calculation, how would the MAD for the new distribution compare to the one in Exercise2? There is more variability, so the MAD is greater than in Exercise 2. 4. Suppose that seven second-graders’ backpack weights were:

a. How is the distribution of backpack weights for the second-graders similar to the original distribution for sixth-graders given in Exercise 1? Both have no variability, so the MAD is zero. The dot plots look the same.

Marygrace Ahern 2014-2015 Pg. 35

b. How are the distributions different? The means are different. One mean is 𝟏𝟖 and the other is 𝟓. Example 2: Using the Mean Versus the MAD Decision-making by comparing distributions is an important function of statistics. Recall that Robert is trying to decide whether to move to New York City or to San Francisco based on temperature. Comparing the center, spread, and shape for the two temperature distributions could help him decide.

From the dot plots, Robert saw that monthly temperatures in New York City were spread fairly evenly from around 𝟒𝟎 degrees to the 𝟖𝟎s, but in San Francisco the monthly temperatures did not vary as much. He was surprised that the mean temperature was about the same for both cities. The MAD of 𝟏𝟒 degrees for New York City told him that, on average, a month’s temperature was 𝟏𝟒 degrees above or below 𝟔𝟑 degrees. That is a lot of variability, which was consistent with the dot plot. On the other hand, the MAD for San Francisco told him that San Francisco’s monthly temperatures differ, on average, only 𝟑.𝟓 degrees from the mean of 𝟔𝟒 degrees. So, the mean doesn’t help Robert very much in making a decision, but the MAD and dot plot are helpful. Which city should he choose if he loves hot weather and really dislikes cold weather? Read through the example as a class. Note that, although the mean provides useful information, it does not give an accurate picture of the spread of monthly temperatures for New York City. It is important to consider the center, spread, and shape of distributions when making decisions. Let students answer the questions: Which city should he choose if he loves hot weather and really dislikes cold weather? San Francisco, because there is little variability and it does not get as cold as New York city. What measure of the data would justify your decision? Why did you choose that measure? The mean absolute deviation (MAD) as it provides a measure of the variability. On average, the monthly temperatures in San Francisco do not vary as much from the mean monthly temperature. Exercises 5–7 Give students an opportunity to work independently. Let them confirm answers with a neighbor as needed. If time allows, discuss answers as a class. Allow students to use calculators for this exercise. Prioritize your discussion of questions, as Exercise 7 is the first time students see a dot plot with the same MAD and mean.

Marygrace Ahern 2014-2015 Pg. 36

5. Robert wants to compare temperatures for Cities B and C.

a. Draw a dot plot of the monthly temperatures for each of the cities.

b. Verify that the mean monthly temperature for each distribution is 𝟔𝟑 degrees. The data is nearly symmetrical around 𝟔𝟑 degrees for City B. The sum of positive deviations is +, and the sum of the negative deviations is −𝟔𝟏 around the mean of 𝟔𝟑 for City C. c. Find the MAD for each of the cities. Interpret the two MADs in words and compare their values. The MAD is 𝟓. 𝟑 degrees for City B, which means, on average, a month’s temperature differs from 𝟔𝟑 degrees by 𝟓.𝟑 degrees. The MAD is 𝟏𝟎.𝟐 for City C, which means, on average, a month’s temperature differs from 𝟔𝟑 degrees by 𝟏𝟎.𝟐 degrees. 6. How would you describe the differences in the shapes of the monthly temperature distributions of the two cities? The temperatures are nearly symmetric around the mean in City B. The temperatures are compact to the left of the mean for City C and then spread out to the right (skewed right). 7. Suppose that Robert had to decide between Cities D, E, and F.

a. Draw dot plots for each distribution.

Marygrace Ahern 2014-2015 Pg. 37

b. Interpret the MAD for the distributions. What does this mean about variability? The MADs are all the same, so Robert needs to look more at the shapes of the distributions to help him make a decision. c. How will Robert decide to which city he should move? List possible reasons Robert might have for choosing each city. City D – Appears to have “four seasons” with widespread temperatures. City E – Has mainly cold weather and is only hot for 𝟑 months. City F – Has mainly moderate weather and only a few cold months.

Homework/Activity:

Resources

Scaffolding Options/UDL

Idaho Core Standards Connection

Formative/ Summative Assessment

Use Homework/Activity F for additional practice with comparing distributions Use Homework/Activity G for additional practice with mean absolute deviations

https://www.engageny.org/file/158

36/download/math-g6-m6-teacher-

materials.pdf?token=1MaOY3fBS

NcufrNiv8u24x01jt77pU5ORNCO

35a5ua8

Date: 10/23/13 29

© 2013 Common Core, Inc. Some

rights reserved. commoncore.org

View and/or send home the link to the video from Learnzillion on the MAD and data distribution https://www.youtube.com/watch?v=zkv_VHGqKqI&list=PLnIkFmW0ticMG9urjT2exiZNpUsROrZws&index=3 For ELL student, use Google translate to translate notes, homework, and tests into Spanish

CCSS.MATH.CONTENT.6.SP.A.2 Understand that a set of data collected to answer a statistical question has a distribution which can be described by its center, spread, and overall shape.

CCSS.MATH.CON

Use Summative Assessment B to measure student understanding

Marygrace Ahern 2014-2015 Pg. 38

Use Homework/Activity H for additional practice with describing distributions with the mean and MAD

This work is licensed under a

Creative Commons Attribution-

NonCommercial-ShareAlike 3.0

Unported License.

Carnegie Learning: Math Course 1

Provide guided notes to students in SPED and on 504s

TENT.6.SP.A.3 Recognize that a measure of center for a numerical data set summarizes all of its values with a single number, while a measure of variation describes how its values vary with a single number. Summarize and describe distributions.

CCSS.MATH.CONTENT.6.SP.B.5 Summarize numerical data sets in relation to their context, such as by:

CCSS.MATH.CONTENT.6.SP.B.5.C Giving quantitative measures of center (median and/or mean) and variability (interquartile range and/or mean absolute deviation), as well as describing any overall pattern and any striking deviations from the overall pattern with reference to the context in which the data were gathered.

CCSS.MATH.CONTENT.6.SP.B.5.D Relating the choice of measures of center and variability to the shape of the data distribution and the context in which the data were gathered.

Marygrace Ahern 2014-2015 Pg. 39

Lesson 5: Five Number Summary and the Box Plot – Duration 3 – 5 hour long periods

Lesson Goals: Calculate and interpret the ranges, quartiles, and interquartile range as measures of variability for a data set. Calculate and interpret the five number summary as a measure of variability for a data set. Construct and interpret a box-and-whisker plot for a data set. Determine if a data set has outliers, and discuss how outliers affect the display and analysis of the data. Lesson:

Five Number Summary Students examine variability in data. They will compute the range for a set of data and practice dividing data sets into quartiles. Students name and interpret the quartiles, and identify and interpret the data which has been divided into different quartiles. The five number summary is explained and interpreted for two data sets set in a context. Have students do this work in their math journal. Sharpe Middle School is applying for a grant that will be used to add fitness equipment to the gym, and that the principal gave a survey to 15 anonymous students. The results from the 15 anonymous students are shown: 0 minutes 40 minutes 60 minutes 30 minutes 60 minutes 10 minutes 45 minutes 30 minutes 300 minutes 90 minutes 30 minutes 120 minutes 60 minutes 0 minutes 20 minutes A dot plot of the sample of 15 students and the number of minutes they exercise each day is shown.

Time Spent Exercising Each Weekday

Marygrace Ahern 2014-2015 Pg. 40

Use the dot plot displaying the number of minutes spent exercising each weekday to answer Questions 1 through 11. 1. Describe the measure of variation of the data. Previously, you examined the mean absolute deviation as a measure of variability to describe the spread of data values. In this lesson, you will examine other measures of variability. Another measure of variation is the range. The range is the difference between the maximum and minimum values of a data set. 2. Calculate the range for the exercise data. Another set of values that helps to describe variability in a data set is a quartile. When data in a set is arranged in order, quartiles are the numbers that split data into quarters (or fourths). Add quartile to the journal: quartiles – numbers that divide a data set into 4 equal parts 3. If you’re determining the quartiles, which cut the data set into fourths, how many quartiles will you have? Hint, how many cuts do you make to cut something into fourths? How do you think you can calculate the quartiles for a data set? 4. Determine the median for the exercise data set. 5. What percent of students spend less than the median number of minutes exercising? What percent of students spend more than the median number of minutes exercising? 6. The exercise data for Sharpe Middle School is shown in ascending order.

a. Circle the median from Question 4 on the data set shown. b. Determine the middle number for the students who spend less than the median number of minutes exercising. Circle this median on the data set. c. Determine the middle number for the students who spend more than the median number of minutes exercising. Circle this median on the data set. d. By calculating each of the medians, how many parts have the data now been divided into? Recall, the numbers that divide a data set into four equal parts are called quartiles. Quartiles are often denoted by the letter Q followed by a number that indicates which fourth it represents. Since the median is the second quartile, it could be denoted Q2. The other quartiles are Q1 and Q3. 7. Add info about quartiles to the journal – quartiles are denoted by letter Q followed by the number of which fourth it represents. Q1 – the median of the lower half of the data (when there is only one median [Q2] do not use it as a number that you cross out to find Q1, ignore it.

Marygrace Ahern 2014-2015 Pg. 41

If there are two numbers that are found when determining the median [Q2], those are not the median, so you DO use those when crossing out numbers to find Q1). Q2 – the median of the entire data set Q3 – the median of the upper half of the data set (see Q1 for specifics about finding Q3). Give an example for each instance, one when there is only one median, and one where there are two numbers found when determining the median. 8. Name the value of each quartile for the Sharpe Middle School exercise data. a. Q1 b. Q2 c. Q3 9. What percent of the data is below Q1? What percent of data is above Q1? 10. What percent of the data is below Q3? What percent of data is above Q3? 11. What percent of the data is between Q1 and Q3? Have them create a chart in their journal showing the different percentages and where they are found. Additional Practice: 12. Determine the quartiles for each data set. Explain the process you used to calculate your solutions. Then, explain what the quartiles tell you about the data. a. 24 32 16 18 30 20 b. 200 150 260 180 300 240 280 250 Another measure of variation can be calculated once the quartiles of a data set are determined. The interquartile range, abbreviated IQR, is the difference between the third quartile, Q3, and the first quartile, Q1. The IQR indicates the range of the middle 50 percent of the data. Add to the vocab section of the journal. 13. What do you think is meant by the middle 50 percent? 14. Do you think it is possible for two sets of data to have the same range, but different IQRs? Explain your reasoning. Practice with another example if students are in need of more practice. You could use the It’s a Snap data from the beginning of the unit. Find the five number summary and find the IQR. Add to the vocab section of the journal: To summarize and describe the spread of the data values, you can use the five number summary. The five number summary includes the following 5 values from a data set:

Minimum: the least value in the data set

Q1: the first quartile

Q2: the median of the data set

Marygrace Ahern 2014-2015 Pg. 42

Q3: the third quartile

Maximum: the greatest value in a data set. Formative Assessment for the five number summary: A newspaper reporter is writing an investigative story about the wait time at two local restaurants. With the help of her assistant, the reporter randomly selects a time on Saturday afternoon to collect information. She and her assistant then enter each restaurant at that same time for two consecutive Saturday afternoons. The reporter and her assistant randomly select 11 patrons at each restaurant and record how many minutes they had to wait before being served. The results are shown.

1. Calculate and interpret the range of the wait times at each restaurant. 2. Do you think there is much difference in the wait times at each restaurant? 3. Calculate the five number summary for each restaurants wait time data values. 4. What does the five number summary tell you about the spread of the data that the range does not tell you? 5. Calculate the IQR for the wait times at each restaurant. 6. What do the IQR values tell you about the time spent waiting in line at each restaurant? 7. If you were the reporter, what might you write in your report? 8. Assuming the food prices and service is the same at each restaurant, which restaurant would you go to if you were in a hurry? Explain your reasoning.

Box-and-Whisker Plot Students continue to examine data by variability by calculating the five number summary and constructing its accompanying box-and-whisker plot. They will then construct and interpret box-and-whisker plots. Students also determine if a data set contains any outliers. Have students take notes in their journal. Give students the following information: There is a special type of graph that displays the variation in a data set. A box-and-whisker plot, or just a box plot, is a graph that displays the data using the five number summary. 1. Examine the box plot shown:

Marygrace Ahern 2014-2015 Pg. 43

a. How is the median represented in the box-and-whisker plot? b. How are the quartiles represented in the box-and-whisker plot? c. Describe the ‘box’ in the box-and-whisker plot. d. How are the minimum and maximum values represented in the box-and-whisker plot? e. Describe the whiskers in the box-and-whisker plot. 2. Use the box-and-whisker plot shown to answer each question.

a. Identify the values of the five number summary for the points scored on the math test. Then, explain what those values tell you about the scores on the test.

Marygrace Ahern 2014-2015 Pg. 44

• Minimum: • Q1: • Median: • Q3: • Maximum: b. Identify the range for the test scores. c. Determine the IQR for the test scores. Then, explain what the IQR represents in this problem situation. d. What percent of test scores are between Q1 and Q3? e. How many students took the math test? Explain your answer. f. Karyn says the median should be at 50 because it is in the middle of the number line values. Do you agree with Karyn’s claim? Explain how you determined your answer. g. Jamal claims that more students scored between 15 and 40 than between 70 and 90 because the lower whisker is longer than the upper whisker. Do you agree or disagree with Jamal? Explain your reasoning. You can describe the distribution of a box plot in the same way we described the shapes of stem-and-leaf plots or histograms. 3. How would you describe the distribution of the box plot? Why? 4. Create a box-and-whisker plot to represent the data about the amount of time students spend exercising at Sharpe Middle School. First, recall the values of the five number summary. a. Minimum: b. Q1: c. Median: d. Q3: e. Maximum: 5. Use a number line to complete the graph.

a. Label the number line to include the maximum and minimum values of the data set. b. Represent the minimum and maximum values on a box-and-whisker plot by placing dots above the minimum and maximum values of the number line. c. Represent the median and quartile values on the box-and-whisker plot by placing vertical lines above the median and quartile values on the number line.

Marygrace Ahern 2014-2015 Pg. 45

d. To complete the box-and-whisker plot, connect the box to the dots representing the minimum and maximum values. The box and dots are connected by drawing “whiskers.” e. Name the box-and-whisker plot. f. Describe the distribution of the box plot. 6. Why do you think the whisker connecting the minimum value with the first quartile is much shorter than the whisker connecting the third quartile with the maximum? 7. Re-examine the original time spent exercising data shown.

a. How many values are below Q1? How many values are above Q3? b. Why is the right whisker so much longer than the left whisker? 8. Summarize what you can determine about the exercise data by examining the box plot. Recall that a value that is far away from the other values in a data set is sometimes called an outlier. An outlier is a number in a data set that is significantly lesser or greater than the other numbers. 9. Do you think any of the values from the exercise data could be outliers? One method for determining if a value is an outlier is to determine if that value is more than 1.5 times the interquartile range away from the quartiles. If a value is greater than Q3 + (1.5 x IQR), or less than Q1 – (1.5 x IQR), then that value is an outlier. 10. Determine if the exercise data set has any outliers using the formula. Explain your reasoning. Formative Assessment for Box-and-Whisker Plot Recall the reporter and her assistant who randomly selected 11 patrons at each of two restaurants and recorded how many minutes they had to wait before being served. The five number summaries for each restaurant were as follows:

Marygrace Ahern 2014-2015 Pg. 46

1. Create box-and-whisker plots for the wait times at each of the restaurants. Use the same number line for each box plot, and place one on top of the other so that they can be compared. 2. What conclusions can you draw about the wait time at the two restaurants? 3. Describe the distribution of each box plot. 4. Would the mean wait time in each restaurant be greater than, less than, or about the same as the median wait time? How do you know? 5. Do either of the restaurants have outliers? How do you know? You can purchase a box-and-whisker plot foldable from teachers-pay-teachers for the notes of your students’ journal. Here’s the link: https://www.teacherspayteachers.com/Product/Box-and-whisker-plot-foldable-1195956 ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------

Culminating Project: Class Project: Students will work as a class to determine how many hours a week the average 6th grader uses electronic devices. The class will be surveyed (we will discuss how there should be 26 or more participants or this would be too small of a sample to really be valid), and the data will be organized into a table. We will then create a frequency table, a line plot, a histogram. As a class, we will find the mean and median of the data, as well as the MAD and the five number summary, which includes the IQR. We will write a summary of what the mean and median (the measures of center) say about the data, as well as what the MAD (the measure of variation) says about the data. We will then create a box-and-whisker plot of our data, and finally write a summary of our findings. The summary will include our research question, research participants (who was surveyed), the summaries of the measures of center and the measure of variation, the description of the spread of our data, and the story of our data (what did we find out). We will then create a poster of the box and whisker plot our data, and compare our class with the other sixth grade classes, as well as the sixth grade as a whole.

Marygrace Ahern 2014-2015 Pg. 47

We will then use the rubric for the individual/partner project to self-assess our work on the class project. This will allow students to see exactly how their individual/partner project is going to be graded. Individual/Partner Project A student description of the individual/partner project is found at the end of this unit, as well as the rubric for grading the project.

Homework/Activity: Resources Scaffolding Options/UDL Idaho Core Standards Connection

Formative/ Summative Assessment

Use Homework/Activity I for additional practice with the five number summary Use Homework/Activity J for additional practice with box-and-whisker plots

Carnegie Learning – Math Course 1 Khan Academy for additional support https://www.khanacademy.org/math/probability/descriptive-statistics/box-and-whisker-plots/v/reading-box-and-whisker-plots

For ELL student, use Google translate to translate notes, homework, and tests into Spanish Provide guided notes to students in SPED and on 504s The class project done before the individual/partner project will set the guidelines and expectations for the culminating project

CCSS.MATH.CONTENT.6.SP.B.4 Display numerical data in plots on a number line, including dot plots, histograms, and box plots.

CCSS.MATH.CONTENT.6.SP.B.5 Summarize numerical data sets in relation to their context, such as by:

CCSS.MATH.CONTENT.6.SP.B.5.A Reporting the number of observations.

CCSS.MATH.CONTENT.6.SP.B.5.B Describing the nature of the attribute under investigation, including how it was measured and its units of measurement.

CCSS.MATH.CONTENT.6.SP.B.5.C Giving quantitative measures of center (median and/or mean) and variability (interquartile range and/or mean absolute deviation), as well as describing any overall pattern and any striking deviations from the overall pattern with reference to the context in which the data were

Use Summative Assessment C to measure student understanding Use the Culminating Project to assess student understanding of the unit

Marygrace Ahern 2014-2015 Pg. 48

gathered.

CCSS.MATH.CONTENT.6.SP.B.5.D Relating the choice of measures of center and variability to the shape of the data distribution and the context in which the data were gathered.

Name: Class: Date:

Homework/Activity A

Name: Class: Date:

Homework/Activity B

Hours of Sleep

Robert, a 6th grader at Roosevelt Middle School, usually goes to bed around 10:00 p.m. and gets up around

6:00 a.m. to get ready for school. That means that he gets about 𝟖 hours of sleep on a school night. He decided

to investigate the statistical question: How many hours per night do 6th graders usually sleep when they have

school the next day?

Robert took a survey of 𝟐𝟗 6th graders and collected the following data to answer the question:

𝟕 𝟖 𝟓 𝟗 𝟗 𝟗 𝟕 𝟕 𝟏𝟎 𝟏𝟎 𝟏𝟏 𝟗 𝟖 𝟖 𝟖 𝟏𝟐 𝟔 𝟏𝟏 𝟏𝟎 𝟖 𝟖 𝟗 𝟗 𝟗 𝟖 𝟏𝟎 𝟗 𝟗 𝟖

Robert decided to make a dot plot of the data to help him answer his statistical question. Robert first drew a

number line and labeled it from 𝟓 to 𝟏𝟐 to match the lowest and highest number of hours slept. He then placed

a dot above 𝟕 for the first piece of data he collected. He continued to place dots above the numbers until

each number was represented by a dot.

1. Complete Robert’s dot plot by placing a dot above the number on the number line for each number of hours

slept. If there is already a dot above a number, then add another dot above the dot already there.

2. What are the least and the most hours of sleep reported in the survey of 6th graders?

3. What is the most common number of hours slept?

4. How many hours of sleep describes the center of the data?

5. Think about how many hours of sleep you usually get on a school night. How does your number compare

with the number of hours of sleep from the survey of 6th graders?

Here are the data for the number of hours 6th graders sleep when they don’t have school the next day:

𝟕 𝟖 𝟏𝟎 𝟏𝟏 𝟓 𝟔 𝟏𝟐 𝟏𝟑 𝟏𝟑 𝟕 𝟗 𝟖 𝟏𝟎 𝟏𝟐 𝟏𝟏 𝟏𝟐 𝟖 𝟗 𝟏𝟎 𝟏𝟏 𝟏𝟎 𝟏𝟐 𝟏𝟏 𝟏𝟏 𝟏𝟏 𝟏𝟐 𝟏𝟏 𝟏𝟏 𝟏𝟎

6. Make a dot plot of the number of hours slept when there is no school the next day.

7. How many hours of sleep with no school the next day describe the center of the data?

8. What are the least and most hours slept with no school the next day reported in the survey?

9. Do students sleep longer when they don’t have school the next day than they do when they do have school

the next day? Explain your answer using the data in both dot plots.

Name: Class: Date:

Homework/Activity C

Stem-and-Leaf Plot

Create a stem-and-leaf plot, or a side-by-side stem-and-leaf plot displaying each given data set.

Name: Class: Date:

Homework/Activity D

Creating a Frequency Table & Line Plot

A biologist collected data to answer the question: “How many eggs do robins lay?”

The following is a frequency table of the collected data:

1. Complete the frequency column.

2. Draw a dot plot of the number of eggs a robin lays.

3. What number of eggs describes the center of the data?

Name: Class: Date:

Homework/Activity E

Creating a Histogram

The frequency table below shows the length of selected movies shown in a local theater over the past six

months.

1. Construct a histogram for the length of movies data.

2. Describe the shape of the histogram.

3. What does the shape tell you about the length of movies?

Name: Class: Date:

Homework/Activity F

Variability in Data Distribution

1. Consider the following statement: Two sets of data with the same mean will also have the same variability.

Do you agree or disagree with this statement? Explain.

2. Suppose the dot plot on the left shows the number of goals a boys’ soccer team has scored in 6 games so far

this season, and the dot plot on the right shows the number of goals a girls’ soccer team has scored in 6 games

so far this season.

a. Compute the mean number of goals for each distribution.

b. For which distribution, if either, would the mean be considered a better indicator of a typical value? Explain

your answer.

Name: Class: Date:

Homework/Activity G

Variability and the Mean Absolute Deviation

Calculate the mean absolute deviation for each data set. You may use a graphing calculator for larger data sets.

9. Data set: 4, 5, 9, 4, 8; Mean 5 6

10. Data set: 7, 11, 8, 35, 14; Mean = 15

11. Data set: 60, 65, 66, 67, 67, 65; Mean = 65

12. Data set: 22, 26, 29, 23, 26, 21, 28, 24, 25, 26; Mean = 25

14. Data set: 180, 210, 155, 110, 230, 90, 400, 35, 190, 0, 10, 100, 90, 130, 200; Mean = 142

15. Data set: 55, 74, 90, 20, 47, 59, 26, 83, 77, 62, 58, 33, 57, 44, 31; Mean = 54.4

Name: Class: Date:

Homework/Activity H

Describing Distributions Using the Mean and MAD

1. A dot plot of times that five students studied for a test is displayed below.

a. Use the table to determine the mean number of hours that these five students studied. Then, complete

the table.

b. Find and interpret the MAD for this data set.

2. The same five students are preparing to take a second test. Suppose that the data were the same except that

Ben studied 2.5 hours for the second test (1.5 hours more) and Emma studied only 3 hours for the second test

(1.5 hours less.)

a. Without doing any calculations, is the mean for the second test the same, higher, or lower than the

mean for the first test? Explain your reasoning.

b. Without doing any calculations, is the MAD for the second test the same, higher, or lower than the

MAD for the first test? Explain your reasoning.

Name: Class: Date:

Homework/Activity I

Five Number Summary

Vocabulary

Write the term that best completes each statement.

1. The ______________________________ is the difference between the first quartile and the third quartile.

2. The ______________________________ for a set of data is the difference between the maximum and

minimum values.

3. ___________________________ are values that divide a data set into four equal parts once the data are

arranged in ascending order.

4. A(n) _________________________________ lists the minimum and maximum values, the median, and the

quartiles for a set of data.

Determine the range for each given data set.

5. The data are 0, 3, 5, 6, 8, 12, 12, and 15.

6. The data are 5, 5, 6, 10, 13, 16, 16, 18, and 20.

7. The data are 20, 25, 25, 30, 40, 45, and 60.

8. The data are 18, 2, 10, 5, 22, 9, 15, 10, and 3.

Determine the median, Q1, and Q3 for each given data set.

9. The data are 0, 3, 5, 6, 8, 12, 12, and 15.

10. The data are 5, 5, 6, 10, 13, 16, 16, 18, and 20.

11. The data are 20, 25, 25, 30, 40, 45, and 60.

Determine the IQR for each given data set.

12. The data are 5, 5, 6, 10, 13, 16, 16, 18, and 20.

13. The data are 20, 25, 25, 30, 40, 45, and 60.

14. The data are 18, 2, 10, 5, 22, 9, 15, 10, and 3.

15. The data are 52, 33, 24, 68, 70, 39, 50, 32, 41, and 62.

Write the five number summary for each data set.

16. The data are 0, 4, 5, 6, 9, 10, and 12.

17. The data are 20, 28, 30, 31, 31, 32, and 34.

18. The data are 55, 80, 65, 78, 73, 74, 67, and 59.

19. The data are 200, 200, 225, 225, 225, 275, 350, 350, 350, and 400.

20. The data are 5, 3, 5, 0, 5, 8, 7, and 5.

Name: Class: Date:

Homework/Activity J

Box-and-Whisker Plots

Determine the range, the median, Q1, Q3, and the IQR for each given box-and whisker plot.

1.

Range:

Median:

Q1:

Q3:

IQR:

2.

Range:

Median:

Q1:

Q3:

IQR:

3.

Range:

Median:

Q1:

Q3:

IQR:

Construct a box-and-whisker plot for each data set.

4. The data are 0, 2, 6, 9, 11, 15, 17, and 20.

5. The data are 10, 13, 13, 14, 16, 20, 28, and 35.

6. The data are 50, 60, 60, 70, 75, 90, and 100.

Determine if each data set has any outliers. If so, list them.

7. The data are 0, 10, 11, 14, 15, 16, and 20

8. The data are 5, 50, 60, 70, 75, 80, 85, 95, and 145.

9. The data are 10, 30, 33, 35, 38, 40, and 60.

Name: Class: Date:

Summative Assessment A

Name: Class: Date:

Summative Assessment B

Mean Absolute Deviation

1. There are nine judges currently serving on the Supreme Court of the United States. The following table lists

how long (number of years) each judge has been serving on the court as of 2013.

a. Calculate the mean length of service for these nine judges. Show your work.

b. Calculate the mean absolute deviation (MAD) of the lengths of service for these nine judges. Show

your work.

c. Explain why the mean may not be the best way to summarize a typical length of service for these

nine judges.

2. The Lopez and Holland families each have 5 children, including a pair of twins. The names and ages of the

children are given.

Lopez family: Rosa, 16; Jose, 8; Lucia, 11; Angel, 5; Carlotta, 5

Holland family: Danielle, 9; Eric, 10; Alexis, 7; Joshua, 10; Cody: 4

a. Calculate the mean age for the children in each family.

b. Complete the tables for the two families.

c. Calculate the mean absolute deviation of the children’s ages in each family.

d. Compare the two results. What does this tell you about the two families?

Name: Class: Date:

Summative Assessment C

Box-and-Whisker Plot

1. Brad wants to estimate the number of points each player earns while playing a math computer game. He

decided to take a random sample of 15 anonymous players. The results are shown.

a. Display the data from the sample on the dot plot.

b. Identify the five-number summary for the data from the sample.

c. Construct a box-and-whisker plot to represent the data.

d. Why is the right whisker so much longer than the left whisker?

e. Determine if the data have any outliers. Explain your reasoning.

Use the box-and-whisker plots to answer the questions.

2. Steph made the box-and-whisker plot shown.

a. What is the range?

b. What is the median?

c. What is Q1?

d. What is Q3?

e. What is the IQR?

3. Mark made the box-and-whisker plot shown.

a. What is the range?

b. What is the median?

c. What is Q1?

d. What is Q3?

e. What is the IQR?

f. What percent of the data is over 40?

6th grade Research Project ~ Statistics Students will work alone or (preferably) with one partner to create a statistical question, which will be the basis of this research project. Students will then do the research to answer their question, using surveys or personal interviews with a group that is large enough to provide a valid sample (26 or more students). The data will then be organized and compiled into graphs so that it can be displayed and analyzed. Finally, students will present their findings to the class. This project will be a mirror of the 6th grade project in which we researched how many hours a week an average 6th grader uses electronic devices. All of the steps that were done for that class project will be done for this individual/team project. Each step must be checked off by the teacher before you can begin the next step. Here are the steps:

1. Create a statistical question that you will answer with your research. 2. Survey or interview a sample of students (26 or more participants). 3. Compile the data in a chart or other organizational manner. 4. Students create a frequency table of their data. 5. Students create a line plot of their data. 6. Students create a histogram of their data. 7. Students find 2 measures of center: the mean and the median. 8. Students find measure of variation (MAD). 9. Students find the five number summary and create a box-and-whisker plot of their data. 10. Students describe the spread and variability of their data. 11. Students write a short summary (2 or more sentences) of both measures of center. 12. Students write a short summary (2 or more sentences) of the measure of variation. 13. Students summarize their findings. Summary includes: research question, research participants (who you

surveyed), the summaries of the measures of center and the measure of variation, the description of the spread of your data, and the story of your data (what did you find out).

This is a big project, so don’t procrastinate! We will have class time to do this project, but some of it will have to be done on your own time, so don’t let time get away from you! I look forward to seeing your beautiful displays of data and hearing the presentations of your research findings! Happy Researching! Mrs. Ahern Due Date: ______________________________ Please have your parents/guardians sign this paper, so they are aware of the project and they can help you manage your time. The signature is worth 3 points! ___________________________________________ Student Name ___________________________________________ Parent/Guardian Name (please print) ___________________________________________ Parent/Guardian Signature

6th grade Research Project ~ Statistics Rubric * Students must hand in completed project to begin grading. Incomplete work will not be graded.

Description: 1 2 3 Research Question A question was asked that

was not statistical (could be answered with a yes or no)

A statistical question was asked, one that needed to be researched to be answered.

A thoughtful, specific statistical question was asked that could be answered only through the compilation of data.

Survey Student does not conduct research/sample size too small (under 26 students)

Student conducts survey of at 26 students, but no more than 30 students.

Student conducts survey of more than 30 students.

Data Compilation Students’ data is compiled in an inefficient manner; data is not valid (sample too small, illegible)

Students’ data is compiled in an efficient manner; data is valid

Students’ data is compiled in an organized, efficient manner with explanations in the form of legends, keys, etc; data is valid

Measure of Center Student finds only one measure of center (mean or median); does not summarize meaning of either or incorrectly summarizes meaning

Student finds both measures of center (mean and median); summarizes meaning of only one or incorrectly summarizes meaning

Student finds both measures of center (mean and median) and summarizes meaning of both correctly

Measure of Variation Student finds the measure of variation (MAD) incorrectly and does not summarize meaning at all/or does not summarize meaning correctly

Student finds the measure of variation (MAD) correctly but does not summarize meaning correctly

Student find measure of variation (MAD) and summarizes meaning correctly

Spread of Data Student does not describe the spread of the data or describes it incorrectly, and does not summarize meaning or summarizes incorrectly

Student describes the spread of the data correctly but does not explain its meaning or explains meaning incorrectly

Student describes the spread of the data correctly and explains the meaning correctly

Frequency Table Students does not create frequency table or there are major errors so the general idea is not correct

Student does not create frequency table correctly; there are minor errors but the general idea is correct

Student creates frequency table correctly

Line Plot Students does not create line plot or there are major errors so the general idea is not correct

Student does not create line plot correctly; there are minor errors but the general idea is correct

Student creates line plot correctly

Histogram Students does not create histogram or there are major errors so the general idea is not correct

Student does not create histogram correctly; there are minor errors but the general idea is correct

Student creates histogram correctly

Boxplot Both the five number summary and the box plot are not correct.

Either the five number summary or the box plot is not correct, but the other is.

Student finds the five number summary and creates boxplot correctly

Summary of Findings Summary of findings was too short for meaningful findings/was not included

Summary of findings was completed minimally but included most of the components that were required

Summary includes: research question, research participants (who you surveyed), the summaries of the measures of center and the measure of variation, the description of the spread of your data, and the story of your data (what did you find out)was in-depth, with a rich, well rounded explanation of findings

Parent Signature: 0 points / 3 points

Total points: