melt-institute- · web viewunderstand how to simulate randomness with pseudo-random number...

Schoolopoly

Mathematical Goals: Teachers will be able to Understand how to simulate randomness with pseudo-random number generators, and

use data from repeated trials to make informed estimates of reasonable intervals. Consider the effect of sample size and variability when comparing empirical results with

expected theoretical outcomes.

Pedagogical Goals: Teachers will be able to Consider the benefits and drawbacks for using computing tools to generate pseudo-

random numbers. Examine the benefits and drawbacks of using multiple representations to examine data

resulting from a random process. Compare the use of graphing calculators and spreadsheets for conducting simulations and

analyzing data generated from random processes.

Technological Goals: Teachers will be able to use a technological tool to Learn the importance of seed values in pseudo-random number generators. Use graphing calculators to generate a sequence of random integers, save the sequence as

a List, and construct a histogram of the results. Use spreadsheets to generate a sequence of random integers, tally the occurrence of

specific values and construct graphs of the results.

Mathematical Practices: Make sense of problems and persevere in solving them. Reason abstractly and quantitatively. Construct viable arguments and critique the reasoning of others. Model with mathematics. Use appropriate tools strategically. Attend to precision. Look for and make use of structure.

Length of session: 90 minutes

Materials needed: Computer with TinkerPlots, CalibratedCubes.tp file, DazzlingDice.tp file, DeltasDice.tp file, DiceDepot.tp file, DiceRUs.tp file, HighRollersInc.tp file, Graphing calculators, Schoolopoly Participant handout

Overview: In this session, participants will use TinkerPlots simulations to determine which dice company actually sells fair die. Then, participants will consider how simulations may be represented with real objects (such as coins) and how we can use a graphing calculator to model a real world situation to collect a large number of trials rather than having to flip a coin repeatedly. Participants will also consider the “randomness” of technology tools.

Estimated # Activity

Adapted from: Lee, H. S., Hollebrands, K. F., & Wilson, P. H. (2010). Designing and using probability simulations. In Preparing to teach mathematics with

technology: An integrated approach to data analysis and probability (103-125). Dubuque: Kendall Hunt. Tarr, J. E., Stohl Lee, H., & Rider, R. (2006). When data and chance collide: Drawing inferences from simulation data. In G. F. Burrill & P. C.

Elliott (Eds.), Thinking and reasoning with data and chance: Sixty-eighth NCTM yearbook (pp. 139-150). Reston, VA: National Council of Teachers of Mathematics.

1

of Minutes5 minutes Introduction

Probabilistic reasoning is difficult and our intuitions often lead us astray. Probability is based on randomness and thus removes cause and effect relationships we often want to look for.

Engaging in simulation activities that are grounded in real world contexts can give students a better intuition for probability than they typically develop when computing a probability using rules and formulas.

25 minutes SchoolopolyConsider the following problem:

Suppose your school is planning to create a board game modeled on the classic game of Monopoly. The game is to be called Schoolopoly and, like Monopoly, will be played with dice. Because many copies of the game expect to be sold, companies are competing for the contract to supply dice for Schoolopoly. Some companies have been accused of making poor quality dice and these are to be avoided since players must believe the dice they are using are actually “fair.” Each company below has provided dice for analysis.

Calibrated Cubes Dazzling Dice Delta’s Dice Dice Depot Dice R’ Us High Rollers, Inc.

Working with a partner, investigate whether the dice sent from each company is fair or biased.

Participants should find only one company with fair dice. They should be able to support their reasoning with data (graphs, tables, etc.). Allow participants to share their findings with each other and discuss how this task might help motivate students to want to generate their own simulations for determining the probability of events.

Questions to consider:1. Do you believe the dice from each company are fair or biased?

Which company would you recommend purchasing dice from? The only fair die are from Dice R’ Us. The other companies all have biased die and should not be recommended as the provider of dice for the game.

2. What compelling evidence do you have that the dice you of each company are fair or unfair? Results may include tables of values, histograms, pie charts, etc.

3. Use your data to estimate the probability of each outcome, 1-6, of the dice for each company. See chart below. Participants’ values




2

may differ slightly but if enough trials were conducted, their empirical probabilities may be close to the theoretical probability.

Company Weight/Theoretical ProbabilityCalibrated Cubes

30.15

30.15

30.15

30.15

30.15

50.25

Dazzling Dice

20.125

30.1875

30.1875

30.1875

30.1875

20.125

Delta’s Dice

20.1333

30.2

20.1333

30.2

20.1333

30.2

Dice Depot

20.1111

30.1667

40.2222

40.2222

30.1667

20.1111

Dice R’ Us

10.1667

10.1667

10.1667

10.1667

10.1667

10.1667

High Rollers, Inc.

40.16

50.2

50.2

50.2

10.04

50.2

Note: The weights and corresponding theoretical probabilities are shown in the table below. To alter the task, you may unlock each of the samplers by typing “Fair” as the password and then selecting to show the contents of the sampler.

10 minutes Simulating Randomness in Technology Tools Technology tools use deterministic algorithms to generate “random”

numbers. They all start with an initial input, or seed value, which tells the computer where in the list of numbers to begin its computation. Thus, if you know the seed value and the algorithm being used, it is possible to predict the output. Therefore, we call these pseudo-random number generators since the process is not truly random. However, if you do not know the seed value or algorithm used, it is unlikely you can predict the output of a computer’s “random” function so pseudo-random generators are accepted for simulations where we want to model a random process.

Let’s consider a graphing calculator. When you first take the graphing calculator out of the package, the seed value is set to 0. So, prior to using the Rand function, you will want to change the seed value on each calculator to ensure students receive different outputs.

To set the seed value:1. Type in a value – One way to choose a seed value is for each student

to type in his or her birthday as a 4-digit number followed by a number assigned to them by counting off (e.g., 120714 would be a student whose birthday is December 7th and is the 14th student in the class).

2. Press the “STO” key (store key).3. Press the “MATH” key.4. Choose “PRB.”5. Select 1:rand.




3

6. Press enter and the seed value will be displayed.

Let’s simulate a die toss on the graphing calculator. The command randInt will allow us to choose an integer between two other integers for a given number of times. For example randInt(1,6,5) will choose a number between 1 and 6 five times. The outcome is based on an equiprobable die.

Questions to consider:1. Would you prefer to discuss the importance of a seed value with

students or to set all of the graphing calculators with different seed values yourself and not discuss the issues with students? Explain your choice. Having students understand how to seed might make learning simulation procedures easier for students, as they will be able to follow along with the teacher or confirm with a neighboring student. Some teachers may prefer to set all the GCs with a different seed value in order to avoid class discussions with students who may doubt that computers can really produce random numbers. However, teachers that make this choice are knowingly avoiding the fact that computers must use an algorithm to produce the sequence of numbers.

2. How could you sue the fact that many calculators and computers generate the same list of pseudo-random numbers given the same initial seed value to generate discussions with students about randomness in general and the use of computers to simulate probability experiments? Having an algorithm to produce pseudo-random numbers is contrary to the very idea of randomness. This contrast might help students understand stochastic versus deterministic. If every graphing calculator produces the same output, that can demonstrate a deterministic algorithm. Inputting the different seed values can then illustrate how a stochastic process can be imitated, although it is not truly random. Because random number generators can be seeded differently, they can model random phenomenon using computers. Users of computers need to be aware of the limitations of the random number generators used and the importance of seed values, particularly if one is creating a simulation tool (i.e., programming their own application). However, the algorithms used by most computing tools are so sophisticated that they can often simulate a random process more “reliably” than if students were tossing a coin by hand.

20 minutes Using Data to Estimate Probabilities and Design Simulations Consider the graph below.




4

The graph has three schools highlighted. The rates for each school describe the actual percent of freshman from an incoming class of full-time students (population) that enrolled in courses for a second year. College administrators could simply use the past retention rates to calculate a predicted number of freshman who will return the next year. However, since we can expect some variability from year to year with different size classes, it is helpful to use retention rates as an estimate of the probability for any given freshman returning to school their sophomore year. This can help administrators to make more informed estimates of a reasonable number of returning sophomores.

Questions to consider:1. Use the retention rates at Chowan College, NC Central and NC State

University to estimate the probability that a randomly chosen freshman will continue into their second year at each school. Estimates might be Chowan—0.5; NCCU—0.8; NCSU—0.9

2. Describe how you would use a coin to simulate the experiment of deciding whether or not any given freshman will continue on the next year at Chowan College. What other objects could you use to conduct a simulation? Since the estimated probability of graduation at Chowan is 0.5, flipping a fair coin can model that scenario if heads is taken to be “Student returns” and tails is taken to be “Student does not return.”

3. If you conduct the simulation you described above with 30 trials to represent the number of freshman, what is a reasonable amount of these 30 freshman you could expect to return the following year? Why? Most students will predict around 15, because 0.5*30=15 successes, though one should expect some variability and an interval of about 10-20 is reasonable.

4. If you repeat a simulation of 30 trials several times (decide how many times), what similarities or differences do you expect across the results from the different samples of 30 trials? The results of each sample may vary considerably from 15. Each of the samples will likely not be 15, and may have a large spread due to the small sample size. Note: Students may NOT actually anticipate a large spread, this is ok. They may predict little variation from 15 such as




5

13, 14, 15, 16, 17, 18 freshmen returning to school in each sample of 30 trials.

5. Use a coin to conduct a simulation of several samples of 30 trials and record your results. How do the results compare with what you anticipated? Students might be surprised to find more variability in the proportion of heads (or tails) in the distribution of sample proportions from samples of 30 trials, than in what they anticipated. For example one such list of frequency of freshman returning and corresponding proportions could include (actual results from 10 samples of 30 trials)

{13, 16, 16, 10, 10, 13, 18, 17, 13, 11} {0.4333, 0.5333, 0.5333, 0.3333, 0.3333, 0.4333, 0.6, 0.5666, 0.4333, 0.3666}

6. What are some of the potential benefits and drawbacks of using a real world context like the freshman retention rate for introducing probability to students, especially terminology such as outcomes, sample space, experiment, trial, event, sample, and sample size? A real world context can help students see how probability is applied in a real world setting and can sometimes help students be engaged in the problem. The real world context can also help student attach experiential meaning to terms such as outcome or sample space. The real world context can help students think about what the possible different outcomes are that make up the sample space. This also brings up important issues of using a probability to model a real world context in that assumptions have to be made about the repeatability of an experiment under identical constraints. It is often difficult to distinguish between trial and sample, and a real context can help with that. With the freshman retention context, every freshman represents a trial with a 50% chance of returning and the size of freshman class is a sample of n freshman (trials). Mapping the real context onto the coin toss simulation can help students simplify the experiment. However, this can also be a lot to think about in the problem and some students may still confuse some of the terminology.

30 minutes Simulating Events with a Graphing Calculator Still considering the retention rate situation of Chowan College, recall our

sample space has two possible outcomes: student returns and student does not return. To simulate this situation using our graphing calculators, we will use the randInt function assigning two consecutive integers to represent the two outcomes.

Questions to consider:1. Suppose the freshman class at Chowan College has 500 students.

What command would you use on the graphing calculator to conduct a simulation of whether or not each of the 500 freshmen stays in




6

school with a 50% estimate for the probability for retention? Explain each part of the command in terms of the context of the situation. One possible command is randInt(0, 1, 500). This function generates 500 random integers of value 0 or 1. In this simulation, 0 could represent “student does not return”, 1 could represent a “student returns”, and 500 is the size of the freshman class.

2. Given a 50% estimate for the probability for retention, out of 500 freshmen, what is a reasonable interval for the proportion of freshmen you would expect to return the following year? Defend your expectation. Estimates will vary. After running 10 simulations, one could expect between 46% and 54% (about 228 and 276 freshman). Some may guess something as small as 49%-51% or a larger more unreasonable (too wide considering 500 trials) spread would be 40-60%.

Now we will model this situation using our graphing calculators. We will store the set of numbers as a list so that we can also view a histogram of our output and observe what happens when we repeat the simulation.o Type randInt(0, 1, 500) L1. Here we will generate a list of 500

random integers between 0 and 1. A result of 0 will represent a student does not return while a result of 1 represents the case when a student returns.

o When you press enter, you will see the list of the 500 random integers. Now we want to see a histogram of this to see the frequency of 0s and 1s.

o Activate Plot1 on the StatPlot menu.o Select the Histogram icon for the Type of plot and make sure the

Xlist displays L1.o Set the window to have range from 0 to 2 for x (scale of 1) and -50

to 350 for y (scale of 50) with x resolution 1.o Now if you press the graph key, you should see the histogram.

Questions to consider:3. Explain why the values used in the Window setting create the

appropriate graphical display of two bars with the frequency of the 0s in the first and the frequency of the 1s in the second. In your response, explain why a y max of 350 is appropriate. The first histogram bin represents all observations below 1. The second bin contains all observations from 1 and up to 2, but not including 2. Creating a domain from 0 to 2 allows for both of those bins to be visible. The Ymax of 350 is a reasonable estimate of the maximum frequency expected for either outcome and ensures that the window will be high enough to capture the height of each of the bins.

4. Determine the proportion of freshmen who will return to Chowan College next year. Answers will vary but should be near 0.50.

5. For our problem we are interested in how much the proportion of Adapted from: Lee, H. S., Hollebrands, K. F., & Wilson, P. H. (2010). Designing and using probability simulations. In Preparing to teach mathematics with



7

freshmen returning to Chowan College will vary from the expected 50%. Use the skills you have learned to repeat the simulation of deciding the retention of 500 freshman many times. Record the proportion of freshmen that continue on next year for each of the samples of 500 trials. How many of the sample proportions fall within the interval you predicted in question 2? Discuss why the results may or may not have been consistent with your earlier prediction. Some teachers may be surprised that the range from the simulation data does not match their initial prediction. They may have predicted too wide or too narrow of an interval.

6. Recall that in our simulation, each freshman had a 50% chance of returning to school. Did you get exactly 50% of the freshmen returning in the simulation? Discuss how and why the empirical proportions from the simulation may have varied from 50%. Not often because of expected variation. Note: it is not expected at this point for teachers to actually calculate probabilities. However, the probability of obtaining exactly 50% (250 freshman, P (250) =.0357) is smaller than the probability of obtaining values close to 250 (e.g., (P (249 or 251) =.0355 + .0355 = .071). However, the empirical results clumped around the value of 0.5.

7. If we reduced the number of trials to 200 freshmen, what do you anticipate would happen to the interval of proportions from the empirical data around the theoretical probability of 50%? Why? Conduct a few samples with 200 trials and compare your results with what you anticipated. Because the sample size decreased, the sample proportions will vary more and the interval will be wider.

8. If we increased the number of trials to 999 freshmen, what do you anticipate would happen to the interval of proportions from the empirical data around the theoretical probability of 50%? Why? Conduct a few samples with 999 trials and compare your results with what you anticipated. Because the sample size increased, the sample proportions will vary less and the interval will be narrower.

9. Based on your experience with the simulations and the empirical data you collected, what would be a reasonable interval for the proportion of freshmen that administrators can expect to return to Chowan College for freshmen class sizes of 200, 500, and 999? Explain your predictions. For ten trials of each, a sample response: From 0.4 to 0.535 with 200 freshmenFrom 0.456 and .522 with 500 freshmenFrom 0.476 and 0.526 with 999 freshmen.As the number of freshmen in a trial increases, more is known about population of all freshmen that attend Chowan. Thus, we can expect less variability in the proportion of returning students.

10. Discuss why it might be beneficial to have students simulate the freshman retention problem for several samples of sample size 500,




8

as well as sample sizes of 200 and 999. The first simulation might not be very far from the theoretical probability; several simulations increased the chances of students seeing a more extreme case. Also, conducting the simulation several times might help them understand sampling variability.

11. An important aspect of conducting a simulation of a repeated random event is to be sure that students have a conceptual understanding of how the simulation and the commands used in the computing tool represents the context of the problem. Consider the following context:

Suppose a university gives away token spirit gifts to all incoming freshmen. As they check in to pick up their class schedule, they get to randomly choose three cards. Each card displays a different gift: keychain, window decal for a car, t-shirt with college logo. Most freshmen would prefer the t-shirt.

Explain how you would help students use the graphing calculator to simulate this context. Explicitly describe what the commands represent and how the students should interpret the results. Let 0 represent the T-shirt, 1 represent the window decal, and 2 represent the keychain. The command RandInt(0, 2,100) L1 would simulate randomly selecting 100 cards. The numbers recorded in L1 would represent the prizes that would be awarded for each card. Counting the number of 0’s would correspond to the number of T-shirts awarded.

12. Suppose that after conducting the above simulations, a student gets a result that a t-shirt was chosen 20% of the time. The student is surprised and claims that this proportion is too low from the expected 33.3%. What might this student be misunderstanding? What are some questions you could pose and further simulations you could suggest that might help this student? The student might not understand sampling variability, especially with a sample size of 30 and three possible outcomes. They may also not understand that although 20% is 13.33% less than the expected 33.33%, it is only a result of 4 less than the expected 10 students choosing a t-shirt. Sometimes it is helpful to ask students to consider sample sizes that are at a lower extreme (like 4 or 10) to get students to imagine and experience the variation that can occur from what is expected. One possible question might be “If you flipped a fair coin 4 times, how many heads would you expect? What about 10 times? Could you expect 2 out of 4 or 5 out of 10 every time you tried this?” Students can do these experiments several times to see the variation that occurs. Having them then do repeated samples of 30 freshman choosing a card can help them notice that an outcome of 6 for any of the 3 possible choices can occur quite easily, especially if you have a whole class of students repeating the simulation several time until




9

someone get s a result of 6/30 for one of the three choices.




10

SchoolopolyParticipant Handout

Consider the following problem:

Suppose your school is planning to create a board game modeled on the classic game of Monopoly. The game is to be called Schoolopoly and, like Monopoly, will be played with dice. Because many copies of the game expect to be sold, companies are competing for the contract to supply dice for Schoolopoly. Some companies have been accused of making poor quality dice and these are to be avoided since players must believe the dice they are using are actually “fair.” Each company below has provided dice for analysis.

Calibrated Cubes Dazzling Dice Delta’s Dice Dice Depot Dice R’ Us High Rollers, Inc.

Working with a partner, investigate whether the dice sent from each company is fair or biased.

Questions to consider:1. Do you believe the dice from each company are fair or biased? Which company

would you recommend purchasing dice from?2. What compelling evidence do you have that the dice you of each company are fair or

unfair?3. Use your data to estimate the probability of each outcome, 1-6, of the dice for each

company.

Simulating Randomness in Technology ToolsQuestions to consider:

1. Would you prefer to discuss the importance of a seed value with students or to set all of the graphing calculators with different seed values yourself and not discuss the issues with students? Explain your choice.

2. How could you sue the fact that many calculators and computers generate the same list of pseudo-random numbers given the same initial seed value to generate discussions with students about randomness in general and the use of computers to simulate probability experiments?




11

Using Data to Estimate Probabilities and Design Simulations

Consider the graph below.

The graph has three schools highlighted. The rates for each school describe the actual percent of freshman from an incoming class of full-time students (population) that enrolled in courses for a second year. College administrators could simply use the past retention rates to calculate a predicted number of freshman who will return the next year. However, since we can expect some variability from year to year with different size classes, it is helpful to use retention rates as an estimate of the probability for any given freshman returning to school their sophomore year. This can help administrators to make more informed estimates of a reasonable number of returning sophomores.

Questions to consider:1. Use the retention rates at Chowan College, NC Central and NC State University to

estimate the probability that a randomly chosen freshman will continue into their second year at each school.

2. Describe how you would use a coin to simulate the experiment of deciding whether or not any given freshman will continue on the next year at Chowan College. What other objects could you use to conduct a simulation?

3. If you conduct the simulation you described above with 30 trials to represent the number of freshman, what is a reasonable amount of these 30 freshman you could expect to return the following year? Why?

4. If you repeat a simulation of 30 trials several times (decide how many times), what similarities or differences do you expect across the results from the different samples of 30 trials?

5. Use a coin to conduct a simulation of several samples of 30 trials and record your results. How do the results compare with what you anticipated?

6. What are some of the potential benefits and drawbacks of using a real world context like the freshman retention rate for introducing probability to students, especially terminology such as outcomes, sample space, experiment, trial, event, sample, and sample size?




12

Simulating Events with a Graphing CalculatorStill considering the retention rate situation of Chowan College, recall our sample space has two possible outcomes: student returns and student does not return. To simulate this situation using our graphing calculators, we will use the randInt function assigning two consecutive integers to represent the two outcomes.

Questions to consider:1. Suppose the freshman class at Chowan College has 500 students. What command

would you use on the graphing calculator to conduct a simulation of whether or not each of the 500 freshmen stays in school with a 50% estimate for the probability for retention? Explain each part of the command in terms of the context of the situation.

2. Given a 50% estimate for the probability for retention, out of 500 freshmen, what is a reasonable interval for the proportion of freshmen you would expect to return the following year? Defend your expectation.

Model using your calculator.

Questions to consider:3. Explain why the values used in the Window setting create the appropriate graphical

display of two bars with the frequency of the 0s in the first and the frequency of the 1s in the second. In your response, explain why a y max of 350 is appropriate.

4. Determine the proportion of freshmen who will return to Chowan College next year.5. For our problem we are interested in how much the proportion of freshmen returning

to Chowan College will vary from the expected 50%. Use the skills you have learned to repeat the simulation of deciding the retention of 500 freshman many times. Record the proportion of freshmen that continue on next year for each of the samples of 500 trials. How many of the sample proportions fall within the interval you predicted in question 2? Discuss why the results may or may not have been consistent with your earlier prediction.

6. Recall that in our simulation, each freshman had a 50% chance of returning to school. Did you get exactly 50% of the freshmen returning in the simulation? Discuss how and why the empirical proportions from the simulation may have varied from 50%.

7. If we reduced the number of trials to 200 freshmen, what do you anticipate would happen to the interval of proportions from the empirical data around the theoretical probability of 50%? Why? Conduct a few samples with 200 trials and compare your results with what you anticipated.

8. If we increased the number of trials to 999 freshmen, what do you anticipate would happen to the interval of proportions from the empirical data around the theoretical probability of 50%? Why? Conduct a few samples with 999 trials and compare your results with what you anticipated.

9. Based on your experience with the simulations and the empirical data you collected, what would be a reasonable interval for the proportion of freshmen that administrators can expect to return to Chowan College for freshmen class sizes of 200, 500, and 999? Explain your predictions.

10. Discuss why it might be beneficial to have students simulate the freshman retention Adapted from: Lee, H. S., Hollebrands, K. F., & Wilson, P. H. (2010). Designing and using probability simulations. In Preparing to teach mathematics with



13

problem for several samples of sample size 500, as well as sample sizes of 200 and 999.

11. An important aspect of conducting a simulation of a repeated random event is to be sure that students have a conceptual understanding of how the simulation and the commands used in the computing tool represents the context of the problem. Consider the following context:

Suppose a university gives away token spirit gifts to all incoming freshmen. As they check in to pick up their class schedule, they get to randomly choose three cards. Each card displays a different gift: keychain, window decal for a car, t-shirt with college logo. Most freshmen would prefer the t-shirt.

Explain how you would help students use the graphing calculator to simulate this context. Explicitly describe what the commands represent and how the students should interpret the results.

12. Suppose that after conducting the above simulations, a student gets a result that a t-shirt was chosen 20% of the time. The student is surprised and claims that this proportion is too low from the expected 33.3%. What might this student be misunderstanding? What are some questions you could pose and further simulations you could suggest that might help this student?




14

melt-institute- · web viewunderstand how to simulate randomness with pseudo-random number...

Documents