distinguishing features of simulation time (clk) dynamic focused on this aspect during the modeling...

42
Distinguishing Features of Simulation • Time (CLK) DYNAMIC focused on this aspect during the modeling section of the course • Pseudorandom variables (RND) STOCHASTIC will focus on this aspect in coming weeks

Post on 21-Dec-2015

214 views

Category:

Documents


0 download

TRANSCRIPT

Distinguishing Features of Simulation

• Time (CLK) DYNAMICfocused on this aspect during the modeling section of the course

• Pseudorandom variables (RND) STOCHASTICwill focus on this aspect in coming weeks

(Pseudo) Random Number Generation

Properties of pseudo-random numbers• Continuous numbers between 0 and 1• Probability of selecting a number in interval (a,b) ~

(b-a) – i.e. Uniformly distributed• Numbers are statistically independent• Can’t really generate random numbers

∞ information – finite algorithm or table

Example: XL spreadsheet function =RAND()• Also, want fast and repeatable...

Random Number Generation

How to generate random numbers– Table look-up– Computer generation: these values cannot be

truly random and a computer cannot express a number to an infinite number of decimal places Pseudorandom numbers

Random Number Generation

Random number seed:

Virtually all computer methods of random number generation start with an initial random number seed. This seed is used to generate the next random number and then is transformed into a new seed value.

Random Generators

• Reasons for pseudorandom numbers:– Flexible policies– Lack of knowledge

• Generate stochastic processes

• Decision making (random decision)

• Numerical analysis (numerical integration)

• Monte Carlo integration

Desirable Properties of Random Number Generators

• Fast• Should not require much memory• Long cycle or period• Should support multiple streams• Sequence should be replicable

– Debugging– Compare various scenarios under similar conditions

• Numbers should come close to:– Uniformity (or known distribution)– Independence

Historical Generator

Midsquare method:1. Start with an initial seed (e.g. a 4-digit

integer).2. Square the number.3. Take the middle 4 digits.4. This value becomes the new seed. Divide

the number by 10,000. This becomes the random number. Go to 2.

Midsquare Method, example

x0 = 5497

x1: 54972 = 30217009 x1 = 2170, R1 = 0.2170

x2: 21702 = 04708900 x2 = 7089, R2 = 0.7089

x3: 70892 = 50253921 x3 = 2539, R3 = 0.2539

Drawback: Hard to state conditions for picking initial seed that will generate a “good” sequence.

Midsquare Generator, examples

“Bad” sequences:• x0 = 5197

x1: 51972 = 27008809 x1 = 0088, R1 = 0.0088 x2: 00882 = 00007744 x2 = 0077, R2 = 0.0077 x3: 00772 = 00005929 x3 = 0059, R3 = 0.0059

• xi = 6500xi+1: 65002=42250000 xi+1=2500, Ri+1= 0.0088xi+2: 25002=06250000 xi+2=2500, Ri+1= 0.0088

Linear Congruential Generator (LCG) Generator

Start with random seed Z0 < m = largest possible integer on machineRecursively generate integers between 0 and M Zi = (a Zi-1 + c) mod m

Use U = Z/m for pseudo-random number get (avoid 0 and 1)

When c = 0 Called Multiplicative Congruential Generator

When c > 0 Mixed LCG

Linear Congruential Generator (LCG) (Lehmer 1951)

Let Zi be the ith number (integer) in the sequence

Zi = (aZi-1+c)mod(m) Zi{0,1,2,…,m-1}

where Z0 = seed

a = multiplier

c = increment

m = modulus

Define Ui = Zi /m (to obtain U(0,1) value)

LCG, example

16-bit machine

a = 1217 c = 0 Z0 = 23 m = 215-1 = 32767

Z1 = (1217*23) mod 32767 = 27991

U1 = 27991/32767 = 0.85424

Z2 = (1217*27991) mod 32767 = 20134

U2 = 20134/32767 = 0.61446

An LCG can be expressed as a function of the seed Z0

THEOREM:

Zi = [aiZ0+c(ai-1)/(a-1)] mod(m)

Proof: By induction on i

i=0 Z0 = [a0Z0+c(a0-1)/(a-1)] mod(m)

Assume for i. Show that expression holds for i+1

Zi+1 = [aZi+c] mod(m)

= [a {[aiZ0+c(ai-1)/(a-1)] mod(m)} +c] mod(m)

= [ai+1Z0+ac(ai-1)/(a-1) +c] mod(m)

= [ai+1Z0+c(ai+1-1)/(a-1) ] mod(m)

Examples:

Zi = (69069Zi-1+1) mod 232 Ui = Zi /232

Zi = (65539Zi-1+76) mod 231 Ui = Zi/231

Zi = (630360016Zi-1) mod (231-1) Ui = Zi/231

Zi = 1313Zi-1 mod 259 Ui = Zi/259

What makes one LCG better than another?

A full period (full cycle) LCG generates all m values before it cycles.

Consider Zi = (3Zi-1+2) mod(9) with Z0 =7

Then Z1 = 5 Z2 = 8 Z3 = 8 Zj = 8 j = 3,4,5,6,…

On the other hand Zi = (4Zi-1+2) mod(9) has full period.

Why?

Random Number Generation

Mixed congruential generator is full period if

1. m = 2B (B is often # bits in word) fast

2. c and m relatively prime (g.c.d. = 1)

3. If 4 divides m, then 4 divides a – 1(e.g., a = 1, 5, 9, 13,…)

The period of an LCG is m (full period or full cycle) if and only if

— If q is a prime that divides m, then q divides a-1

— The only positive integer that divides both m and c is 1

— If 4 divides m, then 4 divides a-1.

ExamplesZi+1 = (16807Zi+3) mod (451605),

where 16807 =75, 16806 =(2)(3)(2801), 451605 =(3)(5)(7)(11)(17)(23)

This LCG does not satisfy the first two conditions.

Zi+1 = (16807Zi+5) mod (635493681)where 16807 =75, 16806 = (2)(3)(2801), 635493681 = (34)(28012)

This LCG satisfies all three conditions.

- m = 2B where B = # bits in the machine is often a good choice to maximize the period.

- If c = 0, we have a power residue or multiplicative generator.

Note that Zn = (aZn-1) mod(m) Zn = (anZ0) mod(m).

If m = 2B, where B = # bits in the machine, the longest period is m/4 (best one can do) if and only if

— Z0 is odd— a = 8k+ 3, kZ+ (5,11,13,19,21,27,…)

Random Number Generation

Other kinds of generators

• Quadratic Congruential Generator– Snew = (a1 Sold

2 + a2 Sold2 + b) mod L

• Combination of Generators– Shuffling – L’Ecuyer – Wichman/Hill

• Tausworthe Generator– Generates sequence of random bits

Feedback Shift Generators

• Tausworthe, Math of Computing 1965

• If {ak} is a sequence of binary digits (0 or 1) defined byak = (c1ak-1 + c2ak-2 + … + cpak-p)mod 2

and the c’s are relatively prime, then {ak} has period 2p-1

IBM - Randu

If c = 0 power residue generator (multiplicative generator)

un = anu0 mod m

un = a un-1 mod m (homework)

NOTES

—Never “invent” your own LCG. It will probably not be “good.”

—All simulation languages and many software packages have their own PRN generator. Most use some variation of a linear congruential generator.

—Power residue generators are the most common.

Tests of RNG, cont’d

• Theoretical tests– Prove sample moments over entire cycle are

correct– Lattice structure of LCGs

• “random numbers fall mainly in the planes” (Marsaglia)

• Spacing hyperplanes: the smaller, the better

Tests of Random Number Generators

• Empirical tests– Uniformity

• Compute sample moments• Goodness of fit

– Independence• Gap Test• Runs Test• Poker Test• Spectral Test• Autocorrelation Test

Testing Random Number Generators

Desirable Properties:

• Mean and Variance

Theorem: E 1/2 and V 1/12 as m+.

Proof:

For a full period LCG, every integer value from 0 to m-1 is represented. Therefore

E = (0+1+…+(m-1))/m2 = ((m-1)(m)/2)/m2 = (1/2)-(1/2m)

V = ((02+12+22+…+(m-1)2)/m3) - E2

= [(m)(m-1)(2m-1)/6]/m3 - [(1/2) - (1/2m)]2

= [(1/12) - (1/12m2)]

• Uniformity

2 Goodness of Fit Test

— Divide n observations into k (equal) intervals

— Do a frequency count fi, i=1,2,…,k

— Compute X2 = i (fi -n/k)2 / (n/k)

= i (fi -npi)2 / npi,

where pi = 1/k, i=1,2,…,k.

• • •

f1 f2 fk-1 fk

01

k

2

k

k 2

k

k 1

k

e1 e2 ek-1 ek

ei = expected number of observations in interval i = n pi = n / k, i = 1, 2, …, k

Data Classification

1

NOTE

— (fi -npi)/(npi)1/2 is the N(0,1) approximation of

a multinomial distribution for pi small, where

E[fi] = npi and Var [fi] = npi (1-pi)).

— For n large, X2 is distributed 2 with k-1 degrees of

freedom

— Reject randomness assumption X2 > 2

1k

NOTE: if X2 is too close to zero, it may be because the

numbers have been “fudged.”

BE WARY OF PRN WHICH LOOK TOO RANDOM

Do NotReject HO

Reject HO

k 1,2

2 Goodness of Fit Test

- Repeat test m times with independent samples of size n

- If H0 is true, test will reject H0 m times (on average)

Trouble Spots

—Choosing the intervals evenly

—Choosing the intervals such that you would expect each class to contain at least 5 or 10 observations

—pi should (ideally) be small (<.05)

Example

n = 1000

[0, .1) fi = 87 [.1, .2) fi = 93 [.2, .3) fi = 113

[.3, .4) fi = 106 [.4, .5) fi = 108 [.5, .6) fi = 99

[.6, .7) fi = 91 [.7, .8) fi = 95 [.8, .9) fi = 103

[.9, 1.] fi = 105

X2 = 628/100 = 6.28 Do not reject H0 : U(0,1).

9,.052 16.919

NOTE

—The 2 goodness of fit test is also used to fit

distributions to data, where

X2 = i (fi -ei)2 / ei

ei = expected number of observations in interval i.

Kolmogorov-Smirnov Goodness-of-fit Test

—Order n U[0,1] variates {x[i]}

—Construct an empirical CDF for the n variates {x[i]}(i.e., F(x[i]) = i/n i = 1,2,…,n)

—Construct a hypothesized CDF for n uniform variates

(i.e., = x, 0x1)

—Compute D = max {D+, D-}, where D+ = Max1<i<n [(i/n)-

D- = Max1<i<n [ -((i-1)/n)].

—Check tables• Reject if D is too large, with a risk , which

means that we reject (uniformity) falsely with probability .

(x)F~

˜ F (x[i])]

)(xF~

[i]

1.00

.75

.50

.25

0 .1 .2 .3 .9 1.0

D+ = max {.15, .30, .45, .10}=.45

D- = max {.10, -.05, -.20, .15}=.15

D+

D-

Examples

— If {Ui} = {.1, .2, .3, .9}, then D = .45.

— If {Ui} = {.2, .6, .8, .9}, then D = .35.

— If {Ui} = {.25, .5, .75, 1.}, then D = .25.

NOTE: The minimum value that D can take on is 1/2n. (How?)

Independence

— Sign Test

* Test Statistic: S = runs of numbers above or below median)

* For large N, S is distributed N( = 1+(N/2), 2 = N/2)

Example

N = 15, S = 7, distributed N( = 8.5, 2 = 15/2)

Maximum value for S: N (negative dependency)

Minimum value for S: 1 (positive dependency)

.87 .15 .23 .45 .69 .32 .30 .19 .24 .18 .65 .82 .93 .22 .81

+ - - - + - - - - - + + + - +

Normal Curve Rejection Regions

REJECT(+ve)

REJECT (-ve)Do

NotREJECT

Reject H0 in favor of HA if

Z = (S - (1+(N/2))) / (N/2)1/2 Z/2 or Z Z/2

Z/2-Z/2

H0 : IndependenceHA : Dependence

— Runs Up and Down Test (runs of increasing and decreasing numbers)

• Assign + if xi <xi+1, assign - if xi>xi+1

• Test Statistic: S = number of runs up AND down (sequence of + and -)

• E(S) = (2N-1)/3, V(S) = (16N-29)/90• Use Normal approximation for N>30.

Example:N = 15, S = 8, distributed N(µ = 29/3, 2 = 211/90)Maximum value for S: N-1 (negative dependency)Minimum value for S: 1 ?

.87 .15 .23 .45 .69 .32 .30 .19 .24 .18 .65 .82 .93 .22 .81

- + + + - - - + - + + + - +

Normal Curve Rejection Regions

REJECT (-ve)REJECT

Do NotREJECT

H0 : IndependenceHA : Dependence

Z/2-Z/2

Reject H0 in favor of HA if

Z = (S - (2N-1)/3) / (16N-29/90)1/2 Z/2 or Z Z/2

Test of Cycling

Floyd’s Test for Cycling

• Assume ui = G(ui-1)

• x0 = y0 = seeds

• xi = G(xi-1) yi = G(G(yi-1)), i.e. skip every other one so y will go twice as fast as x. Then check to see if there is some value of n for which xn = yn.

• If xn = yn, cycling occurred.

Marsaglia’s Theorem

All N-tuples generated by a congruential generator will fall in fewer than (N!m)1/N hyperplanes.

(Proc. Nat. Acad. Sci. 61, 1968 pp.25-28)e.g. all 10-tuples fall in fewer than 13 9-

dimensional planes for m = 216. Randu in ONLY 15 PLANES in 3D cube.

(Solution: Make m bigger – limited by computer word size.)

Plot of RNDi+1 vs RNDi using LCG in SIGMA