RANDOMNESS AND PSEUDORANDOMNESSOmer Reingold, Microsoft Research and Weizmann
Randomness and Pseudorandomness
When Randomness is Useful When Randomness can be reduced or
eliminated – derandomization Basic Tool: Pseudorandomness
An object is pseudorandom if it “looks random” (indistinguishable from uniform), though it is not.
Expander Graphs
Randomness In Computation (1)
Distributed computing (breaking symmetry)
Cryptography: Secrets, Semantic Security, …
Sampling, Simulations, …
Can’t live without you
Randomness In Computation (2)
Communication Complexity (e.g., equality)
Routing (on the cube [Valiant]) - drastically reduces congestion
You change my world
Randomness In Computation (3)
In algorithms – useful design tool, but many times can derandomize (e.g., PRIMES in P). Is it always the case?
BPP=P means that every randomized algorithm can be derandomized with only polynomial increase in time
RL=L means that every randomized algorithm can be derandomized with only a constant factor increase in memory
Do I really need you?
In Distributed Computing
Byzantine Agreement
Deterministic
Randomized
Synchronoust failures
t+1 rounds O(1)
Asynchronous
impossible possible
Dining Philosophers: breaking symmetry
Attack Now
Don’t Attack
Randomness Saves Communication
Original File
Copy
=?
Deterministic: need to send the entire file! Randomness in the Sky: O(1) bits (or log in 1/error) Private Randomness: Logarithmic number of bits (derandomization).
In Cryptography
Private Keys: no randomness - no secrets and no identitiesEncryption: two encryptions of same message with same key need to be differentRandomized (interactive) Proofs: Give rise to wonderful new notions: Zero-Knowledge, PCPs, …
Random Walks and Markov Chains
When in doubt, flip a coin: Explore graph: minimal memory Page Rank: stationary
distribution of Markov Chains Sampling vs. Approx counting.
Estimating size of Web Simulations of Physical Systems …
Shake Your Input Communication network (n-dimensional cube)
Every deterministic routing scheme will incur exponentially busy links (in worse case)Valiant: To send a message from x y, select node z at random, send x z y.Now: O(1) expected load for every edge
Another example – randomized quicksort Smoothed Analysis: small perturbations, big
impact
In Private Data AnalysisHide Presence/Absence of Any IndividualHow many people in the database have the BC1 gene? Add random noise to true answer distributed as Lap(/) More questions? More privacy? Need more noise.
0 2 3 4--2-3-4
ratio bounded
Randomness and Pseudorandomness
When Randomness is Useful When Randomness can be reduced or
eliminated – derandomization Basic Tool: Pseudorandomness
An object is pseudorandom if it “looks random” (indistinguishable from uniform), though it is not.
Expander Graphs
Cryptography: Good Pseudorandom Generators are Crucial
With them, we have one-time pad (and more):
ciphertext = plaintext K = 01101011
E Dplaintext data:00001111
plaintext data: (ciphertext K) = 00001111
short key K0: 110
Without, keys are bad, algorithms are worthless (theoretical & practical)
derived key K:01100100
Data Structures & Hash Functions
Linear Probing:
Alice
Mike
Bob
Mary
Bob
F(Bob)
If F is random then insertion time and query time are O(1) (in expectation).
But where do you store a random function ?!? Derandomize!
Heuristic: use SHA1, MD4, … Recently (2007): 5-wise independent
functions are sufficient* Similar considerations all over: bloom
filters, cuckoo hashing, bit-vectors, …
Weak Sources & Randomness Extractors
Available random bits are biased and correlated Von Neumann sources:
Randomness Extractors produce randomness from general weak sources, many other applications
b1 b2 … bi … are i.i.d. 0/1 variables and bi =1 with some probability p < 1 then translate
01 1
10 0
Algorithms: Can Randomness Save Time or Memory?
Conjecture - No* (*moderate overheads may still apply)
Examples of derandomization:
Holdouts: Identity testing, approximation algorithms, …
Primality Testing in Polynomial Time
Graph Connectivity logarithmic Memory
(Bipartite) Expander Graphs
|(S)| A |S|
(A > 1)
S, |S| K
Important: every (not too large) set expands.
D
N N
(Bipartite) Expander Graphs
|(S)| A |S|
(A > 1)
S, |S| K
Main goal: minimize D (i.e. constant D)
• Degree 3 random graphs are expanders! [Pin73]
D
N N
(Bipartite) Expander Graphs
|(S)| A |S|
(A > 1)
S, |S| K
Also: maximize A. Trivial upper bound: A D
even A ≲ D-1 Random graphs: AD-1
D
N N
Applications of Expanders
These “innocent” looking objects are intimately related to various fundamental problems:
Network design (fault tolerance), Sorting networks, Complexity and proof theory, Derandomization, Error correcting codes, Cryptography, Ramsey theory And more ...
Non-blocking Network with On-line Path Selection [ALM]
N (Inputs)
N (Outputs)
Depth O(log N), size O(N log N), bounded degree.
Allows connection between input nodes and output nodes using vertex disjoint paths.
Non-blocking Network with On-line Path Selection [ALM]
N (Inputs)
N (Outputs)
Every request for connection (or disconnection) is satisfied in O(log N) bit steps:
• On line. Handles many requests in parallel.
The Network
“Lossless”
Expander
N (Inputs) N (outputs)
Slightly Unbalanced, “Lossless” Expanders
|(S)| 0.9 D |S|
S, |S| K
D
N M= N
0< 1 is an arbitrary constant D is constant & K= (M/D) = (N/D). [CRVW 02]: such expanders (with D = polylog(1/))
Property 1: A Very Strong Unique Neighbor Property
S, |S| K, |(S)| 0.9 D |S|
S
Non Unique neighbor
S has 0.8 D |S| unique neighbors !
Unique neighbor of S
Using Unique Neighbors for Distributed Routing
Task: match S to its neighbors (|S| K)
S
Step I: match S to its unique neighbors.
S`
Continue recursively with unmatched vertices S’.
Reminder: The Network
Adding new paths: think of vertices used by previous paths as faulty.
Property 2: Incredibly Fault Tolerant
S, |S| K, |(S)| 0.9 D |S|
Remains a lossless expander even if adversary removes (0.7 D) edges from each vertex.
Simple Expander Codes [G63,Z71,ZP76,T81,SS96]
M= N (Parity Checks)
Linear code; Rate 1 – M/N = (1 - ).Minimum distance K. Relative distance K/N= ( / D) = / polylog (1/). For small beats the Zyablov bound and is quite
close to the Gilbert-Varshamov bound of / log (1/).
N (Variables)
1
100
1
++
+
+
0
Error set B, |B| K/2
Algorithm: At each phase, flip every variable that “sees” a majority of 1’s (i.e, unsatisfied constraints).
Simple Decoding Algorithm in Linear Time (& log n parallel phases)
[SS 96]
M= N (Constraints)N (Variables)
++
+
+1
100
1|Flip\B| |B|/4|B\Flip| |B|/4 |Bnew| |B|/2
|(B)| > .9 D |B|
|(B)Sat|< .2 D|B|0
10
0
11
0
Random Walk on Expanders [AKS 87]
...x0
x1
x2 xi
xi converges to uniform fast (for arbitrary x0).
For a random x0: the sequence x0, x1, x2 . . . has interesting “random-like” properties.
Thanks