foundations of data science course ramesh hariharan jan 2014hariharan-ramesh.com/ppts/ndim.pdf ·...
TRANSCRIPT
High Dimensional SpacesFoundations of Data Science Course
Ramesh Hariharan
Jan 2014
Ramesh Hariharan High Dimensional Spaces
What is Volume?
Volume of a cuboid with sides l1, . . . , ln is l1 ∗ l2 ∗ · · · ∗ ln
For a general object, integrate:
Decompose the object into infinitesimal n-dimensional cuboids
Count the number of such cuboids
Scaling each dimension by r multiplies volume by rn.
Ramesh Hariharan High Dimensional Spaces
What is Volume?
Volume of a cuboid with sides l1, . . . , ln is l1 ∗ l2 ∗ · · · ∗ ln
For a general object, integrate:
Decompose the object into infinitesimal n-dimensional cuboids
Count the number of such cuboids
Scaling each dimension by r multiplies volume by rn.
Ramesh Hariharan High Dimensional Spaces
What is Volume?
Volume of a cuboid with sides l1, . . . , ln is l1 ∗ l2 ∗ · · · ∗ ln
For a general object, integrate:
Decompose the object into infinitesimal n-dimensional cuboids
Count the number of such cuboids
Scaling each dimension by r multiplies volume by rn.
Ramesh Hariharan High Dimensional Spaces
What is Volume?
Volume of a cuboid with sides l1, . . . , ln is l1 ∗ l2 ∗ · · · ∗ ln
For a general object, integrate:
Decompose the object into infinitesimal n-dimensional cuboids
Count the number of such cuboids
Scaling each dimension by r multiplies volume by rn.
Ramesh Hariharan High Dimensional Spaces
What is Volume?
Volume of a cuboid with sides l1, . . . , ln is l1 ∗ l2 ∗ · · · ∗ ln
For a general object, integrate:
Decompose the object into infinitesimal n-dimensional cuboids
Count the number of such cuboids
Scaling each dimension by r multiplies volume by rn.
Ramesh Hariharan High Dimensional Spaces
Volume of an n-Dimensional Sphere
Vn(r) = fn × rn for radius r
f1 = 2
f2 = π
f3 = 43π
Does fn increase or decrease with n?
Ramesh Hariharan High Dimensional Spaces
Volume of an n-Dimensional Sphere
Vn(r) = fn × rn for radius r
f1 = 2
f2 = π
f3 = 43π
Does fn increase or decrease with n?
Ramesh Hariharan High Dimensional Spaces
Volume of an n-Dimensional Sphere
Vn(r) = fn × rn for radius r
f1 = 2
f2 = π
f3 = 43π
Does fn increase or decrease with n?
Ramesh Hariharan High Dimensional Spaces
Volume of an n-Dimensional Sphere
Vn(r) = fn × rn for radius r
f1 = 2
f2 = π
f3 = 43π
Does fn increase or decrease with n?
Ramesh Hariharan High Dimensional Spaces
Volume of an n-Dimensional Sphere
Vn(r) = fn × rn for radius r
f1 = 2
f2 = π
f3 = 43π
Does fn increase or decrease with n?
Ramesh Hariharan High Dimensional Spaces
Inductive View of fn
Ramesh Hariharan High Dimensional Spaces
Inductive Derivation for fn
fn = 2 fn−1∫ π
20 sinn(θ) dθ n ≥ 2
f1 = 2
fn = 2n ∫ π2
0 sinn(θ) dθ∫ π
20 sinn−1(θ) dθ . . .
∫ π2
0 sin1(θ) dθ
Ramesh Hariharan High Dimensional Spaces
Inductive Derivation for fn
fn = 2 fn−1∫ π
20 sinn(θ) dθ n ≥ 2
f1 = 2
fn = 2n ∫ π2
0 sinn(θ) dθ∫ π
20 sinn−1(θ) dθ . . .
∫ π2
0 sin1(θ) dθ
Ramesh Hariharan High Dimensional Spaces
Inductive Derivation for fn
fn = 2 fn−1∫ π
20 sinn(θ) dθ n ≥ 2
f1 = 2
fn = 2n ∫ π2
0 sinn(θ) dθ∫ π
20 sinn−1(θ) dθ . . .
∫ π2
0 sin1(θ) dθ
Ramesh Hariharan High Dimensional Spaces
Volume of a 1, 2, 3, 4-Dimensional Sphere
f1 = 2
f2 = 22 ∫ π2
0 sin2(θ) dθ∫ π
20 sin1(θ) dθ = π
f3 = 23 ∫ π2
0 sin3(θ) dθ∫ π
20 sin2(θ) dθ
∫ π2
0 sin1(θ) dθ = 43π
f4 =
24 ∫ π2
0 sin4(θ) dθ∫ π
20 sin3(θ) dθ
∫ π2
0 sin2(θ) dθ∫ π
20 sin1(θ) dθ = π2
2
Ramesh Hariharan High Dimensional Spaces
Volume of a 1, 2, 3, 4-Dimensional Sphere
f1 = 2
f2 = 22 ∫ π2
0 sin2(θ) dθ∫ π
20 sin1(θ) dθ = π
f3 = 23 ∫ π2
0 sin3(θ) dθ∫ π
20 sin2(θ) dθ
∫ π2
0 sin1(θ) dθ = 43π
f4 =
24 ∫ π2
0 sin4(θ) dθ∫ π
20 sin3(θ) dθ
∫ π2
0 sin2(θ) dθ∫ π
20 sin1(θ) dθ = π2
2
Ramesh Hariharan High Dimensional Spaces
Volume of a 1, 2, 3, 4-Dimensional Sphere
f1 = 2
f2 = 22 ∫ π2
0 sin2(θ) dθ∫ π
20 sin1(θ) dθ = π
f3 = 23 ∫ π2
0 sin3(θ) dθ∫ π
20 sin2(θ) dθ
∫ π2
0 sin1(θ) dθ = 43π
f4 =
24 ∫ π2
0 sin4(θ) dθ∫ π
20 sin3(θ) dθ
∫ π2
0 sin2(θ) dθ∫ π
20 sin1(θ) dθ = π2
2
Ramesh Hariharan High Dimensional Spaces
Volume of a 1, 2, 3, 4-Dimensional Sphere
f1 = 2
f2 = 22 ∫ π2
0 sin2(θ) dθ∫ π
20 sin1(θ) dθ = π
f3 = 23 ∫ π2
0 sin3(θ) dθ∫ π
20 sin2(θ) dθ
∫ π2
0 sin1(θ) dθ = 43π
f4 =
24 ∫ π2
0 sin4(θ) dθ∫ π
20 sin3(θ) dθ
∫ π2
0 sin2(θ) dθ∫ π
20 sin1(θ) dθ = π2
2
Ramesh Hariharan High Dimensional Spaces
Volume of a 1, 2, 3, 4-Dimensional Sphere
f1 = 2
f2 = 22 ∫ π2
0 sin2(θ) dθ∫ π
20 sin1(θ) dθ = π
f3 = 23 ∫ π2
0 sin3(θ) dθ∫ π
20 sin2(θ) dθ
∫ π2
0 sin1(θ) dθ = 43π
f4 =
24 ∫ π2
0 sin4(θ) dθ∫ π
20 sin3(θ) dθ
∫ π2
0 sin2(θ) dθ∫ π
20 sin1(θ) dθ = π2
2
Ramesh Hariharan High Dimensional Spaces
Sine Power Integrals
∫ π2
0 sinn(θ)dθ = n−1n
∫ π2
0 sinn−2(θ)dθ∫ π2
0 sinn(θ)dθ = n−1n
n−3n−2 · · ·
12
π2 , for even n∫ π
20 sinn(θ)dθ = n−1
nn−3n−2 · · ·
23 , for odd n∫ π
20 sinn(θ)dθ
∫ π2
0 sinn−1(θ)dθ = π2n√
π2(n+1) ≤
∫ π2
0 sinn(θ)dθ ≤√
π2n
Ramesh Hariharan High Dimensional Spaces
Sine Power Integrals
∫ π2
0 sinn(θ)dθ = n−1n
∫ π2
0 sinn−2(θ)dθ∫ π2
0 sinn(θ)dθ = n−1n
n−3n−2 · · ·
12
π2 , for even n∫ π
20 sinn(θ)dθ = n−1
nn−3n−2 · · ·
23 , for odd n∫ π
20 sinn(θ)dθ
∫ π2
0 sinn−1(θ)dθ = π2n√
π2(n+1) ≤
∫ π2
0 sinn(θ)dθ ≤√
π2n
Ramesh Hariharan High Dimensional Spaces
Sine Power Integrals
∫ π2
0 sinn(θ)dθ = n−1n
∫ π2
0 sinn−2(θ)dθ∫ π2
0 sinn(θ)dθ = n−1n
n−3n−2 · · ·
12
π2 , for even n∫ π
20 sinn(θ)dθ = n−1
nn−3n−2 · · ·
23 , for odd n∫ π
20 sinn(θ)dθ
∫ π2
0 sinn−1(θ)dθ = π2n√
π2(n+1) ≤
∫ π2
0 sinn(θ)dθ ≤√
π2n
Ramesh Hariharan High Dimensional Spaces
Sine Power Integrals
∫ π2
0 sinn(θ)dθ = n−1n
∫ π2
0 sinn−2(θ)dθ∫ π2
0 sinn(θ)dθ = n−1n
n−3n−2 · · ·
12
π2 , for even n∫ π
20 sinn(θ)dθ = n−1
nn−3n−2 · · ·
23 , for odd n∫ π
20 sinn(θ)dθ
∫ π2
0 sinn−1(θ)dθ = π2n√
π2(n+1) ≤
∫ π2
0 sinn(θ)dθ ≤√
π2n
Ramesh Hariharan High Dimensional Spaces
Sine Power Integrals
∫ π2
0 sinn(θ)dθ = n−1n
∫ π2
0 sinn−2(θ)dθ∫ π2
0 sinn(θ)dθ = n−1n
n−3n−2 · · ·
12
π2 , for even n∫ π
20 sinn(θ)dθ = n−1
nn−3n−2 · · ·
23 , for odd n∫ π
20 sinn(θ)dθ
∫ π2
0 sinn−1(θ)dθ = π2n√
π2(n+1) ≤
∫ π2
0 sinn(θ)dθ ≤√
π2n
Ramesh Hariharan High Dimensional Spaces
The Formula for fn
fn = πn/2n2 !
, for even n
fn = π(n−1)/2
n2 ( n
2−1)··· 12, for odd n
fn → 0 as n →∞!
The biggest unit sphere sits in 5-d!
Ramesh Hariharan High Dimensional Spaces
The Formula for fn
fn = πn/2n2 !
, for even n
fn = π(n−1)/2
n2 ( n
2−1)··· 12, for odd n
fn → 0 as n →∞!
The biggest unit sphere sits in 5-d!
Ramesh Hariharan High Dimensional Spaces
The Formula for fn
fn = πn/2n2 !
, for even n
fn = π(n−1)/2
n2 ( n
2−1)··· 12, for odd n
fn → 0 as n →∞!
The biggest unit sphere sits in 5-d!
Ramesh Hariharan High Dimensional Spaces
The Formula for fn
fn = πn/2n2 !
, for even n
fn = π(n−1)/2
n2 ( n
2−1)··· 12, for odd n
fn → 0 as n →∞!
The biggest unit sphere sits in 5-d!
Ramesh Hariharan High Dimensional Spaces
The Unit Sphere vs theUnit Cube
Corners of a unitcube are distance√
n2 from the origin
Center points ofeach side aredistance 1
2 fromthe origin
It looks like this
Ramesh Hariharan High Dimensional Spaces
The Unit Sphere vs theUnit Cube
Corners of a unitcube are distance√
n2 from the origin
Center points ofeach side aredistance 1
2 fromthe origin
It looks like this
Ramesh Hariharan High Dimensional Spaces
Where is the Volume Concentrated?
How much of the volume is located outside a band of angle 2αaround the equator?
R π2 −α
0 sinn(θ) dθR π2
0 sinn(θ) dθ
Denominator:∫ π
20 sinn(θ) dθ ≥
√π
2(n+1)
Numerator:∫ π
2 −α
0 sinn(θ) dθ ≤?
Ramesh Hariharan High Dimensional Spaces
Where is the Volume Concentrated?
How much of the volume is located outside a band of angle 2αaround the equator?
R π2 −α
0 sinn(θ) dθR π2
0 sinn(θ) dθ
Denominator:∫ π
20 sinn(θ) dθ ≥
√π
2(n+1)
Numerator:∫ π
2 −α
0 sinn(θ) dθ ≤?
Ramesh Hariharan High Dimensional Spaces
Where is the Volume Concentrated?
How much of the volume is located outside a band of angle 2αaround the equator?
R π2 −α
0 sinn(θ) dθR π2
0 sinn(θ) dθ
Denominator:∫ π
20 sinn(θ) dθ ≥
√π
2(n+1)
Numerator:∫ π
2 −α
0 sinn(θ) dθ ≤?
Ramesh Hariharan High Dimensional Spaces
Where is the Volume Concentrated?
How much of the volume is located outside a band of angle 2αaround the equator?
R π2 −α
0 sinn(θ) dθR π2
0 sinn(θ) dθ
Denominator:∫ π
20 sinn(θ) dθ ≥
√π
2(n+1)
Numerator:∫ π
2 −α
0 sinn(θ) dθ ≤?
Ramesh Hariharan High Dimensional Spaces
∫ π2 −α
0 sinn(θ) dθ ≤?
∫ π2 −α
0sinn(θ) dθ
=
∫ 1
sin2 α
12√
y(1− y)
n−12 dy , y = cos2(θ)
≤ 12 sin α
∫ 1
sin2 αe−y n−1
2 dy
≤ 1(n − 1) sin α
e−n−1
2 sin2 α
Ramesh Hariharan High Dimensional Spaces
∫ π2 −α
0 sinn(θ) dθ ≤?
∫ π2 −α
0sinn(θ) dθ
=
∫ 1
sin2 α
12√
y(1− y)
n−12 dy , y = cos2(θ)
≤ 12 sin α
∫ 1
sin2 αe−y n−1
2 dy
≤ 1(n − 1) sin α
e−n−1
2 sin2 α
Ramesh Hariharan High Dimensional Spaces
∫ π2 −α
0 sinn(θ) dθ ≤?
∫ π2 −α
0sinn(θ) dθ
=
∫ 1
sin2 α
12√
y(1− y)
n−12 dy , y = cos2(θ)
≤ 12 sin α
∫ 1
sin2 αe−y n−1
2 dy
≤ 1(n − 1) sin α
e−n−1
2 sin2 α
Ramesh Hariharan High Dimensional Spaces
∫ π2 −α
0 sinn(θ) dθ ≤?
∫ π2 −α
0sinn(θ) dθ
=
∫ 1
sin2 α
12√
y(1− y)
n−12 dy , y = cos2(θ)
≤ 12 sin α
∫ 1
sin2 αe−y n−1
2 dy
≤ 1(n − 1) sin α
e−n−1
2 sin2 α
Ramesh Hariharan High Dimensional Spaces
Volume Fraction outside the 2α-angle Equatorial Band
R π2 −α
0 sinn(θ) dθR π2
0 sinn(θ) dθ≤
√2(n+1)
π1
(n−1) sin αe−n−1
2 sin2 α
For α ∼ sin(α) = 1√n , this is ∼
√2
πe = .4839
More than half the volume is in a 2√n angle band around the
equator.
For sin(α) = a√n , the above bound is ∼
√2π
1ae−
a22
Reminiscent of the Normal distribution?
2∫∞
a1√2π
e−x22 dx ≤
√2π
1a e−
a22
Ramesh Hariharan High Dimensional Spaces
Volume Fraction outside the 2α-angle Equatorial Band
R π2 −α
0 sinn(θ) dθR π2
0 sinn(θ) dθ≤
√2(n+1)
π1
(n−1) sin αe−n−1
2 sin2 α
For α ∼ sin(α) = 1√n , this is ∼
√2
πe = .4839
More than half the volume is in a 2√n angle band around the
equator.
For sin(α) = a√n , the above bound is ∼
√2π
1ae−
a22
Reminiscent of the Normal distribution?
2∫∞
a1√2π
e−x22 dx ≤
√2π
1a e−
a22
Ramesh Hariharan High Dimensional Spaces
Volume Fraction outside the 2α-angle Equatorial Band
R π2 −α
0 sinn(θ) dθR π2
0 sinn(θ) dθ≤
√2(n+1)
π1
(n−1) sin αe−n−1
2 sin2 α
For α ∼ sin(α) = 1√n , this is ∼
√2
πe = .4839
More than half the volume is in a 2√n angle band around the
equator.
For sin(α) = a√n , the above bound is ∼
√2π
1ae−
a22
Reminiscent of the Normal distribution?
2∫∞
a1√2π
e−x22 dx ≤
√2π
1a e−
a22
Ramesh Hariharan High Dimensional Spaces
Volume Fraction outside the 2α-angle Equatorial Band
R π2 −α
0 sinn(θ) dθR π2
0 sinn(θ) dθ≤
√2(n+1)
π1
(n−1) sin αe−n−1
2 sin2 α
For α ∼ sin(α) = 1√n , this is ∼
√2
πe = .4839
More than half the volume is in a 2√n angle band around the
equator.
For sin(α) = a√n , the above bound is ∼
√2π
1ae−
a22
Reminiscent of the Normal distribution?
2∫∞
a1√2π
e−x22 dx ≤
√2π
1a e−
a22
Ramesh Hariharan High Dimensional Spaces
Volume Fraction outside the 2α-angle Equatorial Band
R π2 −α
0 sinn(θ) dθR π2
0 sinn(θ) dθ≤
√2(n+1)
π1
(n−1) sin αe−n−1
2 sin2 α
For α ∼ sin(α) = 1√n , this is ∼
√2
πe = .4839
More than half the volume is in a 2√n angle band around the
equator.
For sin(α) = a√n , the above bound is ∼
√2π
1ae−
a22
Reminiscent of the Normal distribution?
2∫∞
a1√2π
e−x22 dx ≤
√2π
1a e−
a22
Ramesh Hariharan High Dimensional Spaces
Do 2 Equators sum to more than the whole!
Ramesh Hariharan High Dimensional Spaces
Surface Area An(r) of an n-Dimensional Sphere
∫ r0 An(r) dr = Vn(r)
dVn(r)dr = An(r)
An(r) = anrn−1, and an = nfn
an = 2an−1∫ π
20 sinn−2(θ) dθ
a2 = 2π
an = 2n−1π∫ π
20 sinn−2(θ) dθ
∫ π2
0 sinn−3(θ) dθ . . .∫ π
20 sin1(θ) dθ
Ramesh Hariharan High Dimensional Spaces
Surface Area An(r) of an n-Dimensional Sphere
∫ r0 An(r) dr = Vn(r)
dVn(r)dr = An(r)
An(r) = anrn−1, and an = nfn
an = 2an−1∫ π
20 sinn−2(θ) dθ
a2 = 2π
an = 2n−1π∫ π
20 sinn−2(θ) dθ
∫ π2
0 sinn−3(θ) dθ . . .∫ π
20 sin1(θ) dθ
Ramesh Hariharan High Dimensional Spaces
Surface Area An(r) of an n-Dimensional Sphere
∫ r0 An(r) dr = Vn(r)
dVn(r)dr = An(r)
An(r) = anrn−1, and an = nfn
an = 2an−1∫ π
20 sinn−2(θ) dθ
a2 = 2π
an = 2n−1π∫ π
20 sinn−2(θ) dθ
∫ π2
0 sinn−3(θ) dθ . . .∫ π
20 sin1(θ) dθ
Ramesh Hariharan High Dimensional Spaces
Surface Area An(r) of an n-Dimensional Sphere
∫ r0 An(r) dr = Vn(r)
dVn(r)dr = An(r)
An(r) = anrn−1, and an = nfn
an = 2an−1∫ π
20 sinn−2(θ) dθ
a2 = 2π
an = 2n−1π∫ π
20 sinn−2(θ) dθ
∫ π2
0 sinn−3(θ) dθ . . .∫ π
20 sin1(θ) dθ
Ramesh Hariharan High Dimensional Spaces
Surface Area An(r) of an n-Dimensional Sphere
∫ r0 An(r) dr = Vn(r)
dVn(r)dr = An(r)
An(r) = anrn−1, and an = nfn
an = 2an−1∫ π
20 sinn−2(θ) dθ
a2 = 2π
an = 2n−1π∫ π
20 sinn−2(θ) dθ
∫ π2
0 sinn−3(θ) dθ . . .∫ π
20 sin1(θ) dθ
Ramesh Hariharan High Dimensional Spaces
Surface Area An(r) of an n-Dimensional Sphere
∫ r0 An(r) dr = Vn(r)
dVn(r)dr = An(r)
An(r) = anrn−1, and an = nfn
an = 2an−1∫ π
20 sinn−2(θ) dθ
a2 = 2π
an = 2n−1π∫ π
20 sinn−2(θ) dθ
∫ π2
0 sinn−3(θ) dθ . . .∫ π
20 sin1(θ) dθ
Ramesh Hariharan High Dimensional Spaces
Inductive View of an
Ramesh Hariharan High Dimensional Spaces
Dot Product between a Fixed Unit Vector and a Random Unit Vector
A Spherically Symmetric Random Unit Vector:Probability of lying in any specific patch P on the surface isproportional to the area of P.
Dot Product is also the length of the projection of the fixed vectoron the random vector.
Dot Product equals cos(θ), where θ is the angle between the twovectors.
E(cos2(θ)), Var(cos2(θ)), and tail bounds on cos2(θ)?
Ramesh Hariharan High Dimensional Spaces
Dot Product between a Fixed Unit Vector and a Random Unit Vector
A Spherically Symmetric Random Unit Vector:Probability of lying in any specific patch P on the surface isproportional to the area of P.
Dot Product is also the length of the projection of the fixed vectoron the random vector.
Dot Product equals cos(θ), where θ is the angle between the twovectors.
E(cos2(θ)), Var(cos2(θ)), and tail bounds on cos2(θ)?
Ramesh Hariharan High Dimensional Spaces
Dot Product between a Fixed Unit Vector and a Random Unit Vector
A Spherically Symmetric Random Unit Vector:Probability of lying in any specific patch P on the surface isproportional to the area of P.
Dot Product is also the length of the projection of the fixed vectoron the random vector.
Dot Product equals cos(θ), where θ is the angle between the twovectors.
E(cos2(θ)), Var(cos2(θ)), and tail bounds on cos2(θ)?
Ramesh Hariharan High Dimensional Spaces
Dot Product between a Fixed Unit Vector and a Random Unit Vector
A Spherically Symmetric Random Unit Vector:Probability of lying in any specific patch P on the surface isproportional to the area of P.
Dot Product is also the length of the projection of the fixed vectoron the random vector.
Dot Product equals cos(θ), where θ is the angle between the twovectors.
E(cos2(θ)), Var(cos2(θ)), and tail bounds on cos2(θ)?
Ramesh Hariharan High Dimensional Spaces
E(cos2(θ))
∫ π2
0 sinn−2(θ) cos2(θ) dθ∫ π2
0 sinn−2(θ) dθ
=
∫ π2
0 sinn−2(θ) dθ −∫ π
20 sinn(θ) dθ∫ π
20 sinn−2(θ) dθ
= 1− n − 1n
=1n
Ramesh Hariharan High Dimensional Spaces
E(cos2(θ))
∫ π2
0 sinn−2(θ) cos2(θ) dθ∫ π2
0 sinn−2(θ) dθ
=
∫ π2
0 sinn−2(θ) dθ −∫ π
20 sinn(θ) dθ∫ π
20 sinn−2(θ) dθ
= 1− n − 1n
=1n
Ramesh Hariharan High Dimensional Spaces
E(cos2(θ))
∫ π2
0 sinn−2(θ) cos2(θ) dθ∫ π2
0 sinn−2(θ) dθ
=
∫ π2
0 sinn−2(θ) dθ −∫ π
20 sinn(θ) dθ∫ π
20 sinn−2(θ) dθ
= 1− n − 1n
=1n
Ramesh Hariharan High Dimensional Spaces
Var(cos2(θ))
∫ π2
0 sinn−2(θ) cos4(θ) dθ∫ π2
0 sinn−2(θ) dθ− 1
n2
=
∫ π2
0 sinn−2(θ) dθ − 2∫ π
20 sinn(θ) dθ +
∫ π2
0 sinn+2(θ) dθ∫ π2
0 sinn−2(θ) dθ− 1
n2
= 1− 2n − 1
n+
(n − 1)(n + 1)
n(n + 2)− 1
n2 =2(n − 1)
n2(n + 2)≤ 2
n2
Ramesh Hariharan High Dimensional Spaces
Var(cos2(θ))
∫ π2
0 sinn−2(θ) cos4(θ) dθ∫ π2
0 sinn−2(θ) dθ− 1
n2
=
∫ π2
0 sinn−2(θ) dθ − 2∫ π
20 sinn(θ) dθ +
∫ π2
0 sinn+2(θ) dθ∫ π2
0 sinn−2(θ) dθ− 1
n2
= 1− 2n − 1
n+
(n − 1)(n + 1)
n(n + 2)− 1
n2 =2(n − 1)
n2(n + 2)≤ 2
n2
Ramesh Hariharan High Dimensional Spaces
Var(cos2(θ))
∫ π2
0 sinn−2(θ) cos4(θ) dθ∫ π2
0 sinn−2(θ) dθ− 1
n2
=
∫ π2
0 sinn−2(θ) dθ − 2∫ π
20 sinn(θ) dθ +
∫ π2
0 sinn+2(θ) dθ∫ π2
0 sinn−2(θ) dθ− 1
n2
= 1− 2n − 1
n+
(n − 1)(n + 1)
n(n + 2)− 1
n2 =2(n − 1)
n2(n + 2)≤ 2
n2
Ramesh Hariharan High Dimensional Spaces
Tail Bounds on cos2(θ)
Pr(cos2(θ) > a2
n ) =R cos−1( a√
n)
0 sinn−2(θ) dθR π2
0 sinn−2(θ) dθ
≤√
2(n−1)(n−2)π
1(n−3)ae−
n−32n a2 ∼
√2π
1ae−
a22
Ramesh Hariharan High Dimensional Spaces
Tail Bounds on cos2(θ)
Pr(cos2(θ) > a2
n ) =R cos−1( a√
n)
0 sinn−2(θ) dθR π2
0 sinn−2(θ) dθ
≤√
2(n−1)(n−2)π
1(n−3)ae−
n−32n a2 ∼
√2π
1ae−
a22
Ramesh Hariharan High Dimensional Spaces
Projection Length of Fixed Unit Vector on Random Unit Vector
With probability 1−√
2π
1ae−
a22 , the projected length is between 0
and a√n
With probability 0.946, the projected length is between 0 and 2√n
Can we drive the projected length to be much more tightlydistributed around 1√
n ?
Ramesh Hariharan High Dimensional Spaces
Projection Length of Fixed Unit Vector on Random Unit Vector
With probability 1−√
2π
1ae−
a22 , the projected length is between 0
and a√n
With probability 0.946, the projected length is between 0 and 2√n
Can we drive the projected length to be much more tightlydistributed around 1√
n ?
Ramesh Hariharan High Dimensional Spaces
Projection Length of Fixed Unit Vector on Random Unit Vector
With probability 1−√
2π
1ae−
a22 , the projected length is between 0
and a√n
With probability 0.946, the projected length is between 0 and 2√n
Can we drive the projected length to be much more tightlydistributed around 1√
n ?
Ramesh Hariharan High Dimensional Spaces
Project on to many Random Vectors
Let X1, . . . , Xk be the projection lengths on to k independentrandom unit vectors
The resulting k -tuple defines a mapping from n-dimensionalspace to k -dimensional space
X =√
X 21 + · · ·+ X 2
k is the length of the vector post-mapping
Consider X 2 = X 21 + · · ·+ X 2
k .
Ramesh Hariharan High Dimensional Spaces
Project on to many Random Vectors
Let X1, . . . , Xk be the projection lengths on to k independentrandom unit vectors
The resulting k -tuple defines a mapping from n-dimensionalspace to k -dimensional space
X =√
X 21 + · · ·+ X 2
k is the length of the vector post-mapping
Consider X 2 = X 21 + · · ·+ X 2
k .
Ramesh Hariharan High Dimensional Spaces
Project on to many Random Vectors
Let X1, . . . , Xk be the projection lengths on to k independentrandom unit vectors
The resulting k -tuple defines a mapping from n-dimensionalspace to k -dimensional space
X =√
X 21 + · · ·+ X 2
k is the length of the vector post-mapping
Consider X 2 = X 21 + · · ·+ X 2
k .
Ramesh Hariharan High Dimensional Spaces
Project on to many Random Vectors
Let X1, . . . , Xk be the projection lengths on to k independentrandom unit vectors
The resulting k -tuple defines a mapping from n-dimensionalspace to k -dimensional space
X =√
X 21 + · · ·+ X 2
k is the length of the vector post-mapping
Consider X 2 = X 21 + · · ·+ X 2
k .
Ramesh Hariharan High Dimensional Spaces
Sums of Random Variables
Since X 21 , . . . , X 2
k are i.i.d, E(X 2
k ) = E(X 21 ) and Var(X 2
k ) =Var(X 2
1 )k
I.e., the distribution of X 2
k preserves the mean but is much tighteraround the mean.
Pr(|X 2
k − E(X 2
k )| ≥ α) << Pr(|X 21 − E(X 2
1 )| ≥ α)
Pr(|X 2 − E(X 2)| ≥ kα) << Pr(|X 21 − E(X 2
1 )| ≥ α)
Ramesh Hariharan High Dimensional Spaces
Sums of Random Variables
Since X 21 , . . . , X 2
k are i.i.d, E(X 2
k ) = E(X 21 ) and Var(X 2
k ) =Var(X 2
1 )k
I.e., the distribution of X 2
k preserves the mean but is much tighteraround the mean.
Pr(|X 2
k − E(X 2
k )| ≥ α) << Pr(|X 21 − E(X 2
1 )| ≥ α)
Pr(|X 2 − E(X 2)| ≥ kα) << Pr(|X 21 − E(X 2
1 )| ≥ α)
Ramesh Hariharan High Dimensional Spaces
Sums of Random Variables
Since X 21 , . . . , X 2
k are i.i.d, E(X 2
k ) = E(X 21 ) and Var(X 2
k ) =Var(X 2
1 )k
I.e., the distribution of X 2
k preserves the mean but is much tighteraround the mean.
Pr(|X 2
k − E(X 2
k )| ≥ α) << Pr(|X 21 − E(X 2
1 )| ≥ α)
Pr(|X 2 − E(X 2)| ≥ kα) << Pr(|X 21 − E(X 2
1 )| ≥ α)
Ramesh Hariharan High Dimensional Spaces
Approximate Length Preservation in k -Dimensional RandomProjection
E(X 2) = kn , by Linearity of Expectation
Var(X 2) ≤ 2kn2 , by Linearity of Variance under Independence
With probability 1−?, X 2 is in (1− ε)kn . . . (1 + ε)k
n
If ? as small as m−3...
Union Bound: With probability 1−m−1, lengths for m2 distinctfixed vectors of arbitrary lengths are all simultaneouslyapproximately preserved, modulo scaling by
√nk !!
Ramesh Hariharan High Dimensional Spaces
Approximate Length Preservation in k -Dimensional RandomProjection
E(X 2) = kn , by Linearity of Expectation
Var(X 2) ≤ 2kn2 , by Linearity of Variance under Independence
With probability 1−?, X 2 is in (1− ε)kn . . . (1 + ε)k
n
If ? as small as m−3...
Union Bound: With probability 1−m−1, lengths for m2 distinctfixed vectors of arbitrary lengths are all simultaneouslyapproximately preserved, modulo scaling by
√nk !!
Ramesh Hariharan High Dimensional Spaces
Approximate Length Preservation in k -Dimensional RandomProjection
E(X 2) = kn , by Linearity of Expectation
Var(X 2) ≤ 2kn2 , by Linearity of Variance under Independence
With probability 1−?, X 2 is in (1− ε)kn . . . (1 + ε)k
n
If ? as small as m−3...
Union Bound: With probability 1−m−1, lengths for m2 distinctfixed vectors of arbitrary lengths are all simultaneouslyapproximately preserved, modulo scaling by
√nk !!
Ramesh Hariharan High Dimensional Spaces
Approximate Length Preservation in k -Dimensional RandomProjection
E(X 2) = kn , by Linearity of Expectation
Var(X 2) ≤ 2kn2 , by Linearity of Variance under Independence
With probability 1−?, X 2 is in (1− ε)kn . . . (1 + ε)k
n
If ? as small as m−3...
Union Bound: With probability 1−m−1, lengths for m2 distinctfixed vectors of arbitrary lengths are all simultaneouslyapproximately preserved, modulo scaling by
√nk !!
Ramesh Hariharan High Dimensional Spaces
Approximate Length Preservation in k -Dimensional RandomProjection
E(X 2) = kn , by Linearity of Expectation
Var(X 2) ≤ 2kn2 , by Linearity of Variance under Independence
With probability 1−?, X 2 is in (1− ε)kn . . . (1 + ε)k
n
If ? as small as m−3...
Union Bound: With probability 1−m−1, lengths for m2 distinctfixed vectors of arbitrary lengths are all simultaneouslyapproximately preserved, modulo scaling by
√nk !!
Ramesh Hariharan High Dimensional Spaces
Asymptotic Tight Concentration for X 2
By CLT, for k →∞, the distribution of X 2 =∑k
0 X 2i tends to
N(kn ,≤ 2k
n2 )
Pr(|X 2 − kn | ≥ εk
n ) should then be ≤√
4ε2kπ
e−ε2k
4
For k > 12 log mε2 , this is 1
m3
How do we show this for finite k?
Ramesh Hariharan High Dimensional Spaces
Asymptotic Tight Concentration for X 2
By CLT, for k →∞, the distribution of X 2 =∑k
0 X 2i tends to
N(kn ,≤ 2k
n2 )
Pr(|X 2 − kn | ≥ εk
n ) should then be ≤√
4ε2kπ
e−ε2k
4
For k > 12 log mε2 , this is 1
m3
How do we show this for finite k?
Ramesh Hariharan High Dimensional Spaces
Asymptotic Tight Concentration for X 2
By CLT, for k →∞, the distribution of X 2 =∑k
0 X 2i tends to
N(kn ,≤ 2k
n2 )
Pr(|X 2 − kn | ≥ εk
n ) should then be ≤√
4ε2kπ
e−ε2k
4
For k > 12 log mε2 , this is 1
m3
How do we show this for finite k?
Ramesh Hariharan High Dimensional Spaces
Asymptotic Tight Concentration for X 2
By CLT, for k →∞, the distribution of X 2 =∑k
0 X 2i tends to
N(kn ,≤ 2k
n2 )
Pr(|X 2 − kn | ≥ εk
n ) should then be ≤√
4ε2kπ
e−ε2k
4
For k > 12 log mε2 , this is 1
m3
How do we show this for finite k?
Ramesh Hariharan High Dimensional Spaces
Tight Concentration and Tail Bound Inequalities
Markov’s Inequality for a non-negative random variable Y
Pr(Y > k) ≤ E(Y )/k
Chebychev’s Inequality
Pr(|X 2 − kn| ≥ ε
kn
) ≤ Var(X 2)
(εkn )2
≤ 2ε2k
Not strong enough to yield negative exponential dependence onk .
Ramesh Hariharan High Dimensional Spaces
Tight Concentration and Tail Bound Inequalities
Markov’s Inequality for a non-negative random variable Y
Pr(Y > k) ≤ E(Y )/k
Chebychev’s Inequality
Pr(|X 2 − kn| ≥ ε
kn
) ≤ Var(X 2)
(εkn )2
≤ 2ε2k
Not strong enough to yield negative exponential dependence onk .
Ramesh Hariharan High Dimensional Spaces
Tight Concentration and Tail Bound Inequalities
Markov’s Inequality for a non-negative random variable Y
Pr(Y > k) ≤ E(Y )/k
Chebychev’s Inequality
Pr(|X 2 − kn| ≥ ε
kn
) ≤ Var(X 2)
(εkn )2
≤ 2ε2k
Not strong enough to yield negative exponential dependence onk .
Ramesh Hariharan High Dimensional Spaces
Tight Concentration and Tail Bound Inequalities
Markov’s Inequality for a non-negative random variable Y
Pr(Y > k) ≤ E(Y )/k
Chebychev’s Inequality
Pr(|X 2 − kn| ≥ ε
kn
) ≤ Var(X 2)
(εkn )2
≤ 2ε2k
Not strong enough to yield negative exponential dependence onk .
Ramesh Hariharan High Dimensional Spaces
Lower Tail Bound for X 2
Using Markov’s inequality on e−tX 2, where t > 0 (as in Chernoff
Bounds):
Pr(X 2 < (1− ε)kn
) = Pr(−tX 2 > −t(1− ε)kn
)
= Pr(e−tX 2> e−t(1−ε) k
n ) ≤ E(e−tX 2)et(1−ε) k
n
Since X 2 =∑k
1 X 2i and the Xi ’s are identical and independent:
E(e−tX 2)et(1−ε) k
n = E(e−tX 2i )ket(1−ε) k
n
E(e−tX 2i ) ≤?
Ramesh Hariharan High Dimensional Spaces
Lower Tail Bound for X 2
Using Markov’s inequality on e−tX 2, where t > 0 (as in Chernoff
Bounds):
Pr(X 2 < (1− ε)kn
) = Pr(−tX 2 > −t(1− ε)kn
)
= Pr(e−tX 2> e−t(1−ε) k
n ) ≤ E(e−tX 2)et(1−ε) k
n
Since X 2 =∑k
1 X 2i and the Xi ’s are identical and independent:
E(e−tX 2)et(1−ε) k
n = E(e−tX 2i )ket(1−ε) k
n
E(e−tX 2i ) ≤?
Ramesh Hariharan High Dimensional Spaces
Lower Tail Bound for X 2
Using Markov’s inequality on e−tX 2, where t > 0 (as in Chernoff
Bounds):
Pr(X 2 < (1− ε)kn
) = Pr(−tX 2 > −t(1− ε)kn
)
= Pr(e−tX 2> e−t(1−ε) k
n ) ≤ E(e−tX 2)et(1−ε) k
n
Since X 2 =∑k
1 X 2i and the Xi ’s are identical and independent:
E(e−tX 2)et(1−ε) k
n = E(e−tX 2i )ket(1−ε) k
n
E(e−tX 2i ) ≤?
Ramesh Hariharan High Dimensional Spaces
Lower Tail Bound for X 2
Using Markov’s inequality on e−tX 2, where t > 0 (as in Chernoff
Bounds):
Pr(X 2 < (1− ε)kn
) = Pr(−tX 2 > −t(1− ε)kn
)
= Pr(e−tX 2> e−t(1−ε) k
n ) ≤ E(e−tX 2)et(1−ε) k
n
Since X 2 =∑k
1 X 2i and the Xi ’s are identical and independent:
E(e−tX 2)et(1−ε) k
n = E(e−tX 2i )ket(1−ε) k
n
E(e−tX 2i ) ≤?
Ramesh Hariharan High Dimensional Spaces
Lower Tail Bound for X 2
Using Markov’s inequality on e−tX 2, where t > 0 (as in Chernoff
Bounds):
Pr(X 2 < (1− ε)kn
) = Pr(−tX 2 > −t(1− ε)kn
)
= Pr(e−tX 2> e−t(1−ε) k
n ) ≤ E(e−tX 2)et(1−ε) k
n
Since X 2 =∑k
1 X 2i and the Xi ’s are identical and independent:
E(e−tX 2)et(1−ε) k
n = E(e−tX 2i )ket(1−ε) k
n
E(e−tX 2i ) ≤?
Ramesh Hariharan High Dimensional Spaces
Lower Tail Bound for X 2
Using Markov’s inequality on e−tX 2, where t > 0 (as in Chernoff
Bounds):
Pr(X 2 < (1− ε)kn
) = Pr(−tX 2 > −t(1− ε)kn
)
= Pr(e−tX 2> e−t(1−ε) k
n ) ≤ E(e−tX 2)et(1−ε) k
n
Since X 2 =∑k
1 X 2i and the Xi ’s are identical and independent:
E(e−tX 2)et(1−ε) k
n = E(e−tX 2i )ket(1−ε) k
n
E(e−tX 2i ) ≤?
Ramesh Hariharan High Dimensional Spaces
E(e−tX 2i ) ≤?
Using 1− x ≤ e−x ≤ 1− x + x2
2 , for all x ≥ 0:
E(e−tX 2i ) ≤ E(1− tX 2
i + t2 X 4i
2)
≤ 1− tn
+3t2
2n2 ≤ e−tn (1− 3t
2n )
Ramesh Hariharan High Dimensional Spaces
E(e−tX 2i ) ≤?
Using 1− x ≤ e−x ≤ 1− x + x2
2 , for all x ≥ 0:
E(e−tX 2i ) ≤ E(1− tX 2
i + t2 X 4i
2)
≤ 1− tn
+3t2
2n2 ≤ e−tn (1− 3t
2n )
Ramesh Hariharan High Dimensional Spaces
Completing the Lower Tail Bound for X 2
Pr(X 2 < (1− ε)kn ) ≤ E(e−tX 2
i )ket(1−ε) kn
≤ e−ktn (1− 3t
2n )+ ktn (1−ε) ≤ e−
ktn (ε− 3t
2n )
Setting t = nε3 > 0 to minimize the above
Pr(X 2 < (1− ε)kn
) ≤ e−kε3 (ε− ε
2 ) ≤ e−kε2
6
Ramesh Hariharan High Dimensional Spaces
Completing the Lower Tail Bound for X 2
Pr(X 2 < (1− ε)kn ) ≤ E(e−tX 2
i )ket(1−ε) kn
≤ e−ktn (1− 3t
2n )+ ktn (1−ε) ≤ e−
ktn (ε− 3t
2n )
Setting t = nε3 > 0 to minimize the above
Pr(X 2 < (1− ε)kn
) ≤ e−kε3 (ε− ε
2 ) ≤ e−kε2
6
Ramesh Hariharan High Dimensional Spaces
Completing the Lower Tail Bound for X 2
Pr(X 2 < (1− ε)kn ) ≤ E(e−tX 2
i )ket(1−ε) kn
≤ e−ktn (1− 3t
2n )+ ktn (1−ε) ≤ e−
ktn (ε− 3t
2n )
Setting t = nε3 > 0 to minimize the above
Pr(X 2 < (1− ε)kn
) ≤ e−kε3 (ε− ε
2 ) ≤ e−kε2
6
Ramesh Hariharan High Dimensional Spaces
Completing the Lower Tail Bound for X 2
Pr(X 2 < (1− ε)kn ) ≤ E(e−tX 2
i )ket(1−ε) kn
≤ e−ktn (1− 3t
2n )+ ktn (1−ε) ≤ e−
ktn (ε− 3t
2n )
Setting t = nε3 > 0 to minimize the above
Pr(X 2 < (1− ε)kn
) ≤ e−kε3 (ε− ε
2 ) ≤ e−kε2
6
Ramesh Hariharan High Dimensional Spaces
Completing the Lower Tail Bound for X 2
Pr(X 2 < (1− ε)kn ) ≤ E(e−tX 2
i )ket(1−ε) kn
≤ e−ktn (1− 3t
2n )+ ktn (1−ε) ≤ e−
ktn (ε− 3t
2n )
Setting t = nε3 > 0 to minimize the above
Pr(X 2 < (1− ε)kn
) ≤ e−kε3 (ε− ε
2 ) ≤ e−kε2
6
Ramesh Hariharan High Dimensional Spaces
Upper Tail Bound for X 2
As for the Lower Tail Bound, with t > 0:
Pr(X 2 > (1 + ε)kn
) = Pr(tX 2 > t(1 + ε)kn
)
= Pr(etX 2> et(1+ε) k
n ) ≤ E(etX 2)e−t(1+ε) k
n
Since X 2 =∑k
1 X 2i and the Xi ’s are identical and independent:
E(etX 2)e−t(1+ε) k
n = E(etX 2i )ke−t(1+ε) k
n
Ramesh Hariharan High Dimensional Spaces
Upper Tail Bound for X 2
As for the Lower Tail Bound, with t > 0:
Pr(X 2 > (1 + ε)kn
) = Pr(tX 2 > t(1 + ε)kn
)
= Pr(etX 2> et(1+ε) k
n ) ≤ E(etX 2)e−t(1+ε) k
n
Since X 2 =∑k
1 X 2i and the Xi ’s are identical and independent:
E(etX 2)e−t(1+ε) k
n = E(etX 2i )ke−t(1+ε) k
n
Ramesh Hariharan High Dimensional Spaces
Upper Tail Bound for X 2
As for the Lower Tail Bound, with t > 0:
Pr(X 2 > (1 + ε)kn
) = Pr(tX 2 > t(1 + ε)kn
)
= Pr(etX 2> et(1+ε) k
n ) ≤ E(etX 2)e−t(1+ε) k
n
Since X 2 =∑k
1 X 2i and the Xi ’s are identical and independent:
E(etX 2)e−t(1+ε) k
n = E(etX 2i )ke−t(1+ε) k
n
Ramesh Hariharan High Dimensional Spaces
Upper Tail Bound for X 2
As for the Lower Tail Bound, with t > 0:
Pr(X 2 > (1 + ε)kn
) = Pr(tX 2 > t(1 + ε)kn
)
= Pr(etX 2> et(1+ε) k
n ) ≤ E(etX 2)e−t(1+ε) k
n
Since X 2 =∑k
1 X 2i and the Xi ’s are identical and independent:
E(etX 2)e−t(1+ε) k
n = E(etX 2i )ke−t(1+ε) k
n
Ramesh Hariharan High Dimensional Spaces
Upper Tail Bound for X 2
As for the Lower Tail Bound, with t > 0:
Pr(X 2 > (1 + ε)kn
) = Pr(tX 2 > t(1 + ε)kn
)
= Pr(etX 2> et(1+ε) k
n ) ≤ E(etX 2)e−t(1+ε) k
n
Since X 2 =∑k
1 X 2i and the Xi ’s are identical and independent:
E(etX 2)e−t(1+ε) k
n = E(etX 2i )ke−t(1+ε) k
n
Ramesh Hariharan High Dimensional Spaces
The Upper Tail Bound for X 2
Setting y = cos2θ.∫ π2
0 sinn−2 θet cos2 θ dθ∫ π2
0 sinn−2 θ dθ≤
√2(n − 1)
π
12
∫ 1
0
(1− y)n−3
2 ety√
ydy
Setting 1− y ≤ e−y ,∀y .
≤√
2(n − 1)
π
12
∫ 1
0
e−y( n−32 −t)
√y
dy
Setting∫ 1
0 y− 12 e−y dy ≤
√π
≤√
2(n − 1)
π
1
2√
n−32 − t
√π ≤
√n − 1
n − 3− 2t
Ramesh Hariharan High Dimensional Spaces
The Upper Tail Bound for X 2
Setting y = cos2θ.∫ π2
0 sinn−2 θet cos2 θ dθ∫ π2
0 sinn−2 θ dθ≤
√2(n − 1)
π
12
∫ 1
0
(1− y)n−3
2 ety√
ydy
Setting 1− y ≤ e−y ,∀y .
≤√
2(n − 1)
π
12
∫ 1
0
e−y( n−32 −t)
√y
dy
Setting∫ 1
0 y− 12 e−y dy ≤
√π
≤√
2(n − 1)
π
1
2√
n−32 − t
√π ≤
√n − 1
n − 3− 2t
Ramesh Hariharan High Dimensional Spaces
The Upper Tail Bound for X 2
Setting y = cos2θ.∫ π2
0 sinn−2 θet cos2 θ dθ∫ π2
0 sinn−2 θ dθ≤
√2(n − 1)
π
12
∫ 1
0
(1− y)n−3
2 ety√
ydy
Setting 1− y ≤ e−y ,∀y .
≤√
2(n − 1)
π
12
∫ 1
0
e−y( n−32 −t)
√y
dy
Setting∫ 1
0 y− 12 e−y dy ≤
√π
≤√
2(n − 1)
π
1
2√
n−32 − t
√π ≤
√n − 1
n − 3− 2t
Ramesh Hariharan High Dimensional Spaces
The Upper Tail Bound for X 2
Setting y = cos2θ.∫ π2
0 sinn−2 θet cos2 θ dθ∫ π2
0 sinn−2 θ dθ≤
√2(n − 1)
π
12
∫ 1
0
(1− y)n−3
2 ety√
ydy
Setting 1− y ≤ e−y ,∀y .
≤√
2(n − 1)
π
12
∫ 1
0
e−y( n−32 −t)
√y
dy
Setting∫ 1
0 y− 12 e−y dy ≤
√π
≤√
2(n − 1)
π
1
2√
n−32 − t
√π ≤
√n − 1
n − 3− 2t
Ramesh Hariharan High Dimensional Spaces
Completing the Upper Tail Bound for X 2
E(etX 2i )ke−t(1+ε) k
n ≤(√ n−1
n−3−2t
)ke−t(1+ε) kn
Using (1− x)−12 ≤
√1 + x + 2x2 ≤ e
x2 (1+2x), for 0 ≤ x ≤ 1
2 , andconstraining 0 < 2t < n−3
2 , k << n
(√ n − 1n − 3− 2t
)k ≤(√n − 1
n − 3)k
(1− 2tn − 3
)−k2 ≤ eO( k
n )+ tkn−3 (1+ 4t
n−3 )
Ramesh Hariharan High Dimensional Spaces
Completing the Upper Tail Bound for X 2
E(etX 2i )ke−t(1+ε) k
n ≤(√ n−1
n−3−2t
)ke−t(1+ε) kn
Using (1− x)−12 ≤
√1 + x + 2x2 ≤ e
x2 (1+2x), for 0 ≤ x ≤ 1
2 , andconstraining 0 < 2t < n−3
2 , k << n
(√ n − 1n − 3− 2t
)k ≤(√n − 1
n − 3)k
(1− 2tn − 3
)−k2 ≤ eO( k
n )+ tkn−3 (1+ 4t
n−3 )
Ramesh Hariharan High Dimensional Spaces
Completing the Upper Tail Bound for X 2
E(etX 2i )ke−t(1+ε) k
n ≤(√ n−1
n−3−2t
)ke−t(1+ε) kn
Using (1− x)−12 ≤
√1 + x + 2x2 ≤ e
x2 (1+2x), for 0 ≤ x ≤ 1
2 , andconstraining 0 < 2t < n−3
2 , k << n
(√ n − 1n − 3− 2t
)k ≤(√n − 1
n − 3)k
(1− 2tn − 3
)−k2 ≤ eO( k
n )+ tkn−3 (1+ 4t
n−3 )
Ramesh Hariharan High Dimensional Spaces
Completing the Upper Tail Bound for X 2
E(etX 2i )ke−t(1+ε) k
n ≤(√ n−1
n−3−2t
)ke−t(1+ε) kn
Using (1− x)−12 ≤
√1 + x + 2x2 ≤ e
x2 (1+2x), for 0 ≤ x ≤ 1
2 , andconstraining 0 < 2t < n−3
2 , k << n
(√ n − 1n − 3− 2t
)k ≤(√n − 1
n − 3)k
(1− 2tn − 3
)−k2 ≤ eO( k
n )+ tkn−3 (1+ 4t
n−3 )
Ramesh Hariharan High Dimensional Spaces
Completing the Upper Tail Bound for X 2
So: E(etX 2i )ke−t(1−ε) k
n
≤ [eO( kn )+ tk
n−3 (1+ 4tn−3 )][e−t(1+ε) k
n ]
= eO( k
n )+[ tkn−3−
tkn ]+[ 4t2k
(n−3)2− εtk
n ]
≤ eO( k
n )+[ 4t2k(n−3)2
− εtkn ]
Setting t = ε (n−3)2
8n and assuming k << n, we get:
≤ e−ε2 k16 +O( k
n ) ≤ 2e−ε2 k16
Ramesh Hariharan High Dimensional Spaces
Completing the Upper Tail Bound for X 2
So: E(etX 2i )ke−t(1−ε) k
n
≤ [eO( kn )+ tk
n−3 (1+ 4tn−3 )][e−t(1+ε) k
n ]
= eO( k
n )+[ tkn−3−
tkn ]+[ 4t2k
(n−3)2− εtk
n ]
≤ eO( k
n )+[ 4t2k(n−3)2
− εtkn ]
Setting t = ε (n−3)2
8n and assuming k << n, we get:
≤ e−ε2 k16 +O( k
n ) ≤ 2e−ε2 k16
Ramesh Hariharan High Dimensional Spaces
Completing the Upper Tail Bound for X 2
So: E(etX 2i )ke−t(1−ε) k
n
≤ [eO( kn )+ tk
n−3 (1+ 4tn−3 )][e−t(1+ε) k
n ]
= eO( k
n )+[ tkn−3−
tkn ]+[ 4t2k
(n−3)2− εtk
n ]
≤ eO( k
n )+[ 4t2k(n−3)2
− εtkn ]
Setting t = ε (n−3)2
8n and assuming k << n, we get:
≤ e−ε2 k16 +O( k
n ) ≤ 2e−ε2 k16
Ramesh Hariharan High Dimensional Spaces
Completing the Upper Tail Bound for X 2
So: E(etX 2i )ke−t(1−ε) k
n
≤ [eO( kn )+ tk
n−3 (1+ 4tn−3 )][e−t(1+ε) k
n ]
= eO( k
n )+[ tkn−3−
tkn ]+[ 4t2k
(n−3)2− εtk
n ]
≤ eO( k
n )+[ 4t2k(n−3)2
− εtkn ]
Setting t = ε (n−3)2
8n and assuming k << n, we get:
≤ e−ε2 k16 +O( k
n ) ≤ 2e−ε2 k16
Ramesh Hariharan High Dimensional Spaces
Completing the Upper Tail Bound for X 2
So: E(etX 2i )ke−t(1−ε) k
n
≤ [eO( kn )+ tk
n−3 (1+ 4tn−3 )][e−t(1+ε) k
n ]
= eO( k
n )+[ tkn−3−
tkn ]+[ 4t2k
(n−3)2− εtk
n ]
≤ eO( k
n )+[ 4t2k(n−3)2
− εtkn ]
Setting t = ε (n−3)2
8n and assuming k << n, we get:
≤ e−ε2 k16 +O( k
n ) ≤ 2e−ε2 k16
Ramesh Hariharan High Dimensional Spaces
Completing the Upper Tail Bound for X 2
So: E(etX 2i )ke−t(1−ε) k
n
≤ [eO( kn )+ tk
n−3 (1+ 4tn−3 )][e−t(1+ε) k
n ]
= eO( k
n )+[ tkn−3−
tkn ]+[ 4t2k
(n−3)2− εtk
n ]
≤ eO( k
n )+[ 4t2k(n−3)2
− εtkn ]
Setting t = ε (n−3)2
8n and assuming k << n, we get:
≤ e−ε2 k16 +O( k
n ) ≤ 2e−ε2 k16
Ramesh Hariharan High Dimensional Spaces
Completing the Upper Tail Bound for X 2
So: E(etX 2i )ke−t(1−ε) k
n
≤ [eO( kn )+ tk
n−3 (1+ 4tn−3 )][e−t(1+ε) k
n ]
= eO( k
n )+[ tkn−3−
tkn ]+[ 4t2k
(n−3)2− εtk
n ]
≤ eO( k
n )+[ 4t2k(n−3)2
− εtkn ]
Setting t = ε (n−3)2
8n and assuming k << n, we get:
≤ e−ε2 k16 +O( k
n ) ≤ 2e−ε2 k16
Ramesh Hariharan High Dimensional Spaces
Wrapping Up: The Johnson-Lindenstrauß Theorem
Given m points a1, . . . , am in n-dimensional space, m ≥ n, andgiven ε, 0 ≤ ε ≤ 1.
Choose k random unit vectors r1, . . . , rk , where k = 48 ln mε2 << n.
Define k -dimensional points b1, . . . , bm, wherebi = (ai · r1, ai · r2, · · · , ai · rk ).
Consider any pair ai , aj . Then:
|bi − bj ||ai − aj |
=
√(
ai − aj
|ai − aj |· r1)2 + (
ai − aj
|ai − aj |· r2)2 + · · ·+ (
ai − aj
|ai − aj |· rn)2
Then√
(1− ε)√
kn ≤
|bi−bj ||ai−aj | ≤
√(1 + ε)
√kn with probability 3
m3 .
And this holds for all pairs simultaneously with probability 1− 32m .
Ramesh Hariharan High Dimensional Spaces
Wrapping Up: The Johnson-Lindenstrauß Theorem
Given m points a1, . . . , am in n-dimensional space, m ≥ n, andgiven ε, 0 ≤ ε ≤ 1.
Choose k random unit vectors r1, . . . , rk , where k = 48 ln mε2 << n.
Define k -dimensional points b1, . . . , bm, wherebi = (ai · r1, ai · r2, · · · , ai · rk ).
Consider any pair ai , aj . Then:
|bi − bj ||ai − aj |
=
√(
ai − aj
|ai − aj |· r1)2 + (
ai − aj
|ai − aj |· r2)2 + · · ·+ (
ai − aj
|ai − aj |· rn)2
Then√
(1− ε)√
kn ≤
|bi−bj ||ai−aj | ≤
√(1 + ε)
√kn with probability 3
m3 .
And this holds for all pairs simultaneously with probability 1− 32m .
Ramesh Hariharan High Dimensional Spaces
Wrapping Up: The Johnson-Lindenstrauß Theorem
Given m points a1, . . . , am in n-dimensional space, m ≥ n, andgiven ε, 0 ≤ ε ≤ 1.
Choose k random unit vectors r1, . . . , rk , where k = 48 ln mε2 << n.
Define k -dimensional points b1, . . . , bm, wherebi = (ai · r1, ai · r2, · · · , ai · rk ).
Consider any pair ai , aj . Then:
|bi − bj ||ai − aj |
=
√(
ai − aj
|ai − aj |· r1)2 + (
ai − aj
|ai − aj |· r2)2 + · · ·+ (
ai − aj
|ai − aj |· rn)2
Then√
(1− ε)√
kn ≤
|bi−bj ||ai−aj | ≤
√(1 + ε)
√kn with probability 3
m3 .
And this holds for all pairs simultaneously with probability 1− 32m .
Ramesh Hariharan High Dimensional Spaces
Wrapping Up: The Johnson-Lindenstrauß Theorem
Given m points a1, . . . , am in n-dimensional space, m ≥ n, andgiven ε, 0 ≤ ε ≤ 1.
Choose k random unit vectors r1, . . . , rk , where k = 48 ln mε2 << n.
Define k -dimensional points b1, . . . , bm, wherebi = (ai · r1, ai · r2, · · · , ai · rk ).
Consider any pair ai , aj . Then:
|bi − bj ||ai − aj |
=
√(
ai − aj
|ai − aj |· r1)2 + (
ai − aj
|ai − aj |· r2)2 + · · ·+ (
ai − aj
|ai − aj |· rn)2
Then√
(1− ε)√
kn ≤
|bi−bj ||ai−aj | ≤
√(1 + ε)
√kn with probability 3
m3 .
And this holds for all pairs simultaneously with probability 1− 32m .
Ramesh Hariharan High Dimensional Spaces
Wrapping Up: The Johnson-Lindenstrauß Theorem
Given m points a1, . . . , am in n-dimensional space, m ≥ n, andgiven ε, 0 ≤ ε ≤ 1.
Choose k random unit vectors r1, . . . , rk , where k = 48 ln mε2 << n.
Define k -dimensional points b1, . . . , bm, wherebi = (ai · r1, ai · r2, · · · , ai · rk ).
Consider any pair ai , aj . Then:
|bi − bj ||ai − aj |
=
√(
ai − aj
|ai − aj |· r1)2 + (
ai − aj
|ai − aj |· r2)2 + · · ·+ (
ai − aj
|ai − aj |· rn)2
Then√
(1− ε)√
kn ≤
|bi−bj ||ai−aj | ≤
√(1 + ε)
√kn with probability 3
m3 .
And this holds for all pairs simultaneously with probability 1− 32m .
Ramesh Hariharan High Dimensional Spaces
Wrapping Up: The Johnson-Lindenstrauß Theorem
Given m points a1, . . . , am in n-dimensional space, m ≥ n, andgiven ε, 0 ≤ ε ≤ 1.
Choose k random unit vectors r1, . . . , rk , where k = 48 ln mε2 << n.
Define k -dimensional points b1, . . . , bm, wherebi = (ai · r1, ai · r2, · · · , ai · rk ).
Consider any pair ai , aj . Then:
|bi − bj ||ai − aj |
=
√(
ai − aj
|ai − aj |· r1)2 + (
ai − aj
|ai − aj |· r2)2 + · · ·+ (
ai − aj
|ai − aj |· rn)2
Then√
(1− ε)√
kn ≤
|bi−bj ||ai−aj | ≤
√(1 + ε)
√kn with probability 3
m3 .
And this holds for all pairs simultaneously with probability 1− 32m .
Ramesh Hariharan High Dimensional Spaces
Wrapping Up: The Johnson-Lindenstrauß Theorem
Given m points a1, . . . , am in n-dimensional space, m ≥ n, andgiven ε, 0 ≤ ε ≤ 1.
Choose k random unit vectors r1, . . . , rk , where k = 48 ln mε2 << n.
Define k -dimensional points b1, . . . , bm, wherebi = (ai · r1, ai · r2, · · · , ai · rk ).
Consider any pair ai , aj . Then:
|bi − bj ||ai − aj |
=
√(
ai − aj
|ai − aj |· r1)2 + (
ai − aj
|ai − aj |· r2)2 + · · ·+ (
ai − aj
|ai − aj |· rn)2
Then√
(1− ε)√
kn ≤
|bi−bj ||ai−aj | ≤
√(1 + ε)
√kn with probability 3
m3 .
And this holds for all pairs simultaneously with probability 1− 32m .
Ramesh Hariharan High Dimensional Spaces