1 unc, stat & or ??? place ??? object oriented data analysis j. s. marron dept. of statistics...
DESCRIPTION
3 UNC, Stat & OR Statistics - Mathematics Relationship Mathematical Statistics: Validation of existing methods Asymptotics (n ∞) & Taylor expansion Comparison of existing methods (requires hard math, but really “accounting”???)TRANSCRIPT
![Page 1: 1 UNC, Stat & OR ??? Place ??? Object Oriented Data Analysis J. S. Marron Dept. of Statistics and Operations Research, University of North Carolina January](https://reader036.vdocuments.net/reader036/viewer/2022070605/5a4d1ae47f8b9ab059978877/html5/thumbnails/1.jpg)
11
UNC, Stat & OR
??? Place ?????? Place ???
Object Oriented Data Analysis
J. S. MarronDept. of Statistics and Operations
Research, University of North CarolinaMay 3, 2023
![Page 2: 1 UNC, Stat & OR ??? Place ??? Object Oriented Data Analysis J. S. Marron Dept. of Statistics and Operations Research, University of North Carolina January](https://reader036.vdocuments.net/reader036/viewer/2022070605/5a4d1ae47f8b9ab059978877/html5/thumbnails/2.jpg)
22
UNC, Stat & OR
Interdisciplinary RelationshipInterdisciplinary Relationship
How does:
StatisticsRelate to:
Mathematics?(probability, optimization, geometry, …)
![Page 3: 1 UNC, Stat & OR ??? Place ??? Object Oriented Data Analysis J. S. Marron Dept. of Statistics and Operations Research, University of North Carolina January](https://reader036.vdocuments.net/reader036/viewer/2022070605/5a4d1ae47f8b9ab059978877/html5/thumbnails/3.jpg)
33
UNC, Stat & OR
Statistics - Mathematics RelationshipStatistics - Mathematics Relationship
Mathematical Statistics: Validation of existing methods Asymptotics (n ∞) & Taylor
expansion Comparison of existing methods
(requires hard math, butreally “accounting”???)
![Page 4: 1 UNC, Stat & OR ??? Place ??? Object Oriented Data Analysis J. S. Marron Dept. of Statistics and Operations Research, University of North Carolina January](https://reader036.vdocuments.net/reader036/viewer/2022070605/5a4d1ae47f8b9ab059978877/html5/thumbnails/4.jpg)
44
UNC, Stat & OR
Statistics - Mathematics RelationshipStatistics - Mathematics Relationship
Suggested New Relationship:Put Mathematics to work
toGenerate New Statistical Ideas/Approaches
(publishable in the Ann. Stat.???)
![Page 5: 1 UNC, Stat & OR ??? Place ??? Object Oriented Data Analysis J. S. Marron Dept. of Statistics and Operations Research, University of North Carolina January](https://reader036.vdocuments.net/reader036/viewer/2022070605/5a4d1ae47f8b9ab059978877/html5/thumbnails/5.jpg)
55
UNC, Stat & OR
Personal Opinions on Mathematical Personal Opinions on Mathematical StatisticsStatistics
What is Mathematical Statistics? Validation of existing methods Asymptotics (n ∞) & Taylor
expansion Comparison of existing methods
(requires hard math, butreally “accounting”???)
![Page 6: 1 UNC, Stat & OR ??? Place ??? Object Oriented Data Analysis J. S. Marron Dept. of Statistics and Operations Research, University of North Carolina January](https://reader036.vdocuments.net/reader036/viewer/2022070605/5a4d1ae47f8b9ab059978877/html5/thumbnails/6.jpg)
66
UNC, Stat & OR
Personal Opinions on Mathematical Personal Opinions on Mathematical StatisticsStatistics
What could Mathematical Statistics be? Basis for invention of new methods Complicated data mathematical ideas Do we value creativity? Since we don’t do this, others do…
(where are the $$$s???)
![Page 7: 1 UNC, Stat & OR ??? Place ??? Object Oriented Data Analysis J. S. Marron Dept. of Statistics and Operations Research, University of North Carolina January](https://reader036.vdocuments.net/reader036/viewer/2022070605/5a4d1ae47f8b9ab059978877/html5/thumbnails/7.jpg)
77
UNC, Stat & OR
Personal Opinions on Mathematical Personal Opinions on Mathematical StatisticsStatistics
Since we don’t do this, others do… Pattern Recognition Artificial Intelligence Neural Nets Data Mining Machine Learning ???
![Page 8: 1 UNC, Stat & OR ??? Place ??? Object Oriented Data Analysis J. S. Marron Dept. of Statistics and Operations Research, University of North Carolina January](https://reader036.vdocuments.net/reader036/viewer/2022070605/5a4d1ae47f8b9ab059978877/html5/thumbnails/8.jpg)
88
UNC, Stat & OR
Personal Opinions on Mathematical Personal Opinions on Mathematical StatisticsStatistics
Possible Litmus Test:Creative Statistics
Clinical Trials Viewpoint:Worst Imaginable Idea
Mathematical Statistics Viewpoint:???
![Page 9: 1 UNC, Stat & OR ??? Place ??? Object Oriented Data Analysis J. S. Marron Dept. of Statistics and Operations Research, University of North Carolina January](https://reader036.vdocuments.net/reader036/viewer/2022070605/5a4d1ae47f8b9ab059978877/html5/thumbnails/9.jpg)
99
UNC, Stat & OR
Object Oriented Data Analysis, IObject Oriented Data Analysis, I
What is the “atom” of a statistical analysis? 1st Course: Numbers Multivariate Analysis Course : Vectors Functional Data Analysis: Curves More generally: Data Objects
![Page 10: 1 UNC, Stat & OR ??? Place ??? Object Oriented Data Analysis J. S. Marron Dept. of Statistics and Operations Research, University of North Carolina January](https://reader036.vdocuments.net/reader036/viewer/2022070605/5a4d1ae47f8b9ab059978877/html5/thumbnails/10.jpg)
1010
UNC, Stat & OR
Functional Data Analysis, IFunctional Data Analysis, I
Curves as Data ObjectsImportant Duality:Curve Space Point Cloud SpaceIllustrate with Travis Gaydos Graphics 2 dim’al curves (easy to visualize)
![Page 11: 1 UNC, Stat & OR ??? Place ??? Object Oriented Data Analysis J. S. Marron Dept. of Statistics and Operations Research, University of North Carolina January](https://reader036.vdocuments.net/reader036/viewer/2022070605/5a4d1ae47f8b9ab059978877/html5/thumbnails/11.jpg)
1111
UNC, Stat & OR
Functional Data Analysis, Toy EG IFunctional Data Analysis, Toy EG I
![Page 12: 1 UNC, Stat & OR ??? Place ??? Object Oriented Data Analysis J. S. Marron Dept. of Statistics and Operations Research, University of North Carolina January](https://reader036.vdocuments.net/reader036/viewer/2022070605/5a4d1ae47f8b9ab059978877/html5/thumbnails/12.jpg)
1212
UNC, Stat & OR
Functional Data Analysis, Toy EG IIFunctional Data Analysis, Toy EG II
![Page 13: 1 UNC, Stat & OR ??? Place ??? Object Oriented Data Analysis J. S. Marron Dept. of Statistics and Operations Research, University of North Carolina January](https://reader036.vdocuments.net/reader036/viewer/2022070605/5a4d1ae47f8b9ab059978877/html5/thumbnails/13.jpg)
1313
UNC, Stat & OR
Functional Data Analysis, Toy EG IIIFunctional Data Analysis, Toy EG III
![Page 14: 1 UNC, Stat & OR ??? Place ??? Object Oriented Data Analysis J. S. Marron Dept. of Statistics and Operations Research, University of North Carolina January](https://reader036.vdocuments.net/reader036/viewer/2022070605/5a4d1ae47f8b9ab059978877/html5/thumbnails/14.jpg)
1414
UNC, Stat & OR
Functional Data Analysis, Toy EG IVFunctional Data Analysis, Toy EG IV
![Page 15: 1 UNC, Stat & OR ??? Place ??? Object Oriented Data Analysis J. S. Marron Dept. of Statistics and Operations Research, University of North Carolina January](https://reader036.vdocuments.net/reader036/viewer/2022070605/5a4d1ae47f8b9ab059978877/html5/thumbnails/15.jpg)
1515
UNC, Stat & OR
Functional Data Analysis, Toy EG VFunctional Data Analysis, Toy EG V
![Page 16: 1 UNC, Stat & OR ??? Place ??? Object Oriented Data Analysis J. S. Marron Dept. of Statistics and Operations Research, University of North Carolina January](https://reader036.vdocuments.net/reader036/viewer/2022070605/5a4d1ae47f8b9ab059978877/html5/thumbnails/16.jpg)
1616
UNC, Stat & OR
Functional Data Analysis, Toy EG VIFunctional Data Analysis, Toy EG VI
![Page 17: 1 UNC, Stat & OR ??? Place ??? Object Oriented Data Analysis J. S. Marron Dept. of Statistics and Operations Research, University of North Carolina January](https://reader036.vdocuments.net/reader036/viewer/2022070605/5a4d1ae47f8b9ab059978877/html5/thumbnails/17.jpg)
1717
UNC, Stat & OR
Functional Data Analysis, Toy EG VIIFunctional Data Analysis, Toy EG VII
![Page 18: 1 UNC, Stat & OR ??? Place ??? Object Oriented Data Analysis J. S. Marron Dept. of Statistics and Operations Research, University of North Carolina January](https://reader036.vdocuments.net/reader036/viewer/2022070605/5a4d1ae47f8b9ab059978877/html5/thumbnails/18.jpg)
1818
UNC, Stat & OR
Functional Data Analysis, Toy EG VIIIFunctional Data Analysis, Toy EG VIII
![Page 19: 1 UNC, Stat & OR ??? Place ??? Object Oriented Data Analysis J. S. Marron Dept. of Statistics and Operations Research, University of North Carolina January](https://reader036.vdocuments.net/reader036/viewer/2022070605/5a4d1ae47f8b9ab059978877/html5/thumbnails/19.jpg)
1919
UNC, Stat & OR
Functional Data Analysis, Toy EG IXFunctional Data Analysis, Toy EG IX
![Page 20: 1 UNC, Stat & OR ??? Place ??? Object Oriented Data Analysis J. S. Marron Dept. of Statistics and Operations Research, University of North Carolina January](https://reader036.vdocuments.net/reader036/viewer/2022070605/5a4d1ae47f8b9ab059978877/html5/thumbnails/20.jpg)
2020
UNC, Stat & OR
Functional Data Analysis, Toy EG XFunctional Data Analysis, Toy EG X
![Page 21: 1 UNC, Stat & OR ??? Place ??? Object Oriented Data Analysis J. S. Marron Dept. of Statistics and Operations Research, University of North Carolina January](https://reader036.vdocuments.net/reader036/viewer/2022070605/5a4d1ae47f8b9ab059978877/html5/thumbnails/21.jpg)
2121
UNC, Stat & OR
Functional Data Analysis, 10-d Toy EG 1Functional Data Analysis, 10-d Toy EG 1
![Page 22: 1 UNC, Stat & OR ??? Place ??? Object Oriented Data Analysis J. S. Marron Dept. of Statistics and Operations Research, University of North Carolina January](https://reader036.vdocuments.net/reader036/viewer/2022070605/5a4d1ae47f8b9ab059978877/html5/thumbnails/22.jpg)
2222
UNC, Stat & OR
Functional Data Analysis, 10-d Toy EG 1Functional Data Analysis, 10-d Toy EG 1
![Page 23: 1 UNC, Stat & OR ??? Place ??? Object Oriented Data Analysis J. S. Marron Dept. of Statistics and Operations Research, University of North Carolina January](https://reader036.vdocuments.net/reader036/viewer/2022070605/5a4d1ae47f8b9ab059978877/html5/thumbnails/23.jpg)
2323
UNC, Stat & OR
Functional Data Analysis, 10-d Toy EG 2Functional Data Analysis, 10-d Toy EG 2
![Page 24: 1 UNC, Stat & OR ??? Place ??? Object Oriented Data Analysis J. S. Marron Dept. of Statistics and Operations Research, University of North Carolina January](https://reader036.vdocuments.net/reader036/viewer/2022070605/5a4d1ae47f8b9ab059978877/html5/thumbnails/24.jpg)
2424
UNC, Stat & OR
Functional Data Analysis, 10-d Toy EG 2Functional Data Analysis, 10-d Toy EG 2
![Page 25: 1 UNC, Stat & OR ??? Place ??? Object Oriented Data Analysis J. S. Marron Dept. of Statistics and Operations Research, University of North Carolina January](https://reader036.vdocuments.net/reader036/viewer/2022070605/5a4d1ae47f8b9ab059978877/html5/thumbnails/25.jpg)
2525
UNC, Stat & OR
Object Oriented Data Analysis, IObject Oriented Data Analysis, I
What is the “atom” of a statistical analysis? 1st Course: Numbers Multivariate Analysis Course : Vectors Functional Data Analysis: Curves More generally: Data Objects
![Page 26: 1 UNC, Stat & OR ??? Place ??? Object Oriented Data Analysis J. S. Marron Dept. of Statistics and Operations Research, University of North Carolina January](https://reader036.vdocuments.net/reader036/viewer/2022070605/5a4d1ae47f8b9ab059978877/html5/thumbnails/26.jpg)
2626
UNC, Stat & OR
Object Oriented Data Analysis, IIObject Oriented Data Analysis, II
Examples: Medical Image Analysis
Images as Data Objects? Shape Representations as Objects
Micro-arrays for Gene Expression Just multivariate analysis?
![Page 27: 1 UNC, Stat & OR ??? Place ??? Object Oriented Data Analysis J. S. Marron Dept. of Statistics and Operations Research, University of North Carolina January](https://reader036.vdocuments.net/reader036/viewer/2022070605/5a4d1ae47f8b9ab059978877/html5/thumbnails/27.jpg)
2727
UNC, Stat & OR
Object Oriented Data Analysis, IIIObject Oriented Data Analysis, III
Typical Goals: Understanding population variation
Visualization Principal Component Analysis +
Discrimination (a.k.a. Classification) Time Series of Data Objects
![Page 28: 1 UNC, Stat & OR ??? Place ??? Object Oriented Data Analysis J. S. Marron Dept. of Statistics and Operations Research, University of North Carolina January](https://reader036.vdocuments.net/reader036/viewer/2022070605/5a4d1ae47f8b9ab059978877/html5/thumbnails/28.jpg)
2828
UNC, Stat & OR
Object Oriented Data Analysis, IVObject Oriented Data Analysis, IV
Major Statistical Challenge, I:High Dimension Low Sample Size (HDLSS) Dimension d >> sample size n “Multivariate Analysis” nearly useless
Can’t “normalize the data” Land of Opportunity for Statisticians
Need for “creative statisticians”
![Page 29: 1 UNC, Stat & OR ??? Place ??? Object Oriented Data Analysis J. S. Marron Dept. of Statistics and Operations Research, University of North Carolina January](https://reader036.vdocuments.net/reader036/viewer/2022070605/5a4d1ae47f8b9ab059978877/html5/thumbnails/29.jpg)
2929
UNC, Stat & OR
Object Oriented Data Analysis, VObject Oriented Data Analysis, V
Major Statistical Challenge, II: Data may live in non-Euclidean space
Lie Group / Symmet’c Spaces (manifold data)
Trees/Graphs as data objects Interesting Issues:
What is “the mean” (pop’n center)? How do we quantify “pop’n variation”?
![Page 30: 1 UNC, Stat & OR ??? Place ??? Object Oriented Data Analysis J. S. Marron Dept. of Statistics and Operations Research, University of North Carolina January](https://reader036.vdocuments.net/reader036/viewer/2022070605/5a4d1ae47f8b9ab059978877/html5/thumbnails/30.jpg)
3030
UNC, Stat & OR
Statistics in Image Analysis, IStatistics in Image Analysis, I
First Generation Problems: Denoising Segmentation Registration
(all about single images)
![Page 31: 1 UNC, Stat & OR ??? Place ??? Object Oriented Data Analysis J. S. Marron Dept. of Statistics and Operations Research, University of North Carolina January](https://reader036.vdocuments.net/reader036/viewer/2022070605/5a4d1ae47f8b9ab059978877/html5/thumbnails/31.jpg)
3131
UNC, Stat & OR
Statistics in Image Analysis, IIStatistics in Image Analysis, II
Second Generation Problems: Populations of Images
Understanding Population Variation Discrimination (a.k.a. Classification)
Complex Data Structures (& Spaces) HDLSS Statistics
![Page 32: 1 UNC, Stat & OR ??? Place ??? Object Oriented Data Analysis J. S. Marron Dept. of Statistics and Operations Research, University of North Carolina January](https://reader036.vdocuments.net/reader036/viewer/2022070605/5a4d1ae47f8b9ab059978877/html5/thumbnails/32.jpg)
3232
UNC, Stat & OR
HDLSS Statistics in Imaging
Why HDLSS (High Dim, Low Sample Size)?
Complex 3-d Objects Hard to Represent Often need d = 100’s of parameters
Complex 3-d Objects Costly to Segment Often have n = 10’s cases
![Page 33: 1 UNC, Stat & OR ??? Place ??? Object Oriented Data Analysis J. S. Marron Dept. of Statistics and Operations Research, University of North Carolina January](https://reader036.vdocuments.net/reader036/viewer/2022070605/5a4d1ae47f8b9ab059978877/html5/thumbnails/33.jpg)
3333
UNC, Stat & OR
Medical Imaging – A Challenging Medical Imaging – A Challenging ExampleExample
Male Pelvis Bladder – Prostate – Rectum How do they move over time (days)? Critical to Radiation Treatment
(cancer) Work with 3-d CT Very Challenging to Segment
Find boundary of each object? Represent each Object?
![Page 34: 1 UNC, Stat & OR ??? Place ??? Object Oriented Data Analysis J. S. Marron Dept. of Statistics and Operations Research, University of North Carolina January](https://reader036.vdocuments.net/reader036/viewer/2022070605/5a4d1ae47f8b9ab059978877/html5/thumbnails/34.jpg)
3434
UNC, Stat & OR
Male Pelvis – Raw DataMale Pelvis – Raw Data
One CT Slice(in 3d image)
Coccyx(Tail Bone)
RectumBladder
![Page 35: 1 UNC, Stat & OR ??? Place ??? Object Oriented Data Analysis J. S. Marron Dept. of Statistics and Operations Research, University of North Carolina January](https://reader036.vdocuments.net/reader036/viewer/2022070605/5a4d1ae47f8b9ab059978877/html5/thumbnails/35.jpg)
3535
UNC, Stat & OR
Male Pelvis – Raw DataMale Pelvis – Raw Data
Bladder:
manual segmentation
Slice by slice
Reassembled
![Page 36: 1 UNC, Stat & OR ??? Place ??? Object Oriented Data Analysis J. S. Marron Dept. of Statistics and Operations Research, University of North Carolina January](https://reader036.vdocuments.net/reader036/viewer/2022070605/5a4d1ae47f8b9ab059978877/html5/thumbnails/36.jpg)
3636
UNC, Stat & OR
Male Pelvis – Raw DataMale Pelvis – Raw Data
Bladder:
Slices:Reassembled in 3d
How to represent?
Thanks: Ja-Yeon Jeong
![Page 37: 1 UNC, Stat & OR ??? Place ??? Object Oriented Data Analysis J. S. Marron Dept. of Statistics and Operations Research, University of North Carolina January](https://reader036.vdocuments.net/reader036/viewer/2022070605/5a4d1ae47f8b9ab059978877/html5/thumbnails/37.jpg)
3737
UNC, Stat & OR
Object RepresentationObject Representation
Landmarks (hard to find) Boundary Rep’ns (no
correspondence) Medial representations
Find “skeleton” Discretize as “atoms” called M-reps
![Page 38: 1 UNC, Stat & OR ??? Place ??? Object Oriented Data Analysis J. S. Marron Dept. of Statistics and Operations Research, University of North Carolina January](https://reader036.vdocuments.net/reader036/viewer/2022070605/5a4d1ae47f8b9ab059978877/html5/thumbnails/38.jpg)
3838
UNC, Stat & OR
3-d m-reps3-d m-reps
Bladder – Prostate – Rectum (multiple objects, J. Y. Jeong)
• Medial Atoms provide “skeleton”• Implied Boundary from “spokes” “surface”
![Page 39: 1 UNC, Stat & OR ??? Place ??? Object Oriented Data Analysis J. S. Marron Dept. of Statistics and Operations Research, University of North Carolina January](https://reader036.vdocuments.net/reader036/viewer/2022070605/5a4d1ae47f8b9ab059978877/html5/thumbnails/39.jpg)
3939
UNC, Stat & OR
3-d m-reps3-d m-reps
M-rep model fitting• Easy, when starting from binary (blue)• But very expensive (30 – 40 minutes technician’s
time)• Want automatic approach• Challenging, because of poor contrast, noise, …• Need to borrow information across training sample• Use Bayes approach: prior & likelihood
posterior• ~Conjugate Gaussians, but there are issues:
• Major HLDSS challenges• Manifold aspect of data
![Page 40: 1 UNC, Stat & OR ??? Place ??? Object Oriented Data Analysis J. S. Marron Dept. of Statistics and Operations Research, University of North Carolina January](https://reader036.vdocuments.net/reader036/viewer/2022070605/5a4d1ae47f8b9ab059978877/html5/thumbnails/40.jpg)
4040
UNC, Stat & OR
Illuminating ViewpointIlluminating Viewpoint
Object Space Feature Space
Focus here oncollection ofdata objects
Here conceptualize population structure
via “point clouds”
![Page 41: 1 UNC, Stat & OR ??? Place ??? Object Oriented Data Analysis J. S. Marron Dept. of Statistics and Operations Research, University of North Carolina January](https://reader036.vdocuments.net/reader036/viewer/2022070605/5a4d1ae47f8b9ab059978877/html5/thumbnails/41.jpg)
4141
UNC, Stat & OR
Personal HDLSS Viewpoint: Data
Images (cases) are “Points”
In Feature SpaceFeatures are Axes
Data set is “Point Clouds”
Use Proj’ns to visualize
![Page 42: 1 UNC, Stat & OR ??? Place ??? Object Oriented Data Analysis J. S. Marron Dept. of Statistics and Operations Research, University of North Carolina January](https://reader036.vdocuments.net/reader036/viewer/2022070605/5a4d1ae47f8b9ab059978877/html5/thumbnails/42.jpg)
4242
UNC, Stat & OR
Personal HDLSS Viewpoint: PCA
Rotated Axes
Often Insightful
One set of Dir’ns
Others Useful, too
![Page 43: 1 UNC, Stat & OR ??? Place ??? Object Oriented Data Analysis J. S. Marron Dept. of Statistics and Operations Research, University of North Carolina January](https://reader036.vdocuments.net/reader036/viewer/2022070605/5a4d1ae47f8b9ab059978877/html5/thumbnails/43.jpg)
4343
UNC, Stat & OR
Cornea Data, I
Images as data ~42 Cornea ImagesOuter surface of eyeHeat map of curvature (in radial direction)Hard to understand “population structure”
![Page 44: 1 UNC, Stat & OR ??? Place ??? Object Oriented Data Analysis J. S. Marron Dept. of Statistics and Operations Research, University of North Carolina January](https://reader036.vdocuments.net/reader036/viewer/2022070605/5a4d1ae47f8b9ab059978877/html5/thumbnails/44.jpg)
4444
UNC, Stat & OR
Cornea Data, II
PC 1Starts at Pop’n MeanOverall CurvatureVertical AstigmatismCorrelated!Gaussian ProjectionsVisualization: Can’t Overlay
(so use movie)
![Page 45: 1 UNC, Stat & OR ??? Place ??? Object Oriented Data Analysis J. S. Marron Dept. of Statistics and Operations Research, University of North Carolina January](https://reader036.vdocuments.net/reader036/viewer/2022070605/5a4d1ae47f8b9ab059978877/html5/thumbnails/45.jpg)
4545
UNC, Stat & OR
Cornea Data, III
PC 2Horrible Outlier!
(present in data)But look only in center:
Steep at top -- bottomWant Robust PCAFor HDLSS data ???
![Page 46: 1 UNC, Stat & OR ??? Place ??? Object Oriented Data Analysis J. S. Marron Dept. of Statistics and Operations Research, University of North Carolina January](https://reader036.vdocuments.net/reader036/viewer/2022070605/5a4d1ae47f8b9ab059978877/html5/thumbnails/46.jpg)
4646
UNC, Stat & OR
Cornea Data, IV
Robust PC 2No outlier impactSee top – bottom variationProjections now Gaussian
![Page 47: 1 UNC, Stat & OR ??? Place ??? Object Oriented Data Analysis J. S. Marron Dept. of Statistics and Operations Research, University of North Carolina January](https://reader036.vdocuments.net/reader036/viewer/2022070605/5a4d1ae47f8b9ab059978877/html5/thumbnails/47.jpg)
4747
UNC, Stat & OR
PCA for m-reps, I
Major issue: m-reps live in(locations, radius and angles)
E.g. “average” of: = ???
Natural Data Structure is:Lie Groups ~ Symmetric spaces
(smooth, curved manifolds)
)2()3(3 SOSO
359,358,3,2
![Page 48: 1 UNC, Stat & OR ??? Place ??? Object Oriented Data Analysis J. S. Marron Dept. of Statistics and Operations Research, University of North Carolina January](https://reader036.vdocuments.net/reader036/viewer/2022070605/5a4d1ae47f8b9ab059978877/html5/thumbnails/48.jpg)
4848
UNC, Stat & OR
PCA for m-reps, II
PCA on non-Euclidean spaces?(i.e. on Lie Groups / Symmetric Spaces)
T. Fletcher: Principal Geodesic Analysis
Idea: replace “linear summary of data”With “geodesic summary of data”…
![Page 49: 1 UNC, Stat & OR ??? Place ??? Object Oriented Data Analysis J. S. Marron Dept. of Statistics and Operations Research, University of North Carolina January](https://reader036.vdocuments.net/reader036/viewer/2022070605/5a4d1ae47f8b9ab059978877/html5/thumbnails/49.jpg)
4949
UNC, Stat & OR
PGA for m-reps, Bladder-Prostate-Rectum
Bladder – Prostate – Rectum, 1 person, 17 days PG 1 PG 2 PG 3
(analysis by Ja Yeon Jeong)
![Page 50: 1 UNC, Stat & OR ??? Place ??? Object Oriented Data Analysis J. S. Marron Dept. of Statistics and Operations Research, University of North Carolina January](https://reader036.vdocuments.net/reader036/viewer/2022070605/5a4d1ae47f8b9ab059978877/html5/thumbnails/50.jpg)
5050
UNC, Stat & OR
PGA for m-reps, Bladder-Prostate-Rectum
Bladder – Prostate – Rectum, 1 person, 17 days PG 1 PG 2 PG 3
(analysis by Ja Yeon Jeong)
![Page 51: 1 UNC, Stat & OR ??? Place ??? Object Oriented Data Analysis J. S. Marron Dept. of Statistics and Operations Research, University of North Carolina January](https://reader036.vdocuments.net/reader036/viewer/2022070605/5a4d1ae47f8b9ab059978877/html5/thumbnails/51.jpg)
5151
UNC, Stat & OR
PGA for m-reps, Bladder-Prostate-Rectum
Bladder – Prostate – Rectum, 1 person, 17 days PG 1 PG 2 PG 3
(analysis by Ja Yeon Jeong)
![Page 52: 1 UNC, Stat & OR ??? Place ??? Object Oriented Data Analysis J. S. Marron Dept. of Statistics and Operations Research, University of North Carolina January](https://reader036.vdocuments.net/reader036/viewer/2022070605/5a4d1ae47f8b9ab059978877/html5/thumbnails/52.jpg)
5252
UNC, Stat & OR
HDLSS Classification (i.e. HDLSS Classification (i.e. Discrimination)Discrimination)
Background: Two Class (Binary) version:
Using “training data” from Class +1, and from Class -1
Develop a “rule” for assigning new data to a Class
Canonical Example: Disease Diagnosis New Patients are “Healthy” or “Ill” Determined based on measurements
![Page 53: 1 UNC, Stat & OR ??? Place ??? Object Oriented Data Analysis J. S. Marron Dept. of Statistics and Operations Research, University of North Carolina January](https://reader036.vdocuments.net/reader036/viewer/2022070605/5a4d1ae47f8b9ab059978877/html5/thumbnails/53.jpg)
5353
UNC, Stat & OR
HDLSS Classification (Cont.)HDLSS Classification (Cont.)
Ineffective Methods: Fisher Linear Discrimination Gaussian Likelihood Ratio
Less Useful Methods: Nearest Neighbors Neural Nets
(“black boxes”, no “directions” or intuition)
![Page 54: 1 UNC, Stat & OR ??? Place ??? Object Oriented Data Analysis J. S. Marron Dept. of Statistics and Operations Research, University of North Carolina January](https://reader036.vdocuments.net/reader036/viewer/2022070605/5a4d1ae47f8b9ab059978877/html5/thumbnails/54.jpg)
5454
UNC, Stat & OR
HDLSS Classification (Cont.)HDLSS Classification (Cont.)
Currently Fashionable Methods: Support Vector Machines Trees Based Approaches
New High Tech Method Distance Weighted Discrimination
(DWD) Specially designed for HDLSS data Avoids “data piling” problem of SVM Solves more suitable optimization problem
![Page 55: 1 UNC, Stat & OR ??? Place ??? Object Oriented Data Analysis J. S. Marron Dept. of Statistics and Operations Research, University of North Carolina January](https://reader036.vdocuments.net/reader036/viewer/2022070605/5a4d1ae47f8b9ab059978877/html5/thumbnails/55.jpg)
5555
UNC, Stat & OR
HDLSS Classification (Cont.)HDLSS Classification (Cont.)
Currently Fashionable Methods:
Trees Based ApproachesSupport Vector Machines:
![Page 56: 1 UNC, Stat & OR ??? Place ??? Object Oriented Data Analysis J. S. Marron Dept. of Statistics and Operations Research, University of North Carolina January](https://reader036.vdocuments.net/reader036/viewer/2022070605/5a4d1ae47f8b9ab059978877/html5/thumbnails/56.jpg)
5656
UNC, Stat & OR
Kernel Embedding IdeaKernel Embedding Idea
Aizerman, Braverman, Rozoner (1964)
Make data linearly separableby embedding in
higher dimensional space
![Page 57: 1 UNC, Stat & OR ??? Place ??? Object Oriented Data Analysis J. S. Marron Dept. of Statistics and Operations Research, University of North Carolina January](https://reader036.vdocuments.net/reader036/viewer/2022070605/5a4d1ae47f8b9ab059978877/html5/thumbnails/57.jpg)
5757
UNC, Stat & OR
Kernel Embedding IdeaKernel Embedding Idea
Linearly separableby embedding inhigher dimensions
![Page 58: 1 UNC, Stat & OR ??? Place ??? Object Oriented Data Analysis J. S. Marron Dept. of Statistics and Operations Research, University of North Carolina January](https://reader036.vdocuments.net/reader036/viewer/2022070605/5a4d1ae47f8b9ab059978877/html5/thumbnails/58.jpg)
5858
UNC, Stat & OR
Kernel Embedding IdeaKernel Embedding Idea
Linearly separableby embedding inhigher dimensions
![Page 59: 1 UNC, Stat & OR ??? Place ??? Object Oriented Data Analysis J. S. Marron Dept. of Statistics and Operations Research, University of North Carolina January](https://reader036.vdocuments.net/reader036/viewer/2022070605/5a4d1ae47f8b9ab059978877/html5/thumbnails/59.jpg)
5959
UNC, Stat & OR
Kernel Embedding IdeaKernel Embedding Idea
Linearly separableby embedding inhigher dimensions
![Page 60: 1 UNC, Stat & OR ??? Place ??? Object Oriented Data Analysis J. S. Marron Dept. of Statistics and Operations Research, University of North Carolina January](https://reader036.vdocuments.net/reader036/viewer/2022070605/5a4d1ae47f8b9ab059978877/html5/thumbnails/60.jpg)
6060
UNC, Stat & OR
Kernel Embedding IdeaKernel Embedding Idea
Linearly separableby embedding inhigher dimensions
![Page 61: 1 UNC, Stat & OR ??? Place ??? Object Oriented Data Analysis J. S. Marron Dept. of Statistics and Operations Research, University of North Carolina January](https://reader036.vdocuments.net/reader036/viewer/2022070605/5a4d1ae47f8b9ab059978877/html5/thumbnails/61.jpg)
6161
UNC, Stat & OR
Kernel Embedding IdeaKernel Embedding Idea
Linearly separableby embedding inhigher dimensions
Distributional Assumptions
in Embedded Space?
ǁ˅
Support Vector Machine
![Page 62: 1 UNC, Stat & OR ??? Place ??? Object Oriented Data Analysis J. S. Marron Dept. of Statistics and Operations Research, University of North Carolina January](https://reader036.vdocuments.net/reader036/viewer/2022070605/5a4d1ae47f8b9ab059978877/html5/thumbnails/62.jpg)
6262
UNC, Stat & OR
HDLSS Classification (Cont.)HDLSS Classification (Cont.)
Comparison of Linear Methods (toy data):
Optimal DirectionExcellent, but need dir’n in dim = 50
Maximal Data Piling (J. Y. Ahn, D. Peña)Great separation, but generalizability???
Support Vector MachineMore separation, gen’ity, but some data
piling?Distance Weighted Discrimination
Avoids data piling, good gen’ity, Gaussians?
50,20,2.2,, 21,1 dnnINd
![Page 63: 1 UNC, Stat & OR ??? Place ??? Object Oriented Data Analysis J. S. Marron Dept. of Statistics and Operations Research, University of North Carolina January](https://reader036.vdocuments.net/reader036/viewer/2022070605/5a4d1ae47f8b9ab059978877/html5/thumbnails/63.jpg)
6363
UNC, Stat & OR
Distance Weighted DiscriminationDistance Weighted Discrimination
Maximal Data Piling
![Page 64: 1 UNC, Stat & OR ??? Place ??? Object Oriented Data Analysis J. S. Marron Dept. of Statistics and Operations Research, University of North Carolina January](https://reader036.vdocuments.net/reader036/viewer/2022070605/5a4d1ae47f8b9ab059978877/html5/thumbnails/64.jpg)
6464
UNC, Stat & OR
Distance Weighted DiscriminationDistance Weighted Discrimination
Based on Optimization Problem:
More precisely work in appropriate penalty for violations
Optimization Method (Michael Todd): Second Order Cone Programming Still Convex gen’tion of quadratic
prog’ing Fast greedy solution Can use existing software
n
i ibw r1,
1min
![Page 65: 1 UNC, Stat & OR ??? Place ??? Object Oriented Data Analysis J. S. Marron Dept. of Statistics and Operations Research, University of North Carolina January](https://reader036.vdocuments.net/reader036/viewer/2022070605/5a4d1ae47f8b9ab059978877/html5/thumbnails/65.jpg)
6565
UNC, Stat & OR
Simulation ComparisonSimulation Comparison
E.G. Above Gaussians:Wide array of dim’sSVM Subst’ly worseMD – Bayes OptimalDWD close to MD
![Page 66: 1 UNC, Stat & OR ??? Place ??? Object Oriented Data Analysis J. S. Marron Dept. of Statistics and Operations Research, University of North Carolina January](https://reader036.vdocuments.net/reader036/viewer/2022070605/5a4d1ae47f8b9ab059978877/html5/thumbnails/66.jpg)
6666
UNC, Stat & OR
Simulation ComparisonSimulation Comparison
E.G. Outlier Mixture: Disaster for MD SVM & DWD much
more solid Dir’ns are “robust” SVM & DWD similar
![Page 67: 1 UNC, Stat & OR ??? Place ??? Object Oriented Data Analysis J. S. Marron Dept. of Statistics and Operations Research, University of North Carolina January](https://reader036.vdocuments.net/reader036/viewer/2022070605/5a4d1ae47f8b9ab059978877/html5/thumbnails/67.jpg)
6767
UNC, Stat & OR
Simulation ComparisonSimulation Comparison
E.G. Wobble Mixture: Disaster for MD SVM less good DWD slightly betterNote: All methods
come together for larger d ???
![Page 68: 1 UNC, Stat & OR ??? Place ??? Object Oriented Data Analysis J. S. Marron Dept. of Statistics and Operations Research, University of North Carolina January](https://reader036.vdocuments.net/reader036/viewer/2022070605/5a4d1ae47f8b9ab059978877/html5/thumbnails/68.jpg)
6868
UNC, Stat & OR
DWD Bias Adjustment for MicroarraysDWD Bias Adjustment for Microarrays
Microarray data: Simult. Measur’ts of “gene
expression” Intrinsically HDLSS
Dimension d ~ 1,000s – 10,000s Sample Sizes n ~ 10s – 100s
My view: Each array is “point in cloud”
![Page 69: 1 UNC, Stat & OR ??? Place ??? Object Oriented Data Analysis J. S. Marron Dept. of Statistics and Operations Research, University of North Carolina January](https://reader036.vdocuments.net/reader036/viewer/2022070605/5a4d1ae47f8b9ab059978877/html5/thumbnails/69.jpg)
6969
UNC, Stat & OR
DWD Batch and Source AdjustmentDWD Batch and Source Adjustment
For Perou’s Stanford Breast Cancer Data Analysis in Benito, et al (2004)
Bioinformaticshttps://genome.unc.edu/pubsup/dwd/
Adjust for Source Effects Different sources of mRNA
Adjust for Batch Effects Arrays fabricated at different times
![Page 70: 1 UNC, Stat & OR ??? Place ??? Object Oriented Data Analysis J. S. Marron Dept. of Statistics and Operations Research, University of North Carolina January](https://reader036.vdocuments.net/reader036/viewer/2022070605/5a4d1ae47f8b9ab059978877/html5/thumbnails/70.jpg)
7070
UNC, Stat & OR
DWD Adj: Raw Breast Cancer dataDWD Adj: Raw Breast Cancer data
![Page 71: 1 UNC, Stat & OR ??? Place ??? Object Oriented Data Analysis J. S. Marron Dept. of Statistics and Operations Research, University of North Carolina January](https://reader036.vdocuments.net/reader036/viewer/2022070605/5a4d1ae47f8b9ab059978877/html5/thumbnails/71.jpg)
7171
UNC, Stat & OR
DWD Adj: Source ColorsDWD Adj: Source Colors
![Page 72: 1 UNC, Stat & OR ??? Place ??? Object Oriented Data Analysis J. S. Marron Dept. of Statistics and Operations Research, University of North Carolina January](https://reader036.vdocuments.net/reader036/viewer/2022070605/5a4d1ae47f8b9ab059978877/html5/thumbnails/72.jpg)
7272
UNC, Stat & OR
DWD Adj: Batch ColorsDWD Adj: Batch Colors
![Page 73: 1 UNC, Stat & OR ??? Place ??? Object Oriented Data Analysis J. S. Marron Dept. of Statistics and Operations Research, University of North Carolina January](https://reader036.vdocuments.net/reader036/viewer/2022070605/5a4d1ae47f8b9ab059978877/html5/thumbnails/73.jpg)
7373
UNC, Stat & OR
DWD Adj: Biological Class ColorsDWD Adj: Biological Class Colors
![Page 74: 1 UNC, Stat & OR ??? Place ??? Object Oriented Data Analysis J. S. Marron Dept. of Statistics and Operations Research, University of North Carolina January](https://reader036.vdocuments.net/reader036/viewer/2022070605/5a4d1ae47f8b9ab059978877/html5/thumbnails/74.jpg)
7474
UNC, Stat & OR
DWD Adj: Biological Class Colors & DWD Adj: Biological Class Colors & SymbolsSymbols
![Page 75: 1 UNC, Stat & OR ??? Place ??? Object Oriented Data Analysis J. S. Marron Dept. of Statistics and Operations Research, University of North Carolina January](https://reader036.vdocuments.net/reader036/viewer/2022070605/5a4d1ae47f8b9ab059978877/html5/thumbnails/75.jpg)
7575
UNC, Stat & OR
DWD Adj: Biological Class SymbolsDWD Adj: Biological Class Symbols
![Page 76: 1 UNC, Stat & OR ??? Place ??? Object Oriented Data Analysis J. S. Marron Dept. of Statistics and Operations Research, University of North Carolina January](https://reader036.vdocuments.net/reader036/viewer/2022070605/5a4d1ae47f8b9ab059978877/html5/thumbnails/76.jpg)
7676
UNC, Stat & OR
DWD Adj: Source ColorsDWD Adj: Source Colors
![Page 77: 1 UNC, Stat & OR ??? Place ??? Object Oriented Data Analysis J. S. Marron Dept. of Statistics and Operations Research, University of North Carolina January](https://reader036.vdocuments.net/reader036/viewer/2022070605/5a4d1ae47f8b9ab059978877/html5/thumbnails/77.jpg)
7777
UNC, Stat & OR
DWD Adj: PC 1-2 & DWD directionDWD Adj: PC 1-2 & DWD direction
![Page 78: 1 UNC, Stat & OR ??? Place ??? Object Oriented Data Analysis J. S. Marron Dept. of Statistics and Operations Research, University of North Carolina January](https://reader036.vdocuments.net/reader036/viewer/2022070605/5a4d1ae47f8b9ab059978877/html5/thumbnails/78.jpg)
7878
UNC, Stat & OR
DWD Adj: DWD Source AdjustmentDWD Adj: DWD Source Adjustment
![Page 79: 1 UNC, Stat & OR ??? Place ??? Object Oriented Data Analysis J. S. Marron Dept. of Statistics and Operations Research, University of North Carolina January](https://reader036.vdocuments.net/reader036/viewer/2022070605/5a4d1ae47f8b9ab059978877/html5/thumbnails/79.jpg)
7979
UNC, Stat & OR
DWD Adj: Source Adj’d, PCA viewDWD Adj: Source Adj’d, PCA view
![Page 80: 1 UNC, Stat & OR ??? Place ??? Object Oriented Data Analysis J. S. Marron Dept. of Statistics and Operations Research, University of North Carolina January](https://reader036.vdocuments.net/reader036/viewer/2022070605/5a4d1ae47f8b9ab059978877/html5/thumbnails/80.jpg)
8080
UNC, Stat & OR
DWD Adj: Source Adj’d, Class ColoredDWD Adj: Source Adj’d, Class Colored
![Page 81: 1 UNC, Stat & OR ??? Place ??? Object Oriented Data Analysis J. S. Marron Dept. of Statistics and Operations Research, University of North Carolina January](https://reader036.vdocuments.net/reader036/viewer/2022070605/5a4d1ae47f8b9ab059978877/html5/thumbnails/81.jpg)
8181
UNC, Stat & OR
DWD Adj: Source Adj’d, Batch ColoredDWD Adj: Source Adj’d, Batch Colored
![Page 82: 1 UNC, Stat & OR ??? Place ??? Object Oriented Data Analysis J. S. Marron Dept. of Statistics and Operations Research, University of North Carolina January](https://reader036.vdocuments.net/reader036/viewer/2022070605/5a4d1ae47f8b9ab059978877/html5/thumbnails/82.jpg)
8282
UNC, Stat & OR
DWD Adj: Source Adj’d, 5 PCsDWD Adj: Source Adj’d, 5 PCs
![Page 83: 1 UNC, Stat & OR ??? Place ??? Object Oriented Data Analysis J. S. Marron Dept. of Statistics and Operations Research, University of North Carolina January](https://reader036.vdocuments.net/reader036/viewer/2022070605/5a4d1ae47f8b9ab059978877/html5/thumbnails/83.jpg)
8383
UNC, Stat & OR
DWD Adj: S. Adj’d, Batch 1,2 vs. 3 DWDDWD Adj: S. Adj’d, Batch 1,2 vs. 3 DWD
![Page 84: 1 UNC, Stat & OR ??? Place ??? Object Oriented Data Analysis J. S. Marron Dept. of Statistics and Operations Research, University of North Carolina January](https://reader036.vdocuments.net/reader036/viewer/2022070605/5a4d1ae47f8b9ab059978877/html5/thumbnails/84.jpg)
8484
UNC, Stat & OR
DWD Adj: S. & B1,2 vs. 3 AdjustedDWD Adj: S. & B1,2 vs. 3 Adjusted
![Page 85: 1 UNC, Stat & OR ??? Place ??? Object Oriented Data Analysis J. S. Marron Dept. of Statistics and Operations Research, University of North Carolina January](https://reader036.vdocuments.net/reader036/viewer/2022070605/5a4d1ae47f8b9ab059978877/html5/thumbnails/85.jpg)
8585
UNC, Stat & OR
DWD Adj: S. & B1,2 vs. 3 Adj’d, 5 PCsDWD Adj: S. & B1,2 vs. 3 Adj’d, 5 PCs
![Page 86: 1 UNC, Stat & OR ??? Place ??? Object Oriented Data Analysis J. S. Marron Dept. of Statistics and Operations Research, University of North Carolina January](https://reader036.vdocuments.net/reader036/viewer/2022070605/5a4d1ae47f8b9ab059978877/html5/thumbnails/86.jpg)
8686
UNC, Stat & OR
DWD Adj: S. & B Adj’d, B1 vs. 2 DWDDWD Adj: S. & B Adj’d, B1 vs. 2 DWD
![Page 87: 1 UNC, Stat & OR ??? Place ??? Object Oriented Data Analysis J. S. Marron Dept. of Statistics and Operations Research, University of North Carolina January](https://reader036.vdocuments.net/reader036/viewer/2022070605/5a4d1ae47f8b9ab059978877/html5/thumbnails/87.jpg)
8787
UNC, Stat & OR
DWD Adj: S. & B Adj’d, B1 vs. 2 Adj’dDWD Adj: S. & B Adj’d, B1 vs. 2 Adj’d
![Page 88: 1 UNC, Stat & OR ??? Place ??? Object Oriented Data Analysis J. S. Marron Dept. of Statistics and Operations Research, University of North Carolina January](https://reader036.vdocuments.net/reader036/viewer/2022070605/5a4d1ae47f8b9ab059978877/html5/thumbnails/88.jpg)
8888
UNC, Stat & OR
DWD Adj: S. & B Adj’d, 5 PC viewDWD Adj: S. & B Adj’d, 5 PC view
![Page 89: 1 UNC, Stat & OR ??? Place ??? Object Oriented Data Analysis J. S. Marron Dept. of Statistics and Operations Research, University of North Carolina January](https://reader036.vdocuments.net/reader036/viewer/2022070605/5a4d1ae47f8b9ab059978877/html5/thumbnails/89.jpg)
8989
UNC, Stat & OR
DWD Adj: S. & B Adj’d, 4 PC viewDWD Adj: S. & B Adj’d, 4 PC view
![Page 90: 1 UNC, Stat & OR ??? Place ??? Object Oriented Data Analysis J. S. Marron Dept. of Statistics and Operations Research, University of North Carolina January](https://reader036.vdocuments.net/reader036/viewer/2022070605/5a4d1ae47f8b9ab059978877/html5/thumbnails/90.jpg)
9090
UNC, Stat & OR
DWD Adj: S. & B Adj’d, Class ColorsDWD Adj: S. & B Adj’d, Class Colors
![Page 91: 1 UNC, Stat & OR ??? Place ??? Object Oriented Data Analysis J. S. Marron Dept. of Statistics and Operations Research, University of North Carolina January](https://reader036.vdocuments.net/reader036/viewer/2022070605/5a4d1ae47f8b9ab059978877/html5/thumbnails/91.jpg)
9191
UNC, Stat & OR
DWD Adj: S. & B Adj’d, Adj’d PCADWD Adj: S. & B Adj’d, Adj’d PCA
![Page 92: 1 UNC, Stat & OR ??? Place ??? Object Oriented Data Analysis J. S. Marron Dept. of Statistics and Operations Research, University of North Carolina January](https://reader036.vdocuments.net/reader036/viewer/2022070605/5a4d1ae47f8b9ab059978877/html5/thumbnails/92.jpg)
9292
UNC, Stat & OR
DWD Bias Adjustment for Microarrays
Effective for Batch and Source Adj. Also works for cross-platform Adj.
E.g. cDNA & Affy Despite literature claiming contrary
“Gene by Gene” vs. “Multivariate” views
Funded as part of caBIG“Cancer BioInformatics Grid”
“Data Combination Effort” of NCI
![Page 93: 1 UNC, Stat & OR ??? Place ??? Object Oriented Data Analysis J. S. Marron Dept. of Statistics and Operations Research, University of North Carolina January](https://reader036.vdocuments.net/reader036/viewer/2022070605/5a4d1ae47f8b9ab059978877/html5/thumbnails/93.jpg)
9393
UNC, Stat & OR
Interesting Benchmark Data SetInteresting Benchmark Data Set
NCI 60 Cell Lines Interesting benchmark, since same cells Data Web available:
http://discover.nci.nih.gov/datasetsNature2000.jsp
Both cDNA and Affymetrix Platforms
8 Major cancer subtypes
Use DWD now for visualization
![Page 94: 1 UNC, Stat & OR ??? Place ??? Object Oriented Data Analysis J. S. Marron Dept. of Statistics and Operations Research, University of North Carolina January](https://reader036.vdocuments.net/reader036/viewer/2022070605/5a4d1ae47f8b9ab059978877/html5/thumbnails/94.jpg)
9494
UNC, Stat & OR
NCI 60: Fully Adjusted Data, NCI 60: Fully Adjusted Data, Leukemia ClusterLeukemia Cluster
LEUK.CCRFCEM LEUK.K562 LEUK.MOLT4 LEUK.HL60 LEUK.RPMI8266LEUK.SR
![Page 95: 1 UNC, Stat & OR ??? Place ??? Object Oriented Data Analysis J. S. Marron Dept. of Statistics and Operations Research, University of North Carolina January](https://reader036.vdocuments.net/reader036/viewer/2022070605/5a4d1ae47f8b9ab059978877/html5/thumbnails/95.jpg)
9595
UNC, Stat & OR
NCI 60: Views using DWD Dir’ns (focus on NCI 60: Views using DWD Dir’ns (focus on biology)biology)
![Page 96: 1 UNC, Stat & OR ??? Place ??? Object Oriented Data Analysis J. S. Marron Dept. of Statistics and Operations Research, University of North Carolina January](https://reader036.vdocuments.net/reader036/viewer/2022070605/5a4d1ae47f8b9ab059978877/html5/thumbnails/96.jpg)
9696
UNC, Stat & OR
Why not adjust by means?
DWD is complicated: value added? Xuxin Liu example… Key is sizes of biological subtypes Differing ratio trips up mean But DWD more robust
(although still not perfect)
![Page 97: 1 UNC, Stat & OR ??? Place ??? Object Oriented Data Analysis J. S. Marron Dept. of Statistics and Operations Research, University of North Carolina January](https://reader036.vdocuments.net/reader036/viewer/2022070605/5a4d1ae47f8b9ab059978877/html5/thumbnails/97.jpg)
9797
UNC, Stat & OR
Twiddle ratios of subtypes
![Page 98: 1 UNC, Stat & OR ??? Place ??? Object Oriented Data Analysis J. S. Marron Dept. of Statistics and Operations Research, University of North Carolina January](https://reader036.vdocuments.net/reader036/viewer/2022070605/5a4d1ae47f8b9ab059978877/html5/thumbnails/98.jpg)
9898
UNC, Stat & OR
DWD in Face Recognition, I
Face Images as Data(with M. Benito & D. Peña)Registered using landmarksMale – Female Difference?Discrimination Rule?
![Page 99: 1 UNC, Stat & OR ??? Place ??? Object Oriented Data Analysis J. S. Marron Dept. of Statistics and Operations Research, University of North Carolina January](https://reader036.vdocuments.net/reader036/viewer/2022070605/5a4d1ae47f8b9ab059978877/html5/thumbnails/99.jpg)
9999
UNC, Stat & OR
DWD in Face Recognition, II
DWD Direction Good separation Images “make
sense” Garbage at ends?
(extrapolation effects?)
![Page 100: 1 UNC, Stat & OR ??? Place ??? Object Oriented Data Analysis J. S. Marron Dept. of Statistics and Operations Research, University of North Carolina January](https://reader036.vdocuments.net/reader036/viewer/2022070605/5a4d1ae47f8b9ab059978877/html5/thumbnails/100.jpg)
100100
UNC, Stat & OR
DWD in Face Recognition, III
Interesting summary: Jump between means
(in DWD direction) Clear separation of
Maleness vs. Femaleness
![Page 101: 1 UNC, Stat & OR ??? Place ??? Object Oriented Data Analysis J. S. Marron Dept. of Statistics and Operations Research, University of North Carolina January](https://reader036.vdocuments.net/reader036/viewer/2022070605/5a4d1ae47f8b9ab059978877/html5/thumbnails/101.jpg)
101101
UNC, Stat & OR
DWD in Face Recognition, IV
Fun Comparison: Jump between means
(in SVM direction) Also distinguishesMaleness vs. Femaleness But not as well as
DWD
![Page 102: 1 UNC, Stat & OR ??? Place ??? Object Oriented Data Analysis J. S. Marron Dept. of Statistics and Operations Research, University of North Carolina January](https://reader036.vdocuments.net/reader036/viewer/2022070605/5a4d1ae47f8b9ab059978877/html5/thumbnails/102.jpg)
102102
UNC, Stat & OR
DWD in Face Recognition, V
Analysis of difference: Project onto normals SVM has “small gap” (feels noise artifacts?) DWD “more informative” (feels real structure?)
![Page 103: 1 UNC, Stat & OR ??? Place ??? Object Oriented Data Analysis J. S. Marron Dept. of Statistics and Operations Research, University of North Carolina January](https://reader036.vdocuments.net/reader036/viewer/2022070605/5a4d1ae47f8b9ab059978877/html5/thumbnails/103.jpg)
103103
UNC, Stat & OR
DWD in Face Recognition, VI
Current Work: Focus on “drivers”:
(regions of interest) Relation to Discr’n? Which is “best”? Lessons for human
perception?
![Page 104: 1 UNC, Stat & OR ??? Place ??? Object Oriented Data Analysis J. S. Marron Dept. of Statistics and Operations Research, University of North Carolina January](https://reader036.vdocuments.net/reader036/viewer/2022070605/5a4d1ae47f8b9ab059978877/html5/thumbnails/104.jpg)
104104
UNC, Stat & OR
Time Series of Curves
Chemical Spectra, evolving over time(with J. Wendelberger & E. Kober)
Mortality curves changing in time(with Andres Alonzo)
![Page 105: 1 UNC, Stat & OR ??? Place ??? Object Oriented Data Analysis J. S. Marron Dept. of Statistics and Operations Research, University of North Carolina January](https://reader036.vdocuments.net/reader036/viewer/2022070605/5a4d1ae47f8b9ab059978877/html5/thumbnails/105.jpg)
105105
UNC, Stat & OR
Discrimination for m-reps
Classification for Lie Groups – Symm. SpacesS. K. Sen, S. Joshi & M. Foskey
What is “separating plane” (for SVM-DWD)?
![Page 106: 1 UNC, Stat & OR ??? Place ??? Object Oriented Data Analysis J. S. Marron Dept. of Statistics and Operations Research, University of North Carolina January](https://reader036.vdocuments.net/reader036/viewer/2022070605/5a4d1ae47f8b9ab059978877/html5/thumbnails/106.jpg)
106106
UNC, Stat & OR
Blood vessel tree dataBlood vessel tree data
Marron’s brain: Segmented from MRA Reconstruct trees in 3d Rotate to view
![Page 107: 1 UNC, Stat & OR ??? Place ??? Object Oriented Data Analysis J. S. Marron Dept. of Statistics and Operations Research, University of North Carolina January](https://reader036.vdocuments.net/reader036/viewer/2022070605/5a4d1ae47f8b9ab059978877/html5/thumbnails/107.jpg)
107107
UNC, Stat & OR
Blood vessel tree dataBlood vessel tree data
Marron’s brain: Segmented from MRA Reconstruct trees in 3d Rotate to view
![Page 108: 1 UNC, Stat & OR ??? Place ??? Object Oriented Data Analysis J. S. Marron Dept. of Statistics and Operations Research, University of North Carolina January](https://reader036.vdocuments.net/reader036/viewer/2022070605/5a4d1ae47f8b9ab059978877/html5/thumbnails/108.jpg)
108108
UNC, Stat & OR
Blood vessel tree dataBlood vessel tree data
Marron’s brain: Segmented from MRA Reconstruct trees in 3d Rotate to view
![Page 109: 1 UNC, Stat & OR ??? Place ??? Object Oriented Data Analysis J. S. Marron Dept. of Statistics and Operations Research, University of North Carolina January](https://reader036.vdocuments.net/reader036/viewer/2022070605/5a4d1ae47f8b9ab059978877/html5/thumbnails/109.jpg)
109109
UNC, Stat & OR
Blood vessel tree dataBlood vessel tree data
Marron’s brain: Segmented from MRA Reconstruct trees in 3d Rotate to view
![Page 110: 1 UNC, Stat & OR ??? Place ??? Object Oriented Data Analysis J. S. Marron Dept. of Statistics and Operations Research, University of North Carolina January](https://reader036.vdocuments.net/reader036/viewer/2022070605/5a4d1ae47f8b9ab059978877/html5/thumbnails/110.jpg)
110110
UNC, Stat & OR
Marron’s brain: Segmented from MRA Reconstruct trees in 3d Rotate to view
Blood vessel tree dataBlood vessel tree data
![Page 111: 1 UNC, Stat & OR ??? Place ??? Object Oriented Data Analysis J. S. Marron Dept. of Statistics and Operations Research, University of North Carolina January](https://reader036.vdocuments.net/reader036/viewer/2022070605/5a4d1ae47f8b9ab059978877/html5/thumbnails/111.jpg)
111111
UNC, Stat & OR
Blood vessel tree dataBlood vessel tree data
Marron’s brain: Segmented from MRA Reconstruct trees in 3d Rotate to view
![Page 112: 1 UNC, Stat & OR ??? Place ??? Object Oriented Data Analysis J. S. Marron Dept. of Statistics and Operations Research, University of North Carolina January](https://reader036.vdocuments.net/reader036/viewer/2022070605/5a4d1ae47f8b9ab059978877/html5/thumbnails/112.jpg)
112112
UNC, Stat & OR
Blood vessel tree dataBlood vessel tree data
Now look over many people (data objects)Structure of population (understand variation?)PCA in strongly non-Euclidean Space???
, ... ,,
![Page 113: 1 UNC, Stat & OR ??? Place ??? Object Oriented Data Analysis J. S. Marron Dept. of Statistics and Operations Research, University of North Carolina January](https://reader036.vdocuments.net/reader036/viewer/2022070605/5a4d1ae47f8b9ab059978877/html5/thumbnails/113.jpg)
113113
UNC, Stat & OR
Blood vessel tree dataBlood vessel tree data
Possible focus of analysis:• Connectivity structure only (topology)• Location, size, orientation of segments• Structure within each vessel segment
, ... ,,
![Page 114: 1 UNC, Stat & OR ??? Place ??? Object Oriented Data Analysis J. S. Marron Dept. of Statistics and Operations Research, University of North Carolina January](https://reader036.vdocuments.net/reader036/viewer/2022070605/5a4d1ae47f8b9ab059978877/html5/thumbnails/114.jpg)
114114
UNC, Stat & OR
Blood vessel tree dataBlood vessel tree data
Present Focus:Topology only
Already challenging Later address others Then add attributes To tree nodes And extend analysis
![Page 115: 1 UNC, Stat & OR ??? Place ??? Object Oriented Data Analysis J. S. Marron Dept. of Statistics and Operations Research, University of North Carolina January](https://reader036.vdocuments.net/reader036/viewer/2022070605/5a4d1ae47f8b9ab059978877/html5/thumbnails/115.jpg)
115115
UNC, Stat & OR
Blood vessel tree dataBlood vessel tree data
The tree team: Very Interdsciplinary Neurosurgery: Bullitt, Ladha
Statistics: Wang, Marron
Optimization: Aydin, Pataki
![Page 116: 1 UNC, Stat & OR ??? Place ??? Object Oriented Data Analysis J. S. Marron Dept. of Statistics and Operations Research, University of North Carolina January](https://reader036.vdocuments.net/reader036/viewer/2022070605/5a4d1ae47f8b9ab059978877/html5/thumbnails/116.jpg)
116116
UNC, Stat & OR
Blood vessel tree dataBlood vessel tree data
Recall from above:Marron’s brain: Focus on back Connectivity (topology) only
![Page 117: 1 UNC, Stat & OR ??? Place ??? Object Oriented Data Analysis J. S. Marron Dept. of Statistics and Operations Research, University of North Carolina January](https://reader036.vdocuments.net/reader036/viewer/2022070605/5a4d1ae47f8b9ab059978877/html5/thumbnails/117.jpg)
117117
UNC, Stat & OR
Blood vessel tree dataBlood vessel tree data
Present Focus: Topology only Raw data as trees Marron’s reduced tree Back tree only
![Page 118: 1 UNC, Stat & OR ??? Place ??? Object Oriented Data Analysis J. S. Marron Dept. of Statistics and Operations Research, University of North Carolina January](https://reader036.vdocuments.net/reader036/viewer/2022070605/5a4d1ae47f8b9ab059978877/html5/thumbnails/118.jpg)
118118
UNC, Stat & OR
Blood vessel tree dataBlood vessel tree data
Topology onlyE.g. Back TreesFull PopulationStudy as movieUnderstand variation?
![Page 119: 1 UNC, Stat & OR ??? Place ??? Object Oriented Data Analysis J. S. Marron Dept. of Statistics and Operations Research, University of North Carolina January](https://reader036.vdocuments.net/reader036/viewer/2022070605/5a4d1ae47f8b9ab059978877/html5/thumbnails/119.jpg)
119119
UNC, Stat & OR
Strongly Non-Euclidean Strongly Non-Euclidean SpacesSpaces
Statistics on Population of Tree-Structured Data Objects?
• Mean???• Analog of PCA???
Strongly non-Euclidean, since:• Space of trees not a linear space• Not even approximately linear
(no tangent plane)
![Page 120: 1 UNC, Stat & OR ??? Place ??? Object Oriented Data Analysis J. S. Marron Dept. of Statistics and Operations Research, University of North Carolina January](https://reader036.vdocuments.net/reader036/viewer/2022070605/5a4d1ae47f8b9ab059978877/html5/thumbnails/120.jpg)
120120
UNC, Stat & OR
Mildly Non-Euclidean Mildly Non-Euclidean SpacesSpaces
Useful View of Manifold Data: Tangent Space
Center:Frechét Mean
Reason forterminology“mildly nonEuclidean”
![Page 121: 1 UNC, Stat & OR ??? Place ??? Object Oriented Data Analysis J. S. Marron Dept. of Statistics and Operations Research, University of North Carolina January](https://reader036.vdocuments.net/reader036/viewer/2022070605/5a4d1ae47f8b9ab059978877/html5/thumbnails/121.jpg)
121121
UNC, Stat & OR
Strongly Non-Euclidean Strongly Non-Euclidean SpacesSpaces
Mean of Population of Tree-Structured Data Objects?
Natural approach: Fréchet mean
Requires a metric (distance)on tree space
n
ii
xxXdX
1
2,minarg
![Page 122: 1 UNC, Stat & OR ??? Place ??? Object Oriented Data Analysis J. S. Marron Dept. of Statistics and Operations Research, University of North Carolina January](https://reader036.vdocuments.net/reader036/viewer/2022070605/5a4d1ae47f8b9ab059978877/html5/thumbnails/122.jpg)
122122
UNC, Stat & OR
Strongly Non-Euclidean Strongly Non-Euclidean SpacesSpaces
PCA on Tree Space?• Recall Conventional PCA:• Directions that explain structure in
data
• Data are points in point cloud• 1-d and 2-d projections allow insights
about population structure
![Page 123: 1 UNC, Stat & OR ??? Place ??? Object Oriented Data Analysis J. S. Marron Dept. of Statistics and Operations Research, University of North Carolina January](https://reader036.vdocuments.net/reader036/viewer/2022070605/5a4d1ae47f8b9ab059978877/html5/thumbnails/123.jpg)
123123
UNC, Stat & OR
Illust’n of PCA View: PC1 Illust’n of PCA View: PC1 ProjectionsProjections
![Page 124: 1 UNC, Stat & OR ??? Place ??? Object Oriented Data Analysis J. S. Marron Dept. of Statistics and Operations Research, University of North Carolina January](https://reader036.vdocuments.net/reader036/viewer/2022070605/5a4d1ae47f8b9ab059978877/html5/thumbnails/124.jpg)
124124
UNC, Stat & OR
Illust’n of PCA View: Projections on PC1,2 Illust’n of PCA View: Projections on PC1,2 planeplane
![Page 125: 1 UNC, Stat & OR ??? Place ??? Object Oriented Data Analysis J. S. Marron Dept. of Statistics and Operations Research, University of North Carolina January](https://reader036.vdocuments.net/reader036/viewer/2022070605/5a4d1ae47f8b9ab059978877/html5/thumbnails/125.jpg)
125125
UNC, Stat & OR
Source Batch Adj: PC 1-3 & DWD Source Batch Adj: PC 1-3 & DWD directiondirection
![Page 126: 1 UNC, Stat & OR ??? Place ??? Object Oriented Data Analysis J. S. Marron Dept. of Statistics and Operations Research, University of North Carolina January](https://reader036.vdocuments.net/reader036/viewer/2022070605/5a4d1ae47f8b9ab059978877/html5/thumbnails/126.jpg)
126126
UNC, Stat & OR
Source Batch Adj: DWD Source Source Batch Adj: DWD Source AdjustmentAdjustment
![Page 127: 1 UNC, Stat & OR ??? Place ??? Object Oriented Data Analysis J. S. Marron Dept. of Statistics and Operations Research, University of North Carolina January](https://reader036.vdocuments.net/reader036/viewer/2022070605/5a4d1ae47f8b9ab059978877/html5/thumbnails/127.jpg)
127127
UNC, Stat & OR
Strongly Non-Euclidean Strongly Non-Euclidean SpacesSpaces
PCA on Tree Space?Key Idea (Jim Ramsay):• Replace 1-d subspace
that best approximates data• By 1-d representation
that best approximates dataWang and Marron (2007) define notion of
Treeline (in structure space)
![Page 128: 1 UNC, Stat & OR ??? Place ??? Object Oriented Data Analysis J. S. Marron Dept. of Statistics and Operations Research, University of North Carolina January](https://reader036.vdocuments.net/reader036/viewer/2022070605/5a4d1ae47f8b9ab059978877/html5/thumbnails/128.jpg)
128128
UNC, Stat & OR
Strongly Non-Euclidean Strongly Non-Euclidean SpacesSpaces
PCA on Tree Space: Treeline • Best 1-d representation of dataBasic idea:• From some starting tree• Grow only in 1 “direction”
![Page 129: 1 UNC, Stat & OR ??? Place ??? Object Oriented Data Analysis J. S. Marron Dept. of Statistics and Operations Research, University of North Carolina January](https://reader036.vdocuments.net/reader036/viewer/2022070605/5a4d1ae47f8b9ab059978877/html5/thumbnails/129.jpg)
129129
UNC, Stat & OR
Strongly Non-Euclidean Strongly Non-Euclidean SpacesSpaces
PCA on Tree Space: Treeline • Best 1-d representation of dataProblem: Hard to compute• In particular: to solve optimization problemWang and Marron (2007)• Maximum 4 vessel trees• Hard to tackle serious trees
(e.g. blood vessel trees)
![Page 130: 1 UNC, Stat & OR ??? Place ??? Object Oriented Data Analysis J. S. Marron Dept. of Statistics and Operations Research, University of North Carolina January](https://reader036.vdocuments.net/reader036/viewer/2022070605/5a4d1ae47f8b9ab059978877/html5/thumbnails/130.jpg)
130130
UNC, Stat & OR
Strongly Non-Euclidean Strongly Non-Euclidean SpacesSpaces
PCA on Tree Space: Treeline Problem: Hard to computeSolution: Burcu Aydin & Gabor Pataki
(linear time algorithm)(based on clever
“reformulation” of problem)
![Page 131: 1 UNC, Stat & OR ??? Place ??? Object Oriented Data Analysis J. S. Marron Dept. of Statistics and Operations Research, University of North Carolina January](https://reader036.vdocuments.net/reader036/viewer/2022070605/5a4d1ae47f8b9ab059978877/html5/thumbnails/131.jpg)
131131
UNC, Stat & OR
PCA for blood vessel tree PCA for blood vessel tree datadata
PCA on Tree Space: Treelines Interesting to compare:• Population of Left Trees• Population of Right Trees• Population of Back TreesAnd to study 1st, 2nd, 3rd & 4th treelines
![Page 132: 1 UNC, Stat & OR ??? Place ??? Object Oriented Data Analysis J. S. Marron Dept. of Statistics and Operations Research, University of North Carolina January](https://reader036.vdocuments.net/reader036/viewer/2022070605/5a4d1ae47f8b9ab059978877/html5/thumbnails/132.jpg)
132132
UNC, Stat & OR
PCA for blood vessel tree PCA for blood vessel tree datadata
Study“Directions”
1, 2, 3, 4For sub-populations
B, L, R(interpret
later)
![Page 133: 1 UNC, Stat & OR ??? Place ??? Object Oriented Data Analysis J. S. Marron Dept. of Statistics and Operations Research, University of North Carolina January](https://reader036.vdocuments.net/reader036/viewer/2022070605/5a4d1ae47f8b9ab059978877/html5/thumbnails/133.jpg)
133133
UNC, Stat & OR
PCA for blood vessel tree PCA for blood vessel tree datadata
Notes on Treeline Directions:• PC1 always to left• BACK has most variation to right
(PC2)• LEFT has more varia’n to 2nd level
(PC2)• RIGHT has more var’n to 1st level
(PC2)See these in the data?
![Page 134: 1 UNC, Stat & OR ??? Place ??? Object Oriented Data Analysis J. S. Marron Dept. of Statistics and Operations Research, University of North Carolina January](https://reader036.vdocuments.net/reader036/viewer/2022070605/5a4d1ae47f8b9ab059978877/html5/thumbnails/134.jpg)
134134
UNC, Stat & OR
PCA for blood vessel tree PCA for blood vessel tree datadata
Notes:PC1 – all leftPC2:BACK - right LEFT 2nd levRIGHT 1st levSee these??
![Page 135: 1 UNC, Stat & OR ??? Place ??? Object Oriented Data Analysis J. S. Marron Dept. of Statistics and Operations Research, University of North Carolina January](https://reader036.vdocuments.net/reader036/viewer/2022070605/5a4d1ae47f8b9ab059978877/html5/thumbnails/135.jpg)
135135
UNC, Stat & OR
Strongly Non-Euclidean Strongly Non-Euclidean SpacesSpaces
PCA on Tree Space: Treeline Next represent data as projections• Define as closest point in tree line
(same as Euclidean PCA)• Have corresponding score
(length of projection along line)• And analog of residual
(distance from data point to projection)
![Page 136: 1 UNC, Stat & OR ??? Place ??? Object Oriented Data Analysis J. S. Marron Dept. of Statistics and Operations Research, University of North Carolina January](https://reader036.vdocuments.net/reader036/viewer/2022070605/5a4d1ae47f8b9ab059978877/html5/thumbnails/136.jpg)
136136
UNC, Stat & OR
PCA for blood vessel tree PCA for blood vessel tree datadata
Individual (each PC separately) Scores Plot
![Page 137: 1 UNC, Stat & OR ??? Place ??? Object Oriented Data Analysis J. S. Marron Dept. of Statistics and Operations Research, University of North Carolina January](https://reader036.vdocuments.net/reader036/viewer/2022070605/5a4d1ae47f8b9ab059978877/html5/thumbnails/137.jpg)
137137
UNC, Stat & OR
PCA for blood vessel tree PCA for blood vessel tree datadata
Data Analytic Goals: Age, Gender
See these?
No…
![Page 138: 1 UNC, Stat & OR ??? Place ??? Object Oriented Data Analysis J. S. Marron Dept. of Statistics and Operations Research, University of North Carolina January](https://reader036.vdocuments.net/reader036/viewer/2022070605/5a4d1ae47f8b9ab059978877/html5/thumbnails/138.jpg)
138138
UNC, Stat & OR
PCA for blood vessel tree PCA for blood vessel tree datadata
Directly study age PC scoresPC1 + PC2- Thickness Not Sig’t- Descendants Left Sig’t
![Page 139: 1 UNC, Stat & OR ??? Place ??? Object Oriented Data Analysis J. S. Marron Dept. of Statistics and Operations Research, University of North Carolina January](https://reader036.vdocuments.net/reader036/viewer/2022070605/5a4d1ae47f8b9ab059978877/html5/thumbnails/139.jpg)
139139
UNC, Stat & OR
Upcoming New ApproachUpcoming New Approach
Replace Tree-Lines by Tree-Curves:
![Page 140: 1 UNC, Stat & OR ??? Place ??? Object Oriented Data Analysis J. S. Marron Dept. of Statistics and Operations Research, University of North Carolina January](https://reader036.vdocuments.net/reader036/viewer/2022070605/5a4d1ae47f8b9ab059978877/html5/thumbnails/140.jpg)
140140
UNC, Stat & OR
Upcoming New ApproachUpcoming New Approach
Projections on Tree-Curves:
![Page 141: 1 UNC, Stat & OR ??? Place ??? Object Oriented Data Analysis J. S. Marron Dept. of Statistics and Operations Research, University of North Carolina January](https://reader036.vdocuments.net/reader036/viewer/2022070605/5a4d1ae47f8b9ab059978877/html5/thumbnails/141.jpg)
141141
UNC, Stat & OR
Preliminary Tree-Curve Preliminary Tree-Curve ResultsResults
First Correlation
OfStructure
To Age!
(BackTrees)
![Page 142: 1 UNC, Stat & OR ??? Place ??? Object Oriented Data Analysis J. S. Marron Dept. of Statistics and Operations Research, University of North Carolina January](https://reader036.vdocuments.net/reader036/viewer/2022070605/5a4d1ae47f8b9ab059978877/html5/thumbnails/142.jpg)
142142
UNC, Stat & OR
Preliminary Tree-Curve Preliminary Tree-Curve ResultsResults
But does not appeareverywhere
(LeftTrees)
Findinglocality!
![Page 143: 1 UNC, Stat & OR ??? Place ??? Object Oriented Data Analysis J. S. Marron Dept. of Statistics and Operations Research, University of North Carolina January](https://reader036.vdocuments.net/reader036/viewer/2022070605/5a4d1ae47f8b9ab059978877/html5/thumbnails/143.jpg)
143143
UNC, Stat & OR
HDLSS Asymptotics
Why study asymptotics?
![Page 144: 1 UNC, Stat & OR ??? Place ??? Object Oriented Data Analysis J. S. Marron Dept. of Statistics and Operations Research, University of North Carolina January](https://reader036.vdocuments.net/reader036/viewer/2022070605/5a4d1ae47f8b9ab059978877/html5/thumbnails/144.jpg)
144144
UNC, Stat & OR
HDLSS Asymptotics
Why study asymptotics? An interesting (naïve) quote:“I don’t look at asymptotics, because I don’t have an infinite sample size”
![Page 145: 1 UNC, Stat & OR ??? Place ??? Object Oriented Data Analysis J. S. Marron Dept. of Statistics and Operations Research, University of North Carolina January](https://reader036.vdocuments.net/reader036/viewer/2022070605/5a4d1ae47f8b9ab059978877/html5/thumbnails/145.jpg)
145145
UNC, Stat & OR
HDLSS Asymptotics
Why study asymptotics? An interesting (naïve) quote:
“I don’t look at asymptotics, because I don’t have an infinite sample size”
Suggested perspective:Asymptotics are a tool for finding simple
structure underlying complex entities
![Page 146: 1 UNC, Stat & OR ??? Place ??? Object Oriented Data Analysis J. S. Marron Dept. of Statistics and Operations Research, University of North Carolina January](https://reader036.vdocuments.net/reader036/viewer/2022070605/5a4d1ae47f8b9ab059978877/html5/thumbnails/146.jpg)
146146
UNC, Stat & OR
HDLSS Asymptotics
Which asymptotics? n ∞ (classical, very widely done) d ∞ ??? Sensible? Follow typical “sampling process”? Say anything, as noise level increases???
![Page 147: 1 UNC, Stat & OR ??? Place ??? Object Oriented Data Analysis J. S. Marron Dept. of Statistics and Operations Research, University of North Carolina January](https://reader036.vdocuments.net/reader036/viewer/2022070605/5a4d1ae47f8b9ab059978877/html5/thumbnails/147.jpg)
147147
UNC, Stat & OR
HDLSS Asymptotics
Which asymptotics? n ∞ & d ∞ n >> d: a few results around
(still have classical info in data) n ~ d: random matrices (Iain J., et al)
(nothing classically estimable) HDLSS asymptotics: n fixed, d ∞
![Page 148: 1 UNC, Stat & OR ??? Place ??? Object Oriented Data Analysis J. S. Marron Dept. of Statistics and Operations Research, University of North Carolina January](https://reader036.vdocuments.net/reader036/viewer/2022070605/5a4d1ae47f8b9ab059978877/html5/thumbnails/148.jpg)
148148
UNC, Stat & OR
HDLSS Asymptotics
HDLSS asymptotics: n fixed, d ∞ Follow typical “sampling process”?
![Page 149: 1 UNC, Stat & OR ??? Place ??? Object Oriented Data Analysis J. S. Marron Dept. of Statistics and Operations Research, University of North Carolina January](https://reader036.vdocuments.net/reader036/viewer/2022070605/5a4d1ae47f8b9ab059978877/html5/thumbnails/149.jpg)
149149
UNC, Stat & OR
HDLSS Asymptotics
HDLSS asymptotics: n fixed, d ∞ Follow typical “sampling process”? Microarrays: # genes bounded Proteomics, SNPs, …
A moot point, from perspective: Asymptotics are a tool for finding simple
structure underlying complex entities
![Page 150: 1 UNC, Stat & OR ??? Place ??? Object Oriented Data Analysis J. S. Marron Dept. of Statistics and Operations Research, University of North Carolina January](https://reader036.vdocuments.net/reader036/viewer/2022070605/5a4d1ae47f8b9ab059978877/html5/thumbnails/150.jpg)
150150
UNC, Stat & OR
HDLSS Asymptotics
HDLSS asymptotics: n fixed, d ∞ Say anything, as noise level increases???
![Page 151: 1 UNC, Stat & OR ??? Place ??? Object Oriented Data Analysis J. S. Marron Dept. of Statistics and Operations Research, University of North Carolina January](https://reader036.vdocuments.net/reader036/viewer/2022070605/5a4d1ae47f8b9ab059978877/html5/thumbnails/151.jpg)
151151
UNC, Stat & OR
HDLSS Asymptotics
HDLSS asymptotics: n fixed, d ∞ Say anything, as noise level increases???
Yes, there exists simple, perhapssurprising, underlying structure
![Page 152: 1 UNC, Stat & OR ??? Place ??? Object Oriented Data Analysis J. S. Marron Dept. of Statistics and Operations Research, University of North Carolina January](https://reader036.vdocuments.net/reader036/viewer/2022070605/5a4d1ae47f8b9ab059978877/html5/thumbnails/152.jpg)
152152
UNC, Stat & OR
HDLSS Asymptotics: Simple Paradoxes, I
For dim’al “Standard Normal” dist’n:
Euclidean Distance to Origin (as ):
- Data lie roughly on surface of sphere of radius - Yet origin is point of “highest density”??? - Paradox resolved by:
“density w. r. t. Lebesgue Measure”
d
d
dd
d
INZ
ZZ ,0~
1
)1(pOdZ
d
![Page 153: 1 UNC, Stat & OR ??? Place ??? Object Oriented Data Analysis J. S. Marron Dept. of Statistics and Operations Research, University of North Carolina January](https://reader036.vdocuments.net/reader036/viewer/2022070605/5a4d1ae47f8b9ab059978877/html5/thumbnails/153.jpg)
153153
UNC, Stat & OR
HDLSS Asymptotics: Simple Paradoxes, II
For dim’al “Standard Normal” dist’n: indep. of
Euclidean Dist. between and (as ):Distance tends to non-random constant:
Can extend to Where do they all go???
(we can only perceive 3 dim’ns)
d
d
dd INZ ,0~2
)1(221 pOdZZ
1Z
1Z 2Z
nZZ ,...,1
![Page 154: 1 UNC, Stat & OR ??? Place ??? Object Oriented Data Analysis J. S. Marron Dept. of Statistics and Operations Research, University of North Carolina January](https://reader036.vdocuments.net/reader036/viewer/2022070605/5a4d1ae47f8b9ab059978877/html5/thumbnails/154.jpg)
154154
UNC, Stat & OR
HDLSS Asymptotics: Simple Paradoxes, III
For dim’al “Standard Normal” dist’n: indep. of
High dim’al Angles (as ):
- -“Everything is orthogonal”??? - Where do they all go???
(again our perceptual limitations) - Again 1st order structure is non-random
d
d
dd INZ ,0~2
)(90, 2/121
dOZZAngle p
1Z
![Page 155: 1 UNC, Stat & OR ??? Place ??? Object Oriented Data Analysis J. S. Marron Dept. of Statistics and Operations Research, University of North Carolina January](https://reader036.vdocuments.net/reader036/viewer/2022070605/5a4d1ae47f8b9ab059978877/html5/thumbnails/155.jpg)
155155
UNC, Stat & OR
HDLSS Asy’s: Geometrical Representation, I
Assume , let Study Subspace Generated by
Dataa. Hyperplane through 0, of
dimension b. Points are “nearly
equidistant to 0”, & dist c. Within plane, can “rotate
towards Unit Simplex”d. All Gaussian data sets
are“near Unit Simplex Vertices”!!!
“Randomness” appears only in rotation of simplex
n
d ddn INZZ ,0~,...,1
d
d
With P. Hall & A. Neeman
![Page 156: 1 UNC, Stat & OR ??? Place ??? Object Oriented Data Analysis J. S. Marron Dept. of Statistics and Operations Research, University of North Carolina January](https://reader036.vdocuments.net/reader036/viewer/2022070605/5a4d1ae47f8b9ab059978877/html5/thumbnails/156.jpg)
156156
UNC, Stat & OR
HDLSS Asy’s: Geometrical Representation, II
Assume , let
Study Hyperplane Generated by Data
a. dimensional hyperplane
b. Points are pairwise equidistant, dist
c. Points lie at vertices of “regular hedron”
d. Again “randomness in data” is only in rotation
e. Surprisingly rigid structure in data?
1n
d ddn INZZ ,0~,...,1
d2d~
n
![Page 157: 1 UNC, Stat & OR ??? Place ??? Object Oriented Data Analysis J. S. Marron Dept. of Statistics and Operations Research, University of North Carolina January](https://reader036.vdocuments.net/reader036/viewer/2022070605/5a4d1ae47f8b9ab059978877/html5/thumbnails/157.jpg)
157157
UNC, Stat & OR
HDLSS Asy’s: Geometrical Representation, III
Simulation View: shows “rigidity after rotation”
![Page 158: 1 UNC, Stat & OR ??? Place ??? Object Oriented Data Analysis J. S. Marron Dept. of Statistics and Operations Research, University of North Carolina January](https://reader036.vdocuments.net/reader036/viewer/2022070605/5a4d1ae47f8b9ab059978877/html5/thumbnails/158.jpg)
158158
UNC, Stat & OR
HDLSS Asy’s: Geometrical Representation, III
Straightforward Generalizations: non-Gaussian data: only need moments non-independent: use “mixing conditions” (with P. Hall & A. Neeman)
Mild Eigenvalue condition on Theoretical Cov. (with J. Ahn, K. Muller & Y. Chi)
Mixing Condition on Stand’d & Permuted Var’s
(with S. Jung)
All based on simple “Laws of Large Numbers”
![Page 159: 1 UNC, Stat & OR ??? Place ??? Object Oriented Data Analysis J. S. Marron Dept. of Statistics and Operations Research, University of North Carolina January](https://reader036.vdocuments.net/reader036/viewer/2022070605/5a4d1ae47f8b9ab059978877/html5/thumbnails/159.jpg)
159159
UNC, Stat & OR
2nd Paper on HDLSS Asymptotics
Ahn, Marron, Muller & Chi (2007) Biometrika Assume 2nd Moments (and Gaussian) Assume no eigenvalues too large in sense:
For assume i.e. (min possible)
(much weaker than previous mixing conditions…)
d
jj
d
jj
d1
2
2
1
)(1 do 1 d
![Page 160: 1 UNC, Stat & OR ??? Place ??? Object Oriented Data Analysis J. S. Marron Dept. of Statistics and Operations Research, University of North Carolina January](https://reader036.vdocuments.net/reader036/viewer/2022070605/5a4d1ae47f8b9ab059978877/html5/thumbnails/160.jpg)
160160
UNC, Stat & OR
HDLSS Asy’s: Geometrical Representation, IV
Explanation of Observed (Simulation) Behavior:“everything similar for very high d”
2 popn’s are 2 simplices (i.e. regular n-hedrons)
All are same distance from the other class i.e. everything is a support vector i.e. all sensible directions show “data piling” so “sensible methods are all nearly the same” Including 1 - NN
![Page 161: 1 UNC, Stat & OR ??? Place ??? Object Oriented Data Analysis J. S. Marron Dept. of Statistics and Operations Research, University of North Carolina January](https://reader036.vdocuments.net/reader036/viewer/2022070605/5a4d1ae47f8b9ab059978877/html5/thumbnails/161.jpg)
161161
UNC, Stat & OR
HDLSS Asy’s: Geometrical Representation, V
Further Consequences of Geometric Representation
1. Inefficiency of DWD for uneven sample size(motivates “weighted version”, work in progress)
2. DWD more “stable” than SVM(based on “deeper limiting distributions”)(reflects intuitive idea “feeling sampling
variation”)(something like “mean vs. median”)
3. 1-NN rule inefficiency is quantified.
![Page 162: 1 UNC, Stat & OR ??? Place ??? Object Oriented Data Analysis J. S. Marron Dept. of Statistics and Operations Research, University of North Carolina January](https://reader036.vdocuments.net/reader036/viewer/2022070605/5a4d1ae47f8b9ab059978877/html5/thumbnails/162.jpg)
162162
UNC, Stat & OR
HDLSS Math. Stat. of PCA, I
Consistency & Strong Inconsistency:Spike Covariance Model (Johnstone & Paul)For Eigenvalues: 1st Eigenvector:
How good are empirical versions,as estimates?
1,,1, ,,2,1 dddd d
1u
1,,1 ˆ,ˆ,,ˆ uddd
![Page 163: 1 UNC, Stat & OR ??? Place ??? Object Oriented Data Analysis J. S. Marron Dept. of Statistics and Operations Research, University of North Carolina January](https://reader036.vdocuments.net/reader036/viewer/2022070605/5a4d1ae47f8b9ab059978877/html5/thumbnails/163.jpg)
163163
UNC, Stat & OR
HDLSS Math. Stat. of PCA, II
Consistency (big enough spike):For ,
Strong Inconsistency (spike not big enough):For ,
1
0ˆ, 11 uuAngle
1
011 90ˆ, uuAngle
![Page 164: 1 UNC, Stat & OR ??? Place ??? Object Oriented Data Analysis J. S. Marron Dept. of Statistics and Operations Research, University of North Carolina January](https://reader036.vdocuments.net/reader036/viewer/2022070605/5a4d1ae47f8b9ab059978877/html5/thumbnails/164.jpg)
164164
UNC, Stat & OR
HDLSS Math. Stat. of PCA, III
Consistency of eigenvalues?
Eigenvalues Inconsistent But known distribution Unless as well
nn
dL
d
2
,1,1̂
n
![Page 165: 1 UNC, Stat & OR ??? Place ??? Object Oriented Data Analysis J. S. Marron Dept. of Statistics and Operations Research, University of North Carolina January](https://reader036.vdocuments.net/reader036/viewer/2022070605/5a4d1ae47f8b9ab059978877/html5/thumbnails/165.jpg)
165165
UNC, Stat & OR
HDLSS Work in Progress, I
Batch Adjustment: Xuxin LiuRecall Intuition from above: Key is sizes of biological subtypes Differing ratio trips up mean But DWD more robust
Mathematics behind this?
![Page 166: 1 UNC, Stat & OR ??? Place ??? Object Oriented Data Analysis J. S. Marron Dept. of Statistics and Operations Research, University of North Carolina January](https://reader036.vdocuments.net/reader036/viewer/2022070605/5a4d1ae47f8b9ab059978877/html5/thumbnails/166.jpg)
166166
UNC, Stat & OR
Liu: Twiddle ratios of subtypes
![Page 167: 1 UNC, Stat & OR ??? Place ??? Object Oriented Data Analysis J. S. Marron Dept. of Statistics and Operations Research, University of North Carolina January](https://reader036.vdocuments.net/reader036/viewer/2022070605/5a4d1ae47f8b9ab059978877/html5/thumbnails/167.jpg)
167167
UNC, Stat & OR
HDLSS Data Combo Mathematics
Xuxin Liu Dissertation Results: Simple Unbalanced Cluster Model Growing at rate as Answers depend on
Visualization of setting….
d d
![Page 168: 1 UNC, Stat & OR ??? Place ??? Object Oriented Data Analysis J. S. Marron Dept. of Statistics and Operations Research, University of North Carolina January](https://reader036.vdocuments.net/reader036/viewer/2022070605/5a4d1ae47f8b9ab059978877/html5/thumbnails/168.jpg)
168168
UNC, Stat & OR
HDLSS Data Combo Mathematics
![Page 169: 1 UNC, Stat & OR ??? Place ??? Object Oriented Data Analysis J. S. Marron Dept. of Statistics and Operations Research, University of North Carolina January](https://reader036.vdocuments.net/reader036/viewer/2022070605/5a4d1ae47f8b9ab059978877/html5/thumbnails/169.jpg)
169169
UNC, Stat & OR
HDLSS Data Combo Mathematics
![Page 170: 1 UNC, Stat & OR ??? Place ??? Object Oriented Data Analysis J. S. Marron Dept. of Statistics and Operations Research, University of North Carolina January](https://reader036.vdocuments.net/reader036/viewer/2022070605/5a4d1ae47f8b9ab059978877/html5/thumbnails/170.jpg)
170170
UNC, Stat & OR
HDLSS Data Combo Mathematics
Asymptotic Results (as ):
For , DWD Consistent
Angle(DWD,Truth)
For , DWD Strongly Inconsistent
Angle(DWD,Truth)
d
21
21
0
090
![Page 171: 1 UNC, Stat & OR ??? Place ??? Object Oriented Data Analysis J. S. Marron Dept. of Statistics and Operations Research, University of North Carolina January](https://reader036.vdocuments.net/reader036/viewer/2022070605/5a4d1ae47f8b9ab059978877/html5/thumbnails/171.jpg)
171171
UNC, Stat & OR
HDLSS Data Combo Mathematics
Asymptotic Results (as ):
For , PAM Inconsistent
Angle(PAM,Truth)
For , DWD Strongly Inconsistent
Angle(PAM,Truth)
d
21
21
0 rC
090
![Page 172: 1 UNC, Stat & OR ??? Place ??? Object Oriented Data Analysis J. S. Marron Dept. of Statistics and Operations Research, University of North Carolina January](https://reader036.vdocuments.net/reader036/viewer/2022070605/5a4d1ae47f8b9ab059978877/html5/thumbnails/172.jpg)
172172
UNC, Stat & OR
HDLSS Data Combo Mathematics
Value of , for sample size ratio :
, only when
Otherwise for , PAM Inconsistent
Verifies intuitive idea in strong way
rC
22
1cos2
1
r
rCr
0rC
r
1r
1r
![Page 173: 1 UNC, Stat & OR ??? Place ??? Object Oriented Data Analysis J. S. Marron Dept. of Statistics and Operations Research, University of North Carolina January](https://reader036.vdocuments.net/reader036/viewer/2022070605/5a4d1ae47f8b9ab059978877/html5/thumbnails/173.jpg)
173173
UNC, Stat & OR
HDLSS Work in Progress, II
Canonical Correlations: Myung Hee Lee Results similar to those for those for
PCA Singular values inconsistent But directions converge under a much
milder spike assumption.
![Page 174: 1 UNC, Stat & OR ??? Place ??? Object Oriented Data Analysis J. S. Marron Dept. of Statistics and Operations Research, University of North Carolina January](https://reader036.vdocuments.net/reader036/viewer/2022070605/5a4d1ae47f8b9ab059978877/html5/thumbnails/174.jpg)
174174
UNC, Stat & OR
HDLSS Work in Progress, III
Conditions for Geo. Rep’n & PCA Consist.:John Kent example:
Can only say: not deterministic
Conclude: need some flavor of mixing
dddddd ININX *100,021,0
21~
212/1212/1
2/1
..10
..)(
pwdpwd
dOX p
![Page 175: 1 UNC, Stat & OR ??? Place ??? Object Oriented Data Analysis J. S. Marron Dept. of Statistics and Operations Research, University of North Carolina January](https://reader036.vdocuments.net/reader036/viewer/2022070605/5a4d1ae47f8b9ab059978877/html5/thumbnails/175.jpg)
175175
UNC, Stat & OR
HDLSS Work in Progress, III
Conditions for Geo. Rep’n & PCA Consist.:Conclude: need some flavor of mixing
Challenge: Classical mixing conditionsrequire notion of time ordering
Not always clear, e.g. microarrays
![Page 176: 1 UNC, Stat & OR ??? Place ??? Object Oriented Data Analysis J. S. Marron Dept. of Statistics and Operations Research, University of North Carolina January](https://reader036.vdocuments.net/reader036/viewer/2022070605/5a4d1ae47f8b9ab059978877/html5/thumbnails/176.jpg)
176176
UNC, Stat & OR
HDLSS Work in Progress, III
Conditions for Geo. Rep’n & PCA Consist.:Sungkyu Jung Condition: whereDefine:Assume: Ǝ a permutation, So that is ρ-mixing
ddX ,0~ tdddd UU
dtddd XUZ 2/1
d
ddZ
![Page 177: 1 UNC, Stat & OR ??? Place ??? Object Oriented Data Analysis J. S. Marron Dept. of Statistics and Operations Research, University of North Carolina January](https://reader036.vdocuments.net/reader036/viewer/2022070605/5a4d1ae47f8b9ab059978877/html5/thumbnails/177.jpg)
177177
UNC, Stat & OR
HDLSS Deep Open Problem
In PCA Consistency: Strong Inconsistency - spike Consistency - spike
What happens at boundary ( )???
1
1
1
![Page 178: 1 UNC, Stat & OR ??? Place ??? Object Oriented Data Analysis J. S. Marron Dept. of Statistics and Operations Research, University of North Carolina January](https://reader036.vdocuments.net/reader036/viewer/2022070605/5a4d1ae47f8b9ab059978877/html5/thumbnails/178.jpg)
178178
UNC, Stat & OR
The Future of HDLSS Asymptotics?
1. Address your favorite statistical problem…2. HDLSS versions of classical optimality results?3. Continguity Approach (~Random Matrices)4. Rates of convergence?5. Improved Discrimination Methods?
It is early days…
![Page 179: 1 UNC, Stat & OR ??? Place ??? Object Oriented Data Analysis J. S. Marron Dept. of Statistics and Operations Research, University of North Carolina January](https://reader036.vdocuments.net/reader036/viewer/2022070605/5a4d1ae47f8b9ab059978877/html5/thumbnails/179.jpg)
179179
UNC, Stat & OR
The Future of Geometrical Representation?
HDLSS version of “optimality” results? “Contiguity” approach? Params depend on
d? Rates of Convergence? Improvements of DWD?(e.g. other functions of distance than inverse)
It is still early days …
![Page 180: 1 UNC, Stat & OR ??? Place ??? Object Oriented Data Analysis J. S. Marron Dept. of Statistics and Operations Research, University of North Carolina January](https://reader036.vdocuments.net/reader036/viewer/2022070605/5a4d1ae47f8b9ab059978877/html5/thumbnails/180.jpg)
180180
UNC, Stat & OR
Some Carry Away Lessons
Atoms of the Analysis: Object Oriented Viewpoint: Object Space Feature Space DWD is attractive for HDLSS classification “Randomness” in HDLSS data is only in rotations
(Modulo rotation, have constant simplex shape) How to put HDLSS asymptotics to work?
![Page 181: 1 UNC, Stat & OR ??? Place ??? Object Oriented Data Analysis J. S. Marron Dept. of Statistics and Operations Research, University of North Carolina January](https://reader036.vdocuments.net/reader036/viewer/2022070605/5a4d1ae47f8b9ab059978877/html5/thumbnails/181.jpg)
181181
UNC, Stat & OR
Object Oriented Data AnalysisObject Oriented Data Analysis
Potential Future Opportunity:
OODA SAMSI Program2010-2011
Interested in joining? Let’s talk