anthony bak, principal data scientist at ayasdi at mlconf sea - 5/01/15
TRANSCRIPT
![Page 1: Anthony Bak, Principal Data Scientist at Ayasdi at MLconf SEA - 5/01/15](https://reader034.vdocuments.net/reader034/viewer/2022042607/55a686eb1a28ab501e8b456b/html5/thumbnails/1.jpg)
Shape as Organizing Principle for
Data
MLConf Seattle 2015
Anthony Bak, Principal Data Scientist
![Page 2: Anthony Bak, Principal Data Scientist at Ayasdi at MLconf SEA - 5/01/15](https://reader034.vdocuments.net/reader034/viewer/2022042607/55a686eb1a28ab501e8b456b/html5/thumbnails/2.jpg)
The Data Problem: Complexity
![Page 3: Anthony Bak, Principal Data Scientist at Ayasdi at MLconf SEA - 5/01/15](https://reader034.vdocuments.net/reader034/viewer/2022042607/55a686eb1a28ab501e8b456b/html5/thumbnails/3.jpg)
Solution: Topological Summaries
![Page 4: Anthony Bak, Principal Data Scientist at Ayasdi at MLconf SEA - 5/01/15](https://reader034.vdocuments.net/reader034/viewer/2022042607/55a686eb1a28ab501e8b456b/html5/thumbnails/4.jpg)
Shape as Organizing
Principle for Data
![Page 5: Anthony Bak, Principal Data Scientist at Ayasdi at MLconf SEA - 5/01/15](https://reader034.vdocuments.net/reader034/viewer/2022042607/55a686eb1a28ab501e8b456b/html5/thumbnails/5.jpg)
Shape as Organizing Principle
![Page 6: Anthony Bak, Principal Data Scientist at Ayasdi at MLconf SEA - 5/01/15](https://reader034.vdocuments.net/reader034/viewer/2022042607/55a686eb1a28ab501e8b456b/html5/thumbnails/6.jpg)
Reduce Bias, Discover Models
TDA tells you the data you have,
not the data you want to have.
![Page 7: Anthony Bak, Principal Data Scientist at Ayasdi at MLconf SEA - 5/01/15](https://reader034.vdocuments.net/reader034/viewer/2022042607/55a686eb1a28ab501e8b456b/html5/thumbnails/7.jpg)
Generating Topological
Summaries
![Page 8: Anthony Bak, Principal Data Scientist at Ayasdi at MLconf SEA - 5/01/15](https://reader034.vdocuments.net/reader034/viewer/2022042607/55a686eb1a28ab501e8b456b/html5/thumbnails/8.jpg)
Generating Topological Summaries
![Page 9: Anthony Bak, Principal Data Scientist at Ayasdi at MLconf SEA - 5/01/15](https://reader034.vdocuments.net/reader034/viewer/2022042607/55a686eb1a28ab501e8b456b/html5/thumbnails/9.jpg)
Generating Topological Summaries
![Page 10: Anthony Bak, Principal Data Scientist at Ayasdi at MLconf SEA - 5/01/15](https://reader034.vdocuments.net/reader034/viewer/2022042607/55a686eb1a28ab501e8b456b/html5/thumbnails/10.jpg)
Generating Topological Summaries
![Page 11: Anthony Bak, Principal Data Scientist at Ayasdi at MLconf SEA - 5/01/15](https://reader034.vdocuments.net/reader034/viewer/2022042607/55a686eb1a28ab501e8b456b/html5/thumbnails/11.jpg)
Generating Topological Summaries
![Page 12: Anthony Bak, Principal Data Scientist at Ayasdi at MLconf SEA - 5/01/15](https://reader034.vdocuments.net/reader034/viewer/2022042607/55a686eb1a28ab501e8b456b/html5/thumbnails/12.jpg)
Generating Topological Summaries
![Page 13: Anthony Bak, Principal Data Scientist at Ayasdi at MLconf SEA - 5/01/15](https://reader034.vdocuments.net/reader034/viewer/2022042607/55a686eb1a28ab501e8b456b/html5/thumbnails/13.jpg)
Generating Topological Summaries
![Page 14: Anthony Bak, Principal Data Scientist at Ayasdi at MLconf SEA - 5/01/15](https://reader034.vdocuments.net/reader034/viewer/2022042607/55a686eb1a28ab501e8b456b/html5/thumbnails/14.jpg)
Generating Topological Summaries
![Page 15: Anthony Bak, Principal Data Scientist at Ayasdi at MLconf SEA - 5/01/15](https://reader034.vdocuments.net/reader034/viewer/2022042607/55a686eb1a28ab501e8b456b/html5/thumbnails/15.jpg)
Generating Topological Summaries
![Page 16: Anthony Bak, Principal Data Scientist at Ayasdi at MLconf SEA - 5/01/15](https://reader034.vdocuments.net/reader034/viewer/2022042607/55a686eb1a28ab501e8b456b/html5/thumbnails/16.jpg)
Generating Topological Summaries
![Page 17: Anthony Bak, Principal Data Scientist at Ayasdi at MLconf SEA - 5/01/15](https://reader034.vdocuments.net/reader034/viewer/2022042607/55a686eb1a28ab501e8b456b/html5/thumbnails/17.jpg)
Generating Topological Summaries
![Page 18: Anthony Bak, Principal Data Scientist at Ayasdi at MLconf SEA - 5/01/15](https://reader034.vdocuments.net/reader034/viewer/2022042607/55a686eb1a28ab501e8b456b/html5/thumbnails/18.jpg)
Generating Topological Summaries
![Page 19: Anthony Bak, Principal Data Scientist at Ayasdi at MLconf SEA - 5/01/15](https://reader034.vdocuments.net/reader034/viewer/2022042607/55a686eb1a28ab501e8b456b/html5/thumbnails/19.jpg)
Generating Topological Summaries
![Page 20: Anthony Bak, Principal Data Scientist at Ayasdi at MLconf SEA - 5/01/15](https://reader034.vdocuments.net/reader034/viewer/2022042607/55a686eb1a28ab501e8b456b/html5/thumbnails/20.jpg)
Generating Topological Summaries
![Page 21: Anthony Bak, Principal Data Scientist at Ayasdi at MLconf SEA - 5/01/15](https://reader034.vdocuments.net/reader034/viewer/2022042607/55a686eb1a28ab501e8b456b/html5/thumbnails/21.jpg)
Generating Topological Summaries
![Page 22: Anthony Bak, Principal Data Scientist at Ayasdi at MLconf SEA - 5/01/15](https://reader034.vdocuments.net/reader034/viewer/2022042607/55a686eb1a28ab501e8b456b/html5/thumbnails/22.jpg)
Generating Topological Summaries
![Page 23: Anthony Bak, Principal Data Scientist at Ayasdi at MLconf SEA - 5/01/15](https://reader034.vdocuments.net/reader034/viewer/2022042607/55a686eb1a28ab501e8b456b/html5/thumbnails/23.jpg)
Generating Topological Summaries
![Page 24: Anthony Bak, Principal Data Scientist at Ayasdi at MLconf SEA - 5/01/15](https://reader034.vdocuments.net/reader034/viewer/2022042607/55a686eb1a28ab501e8b456b/html5/thumbnails/24.jpg)
Remember/Forget
Use multiple lenses/metrics to get the complete picture
Different lenses provide different summaries
![Page 25: Anthony Bak, Principal Data Scientist at Ayasdi at MLconf SEA - 5/01/15](https://reader034.vdocuments.net/reader034/viewer/2022042607/55a686eb1a28ab501e8b456b/html5/thumbnails/25.jpg)
Generating Topological Summaries
![Page 26: Anthony Bak, Principal Data Scientist at Ayasdi at MLconf SEA - 5/01/15](https://reader034.vdocuments.net/reader034/viewer/2022042607/55a686eb1a28ab501e8b456b/html5/thumbnails/26.jpg)
Lenses: where do they come from?
Mean/Max/Min
Variance
n-Moment
Density
…
Statistics
PCA/SVD
Autoencoders
Isomap/MDS/TS
NE
…
Machine
Learning
Centrality
Curvature
Harmonic Cycles
…
Geometry
![Page 27: Anthony Bak, Principal Data Scientist at Ayasdi at MLconf SEA - 5/01/15](https://reader034.vdocuments.net/reader034/viewer/2022042607/55a686eb1a28ab501e8b456b/html5/thumbnails/27.jpg)
Why Topology?
![Page 28: Anthony Bak, Principal Data Scientist at Ayasdi at MLconf SEA - 5/01/15](https://reader034.vdocuments.net/reader034/viewer/2022042607/55a686eb1a28ab501e8b456b/html5/thumbnails/28.jpg)
Key Properties of TDA
Deformation
Invariance
Compressed
Representation
Coordinate
Freeness
![Page 29: Anthony Bak, Principal Data Scientist at Ayasdi at MLconf SEA - 5/01/15](https://reader034.vdocuments.net/reader034/viewer/2022042607/55a686eb1a28ab501e8b456b/html5/thumbnails/29.jpg)
Coordinate Invariance
1. Topology of shape doesn’t depend on the coordinates used to
describe the shape
1. Different feature sets can describe the same phenomena
1. While processing data, we frequently alter coordinates: scaling,
rotating, whitening
You want to study properties of your data that are invariant
under coordinate changes
![Page 30: Anthony Bak, Principal Data Scientist at Ayasdi at MLconf SEA - 5/01/15](https://reader034.vdocuments.net/reader034/viewer/2022042607/55a686eb1a28ab501e8b456b/html5/thumbnails/30.jpg)
Coordinate Invariance: Gene Expression
NKI
GSE230
![Page 31: Anthony Bak, Principal Data Scientist at Ayasdi at MLconf SEA - 5/01/15](https://reader034.vdocuments.net/reader034/viewer/2022042607/55a686eb1a28ab501e8b456b/html5/thumbnails/31.jpg)
Coordinate Invariance: Disease State
![Page 32: Anthony Bak, Principal Data Scientist at Ayasdi at MLconf SEA - 5/01/15](https://reader034.vdocuments.net/reader034/viewer/2022042607/55a686eb1a28ab501e8b456b/html5/thumbnails/32.jpg)
Deformation Invariance
• Topological features don’t change when you stretch and distort the
data
Advantage: Makes problems easier
Noise resistance
Less pre-processing of data
Robust (stable) data
![Page 33: Anthony Bak, Principal Data Scientist at Ayasdi at MLconf SEA - 5/01/15](https://reader034.vdocuments.net/reader034/viewer/2022042607/55a686eb1a28ab501e8b456b/html5/thumbnails/33.jpg)
Deformation Invariance
![Page 34: Anthony Bak, Principal Data Scientist at Ayasdi at MLconf SEA - 5/01/15](https://reader034.vdocuments.net/reader034/viewer/2022042607/55a686eb1a28ab501e8b456b/html5/thumbnails/34.jpg)
Deformation Invariance
![Page 35: Anthony Bak, Principal Data Scientist at Ayasdi at MLconf SEA - 5/01/15](https://reader034.vdocuments.net/reader034/viewer/2022042607/55a686eb1a28ab501e8b456b/html5/thumbnails/35.jpg)
Deformation Invariance
![Page 36: Anthony Bak, Principal Data Scientist at Ayasdi at MLconf SEA - 5/01/15](https://reader034.vdocuments.net/reader034/viewer/2022042607/55a686eb1a28ab501e8b456b/html5/thumbnails/36.jpg)
Deformation Invariance
![Page 37: Anthony Bak, Principal Data Scientist at Ayasdi at MLconf SEA - 5/01/15](https://reader034.vdocuments.net/reader034/viewer/2022042607/55a686eb1a28ab501e8b456b/html5/thumbnails/37.jpg)
Compressed Representation
• Replace the metric space with a combinatorial summary: a simplicial
complex.
• Data becomes easier to manage, search, and query while
maintaining essential features.
• Leverages many known algorithms from graph theory, computational
topology, computational geometry.
![Page 38: Anthony Bak, Principal Data Scientist at Ayasdi at MLconf SEA - 5/01/15](https://reader034.vdocuments.net/reader034/viewer/2022042607/55a686eb1a28ab501e8b456b/html5/thumbnails/38.jpg)
Compressed Representation
![Page 39: Anthony Bak, Principal Data Scientist at Ayasdi at MLconf SEA - 5/01/15](https://reader034.vdocuments.net/reader034/viewer/2022042607/55a686eb1a28ab501e8b456b/html5/thumbnails/39.jpg)
Baby Steps: PCA
![Page 40: Anthony Bak, Principal Data Scientist at Ayasdi at MLconf SEA - 5/01/15](https://reader034.vdocuments.net/reader034/viewer/2022042607/55a686eb1a28ab501e8b456b/html5/thumbnails/40.jpg)
PCA
![Page 41: Anthony Bak, Principal Data Scientist at Ayasdi at MLconf SEA - 5/01/15](https://reader034.vdocuments.net/reader034/viewer/2022042607/55a686eb1a28ab501e8b456b/html5/thumbnails/41.jpg)
PCA
![Page 42: Anthony Bak, Principal Data Scientist at Ayasdi at MLconf SEA - 5/01/15](https://reader034.vdocuments.net/reader034/viewer/2022042607/55a686eb1a28ab501e8b456b/html5/thumbnails/42.jpg)
Data Stories
![Page 43: Anthony Bak, Principal Data Scientist at Ayasdi at MLconf SEA - 5/01/15](https://reader034.vdocuments.net/reader034/viewer/2022042607/55a686eb1a28ab501e8b456b/html5/thumbnails/43.jpg)
Model Introspection
![Page 44: Anthony Bak, Principal Data Scientist at Ayasdi at MLconf SEA - 5/01/15](https://reader034.vdocuments.net/reader034/viewer/2022042607/55a686eb1a28ab501e8b456b/html5/thumbnails/44.jpg)
Model Introspection
![Page 45: Anthony Bak, Principal Data Scientist at Ayasdi at MLconf SEA - 5/01/15](https://reader034.vdocuments.net/reader034/viewer/2022042607/55a686eb1a28ab501e8b456b/html5/thumbnails/45.jpg)
Predictive Maintenance
![Page 46: Anthony Bak, Principal Data Scientist at Ayasdi at MLconf SEA - 5/01/15](https://reader034.vdocuments.net/reader034/viewer/2022042607/55a686eb1a28ab501e8b456b/html5/thumbnails/46.jpg)
Customer Churn
![Page 47: Anthony Bak, Principal Data Scientist at Ayasdi at MLconf SEA - 5/01/15](https://reader034.vdocuments.net/reader034/viewer/2022042607/55a686eb1a28ab501e8b456b/html5/thumbnails/47.jpg)
Customer Churn
![Page 48: Anthony Bak, Principal Data Scientist at Ayasdi at MLconf SEA - 5/01/15](https://reader034.vdocuments.net/reader034/viewer/2022042607/55a686eb1a28ab501e8b456b/html5/thumbnails/48.jpg)
Transaction Fraud
![Page 49: Anthony Bak, Principal Data Scientist at Ayasdi at MLconf SEA - 5/01/15](https://reader034.vdocuments.net/reader034/viewer/2022042607/55a686eb1a28ab501e8b456b/html5/thumbnails/49.jpg)
Transaction Fraud
![Page 50: Anthony Bak, Principal Data Scientist at Ayasdi at MLconf SEA - 5/01/15](https://reader034.vdocuments.net/reader034/viewer/2022042607/55a686eb1a28ab501e8b456b/html5/thumbnails/50.jpg)
Transaction Fraud
![Page 51: Anthony Bak, Principal Data Scientist at Ayasdi at MLconf SEA - 5/01/15](https://reader034.vdocuments.net/reader034/viewer/2022042607/55a686eb1a28ab501e8b456b/html5/thumbnails/51.jpg)
We’re Hiring!http://www.ayasdi.com/company/careers/
Data Has Shape
And
Shape Has Meaning