topological data analysis and network...
TRANSCRIPT
![Page 1: Topological Data Analysis and Network Coverageweb.media.mit.edu/~echu/assets/projects/tda/122_proj_pres.pdf · •most work fell into one of two groups - approaches that utilized](https://reader035.vdocuments.net/reader035/viewer/2022070820/5f1c716a6d6d54783204d0d7/html5/thumbnails/1.jpg)
Topological Data Analysis and Network Coverage
EE122 Project, Spring 2014
Rey Blume, Eric Chu
![Page 2: Topological Data Analysis and Network Coverageweb.media.mit.edu/~echu/assets/projects/tda/122_proj_pres.pdf · •most work fell into one of two groups - approaches that utilized](https://reader035.vdocuments.net/reader035/viewer/2022070820/5f1c716a6d6d54783204d0d7/html5/thumbnails/2.jpg)
TDA: Motivation and Intro
• http://www.coloquios.info/ponencias/MBGT-TopologicalDataAnalysis.pdf• Algebraic topology • Increasing amount of data produced, high-dimensional, large amount of
data• Qualitative information to make sense and structure of data• Metrics and coordinates used to calculate statistical values (means,
distances, etc.) often unjustified, e.g. biological problems• Clustering algorithms brittle to choice of epsilon• Why topology?
– Study of qualitative geometric information, connectivity– Less sensitive to actual choice of metrics, coordinate-free– Functoriality (inclusion maps between spaces – in our case, complexes – allow
us to make conclusions at the global level from local pieces)
• TDA: Applications
![Page 3: Topological Data Analysis and Network Coverageweb.media.mit.edu/~echu/assets/projects/tda/122_proj_pres.pdf · •most work fell into one of two groups - approaches that utilized](https://reader035.vdocuments.net/reader035/viewer/2022070820/5f1c716a6d6d54783204d0d7/html5/thumbnails/3.jpg)
Basic ApproachEx. Torus
![Page 4: Topological Data Analysis and Network Coverageweb.media.mit.edu/~echu/assets/projects/tda/122_proj_pres.pdf · •most work fell into one of two groups - approaches that utilized](https://reader035.vdocuments.net/reader035/viewer/2022070820/5f1c716a6d6d54783204d0d7/html5/thumbnails/4.jpg)
Algebraic Topology: Topology
• ‘Rubber-sheet geometry’
• Continuity
• Different levels of ‘sameness’
– Homeomorphism, homotopy equivalence
– Homology computationally tractable
– Ex.
![Page 5: Topological Data Analysis and Network Coverageweb.media.mit.edu/~echu/assets/projects/tda/122_proj_pres.pdf · •most work fell into one of two groups - approaches that utilized](https://reader035.vdocuments.net/reader035/viewer/2022070820/5f1c716a6d6d54783204d0d7/html5/thumbnails/5.jpg)
Simplicial Complexes
• Thm. Triangulation of a space X is a simplicial complex K with a homeomorphism |K| -> X– Choice of triangulation doesn’t matter
– Therefore, can go in reverse: point-cloud -> complex -> space
• A simplicial complex K is a set of simplices where– Any face of K is also in K
– Intersection of any two simplices i, j in K is a face of both i and j
![Page 6: Topological Data Analysis and Network Coverageweb.media.mit.edu/~echu/assets/projects/tda/122_proj_pres.pdf · •most work fell into one of two groups - approaches that utilized](https://reader035.vdocuments.net/reader035/viewer/2022070820/5f1c716a6d6d54783204d0d7/html5/thumbnails/6.jpg)
Simplicial Complexes
• Cech Complex– Intersection of epsilon /2 balls = edge
– Too computationally intensive for anything n-simplex with n > 1
• Therefore, Rips Complex– Pair-wise computations: if distance < epsilon
– Add high dimension simplices whenever possible (all its faces have been added)
– Not homotopy equivalent to the cover of the set, but seems to work reasonably well
![Page 7: Topological Data Analysis and Network Coverageweb.media.mit.edu/~echu/assets/projects/tda/122_proj_pres.pdf · •most work fell into one of two groups - approaches that utilized](https://reader035.vdocuments.net/reader035/viewer/2022070820/5f1c716a6d6d54783204d0d7/html5/thumbnails/7.jpg)
Rips Complex
![Page 8: Topological Data Analysis and Network Coverageweb.media.mit.edu/~echu/assets/projects/tda/122_proj_pres.pdf · •most work fell into one of two groups - approaches that utilized](https://reader035.vdocuments.net/reader035/viewer/2022070820/5f1c716a6d6d54783204d0d7/html5/thumbnails/8.jpg)
Homology Groups
• Groups– Set of elements with an operation that satisfy closure, associativity, identity element, inverse element– Ex. Integers with addition; Integers with multiplication is NOT a group (no inverses); symmetry group (e.g.
square with rotations)
• Chain groups– k-chain = sum of oriented k-simplices– C_k (K) := kth chain group, set of all chains – Relate chain groups of successive dimensions through the boundary operator
• Alternating sum of its faces
![Page 9: Topological Data Analysis and Network Coverageweb.media.mit.edu/~echu/assets/projects/tda/122_proj_pres.pdf · •most work fell into one of two groups - approaches that utilized](https://reader035.vdocuments.net/reader035/viewer/2022070820/5f1c716a6d6d54783204d0d7/html5/thumbnails/9.jpg)
Homology Groups
• Chain complex C*
• kth cycle group
• kth boundary group
• Homology groups from chain groups
– “cycles mod boundaries” - boundaries become identity element in new group
– Cycles that aren’t boundaries are holes
![Page 10: Topological Data Analysis and Network Coverageweb.media.mit.edu/~echu/assets/projects/tda/122_proj_pres.pdf · •most work fell into one of two groups - approaches that utilized](https://reader035.vdocuments.net/reader035/viewer/2022070820/5f1c716a6d6d54783204d0d7/html5/thumbnails/10.jpg)
Betti Numbers
• Rank of nth homology group = nth Betti number
• Computation of rank is just linear algebra given simplices– Rank-Nullity Thm.
• Euler-Poincare Formula – Summation of Betti numbers related to Euler Characteristic (topological invariant)
• Betti-0 = # connected components
• Betti-1 = # loops
• Betti-2 = # cavities
• Ex. Circle, Torus, Solid Sphere
![Page 11: Topological Data Analysis and Network Coverageweb.media.mit.edu/~echu/assets/projects/tda/122_proj_pres.pdf · •most work fell into one of two groups - approaches that utilized](https://reader035.vdocuments.net/reader035/viewer/2022070820/5f1c716a6d6d54783204d0d7/html5/thumbnails/11.jpg)
Persistent Homology and Barcodes
• Want to ignore noise, capture features that persist
• Increasing epsilon creates different complexes over time
• Barcodes simply neat way of capturing that information– Current research includes
statistical methods to analyze barcodes
![Page 12: Topological Data Analysis and Network Coverageweb.media.mit.edu/~echu/assets/projects/tda/122_proj_pres.pdf · •most work fell into one of two groups - approaches that utilized](https://reader035.vdocuments.net/reader035/viewer/2022070820/5f1c716a6d6d54783204d0d7/html5/thumbnails/12.jpg)
Persistent Homology and Barcodes
![Page 13: Topological Data Analysis and Network Coverageweb.media.mit.edu/~echu/assets/projects/tda/122_proj_pres.pdf · •most work fell into one of two groups - approaches that utilized](https://reader035.vdocuments.net/reader035/viewer/2022070820/5f1c716a6d6d54783204d0d7/html5/thumbnails/13.jpg)
Network Coverage: Problem, Classical Solutions
• most work fell into one of two groups - approaches that utilized geometric analysis to obtain an exact answer and those that sought a non-deterministic approximation but assumed significant capabilities of the sensors.
• The former approach requires a great deal of prior knowledge about the geometry of the domain and the exact location of the sensors, or at least exact distances for every pair of sensors. The latter does not require this exactness, but often requires a uniform distribution of nodes or a high level of intelligence in the sensors
• http://www.elizabethmunch.com/math/research/ElizabethMunch-TimeVaryingPersistence.pdf
![Page 14: Topological Data Analysis and Network Coverageweb.media.mit.edu/~echu/assets/projects/tda/122_proj_pres.pdf · •most work fell into one of two groups - approaches that utilized](https://reader035.vdocuments.net/reader035/viewer/2022070820/5f1c716a6d6d54783204d0d7/html5/thumbnails/14.jpg)
Network Coverage: Why TDA?, Applications
• Ghrist– GPS can be unattractive due to: cost, power consumption, accuracy
limitations
• Coverage problem can be solved if we have:– Exact knowledge of the coverage area shape,– Exact knowledge of each sensors’ position, and– Centralized information gathering and processing
• But, using TDA, can solve even if we have:– Unknown coverage area shape– Crude proximity information– Centralized information gathering processing (still need this one)
• Topology gives global information from local inputs
![Page 15: Topological Data Analysis and Network Coverageweb.media.mit.edu/~echu/assets/projects/tda/122_proj_pres.pdf · •most work fell into one of two groups - approaches that utilized](https://reader035.vdocuments.net/reader035/viewer/2022070820/5f1c716a6d6d54783204d0d7/html5/thumbnails/15.jpg)
Network Coverage: Why TDA?, Applications
• Especially applicable for ad-hoc networks, which are a hot area of research, entering public usage
– new iPhone mesh network functionality
– Egypt, openmeshnetwork
• Robotic sensors
![Page 16: Topological Data Analysis and Network Coverageweb.media.mit.edu/~echu/assets/projects/tda/122_proj_pres.pdf · •most work fell into one of two groups - approaches that utilized](https://reader035.vdocuments.net/reader035/viewer/2022070820/5f1c716a6d6d54783204d0d7/html5/thumbnails/16.jpg)
Simulation: Intel Lab Data
• http://db.csail.mit.edu/labdata/labdata.html
• 54 sensors in Intel Berkeley Research Lab collecting humidity, temporate, light, etc.
• Computations done using Javaplex and Matlab
![Page 17: Topological Data Analysis and Network Coverageweb.media.mit.edu/~echu/assets/projects/tda/122_proj_pres.pdf · •most work fell into one of two groups - approaches that utilized](https://reader035.vdocuments.net/reader035/viewer/2022070820/5f1c716a6d6d54783204d0d7/html5/thumbnails/17.jpg)
Simulation: Intel Lab Data, Euclidean Position
Max_filtration = 100, num_divisions = 100, vietoris rips
![Page 18: Topological Data Analysis and Network Coverageweb.media.mit.edu/~echu/assets/projects/tda/122_proj_pres.pdf · •most work fell into one of two groups - approaches that utilized](https://reader035.vdocuments.net/reader035/viewer/2022070820/5f1c716a6d6d54783204d0d7/html5/thumbnails/18.jpg)
Simulation: Intel Lab Data, Euclidean Position
Max_filtration = 100, num_divisions = 50, vietoris ripsSmaller divisions = greater time b/n homology calcuation = lose some granularity
![Page 19: Topological Data Analysis and Network Coverageweb.media.mit.edu/~echu/assets/projects/tda/122_proj_pres.pdf · •most work fell into one of two groups - approaches that utilized](https://reader035.vdocuments.net/reader035/viewer/2022070820/5f1c716a6d6d54783204d0d7/html5/thumbnails/19.jpg)
Simulation: Intel Lab Data, Euclidean Position
Max_filtration = 100, num_divisions = 10, vietoris rips
![Page 20: Topological Data Analysis and Network Coverageweb.media.mit.edu/~echu/assets/projects/tda/122_proj_pres.pdf · •most work fell into one of two groups - approaches that utilized](https://reader035.vdocuments.net/reader035/viewer/2022070820/5f1c716a6d6d54783204d0d7/html5/thumbnails/20.jpg)
Simulation: Intel Lab Data, Complexity
• Rips complex construction can be a bottleneck
– # of simplices for intel example
• Witness/Lazy-witness creates far fewer simplices than rips
– Landmark points
• 1) random sampling of landmark points L
• 2) greedy inductive selection process called sequential maxmin
• Formal definitions of each
![Page 21: Topological Data Analysis and Network Coverageweb.media.mit.edu/~echu/assets/projects/tda/122_proj_pres.pdf · •most work fell into one of two groups - approaches that utilized](https://reader035.vdocuments.net/reader035/viewer/2022070820/5f1c716a6d6d54783204d0d7/html5/thumbnails/21.jpg)
Simulation: Intel Lab Data, Complexity
![Page 22: Topological Data Analysis and Network Coverageweb.media.mit.edu/~echu/assets/projects/tda/122_proj_pres.pdf · •most work fell into one of two groups - approaches that utilized](https://reader035.vdocuments.net/reader035/viewer/2022070820/5f1c716a6d6d54783204d0d7/html5/thumbnails/22.jpg)
Simulation: Intel Lab Data, Complexity
• Results: # simplices, run-time for each complex under different parameters
• Num_divisions = 50
– Rips: t=21.2005s, num_simplices=342540
– Witness; Lazy Witness:
• 20 L pts: 0.9828, 6195; 0.6864, 6195
• 30 L pts: 1.2792, 31930; 0. 1.0920, 31930
• 40 L pts: 3.9156, 102090; 3.6816, 102090
![Page 23: Topological Data Analysis and Network Coverageweb.media.mit.edu/~echu/assets/projects/tda/122_proj_pres.pdf · •most work fell into one of two groups - approaches that utilized](https://reader035.vdocuments.net/reader035/viewer/2022070820/5f1c716a6d6d54783204d0d7/html5/thumbnails/23.jpg)
Witness with only 20 landmark pts is still largely accurate
![Page 24: Topological Data Analysis and Network Coverageweb.media.mit.edu/~echu/assets/projects/tda/122_proj_pres.pdf · •most work fell into one of two groups - approaches that utilized](https://reader035.vdocuments.net/reader035/viewer/2022070820/5f1c716a6d6d54783204d0d7/html5/thumbnails/24.jpg)
Simulation: Intel Lab Data, Connectivity Data
• Data: probability that sensor A will be able to talk to sensor B– Asymmetric
• Create 1-simplex if P(A,B) > thres && P(B,A) > thres
• Create 2-simplex if directional pairs in triplet > thres
• Global connectivity data from local data• Studies show poor correlation between distance
and signal anyway
![Page 25: Topological Data Analysis and Network Coverageweb.media.mit.edu/~echu/assets/projects/tda/122_proj_pres.pdf · •most work fell into one of two groups - approaches that utilized](https://reader035.vdocuments.net/reader035/viewer/2022070820/5f1c716a6d6d54783204d0d7/html5/thumbnails/25.jpg)
Threshold = 0.05, 3, 0, 2026
![Page 26: Topological Data Analysis and Network Coverageweb.media.mit.edu/~echu/assets/projects/tda/122_proj_pres.pdf · •most work fell into one of two groups - approaches that utilized](https://reader035.vdocuments.net/reader035/viewer/2022070820/5f1c716a6d6d54783204d0d7/html5/thumbnails/26.jpg)
Threshold = 0.3, 3, 349Threshold = 0.1, 3, 2, 981
![Page 27: Topological Data Analysis and Network Coverageweb.media.mit.edu/~echu/assets/projects/tda/122_proj_pres.pdf · •most work fell into one of two groups - approaches that utilized](https://reader035.vdocuments.net/reader035/viewer/2022070820/5f1c716a6d6d54783204d0d7/html5/thumbnails/27.jpg)
Threshold = 0.3, 4, 118Threshold = 0.3, 4, 7, 96
![Page 28: Topological Data Analysis and Network Coverageweb.media.mit.edu/~echu/assets/projects/tda/122_proj_pres.pdf · •most work fell into one of two groups - approaches that utilized](https://reader035.vdocuments.net/reader035/viewer/2022070820/5f1c716a6d6d54783204d0d7/html5/thumbnails/28.jpg)
Threshold = 0.5, 11, 18Threshold = 0.5, 11, 1, 3
![Page 29: Topological Data Analysis and Network Coverageweb.media.mit.edu/~echu/assets/projects/tda/122_proj_pres.pdf · •most work fell into one of two groups - approaches that utilized](https://reader035.vdocuments.net/reader035/viewer/2022070820/5f1c716a6d6d54783204d0d7/html5/thumbnails/29.jpg)
Threshold = 0.7, 46, 0
![Page 30: Topological Data Analysis and Network Coverageweb.media.mit.edu/~echu/assets/projects/tda/122_proj_pres.pdf · •most work fell into one of two groups - approaches that utilized](https://reader035.vdocuments.net/reader035/viewer/2022070820/5f1c716a6d6d54783204d0d7/html5/thumbnails/30.jpg)
Another Dataset/Moving Network?
• http://crawdad.cs.dartmouth.edu/all-byname.html• - Dataset of mobility traces of taxi cabs in San Francisco, USA. !!• - Dataset of WiFi-based connectivity between basestations and
vehicles in urban settings.• - Dataset of received signal strength indication (RSSI) collected from
within an indoor office building.• - Dataset of coverage and performance-related information of
MetroFi, a 802.11x municipal wireless mesh network in Portland, Oregon in 2007. !!
• - Data set consisting of measurements from two different wireless mesh network testbeds (802.11g and 802.11a).
• http://www.wings.cs.sunysb.edu/wiki/doku.php?id=mutli-channel-dataset
![Page 31: Topological Data Analysis and Network Coverageweb.media.mit.edu/~echu/assets/projects/tda/122_proj_pres.pdf · •most work fell into one of two groups - approaches that utilized](https://reader035.vdocuments.net/reader035/viewer/2022070820/5f1c716a6d6d54783204d0d7/html5/thumbnails/31.jpg)
Future Research
• Ns-3, mobile ad-hoc network– Google Loon, Facebook Drones
• Distributed homology calculation• Pursuit evasion problem: Betti-0 = 0 over time