alan s. willsky director, lids head, ssg willsky@mit ssg.mit november 2010
DESCRIPTION
Research in MIT’s Laboratory for Information and Decision Systems and in The Stochastic Systems Group. Alan S. Willsky Director, LIDS Head, SSG [email protected] http://ssg.mit.edu November 2010. A Brief History of LIDS. The oldest continuing laboratory on campus - PowerPoint PPT PresentationTRANSCRIPT
Research in MIT’s Laboratory for Information and Decision Systems and in The Stochastic Systems Group
Alan S. WillskyDirector, LIDS
Head, SSG
[email protected]://ssg.mit.eduNovember 2010
A Brief History of LIDS The oldest continuing laboratory on campus Servomechanism Lab founded in 1940
Major contributions to crucial applications Military fire control, Numerically-controlled machines, …
Pushing emerging computer technologies Whirlwind, APT, …
Broadened agenda and name changes: ESL (1950s) and LIDS (1970s)
Radar - Porcupine point defense INTREX - One of the first database systems Modern Control and Optimization Robust and adaptive control Large-scale and decentralized systems
Continuing history of involvement and partnerships with industry and government (including a number of successful start-ups)
Continuing history of major impact on academic programs and development of widely-used texts
LIDS Now A center of gravity for research on the analytical information
and decision sciences Our mission: Pushing the envelope and foundations of
information and decision sciences in the large Research “centers of gravity” and traditional core disciplines
Systems and control Optimization Networks Inference, estimation, learning, and fusion Communications and information theory
Major push to work across disciplines, e.g., The Science of Networked Systems
A sampling of application areas Coordination/control of autonomous vehicles Energy and economic information and decision systems Situational awareness Biological and biomedical signal and image analysis and modeling Large-scale data assimilation for the geosciences
402/13/2008 Team MIT
The DARPA Urban Challenge (Joint CSAIL/LIDS): Example: Evasive Maneuvering Intention of other cars not always clear
Have to believe that other vehicles will behave rationally Still need to be able to avoid accordingly
Video shows safe avoidance maneuver
First demonstration of UWB Localization
From 9% to 87%
C-LOCIncrease in coverage
Improved precision
C-LOCIncrease in coverage
Improved precision
Gossip algorithm: P2P networks
• Peer-to-peer networkso Architecture of choice for content
dissemination, e.g. BBC iPlayero Need extremely simple
algorithms
• Randomized gossip algorithm• Local, iterative and very simple• Robust through randomness
• Efficient gossip solutions for• Content dissemination• Code-based distributed storage• Separable function computation
• Performance is determined by• Spectral properties of network
cut
MAC Protocol that finally works!
Contention resolution or Medium access Fundamental to any well engineered system,
e.g. emerging wireless networks Challenge: need efficient & implementable
MAC Unresolved quest for over four decades.
A new queued based MAC protocol Insights from learning, stat physics &
theory of Markov chains Essentially, each transmitter transmits or
not Independently, with probability
that is function of its own backlog And that’s it !
Theorem. This MAC protocol is efficient. Received ACM Sigmetrics best paper award
2009
Learning a Large Circuit Evaluating yield of an SRAM cell
To a high degree of accuracy for low failure prob. Our approach
Identify effective failure event inspired by theory of Large Deviations
Rare events happen only in a typical manner Efficient sampling mechanism based on importance
sampling
now: 2 minutesbefore: 2 months
SMART IRG#4: Future Mobility
Objectives: Develop in and beyond Singapore new paradigms for the
planning, design, and operation of future urban transportation systems
Sustainability, societal, and environmental well-being in a high-density, livable urban environment
Multi-disciplinary foundational elements Pillar 1: Networked Computing and Control [Frazzoli,
Jaillet, Dahleh] Enable transformative technologies for urban transportation by collecting, storing, securely processing, and exploiting fine-granularity mobility data through the increasingly powerful Internet “cloud” and personal devices
Pillar 2: Integrated Models of Land Use, Mobility, and Energy and Resource Use [Jaillet, Frazzoli]Develop advanced integrated behavioral models to predict the effects of system interventions. Development of new simulation, optimization, and evaluation tools for real-time services and system controls.
Pillar 3: Performance assessment and implementationDeveloping “metrics that matter” to enable scale-able system assessment approaches to validly and reliably measure sustainability impacts.
Hybrid Electric Vehicle (HEV)
The HEV draws power from two sources Internal Combustion Engine: Primary power source Battery: Secondary power source
assists engine at high torques – engine is less fuel-efficient at high torques
allow for engine shut-off while idle -- waiting at a red light recharges through regenerative braking
Challenge: optimal battery usage utilizing GPS data
Networked Control
System
Controller
System Network controller
From classical co-located systems to networked systems
Vehicle Routing Problems in a Dynamic World
“User” or “environment” model: Events of interest driven over time by
exogenous processes. Stochastic or adversarial models. Complex task specifications (e.g.,
temporal logic constraints). “System” model
Vehicles subject to algebraic, differential, and integral constraints.
Local sensing and communications. Limited computational resources. Heterogeneous systems: different
vehicles, human agents. Performance Criteria
Quality of Service: minimize delays, maximize capacity/throughput.
Approach Design polynomial time approximation
algorithms. Novel tools combining systems and control
theory, combinatorial optimization, queueing theory, stochastic processes, game theory, learning and estimation.
Traveling Repairperson Dial-A-Ride Environmental Monitoring Mobile sensor networks Surveillance Search and Rescue Area Denial Crime prevention Security Network connectivity Emergency Relief Traffic congestion management.
Dynamics in Social Networks Spread of different “epidemics” may have similar structures
Models for understanding dynamics of fads, opinions, conventions, technological innovations, and implications of network structure
Tuberculosis outbreak
Word-of mouth product
recommendations
Statistical methods for locating “rumor” sources
Proving desired performance and absence of run-time errors in real-time embedded software is critical.
Software can be modeled as a dynamical system. Specific Lyapunov-like functions can prove critical properties such as absence of variable overflow and termination in finite-time. Optimization methods such as semi-definite programming or linear programming can be used to find these Lyapunov-like functions.
Verification of Real-Time Embedded Software
Lyapunov-like FunctionsCan Prove Certain
Invariant Properties ofDynamical Systems
Optimization-Based Search(e.g. Semidefinite Programming)
for Lyapunov-like Invariants
UndecidableProblems
SuitableDynamical System
Model
Computer Program
Terminates inFinite-time?
Runs without
Overflow?
Scale up for Application to
Large Programs
Bandgap optimization Systematic design of materials for wave
propagation Decision variables are dielectric
composition “Good” material properties determined by
spectral bandgap
Mangan, et al., OFC 2004 PDP24
SSG Themes Representation and extraction of information in complex data and
phenomena Models that capture rich classes of phenomena and also lead to scalable
algorithms Graphical models represents a major component of our efforts
Representation and extraction of geometric information Learning, model discovery, and data mining
Fusion, segmentation, etc., when models aren’t available (or trustworthy) a priori Or when we desire models that have desirable properties (e.g., sparsity,
tractability Statistical methods for distributed phenomena
Graphical models/Markov random fields Sensor networks and fusion
Application areas Situational awareness/multisensor fusion in complex environments Computer vision Sensor networks Geophysical data assimilation and remote sensing Medical imaging …
Inference algorithms for graphical models on trees
Message-passing algorithms for “estimation” (marginal computation) Two-sweep algorithms (leaves-root-leaves)
For linear/Gaussian models, these are the generalizations of Kalman filters and smoothers
Belief propagation, sum-product algorithm Non-directional (no root; all nodes are equal) Lots of freedom in message scheduling
Message-passing algorithms for “optimization” (MAP estimation) Two sweep: Generalization of
Viterbi/dynamic programming Max-product algorithm
What do people do when there are loops?
Turn graphs into trees Junction trees and cutset models Dimensionality/combinatorial explosion in many cases
Learn (or approximate with) models with tractable structure
Multiresolution models and others with hidden variables Another well-oiled approach
Belief propagation (and max-product) are algorithms whose local form is well defined for any graph
However for a loopy graph, BP fuses information based on invalid assumptions of conditional independence
When does this converge? What does it converge to? Come up with new algorithms
Recursive Cavity Models
Graphical Model Example #1 Near-optimal, scalable, and very large-scale data
assimilation for geophysical mapping (and uncertainty quantification)
Multiresolution/Hierarchical Models: A Continuing SSG Theme
Earliest work: MR models on pyramidal trees Subsequent:
Algorithms that use embedded trees as the kernel of iterative algorithms
MR models but with sparse in-scale conditional graphical structure or conditional correlations Iterative algorithms with good properties
And there’s more “in the works”
Graphical Model Example #2• Fusion of multi-modal, multi-resolution data (and estimation of critical aggregate variables)• Learning of hierarchical relationships/dependencies
Graphical Models Example #3: Fast algorithms supporting expert analysts
initial estimates
re-estimates
1757 X 1284 surface, 377384 measurements 3 million nodes in the pyramidal graph Introduce 100 new measurements in a 17 X 17 square
region Use adaptive multipole methods to update in 10 iterations,
each of which involves fewer than 1000 nodes
Walk-sum analysis for Gaussian models
Gaussian models are specified in terms of the inverse covariance, J
Sparsity pattern determines graph structure Computing estimates involves solving linear equations
involving J Message-passing algorithms involve “information
walks” along paths in a graph For Gaussian problems these correspond to the computation of
walk sums – easiest to see if J is normalized so that = I - R J-1 =* I + R + R2 + … (makes sense if absolutely summable)
Walk-sum analysis and walk-summability Provides conditions for convergence of algorithms such as
Belief Propagation (which only captures some of the walks) Provides a very clear picture of when BP fails and why it does
so catastrophically
A simple example
Extensions - I Embedded subgraph iterations
Cut some edges to get a tractable graph (e.g., tree)
Perform exact inference (collect all walks) in the subgraph
Richardson iteration: Correction term for effects of edges left out (corresponds to single hop across cut edges)
Repeat – although one can cut different edges Result: Can collect all walks this way and get
exact answer asymptotically
Extensions - II Segregating “feedback nodes”
Find a set of nodes so that removing them cuts all (most) cycles
Then have three-step algorithm BP in remaining graph – exact (approximate) if all cycles
removed Solve inference on the set of feedback nodes Correction BP step for the remainder of the graph
Exact if have complete feedback vertex set Can yield excellent results even if don’t use complete set
Can work even for non-walk-summable models Experiments indicate can get very good results with log(n)
feedback nodes
Nonparametric Inference for General Graphs
Problem: What is the product of two collections of particles?
Belief Propagation•General graphs
•Discrete or Gaussian
Particle Filters•Markov chains
•General potentials
Nonparametric BP•General graphs
•General potentials
Graphical Models Example #4: Dynamic fusion in complex, constrained contexts
Multisensor Data Association in Sensor Networks
Organized network data association
Self-organization with region-based
representation
Hierarchical Dirichlet Processes and Graphical Models: From Scene/context to objects to parts/shape to features
speaker label
speaker state
observations
Speaker-specific transition densities
Speaker-specific mixture weights
Mixture parameters
Speaker-specific emission distribution – infinite Gaussian mixture
Emission distribution conditioned on speaker state
10 20 30 40 50 60 70 80 90 1000
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
Gibbs Iteration
Nor
mal
ize
Ham
min
g E
rror
0 1 2 3 4 5
x 104
0
2
4
6
8
10
12
14
16Gibbs Iteration 100
Time
Spe
aker
Lab
el
Ground TruthEstimated
Unsupervised extraction of structure in dynamic processes, signals, and images
Hierarchical Dirichlet Processes for Object Recognition and Extraction of Switching Dynamic Behavior
NEED MIKE S. VIDEO
Geometry Extraction #1: Curve evolution methods for “blind” segmentation
MCMC-Curve evolution methods for aided gravity inversion
Top salt constraint With additional constraint
Principal Modes of Shape Uncertainty
Some other things Learning graphical models
Error exponents for learning tree models Learning discriminative tree models Learning tree models with hidden nodes
Applications to computer vision Learning models with hidden variables that expose sparse
conditional structure for the observed variables More nonparametrics
Learning hidden semi-Markov models Identifying more complex hidden structures
Can we learn that the motion of 11 “objects” corresponds to two basketball teams and a basketball – AND can we learn the difference between offense and defense…
Exploiting sparsity Sparse reconstruction with uncertain forward operators