distributed algorithms and biological systems

83
DISTRIBUTED ALGORITHMS AND BIOLOGICAL SYSTEMS Nancy Lynch, Saket Navlakha BDA-2014 October, 2014 Austin, Texas

Upload: kirestin-donaldson

Post on 02-Jan-2016

46 views

Category:

Documents


2 download

DESCRIPTION

Distributed Algorithms and Biological Systems. Nancy Lynch, Saket Navlakha BDA-2014 October, 2014 Austin, Texas. Distributed Algorithms + Biological Systems. Distributed algorithms researchers have been considering biological systems, looking for: - PowerPoint PPT Presentation

TRANSCRIPT

Distributed Algorithms for Wireless Networks

Distributed Algorithms and Biological SystemsNancy Lynch, Saket NavlakhaBDA-2014October, 2014Austin, Texas

Picture by Scott Camazine.1Distributed Algorithms + Biological SystemsDistributed algorithms researchers have been considering biological systems, looking for:Biological problems and behaviors that they can model and study using distributed algorithms methods, and Biological strategies that can be adapted for use in computer networks.Yields interesting distributed algorithms results.Q: But what can distributed algorithms contribute to the study of biological systems?

This talk: Overview fundamental ideas from the distributed algorithms research area, for biology researchers. Consider how these might contribute to biology research.What about the other way around?2What are distributed algorithms?Abstract models for systems consisting of many interacting components, working toward a common goal.

Examples:Wired or wireless network of computers, communicating or managing data. Robot swarm, searching an unknown terrain, cleaning up, gathering resources,

Common goals: Computing something, agree on a course of action, form a pattern.

Pictures: computers, robots, insects3What are distributed algorithms?Abstract models for systems consisting of many interacting components, working toward a common goal.

Examples:Wired or wireless network of computers, communicating or managing data. Robot swarm, searching an unknown terrain, cleaning up, gathering resources,Social insect colony, foraging, feeding, finding new nests, resisting predators,Components generally interact directly with nearby components only, using local communication.

Common goals: Computing something, agree on a course of action, form a pattern.

Pictures: computers, robots, insects4Distributed algorithms researchModels for distributed systems.Problems to be solved.Algorithms, analysis.Lower bounds, impossibility results.Problems: Communication, consensus, data management, resource allocation, synchronization, failure detection,Models: Interacting automata.Local communication: individual message-passing, local broadcast, or shared memory.Metrics: Time, amount of communication, local storage.

The research field.Started around 1970.

Algorithms, lower bounds, and impossibility results for problems to be solved in distributed systems.

5Distributed algorithms researchAlgorithms:Some use simple rules, some use complex logical constructions.Local communication.Designed to minimize costs, according to the cost metrics.Often designed to tolerate limited failures.Analyze correctness, costs, and fault-tolerance mathematically.Try to optimize, or at least obtain algorithms that perform very well according to the metrics.

Simple rules, like primitive insects. More complex rules can be sophisticated programs.

Correctness proofs, formal analysis. Not generally automated analysis.6Distributed algorithms researchLower bounds, impossibility results:Theorems that say that you cant solve a problem in a particular system model, or you cant solve it with a certain cost.Distributed computing theory includes hundreds of impossibility results.Unlike theory for traditional sequential algorithms.Why?Distributed models are hard to cope with.Locality of knowledge and action imposes strong limitations.

Unlike Irits area

No one knows what the entire system is doing.7Formal modeling and analysis

Distributed algorithms can get complicated:Many components act concurrently.May have different speeds, failures.Local knowledge and action.In order to reason about them carefully, we need clear, rigorous mathematical foundations.Impossibility results are meaningless without rigorous foundations.Need formal modeling frameworks.

Local knowledge and action: No one knows what the entire system is doing.

8Key ideasModel systems using interacting automata.Not just finite-state automata, but more elaborate automata that may include complex states, timing information, probabilistic behavior, and both discrete and continuous state changes. Timed I/O Automata, Probabilistic I/O Automata.Formal notions of composition and abstraction. Support rigorous correctness proofs and cost analysis.Key ideas from distributed algorithms research, which may be useful for biology research.

Models:Mention Alur-Dill-style Timed Automata and other restricted models that support automated tools. But that is not emphasized that much in the distributed algorithms community---our models are more general and powerful.

9Key ideasDistinguish among:The problems to be solved, The physical platforms on which the problems should be solved, and The algorithms that solve the problems on the platforms.Define cost metrics, such as time, local storage space, and amount of communication, and use them to analyze and compare algorithms and prove lower bounds.Many kinds of algorithms.Many kinds of analysis. Many lower bound methods.

Q: How can this help biology research?We maintain a clear separation among:the problems to be solved, the physical platform on which the problems should be solved, and the algorithms that run on the platforms to solve the problems.

Problems of communication, building network structures, function computation, consensus, data management, resource allocation, and robot coordination,

10Model a system of insects, or cells, using interacting automata.Biology research

Define formally:The problems the systems are solving (distinguishing cells, building structures, foraging, reaching consensus, task allocation, ),The physical capabilities of the systems, and The strategies (algorithms) that are used by the systems.Identify cost metrics (time, energy,)Analyze, compare strategies.Prove lower bounds expressing inherent limitations.Use this to:Predict system behavior.Explain why a biological system has evolved to have the structure it has, and has evolved or learned to behave as it does.One could model a system of insects, or cells, using interacting automata.One could articulate formally the problemsOne could define suitable cost metrics, and analyze strategies in terms of these metrics.

One might hope thereby to predict system behavior, or to explain why a particular biological system has evolved to behave as it does.

Predicting system behavior: We can describe the platform and algorithm, analyze the behavior, and conjecture that this is really what the system is doing. Then we can experimentally examine the system to see if it meets the platform assumptions, and to see if its behavior is consistent with the conjectured algorithm.

Explaining:

If an algorithm is shown to be optimal, or very good compare to simple modifications, that would justify why the system behaves at it does. This behavior could be a result of evolution or just simple learning.

Some platforms also admit better algorithms than others. Such analysis could explain why the platform evolved to have the structural features that it has. Suitability would depend on particular environments, e.g., deserts vs. jungle would lead to different kinds of ants.

11Example 1: Leader electionRing of processes.Computers, programs, robots, insects, cells,Communicate with neighbors by sending messages, in synchronous rounds.Illustrate these ideas using examples from traditional distributed computing theory, and then from some new biologically-inspired distributed algorithms work on agent exploration and task allocation.

Traditional:

Describe the problem(s), outline algorithms, say how we analyze them.

Do in some detail:

Leader electionMaximal independent setSpanning tree

Then mention some others, but no details:

Building spanning trees and other network structures.Fault-tolerant consensusTask allocationData management

Leader election: Simple algorithm (LeLann, Chang, Roberts) that simply discovers the largest unique identifier (UID).

This is a problem of breaking symmetry.

What are processes?

Could show little pictures of computer, program, robot, insect.Just draw a process as a circle.12Example 1: Leader electionRing of processes.Computers, programs, robots, insects, cells,Communicate with neighbors by sending messages, in synchronous rounds.Problem: Exactly one process should (eventually) announce that it is the leader.Motivation:A leader in a computer network or robot swarm could take charge of a computation, telling everyone else what to do.A leader ant could choose a new nest.This is a problem of breaking symmetry.

What are processes?

Could show little pictures of computer, program, robot, insect.Just draw a process as a circle.13Leader electionProblem: Exactly one process should announce that it is the leader.

Suppose: Processes start out identical.Behavior of each process at each round is determined by its current state and incoming messages.Theorem: In this case, its impossible to elect a leader.No distributed algorithm could possibly work.Proof: By contradiction. Suppose we have an algorithm that works, and reach a contradictory conclusion.

Impossibility holds even under the most favorable assumptions: bidirectional communication, ring size n is known to all.

Discuss possible strategies, see that they dont work.

14Leader electionProblem: Exactly one process should announce that it is the leader.Initially identical, deterministic processes

Theorem: Impossible.Proof: Suppose we have an algorithm that works.All processes start out identical.At round 1, they all do the same thing (send the same messages, make the same changes to their states).So they are again identical.Same for round 2, etc.Since the algorithm solves the problem, some process must eventually announce that it is the leader.But then everyone does, contradiction.

Same for all rounds is actually a proof by mathematical induction.

15So what to do?Suppose the processes arent identical---they have unique ID numbers.

Unique identifiers, like SS numbers.

Notice that these are simple local rules. This is a kind of program, not real code that runs, but pseudocode.

Why does this work right? The largest ID gets all the way around the ring, causing its owner to elect itself. Every other ID gets discarded somewhere.

Not hard to prove this.16What if we dont have IDs?Suppose processes are identical, no IDs.Allow them to make random choices.

Assume the processes know n, the total number of processes in the ring.Algorithm: Toss an unfair coin, with probability 1/n of landing on heads.If you toss heads, become a leader candidate.Its pretty likely that there is exactly one candidate.The processes can verify this by passing messages around and seeing what they receive.If they did not succeed, try again.If we dont have IDs, we must weaken the determinism assumption, or else we are up against the impossibility result.So we allow them to be nondeterministic, specifically, we let them be probabilistic, and make random choices.

Probability 1/n means that approximately 1 out of n times, it will land on heads. In a real coin, we could achieve this by weighting the coin. In a computer algorithm, we just use an unfair random number generator program.

17Example 2: Maximal Independent SetAssume a general graph network, with processes at the nodes:Problem: Select some of the processes, so that they form a Maximal Independent Set.

Independent: No two neighbors are both in the set.Maximal: We cant add any more nodes without violating independence.Motivation:Communication networks: Selected processes can take charge of communication, convey information to all the other processes.Developmental biology: Distinguish cells in fruit flys nervous system to become Sensory Organ Precursor cells.

More precisely, the processes should select themselves.

Recent Science magazine, paper by Bar-Joseph et al. Experimental results show that in the development of the flys nervous system, certain cells are singled out as Sensory Organ Precursor cells. These form an MIS. The paper develops a theoretical distributed MIS algorithm whose behavior is consistent with what is observed experimentally. Its somewhat similar to the algorithm Im about to present, but it uses a strategy of waiting a random amount of time.

Talk with Saket about this, to coordinate with what he will say later about Ziv's fly-MIS work.

18Maximal Independent SetProblem: Select some of the processes, so that they form a Maximal Independent Set.Independent: No two neighbors are both in the set.Maximal: We cant add any more nodes without violating independence.

Maximal Independent SetProblem: Select some of the processes, so that they form a Maximal Independent Set.Independent: No two neighbors are both in the set.Maximal: We cant add any more nodes without violating independence.

Distributed MIS problemProblem: Processes should cooperate in a distributed (synchronous, message-passing) algorithm to compute an MIS of the graph.Processes in the MIS should output in and the others should output out.

Unsolvable by deterministic algorithms, in some graphs.Probabilistic algorithm:

Assume processes know a good upper bound on n.No IDs.A bound on the local degree would do.

Unsolvability uses arguments like the impossibility for symmetry-breaking in the ring.

21Probabilistic MIS algorithmAlgorithm idea:Each process chooses a random ID from 1 to N.N should be large enough so its likely that all IDs are distinct.Neighbors exchange IDs.

If a processs ID is greater than all its neighbors IDs, then the process declares itself in, and notifies its neighbors.Anyone who hears a neighbor is in declares itself out, and notifies its neighbors.Processes reconstruct the remaining graph, omitting those who have already decided.Continue with the reduced graph, until no nodes remain.

This is a version of Lubys algorithm.

In another version, the nodes could choose 1 with a probability that depends on the max node degree, 0 otherwise. If a node chooses 1 and all its neighbors choose 0, node elects itself.22ExampleAll nodes start out identical.A bound on the local degree would do.23ExampleEveryone chooses an ID.16513210111987A bound on the local degree would do.24Example16513210111987Processes that chose 16 and 13 are in. Processes that chose 11, 5, 2, and 10 are out.A bound on the local degree would do.25Example412187Undecided (gray) processes choose new IDs.A bound on the local degree would do.26Example412187Processes that chose 12 and 18 are in.Process that chose 7 is out. A bound on the local degree would do.27Example12Undecided (gray) process chooses a new ID.A bound on the local degree would do.28Example12Its in.A bound on the local degree would do.29Properties of the algorithmIdeas illustrated:A more complicated version of leader election, extending to a general graph networkSymmetry-breakingProbabilistic algorithmsAn interesting and clever analysis (not done here).

30More examplesBuilding spanning trees that minimize various network cost measures.Motivation:Communication networks: Use the tree for sending messages from the leader to everyone else.Slime molds: Build a system of tubes that can convey food efficiently from a source to all the mold cells.Building other network structures:RoutesClusters with leaders.

Now we start talking about failures.So far, the settings and algorithms have not involved failures, or other bad network behavior. No requirements of flexibility, robustness, adaptiveness.

31 More examplesHere is where we start studying failures. So far the platforms have been reliable.

Explain fault-prone model, both stopping and Byzantine.

There are many other algorithms.A simple algorithm based on persuasion, as in the Angluin paper? Randomized algorithms like Ben-Ors?

Consensus has been very widely studied, in synchronous, asynchronous, and partially synchronous settings.

32Other examplesCommunicationResource allocationTask allocationSynchronizationData management Failure detection

Its a big field.

Communication:Problems of reliable communication over less reliable networks.

Resource allocation:A number of processes need some resources (could be anything, say, data or equipment) to perform their work. There arent enough resources for everyone to have their own, so they have to share. Protocols are needed to share correctly and efficiently.

Task allocation:Lots of tasks need doing. Assign them to different processes so they all get done.Shvartsman, Georgiou work.Relevant to ant task allocation.

Synchronization:Do stuff at the same time.

Data management problems:Try to maintain coherent shared data, in a large distributed network. Coherent means that it should look like data that is stored and accessed from one place. But it is actually managed in a very distributed way. We need good algorithms to make this work. They are subtle, and we need good modeling, proof, and analysis methods to show that they work rightRelevant ? Algorithms use quorums, an idea that occurs in some bio work.

Failure detection:Detect failures, use information about detection to help solve problems (like consensus, data management, resource allocation,) in setting with failures.Monitoring for failures; can also monitor for other environmental characteristics.

All are studied with this type of theoretical modeling and analysis.

Anything else traditional that foreshadows anything discussed later in the talk.

33Recent Work: Dynamic NetworksMost of distributed computing theory deals with fixed, wired networks.Now we study dynamic networks, which change while they are operating.E.g. wireless networks.Participants may join, leave, fail, and recover.May move around (mobile systems).

Computing in Dynamic Graph NetworksThe graph is always connected.

We studied all these problems.

The diagram is just a graphic here---it depicts a situation that arises in the proofs for some of these algorithms.35Robot Coordination AlgorithmsWe developed distributed algorithms to:Keep the swarm connected for communication.Achieve flocking behavior.Map an unknown environment.Determine global coordinates, working from local sensor readings.

A swarm of cooperating robots, engaged in:Search and rescueExplorationRobots communicate, learn about their environment, perform coordinated activities.

Flocking behavior is like bird flocks, or fish schools.

55 minutes36Biological Systems as Distributed AlgorithmsBiological systems consist of many components, interacting with nearby components to achieve common goals.Colonies of bacteria, bugs, birds, fish,Cells within a developing organism.Brain networks.

Biological systems are like distributed algorithms.37Biological Systems as Distributed AlgorithmsThey are special kinds of distributed algorithms:Use simple chemical messages.Components have simple state, follow simple rules.Flexible, robust, adaptive.

Bio systems are special kinds of distributed algorithms:The components follow simple rules, not complicated programs.

They are also resilient, flexible, fault-tolerant.38ProblemsLeader election: Ants choose a queen.Maximal Independent Set: In fruit fly development, some cells become sensory organs. Building communication structures:Slime molds build tubes to connect to food.Brain cells form circuits to represent memories.

MIS: Cells in the nervous system of a fruit fly start out identical. Then some cells develop into sensory organs. Organized like a MIS.

Communication structures are like spanning trees.

39More problemsConsensus: Bees agree on location of a new hive.Reliable local communication: Cells use chemical signals.Robot swarm coordination: Birds, fish, bacteria travel in flocks / schools / colonies.

The chemical signals avoid problems with collisions. We just get higher concentrations of the chemicals.

40Biological Systems as Distributed AlgorithmsSo, study biological systems as distributed algorithms.Models, problem statements, algorithms, impossibility results. Goals:Use distributed algorithms to understand biological system behavior.Use biological systems to inspire new distributed algorithms.

60 minutes41Example 3: Ant (Agent) ExplorationBut now, assume bounds on:The size of an ants memory.The fineness of probabilities used in an ants random choices.

Describe model, problem, algorithm, and lower bound.

No feedback from the environment, in particular, ants dont know their coordinates.

Include a picture of an ant with memory.42A(ge)nt Exploration

43A(ge)nt Exploration

Use properties of the enhanced Markov chain to analyze probabilities of reaching various locations in the grid within certain amounts of time.

Explain why this lower bound doesnt contradict the upper bound.44 Example 4: Ant Task Allocation

Describe model, problem, algorithm.45 Ant Task Allocation

46Limited ant memory size

Extra stuffMaybe for Sakets talkCharacteristics of the new examplesWhat are important characteristics of these examples that distinguish them from the usual prior work on distributed algorithms?They assume agents with minimal capabilities, such as small number of states, and limitations on the fineness of probabilistic choice.(Other possibilities are limitations on the precision of probabilistic choices, accuracy of counters, accuracy of estimates of other environmental factors.)Results tend to be flexible (work in different environments), robust, adaptive.Algorithms work well "on the average", and little changes don't have much effect.No UIDs.Saket can49Comparing Distributed Algorithms and Biological SystemsModels:Distributed algorithms:UIDs.Large state spaces, may have elaborately structured states.May be deterministic or probabilistic.If probabilistic, they may make fine use of precise probabilities.Communication: Message-passing with possibly-large, structured messages. Usually local. Shared memory.

Bio systems:No UIDs.Small state spaces, simply structured states.May be deterministic or probabilistic.If probabilistic, they may have course probabilities, approximate.Communication: Message-passing with small signals. Usually local.Metrics and requirements:

Distributed algorithms:Run time, total amount of communication.

Bio systems:Getting done "soon enough".Flexibility, robustness, adaptability. Self-stabilization.Algorithm strategies: Saket

4 examples

Slime mold foraging & MST construction[Tero et al. Science 2010]

Fly brain development & MIS[Afek et al. Science 2010]

E coli foraging & consensus navigation[Shklarsh et al. PLoS Comput. Biol 2013]

Synaptic pruning & network design

SOP selection in fruit fliesDuring nervous system development, some cells are selected as sensory organ precursors (SOPs)SOPs are later attached to the fly's sensory bristles Like MIS, each cell is either: - Selected as a SOP - Laterally inhibited by a neighboring SOP so it cannot become a SOP

Recent findings suggest that Notch is also suppressed in cis by deltas from the same cellOnly when a cell is elected it communicates its decision to the other cellMiller et al Current Biology 2009, Sprinzak et al Nature 2010, Barad et al Science Signaling 2010Trans modelCis+Trans modelNotchDeltaTrans vs. cis inhibition53MIS vs SOPStochasticProven for MISExperimentally validated for SOPConstrained by timeAn uninhibited cell eventually becomes a SOPReduced communicationA node (cell) only sends messages if it joins the MISCompared to previous algs:Unlike Luby, SOP cells do not know its number of neighbors (nor network topology)

For SOP, messages are binaryCan we improve MIS algorithms by understating how the biological process is performed?54Movie

Simulations 2 by 6 grid Each cell touches all adjacent and diagonal neighbors

56Simulations- All models assume a cell becomes a SOP by accumulating the protein Delta until it passes some thresholdFour different models:Accumulation - Accumulating Delta based on a Gaussian distribution2. Fixed Accumulation - Randomly select an accumulation rate only once 3. Rate Change - Increase accumulation probability as time goes by using feedback loopFixed rate - Fix accumulation probability, use the same probability in all rounds

57Observation: Comparing the time of experimental and simulated selection

New MIS Algorithm + DemoMIS Algorithm (n,D) // n upper bound on number of nodes D - upper bound on number of neighbors - p = 1/D - round = round +1 - if round > log(n) p = p * 2 ; round = 0 // we start a new phase - Each processor flips a coin with probability p - If result is 0, do nothing - If result is 1, send to all other processors - If no collisions, Leader; all processes exit - OtherwiseAfek et al Science 2011, Afek et al DISC 2011

- W.h.p., the algorithm computes a MIS in O(log2n) rounds - All msgs are 1 bit59Demo: http://jberryman.github.io/fly-mis/

Luby is O(log n) worst case.4 examples

Slime mold foraging & MST construction[Tero et al. Science 2010]

Fly brain development & MIS[Afek et al. Science 2010]

E coli foraging & consensus navigation[Shklarsh et al. PLoS Comput. Biol 2013]

Synaptic pruning & network design

Slime mold networkTokyo rail networkVery similar transport efficiency and resilienceSteiner points61Slime-mold modelSlime mold operates without centralized control or global information processing

Simple feedback loop between the thickness of tube (i.e. edge weight) and the internal protoplasmic flow (i.e. the amount its used)

Idea: reinforce preferred routes; removed unused or overly redundant connections

Edges = tubesNodes = junctions between tubes

Slime-mold model

The flux through a tube (edge) is calculated as: Start with a randomly meshed lattice with fine spacing (t = 0)

Pressure difference between ends of tubeLength of tubeConductance of the tubeAt each time step, choose random source (node1) and sink (node2)Source:Sink:Otherwise:

Think network flow: the source pumps flow out, the sink consumes it; everyone else just passes the flow along (conservation: what enters must exit)

Node potentials that regulate the flow of current.Ps0(t) = 0 for all t. then you can iteratively solve for the subsequent potentials. Solve this for pj. This gives you the values of Qij and Dij, to plug into the next equation63Updated tube/edge weights

The first term: the expansion of tubes in response to the flux

The second term: the rate of tube constriction, so that if there is no flow, the tube gradually disappears

f(|Q|) = sigmoidal curveTL = wiring lengthMD = avg minimum distance between any pair of food sourcesFT = tolerance to disconnection after single link failure

Each normalized to baseline MST

= Final slime mold network= Tokyo railway network= minimum spanning tree = minimum spanning tree+ added linksMeasuring network quality

= actual railway network

= slime mold designed network (with light)

= slime mold designed network (no light)

= simulated networksCost: TLMST( ) = 1.80 and TLMST( ) = 1.75 +/- 0.30 SIMILAR

Efficiency: MDMST( ) = 0.85 and MDMST( ) = 0.85 +/- 0.04 SIMILAR

Fault tolerance: 4% of links cause real network disconnection, but 14-20% for SM

Same perform, lower cost for SMOverall better FT for real networkAlpha = benefit-cost ratio: FT/TLMST . Real network has slightly higher alpha for same MDMST.

664 examples

Slime mold foraging & MST construction[Tero et al. Science 2010]

Fly brain development & MIS[Afek et al. Science 2010]

E coli foraging & consensus navigation[Shklarsh et al. PLoS Comput. Biol 2013]

Synaptic pruning & network design

Bacteria foragingImagine a complicated terrain with food in some local minima, and a collection of bacterium that want to find the food. How can they quickly succeed?

Need to agree on a movement trajectory to collectively find food sources based on:Individual knowledge / sensory informationKnowledge from others

Need to account for dynamic and varied environments with little or no central coordinationMay also be helpful for robot coordination problems

Bacterial chemotaxisBacteria navigate via chemotaxis, i.e. they move according to gradients in the chemical concentration (food)In areas of low concentration, bacteria tumble more (i.e. they move randomly)

Bacteria acquire cues from neighbors:Repulsion to avoid collisionOrientationAttraction to avoid fragmentationBacteria as automataTreat bacteria has individual automata with two sources of information:Individual belief in the food sourceInteraction with neighbors beliefs about where the food source isParameter w(i)t controls how much bacterium i listens to its neighbors at time t

Independent vs fixed weights[[maybe include videos?]]

Problem: erroneous positive feedback leads bacteria astray: a subgroup gets bad information and convinces the others along an incorrect trajectorySolution: adaptive weightsBacteria adjust their communication weights based on their own internal confidence:When a bacteria finds a beneficial path (strong gradient), it downweights, listens less to its neighborsWhen it is unsure, it increases its interaction with its neighborsThis simple idea requires short-term memory and a very simple rule

4 examples

Slime mold foraging & MST construction[Tero et al. Science 2010]

Fly brain development & MIS[Afek et al. Science 2010]

E coli foraging & consensus navigation[Shklarsh et al. PLoS Comput. Biol 2013]

Synaptic pruning & network design

Human birthAge 2Thousands of neurons & synapses generated per minute

Synaptic pruning & brain developmentAdolescence

Density of synapses decreases by 50-60%Pruning occurs in almost every brain region and organism studiedBut it is unlike previous models of network development so why does it happen?But what about neural networks? Well, it starts off the same. Thousands of neurons are generated and connected by synapses per minute in early and pre-natal life, but then, 50-60% of the original synapses grown are just pruned away! The nodes remain the same, but the majority of synapses are lost.

This is a fundamental process of brain development that occurs in every brain region and organism studied that exhibits learning. But its clearly unlike any of the popular generative models previously proposed.

From an engineering perspective, its also intriguing. Why grow out all these connections, and then eliminate most of them? Isnt this wasteful? We clearly dont do this when designing road networks or communication networks, so why is the brain doing it?

Pruning has in fact been known about in neuroscience for decades, so I started reading through some old literature and came across

===Learning (i.e. network rewiring) is happening continuously, but this is when the most radical changes are happening.Idea: Source-target pairs are drawn from a distribution D that represents a prior/likelihood on signals.Dtrain is fixed beforehand (but unknown)Dtest derived from D, but different s-t pairs

Constraints:Distributedno centralized controllerStreamingprocess s-t pairs online

\noindent \underline{Problem}: \textrm{Given $n$ nodes and a set of source-target pairs $\{(s_i,t_i)\}_{i=1}^p$ drawn from some distribution $\mathcal{D}_{\textrm{train}}$, return an ``efficient'' and ``robust'' network $G$ (evaluated on $\mathcal{D}_{\textrm{test}}$) and with $B$ edges, where $B \ll p$.}75Synaptic pruning as an algorithm1. Start with all equivalent nodes and blanket the space with connectionsYou dont know what the requests are beforehand, so you start with all-equivalent cells.76Synaptic pruning as an algorithm2. Receive source-target requestSourceTarget77use it or lose it3. Route request and keep track of usageSourceTargetEdgeUsage1-10110-151*011015784. Process next pairSourceTarget6143011015EdgeUsage1-10110-1516-14114-301*0use it or lose it795. Repeat61430EdgeUsage1-10210-1516-14214-3011-3110-1916-1311101531913use it or lose it806. Prune at some rate:High usage edges are important [keep]Low usage edges are not [prune]61430EdgeUsage1-101410-1566-141214-3031-3710-1926-1311101531913use it or lose it81Hubel and Wiesel (1970s)Left eyeRight eye

Two sets of neurons that each respond to stimuli from one eye

Hubel and Wiesel (1970s)Left eyeRight eyeWhat happens to the neurons that now receive no input?

?

Hubel and Wiesel (1970s)Left eyeRight eyeBoth sets of neurons respond to activity from the same eye

Why does this happen?

Pool resources to improve acuity in the left eye as a means to compensate for the loss of the right eye

The connections adapt to the structure of the inputHubel and Wiesel (1970s)How does this happen?

Via a competition, one target wins and that neuron becomes dominantly responsive to the active eyeLeft eyeRight eye

Initially, each neuron connected to both eyes but with some bias. Strength and repetition of input signals can cause rewiring.

Human frontal cortex [Huttenlocher 1979]

Mouse somatosensory cortex [White et al. 1997]

Pruning rates have been ignored in the literatureRapid elimination early then taper-off

16 time-points41 animals9754 images42709 synapses# of synapses / imagePostnatal dayDecreasing pruning rates in the cortex

Efficiency(avg. routing distance)Cost (# of edges)

Robustness(# of alternate paths)Cost (# of edges)Decreasing rates > 20% more efficient than other ratesDecreasing rates have slightly better fault toleranceDecreasing rates optimize network functionTake-aways [clean up / split]Biology-inspired algorithms Evaluated using standard cost metrics, based on formal modelsSometimes worse performance in some aspects, but often simpler algorithms: e.g. the fly MIS algorithm had higher run-time, but required fewer assumptions; synaptic pruning is initially wasteful, but is more adaptive than growth-based models.Often no UIDsDynamic and variable network topologies, similar to new wireless devices & networksFormal models can be used to evaluate performance and predict future system behavior

Contribution to biology:Global understanding of local processes (Notch-Delta signaling, slime mold optimization criteria)Raised new testable hypotheses (e.g. the role of pruning rates on learning)[add more]

Common algorithmic strategies weve seen:Stochasticity, to overcome noise and adversariesFeedback processes, to reinforce good solutions/edges/pathsRates of communication/contact vital to inform decision-making (MIS, brain, ants, etc.)Conclusion