cs305/503, spring 2009 graphs michael barnathan. here’s what we’ll be learning: data structures:...

33
CS305/503, Spring 2009 Graphs Michael Barnathan

Post on 21-Dec-2015

215 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: CS305/503, Spring 2009 Graphs Michael Barnathan. Here’s what we’ll be learning: Data Structures: – Graphs. Theory: – Graph nomenclature (there is a lot

CS305/503, Spring 2009Graphs

Michael Barnathan

Page 2: CS305/503, Spring 2009 Graphs Michael Barnathan. Here’s what we’ll be learning: Data Structures: – Graphs. Theory: – Graph nomenclature (there is a lot

Here’s what we’ll be learning:• Data Structures:

– Graphs.• Theory:

– Graph nomenclature (there is a lot of it).– Depth-first search.– Breadth-first search.– Best-first search.

Page 3: CS305/503, Spring 2009 Graphs Michael Barnathan. Here’s what we’ll be learning: Data Structures: – Graphs. Theory: – Graph nomenclature (there is a lot

Review: Trees

• A tree is a data structure in which every node points to a set of “children”.

• A binary tree is a special case in which a node may contain up to 2 children.

• Each node has exactly one parent, except the root, which has no parent.

• There is thus only one unique path to every node.– This is nice; it simplifies many of the algorithms.– You very seldom need to backtrack.

Page 4: CS305/503, Spring 2009 Graphs Michael Barnathan. Here’s what we’ll be learning: Data Structures: – Graphs. Theory: – Graph nomenclature (there is a lot

Unique Paths

5

4

1

2

3

This is a tree:

5

4

1

2

3

This is not a tree:

4 has two parents and there are two ways to access it.

Page 5: CS305/503, Spring 2009 Graphs Michael Barnathan. Here’s what we’ll be learning: Data Structures: – Graphs. Theory: – Graph nomenclature (there is a lot

There goes another assumption!

• What if we get rid of the assumption that each node has one parent and one path?

• We’re not assuming much anymore… now we’re just looking at connected nodes.

5

4

1

2

3

Weird.

Page 6: CS305/503, Spring 2009 Graphs Michael Barnathan. Here’s what we’ll be learning: Data Structures: – Graphs. Theory: – Graph nomenclature (there is a lot

Graphs

• This data structure is called a graph.• It is the most general data structure.

– Trees are special cases of graphs.– Linked lists are special cases of graphs.

• Formally, a graph is simply a set of nodes V connected by a set of lines E: G = <V,E>.– The nodes are called vertices.– The lines connecting them are edges.– The number of edges adjacent to a vertex is called the

degree of that vertex.

Page 7: CS305/503, Spring 2009 Graphs Michael Barnathan. Here’s what we’ll be learning: Data Structures: – Graphs. Theory: – Graph nomenclature (there is a lot

Example

5

4

1

2

3G =

Edges

Vertices

541 2 3V =

E =

Page 8: CS305/503, Spring 2009 Graphs Michael Barnathan. Here’s what we’ll be learning: Data Structures: – Graphs. Theory: – Graph nomenclature (there is a lot

Why are they useful?

• Networks:– Computer networks (routers!)– Social networks.– Spread of disease.

• Roads, paths, travel:

Woodland

71Larchwood

Jonathon

Palmer

You

Bob

Alice

Trudy

Mallory

Page 9: CS305/503, Spring 2009 Graphs Michael Barnathan. Here’s what we’ll be learning: Data Structures: – Graphs. Theory: – Graph nomenclature (there is a lot

Undirected Graphs

• These are all two-way streets. Traffic can flow both ways. We can turn from 71 onto Larchwood, or Larchwood onto 71.

• The graph is therefore called undirected. The edges can be traversed in either direction.

Woodland

71Larchwood

Jonathon

Palmer

Page 10: CS305/503, Spring 2009 Graphs Michael Barnathan. Here’s what we’ll be learning: Data Structures: – Graphs. Theory: – Graph nomenclature (there is a lot

Directed Graphs

• What if Larchwood were one way only?• You could not turn onto 71 from Larchwood, but could turn onto

Larchwood from 71.• This is represented by adding arrows to edges to signify that the

edge only flows one way. Edges cannot be traversed against the direction of the arrow.

• These are called directed edges and a graph containing at least one of them is called a directed graph or digraph.

Woodland

71Larchwood

Jonathon

Palmer

Page 11: CS305/503, Spring 2009 Graphs Michael Barnathan. Here’s what we’ll be learning: Data Structures: – Graphs. Theory: – Graph nomenclature (there is a lot

Cycles

• It is possible for a graph to loop back on itself, directly or indirectly.

• The loop is called a cycle or closed walk.• The number of vertices in the loop is called the

length of the cycle.• A graph with cycles is known as a cyclic graph,

while one that contains none is called acyclic.

3

1

2

1

Length 1 Length 3

Page 12: CS305/503, Spring 2009 Graphs Michael Barnathan. Here’s what we’ll be learning: Data Structures: – Graphs. Theory: – Graph nomenclature (there is a lot

Trees

• Since you don’t have a pointer back to the parent, trees are directed acyclic graphs.

5

4

1

2

3

Page 13: CS305/503, Spring 2009 Graphs Michael Barnathan. Here’s what we’ll be learning: Data Structures: – Graphs. Theory: – Graph nomenclature (there is a lot

Connected Components• It is possible for some vertices to be isolated from others within the

same graph:

• Each group is called a connected component. Formally, two vertices are in the same connected component if one may be reached from the other. A connected graph has only one connected component.

• A strongly connected component is a group in which every vertex in the group can be reached from every other vertex in the group.

• Question: are the connected components of the graph shown above strongly connected? Why or why not?

3

1

2 5

4

This is one graph.

Page 14: CS305/503, Spring 2009 Graphs Michael Barnathan. Here’s what we’ll be learning: Data Structures: – Graphs. Theory: – Graph nomenclature (there is a lot

Path Length

• A traversal starting at one vertex and ending at another is called a path.

• The number of edges traversed to get from the start to the end vertex is the path length.

• The minimal path length between two vertices is the length of the shortest path that connects them.

Page 15: CS305/503, Spring 2009 Graphs Michael Barnathan. Here’s what we’ll be learning: Data Structures: – Graphs. Theory: – Graph nomenclature (there is a lot

Path Length Example

• What is the shortest path from 71 to Palmer?

Woodland

71Larchwood

Jonathon

Palmer

1

2 2

3 3

2

Page 16: CS305/503, Spring 2009 Graphs Michael Barnathan. Here’s what we’ll be learning: Data Structures: – Graphs. Theory: – Graph nomenclature (there is a lot

The Problem With Path Length

• Of course, not all roads are created equal.• Which is closer, Colorado or West Virginia?

Path Length = 27. Path Length = 30.

Colorado, here we come!

Page 17: CS305/503, Spring 2009 Graphs Michael Barnathan. Here’s what we’ll be learning: Data Structures: – Graphs. Theory: – Graph nomenclature (there is a lot

Weighted Path Length

• In order to represent things like distance (I-95 != Route 36) or “cost” of walking down a certain path, we can assign weights to edges.

• Instead of counting each edge as “1”, we count it by its weight:

Woodland

71Larchwood

Jonathon

Palmer

0.4

0.2 0.4

0.2 0.2

0.3

Shortest path length: 0.4 + 0.3 = 0.7 mi

Page 18: CS305/503, Spring 2009 Graphs Michael Barnathan. Here’s what we’ll be learning: Data Structures: – Graphs. Theory: – Graph nomenclature (there is a lot

Weighted Path Length• Path lengths can also be negative in some cases

(maybe a certain road bypasses traffic and saves you driving time?)

• Finding the shortest path length is obviously an important problem.– If you’re UPS, you want your truck drivers to deliver

packages on time in as short a distance as possible (to conserve fuel).

– If you are routing a packet, you want to select the fastest route that can get it to its destination.

• Intuitively, how would you find the shortest weighted path length between two vertices?

• We’ll give some formal strategies for this next time.

Page 19: CS305/503, Spring 2009 Graphs Michael Barnathan. Here’s what we’ll be learning: Data Structures: – Graphs. Theory: – Graph nomenclature (there is a lot

Traversing a Graph.

• Very often, we will want to scan the vertices of a graph (for example, to find the path length).

• There are three common ways of traversing a graph:– Depth-first.– Breadth-first.– Best-first.

• There are also popular variations on best-first search, such as A* search, which are used frequently in AI.

• A “root” (vertex to start at) must be selected in order to give the traversal a place to begin.

Page 20: CS305/503, Spring 2009 Graphs Michael Barnathan. Here’s what we’ll be learning: Data Structures: – Graphs. Theory: – Graph nomenclature (there is a lot

Depth-First Search

• DFS is equivalent to preorder traversal of a tree. Because graphs may be cyclic, it requires keeping track of which vertices were visited.

• The idea: when encountering an unvisited vertex, traverse down it immediately.

• Only once that traversal finishes do you traverse down the remaining edges of the current vertex.

• This is usually done recursively.

Page 21: CS305/503, Spring 2009 Graphs Michael Barnathan. Here’s what we’ll be learning: Data Structures: – Graphs. Theory: – Graph nomenclature (there is a lot

DFS Example

2

5

4

3

1

Start

When we traverse 3, 3 becomes the new current vertex. We then traverse its edges (to 4) before returning and finishing up with 2’s other vertex (5).

Page 22: CS305/503, Spring 2009 Graphs Michael Barnathan. Here’s what we’ll be learning: Data Structures: – Graphs. Theory: – Graph nomenclature (there is a lot

DFS Algorithmvoid dfs(Vertex v) {

if (v == null)return;

visit(v); //We can do anything with v here.v.visited = true;

for (Edge e : v.edges())if (!e.getOtherVertex(v).visited())

dfs(e.getOtherVertex(v));}

Page 23: CS305/503, Spring 2009 Graphs Michael Barnathan. Here’s what we’ll be learning: Data Structures: – Graphs. Theory: – Graph nomenclature (there is a lot

Breadth First Search

• Where depth-first search scanned down the entire path before checking additional edges, breadth-first search does the opposite.

• Idea: scan each adjacent edge before traversing into any of them.

• Whereas DFS used a stack to traverse (you did realize it was using the system stack to keep track of the history, right?), BFS uses a queue.

• Also, while DFS is recursive, BFS is iterative.

Page 24: CS305/503, Spring 2009 Graphs Michael Barnathan. Here’s what we’ll be learning: Data Structures: – Graphs. Theory: – Graph nomenclature (there is a lot

BFS Example

2

4

5

3

1

Start

All of 2’s adjacent vertices (3 and 4) are labeled before we traverse into 3 and check its adjacent vertices (5).

Page 25: CS305/503, Spring 2009 Graphs Michael Barnathan. Here’s what we’ll be learning: Data Structures: – Graphs. Theory: – Graph nomenclature (there is a lot

BFS Algorithmvoid bfs(Vertex v) {

if (v == null)return;

Queue<Vertex> vqueue = new Queue<Vertex>();vqueue.add(v); //Start with the start vertex.v.visited = true;while (!vqueue.empty()) {

v = vqueue.pop(); //Dequeue the next element and store it in v.

visit(v); //We can do anything with v here.

for (Edge e : v.edges())if (!e.getOtherVertex(v).visited()) {

vqueue.add(e.getOtherVertex(v));e.getOtherVertex(v).visited = true;

}}

}

Page 26: CS305/503, Spring 2009 Graphs Michael Barnathan. Here’s what we’ll be learning: Data Structures: – Graphs. Theory: – Graph nomenclature (there is a lot

Best First Search• Best-first search uses a user-chosen heuristic function which ranks

nodes based on how “promising” they are in achieving a goal.• The heuristic function may be based on the value or position of the

vertex or weight of the edges.• For example, in a game of checkers, a move that results in jumping

an opponent’s piece may be ranked highly by the heuristic function, since it makes progress towards attaining a goal (winning the game).

• Best-first search always chooses the “best” next move at each step.– What do we call those sorts of algorithms again?

• Whereas a stack is used in depth-first search and a queue is used in breadth-first search, a priority queue can be used in best-first search.

• The priority would be how “good” a vertex is ranked.• Other than that change, the algorithm is the same as breadth-first

search.

Page 27: CS305/503, Spring 2009 Graphs Michael Barnathan. Here’s what we’ll be learning: Data Structures: – Graphs. Theory: – Graph nomenclature (there is a lot

A Classical Problem• This is called the “7 Bridges of Konigsberg”. You may have

seen it on IQ tests.• Euler first solved it in 1736. We’ll walk through his solution.• The problem: find a route that allows you to cross each of

the 7 bridges exactly once, or demonstrate that none exists.

Page 28: CS305/503, Spring 2009 Graphs Michael Barnathan. Here’s what we’ll be learning: Data Structures: – Graphs. Theory: – Graph nomenclature (there is a lot

Euler’s Solution• The configuration of the city is irrelevant; only how one

can move from one part of it to another is important.• So we can represent the problem as a graph.

A

B

C

D

B

C

A D

AB

ACCD

AD

BD

Page 29: CS305/503, Spring 2009 Graphs Michael Barnathan. Here’s what we’ll be learning: Data Structures: – Graphs. Theory: – Graph nomenclature (there is a lot

As “States”• Think of each vertex as a “state”.• One bridge is required to enter that state.• One bridge is required to leave that state.• Clearly, there would be no solution (except to swim!) if the graph were

not connected.• Because the graph is connected, a solution will involve both entering and

leaving every state at least once.– With potentially two exceptions: the state you start in (you don’t need to

enter it) and the one you end in (you don’t need to leave it).• This means that either every vertex or every vertex but the starting and

ending vertices must have an even degree for this to work!• There are four vertices in this graph: A, B, C, and D.• We can start at any one of them and finish at any one of them.• So if we can find any two vertices with an even degree in this graph, we

can cross each bridge exactly once. Otherwise, we can’t.

Page 30: CS305/503, Spring 2009 Graphs Michael Barnathan. Here’s what we’ll be learning: Data Structures: – Graphs. Theory: – Graph nomenclature (there is a lot

Degrees of Each Vertex• Recall: The degree of a vertex is the number of edges

adjacent to that vertex.• What is the degree of each of the four vertices?

B

C

A D

Page 31: CS305/503, Spring 2009 Graphs Michael Barnathan. Here’s what we’ll be learning: Data Structures: – Graphs. Theory: – Graph nomenclature (there is a lot

Eulerian Graphs

• A has degree 5, all others have degree 3.• In order to cross each bridge exactly once, either all

vertices in the graph or all but two vertices in the graph must have even degrees.– A path that crosses each edge in a graph exactly once is

called a “Eulerian path”.– Graphs that satisfy the above condition (i.e. they have a

Eulerian path) are called Eulerian graphs.• Every vertex in this graph has an odd degree.

– So this graph is not Eulerian.– Therefore, it is impossible to walk each bridge only once.

Page 32: CS305/503, Spring 2009 Graphs Michael Barnathan. Here’s what we’ll be learning: Data Structures: – Graphs. Theory: – Graph nomenclature (there is a lot

A Bridge Too Far

• We discussed some basic graph theory today.• Next time, we’ll cover algorithms for finding

the shortest path between two vertices and an alternate representation of a graph.

• The lesson:– Particularly in mathematics, it is possible to

simplify a problem by removing irrelevant information. The clutter may make them seem more difficult than they appear.

Page 33: CS305/503, Spring 2009 Graphs Michael Barnathan. Here’s what we’ll be learning: Data Structures: – Graphs. Theory: – Graph nomenclature (there is a lot

Assignment 4

• This assignment will have you writing a heap from scratch.

• You may not use the Java Set or Map classes for this assignment.

• The assignment handout is located on the course website.

• The deadline is next Tuesday, April 14.