people.uncw.edupeople.uncw.edu/tagliarinig/courses/380/f2017 paper… · web viewgreedy algorithm a...

MAXIMUM PATH SUM 1

Maximum Path Sum of a Triangle

Design & Analysis of Optimization Algorithms

William Jacobs, Samnang Pann and Andrew Mason

30 November 2017

University of North Carolina Wilmington

MAXIMUM PATH SUM 2

Abstract

This maximum path sum problem involves a triangle of numeric values where each

value, except those on the final level, has an adjacent relationship to the two values on the level

below. A path can move from one value to only one of the adjacent values below. The sum of all

values located on a single path from the top level of the triangle to the bottom level of a triangle

is called a path sum. The goal of this problem is finding the maximum path sum. We sought to

accomplish this through the design and implementation of four different types of algorithms:

brute force, dynamic programming, simulated annealing, and a generic greedy algorithm. After

the analysis and comparison of each algorithm, dynamic programming proved to find the exact

solution in a reasonable amount of time; it can, for example, find the maximum path sum in a

triangle with 2999 possible paths in about 2.3 seconds. For an estimated value in a larger triangle,

simulated annealing will perform with less time demands than dynamic programming and

generally return a more accurate approximation than the greedy algorithm. Exhaustive search

returns the exact solution through brute force, but does so in exponential time, rendering it very

computationally expensive for triangles of more than a dozen levels.

Key Words: levels, path, path sum, dynamic programming, simulated annealing,

algorithms, greedy algorithms, exhaustive search

MAXIMUM PATH SUM 3

1. Introduction

The goal of the triangle maximum path sum problem is to find what path must be taken in

the triangle, from top to bottom, that yields the maximum total amongst all the paths. To find the

maximum path sum, we begin at the apex and move to one of the adjacent values on the level

below, until we reach the bottom level of the triangle. Utilizing optimization algorithms, the

order of values visited, in order to detect the maximum path sum, can be found using the weights

associated with each vertex.

Formally, Let T represent a triangle representation of a directed, connected graph. Let V

refer to a vertex. Let W represent the weight of V. Let Li represent a level in the triangle.

Therefore, VLi refers to a vertex at level Li. Let p represent an ordered sequence of V along a

path.

p = (VL1, VL2, VL3, … , VLn)).

Let S(p) represent the sum of p formulated by:

S ( p )=∑i=1

n

W (V L i)

Let P represent a set of S(p).

P = {S1(p), S2(p), S3(p), …, Sn(p)}

Therefore,

max S(p) P

Problem: Given T, find the maximum S(p) in P for T.

2. Brute Force – Exhaustive Search

One method to find the global optimum is with brute force. This algorithm simply checks

every possible path and returns the maximum sum. Every vertex in a triangle of N levels may

MAXIMUM PATH SUM 4

move in either of two directions, except for the vertices located on the Nth level which mark the

end of a path. There are therefore 2(N-1) possible paths in a triangle of N levels.

2.1. Implementation

A convenient way of representing a single path is with a bit vector of length N-1. The path is

described by this vector as such: the path begins at the apex of the triangle, and moves as

according to the vector bit by bit, where a 0 bit indicates a move to the left adjacent vertex, and a

1 bit indicates a move to the right adjacent vertex, descending levels until a vertex is reached on

the Nth level. With there being 2(N-1) unique permutations of binary bits in a bit vector of length N-

1, the set of permutations maps to the set of possible paths in a triangle of N levels. The solution

is found by navigating each path and detecting the maximum path sum.

2.2. Pseudo Code

Temp_max, Max = V(L1).value()

V = V(L1)

For p in permutations:

For bit in p:

If bit:

V=V.rightChild()

Temp_max += V.value()

Else:

V=V.leftChild()

Temp_max += V.value()

If Temp_max > Max:

Max = Temp_max

Return Max

2.3. Complexity

MAXIMUM PATH SUM 5

The algorithm must iterate through N-1 bits per each bit vector in the set of 2(N-1) permutations,

yielding an overall time complexity of O[(N-1) * 2(N-1)]. This makes exhaustive search extremely

time consuming for triangles with levels exceeding 15, and require an excess of memory beyond

that. As an example, even if a computer could check the sum of one trillion routes a second, it

would take over twenty billion years to check every route of a triangle of 100 levels.

3. Greedy Algorithm

A greedy algorithm follows a problem-solving heuristic whereby a solution is sought by

making a locally optimal decision at each stage of the problem. Paul Black (2005) states that

these algorithms “may find the globally optimal solution for some optimization problems, but

may find less-than-optimal solutions for some instances of other problems.” Due to the effect of

local maxima which do not reside in the global optimal solution, our greedy algorithm falls under

the latter.

3.1. Implementation

For a maximum path sum problem, the most obvious implementation of a greedy algorithm

would be to forge the solution path by always moving to the current node’s child with the larger

value. Starting from the apex of the triangle, we move to an adjacent vertex with the highest

weight, and repeat until we reach the final level of the triangle. The respective sum of this path is

then an approximation of the globally optimum solution.

3.2. Complexity

The benefit to this approach is that the algorithm only requires a computation of order 1

(comparing the max of two values) per each level of a triangle with N levels. The time

complexity is therefore O(N), which is the least expensive of the algorithms implemented in this

study.

MAXIMUM PATH SUM 6

Total Levels Time(s) Answer

MAXIMUM PATH SUM 7

50 0.000251 3385100 0.000358 6771150 0.000561 10061200 0.000723 13468250 0.001159 16658300 0.001161 20386350 0.001408 23471400 0.001515 26419450 0.001895 29712500 0.002371 33778550 0.002063 37935600 0.002708 39925650 0.003034 44088700 0.003079 46568750 0.003314 47690800 0.003659 53738850 0.00447 56808900 0.004048 60058950 0.004341 634461000 0.004175 66689

This algorithm is easy to design and implement. The drawback to the greedy algorithm is that the

returned sum is not guaranteed to be the maximum path sum. Local maxima will always be

selected, which means that values which are included in the global optimum may be “hidden” if

any one of their parent nodes was not a local maximum over its respective sibling node.

Simulated annealing, another algorithm discussed later, provides a more accurate estimation than

this approach, due to its ability to escape local optima, at the expense of a slightly higher time

complexity.

4. Dynamic Programming

Dynamic programming is an algorithm design technique that appeared in the 1950’s. This

algorithm was invented by a United States mathematician Richard Bellman. Bellman designed

dynamic programming as a general method for optimizing multi-stage decision processes

MAXIMUM PATH SUM 8

(Levitin, 2012, p. 283-284). Our team decided to use this algorithm because we would be able to

break down the triangle into smaller triangles (See Figure 1 below).

Figure 1

By breaking it down into smaller triangles, we would be able to find the optimal solution of a

smaller subset. Then by combining all of the optimal solutions together, it would provide the

optimal solution of the entire triangle.

4.1. Implementation

The following section describes the way dynamic programming is implemented in a

single array:

[null, V1, V2, V3, …, Vn]

The length of the array is determined by the formula:

length=TL(TL+1)2

+1

MAXIMUM PATH SUM 9

The length value determines how many values are contained within the triangle. The

addition of 1 at the end of the formula is to represent the placeholder of null at index 0. This

placeholder has no value in the overall scheme of the algorithm.

Here is a reinterpretation of the problem formalization with adjustments for designing in

a single array. Let I represent the index. Let L represent the level. TL represents the total amount

of levels. Let v represent the value. Let r represent the route. Let LC represent the relationship

between the left value on L to L-1 formulated as: Ln

I(LC) = 2L

Let RC represent the relationship between the right value on L to L-1 formulated as:

I(RC) = 2L+1

To reach R, the maximum path sum of a route, it is represented by the formula:

S(p) = ∑i=1

n

W ¿¿¿

4.2. Pseudo Code

L = TL - 1

while L ¹ 0:

for value in range(endIndex, startIndex,-1):

if v[I(RC)] > v[I(Left]:

v += v[I(Right)]

else:

v += v[I(Left)]

L -= 1;

4.3. Complexity

The big-O of this algorithm is O(N2). However, the value of n in this case would be

equivalent to the total number of levels. This is relevant because in a triangle with 15 levels there

MAXIMUM PATH SUM 10

are a total of 16384 possible routes. In the tests, this algorithm is able to search over 21000 routes

in about 2.3 seconds for the greatest possible route with 100 percent accuracy.

Total Levels Time(s) Answer50 0.006372 3748100 0.030149 7209150 0.06181 11043200 0.094686 14884250 0.146335 18417300 0.220056 22304350 0.288344 26231400 0.381511 29910450 0.497373 33568500 0.579599 37915550 0.746548 40699600 0.858713 44822650 0.987607 48546700 1.158356 52280750 1.326067 56128800 1.511303 59896850 1.70119 63832900 1.905026 67778950 2.127929 709981000 2.375468 75101

A downfall to this algorithm would be the amount of contiguous memory that it would

take up due to the length of the array. That would mean the length of the array is limited by the

amount of memory available on the operating system. This would cause problems for large data

sets being unable to run reliably. However, this is advantageous over a double array

implementation. The advantages are that it is easier to implement by only having to consider a

single array and slightly faster.

5. Simulated Annealing

The simulated annealing (SA) algorithm gets its name due to its ties with the simulation

of the annealing of solids. According to van Laarhoven and Aarts (1987), annealing, in

condensed matter physics, refers to the process where a solid is heated to its maximum

MAXIMUM PATH SUM 11

temperature and then slowly cooled, according to a schedule, to create the optimal crystalline

structure. It should be no surprise then that the SA algorithm is an optimization problem capable

of solving large combinatorial problems. Kirkpatric, Gelatti, and Vecchi (1983) were the first to

truly formalize the SA algorithm, which provided a unique perspective on how to handle

optimization problems. In their article, Kirkpatric et al. reveal how the SA algorithm provides:

“…efficient techniques for finding minimum or maximum values of a function of very many

independent variables” (pg. 671). SA has been shown to work very well on traditionally

complicated problems, like the traveling salesman problem, by providing a way escape local

minima, an issue that many other algorithms were not able to overcome. Due to its ability to

solve large problems, with close to optimal solutions, our team decided to implement this

algorithm to solve the maximum path sum problem.

5.1. Implementation

In order to implement SA to solve our problem, we need to understand how the algorithm

operates. Pulling from the ties to metallurgy, we set a maximum temperature, that is lowered

according to a cooling schedule, until it reaches its optimal state, which is close to zero. At high

temperatures, the algorithm behaves more randomly, similar to the random configuration of

atoms at high temperatures. As we slowly lower the temperature, or cool it, the algorithm

becomes less random and begins to behave like the hill climbing algorithm.

We will begin by choosing an initial solution path, p, and then evaluate S(p), which is the

combined weight of all the vertices in our initial solution path. The initial solution path is set as

our current configuration until we can find a better one. Next, we will choose a neighboring path

to our current one, and evaluate the difference in weights between them. A path is a neighbor to

MAXIMUM PATH SUM 12

our current configuration if it has only one, random change in the configuration. We will refer to

the difference in weights between our current solution path and its neighboring path as ∆ E.

Δ E=S( pn)−S (pc)

In this equation, S( pn) refers to the weight of our neighboring configuration and S ( pc )

refers to the weight of our current configuration. We will use Δ E to decide whether or we should

keep our current configuration or set the neighboring configuration as our current one. If Δ E ≥ 0,

we will accept the neighboring configuration as our current one. If Δ E<0, we still might change

to the next solution, which is what allows SA to behave randomly. We decide whether to accept

the non-optimal solution using the Sigmoid function, where T represents our current

temperature.

Probability ( pn, cn )= 1

1+e−∆ E

T

Next, we will choose a random decimal between 0 and 1, and check to see whether

Probability ( pn , cn )>rand (0,1).

Here is the effect of the Sigmoid function. Let’s suppose our temperature is equal to 10, and

S( pc ) = 107.

If the move is

bad,

Probability ( pn , cn ) will be low, but if the move is good, the probability will be high. As shown

from the above chart, the probability will always be between zero and one, which is where the

comparison to a random decimal between zero and one comes into play. The better the move, the

S( pn) ∆E e−∆ E

T Probability ( pn, cn )80 -27 14.88 0.06

100 -7 2.01 0.33107 0 1 0.5120 13 0.27 0.78150 43 0.01 0.99

MAXIMUM PATH SUM 13

more likely Probability ( pn , cn ) will be greater than rand(0,1), and thus more likely that we will

make that move. The worse the solution is, the lower the probability of moving to it.

Now that we have decided whether or not to make our move, we will either update our

current configuration or set our current configuration equal to the neighboring configuration. We

will repeat this process a preset number of times, and then lower our temperature, because as the

temperature decreases, so does our chance of moving to a random, worse configuration. We do

this according to our cooling schedule, which can be extremely complex or relatively simple. For

this project, we choose to stick with a simple cooling schedule which is done by multiplying our

current temperature by static cooling rate variable which is between zero and one. The cooling

rate we choose for this project is equal to 0.97. We will then repeat the entire process as long as

our temperature is greater than our minimum allowed temperature, also called the frozen state.

5.2. Pseudo Code

While Temperature >= frozenState {

iterations = 0 𝐶ℎ𝑜𝑜𝑠𝑒 pc 𝐸𝑣𝑎𝑙𝑢𝑎𝑡𝑒 S ( pc )While iterations < maxIterations {

Choose pn𝐸𝑣𝑎𝑙𝑢𝑎𝑡𝑒 S( pc )𝐸𝑣𝑎𝑙𝑢𝑎𝑡𝑒 ∆𝐸 If ∆𝐸 ≥ 0 or 1 / 1+𝑒 −∆𝐸/ 𝑇 > random. random 0,1 {

pc = pn

S ( pc ) = S ( pn )}Iterations += 1

}Temperature *= coolingRate}

5.3. Complexity

MAXIMUM PATH SUM 14

Simulated Annealing goes through O(log N) temperature steps for a triangle of N levels. Initial

path selection is done in O(N), and neighbor paths are done in O(1). Comparison of current

configuration to next configuration is O(1). If change is accepted, configuration changes are

done in O(1). Therefore, the total complexity is O(N log N).

Total Levels Time(s) Answer

50 0.032269 2758100 0.040408 4907150 0.057557 7696200 0.0733 9696250 0.085859 12521300 0.094135 15280350 0.115234 17138400 0.127033 20105450 0.135507 23047500 0.158376 24672550 0.161347 27000600 0.17532 29706650 0.201472 32545700 0.216549 34759750 0.219645 36807800 0.234253 40361850 0.253339 40782900 0.258885 45922950 0.278723 48559

1000 0.289096 49925

6. Results & Comparison of Algorithms

By looking at all of the different algorithms, performing on several different triangles, we

are able to determine with confidence optimal solutions for this issue. Seeking an exact answer,

between the exhaustive search and dynamic programming, dynamic programming is the optimal

choice. The exhaustive search is unable to solve a triangle with 200 levels without requiring

more memory than most machines can handle, and started taking more than 10 minutes per run

MAXIMUM PATH SUM 15

after about 14 levels. On the other hand, dynamic programming is able to solve the triangle and

come up with the exact sum in approximately two seconds. For extremely large data sets where

an approximate answer is acceptable, simulated annealing overtakes a greedy algorithm in terms

of general accuracy, but the greedy algorithm completes it in quicker, linear time.

LevelsSimulated

Annealing (s) Dynamic (s) Greedy (s)50 0.032269 0.006372 0.000251100 0.040408 0.030149 0.000358150 0.057557 0.06181 0.000561200 0.0733 0.094686 0.000723250 0.085859 0.146335 0.001159300 0.094135 0.220056 0.001161350 0.115234 0.288344 0.001408400 0.127033 0.381511 0.001515450 0.135507 0.497373 0.001895500 0.158376 0.579599 0.002371550 0.161347 0.746548 0.002063600 0.17532 0.858713 0.002708650 0.201472 0.987607 0.003034700 0.216549 1.158356 0.003079750 0.219645 1.326067 0.003314800 0.234253 1.511303 0.003659850 0.253339 1.70119 0.00447900 0.258885 1.905026 0.004048950 0.278723 2.127929 0.0043411000 0.289096 2.375468 0.004175

MAXIMUM PATH SUM 16

50100

150200

250300

350400

450500

550600

650700

750800

850900

9501000

0

0.5

1

1.5

2

2.5

Comparison

Simulated Annealling Dynamic Greedy

Levels

Tim

e

Exhaustive search would likely take over a few hours to compute at 50 levels, and is thus not listed on the above graph,

where the maximum Time is under 2.5 seconds.

Depending on the user, each algorithm has its pros and cons, excepting exhaustive serach

which has no concluded benefit over dynamic programming. The simulated annealing with a big-

O of n log(N) and a closer approximation, or the greedy algorithm with a big-O of N and an

approximation that may be far off. By analyzing and comparing different algorithms, finding the

optimal solution to a problem, that at first seemed daunting, is made easier. We were able to

determine the efficiency of each algorithm by comparing the times to complete and the accuracy

of solution it provided with confidence.

MAXIMUM PATH SUM 17

References

1. Kirkpatrick, S., Gelatt, C. D., & Vecchi, M. P. (1983). Optimization by Simulated

Annealing.Science, 220(4598), 671-680. Retrieved November 29, 2017, from

http://www.jstor.org/stable/1690046

2. Laarhoven, P. J., & Aarts, E. (1987). Simulated annealing: Theory and Applications.

Dordrect: Kluwer Academic.

3. Levitin, A. (2012). Dynamic Programming. Introduction to the design & analysis of

algorithms (pp. 283-285). Boston, MA: Pearson.

4. Paul E. Black, "Greedy Algorithm", Dictionary of Algorithms and Data Structures

[online], Vreda Pieterse and Paul E. Black, eds. 2 February 2005.

people.uncw.edupeople.uncw.edu/tagliarinig/courses/380/f2017 paper… · web viewgreedy algorithm a...

Documents