people.uncw.edupeople.uncw.edu/tagliarinig/courses/380/f2017 paper… · web viewgreedy algorithm a...
TRANSCRIPT
MAXIMUM PATH SUM 1
Maximum Path Sum of a Triangle
Design & Analysis of Optimization Algorithms
William Jacobs, Samnang Pann and Andrew Mason
30 November 2017
University of North Carolina Wilmington
MAXIMUM PATH SUM 2
Abstract
This maximum path sum problem involves a triangle of numeric values where each
value, except those on the final level, has an adjacent relationship to the two values on the level
below. A path can move from one value to only one of the adjacent values below. The sum of all
values located on a single path from the top level of the triangle to the bottom level of a triangle
is called a path sum. The goal of this problem is finding the maximum path sum. We sought to
accomplish this through the design and implementation of four different types of algorithms:
brute force, dynamic programming, simulated annealing, and a generic greedy algorithm. After
the analysis and comparison of each algorithm, dynamic programming proved to find the exact
solution in a reasonable amount of time; it can, for example, find the maximum path sum in a
triangle with 2999 possible paths in about 2.3 seconds. For an estimated value in a larger triangle,
simulated annealing will perform with less time demands than dynamic programming and
generally return a more accurate approximation than the greedy algorithm. Exhaustive search
returns the exact solution through brute force, but does so in exponential time, rendering it very
computationally expensive for triangles of more than a dozen levels.
Key Words: levels, path, path sum, dynamic programming, simulated annealing,
algorithms, greedy algorithms, exhaustive search
MAXIMUM PATH SUM 3
1. Introduction
The goal of the triangle maximum path sum problem is to find what path must be taken in
the triangle, from top to bottom, that yields the maximum total amongst all the paths. To find the
maximum path sum, we begin at the apex and move to one of the adjacent values on the level
below, until we reach the bottom level of the triangle. Utilizing optimization algorithms, the
order of values visited, in order to detect the maximum path sum, can be found using the weights
associated with each vertex.
Formally, Let T represent a triangle representation of a directed, connected graph. Let V
refer to a vertex. Let W represent the weight of V. Let Li represent a level in the triangle.
Therefore, VLi refers to a vertex at level Li. Let p represent an ordered sequence of V along a
path.
p = (VL1, VL2, VL3, … , VLn)).
Let S(p) represent the sum of p formulated by:
S ( p )=∑i=1
n
W (V L i)
Let P represent a set of S(p).
P = {S1(p), S2(p), S3(p), …, Sn(p)}
Therefore,
max S(p) P
Problem: Given T, find the maximum S(p) in P for T.
2. Brute Force – Exhaustive Search
One method to find the global optimum is with brute force. This algorithm simply checks
every possible path and returns the maximum sum. Every vertex in a triangle of N levels may
MAXIMUM PATH SUM 4
move in either of two directions, except for the vertices located on the Nth level which mark the
end of a path. There are therefore 2(N-1) possible paths in a triangle of N levels.
2.1. Implementation
A convenient way of representing a single path is with a bit vector of length N-1. The path is
described by this vector as such: the path begins at the apex of the triangle, and moves as
according to the vector bit by bit, where a 0 bit indicates a move to the left adjacent vertex, and a
1 bit indicates a move to the right adjacent vertex, descending levels until a vertex is reached on
the Nth level. With there being 2(N-1) unique permutations of binary bits in a bit vector of length N-
1, the set of permutations maps to the set of possible paths in a triangle of N levels. The solution
is found by navigating each path and detecting the maximum path sum.
2.2. Pseudo Code
Temp_max, Max = V(L1).value()
V = V(L1)
For p in permutations:
For bit in p:
If bit:
V=V.rightChild()
Temp_max += V.value()
Else:
V=V.leftChild()
Temp_max += V.value()
If Temp_max > Max:
Max = Temp_max
Return Max
2.3. Complexity
MAXIMUM PATH SUM 5
The algorithm must iterate through N-1 bits per each bit vector in the set of 2(N-1) permutations,
yielding an overall time complexity of O[(N-1) * 2(N-1)]. This makes exhaustive search extremely
time consuming for triangles with levels exceeding 15, and require an excess of memory beyond
that. As an example, even if a computer could check the sum of one trillion routes a second, it
would take over twenty billion years to check every route of a triangle of 100 levels.
3. Greedy Algorithm
A greedy algorithm follows a problem-solving heuristic whereby a solution is sought by
making a locally optimal decision at each stage of the problem. Paul Black (2005) states that
these algorithms “may find the globally optimal solution for some optimization problems, but
may find less-than-optimal solutions for some instances of other problems.” Due to the effect of
local maxima which do not reside in the global optimal solution, our greedy algorithm falls under
the latter.
3.1. Implementation
For a maximum path sum problem, the most obvious implementation of a greedy algorithm
would be to forge the solution path by always moving to the current node’s child with the larger
value. Starting from the apex of the triangle, we move to an adjacent vertex with the highest
weight, and repeat until we reach the final level of the triangle. The respective sum of this path is
then an approximation of the globally optimum solution.
3.2. Complexity
The benefit to this approach is that the algorithm only requires a computation of order 1
(comparing the max of two values) per each level of a triangle with N levels. The time
complexity is therefore O(N), which is the least expensive of the algorithms implemented in this
study.
MAXIMUM PATH SUM 6
Total Levels Time(s) Answer
MAXIMUM PATH SUM 7
50 0.000251 3385100 0.000358 6771150 0.000561 10061200 0.000723 13468250 0.001159 16658300 0.001161 20386350 0.001408 23471400 0.001515 26419450 0.001895 29712500 0.002371 33778550 0.002063 37935600 0.002708 39925650 0.003034 44088700 0.003079 46568750 0.003314 47690800 0.003659 53738850 0.00447 56808900 0.004048 60058950 0.004341 634461000 0.004175 66689
This algorithm is easy to design and implement. The drawback to the greedy algorithm is that the
returned sum is not guaranteed to be the maximum path sum. Local maxima will always be
selected, which means that values which are included in the global optimum may be “hidden” if
any one of their parent nodes was not a local maximum over its respective sibling node.
Simulated annealing, another algorithm discussed later, provides a more accurate estimation than
this approach, due to its ability to escape local optima, at the expense of a slightly higher time
complexity.
4. Dynamic Programming
Dynamic programming is an algorithm design technique that appeared in the 1950’s. This
algorithm was invented by a United States mathematician Richard Bellman. Bellman designed
dynamic programming as a general method for optimizing multi-stage decision processes
MAXIMUM PATH SUM 8
(Levitin, 2012, p. 283-284). Our team decided to use this algorithm because we would be able to
break down the triangle into smaller triangles (See Figure 1 below).
Figure 1
By breaking it down into smaller triangles, we would be able to find the optimal solution of a
smaller subset. Then by combining all of the optimal solutions together, it would provide the
optimal solution of the entire triangle.
4.1. Implementation
The following section describes the way dynamic programming is implemented in a
single array:
[null, V1, V2, V3, …, Vn]
The length of the array is determined by the formula:
length=TL(TL+1)2
+1
MAXIMUM PATH SUM 9
The length value determines how many values are contained within the triangle. The
addition of 1 at the end of the formula is to represent the placeholder of null at index 0. This
placeholder has no value in the overall scheme of the algorithm.
Here is a reinterpretation of the problem formalization with adjustments for designing in
a single array. Let I represent the index. Let L represent the level. TL represents the total amount
of levels. Let v represent the value. Let r represent the route. Let LC represent the relationship
between the left value on L to L-1 formulated as: Ln
I(LC) = 2L
Let RC represent the relationship between the right value on L to L-1 formulated as:
I(RC) = 2L+1
To reach R, the maximum path sum of a route, it is represented by the formula:
S(p) = ∑i=1
n
W ¿¿¿
4.2. Pseudo Code
L = TL - 1
while L ¹ 0:
for value in range(endIndex, startIndex,-1):
if v[I(RC)] > v[I(Left]:
v += v[I(Right)]
else:
v += v[I(Left)]
L -= 1;
4.3. Complexity
The big-O of this algorithm is O(N2). However, the value of n in this case would be
equivalent to the total number of levels. This is relevant because in a triangle with 15 levels there
MAXIMUM PATH SUM 10
are a total of 16384 possible routes. In the tests, this algorithm is able to search over 21000 routes
in about 2.3 seconds for the greatest possible route with 100 percent accuracy.
Total Levels Time(s) Answer50 0.006372 3748100 0.030149 7209150 0.06181 11043200 0.094686 14884250 0.146335 18417300 0.220056 22304350 0.288344 26231400 0.381511 29910450 0.497373 33568500 0.579599 37915550 0.746548 40699600 0.858713 44822650 0.987607 48546700 1.158356 52280750 1.326067 56128800 1.511303 59896850 1.70119 63832900 1.905026 67778950 2.127929 709981000 2.375468 75101
A downfall to this algorithm would be the amount of contiguous memory that it would
take up due to the length of the array. That would mean the length of the array is limited by the
amount of memory available on the operating system. This would cause problems for large data
sets being unable to run reliably. However, this is advantageous over a double array
implementation. The advantages are that it is easier to implement by only having to consider a
single array and slightly faster.
5. Simulated Annealing
The simulated annealing (SA) algorithm gets its name due to its ties with the simulation
of the annealing of solids. According to van Laarhoven and Aarts (1987), annealing, in
condensed matter physics, refers to the process where a solid is heated to its maximum
MAXIMUM PATH SUM 11
temperature and then slowly cooled, according to a schedule, to create the optimal crystalline
structure. It should be no surprise then that the SA algorithm is an optimization problem capable
of solving large combinatorial problems. Kirkpatric, Gelatti, and Vecchi (1983) were the first to
truly formalize the SA algorithm, which provided a unique perspective on how to handle
optimization problems. In their article, Kirkpatric et al. reveal how the SA algorithm provides:
“…efficient techniques for finding minimum or maximum values of a function of very many
independent variables” (pg. 671). SA has been shown to work very well on traditionally
complicated problems, like the traveling salesman problem, by providing a way escape local
minima, an issue that many other algorithms were not able to overcome. Due to its ability to
solve large problems, with close to optimal solutions, our team decided to implement this
algorithm to solve the maximum path sum problem.
5.1. Implementation
In order to implement SA to solve our problem, we need to understand how the algorithm
operates. Pulling from the ties to metallurgy, we set a maximum temperature, that is lowered
according to a cooling schedule, until it reaches its optimal state, which is close to zero. At high
temperatures, the algorithm behaves more randomly, similar to the random configuration of
atoms at high temperatures. As we slowly lower the temperature, or cool it, the algorithm
becomes less random and begins to behave like the hill climbing algorithm.
We will begin by choosing an initial solution path, p, and then evaluate S(p), which is the
combined weight of all the vertices in our initial solution path. The initial solution path is set as
our current configuration until we can find a better one. Next, we will choose a neighboring path
to our current one, and evaluate the difference in weights between them. A path is a neighbor to
MAXIMUM PATH SUM 12
our current configuration if it has only one, random change in the configuration. We will refer to
the difference in weights between our current solution path and its neighboring path as ∆ E.
Δ E=S( pn)−S (pc)
In this equation, S( pn) refers to the weight of our neighboring configuration and S ( pc )
refers to the weight of our current configuration. We will use Δ E to decide whether or we should
keep our current configuration or set the neighboring configuration as our current one. If Δ E ≥ 0,
we will accept the neighboring configuration as our current one. If Δ E<0, we still might change
to the next solution, which is what allows SA to behave randomly. We decide whether to accept
the non-optimal solution using the Sigmoid function, where T represents our current
temperature.
Probability ( pn, cn )= 1
1+e−∆ E
T
Next, we will choose a random decimal between 0 and 1, and check to see whether
Probability ( pn , cn )>rand (0,1).
Here is the effect of the Sigmoid function. Let’s suppose our temperature is equal to 10, and
S( pc ) = 107.
If the move is
bad,
Probability ( pn , cn ) will be low, but if the move is good, the probability will be high. As shown
from the above chart, the probability will always be between zero and one, which is where the
comparison to a random decimal between zero and one comes into play. The better the move, the
S( pn) ∆E e−∆ E
T Probability ( pn, cn )80 -27 14.88 0.06
100 -7 2.01 0.33107 0 1 0.5120 13 0.27 0.78150 43 0.01 0.99
MAXIMUM PATH SUM 13
more likely Probability ( pn , cn ) will be greater than rand(0,1), and thus more likely that we will
make that move. The worse the solution is, the lower the probability of moving to it.
Now that we have decided whether or not to make our move, we will either update our
current configuration or set our current configuration equal to the neighboring configuration. We
will repeat this process a preset number of times, and then lower our temperature, because as the
temperature decreases, so does our chance of moving to a random, worse configuration. We do
this according to our cooling schedule, which can be extremely complex or relatively simple. For
this project, we choose to stick with a simple cooling schedule which is done by multiplying our
current temperature by static cooling rate variable which is between zero and one. The cooling
rate we choose for this project is equal to 0.97. We will then repeat the entire process as long as
our temperature is greater than our minimum allowed temperature, also called the frozen state.
5.2. Pseudo Code
While Temperature >= frozenState {
iterations = 0 𝐶ℎ𝑜𝑜𝑠𝑒 pc 𝐸𝑣𝑎𝑙𝑢𝑎𝑡𝑒 S ( pc )While iterations < maxIterations {
Choose pn𝐸𝑣𝑎𝑙𝑢𝑎𝑡𝑒 S( pc )𝐸𝑣𝑎𝑙𝑢𝑎𝑡𝑒 ∆𝐸 If ∆𝐸 ≥ 0 or 1 / 1+𝑒 −∆𝐸/ 𝑇 > random. random 0,1 {
pc = pn
S ( pc ) = S ( pn )}Iterations += 1
}Temperature *= coolingRate}
5.3. Complexity
MAXIMUM PATH SUM 14
Simulated Annealing goes through O(log N) temperature steps for a triangle of N levels. Initial
path selection is done in O(N), and neighbor paths are done in O(1). Comparison of current
configuration to next configuration is O(1). If change is accepted, configuration changes are
done in O(1). Therefore, the total complexity is O(N log N).
Total Levels Time(s) Answer
50 0.032269 2758100 0.040408 4907150 0.057557 7696200 0.0733 9696250 0.085859 12521300 0.094135 15280350 0.115234 17138400 0.127033 20105450 0.135507 23047500 0.158376 24672550 0.161347 27000600 0.17532 29706650 0.201472 32545700 0.216549 34759750 0.219645 36807800 0.234253 40361850 0.253339 40782900 0.258885 45922950 0.278723 48559
1000 0.289096 49925
6. Results & Comparison of Algorithms
By looking at all of the different algorithms, performing on several different triangles, we
are able to determine with confidence optimal solutions for this issue. Seeking an exact answer,
between the exhaustive search and dynamic programming, dynamic programming is the optimal
choice. The exhaustive search is unable to solve a triangle with 200 levels without requiring
more memory than most machines can handle, and started taking more than 10 minutes per run
MAXIMUM PATH SUM 15
after about 14 levels. On the other hand, dynamic programming is able to solve the triangle and
come up with the exact sum in approximately two seconds. For extremely large data sets where
an approximate answer is acceptable, simulated annealing overtakes a greedy algorithm in terms
of general accuracy, but the greedy algorithm completes it in quicker, linear time.
LevelsSimulated
Annealing (s) Dynamic (s) Greedy (s)50 0.032269 0.006372 0.000251100 0.040408 0.030149 0.000358150 0.057557 0.06181 0.000561200 0.0733 0.094686 0.000723250 0.085859 0.146335 0.001159300 0.094135 0.220056 0.001161350 0.115234 0.288344 0.001408400 0.127033 0.381511 0.001515450 0.135507 0.497373 0.001895500 0.158376 0.579599 0.002371550 0.161347 0.746548 0.002063600 0.17532 0.858713 0.002708650 0.201472 0.987607 0.003034700 0.216549 1.158356 0.003079750 0.219645 1.326067 0.003314800 0.234253 1.511303 0.003659850 0.253339 1.70119 0.00447900 0.258885 1.905026 0.004048950 0.278723 2.127929 0.0043411000 0.289096 2.375468 0.004175
MAXIMUM PATH SUM 16
50100
150200
250300
350400
450500
550600
650700
750800
850900
9501000
0
0.5
1
1.5
2
2.5
Comparison
Simulated Annealling Dynamic Greedy
Levels
Tim
e
Exhaustive search would likely take over a few hours to compute at 50 levels, and is thus not listed on the above graph,
where the maximum Time is under 2.5 seconds.
Depending on the user, each algorithm has its pros and cons, excepting exhaustive serach
which has no concluded benefit over dynamic programming. The simulated annealing with a big-
O of n log(N) and a closer approximation, or the greedy algorithm with a big-O of N and an
approximation that may be far off. By analyzing and comparing different algorithms, finding the
optimal solution to a problem, that at first seemed daunting, is made easier. We were able to
determine the efficiency of each algorithm by comparing the times to complete and the accuracy
of solution it provided with confidence.
MAXIMUM PATH SUM 17
References
1. Kirkpatrick, S., Gelatt, C. D., & Vecchi, M. P. (1983). Optimization by Simulated
Annealing.Science, 220(4598), 671-680. Retrieved November 29, 2017, from
http://www.jstor.org/stable/1690046
2. Laarhoven, P. J., & Aarts, E. (1987). Simulated annealing: Theory and Applications.
Dordrect: Kluwer Academic.
3. Levitin, A. (2012). Dynamic Programming. Introduction to the design & analysis of
algorithms (pp. 283-285). Boston, MA: Pearson.
4. Paul E. Black, "Greedy Algorithm", Dictionary of Algorithms and Data Structures
[online], Vreda Pieterse and Paul E. Black, eds. 2 February 2005.