introduction to computer science and programming for ...szapudi/astro735/lecture3.pdf ·...
TRANSCRIPT
ReminderBasic Data Structures: Continued
Algorithmic AnalysisSummary
Introduction to Computer Science andProgramming for Astronomers
Lecture 3.
István Szapudi
Institute for AstronomyUniversity of Hawaii
January 30, 2020
735 Computational Astronomy
ReminderBasic Data Structures: Continued
Algorithmic AnalysisSummary
Outline
1 Reminder
2 Basic Data Structures: Continued
3 Algorithmic AnalysisFibonacci
735 Computational Astronomy
ReminderBasic Data Structures: Continued
Algorithmic AnalysisSummary
ReminderWhere were we last time...
We have looked at testing and debugging.Elements of object oriented design.We started to look at data structuresThe two basic ones we finished were stacks and queues,next continue with binary trees.FAQ: how are lists implemented? A: array of C-pointers inthe reference CPython implementation. (There are otherimplementations, e.g. ironpython, pypy, etc, they all shouldwork the same, we take the “black box” approach)
735 Computational Astronomy
ReminderBasic Data Structures: Continued
Algorithmic AnalysisSummary
Binary TreesThis is the most basic and useful tree data structure
Graphs are networks consisting of nodes connected byedges or arcs.Trees are graphs which have no loopsA simple linked list can be thought of as a graph or a tree(sometimes called “snake topology” in tree graphslanguage)Binary trees are trees with two branchesRoot: the top of the tree. Leaves: tips with null references.(Sometimes parent/childre/siblings is used).
735 Computational Astronomy
ReminderBasic Data Structures: Continued
Algorithmic AnalysisSummary
Binary TreeThis tree is upside down!
Note the similarity of construction with linked lists.The “leafs” are marked with None.The binary nature would be easy to generalize.
735 Computational Astronomy
ReminderBasic Data Structures: Continued
Algorithmic AnalysisSummary
python implementationIt cannot get easier then python ...
class Tree:def __init__(self, cargo, left=None, right=None):self.cargo = cargoself.left = leftself.right = right
def __str__(self):return str(self.cargo)
tree=Tree(1,Tree(2),Tree(3))
735 Computational Astronomy
ReminderBasic Data Structures: Continued
Algorithmic AnalysisSummary
Traversing a treeSimply take left and right
def totalTree(tree):if tree == None: return 0return totalTree(tree.left) +
totalTree(tree.right) +tree.cargo
735 Computational Astronomy
ReminderBasic Data Structures: Continued
Algorithmic AnalysisSummary
Traversing a treeSimply take left and right
Python 2.3.3 (#1, May 7 2004, 10:31:40)[GCC 3.3.3 20040412 (Red Hat Linux 3.3.3-7)] on linux2Type "help", "copyright", "credits" or "license" for more information.>>> import lecture2>>> Tree = lecture2.Tree>>> tree = Tree(1,Tree(2),Tree(3))>>> lecture2.totalTree(tree)6
735 Computational Astronomy
ReminderBasic Data Structures: Continued
Algorithmic AnalysisSummary
Traversing a treeE.g. preorder
def printTreePreorder(tree):if tree == None: returnprint tree.cargo,printTreePreorder(tree.left)printTreePreorder(tree.right)
735 Computational Astronomy
ReminderBasic Data Structures: Continued
Algorithmic AnalysisSummary
An Expression TreeA simple example of a common application.
The following represents 1 + 2 ∗ 3
735 Computational Astronomy
ReminderBasic Data Structures: Continued
Algorithmic AnalysisSummary
Traversing an expression treePreorder (prefix), postorder (postfix) , inorder (infix)
>>> tree = Tree(’+’, Tree(1),Tree(’*’, Tree(2), Tree(3)))
>>> lecture2.printTreePreorder(tree)+ 1 * 2 3>>> lecture2.printTreePostorder(tree)1 2 3 * +>>> lecture2.printTreeInorder(tree)1 + 2 * 3
735 Computational Astronomy
ReminderBasic Data Structures: Continued
Algorithmic AnalysisSummary
What’s left outA lot...
I only covered thre three most basic data structures. Some ofthe important structures you might encounter are
General GraphsOrdered Lists and Sorted ListsHashing, Hash Tables, and Scatter Tables (usepythondictionaries)General Trees/Search TreesHeaps and Priority QueuesSets, Multisets, and Partitions (python2.4 > has nowdirect support for sets)
735 Computational Astronomy
ReminderBasic Data Structures: Continued
Algorithmic AnalysisSummary
Fibonacci
Analysis of Algorithms
The algorithm is the basic idea or outline behind aprogram.We express algorithms in python in this class but youshould be able to translate any algorithm, to code in anylanguage via easy (and mechanical) steps (or you couldwrite a cross compiler :-)We will talk about both the design and the analysis ofalgorithms
735 Computational Astronomy
ReminderBasic Data Structures: Continued
Algorithmic AnalysisSummary
Fibonacci
Why Analysing of Algorithms?
Analysis is more reliable than experimentation.It helps choosing among different solutions to problems.Predict the performance of a program before coding(especially important for large projects).By analyzing an algorithm, we gain a better understandingof where the fast and slow parts are, and what to work onor work around in order to speed it up.
735 Computational Astronomy
ReminderBasic Data Structures: Continued
Algorithmic AnalysisSummary
Fibonacci
The Fibonacci NumbersA classic problem to introduce algorithms.
Problem: calculate the Fibonacci numbers. They are defined bythe following recursion:
Fn = Fn−1 + Fn−2 (1)
with initial conditions F0 = F1 = 1.
The orginal problem is a toy model for populationdynamics.This is simple example is rich enough to demonstratemany aspects of algorithmic analysis.
735 Computational Astronomy
ReminderBasic Data Structures: Continued
Algorithmic AnalysisSummary
Fibonacci
The Fibonacci NumbersMathematical Analysis
This is second order, homogenous, linear recursion.Linear recursions (difference equations) are a a bit likelinear differential equations. For this one, we can look forthe solution in the form of Fn = qn. This gives rise to thefollowing characteristic equation:
qn = qn−1 + qn−2 → (q 6= 0)q2 = q + 1 (2)
The solution: q± = 1±√
52 (golden ratio)
735 Computational Astronomy
ReminderBasic Data Structures: Continued
Algorithmic AnalysisSummary
Fibonacci
The Fibonacci NumbersInitial Conditions
Since the equation is linear, superposition of solutions is asolution.We need to use the initial conditions, (second order needstwo condition), assuming Fn = aqn
+ + bqn−
a + b = 1aq+ + bq− = 1 (3)
The solution:
a =1− q−
q+ − q−, b =
q+ − 1q+ − q−
(4)
Note that you can solve all equations of the formcFn + dFn−1 + Fn−2 = 0 exactly the same way.
735 Computational Astronomy
ReminderBasic Data Structures: Continued
Algorithmic AnalysisSummary
Fibonacci
The Fibonacci NumbersLet’s write a class!
class Fibonacci:
def __init__(self):self.qp = 0.5+math.sqrt(5.0)/2.0self.qm = 0.5-math.sqrt(5.0)/2.0dq = self.qp-self.qmself.a = (1.0-self.qm)/dqself.b = (self.qp-1.0)/dq
735 Computational Astronomy
ReminderBasic Data Structures: Continued
Algorithmic AnalysisSummary
Fibonacci
The Fibonacci NumbersPhysicist’s solution
class Fibonacci:...
def fibQ(self,n):"""arithmetic"""return self.a*self.qp**n+
self.b*self.qm**n
735 Computational Astronomy
ReminderBasic Data Structures: Continued
Algorithmic AnalysisSummary
Fibonacci
Is this accurate enough?
>>> import lecture3>>> f = lecture3.Fibonacci()>>> f.fibQ(100)5.7314784401381897e+20>>> f.fibNonRecS(100) # we’ll see this later573147844013817084101L
For many purposes probably...but what if want accurate, integerarithmetic?
735 Computational Astronomy
ReminderBasic Data Structures: Continued
Algorithmic AnalysisSummary
Fibonacci
An Archetypal Recursive AlgorithmTogether with factorial...
def fibRec(self,n):"""simple recursive version"""if n == 0 or n == 1:
return 1else:
return self.fibRec(n-1) + self.fibRec(n-2)
735 Computational Astronomy
ReminderBasic Data Structures: Continued
Algorithmic AnalysisSummary
Fibonacci
Is this fast enough?How do we compare the efficiency of algorithms?
If n > 1 this algorithm executes 2 lines, and then itselftwice, simbolically
tn = 2 + tn−1 + tn−2 (5)
This can be visualized as a binary tree.This is almost the same equation (it is inhomogenous, tosolve it we would need to find a particular solution, e.g.with the method of “varying the constants”.)Even without solving it, it is clear that it will giveexponential solution tn ' const .n. This can be very slow.(try to run it for n = 40).
735 Computational Astronomy
ReminderBasic Data Structures: Continued
Algorithmic AnalysisSummary
Fibonacci
Dynamic ProgrammingThis is a useful paradigm to find efficient algorithms
The problem with the above algorithm is that we keepcalculating the same things over and over. E.g. to calculaten = 100, we have to calcuate each n before, and for eachwe calculate n = 2, etc.Dynamic programming means breaking the program intosmaller problems, calculating and storing the resultsIn this case this is achieved easiest by opening up therecursionThere are alternative definitions for “dynamicprogramming”, but the essence is the same.
735 Computational Astronomy
ReminderBasic Data Structures: Continued
Algorithmic AnalysisSummary
Fibonacci
An iterative algorithmUses loops instead of recursion
def fibNonRec(self,n):"""non-recursive version"""if n == 0 or n == 1:
return 1else:
f = (n+1)*[0]f[0] = f[1] = 1;for i in range(2,n+1):
f[i] = f[i-1] + f[i-2]return f[n];
735 Computational Astronomy
ReminderBasic Data Structures: Continued
Algorithmic AnalysisSummary
Fibonacci
A More Pythonic SolutionMemoization with dictionaries
This is still dynamic programming, but without resolving therecursion.
self.cacheDic = 0:1, 1:1 #__init__...def fibCacheDic(self,n):
if self.cacheDic.has_key(n):return self.cacheDic[n]
else:newFib = self.fibCacheDic(n-1) + self.fibCacheDic(n-2)self.cacheDic[n] = newFib
return newFib
735 Computational Astronomy
ReminderBasic Data Structures: Continued
Algorithmic AnalysisSummary
Fibonacci
Comparison
This is much faster: e.g. for n = 45 the (uncached)recursive algorithm takes over a billion stepsThe iterative algorithm takes about 2n = 90 steps. It is 10million times fasterNotice that the difference is quite subtle between the two!
735 Computational Astronomy
ReminderBasic Data Structures: Continued
Algorithmic AnalysisSummary
Fibonacci
Space Complexity
Speed is not everything: we can also compare algorithmson the basis of how much other resources they are using(e.g. memory, disk, etc.)If an algorithm takes too much memory you may not beable to run (even if it was theoretically faster than anotheralgorithm)In the iterative algorithm we used an array of length nIt is more complicated to analyse the space complexity of arecursive algorithm: at any one time we have to take intoaccount all the active calls. This form a path from theactive call to the leaves of the tree. This is at most n.The second algorithm can be modified to use less space.
735 Computational Astronomy
ReminderBasic Data Structures: Continued
Algorithmic AnalysisSummary
Fibonacci
Temporary Variables Instead of Lists
def fibNonRecS(self,n):"""non-recursive version, space saving"""if n == 0 or n == 1:
return 1else:
a = b = 1;i = 2while i < n+1:
c = a + ba = bb = ci = i+1
return b;735 Computational Astronomy
ReminderBasic Data Structures: Continued
Algorithmic AnalysisSummary
Fibonacci
The O(Big O) notation
Def: f (n) = O(g(n)) iff there exists a constant c s.t.f (n) ≤ c ∗ g(n).This notation does not care about fine details, but allowsus to compare two algorithms easily.E.g. two algorithms with 2n and 10n respectively areequivalent according to this.2n2 and 10n are not. The idea is that the algorithm withfaster asymptotics will win for large n anyway.For completelness: Ω: lower bound, θ: both O and Ω,(little) o: O but not Ω. In practice, we use O is if it was θ.
735 Computational Astronomy
ReminderBasic Data Structures: Continued
Algorithmic AnalysisSummary
Fibonacci
A Matrix Trick
One can check with direct calculation and prove withinduction that the following identity is true:(
1 11 0
)n
=
(Fn Fn−1
Fn−1 Fn−2
)(6)
We can use this to construct an algorithm, whichcalculates Fibonacci numbers through matrix multiplication(but what good does that do?)
735 Computational Astronomy
ReminderBasic Data Structures: Continued
Algorithmic AnalysisSummary
Fibonacci
Recursive Powering
For calculating the Fibonacci numbers with the above trick,we need to calcuate Mn, where M is a two-by-two matrix.If we had A = Mn/2, we could just do it one onemultiplication: Mn = A2.Use this to calculate the powers in O(log(n)) time.Note: log is always base-2 unless otherwise noted!
735 Computational Astronomy
ReminderBasic Data Structures: Continued
Algorithmic AnalysisSummary
Fibonacci
The Basic Idea
def fibMat(self,n):if n == 0 or n == 1:
return 1else:
self.m = [[1,1],[1,0]]self.matRecPow(n)return self.m[0][0]
735 Computational Astronomy
ReminderBasic Data Structures: Continued
Algorithmic AnalysisSummary
Fibonacci
The Heart of the Algorithm
def matRecPow(self,n):if n > 1:
self.matRecPow(n/2)self.m = self.matSqr2x2(self.m)if n % 2 is 1: #n odd
self.m = self.matMul2x2(self.m,[[1,1],[1,0]])
735 Computational Astronomy
ReminderBasic Data Structures: Continued
Algorithmic AnalysisSummary
Fibonacci
The Actual Matrix Multiplication
def matMul2x2(self,m1,m2):"""multiply two 2x2 matrices"""return [[m1[0][0]*m2[0][0]+m1[0][1]*m2[1][0],
m1[0][0]*m2[0][1]+m1[0][1]*m2[1][1]],[m1[1][0]*m2[0][0]+m1[1][1]*m2[1][0],m1[1][0]*m2[0][1]+m1[1][1]*m2[1][1]]]
735 Computational Astronomy
ReminderBasic Data Structures: Continued
Algorithmic AnalysisSummary
Fibonacci
Squaring...
def matSqr2x2(self,m):"""square of a 2x2 matrix"""return self.matMul2x2(m,m)
735 Computational Astronomy
ReminderBasic Data Structures: Continued
Algorithmic AnalysisSummary
Fibonacci
Summary of Fibonnacci Algorithms
The arithmetic algorithm is O(1), but uses floating point.The recursive one the most natural algorithm, however ithas exponential scaling (the only really slow one)The dynamic algorithms are O(n) (linear). The second(memoization) version is O(1) after a number higher then nis called. A variation of the first (for loop) version which hadO(1) space complexity.The matrix algorithm is O(log(n)). In principle this wouldbe the fastest integer algorithm for very large n. (However,can we store the results, say, for n = 1000000?)
735 Computational Astronomy
ReminderBasic Data Structures: Continued
Algorithmic AnalysisSummary
Summary
We finished basic data structuresStarted to study analysis of algorithmsDemonstrated analysis through an assortment ofFibonacci algorithms
735 Computational Astronomy
ReminderBasic Data Structures: Continued
Algorithmic AnalysisSummary
Homework # 3
Reading: read the rest of the python tutorial athttp://http://docs.python.org/2/tutorial/.(E1) A full node in a binary tree is a node with twonon-empty subtrees. Let l be the number of leaf nodes in abinary tree. Show that the number of full nodes is l-1. (hint:induction)(E2) Write a program which will create a lookup table forhiearchical tree-indexing of a 2-dimensional 2n × 2n array.Hierarchical indices have the advantage that two pixelswhich are close physically also have indices which areclose. This can be thought of as a list representation of a(quad)-tree. Optionally, make your program work inarbitrary dimensions.
735 Computational Astronomy
ReminderBasic Data Structures: Continued
Algorithmic AnalysisSummary
Homework # 3continued
The recursive definition of the hierarchical index fortwo-dimensions is the following: The first two bits of the indextells you if the pixel is in the upper/lower or left/right portion ofthe full array. Once we know these bits, we know the pixel is inone of four quadrants of size 2n−1 × 2n−1. Then the next twobits carry the same information with respect to this smallerrectangle, etc. untill we arrive at the lowest level. This needs abit of digesting; also look at the figurehttp://healpix.jpl.nasa.gov/html/intronode3.htm
735 Computational Astronomy
ReminderBasic Data Structures: Continued
Algorithmic AnalysisSummary
Homework # 3continued
(E3) Measure the speed of the algorithms given inlecture3.py, verify the scalings. Plot your measured timeson a figure, and fit a formula for each programs runningtime. Comment on the results.(E4) Solve the recursions yn + yn−1 ± 2yn−2 = 2−n, withinitial conditions y0 = y1 = 1. Derive analytic solutions, anddevise an algorithm which calculates the recursion.Compare the two solutions on a figure.(hint: solve thehomogenous equation first, and find a particular solution).
735 Computational Astronomy