kk20503 1 introduction
TRANSCRIPT
Dr James Mountstephens
IntroductionsI am Dr James Mountstephens and I will be
teaching Algorithm Analysis (AA) this semesterWho are you?I will be teaching in English
Please tell me if you don’t understand what I am saying
Some rules for lecturesPlease be punctual.
Students more than 10 minutes late will not be allowed to enter the lecture or tutorial
Please, no talking. Please ask questions. I will be asking you questions.Please think about the material!
About The Course AA is a crucial and useful course for computer
scientistsAA is a difficult course
I am going to assume that you are intelligent people. I will not dumb the subject down and I will discuss
complex topics. Again, please ask questions.You will fail if you are lazy but each of you can pass if
you work hard and really think about the materialThe first two lectures will probably be the hardest …!
Course Delivery14 Lectures and 14 Tutorials Quizzes (5%)
You will be in groups, to be assigned soonGroup Assignment (15%), Individual Assignment
(10%)Midterm (20%) and Final Exam (50%)
• All students, please sign up to Opencommittee I will post lecture slides, assignments etc. I will make announcements so please check regularlyhttp://www.opencommittee.com Join the committee for this course (code 102923)USE YOUR PROPER NAME!
Lectures1. Introduction2. Fundamentals of the Analysis of Algorithm Efficiency3. Brute Force4. Divide and Conquer5. Decrease and Conquer6. Transform and Conquer I7. Transform and Conquer II8. Space and Time Tradeoffs9. Dynamic Programming10. Greedy Technique11. Iterative Improvement12. Limitations of Algorithmic Power13. Coping with the Limitations of Algorithmic Power14. Revision
The TextbookPlease read the
textbook“Introduction to The
Design and Analysis of Algorithms” by Anany Levitin (Second Edition)
Expensive but please buy it and read it
Dr James Mountstephens
1: Introduction
ContentsWhat is an Algorithm?Fundamentals of Algorithmic Problem
SolvingImportant Problem TypesFundamental Data Structures
What is an Algorithm?An algorithm is a sequence of unambiguous
instructions for solving a problem in a finite timeAlgorithmic problem solving means obtaining a required
output for any legitimate input
“computer”
problem
algorithm
input outputsolutio
n
What is an Algorithm?Algorithms are the core of Computer Science
Computers are algorithmic machinesEVERY piece of software that you use is an instance of
an algorithm Take time to think about that…Google, Facebook, MS Word
etc. are all algorithms.
Every Computer Scientist/Software Engineer/IT person should have a toolkit of known algorithmsFor practically doing your job!Provide a framework for designing and analysing
algorithms for new problems
What is an Algorithm?An algorithm is a sequence of
unambiguous instructions for solving a problem in a finite timeSolving a problemSequenceInstructionsNon-ambiguityFinite timeThe range of inputs must be carefully specified
We can consider algorithms to be procedural solutions to problems
What is an Algorithm?Algorithms are not computer programs
Programs represent and implement algorithms
The same algorithm may be represented in several different waysNatural language, Programming language, Pseudocode,
flowcharts…
Several different algorithms for solving the same problem may existEach may be based on different design ideas and have
very different performance
Example Problem: Greatest Common DivisorGCD of two integers m,n is the largest
positive integer that divides them both with no remainderm,n are nonnegative and not both zero
Examples: gcd(60,24) = 12gcd(60,0) = 60gcd(57, 13) = 1
GCD is used in number theory, solution of equations and cryptography (RSA algorithm)
Example: Euclid’s Algorithm for GCDEuclid’s Algorithm for the GCD of two numbers
is the earliest recorded algorithm (c. 300 BC)It is based on repeated application of the
following equality until the second number becomes 0, leaving the first number as the answer
gcd(m,n) = gcd(n, m mod n)Examples
gcd(60,24) = gcd(24,12) = gcd(12,0) = 12gcd(57,13) = gcd(13,5) = gcd(5,3) = gcd(3,2)
= gcd(2,1) = gcd(1,0) = 1
m mod n is the remainder of m/n
Two Descriptions of Euclid’s algorithm
while n ≠ 0 do r ← m mod n
m← n n ← r return m
Step 1 If n = 0, return m and stop; otherwise go to Step 2
Step 2 Divide m by n and assign the value of the remainder to r
Step 3 Assign the value of n to m and the value of r to n. Go to Step 1.
Remember, the same algorithm may be represented in several different ways…
Other Algorithms for GCD
Consecutive Integer Checking AlgorithmStep 1 Assign the value of min{m,n} to tStep 2 Divide m by t. If the remainder is 0, go to Step 3;
otherwise, go to Step 4Step 3 Divide n by t. If the remainder is 0, return t and
stop; otherwise, go to Step 4
Step 4 Decrease t by 1 and go to Step 2
Remember, several different algorithms for solving the same problem may exist Each may be based on different design ideas and have very
different performance…
Consecutive Integer Checking Algorithm for GCD
Other Algorithms for GCDConsecutive Integer Checking AlgorithmStep 1 Assign the value of min{m,n} to tStep 2 Divide m by t. If the remainder is 0,
go to Step 3; otherwise, go to Step 4Step 3 Divide n by t. If the remainder is 0,
return t and stop; otherwise, go to Step 4Step 4 Decrease t by 1 and go to Step 2
Consecutive Integer Checking Algorithm example: gcd(60,24)
t = min(60,24) = 24 60/24 = 2 remainder 12, t = 23 60/23 = 2 remainder 14, t = 22 60/22 = 2 remainder 16, t =21 60/21 = 2 remainder 18, t=20 60/20 = 3 remainder 0 24/20 = 1 remainder 4, t=19 …..(t = 19-16 not shown)… t=15 60/15 = 4 remainder 0 24/15 = 1 remainder 9 …..(t = 14-13 not shown)… t =
12 60/12 = 5 remainder 0 24/12 = 2 remainder 0 gcd(24,12) = 12
How does the performance of this algorithm compare to Euclid’s?
It is clearly worse for this pair of numbers, at least
Soon we shall see that Euclid’s is O(log n) and this is O(n)
Other Algorithms for GCD“Middle-school” procedureStep 1 Find the prime factorisation of mStep 2 Find the prime factorisation of nStep 3 Find all the common prime factorsStep 4 Compute the product of all the
common prime factors and return it as gcd(m,n)
Is this an algorithm?Consider the non-ambiguity requirement…
48 180
gcd = 12
Other Algorithms for GCD We need an algorithm
for prime factorisation to make the middle-school procedure into an algorithm
We just repeatedly divide n by the prime numbers less than it, with no remainder, until the quotient becomes 1
But to do this prime factorisation, we need an algorithm to find all the prime numbers, up to n…
This can be done with the “Sieve of Eratosthenes”
Other Algorithms for GCD
“Middle-school” procedureStep 1 Find the prime factorisation of mStep 2 Find the prime factorisation of nStep 3 Find all the common prime factorsStep 4 Compute the product of all the
common prime factors and return it as gcd(m,n)
Prime FactorisationInput: Integer x ≥ 2, Output: List F of prime
factors of xP ← Sieve(x)while n > 1 do while n mod P[i] = 0 do F ← F + P[i] x ← x / P[i]
i ← i + 1
Sieve of Eratosthenes
Input: Integer x ≥ 2Output: List of primes less
than or equal to xfor p ← 2 to x do A[p] ← pfor p ← 2 to x do if A[p] 0
j ← p* p while j ≤ x do A[j] ← 0 j ← j + p
How does the performance of this algorithm compare to Euclid’s and Consecutive Integer Checking?
Note also that an algorithm will be required to find all common prime factors too…
Fundamentals of Algorithmic Problem SolvingUnderstand the ProblemAscertain the Capabilities of the Computational DeviceChoosing between Exact and Approximate Problem SolvingDeciding on Appropriate Data StructuresAlgorithm Design TechniquesMethods of Specifying an AlgorithmProving an Algorithm’s CorrectnessAnalysing an AlgorithmCoding an Algorithm
Please read this section in the textbook in much more detail…
Algorithm Design TechniquesAn algorithm design technique (or
“strategy” or “paradigm”) is…a general approach to solving problems algorithmically
that is applicable to a wide range of problems from different areas of computing
This course is organised by design techniquesSee the lecture topics from earlier…Usually, AA courses are organised by problem type not
algorithm design technique
Analysing an AlgorithmAnalysis is essential. We want to know that
our algorithms have good characteristics:Good Time Efficiency (or Complexity)Good Space Efficiency (or Complexity)SimplicityGenerality
Analysis allows comparison of algorithms and allows us to know if they can be used practically or not
We will study a framework for analysis in the next lecture
Important Problem TypesSortingSearchingString ProcessingGraph ProblemsCombinatorial ProblemsGeometric ProblemsNumerical Problems
Sorting ProblemsProblem: rearrange the items of a given list in
ascending order.Input: A sequence of n items <a1, a2, …, an>Output: A reordering <a*
1, a*2, …, a*
n> of the input sequence such that a*
1≤ a*2 ≤ … ≤ a*
n.
Importance of sortingCan help searching tremendouslyAlgorithms often use sorting as a key subroutine.
Sorting keyA specially chosen piece of information used to guide
sorting. E.g., sort student records by names.
Sorting ProblemsExamples of sorting algorithms
Selection SortBubble sort Insertion sortMerge sortHeap sort …and others
We usually evaluate sorting algorithm complexity by the number of key comparisons.
Two important propertiesStability: A sorting algorithm is called stable if it preserves
the relative order of any two equal elements in its input.In place : A sorting algorithm is in place if it does not
require extra memory, except, possibly for a few memory units.
Sorting Problems
SelectionSort(A[0..n-1])//Input: An array A[0..n-1] of orderable
elements//Output: Array A[0..n-1] sorted in ascending
order
for i ← 0 to n – 2 domin ← ifor j ← i + 1 to n – 1 do
if A[j] < A[min] min ← j
swap A[i] and A[min]
An example of a simple (and often low-performance) sorting algorithm is Selection Sort
Searching ProblemsProblem: find a given value, called a search
key, in a given set.Eg ID, ref number, name, etc
Search is a hugely important problemRetrieving data from database storage
Student records, company accounts etc.
And more general search… Searching the web for informationSearching to solve problems in AI
Searching ProblemsExamples of searching algorithms
Sequential search (Time O(n))Binary search (Time: O(log n))
Binary Search// Input: sorted array a_i < … < a_j
and key x;m (i+j)/2;while i < j and x != a_m do if x < a_m then j m-1 else i m+1; if x = a_m then output a_m;
String Processing ProblemsA string is a sequence of characters from an
alphabet.“This is a string” is a string
Text strings: letters, numbers, and special characters.
String matching: searching for a given word/pattern in a text.
Examplessearching for a word or phrase on WWW or in a Word
documentsearching for a particular sequence of {A,C,G,T} in a
reference genomic sequence
Graph Problems(Informally) A graph is a collection of points called
vertices, some of which are connected by line segments called edges.
Graphs are extremely important for modeling real-life problemsModeling WWWSocial relationshipsCommunication networksProject scheduling …
Examples of graph algorithmsGraph traversal algorithmsShortest-path algorithmsTopological sorting
1 2
3 4
Graph ProblemsExample: Travelling
Salesman ProblemFor a set of cities linked by
roads, find the shortest route that visits them all only once.
Important to Businesses and route planning Circuit board and VLSI design and
manufacture X Ray crystallography Genetic engineering
TSP, like many other graph problems, is extremely difficult and no general efficient algorithms are knownIn fact, they may not exist…
Combinatorial ProblemsFinding a Combinatorial Object that satisfies certain
constraints or has certain propertiesSets, subsetsPermutations
Set P of all ordered sequences of length r from a set S of n elements Eg. for S = {1,2,3} and r=3, P = {(1,2,3), (1,3,2), (2,1,3),(2,3,1),
(3,1,2), (3,2,1)} Eg. for S = {1,2,3} and r=2, P = {(1,2), (1,3), (2,1),(2,3), (3,1), (3,2)} The number of permutations is nPr = n!/(n-r)!
Combinations Set C of all unordered sequences of length r from a set S of n
elements Eg. for S = {1,2,3} and r=2, C = {(1,2), (1,3), (2,3),} Eg. for S = {1,2,3} and r=3, C = {(1,2,3)} The number of combinations is nCr = n!/(n-r)! r!
Combinatorial problems are among the hardest known
Combinatorial ProblemsExample: The Knapsack
ProblemGiven a set of items, each
with a weight and a value, determine the most valuable subset of items with total weight less than or equal to a given limit
Geometric ProblemsProblems dealing with Geometric Objects
such as points, lines, polygonsExamples
Finding closest pair of pointsFinding convex hull of a set of points
Numerical ProblemsProblems dealing with
mathematical objectsNumbers, Equations, Matrices…
ExamplesEfficient arithmeticEvaluating functionsSolving systems of linear
equations with Gaussian Elimination
Evaulating Definite IntegralsOptimising a function subject to
constraints
Fundamental Data Structures Linear Data
StructuresArray
String
Linked list
Stack
Queue Priority queue/heap
Graph
General Graph
Tree Binary Tree
Set and Dictionary
Linear Data StructuresArrays
A sequence of n items of the same data type that are stored contiguously in computer memory and made accessible by specifying a value of the array’s index.
Array properties fixed length (need preliminary reservation of memory) contiguous memory locations direct access Operations of insert/delete Often used to implement Lists and Strings
Linear Data StructuresLinked List
A sequence of zero or more nodes each containing two kinds of information: some data and one or more links called pointers to other nodes of the linked list.
Singly linked list (next pointer) Doubly linked list (next + previous pointers)
Linked List Properties dynamic length arbitrary memory locations access by following links Operations of Insert/delete
Stacks and QueuesStacks
A stack of plates insertion/deletion can be done
only at the top. LIFO
Two operations (push and pop)Queues
A queue of customers waiting for services Insertion/enqueue from the rear
and deletion/dequeue from the front.
FIFOTwo operations (enqueue and
dequeue)
Priority Queues and Heaps Priority queues (implemented using
heaps) A data structure for maintaining a set of elements, each
associated with a key/priority, with the following operations
Operations Finding the element with the highest priority
Deleting the element with the highest priority
Inserting a new element
96 8
5 2 3
9 6 58 2 3
GraphsFormal definition
A graph G = <V, E> is defined by a pair of sets: a finite set V of items called vertices and a set E of vertex pairs called edges.
Undirected and directed graphs (digraphs).What’s the maximum
number of edges in an undirected graph with |V| vertices?
Complete graphsA graph with every pair of
its vertices connected by an edge is called complete, K|V|
Dense and sparse graphs Dense graphs have edges
between most nodes Sparse graphs have few edges
between nodes
1 2
3 4
1 2
3 4
Complete, undirected graph
Incomplete, directed graph
Both graphs here are dense
Graph RepresentationNodes are said to be
adjacent if an edge exists between them
Adjacency Matrix n x n boolean matrix if number
of nodes |V| is n. The element on the ith row and
jth column is 1 if there’s an edge from ith vertex to the jth vertex; otherwise 0.
The adjacency matrix of an undirected graph is symmetric.
Adjacency Linked ListsA collection of linked lists,
one for each vertex, that contain all the vertices adjacent to the list’s vertex.
Which data structure would you use for dense and sparse graphs?
0 1 1 10 0 0 10 0 0 10 0 0 0
2 3 444
1 2
3 4
Incomplete, directed graph G
Adjacency Matrix for G
Adjacency Linked List for G
Weighted GraphsWeighted graphs
Graphs or digraphs with numbers assigned to the edges.
1 2
3 4
6
8
5
79
Graph Properties -- Paths and ConnectivityPaths
A path from vertex u to v of a graph G is defined as a sequence of adjacent vertices that starts with u and ends with v.
Simple paths: All edges of a path are distinct.
Path lengths: the number of edges, or the number of vertices – 1.
1 2
4 5
3
Simple Path from 4 to 3
(4,1,2,3)Path Length 3
1 2
4 5
3
Path from 4 to 3(4,1,2,5,4,1,2,3)Path Length 7
Graph Properties -- Paths and ConnectivityConnected graphs
A graph is said to be connected if for every pair of its vertices u and v there is a path from u to v.
Connected componentThe maximum
connected subgraph of a given graph.
1 2
4 5
3
Connected Graph G1
1 2
4 5
3
Non-Connected Graph G2
Graph Properties -- AcyclicityCycle
A simple path of a positive length that starts and ends a the same vertex.
Acyclic graphA graph without cyclesDAG (Directed Acyclic
Graph)1 2
3 4
Directed Acyclic Graph
1 2
4 5
3
Cycle (1,2,5,4,1)
TreesTrees
A tree (or free tree) is a connected acyclic graph.
Forest: a graph that has no cycles but is not necessarily connected.
Properties of trees |E| = |V| - 1For every two vertices in a
tree there always exists exactly one simple path from one of these vertices to the other….
1 3
2 4
5
1 3
2 4
5 6
7
Free Tree
Free Tree that is also a forest
Rooted TreesThis “single path”
property makes it possible to select an arbitrary vertex in a free tree and consider it as the root of the so-called rooted tree.
A root then defines levels in the tree.Number of edges from
the root
1
3
2
4 5
1 3
2 4
5
Free Tree
Rooted Tree with 2 levels
Rooted TreesAncestors
For any vertex v in a tree T, all the vertices on the simple path from the root to that vertex are called ancestors.
Descendants All the vertices for which a vertex v is an
ancestor are said to be descendants of v.Parent, child and siblings
If (u, v) is the last edge of the simple path from the root to vertex v, u is said to be the parent of v and v is called a child of u.
Vertices that have the same parent are called siblings.
Leaves A vertex without children is called a leaf.
Subtree A vertex v with all its descendants is
called the subtree of T rooted at v.
1
3
2
4 5
76
8 9
Examples:•1 is an ancestor of 6 and 7•8 is a descendant of 5•1 is parent (and ancestor) of 6 and 7•8 and 9 are children (and descendants) of 5•6 and 7 are siblings•8 and 9 are siblings•4,6,7,8,9 are leaves•5 is the subtree consisting of 5,8,9
Rooted TreesDepth of a vertex
The length of the simple path from the root to the vertex.
Height of a treeThe length of the longest
simple path from the root to a leaf.
1
3
2
4 5
76
8 9
Examples:•Vertex 2 has depth d=2•Vertex 7 has depth d=3•The tree itself has height h=3
Ordered TreesOrdered trees
An ordered tree is a rooted tree in which all the children of each vertex are ordered.
Binary treesA binary tree is an
ordered tree in which every vertex has no more than two children and each children is designated s either a left child or a right child of its parent.
9
6 8
5 2 3
Binary Tree
L R L
Ordered TreesBinary search trees
Each vertex is assigned a number.
A number assigned to each parental vertex is larger than all the numbers in its left subtree and smaller than all the numbers in its right subtree.
log2n h n – 1, where h is the height of a binary tree and n the size.
6
3 9
2 5 8
Sets and DictionariesI will assume you already know what a set is!
Elements, subsets, union, intersection etc..
Dictionaries are Sets with 3 operations:Searching for a given elementInserting a new elementDeleting an existing element
Please read the textbook for more detail…