kk20503 1 introduction

54
Dr James Mountstephens

Upload: low-ying-hao

Post on 11-May-2015

927 views

Category:

Technology


0 download

TRANSCRIPT

Page 1: Kk20503 1 introduction

Dr James Mountstephens

Page 2: Kk20503 1 introduction

IntroductionsI am Dr James Mountstephens and I will be

teaching Algorithm Analysis (AA) this semesterWho are you?I will be teaching in English

Please tell me if you don’t understand what I am saying

Some rules for lecturesPlease be punctual.

Students more than 10 minutes late will not be allowed to enter the lecture or tutorial

Please, no talking. Please ask questions. I will be asking you questions.Please think about the material!

Page 3: Kk20503 1 introduction

About The Course AA is a crucial and useful course for computer

scientistsAA is a difficult course

I am going to assume that you are intelligent people. I will not dumb the subject down and I will discuss

complex topics. Again, please ask questions.You will fail if you are lazy but each of you can pass if

you work hard and really think about the materialThe first two lectures will probably be the hardest …!

Page 4: Kk20503 1 introduction

Course Delivery14 Lectures and 14 Tutorials Quizzes (5%)

You will be in groups, to be assigned soonGroup Assignment (15%), Individual Assignment

(10%)Midterm (20%) and Final Exam (50%)

• All students, please sign up to Opencommittee I will post lecture slides, assignments etc. I will make announcements so please check regularlyhttp://www.opencommittee.com Join the committee for this course (code 102923)USE YOUR PROPER NAME!

Page 5: Kk20503 1 introduction

Lectures1. Introduction2. Fundamentals of the Analysis of Algorithm Efficiency3. Brute Force4. Divide and Conquer5. Decrease and Conquer6. Transform and Conquer I7. Transform and Conquer II8. Space and Time Tradeoffs9. Dynamic Programming10. Greedy Technique11. Iterative Improvement12. Limitations of Algorithmic Power13. Coping with the Limitations of Algorithmic Power14. Revision

Page 6: Kk20503 1 introduction

The TextbookPlease read the

textbook“Introduction to The

Design and Analysis of Algorithms” by Anany Levitin (Second Edition)

Expensive but please buy it and read it

Page 7: Kk20503 1 introduction

Dr James Mountstephens

1: Introduction

Page 8: Kk20503 1 introduction

ContentsWhat is an Algorithm?Fundamentals of Algorithmic Problem

SolvingImportant Problem TypesFundamental Data Structures

Page 9: Kk20503 1 introduction

What is an Algorithm?An algorithm is a sequence of unambiguous

instructions for solving a problem in a finite timeAlgorithmic problem solving means obtaining a required

output for any legitimate input

“computer”

problem

algorithm

input outputsolutio

n

Page 10: Kk20503 1 introduction

What is an Algorithm?Algorithms are the core of Computer Science

Computers are algorithmic machinesEVERY piece of software that you use is an instance of

an algorithm Take time to think about that…Google, Facebook, MS Word

etc. are all algorithms.

Every Computer Scientist/Software Engineer/IT person should have a toolkit of known algorithmsFor practically doing your job!Provide a framework for designing and analysing

algorithms for new problems

Page 11: Kk20503 1 introduction

What is an Algorithm?An algorithm is a sequence of

unambiguous instructions for solving a problem in a finite timeSolving a problemSequenceInstructionsNon-ambiguityFinite timeThe range of inputs must be carefully specified

We can consider algorithms to be procedural solutions to problems

Page 12: Kk20503 1 introduction

What is an Algorithm?Algorithms are not computer programs

Programs represent and implement algorithms

The same algorithm may be represented in several different waysNatural language, Programming language, Pseudocode,

flowcharts…

Several different algorithms for solving the same problem may existEach may be based on different design ideas and have

very different performance

Page 13: Kk20503 1 introduction

Example Problem: Greatest Common DivisorGCD of two integers m,n is the largest

positive integer that divides them both with no remainderm,n are nonnegative and not both zero

Examples: gcd(60,24) = 12gcd(60,0) = 60gcd(57, 13) = 1

GCD is used in number theory, solution of equations and cryptography (RSA algorithm)

Page 14: Kk20503 1 introduction

Example: Euclid’s Algorithm for GCDEuclid’s Algorithm for the GCD of two numbers

is the earliest recorded algorithm (c. 300 BC)It is based on repeated application of the

following equality until the second number becomes 0, leaving the first number as the answer

gcd(m,n) = gcd(n, m mod n)Examples

gcd(60,24) = gcd(24,12) = gcd(12,0) = 12gcd(57,13) = gcd(13,5) = gcd(5,3) = gcd(3,2)

= gcd(2,1) = gcd(1,0) = 1

m mod n is the remainder of m/n

Page 15: Kk20503 1 introduction

Two Descriptions of Euclid’s algorithm

while n ≠ 0 do r ← m mod n

m← n n ← r return m

Step 1 If n = 0, return m and stop; otherwise go to Step 2

Step 2 Divide m by n and assign the value of the remainder to r

Step 3 Assign the value of n to m and the value of r to n. Go to Step 1.

Remember, the same algorithm may be represented in several different ways…

Page 16: Kk20503 1 introduction

Other Algorithms for GCD

Consecutive Integer Checking AlgorithmStep 1 Assign the value of min{m,n} to tStep 2 Divide m by t. If the remainder is 0, go to Step 3;

otherwise, go to Step 4Step 3 Divide n by t. If the remainder is 0, return t and

stop; otherwise, go to Step 4

Step 4 Decrease t by 1 and go to Step 2

Remember, several different algorithms for solving the same problem may exist Each may be based on different design ideas and have very

different performance…

Consecutive Integer Checking Algorithm for GCD

Page 17: Kk20503 1 introduction

Other Algorithms for GCDConsecutive Integer Checking AlgorithmStep 1 Assign the value of min{m,n} to tStep 2 Divide m by t. If the remainder is 0,

go to Step 3; otherwise, go to Step 4Step 3 Divide n by t. If the remainder is 0,

return t and stop; otherwise, go to Step 4Step 4 Decrease t by 1 and go to Step 2

Consecutive Integer Checking Algorithm example: gcd(60,24)

t = min(60,24) = 24 60/24 = 2 remainder 12, t = 23 60/23 = 2 remainder 14, t = 22 60/22 = 2 remainder 16, t =21 60/21 = 2 remainder 18, t=20 60/20 = 3 remainder 0 24/20 = 1 remainder 4, t=19 …..(t = 19-16 not shown)… t=15 60/15 = 4 remainder 0 24/15 = 1 remainder 9 …..(t = 14-13 not shown)… t =

12 60/12 = 5 remainder 0 24/12 = 2 remainder 0 gcd(24,12) = 12

How does the performance of this algorithm compare to Euclid’s?

It is clearly worse for this pair of numbers, at least

Soon we shall see that Euclid’s is O(log n) and this is O(n)

Page 18: Kk20503 1 introduction

Other Algorithms for GCD“Middle-school” procedureStep 1 Find the prime factorisation of mStep 2 Find the prime factorisation of nStep 3 Find all the common prime factorsStep 4 Compute the product of all the

common prime factors and return it as gcd(m,n)

Is this an algorithm?Consider the non-ambiguity requirement…

48 180

gcd = 12

Page 19: Kk20503 1 introduction

Other Algorithms for GCD We need an algorithm

for prime factorisation to make the middle-school procedure into an algorithm

We just repeatedly divide n by the prime numbers less than it, with no remainder, until the quotient becomes 1

But to do this prime factorisation, we need an algorithm to find all the prime numbers, up to n…

This can be done with the “Sieve of Eratosthenes”

Page 20: Kk20503 1 introduction

Other Algorithms for GCD

“Middle-school” procedureStep 1 Find the prime factorisation of mStep 2 Find the prime factorisation of nStep 3 Find all the common prime factorsStep 4 Compute the product of all the

common prime factors and return it as gcd(m,n)

Prime FactorisationInput: Integer x ≥ 2, Output: List F of prime

factors of xP ← Sieve(x)while n > 1 do while n mod P[i] = 0 do F ← F + P[i] x ← x / P[i]

i ← i + 1

Sieve of Eratosthenes

Input: Integer x ≥ 2Output: List of primes less

than or equal to xfor p ← 2 to x do A[p] ← pfor p ← 2 to x do if A[p] 0

j ← p* p while j ≤ x do A[j] ← 0 j ← j + p

How does the performance of this algorithm compare to Euclid’s and Consecutive Integer Checking?

Note also that an algorithm will be required to find all common prime factors too…

Page 21: Kk20503 1 introduction

Fundamentals of Algorithmic Problem SolvingUnderstand the ProblemAscertain the Capabilities of the Computational DeviceChoosing between Exact and Approximate Problem SolvingDeciding on Appropriate Data StructuresAlgorithm Design TechniquesMethods of Specifying an AlgorithmProving an Algorithm’s CorrectnessAnalysing an AlgorithmCoding an Algorithm

Please read this section in the textbook in much more detail…

Page 22: Kk20503 1 introduction

Algorithm Design TechniquesAn algorithm design technique (or

“strategy” or “paradigm”) is…a general approach to solving problems algorithmically

that is applicable to a wide range of problems from different areas of computing

This course is organised by design techniquesSee the lecture topics from earlier…Usually, AA courses are organised by problem type not

algorithm design technique

Page 23: Kk20503 1 introduction

Analysing an AlgorithmAnalysis is essential. We want to know that

our algorithms have good characteristics:Good Time Efficiency (or Complexity)Good Space Efficiency (or Complexity)SimplicityGenerality

Analysis allows comparison of algorithms and allows us to know if they can be used practically or not

We will study a framework for analysis in the next lecture

Page 24: Kk20503 1 introduction

Important Problem TypesSortingSearchingString ProcessingGraph ProblemsCombinatorial ProblemsGeometric ProblemsNumerical Problems

Page 25: Kk20503 1 introduction

Sorting ProblemsProblem: rearrange the items of a given list in

ascending order.Input: A sequence of n items <a1, a2, …, an>Output: A reordering <a*

1, a*2, …, a*

n> of the input sequence such that a*

1≤ a*2 ≤ … ≤ a*

n.

Importance of sortingCan help searching tremendouslyAlgorithms often use sorting as a key subroutine.

Sorting keyA specially chosen piece of information used to guide

sorting. E.g., sort student records by names.

Page 26: Kk20503 1 introduction

Sorting ProblemsExamples of sorting algorithms

Selection SortBubble sort Insertion sortMerge sortHeap sort …and others

We usually evaluate sorting algorithm complexity by the number of key comparisons.

Two important propertiesStability: A sorting algorithm is called stable if it preserves

the relative order of any two equal elements in its input.In place : A sorting algorithm is in place if it does not

require extra memory, except, possibly for a few memory units.

Page 27: Kk20503 1 introduction

Sorting Problems

SelectionSort(A[0..n-1])//Input: An array A[0..n-1] of orderable

elements//Output: Array A[0..n-1] sorted in ascending

order

for i ← 0 to n – 2 domin ← ifor j ← i + 1 to n – 1 do

if A[j] < A[min] min ← j

swap A[i] and A[min]

An example of a simple (and often low-performance) sorting algorithm is Selection Sort

Page 28: Kk20503 1 introduction

Searching ProblemsProblem: find a given value, called a search

key, in a given set.Eg ID, ref number, name, etc

Search is a hugely important problemRetrieving data from database storage

Student records, company accounts etc.

And more general search… Searching the web for informationSearching to solve problems in AI

Page 29: Kk20503 1 introduction

Searching ProblemsExamples of searching algorithms

Sequential search (Time O(n))Binary search (Time: O(log n))

Binary Search// Input: sorted array a_i < … < a_j

and key x;m (i+j)/2;while i < j and x != a_m do if x < a_m then j m-1 else i m+1; if x = a_m then output a_m;

Page 30: Kk20503 1 introduction

String Processing ProblemsA string is a sequence of characters from an

alphabet.“This is a string” is a string

Text strings: letters, numbers, and special characters.

String matching: searching for a given word/pattern in a text.

Examplessearching for a word or phrase on WWW or in a Word

documentsearching for a particular sequence of {A,C,G,T} in a

reference genomic sequence

Page 31: Kk20503 1 introduction

Graph Problems(Informally) A graph is a collection of points called

vertices, some of which are connected by line segments called edges.

Graphs are extremely important for modeling real-life problemsModeling WWWSocial relationshipsCommunication networksProject scheduling …

Examples of graph algorithmsGraph traversal algorithmsShortest-path algorithmsTopological sorting

1 2

3 4

Page 32: Kk20503 1 introduction

Graph ProblemsExample: Travelling

Salesman ProblemFor a set of cities linked by

roads, find the shortest route that visits them all only once.

Important to Businesses and route planning Circuit board and VLSI design and

manufacture X Ray crystallography Genetic engineering

TSP, like many other graph problems, is extremely difficult and no general efficient algorithms are knownIn fact, they may not exist…

Page 33: Kk20503 1 introduction

Combinatorial ProblemsFinding a Combinatorial Object that satisfies certain

constraints or has certain propertiesSets, subsetsPermutations

Set P of all ordered sequences of length r from a set S of n elements Eg. for S = {1,2,3} and r=3, P = {(1,2,3), (1,3,2), (2,1,3),(2,3,1),

(3,1,2), (3,2,1)} Eg. for S = {1,2,3} and r=2, P = {(1,2), (1,3), (2,1),(2,3), (3,1), (3,2)} The number of permutations is nPr = n!/(n-r)!

Combinations Set C of all unordered sequences of length r from a set S of n

elements Eg. for S = {1,2,3} and r=2, C = {(1,2), (1,3), (2,3),} Eg. for S = {1,2,3} and r=3, C = {(1,2,3)} The number of combinations is nCr = n!/(n-r)! r!

Combinatorial problems are among the hardest known

Page 34: Kk20503 1 introduction

Combinatorial ProblemsExample: The Knapsack

ProblemGiven a set of items, each

with a weight and a value, determine the most valuable subset of items with total weight less than or equal to a given limit

Page 35: Kk20503 1 introduction

Geometric ProblemsProblems dealing with Geometric Objects

such as points, lines, polygonsExamples

Finding closest pair of pointsFinding convex hull of a set of points

Page 36: Kk20503 1 introduction

Numerical ProblemsProblems dealing with

mathematical objectsNumbers, Equations, Matrices…

ExamplesEfficient arithmeticEvaluating functionsSolving systems of linear

equations with Gaussian Elimination

Evaulating Definite IntegralsOptimising a function subject to

constraints

Page 37: Kk20503 1 introduction

Fundamental Data Structures Linear Data

StructuresArray

String

Linked list

Stack

Queue Priority queue/heap

Graph

General Graph

Tree Binary Tree

Set and Dictionary

Page 38: Kk20503 1 introduction

Linear Data StructuresArrays

A sequence of n items of the same data type that are stored contiguously in computer memory and made accessible by specifying a value of the array’s index.

Array properties fixed length (need preliminary reservation of memory) contiguous memory locations direct access Operations of insert/delete Often used to implement Lists and Strings

Page 39: Kk20503 1 introduction

Linear Data StructuresLinked List

A sequence of zero or more nodes each containing two kinds of information: some data and one or more links called pointers to other nodes of the linked list.

Singly linked list (next pointer) Doubly linked list (next + previous pointers)

Linked List Properties dynamic length arbitrary memory locations access by following links Operations of Insert/delete

Page 40: Kk20503 1 introduction

Stacks and QueuesStacks

A stack of plates insertion/deletion can be done

only at the top. LIFO

Two operations (push and pop)Queues

A queue of customers waiting for services Insertion/enqueue from the rear

and deletion/dequeue from the front.

FIFOTwo operations (enqueue and

dequeue)

Page 41: Kk20503 1 introduction

Priority Queues and Heaps Priority queues (implemented using

heaps) A data structure for maintaining a set of elements, each

associated with a key/priority, with the following operations

Operations Finding the element with the highest priority

Deleting the element with the highest priority

Inserting a new element

96 8

5 2 3

9 6 58 2 3

Page 42: Kk20503 1 introduction

GraphsFormal definition

A graph G = <V, E> is defined by a pair of sets: a finite set V of items called vertices and a set E of vertex pairs called edges.

Undirected and directed graphs (digraphs).What’s the maximum

number of edges in an undirected graph with |V| vertices?

Complete graphsA graph with every pair of

its vertices connected by an edge is called complete, K|V|

Dense and sparse graphs Dense graphs have edges

between most nodes Sparse graphs have few edges

between nodes

1 2

3 4

1 2

3 4

Complete, undirected graph

Incomplete, directed graph

Both graphs here are dense

Page 43: Kk20503 1 introduction

Graph RepresentationNodes are said to be

adjacent if an edge exists between them

Adjacency Matrix n x n boolean matrix if number

of nodes |V| is n. The element on the ith row and

jth column is 1 if there’s an edge from ith vertex to the jth vertex; otherwise 0.

The adjacency matrix of an undirected graph is symmetric.

Adjacency Linked ListsA collection of linked lists,

one for each vertex, that contain all the vertices adjacent to the list’s vertex.

Which data structure would you use for dense and sparse graphs?

0 1 1 10 0 0 10 0 0 10 0 0 0

2 3 444

1 2

3 4

Incomplete, directed graph G

Adjacency Matrix for G

Adjacency Linked List for G

Page 44: Kk20503 1 introduction

Weighted GraphsWeighted graphs

Graphs or digraphs with numbers assigned to the edges.

1 2

3 4

6

8

5

79

Page 45: Kk20503 1 introduction

Graph Properties -- Paths and ConnectivityPaths

A path from vertex u to v of a graph G is defined as a sequence of adjacent vertices that starts with u and ends with v.

Simple paths: All edges of a path are distinct.

Path lengths: the number of edges, or the number of vertices – 1.

1 2

4 5

3

Simple Path from 4 to 3

(4,1,2,3)Path Length 3

1 2

4 5

3

Path from 4 to 3(4,1,2,5,4,1,2,3)Path Length 7

Page 46: Kk20503 1 introduction

Graph Properties -- Paths and ConnectivityConnected graphs

A graph is said to be connected if for every pair of its vertices u and v there is a path from u to v.

Connected componentThe maximum

connected subgraph of a given graph.

1 2

4 5

3

Connected Graph G1

1 2

4 5

3

Non-Connected Graph G2

Page 47: Kk20503 1 introduction

Graph Properties -- AcyclicityCycle

A simple path of a positive length that starts and ends a the same vertex.

Acyclic graphA graph without cyclesDAG (Directed Acyclic

Graph)1 2

3 4

Directed Acyclic Graph

1 2

4 5

3

Cycle (1,2,5,4,1)

Page 48: Kk20503 1 introduction

TreesTrees

A tree (or free tree) is a connected acyclic graph.

Forest: a graph that has no cycles but is not necessarily connected.

Properties of trees |E| = |V| - 1For every two vertices in a

tree there always exists exactly one simple path from one of these vertices to the other….

1 3

2 4

5

1 3

2 4

5 6

7

Free Tree

Free Tree that is also a forest

Page 49: Kk20503 1 introduction

Rooted TreesThis “single path”

property makes it possible to select an arbitrary vertex in a free tree and consider it as the root of the so-called rooted tree.

A root then defines levels in the tree.Number of edges from

the root

1

3

2

4 5

1 3

2 4

5

Free Tree

Rooted Tree with 2 levels

Page 50: Kk20503 1 introduction

Rooted TreesAncestors

For any vertex v in a tree T, all the vertices on the simple path from the root to that vertex are called ancestors.

Descendants All the vertices for which a vertex v is an

ancestor are said to be descendants of v.Parent, child and siblings

If (u, v) is the last edge of the simple path from the root to vertex v, u is said to be the parent of v and v is called a child of u.

Vertices that have the same parent are called siblings.

Leaves A vertex without children is called a leaf.

Subtree A vertex v with all its descendants is

called the subtree of T rooted at v.

1

3

2

4 5

76

8 9

Examples:•1 is an ancestor of 6 and 7•8 is a descendant of 5•1 is parent (and ancestor) of 6 and 7•8 and 9 are children (and descendants) of 5•6 and 7 are siblings•8 and 9 are siblings•4,6,7,8,9 are leaves•5 is the subtree consisting of 5,8,9

Page 51: Kk20503 1 introduction

Rooted TreesDepth of a vertex

The length of the simple path from the root to the vertex.

Height of a treeThe length of the longest

simple path from the root to a leaf.

1

3

2

4 5

76

8 9

Examples:•Vertex 2 has depth d=2•Vertex 7 has depth d=3•The tree itself has height h=3

Page 52: Kk20503 1 introduction

Ordered TreesOrdered trees

An ordered tree is a rooted tree in which all the children of each vertex are ordered.

Binary treesA binary tree is an

ordered tree in which every vertex has no more than two children and each children is designated s either a left child or a right child of its parent.

9

6 8

5 2 3

Binary Tree

L R L

Page 53: Kk20503 1 introduction

Ordered TreesBinary search trees

Each vertex is assigned a number.

A number assigned to each parental vertex is larger than all the numbers in its left subtree and smaller than all the numbers in its right subtree.

log2n h n – 1, where h is the height of a binary tree and n the size.

6

3 9

2 5 8

Page 54: Kk20503 1 introduction

Sets and DictionariesI will assume you already know what a set is!

Elements, subsets, union, intersection etc..

Dictionaries are Sets with 3 operations:Searching for a given elementInserting a new elementDeleting an existing element

Please read the textbook for more detail…