programming interest group chxw/pig/index.htm tutorial two data structures

34
Programming Interest Group http://www.comp.hkbu.edu.hk/~chxw/pig/index.htm Tutorial Two Data Structures

Upload: kelly-dorsey

Post on 28-Dec-2015

223 views

Category:

Documents


0 download

TRANSCRIPT

Programming Interest Grouphttp://www.comp.hkbu.edu.hk/~chxw/pig/index.htm

Tutorial Two

Data Structures

Data Structures Basic data types:

Integral: integer, character, boolean Floating-point types: float, double, long double

Data structures are methods of organizing large amounts of data. Array List, Stack, Queue, Dequeue Trees: binary tree, binary search tree, AVL tree Priority Queues Hash table Set Graph

COMP1200: Data Structures and Algorithms

Elementary Data Structures Data type is a set of values and a collection of

operations on those values

Basic data types in C and C++ Integers (ints)

short int, int, long int, Floating-point numbers (floats)

float, double Characters (chars)

char

Structure in C and C++

Example 1: Basic Data Types#include <iostream>#include <stdlib.h>#include <math.h>

using namespace std;typedef int Number;Number randNum(){ return rand();}

int main(int argc, char *argv[]){ int N = atoi(argv[1]); float m1 = 0.0, m2 = 0.0; for (int i = 0; i < N; i++) { Number x = randNum(); m1 += ((float)x) / N; m2 += ((float)x*x) / N; } cout << "RAND_MAX.: " << RAND_MAX << endl; cout << "Avg.:" << m1 << endl; cout << "Std. dev.: " << sqrt(m2 - m1 * m1) << endl;}

This program computers the average and standard deviation of a sequence of integers generated by the library function rand( ).

Question: how can you modity the program to handle a sequence of random floating-point numbers in the range of [0, 1]?

Example 2: Structure#include <iostream>#include <stdlib.h>#include <math.h>

using namespace std;struct mypoint { float x; float y; };float mydistance(mypoint, mypoint);mypolar (mypoint, float *r, float *theta);

int main(int argc, char *argv[]){ struct mypoint a, b; a.x = 1.0; a.y = 1.0; b.x = 4.0; b.y = 5.0; cout << " Distance is " << mydistance(a, b); float r, theta; mypolar(a, &r, &theta); cout << "r : " << r << endl; cout << “theta: " << theta << endl;}

/* return the distance between two points */float mydistance(mypoint a, mypoint b){ float dx = a.x - b.x; float dy = a.y - b.y; return sqrt(dx*dx + dy*dy);}

/* convert from Cartesian to polar coordinates */mypolar (mypoint p, float *r, float *theta){ *r = sqrt(p.x*p.x + p.y*p.y); *theta = atan2(p.y, p.x);}

Result:[chxw@csr40 cplus]$ ./a.outDistance is 5r : 1.41421theta: 0.785398

Arrays Array is the most fundamental data structure

An array is a fixed collection of same-type data that are stored contiguously and are accessible by an index

It is the responsibility of the programmer to use indices that are nonnegative and smaller than the array size

Two ways to create an array Static allocation: size known to and set by the programmer Dynamic allocation: size unknown to the programmer and set by

the user at the execution time

Example: Sieve of Eratosthenes

#include <iostream>using namespace std;

static const int N = 1000;int main( ){ int i, a[N]; /* initialization */ for (i = 2; i < N; i++) a[i] = 1; for (i = 2; i < N; i++) if (a[i] ) /* sieve i’s multiples up to N-1*/ for(int j = i; j*i < N; j++) a[i*j] = 0; for (i = 2; i < N; i++) if (a[i]) cout << " " << i; cout << endl;}

Sieve of Eratosthenes is a classical method to calculate the table of prime numbers.

Basic idea: Set a[i] to 1 if i is prime, and 0 if i is not a prime.

Dynamic Memory Allocation

C language malloc( ) and free( )

C++ language use operator new and operator delete

int main(int argc, char *argv[]){ int N = atoi(argv[1]); int *a = new int[N]; if (a == 0) { cout << “out of memory " << endl; return 0; } … delete [] a;}

Array of Structures#include <iostream>#include <stdlib.h>#include <math.h>

using namespace std;struct mypoint { float x; float y; };float mydistance(mypoint, mypoint);float randfloat( );int main(int argc, char *argv[]){ float d = atof(argv[2]); int i, cnt = 0, N = atoi(argv[1]); mypoint *a = new mypoint[N]; for( i = 0; i < N; i++) { a[i].x = randfloat(); a[i].y = randfloat(); } for( i = 0; i < N; i++) for(int j = i+1; j < N; j++) if (mydistance(a[i], a[j]) < d) cnt++; cout << cnt << " pairs within " << d << endl; delete [] a;}

/* return the distance between two points */float mydistance(mypoint a, mypoint b){ float dx = a.x - b.x; float dy = a.y - b.y; return sqrt(dx*dx + dy*dy);}

/* return a random number between 0 and 1 */float randfloat( ){ return 1.0 * rand() / RAND_MAX;}

This program calculates the number of pair of points whose distance is shorter than a threshold.

List

A general list of elements: A1, A2, …, AN, associated with a set of operations: Insert: add an element Delete: remove an element Find: find the position of an element (search) FindKth: find the kth element

Each element has a fixed position Two different implementations:

Array-based list Linked list

List

A 1 A 2 A 3

A 1 A 2 A 3 h ead er

Linked list:

Linked list with a header:

Doubly linked list:

A 1 A 2 A 3

Sample C Implementation of Linked List with a Header

Header files: http://www.comp.hkbu.edu.hk/~chxw/pig/code/fatal.h http://www.comp.hkbu.edu.hk/~chxw/pig/code/list.h

Source file: http://www.comp.hkbu.edu.hk/~chxw/pig/code/list.h

Circular List Example

Josephus problem: N people decided to elect a leader as follows: Arrange themselves in a circle Eliminate every Mth person around the circle The last remaining person will be the leader

Simulation of Josephus problem#include <iostream>#include <stdlib.h>

using namespace std;

struct mynode { int item; mynode* next; /* constructor */ mynode(int x, mynode* t) { item = x; next = t; } };

typedef mynode *mylink;

int main(int argc, char *argv[]){ int i, N = atoi(argv[1]), M = atoi(argv[2]);

/* create the first node */ mylink t = new mynode(1, 0); t->next = t; mylink x = t;

/* insert the next N-1 nodes */ for( i = 2; i <= N; i++) x = (x->next = new mynode(i, t));

/* simulate the election process */ while (x != x->next) { for (i = 1; i < M; i++) x = x->next; /* delete the next node */ t = x-> next; x->next = t->next; delete t; } cout << x->item << endl;}

Stacks

A stack is a list with the restriction that insertions and deletions can be performed at the end of the list, called the top. LIFO: last in, first out

Operations: Push(x, s) Pop(s) MakeEmpty(s) IsEmpty(s) Top(s)

Stack Implementations

Using a linked list http://www.comp.hkbu.edu.hk/~chxw/pig/code/stackli.h http://www.comp.hkbu.edu.hk/~chxw/pig/code/stackli.c

Using an array http://www.comp.hkbu.edu.hk/~chxw/pig/code/stackar.h http://www.comp.hkbu.edu.hk/~chxw/pig/code/stackar.c

Remark: you need to define the maximum stack size when creating the stack

Queues A Queue is a list with the restriction that insertion

is done at one end, whereas deletion is done at the other end. FIFO: first in, first out

Operations: CreateQueue(x): create a queue with maximum size

of x Enqueue(x, q): insert an element x at the end of the

list Dequeue(q): return and remove the element at the

start of the list IsEmpty(q) and IsFull(q)

Queue Implementation

Implemented by a circular array Need to specify the maximum size of the queue when creating

the queue One variable for the front of the queue, another one for the

rear of the queue

Sample code http://www.comp.hkbu.edu.hk/~chxw/pig/code/queue.h http://www.comp.hkbu.edu.hk/~chxw/pig/code/queue.c

Priority Queues

A priority queue is a data structure that allows the following operations: Insert(x, p): insert item x into priority queue p Maximum(p): return the item with the highest

priority in priority queue p ExtractMax(p): return and remove the item with

the highest priority in p Note:

Each element contains a key which represents its priority

Sets

A set is a collection of unordered elements drawn from a given universal set U.

Operations: Member(x, S): is an item x an element of set S? Union(A, B) Intersection(A, B) Insert(x, S) Delete(x, S)

Dictionaries

Dictionaries permit content-based retrieval. Operations:

Insert(x, d) Delete(x, d) Search(k, d): return an item with key k

Note Dictionaries can be implemented by lots of

techniques, like linked list, array, tree, hashing, etc.

C++ Standard Template Library The C++ STL provides implementations of lots of

data structures Reference:

http://www.sgi.com/tech/stl/ http://www.cppreference.com/

Data structures: (Containers in C++) Sequential containers (see Workshop 7)

Vectors, Lists, Double-ended Queues Associative containers (see Workshop 7)

Sets, Multisets, Maps, Multimaps Container adaptors

Stacks, Queues, Priority Queues

List in C++

List is implemented as a doubly linked list of elements Each element in a list has its own segment of memory and

refers to its predecessor and its successor Disadvantage: Lists do not provide random access.

General access to an arbitrary element takes linear time. Hence lists don’t support the [ ] operator

Advantage: insertion or removal of an element is fast at any position

http://www.cplusplus.com/reference/stl/list/

24

List Example 1// list1.cpp#include <iostream>#include <list>using namespace std;

int main(){ list<char> coll;

for (char c = 'a'; c <= 'z'; ++c) coll.push_back(c);

while (! coll.empty() ) { cout << coll.front() << ' '; coll.pop_front(); } cout << endl;

return 0;}

$ g++ list1.cpp$ ./a.outa b c d e f g h i j k l m n o p q r s t u v w x y z $

25

List Example 2// list2.cpp#include <iostream>#include <list>using namespace std;

int main(){ list<char> coll;

for (char c='a'; c<='z'; ++c) coll.push_back(c);

list<char>::const_iterator pos; for (pos = coll.begin(); pos != coll.end(); ++pos) cout << *pos << ' '; cout << endl;}

$ g++ list2.cpp$ ./a.outa b c d e f g h i j k l m n o p q r s t u v w x y z $

begin() end()pos ++

26

List Example 3// list3.cpp#include <iostream>#include <list>using namespace std;

int main(){ list<char> coll;

for (char c='a'; c<='z'; ++c) coll.push_back(c);

list<char>::iterator pos; for (pos = coll.begin(); pos != coll.end(); ++pos) { *pos = toupper(*pos); cout << *pos << ' '; } cout << endl;}

Stack in C++// stack.cpp#include <iostream>#include <stack>using namespace std;

int main(){ stack<int> s;

for (int i=1; i<=10; ++i) s.push(i);

while( !s.empty() ) { cout << s.top() << endl; s.pop(); }

return 0;}

push(): insert an elementpop(): remove the first elementtop(): access the first elementsize(): return the number of elementsempty(): check whether the container is empty

Remark:pop() will remove the first element and return nothing. So usually we need to call top() to get the first element, then call pop() to remove it.

Queue in C++// queue.cpp#include <iostream>#include <queue>using namespace std;

int main(){ queue<int> s;

for (int i=1; i<=10; ++i) s.push(i);

while( !s.empty() ) { cout << s.front() << endl; s.pop(); }

return 0;}

push(): insert an elementpop(): remove the first elementfront(): access the first elementback(): access the last elementsize(): return the number of elementsempty(): check whether the container is empty

Queue Example II// queue2.cpp#include <iostream>#include <queue>#include <string>using namespace std;

int main(){ queue<string> q;

q.push(“These “); q.push(“are “); q.push(“more than “);

cout << q.front(); q.pop(); cout << q.front(); q.pop();

q.push(“four “); q.push(“words!“);

// skip one element q.pop(); cout << q.front(); q.pop(); cout << q.front(): q.pop();

cout << “number of elements in the queue: “ << q.size() << endl;

return 0;}

Priority Queue in C++// pqueue.cpp#include <iostream>#include <queue>using namespace std;

int main(){ priority_queue<int> s;

s.push(5); s.push(4); s.push(8); s.push(9); s.push(2); s.push(7); s.push(6); s.push(3); s.push(10);

while( !s.empty() ) { cout << s.top() << endl; s.pop(); }

return 0;}

push(): insert an elementpop(): remove the element with the highest prioritytop(): access the element with the highest prioritysize(): return the number of elementsempty(): check whether the container is empty

By default, elements are sorted by operator < in descending order, i.e., the largest element has the highest priority.

Different Sorting Criterion// pqueue.cpp#include <iostream>#include <queue>using namespace std;

int main(){ priority_queue<int, vector<int>, greater<int> > s;

s.push(5); s.push(4); s.push(8); s.push(9); s.push(2); s.push(7); s.push(6); s.push(3); s.push(10);

while( !s.empty() ) { cout << s.top() << endl; s.pop(); }

return 0;}

Three parameters when defining a priority queue:

int: type of element

vector<int>: the container that is used internally

greater<int>: the sorting criteria(by default, it is less<>)

Java java.util package http://java.sun.com/products/jdk

http://java.sun.com/j2se/1.4.2/docs/api/java/util/package-summary.html

Stack Stack

Queue ArrayList, LinkedList

Dictionaries HashMap, hashtable

Priority Queue TreeMap

Sets HashSet

What to do now?

Choose your own weapon C: write a set of data structure C++: learn the STL Java: learn the java.util package

Try to solve at least one exercise If you still have time, solve more exercises.

Practice

http://acm.uva.es/p/v100/10038.html http://acm.uva.es/p/v100/10044.html http://acm.uva.es/p/v100/10050.html http://acm.uva.es/p/v101/10149.html http://acm.uva.es/p/v102/10205.html http://acm.uva.es/p/v102/10258.html http://acm.uva.es/p/v103/10315.html http://acm.uva.es/p/v8/843.html