data structures and algorithm - gascwkm.org
TRANSCRIPT
DATA STRUCTURES AND
ALGORITHM
ARRAYS
• To store a group of data together in one place
that is called array.
• In this data structure all the elements are stored
in contiguous location .
Shown in figure:
Definition
• An array is a finite,ordered and collection of
homogeneous data elements.
• It contain limited no of elements.
• Stored one by one in contiguous location of the
computer memory in linear ordered fashion.
• All the element of an array are in same data
type.
The following some examples:
An array of integers to store the age of the students
in the class.
An array of string to store the names of all villagers
in village.
The data structure is generally build in data types:
Basic : DIMENSIN A[100]
FORTRAN : DIM A[100]
Pascal : A:ARRAY[1….100] of
Integers
C : int A[100]
Terminology
• Size:
The no of elements in an array called the size of the array.
The size also called length and Dimensions.
Type:
It represents the kind of data type
for example:
Array of String, Array of Character ,Array of Integers
Index:
All the elements in an array can be referenced by subscripts like Ai or A[i] this Subscript known as Index.
Range of Indices:
Indices of array elements may change from a lower bound(L) to upper Bound(U).
Lower bound Formula stands as
Index Ai=L + I – 1
Size of the array calculated as
Size (A) = U + L + 1
Word:
It denotes the size of an elements.
In each memory location a computer can store
an element of word size w
The word size varies to the machine to machine
such as 1 byte to 8 bytes.
The size of the elements is doubled then the
word size of machine to store such element
would require two consecutive memory
location.
ONE DIMENSIONAL ARRAY
• If only one subscript/index is required to
reference all the element in an array then the
termed one dimensional array or simply an
array.
Memory Allocation of an Array:
Address (A[i]) = M + ( I - 1)
• Indexing formula
Address (A[i]) = M + (i-L) X W
Operation of Array
This is sub divided into six categories
1. Traversing
2. Sorting
3. Searching
4. Inserting
5. Deleting
6. Merging
Traversing
Searching
Sorting
Insertion
Deletion
Merging
Applications of Arrays
• There are numerous applications of arrays in
computation.
• This is why almost every programming
language Includes this data type as build in
data type.
Example:
• We can store the records as shown in figure:
MULTI DIMENSIONAL ARRAY
• Two dimensional array:
two dimensional arrays (alternatively termed
matrices)are collections of homogeneous
elements.
Where the elements are stored in a number of
rows and columns.
An example of an M x N matrix.
M denotes the number of rows
N denotes the number of column
Memory Representation of a
Matrix
• One dimensional arrays matrices are
also stored in contagious memory
locations.
• There are two conventions of storing
any matrix in the memory
1. Row –Major order
2. Column –Major order
• Row major order:
the elements of a matrix are stored on a row
by row basis.
All the elements in the first row then in the
second row and so on.
Column major order:
The elements are stored in column by column
First column are stored in their order of rows
then the second column, third column and so
on.
Reference of elements in a matrix
• Logically matrix appears as two dimensional
array.
• It is stored in a linear fashion.
• In order to map from logical view to physical
structure.
• We need a index formula.
The indexing formula for different order are
stated below:
Row mjor order:
Address aij =(I - 1)X n + j.
Column major order:
Address (aij)= (j - 1) X m + i.
Sparse Matrices:
It is a two dimensional array .where the majority of the
elements have the value null.
Example:
Sparse matrices can be classified as:
stacks• A stacks is a linear data structure and much
useful in various application of computer
science.
DEFINITION
• A stack is an ordered collection of
homogeneous data elements when the insertion
and deletion operations take place at end only.
• The insertion and deletion operations in the
case of a stack are specially termed PUSH and
POP.
• The position of the stack where those operation
are performed is known as TOP.
REPRESENTATION OF A
STACK
• Stack may be represented in the memory in
various ways.
• There are two main ways:
1. One Dimensional Array.
2. Single Linked List.
Array Representation of Stacks
• First we have to allocate a memory block of
sufficient size to accommodate the full
capacity of the stack.
• Itemi denoted the ith item in the stack l and u
denote the index range of the stack.
• The starting from the first location of the
memory block the item of the stack can be
stored in a sequential fashion.
• The Top is the pointer of the position of the
array.
EMPTY :
TOP < 1
FULL:
TOP >= u
Linked List Representation of
Stacks
• Array representation of stack is very easy and
convenient but it allows the representation of
only fixed sized stacks.
• Inn several application size of the stack may
vary during the program execution.
• The obvious solution to this problem is to
represent a stack using a linked list.
• The linked list representation the first node on
the list is the current item that is the item in
the top of the stack.
• Last node containing the bottom most item.
• The PUSH operation will add a new node in
the front
• POP operation will remove a node from the
list.
• The SIZE of the stack is not important here
this representation allows dynamic stack.
• Static stack as with arrays.
OPERATION ON STACKS
• PUSH:
To insert an item into a stack.
• POP:
To remove an item from the stack.
• STATUS:
To known the present state of the stack.
• PUSH _ARRAY:
• POP_ARRAY
• STATUS _ARRAY
• PUSH_LINKED LIST
POP_LINKED LIST
STATUS _LINKED LIST
APPLICATIONS OF STACK
• various application of stack are known.
• A classical application in a compiler design is the
evaluation of arithmetic expression.
• Here the compiler uses to translate an input
arithmetic expression into its corresponding
object code.
• Build in stack hardware is called stack machine.
• Another important application of a stack is
during execution of recursive programs.
• Some programming language use stack to run
recursive program.
• Important feature of any programming
language is the build in memory variables.
• There are two scope rule :
static scope rule
Dynamic scope rule.
• The implementation of such scope rules is
possible using a stack known as a runtime
stack.
EVALUTION OF ARITHMATIC
EXPRESSION
• An arithmetic expression consist of operands
and operators.
• Operands are variables or constant.
• Operators are of various type
Arithmetic (+,-,*,\,%,^)
Unary
Binary
Boolean (AND,OR,NOT,XOR)
Relatioal
Notation of Arithmetic
Expression
• There are three notations to represent an arithmetic expression.
–Infix
–Post fix
–Pre fix
• INFIX:
Notations
<operands > <operator> <operands>
Example:
A+B,C-D,E*F,G/H,
This is called infix because the operator come in between
the operands
• PRE FIX
NOTATIONS:
<operator> <operands> <operands>
Ex:
+AB,-CD,*EF,/GH..Etc.,
• POST FIX
NOTATIONS:
<OPERANDS>
<OPERANDS><OPERATOR>
EX:
AB+,CD-,EF*,GH/,etc.,
The postfix notation is just reverse of the polish
notation, hence it is also termed reverse polish
notation.
ALGORITHM –INFIX TO
POSTFIX
• EXAMPLE:
ALGORITHM –EVALUATE POSTFIX
• Conversion of a postfix expression to a code:
Code generation for stack
machine
Algorithm postfix to code for stack machine:
Implementation of recursion
• Recursion is an important tool to describe a procedure having several repetitions of same.
• A procedure is termed recursive if the procedure is defined by itself.
• as an example of factorial of integer n number.
• Algorithm factorial _ I
Algorithm Factorial-R
QUEUES
Introduction:
A queue is simple but very powerful data
structure to solve numerous computer applications.
Like stacks.
Queues are also useful to solve various system
programs.
Let us discuss some simple applications of
queues in our every day.
DEFINITIONS:
a queue is linear datastucture like an array,a stacks and a linked list where the ordering of elements is in a linear fashion.
Difference between stacks and queue
STACKS QUEUE
OPERATIONS
PUSH,POP INSERTION,DELETION
At one end only-TOP Two End- rear AND FRONT
LIFO-last in first out FIFO-first in first out
REPRESENTATION OF
QUEUES
There are two ways to represent a queue in memory:
1.Using a Array.
2.Using a Linked List.
This first kind of representation uses a one
dimensional array and it is a better choice where queue
of fixed size is required.
The other representation uses a double linked list
and provides a queue whose size can vary during
processing.
Representatin of a queue using an Array:
One dimensional Array- Q [1…..N]
Pointers- Rear and Front is indicated two ends.
Three states of a Queue with this representation are given
below:
Queue is EmptyFront=0
Rear=0
Queue is FullRear=N
Front=1
Queue contains element>=1Front<=Rear
Number of elements=Rear – Front +1
Algorithm: Enqueue
Algorithm: Dequeue
Representation of a queue using a linked list:
Two states of the queue either empty or containing some
elements, can be judged by the following tests:
Queue is Empty:
FRONT=REAR=HEADER
HEADERRLINK=NULL
Queue contains at aleast one element:
HEADERRLINK= NULL
VARIOUS QUEUE DATA STRUCTURE:
we have discussed two different queue data
structures. That is either using an array or using a
linked list. other than these, there are some more
known queue structures.
Circular queue:
A queue represented using an array when the rear
pointer reaches at the end, insertion will be denied
even if room is available at the front.
One way to avoid the circular array.
The circular array is same as an ordinary array. say
A[1….n]
With this principle the two state of the queue
regarding, that is empty or full ,will be decided as
follows:
Circular Empty:
FRONT=0,REAR=0
Circular is Full:
Front=(Rear mod Length)=1
Algorithm: Enqueue_CQ
Algorithm :Dequeue_CQ
DEQUE:
Another variation of the queue is known as deque.
May be pronounced deck
Unlike a queue in deque, both insertion and deletion
operations can be made at either end of the structure.
Actually, the termed deque has originated from
double ended queue.
Shown in fig.,
There are various ways of representing a deque on the
computer.
Only simpler way to represent it is using a double
linked list.
Another popular representation is using a circular
array
1.Push_DQ(Item): to insert Item at the Front end of a
deque.
2.Pop_DQ():to remove the front item from a deque.
3.Injection(item):to insert item at the rear end of
deque.
4.Eject(): to remove the rear Item from the deque.
Algorithm:PUSH_DQ
Algorithm: Pop_DQ()
Same as the algorithm DEQUEUE_CQ
Algorithm: Eject()
There are however, two known variation of deque:
1.input-restricted deque.
2.Output-restricted deque.
1.Which allows insertion at one end(say Rear end)
only, but allows deletion at both ends.
2.Where deletions take place at one end only(say Front
end),but allows insertion at both ends.
Priority Queue:
A priority queue is another variation of queue
structure. here, each element has been assigned a
value called the priority of the element.
Element can be insertion and deletion not only at the
end but at any position on the queue.
Here,process of means two basic operations namely
insertion or deletion.
There are various ways of implementing the structure
of a priority queue.
1.Using simple/circular array
2.Multi queue implementation
3.Using double linked list
4.Using heap tree.
Priority queue using an array:
Multi queue implementation:
Linked list representation of a priority queue:
Algorithm: insert _PQ
Algorithm: Delete_PQ
APPLICATIONS OF QUEUE:
ONE MAJOR APPLICATION of queue s is in
Simulation.
Another important application of queue is
observed in the various aspects of operating system.
A Multi Programming environment uses the
several queue of to control of various programs.
Simulation:
it is modeling of real life problem.
It is the model of a real life situation in the form
of a computer program.
To study the real life situation under the control of
various parameters which affect the real problem
and is a research interest of system analysis or
operation research scientist.
Based on the result of simulation the actual problem
can be solved in an optimized way.
Another advantage of this is to experiment the
danger area. For example the area such as military
operation are safer to simulate than to field test. it is
being free from any risk as well as inexpensive.
System can be divided into
Discrete system: ticket reservation
Continuous system: water flow through PIPE to
reservoir
Deterministic system's set of initial input but the final
outcome is predicted.
Stochastic system: both deterministic and stochastic
Simulation can be divided into two models
1.Event driven simulation.
2.Time driven simulation.
CPU scheduling Multiprogramming environment:
Single CPU has to serve more than one program
simultaneously.
Multiprogramming environment where the possible
jobs for the CPU are categorized into three groups.
1.Interrupts to be serviced.
2.Interactive user to be serviced.
3.Batch job to be serviced.
ROUND ROBIN ALGORITHM:this algorithm is a
well known scheduling aalgorithm and is designed
especially for time sharing system.
Here we will see how a circular queue can be used to
implement such an algorithm.
First we described the algorithm with illustration.n
no of process p1,p2,p3,…pn
This algorithm first decide a small unit of time called a
time quantum or time slice.(τ)
A time quantum is generally from 10 to 100
milliseconds.
LINKED LIST
The linked lit is called a dynamic structure,
where the amount of memory required can be varied
during its use.
In linked list, the adjacency between the
elements is maintained by means of links or pointer.
Definition:
A linked list is an ordered collection of
finite ,homogeneous data element called nodes
where the linear order is maintained by means of
links or pointer.
The linked list can be classified into three
major groups
1.Single linked list.
2.Circcular linked list.
3.Double linked list.
SINGLE LINKED LIST
In a single linked list each node contains
only one link which points to the subsequent node in
the list.
Here, Header is empty node, and only
pointer to the first node.
Representation of a linked list in memory:
There are two ways to represent a linked list in
memory.
1. Static representation using Array.
2.Dynamic representation using free pool of storage.
Static representation:
In static representation of a single linked list, two arrays
are maintained .
One array for data and other for links.
Two parallel array of equal size are allocated which should
be sufficient to store the entire linked list.
In some programming language for example for ALGOL,
FORTRAN, BASIC.etc., such a representation is the only
represent to mange a linked list.
Dynamic Representation:
the efficient way of representing a linked list is
using the free pool of storage.
Memory bank:
-----which is nothing but a collection of a
free memory spaces.
Memory manager: a program in fact.
Garbage collector:
the memory manger will then search the
memory bank for the block requested and I found,
grant the desired block to the caller.
the mechanism of dynamic representation of single
linked list is illustrated in fig., A and B.
The list of available memory spaces is there whose
pointer is stored in AVAIL.
arequest of a node, the list AVAIL is searched for
the block of right size.
If AVAIL is null or if the block is desired node is not
found, the memory manger will return a message
accordingly.
The memory manager will return the pointer of XY
to the caller in a temporary buffer.
Newly availed node is XY.
Operations on a single linked list:
1.Traversing.
2.Inserting.
3.Deleting.
4.Copying.
5.Merging.
6.Searching.
Traversing a single linked list:
we visit the every node fro starting to end node.
Insert a node in to single linked list:There are various position where node to insert
1.At front
2.At end
3. At any other position
At front:
At end:
At any position:
TREES
Linear- one dimensional array.
non linear representation of data.(two dimensional
representation).
Tree is a non linear data structure.
Family hierarchy of tree:
Algebraic expression:
Basic Terminology:
Node:
this is main component of any tree structure. the
concept of the node is that same as the linked list.
Parent:
the parent of a node is the immediate predecessor
of a node.
Child:
if the predecessor of the node is the parent of the
node then all immediate successor of the node is
successor is the child.
Left side of the node is LEFT CHILD
Right side of the node is RIGHT CHILD
Link:
This is the pointer to a node in a tree. Left Child and Right Child are two links
of a node.
Root:
This is specifically designated node which has no parent.
Leaf:
The node which is at the end node and does not have any child is called leaf.
Level:
level is the rank in hierarchy.
Height:
The maximum number of node that is possible in a path starting from the root
node to a leaf node is called the height.
Degree:
the maximum number of childs that is possible for a node is kknown as the
degree node.
Sibling:
the node which have the same parent are called sibling.
Definition and concept :
A tree is a finite set of one or more nodes such that:
• There is a specially designated node called the root.
• The remaining nodes are partitioned into n(n>0) disjoint set
T1,T2,T3,T4……TN.
In a sample tree T, there are set of 12 nodes,
A is a root node.
Remaining node are portioned into 3 sets T1,T2 and T3.
By definition each sub tree is also a tree.
Observe that a tree is defined recursively.
The same tree can be expressed in a string notation as shown
below:
BINARY TREE:
A Binary tree is a special form of a tree.
Binary tree is more important and frequently used in
various applications of computer science.
A binary tree can also be defined as a finite set of
nodes such that:
• T is empty.
• T contain a specially designated node called the root
of T.
• T form a disjoint binary trees T1 and T2 which are
called the left sub tree and right sub tree.
Difference between Tree and Binary Tree:
TREE BINARY TREE
Never empty. May be empty.
Many number of children. At most of the two children.
It can be divied into two special situations
1.FULL BINARY TREE.
2.COMPLETE BINARY TREE.
FULL BINARY TREE:
It contains the maximum possible number of nodes at all levels.
Complete binary tree:
if its level ,except the possible the last
level have the maximum number of possible nodes,
and all the nodes in the last level appear as far left
as possible.
REPRESENTATION OF BINARY TREE:
it must represent a hierachical relationship
between a parent node and child nodes.
There are two common methods used for
representing this conceptual structure.
1.linear(sequential) representation:-it is using an
array we do not require the overhead of maintaining
pointers(links).
2.Linked representation:- it uses a pointers, the main
objective is that one should have direct access to the
root node of the tree, and for an given node, one
should direct access to the children of it.
LINEAR REPRESENTATION OF A BINARY
Tree:
• this type of representation is static in the sense of
memory for an array is allocated before storing the
actual tree, once the memory is allocated the size of
the tree is restricted as permitted by the memory.
• The nodes are stored level by level, start from 0
level.
• The root node is stored in first memory location.
• The root node is at location 1.
• For any node with index i,1<i<=n
How can the size of an array estimated?
The value can be obtained easily if binary tree is full
binary tree.
Height is h. can most 2h-1 nodes.
So the size of the array to fit such a binary tree is 2h-1.
Size max= 2n-1
Size min=2[log2(n+1)]-1
ADVANTAGES:
• Any node can accessed from any other node by calculating the index
and this is efficient from execution point of view.
• Only data are stored without any pointers to their successor or
ancestor which are mentioned implicitly.
• Programming languages, where dynamic memory location is not
possible (SUCH AS BASIC,FORTRAN)array representation is only
means to store a tree.
DISADVANTAGES:
• The majority of array entries may be empty.
• It allows only static representation, the array size is limited.
• Insert a new node, delete a node are inefficient with this
representation. data movement up and down the array which
demand excessive amount of processing time.
LINKED REPRESENTATIN OF BINARY TREE:
• It is easy to implementation and simplicity.
• It have number of over heads.
Structure diagram :
• Here the LC and RC are stored in memory address.
• DATA contain information content of node.
• This representation ,if one knows the address of the
root node then from it any other node can be
accessed.
ADVANTAGES:
• It allows the dynamic memory location.
• The size of the tree can be changed as and when the
need arises without any limitation except.
OPERATIONS ON A BINARY TREE:
Operations are as follows:
Insertion:-
To include into an existing binary tree(may be empty)
Deletion:-
To delete a node from a non empty binary tree.
Traversal:-
To visit all the nodes in a binary tree.
Merge:-
To merge two binary trees into a large one.
INSERTION:
INSERT BINARY TREE IN SEQUENTIAL(LINEAR):
SEARCH-SEQUENTIAL(LINEAR)
INSERT BINARY TREE IN LINK:
SEARCH-LINK:
TYPES OF BINARY TREE:
1.Expression tree
2.Binary search tree
3.Heap tree
4.Threaded binary tree
5.Huffman tree
6.Height balanced tree
7.Red black tree
8.Splay tree
9.Decision tree.
Expression tree:
BINARYSEARCH TREE:
OPEREATIONS;
• search
• insert
• deleting
• Traversal
HEAP SORT:
H is complete binary tree, it will be termed Heap Tree.
Properties:
• For each node N in H, the value at N is greater than or equal
to the value of each of the children of N
• N has a value which is greater than or equal to the value of
every successor of N
Representation of heap tree:
operations:
• Insert
• Delete
• Merge
• Sort
RED BLOCK TREE:
Tree balanced called the red block tree.
OPERATIONS:
• INSERT
• DELETE
• SEARCH
GRAPHS:
It is no linear data structure,
It have many parent and many children
Examples:
AIRLINES:
Cities are connected through airlines. this can be
represented through a graph structure which is
shown in fig., here airports are connected solid dots
and airline s by lines.
SOURCE DESTINATION NETWORK:
Shown in fig., represent network connections of
three commodities: electricity, gas and water among
three distant destinations D1,d2 and d3
KONIGSBRIDGE’S BRIDGES:
In eastern prussia, there is city named
konigsbridge. this city is surrounded be four lands,
A,B,Cand D which are divide by the river pregal.
Seven Bridges connected the Four lands. this
geographical description can be easily represented
through a graph structure which is shown in fig
.,(c).
FLOW CHART PROGRAM:
The flow chart of a program is in fact the graphical
representation of an algorithm of problem. shown in
fig.,(d).
GRAPH TERMINOLOGIES:
GRAPH:
a graph G consist of two sets:
A set of v called the set of all vertices.(nodes)
A set E called the set of all edges(arc).this set E is the
set of all pair of elements from V.
For example the figure g1:
V={v1,v2,v3,v4}
E={(v1,v2)(v1,v3)(v1,v4)(v2,v3)(v3,v4)}
Digraph:
Weighted graph:
The edges in it are labeled with some weights, for
example g3 and g4.
Adjacent vertices: Fig., g3 and g4
Self loop: Fig., g5
Parallel loop: Fig., g5
Simple graph: Fig., g5 and g10
Complete graph: Fig., g6 and g9
Acyclic graph: Fig g4,g7
Isolated vertex: fig., G8
Degree vertex: fig., g6
Pendant vertex: fig., G7
Connected vertex: fig., g1,g3 and g6
REPRESENTATION OF GRAPH:
1.Set representation
2.Linked representation
3.Sequential representation(matrix)
Set Representation:
It is one of the straight forward method of representing
a graph. with this method two sets are maintained:
1.V the set of vertices.
2.E the set of Edges.
Which is the subset of V X V.
But if the graph is weighted the set E is the ordered
collection of three tuples, that is E=W X V X V
where W is the Set of weight.
Linked Representation:
Matrix Representation:
OPERATIONS OF GRAPHS:
INSERT:
• To insert a vertex and hence establish connectivity
with other vertices in the existing graph.
• To insert an edge between two vertices in the graph.
Deletion:
• Delete vertex from graph
• Delete edge fro graph.
Merging:
To merge two graph G1 and G@ in to simple graph.
Traversal:
To visit all the vertices in the graphs.
Operations on linked list Representations of Graph:
Insertion:
Deleteion;
Graph traversal: visit all the vertices in graph exactly
once.
Methods:
DFS-Dept first Search
BFS- Breadth first Search
DFS:
Traversal is similar to the inorder traversal of a binary tree,starting from a given node this can visit all the node up to the deepeset level.
the two Graphs G1 nad G2
Strating from vertex v1,the path traversal are indicated in the thick lines,the sequence of visiting of the vertices can be obtained as:
DFS(G1)=v1-v2-v5-v7-v4-v8-v6-v3
DFS(G2)=v1-v2-v5-v7-v4-v8-v3-v6
BFS
This traversal is very similar to the level by level
traversal of a tree.
BFS(G1)=v1-v2-v8-v3-v5-v4-v6-v7
BFS(G2)=v1-v2-v3-v5-v4-v6-v7-v8