data structure: syllabus and basic...

28
Data Structure: Syllabus and Basic Concepts 2017 Instructor: Prof. Young-guk Ha Dept. of Computer Science & Engineering

Upload: dangnguyet

Post on 15-May-2018

219 views

Category:

Documents


1 download

TRANSCRIPT

Data Structure:

Syllabus and Basic Concepts

2017

Instructor: Prof. Young-guk HaDept. of Computer Science & Engineering

Contents

• Course syllabus

• Introduction to basic conceptsfor studying data structure

2

Course Syllabus

Course Introduction

• Course title“Data Structure”

• Objective– To study how to organize and process data to implement

scalable and effective computer software

• E.g., searching 10 numbers vs. 1,000,000,000,000 numbers

– I.e., to improve programming skill

• Schedule– Class A: Tue. 13:30~15:00 / Wed. 13:30~15:00 (#502)

– Class B: Tue. 15:30~17:00 / Wed. 15:30~17:00 (#502)

– Class C (SCSC): Thu. 17:30~19:00 / Fri. 17:30~19:00 (#402)

4

Course Introduction (cont’d)

• Prerequisite– Knowledge on Java programming

• Lectures will be given with Power Point presentations– Presentation (in PDF) can be downloaded from the course

homepage before the corresponding class

– Practice after each major topic is covered

• Language– Using Korean for student’s clear understanding

– Syllabus and class materials will be given in English

5

Topics and ScheduleWeeks Major Topics

Week 1 Syllabus and Basic concepts / Analysis of Algorithms

Week 2 Recursion / Stacks and Queues

Week 3 Lists and Iterators

Week 4 Trees

Week 5 Priority queues and Heaps

Week 6 Maps and Dictionaries

Week 7 Search trees 1 (Bin Search Trees and AVL Trees)

Week 8 Midterm exam

Week 9 Search trees 2 ([2, 4] Trees and Red-Black Trees)

Week 10 Sorting (Merge sort, Quick sort, Radix sort)

Week 11 Sets and Selection

Week 12 Text processing

Week 13 Heap structure

Week 14 Graphs and Graph traversals

Week 15 Directed graphs

Week 16 Final exam7

Grading Policy

• Exams: 70%– Midterm exam: 35%

– Final exam: 35%

• Assignments: 20%– No submission, no grade

• Class attendance: 10%– 2 lateness = 1 absence

– 5 absences = no grade

8

9

Course Homepage

• How to access

– http://sclab.konkuk.ac.kr Classes Data Structure

• Downloading class material

– You can download syllabus and lecture notes in PDF format

• Class announcement

– Assignments

– Exam schedule and result

– And so on

Contact Information

• Professor Young-guk Ha– Office

• Room 903, New Millennium Bldg.

– Phone (office)

• 02-450-3273

– Email

[email protected]

– Office hour

• One hour after the class

• Otherwise, make an appointment before visiting my office

10

Teaching Assistants

• TA for Class A

– Name: Myung-jae Lee

– Office: New Engineering Bldg. #1216 (SCLab)

– Email: [email protected]

• TA for Class B

– Name: Soo-young Choi

– Office: New Engineering Bldg. #1216 (SCLab)

– Email: [email protected]

• TA for Class C

– Name: Cheol-jin Kim

– Office: New Engineering Bldg. #1216 (SCLab)

– Email: [email protected]

11

Introduction to Basic Concepts for Data Structure

Basic Concepts

• Data

• Data Structure

• Data type

• Algorithm

• Performance analysis

13

Data

• Definition: data is a set of values of qualitative or quantitative variables

– Integer data: ..., -3, -2, -1, 0, 1, 2, 3, ...

– Floating-point data: 0.1, 0.2, 0.3, 0.4, 0.5, ...

– Character data: ‘a’, ‘b’, ‘c’, ‘d’, ‘e’, ‘f’, ‘g’, ‘h’, ...

– String data: “string”, “data”, “value”, …

– Boolean data: true, false

– …

14

Data Type

• Definition: a data type is a particular kind of data, which consists of data and a set of operations that act on those data values– Fundamental data types

• char, int, float, double, …

– Derived data types

• array, pointer, reference, …

– User-defined data types• struct, union, class, …

• Example: an int data type

– Data values: { INT_MIN, … , -2, -1, 0, 1, 2, … , INT_MAX }

– Operations: + - * / > < == …

15

Data Structure

• Definition: data structure is a particular way of organizing data in a computer so that it can be used efficiently

– Stack, Queue, Deque

– Array List, Node List, Sequence

– Map, Dictionary

– Priority Queue

– Tree, Binary Tree, Heap, Search Tree

– Graph

– …

16

Algorithm

• Definition: an algorithm is a set of instructions (operations) that accomplishes a particular task (using data structure), and all algorithms must satisfy the following five criteria

– Input: there are zero or more inputs that are externally supplied

– Output: at least one output is produced

– Definiteness: each instruction is clear and unambiguous

– Finiteness: algorithm terminates after finite steps for all cases

– Effectiveness: every instruction must be basic and feasible

• Program vs. algorithm

– A program does not have to satisfy the finiteness condition(however, we use these two terms interchangeably)

– What about operating systems: algorithm or program?

17

Data Structure and Algorithm

• Problem solving with a computer1) Organizing the given problem as a good data structure

2) Solving the given problem by applying a good algorithm to the data structure

E.g.) Finding shortest path

18

Dijkstra Algorithm

Graph Data StructureA

J

Description of Algorithm

• We can describe an algorithm in many ways– Natural languages: Korean, English, …

– Graphic representations: Flowcharts, UML, VPL, …

– Programming languages: Java, C, C++, JavaScript, Python, …

– Pseudo-code: a combination of natural and programming languages

• In this course, we will mainly use Java as well as a kind of pseudo-code introduced in the textbook– Note that basic syntax of Java language will not be covered in

this course

19

Performance Analysis

• How to evaluate whether a data structure of an algorithm is good or bad?

– We need to know how much time and space is required to complete the algorithm with the data structure

• Performance analysis: to obtain machine-independent estimatesof required time and space (theoretic calculation)

– Definition: the time complexity of an algorithm is the total amount of

computer time that it needs to run to completion

– Definition: the space complexity of an algorithm is the total amount of

memory space that it needs to run to completion

Cf.) Performance measurement

• To obtain exact running time and required memory space using specific

computer systems (machine-dependent)

20

Analyzing Space Complexity

• Components for analyzing space complexity

– Fixed space requirements• Not dependent on the number and size of the inputs

– Instruction space (needed to store the code)

– Space for simple variables, arrays, fixed size structured variables, constants

– Variable space requirements• Consisting of space for structured variables whose size

depends on the particular instance of the problem being solved (inputs)

– Dynamically allocated space for storing and handling input data with random size

– Additional stack space when a function uses recursion

21

Total Space Requirements

• Total space requirements S of a program P is defined as a sum of the two components as follows

S(P) = c + Sp(I)

– where c is a constant representing fixed space requirements

– and Sp(I) is variable space requirements of P for the problem instance (input) I

When analyzing the space complexity, we are usually concerned with only the variable space requirements Sp(I)

Usually Sp(I) is given as a function of some variable characteristics of the problem instance

Note) We normally use Sp(n) where n is the number of inputs

22

Space Complexity Example 1

public float abc(float a, float b, float c)

{

return a + b + b * c + (a + b + c) / (a + b) + 4.00;

}

• We have a function abc that accepts three simple variables as

inputs and returns a simple value as an output

• This function has only fixed space requirements

• Therefore, Sabc(I) = Sabc(n) = 0

23

Space Complexity Example 2

public int[] reverse(int[] list, int n)

{

int[] rlist = new int[n];

for (int i = 0; i < n; i++) rlist[n-i-1] = list[i];

return rlist;

}

• We have a function reverse that reverses a list of int numbers

• The input includes an array of variable size n

• Therefore, the variable space requirements depend on the size of the input array– The temporary array (rlist) of size n is required to reversely copy

the entire input array, so Sreverse(I) = Sreverse(n) = n

24

Analyzing Time Complexity

• The time T taken by a program P is the sum of compile time and running time of P

T(P) = Tc + Tp

– where Tc is the compile time of P

– and Tp is the running time (machine instruction count) of P

Usually we may run a program many times without recompilation,so we are really concerned only with the program’s running time Tp

Determining detailed Tp (exact instruction count) is not an easy task

• It rarely worth the effort because we do not know how each compiler translates the source program into machine codes

Alternatively, we can count the primitive operations, which are basic and most important computations composing an algorithm

25

Time Complexity Exampleprivate int count = 0;

public float sum(float[] list, int n)

{

float tempsum = 0; count++; // the 1st assignment: count += 1

int i;

for (i = 0; i < n; i++) { count++; // in the for-loop: count += n

tempsum += list[i]; count++; // each assignment: count += n

} count++; // for-loop termination: count += 1

return tempsum;

}

• This example illustrates how to count primitive operations (assignments “=“) for the sum function

• Note that we need to be concerned only with the primitive operations, which eliminates other operations (e.g., parameter passing, variable declaration, int addition, array indexing, and so on) from counting

• So, we simply increment the count variable by one for each assignment

• The final value of count variable is 2n + 2, thus, each invocation of sumexecutes a total of 2n + 2 primitive operations and T(sum) = 2n + 2

26

Types of Complexity Analysis

• Normally, the time (or space) complexity is not dependent solely on a specified characteristic

– That is, the time (or space) complexity of the same algorithm could vary with the composition of given inputs even though each input has the same size

E.g.) Time complexity of a linear search algorithm• To return the position of search_num in the given list of size n by

probing each element in the list one by one from the beginning

public int linearSearch(int[] list, int n, int search_num)

• Running time of LinearSearch varies with

– not only the size n of the input list

– but position i of the search_num in the input list being searched for

27

0 1 2 … n-1i …

Types of Complexity Analysis (cont’d)

• So, we can deliver ourselves from such difficulties of determining the complexity uniquely by defining three kinds of complexity analysis– Best case analysis

• Analyzes the minimum time (or space) needed

• Provides little information

– Average case analysis

• Analyzes the average time (or space) needed

• Hard to determine

– Worst case analysis

• Analyzes the maximum time (or space) needed

• Most important and frequently used

28