1 computer science an overview allen c.-h. wu computer science department tsing hua university

157
1 Computer Science An Overview Allen C.-H. Wu Computer Science Department Tsing Hua University

Upload: gabriel-montgomery

Post on 26-Dec-2015

216 views

Category:

Documents


0 download

TRANSCRIPT

1

Computer ScienceAn Overview

Allen C.-H. WuComputer Science Department

Tsing Hua University

2

Preface

Beginning computer science students need exposure to the breadth of the subject in which they are planning to major.

A foundation from which they can understand the relevance and interrelationships of future courses.

3

Introduction

Computer science is the discipline that seeks to build a scientific foundation for a variety of topics.

Computer science provides the underpinnings for today’s computer applications as well as the foundations for tomorrow’s applications.

4

The Study of Algorithms

An algorithm is a set of steps that defines how a task is performed.

In the domain of computing machinery, algorithms are represented as programs within computers.

Algorithms + Data Structure -> Programs, Programs -> Software <=> Hardware.

5

The Study of Algorithms

The study of algorithms began as a subject in mathematics.

The major goal is to find a single set of directions that described how any problem of a particular type could be solved.

E.g., the long division algorithm and the Euclidean algorithm.

6

The Study of Algorithms

Machine Architecture - . Data storage (Ch. 1) . Data manipulation (Ch. 2)

Software - . Operating systems and networks (Ch. 3) . Algorithms (Ch. 4) . Programming languages (Ch. 5) . Software engineering (Ch. 6)

Data Organization - . Data structures (Ch. 7) . File structures (Ch. 8) . Database structures (Ch. 9)

AI and Theory of Computation

7

The Development of Algorithmic Machines Abacus. Babbage’s difference engine. Jacquard’s loom. Herman Hollerith (holes in paper cards). Mark I at Harvard University. ENIAC at U. of Pennsylvania.

8

The Evolution of Computer Science

Algorithms

Limitations of Execution of

Analysis of

Communication ofDiscovery of

Representation of

9

The Evolution of Computer Science

AlgorithmsHardware

SoftwareLanguages

Applications

10

Abstraction and Other Issues

Abstraction - the distinction between the external properties of a component and the internal details of the component’s construction.

Ethical issues. Social issues. Legal issues.

11

Part I: Machine Architecture

A major process in the development of a science is the construction of theories that are confirmed or rejected by experimentation.

In some cases these theories lie dormant for extended periods, waiting for technology to develop to the point that they can be tested.

12

Ch. 1 Data Storage

Storage of bits. Main memory. Mass storage. Coding information for storage. The binary system. Storing integers. Storing Fractions. Communication errors.

13

Storage of bits

Boolean operations, e.g., AND, NOT, and OR.

Gates are devices that produce the output of a Boolean operation when given the operation’s input values.

A flip-flop is a circuit that has one of two output values (i.e., 0 or 1), the output will flip or flop between two values under control of external stimuli.

14

Storage of Bits

A flip-flop is ideal for the storage of a bit within a computer (on a single wafer or chip). A flip-flop loses data when its power is turned off.

Cores, a donut-shaped rings of magnetic material, are obsolete today due to their size and power requirements.

A magnetic or laser storage device is commonly used when longevity is important.

Hexadecimal notation.

15

Main Memory

Cells - a typical cell size is 8 or called byte. Address is used to identify individual cells

in a main memory. Random access memory (RAM). Read only memory (ROM). Most significant bit (MSB) and least

significant bit (LSB).

16

Mass Storage

Secondary memory. Storing large units of data (called files). Mass storage systems are slow due to

mechanical motion requirement. On-line Vs. off-line operations.

17

Mass Storage

Disk storage. Compact disks and CD-ROM. Tape storage. Physical Vs. logical records.

18

Coding Information for Storage

American Standard Code for Information Interchange (ASCII) - 8-bit codes.

International Standards Organization (ISO) - 16-bit codes.

Binary-decimal number conversion. Bit maps representation - Tag Image Format

File (TIFF), Graphic Interchange Format (GIF), and Joint Photographs Experts Group (JPEG).

19

The Binary System

Binary addition. Fractions in binary. Radix point (same as decimal point in

decimal notation).

20

Storing Integers

Excess notation. Two’s complement notation. Addition in two’s complement notation. Overflow problem. Double precision. Memory size Vs. accuracy of number

representation.

21

Storing Fractions

Floating-point notation. Sign bit => Exponent => Mantissa. Round-off errors.

22

Communication Errors

How can you make sure the information you receive is correct???

Coding techniques for error detection and correction.

Parity bits. Error-correcting codes.

23

Ch. 2 Data Manipulation

The central processing unit. The stored-program concept. Program execution. Other architectures. Arithmetic/logic instructions. Computer-peripheral communication.

24

The Central Processing Unit

CPU

ALU Regs. Control unit

Mainmemory

Bus

25

The Central Processing Unit

General-purpose registers - temporary holding places for data being manipulated by the CPU.

Cache memory (memory hierarchy!). Bus - CPU/memory interface. Machine instructions - data transfer,

arithmetic/logic, and control.

26

The Stored-Program Concept

In early computing, the program is built into the control unit as a part of the machine. The user rewires the control unit to adapt different programs.

Instructions as bit patterns - a program and data can be coded and stored in main memory. A computer’s program can be changed merely by changing the contents of the computer’s memory instead of rewiring the control unit.

27

The Stored-Program Concept

The main concept of the stored-program is that both program and data are stored in main memory instead of data were stored in memory and programs were part of the control unit.

Machine instructions consists two fields: op-code and operand.

28

The Stored-Program Concept

CPU

ALU Regs.

Control unit

Mainmemory

Bus

Program counterInstr. Reg.

Address00

FF

Op-code

operand

29

Program Execution

The machine cycle: 1. Fetch: retrieve the next instruction from

memory and then increment the program counter.

2. Decode: decode the bit pattern in the instruction register.

3. Execute: perform action requested by the instruction in the instruction register.

30

Other Architectures

The design of a machine’s language - complex instruction set Vs. simple instruction set.

CISC Vs. RISC. CISC - microprogram. RISC - simple CPU design.

31

Other Architectures

Pipelining - the throughput concept. Multiprocessor machines - parallel

processing. SISD, SIMD, MIMD. Load balancing problem in multiprocessor

machines. Distributed systems.

32

Arithmetic/Logic Instructions

Logic operations - AND, OR, XOR, …. Masking (AND operation) and bit map. Rotation and shift operations - logic shift

and arithmetic shift (leave the sign bit unchanged).

Arithmetic operations - add, subtract,…..

33

Computer-Peripheral Communication

Controllers handle communication between machine’s CPU and peripheral devices.

The controllers are often a stand-alone small computer, each with its own memory and CPU that performs a program to convert messages and data back and forth between machine and a peripheral device.

34

Computer-Peripheral Communication

CPU

Peripheral device

Controller

Main memory

Controller

Peripheral device

Bus

35

Computer-Peripheral Communication Direct memory access (DMA) - the ability

of controller which can access memory directly.

Buffering - a buffer is any location where one system leaves data to be picked up later by another.

von Neumann bottleneck - central communication bus problem.

36

Computer-Peripheral Communication

CPU

Peripheral device

Controller

Main memory

Bus

Memory-mapped I/O

37

Computer-Peripheral Communication Port - the block of addresses associated with

a controller. Handshaking - the two-way communication

that takes place between devices. Parallel and serial communications. Bits per second (bps) and baud rate. Data compression. Huffman code. Lempel-Ziv encoding.

38

Part II: Software

In part II, we focus on topics associated with software. In particular, we will investigate the discovery, representation, and communication of algorithms.

Operating systems and networks. Algorithms. Programming languages. Software engineering.

39

Ch. 3 Operating Systems and Networks

The evolution of operating systems. Operating system architecture. Coordinating the machine’s activities. Handling Competition among processes. Networks. Network protocols.

40

Operating Systems

Why needs an operating system? Computer applications often require a single machine

to perform activities that may compete with one another for the machine’s resources. It requires a high degree of coordination to ensure that unrelated activities do not interfere with one another and that communication between related activities is efficient and reliable.

What is an operating system? A software system which handles such a coordination task.

41

The evolution of Operating Systems Single-processor systems. Batch processing - the execution of jobs

(programs) by collecting them in a single batch, then executing them without further interaction with the user.

A job queue (FIFO) and a job control language (JCL).

The main drawback to batch processing is no interaction between user and job.

42

The Evolution of Operating Systems Interactive processing, Real-time processing. Time-sharing. Multitasking - time-sharing for a single

user systems. Multiprocessor systems - networks such as

internet. Load balancing and scaling problems.

43

Operating System Architecture

Software

Application System

Utility Operating system

Shell Kernel

44

Operating System Architecture

A machine’s software can be divided into two categories: application software and system software.

Application software - the programs for performing tasks particular to the machine’s utilization.

System software - performs tasks which are common to computer systems in general.

45

Operating System Architecture

System software can be divided into two categories: operating-system software and utility software.

Utility software consists of software units that extend the capabilities of the operating system. For example, the ability to format a disk or software for communicating through a modem over telephone lines.

46

Operating System Architecture

Shell - the portion of an operating system that defines the interface between the operating system and its users.

Graphical user interface (GUI). Importance of uniformity in the human-

machine interface across a variety of machines.

UNIX Vs. MS-DOS and Windows.

47

Operating System Architecture

Kernel - the internal part of an operating system, which contains those software components that perform the very basic functions required by the computer installation.

File manager - directory (folder) and path. Device drivers. Memory manager.

48

Operating System Architecture

Main memory Vs. virtual memory. Pages. Scheduler and dispatcher. Booting (booting strapping). Bootstrap - a short program placed in ROM

and this program is executed automatically when the machine is turned on.

49

Coordinating the Machine Activities Process - is a dynamic activity whose

properties change as time progresses. Process state - is a snapshot of the machine

at that time. For example, the current position in the program being executed and the values in the CPU registers.

A program Vs. a process. Interprocess communication.

50

Coordinating the Machine’s Activities Process administration - the tasks

associated with process coordination are handled by the scheduler and dispatcher within the operating system’s kernel.

Process table - keeps information of a process when it is created (assigned memory area, the priority, the status - ready or waiting).

51

Coordinating the Machine’s Activities The dispatcher is the component of the

kernel that ensures that the scheduled processes are actually executed.

In a time-sharing system, the dispatcher divides time into time slices or quantum.

The dispatcher interrupts the process running out of a time slice and assign a time slice to another process (process switch).

52

Coordinating the Machine’s Activities The client/server model. A client - makes requests of other units. A server - satisfies the requests made by

clients. The client/server model in the design

software leads to uniformity among the types of communication taking place in the system.

53

Handling Competition Among Processes Competing resources among processes. Semaphores. Test-and-set. Critical region - is a sequence of

instructions which can be executed by only one process.

54

Handling Competition Among Processes Deadlock - when two or more processes

are blocked from processing because each is waiting for access to resources allocated to another.

Three necessary conditions to avoid deadlock:

1. There is competition for non-shareable resources.

55

Handling Competition Among Processes 2. The resources are requested on a partial

basis; that is, having received some resources, a process will return later to request more.

3. Once a resource has been allocated, it cannot be forcibly retrieved.

Spooling - holding data for output at a later but more convenient time.

56

Networks

Local area networks (LAN). Wide area networks (WAN). Proprietary networks. Open networks. Network topology - ring, bus, star, and

irregular.

57

Networks

Internet - initiated in 1973 by the Defense Advanced Research Projects Agency (DARPA). Goal: develop the ability to connect a variety of computer networks o that they can function as a single network.

Internet addressing - domains (a collection of network clusters), network identifier, host address; ex., [email protected].

58

Networks

Email and name server. The world wide web - hypertext and

hypermedia documents. A browser - a client. Uniform resource locator (URL) - a

browser can contact the proper server and request the desired document.

Hypertext Markup Language (HTML).

59

Networks

Unauthorized access to information and vandalism.

Passwords and data encryption. Virus. Worm.

60

Network Protocols

Protocols - the rules that govern the communication between different components within a computer system.

Token ring protocol for networks with the ring topology.

CSMA/CD (carrier sense, multiple access with collision detection) in an Ethernet.

61

Network Protocols

You

Shipper

Airline

Customer

Shipper

Airline

Package Package

Container Container

Aircraft

62

Application layer

Network Protocols: The Internet Software Layer

Transport layer

Network layer

Link layer

Application layer

Transport layer

Network layer

Link layer

Message source Message destination

63

Network Protocols

Open system interconnection (OSI). International standards organization (ISO). TCP/IP (transmission control

protocol/internet protocol). UDP (user datagram protocol).

64

Ch. 4 Algorithms

The concept of an algorithm. Algorithm representation. Algorithm discovery. Iterative structures. Recursive structures. Efficiency and correctness.

65

The Concept of an Algorithm

An algorithm is an ordered set of unambiguous, executable steps, defining a terminating process.

Parallel algorithms. Program Vs. algorithm Vs. process.

66

Algorithm Representation

Primitive is a set of well-defined building blocks which algorithm representations can be constructed.

Primitive - graphical and texture. Primitive => programming language. Primitive - syntax and semantics.

67

Algorithm Representation

Pseudocode - is a notational system in which ideas can be expressed informally during the algorithm development process.

Ex. If you have more than $10 buy a cake; otherwise buy nothing => if (cond) then (act1) else (act2)

Ex. As long as you have money, you an spend => while(having money) do (spend)

68

Algorithm Representation

Ex. Assign name the value price+tax. Begin a pseudocode with procedure name. Ex. The pseudocode for Greetings:

procedure Greetings assign Count the value 3; while Count > 0 do (print the message “Hello” and assign Count the value Count - 1)

69

Algorithm Discovery

The development of a program consists of two activities - discovering the underlying algorithm and representing that algorithm as a program.

The basic principles for problem-solving: 1. Understand the problem. 2. Get an idea as to how an algorithmic

procedure might solve the problem.

70

Algorithm Discovery

3. Formulate the algorithm and represent it as a program.

4. Evaluate the program for accuracy and for its potential as a tool for solving other problems.

Conscious work Vs. inspiration. Stepwise refinement - a top-down

methodology.

71

Iterative Structures

Iterative structures - a collection of instructions is repeated in a looping manner.

The while loop structure. The repeat loop structure. The insertion sort algorithm.

72

Recursive Structures

Recursive structures provide an alternative to the loop paradigm for repetitive structures (by invoking itself).

The binary search algorithm. The quick sort algorithm.

73

Efficiency and Correctness

You can develop a variety of algorithms to solve the same problem. However, the choice between efficient and inefficient algorithms can make the difference between a practical solution to a problem and an impractical one.

Time and storage complexity of the algorithm.

74

Efficiency and Correctness

How to make sure the algorithm and program developed is correct?

Difference between testing and verification.

Precondition, assertions, loop invariant.

75

Ch. 5 Programming Languages

Historical perspective. Traditional programming concepts. Program units. Language implementation. Parallel computing. Declarative programming.

76

Historical Perspective

Machine language - binary form direct controls the hardware.

Assembly language - mnemonic form of the machine language.

High-level programming language - English like language.

Evolution?

77

Historical Perspective

Compiler

Assembler 1

Arch 1 Arch n

Assembler n

HLL Machine independent

Machinedependent

78

Historical Perspective

1st-generation - machine language. 2nd-generation - assembly language. 3rd-generation - machine independent. 4th-generation - software packages that allow

users to customize computer software to their applications without needing technical expertise.

5th-generation - declarative (logic) programming.

79

Historical Perspective

1st 4th

Problems solved in anenvironment in which the human must conformto the machine’s characteristics

Problems solved in anenvironment in which the machine conformsto the human’s characteristics

80

Historical Perspective

Imperative paradigm - procedure paradigm, machine languages, FORTRAN, COBOL, ALGOL, BASIC, APL, C, PASCAL, ADA.

Functional paradigm - views the process of program development as the construction of “black boxes,” each accepts inputs and produces outputs, LISP, ML, Scheme.

81

Historical Perspective

Object-oriented paradigm - units of data are viewed as active “objects” rather than the passive units envisioned by the imperative paradigm, SIMULA, Smalltalk, C++, Ada95, Java.

Declarative paradigm - discover and implement a general problem-solving algorithm, GPSS, Prolog.

82

Traditional Programming Concept Statements in programming languages tend

to fall into three categories: declarative statements, imperative statements, and comments.

Declarative statements - define customized terminology used in the program.

Imperative statements - describe steps in the underlying algorithm.

Comments.

83

Traditional Programming Concept Variables, constants, and literals. Data type - integer, read, Boolean, char….. Data structure - array, queue, list,…….. Assignment statements Control statements. Comments - internal documentation.

84

Program Units

Breaking large programs into manageable units, units = modules, functions, objects.

Procedures and functions. Parameter passing - formal parameters and

actual parameter, call by address and call by value.

I/O statements.

85

Language Implementation

Translation - converting a program from one language to another.

Translation involves three activities: 1. Lexical analysis, 2. Parsing, and 3. Code generation.

Lexical analysis - recognizing which strings of symbols from the source program represent a single entity.

86

Language Implementation

Parsing - identifying the grammatical structure of the program and recognizing the role of each component.

Fixed-format languages Vs. free-format languages.

Key words, reserved words, syntax diagram, parse tree.

Coercion and strongly typed.

87

Language Implementation

Code generation - constructing the machine language instructions to simulate the statements recognized by the parser.

Code optimization. Linker - links all necessary object programs

to produce a complete, executable program. Loader - place the program in memory for

execution (what about multitasking?)

88

Parallel Computing

Developing languages for describing processes that execute simultaneously.

Ada. Linda - tuple space (a shared storage area),

in which each process in the system can deposit and retrieve data bundles.

89

Declarative Programming

Logical deduction - resolution. Resolution can be applied only to pairs of

statements that appear in clause form. Inconsistent - in a collection of statements,

it it is impossible for all the statements to be true at the same time.

Prolog - a declarative programming language based on repeated resolution.

90

Ch. 6 Software Engineering

The software engineering discipline. The software life cycle. Modularity. Development tools and techniques. Documentation. Software ownership and liability.

91

The Software Engineering Discipline How to develop and manage a large program

(>100K lines of code) or a huge program (>1M lines of code)???

What is software engineering discipline? What is the quantitative system (metrics) to

measure the quality and successfulness of the underlying software development???

Developing techniques for immediate applications and for future applications.

92

The Software Life Cycle

Development Use

Modification

93

The Software Life Cycle

Analysis

Design

Implementation

Testing

Development phase

94

The Software Life Cycle

Waterfall model. Computer-aided software engineering

(CASE). Prototyping.

95

Modularity

Modular implementation - structure chart. Coupling - control and data coupling. Implicit coupling, global data - why is not

good? Side effects! Cohesion - the coupling between modules. Logical cohesion and functional cohesion.

96

Development Tools and Techniques Top-down design. Bottom-up design. Dataflow diagrams - a pictorial

representation of data paths. Entity-relationship diagrams - a pictorial

representation of the items of information (entities) within the system and the relationships between these pieces of information.

97

Development Tools and Techniques Data dictionaries - a central depository of

information about the data items appearing throughout the system.

Enhancing communication between the potential user of the system.

Establishing uniformity throughout the system.

98

Documentation, software Ownership and Liability

User documentation and system documentation.

Copyright and patent laws.

99

Part III: Data Organization

Data structures. File structures. Database structures.

100

Ch. 7 Data Structures

Arrays. Lists. Stacks. Queues. Trees. Customized data types. Object-oriented programming.

101

Arrays

One dimensional arrays. Multidimensional arrays.

102

Lists

Pointers. Contiguous lists. Linked lists.

103

Stacks

Last-in first-out. Push and pop. Using stacks for maintaining procedure

calls. Other applications???

104

Queues

First-in first-out. Head and tail. Circular queue. Applications???

105

Trees

Trees - an organization chart; e.g., family tree and company’s organization .

Root node, leaf nodes, arc, subtrees. Parent, children, siblings. Depth of a tree. Tree implementation. Binary tree. Applications???

106

Customized Data Types

User-defined types - allow programmers to define additional data types using the primitive types and structures as building blocks.

Abstract data types - encompasses both the storage system and the associated operations.

Encapsulation.

107

Object-Oriented Programming

Objects. Methods (or member functions). Class. Inheritance.

108

Ch. 8 File Structures

Sequential files. Text files. Indexed files. Hashed files. The role of the operating system.

109

Sequential Files

When to use it? When all the records need to be proceeded, it makes no difference which records are proceeded first.

If the storage device is a tape system, we normally follow the sequential order because of the sequential nature of the tape itself. What’s about a disk system???

EOF and sentinel. How to update a sequential file?

110

Sequential Files

In PASCAL, statements read() and write() are used to retrieve and deposit information.

Merge

Transaction file Old master file

New master file

Alg. See Figure8.3

111

Text Files

Text file - the size of the logical records in a sequential file to a single byte (Char).

How to manipulate a text file? A word processor?

How to use text files to define an input and an output files to a program?

112

Indexed Files

If you need to retrieve records in the file in an arbitrary order throughout the day, what is the main problem when you use a sequential file to store the records?

What’s the fast way to find the subject you are interesting in from a book??? Ans. Using the index.

113

Indexed Files

An index for a file consists of a listing of the key field values occurring in the file along with the location in mass storage of the corresponding record.

Key field. An inverted file - primary key and secondary

key. When records are inserted and deleted, all

indexes must be updated.

114

Indexed Files

Index size - since the index must be moved to main memory to be searched, it must remain small enough to fit within a reasonable memory area.

What if the index size is too large??? The partial-index structure. An index to the index.

115

Hashed Files

Sequential files - process in a serial order. Indexed files - direct access (random

access) . Overhead: maintaining an index table.

Hashed files - reduce the overhead by computing the location of a record in mass storage by applying an algorithm to the value of the key field in question.

116

Hashed Files

A particular hashing technique: 1. Divide the mass storage area allotted to the

file into several sections called buckets. 2. Convert any key field value into a numeric

value. 3. Divide any key field value stored in

memory by the number of buckets. 4. Convert any key field value into an integer

that identifies the bucket in memory.

117

Hashed Files

What is the main concern when using hashed files?

Distribution problems - once we have chosen the hash algorithm, we have no control over the distribution of records in mass storage.

Clustering problem - majority of records are placed in the same bucket and the rest of buckets contain almost no records.

118

Hashed Files

Overflow problem - unless the buckets are extremely large, overflow may occur.

Goal - how to select a hash algorithm that evenly distributes the records among the buckets.

Division method. The midsquare method. The extraction method.

119

Hashed Files

Collision - more than one record will hash to the same bucket.

Assume insert records into 41 buckets: the probability of placing the 1st record to an empty bucket is 41/41, the 2nd is 40/41, the 3rd is 39/41 and so on. The probability of placing 8 records into 8 empty buckets is (41/41)(40/41)(39/41)….(34/41) = .482 Less than 50%!!!

120

Hashed Files

The high probability of collisions indicates that a hashed file should never be implemented under the assumption that clustering will never occur.

How to handle the overflow problem? Reserve an additional area of mass storage

to hold overflow records. Double hashing method.

121

The Role of the Operating System Operating systems need to manipulate files

to perform designated tasks. Operating systems maintains a table called

a file descriptor or file control block for each file being processed.

In PASCAL, file descriptors can be created by assign() and reset().

122

Ch. 9 Database Structures

General issues. The layered approach to database

implementation. The relational model. Object-oriented databases. Maintaining database integrity.

123

General Issues

A file Vs. a database organization. Why needs a database system? The consolidation approach - advantage:

central control, disadvantage: security. Database administrator (DBA). Access privileges - schema and subschema. Other issues - size and scope, privacy.

124

The Layered Approach to Database Implementation

End user

Application software

Database management system

Actual database

Data seen in terms of the applications

Data seen in terms of a database model

Data seen in its actualorganization

125

The Layered Approach to Database Implementation Database management system (DBMS). The advantages of the separation of

application software and the database management system:

1. Simplify the design process - for example the distributed database.

2. Providing a central controlling access to the database.

126

The Layered Approach to Database Implementation 3. Data independence - the ability to change

the organization of the database itself without changing the application software.

4. Allows the application software to be written based on a simplified, conceptual view of the database (database model) instead of the actual complex database structure.

Host languages.

127

The Relational Model

Relation - tuple (row) and attribute (column).

How to make up the database using the relations of data?

Extending the relation - pro and con? Dividing information into various relations

(nonloss decomposition) - pro and con?

128

The Relational Model

Relational operations: The SELECT operation. The PROJECT operation. The JOIN operation. The SQL (Structured Query Language).

129

Object-Oriented Databases

Why object-oriented databases: 1. Data independence can be achieved by

encapsulation. 2. The concepts of classes and inheritance fit

schemas and subschemas of databases. 3. Intelligent data objects that can answer

questions themselves. 4. It may overcome some of the restrictions

inherent in other database models.

130

Maintaining Database Integrity

Why database integrity is important? The commit/rollback protocol. Cascading roll back. Locking protocol - shared locks and

exclusive locks. Wound-wait protocol.

131

PART IV: The Potential of Algorithmic Machines

Artificial Intelligence. Theory of Computation.

132

Ch. 10 Artificial Intelligence

Some philosophical issues. Image analysis. Reasoning. Control system activities. Using Heuristics. Artificial neural networks. Applications of AI.

133

Some Philosophical Issues

Machines Vs. humans. Performance Vs. simulation. Intelligence as an interior characteristic -

Turing test and program DOCTOR (ELIZA).

How to create an intelligent machine?

134

An Intelligent puzzle-solving machine This machine takes the form of a metal box

equipped with a gripper, a video camera, and a finger with a rubber end so that it does not slip when pushing something.

Actions: 1. Turn on the machine. 2. Place the puzzle. 3. The finger pushes the tiles back to the original order. 4. Turn off the machine.

135

Image Analysis

The first intelligent behavior required by the puzzle-solving machine is the extraction of information through a visual medium.

Perceive ability - determine the current status of the puzzle.

Optical character readers. Character recognition based on matching

the geometric characteristics.

136

Reasoning

Is possible to develop proper programs targeted to all possible initial configurations (in total 181,440 of them)?

Develop a program which can solve the problem itself - the ability to make decisions, draw conclusions, and in short, perform elementary reasoning activities.

137

Reasoning

A production system consists of three main components:

1. A collection of states - start/goal states. 2. A collection of productions (rules). 3. A control system - which consists of the

logic that solves the problem of moving from the start state to the goal state.

State graph - conceptualizing all states, rules, and preconditions in a production system.

138

Reasoning

Socrates is a man.All men are humans.All humans are mortal.

Socrates is a man.All men are humans.All humans are mortal.Socrates is a human.

Socrates is a man.All men are humans.All humans are mortal.Socrates is a human.Socrates is mortal.

Start state

Goal state

139

Control System Activities

A state-graph traversal problem. Search tree. How to build a search tree? It is impractical to develop a full search

tree for a complex problem. Using depth-first construction instead of

breadth-first manner. Avoiding redundancy.

140

Using Heuristics

Heuristics - the use of intuition, a rule of thumb which may lead to a correct direction but offer no assurance on it.

How to develop a heuristic - first develop a quantitative measure by which a program can determine which of several states is considered closest to the goal (cost function).

141

Artificial Neural Networks

Neural networks - model networks of neurons in living biological systems.

Compute effectiveinputs

Thresholdvalue

Output0 or 1

I1W1+…+InWn

142

Applications of Artificial Intelligence Language processing. Robotics. Database systems. Expert systems.

143

Ch. 11 Theory of Computation

A bare bones programming. Turing machines. Computable functions. A noncomputable function. Complexity and its measure. Problem classification.

144

A Bare Bones Programming Language A universal programming language - a

language encompasses the power of algorithmic processes themselves; i.e., if a problem can be solved algorithmically, the an algorithm for solving the problem can be expressed in the language. On the other hand, if the problem can not be expressed in the language, there is no such an algorithm to solve the problem.

145

A Bare Bones Programming Language Data description statements - all variables

are considered to be of type “bit pattern of any length.” => no need a declarative part.

Process description statements - three assignment statements: clear, incr, decr and one control structure: while-end.

146

A Bare Bones Programming Language

“move tax to extra”Clear aux;clear extra;while tax not 0 do; incr aux; decr tax;end;while aux not 0 do; incr tax; incr extra; decr aux;end;

147

Turing Machines

Turing machines - are conceptual devices for studying the power of algorithmic processes.

A Turing machine consists of a control unit that can read and write symbols on a tape

The machine must be in one of a finite number of states, start/halt states.

148

Turing Machines

Today’s computers <=> Turing machines finite memories <=> infinite supply of tape CPU <=> the control unit bit patterns <=> states

The significance of Turing machines in theoretical computer science - the computation power of Turing machines is as great as any algorithmic system.

149

Computable Functions

How to measure computing power? Goal: using Turing machines to investigate

the power of the bare bones language. Computing the functions is the process of

determining an output of a function from its inputs.

If one machine is capable of computing more functions than another, the former is considered the more powerful.

150

Computable Functions

Ex. A system in which function outputs are predetermined and recorded in a table.

Ex. Finding function outputs would be to describe how to compute the output.

Computable - the functions whose output values can be determined algorithmically from their input values.

Noncomputable functions!

151

Computable Functions

Turing computable. The Church-Turing thesis. If a computational system is capable of

computing all the Turing-computable functions, it is considered to be a universal system.

Apply the Church-Turing these to confirm that the bare bones language is a universal programming language.

152

A Noncomputable Function

Computing the Godel number. The halting problem.

153

Complexity and Its Measure

Time and storage complexities (Big O). Order of complexity. Polynomial and nonpolynomial problems. NP problems - nondeterministic

polynomial problems. NP-complete problems.

154

Roadmap to Computer Science Study Fundamental courses: Physics, Mathematics,

and Introduction to Computer Science. Software: 1. Fundamental: Problem Solving and

Programming, Data Structure, Algorithm, and Software Engineering.

2. Language: Assembly Language, Programming Language, C, and JAVA.

155

Roadmap to Computer Science Study 3. Theory: Formal Language and Theory of

Computation. 4. System: Operating System, Compiler,

Networking, Database, and Multimedia. Hardware: 1. Fundamental: Electronics, Logic Design,

Digital System Design, and Computer Architecture.

156

Roadmap to Computer Science Study 2. System: Microprocessors and VLSI

design. Applications: 1. Consumer products. 2. Artificial Intelligence. 3. Networking. 4. Image Processing. 5. Computer Architecture and Compiler.

157

Roadmap to Computer Science Study

6. VLSI and Computer-Aided Design. 7. Biological (Medical) Computing. 8. Multimedia. 9. Databases. 10. Education. 11. Business and management. 12. And more!!!