introduction to information theory and applicationsjnujprdistance.com/assets/lms/lms...

167
Introduction to Information Theory and Applications

Upload: lamthuan

Post on 03-Jul-2018

223 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Introduction to Information Theory and Applicationsjnujprdistance.com/assets/lms/LMS JNU/B.Sc.(IT)/Sem I/Introduction...Summary ... 8.3.6 Project VENONA ... Table 8.2 VENONA Decrypt

Introduction to Information Theory and Applications

Page 2: Introduction to Information Theory and Applicationsjnujprdistance.com/assets/lms/LMS JNU/B.Sc.(IT)/Sem I/Introduction...Summary ... 8.3.6 Project VENONA ... Table 8.2 VENONA Decrypt

Board of Studies

Prof. H. N. Verma Prof. M. K. GhadoliyaVice- Chancellor Director, Jaipur National University, Jaipur School of Distance Education and Learning Jaipur National University, JaipurDr. Rajendra Takale Prof. and Head AcademicsSBPIM, Pune

___________________________________________________________________________________________

Subject Expert Panel

Dr. Ramchandra G. Pawar Ashwini PanditDirector, SIBACA, Lonavala Subject Matter ExpertPune

___________________________________________________________________________________________

Content Review Panel

Gaurav Modi Shubhada PawarSubject Matter Expert Subject Matter Expert

___________________________________________________________________________________________Copyright ©

This book contains the course content for Introduction to Information Theory and Applications.

First Edition 2013

Printed byUniversal Training Solutions Private Limited

Address05th Floor, I-Space, Bavdhan, Pune 411021.

All rights reserved. This book or any portion thereof may not, in any form or by any means including electronic or mechanical or photocopying or recording, be reproduced or distributed or transmitted or stored in a retrieval system or be broadcasted or transmitted.

___________________________________________________________________________________________

Page 3: Introduction to Information Theory and Applicationsjnujprdistance.com/assets/lms/LMS JNU/B.Sc.(IT)/Sem I/Introduction...Summary ... 8.3.6 Project VENONA ... Table 8.2 VENONA Decrypt

I

Index

ContentI. ...................................................................... II

List of FiguresII. ..........................................................VI

List of TablesIII. ......................................................... VII

AbbreviationsIV. ......................................................VIII

Case StudyV. .............................................................. 147

BibliographyVI. ......................................................... 151

Self Assessment AnswersVII. ................................... 154

Book at a Glance

Page 4: Introduction to Information Theory and Applicationsjnujprdistance.com/assets/lms/LMS JNU/B.Sc.(IT)/Sem I/Introduction...Summary ... 8.3.6 Project VENONA ... Table 8.2 VENONA Decrypt

II

Contents

Chapter I ...................................................................................................................................................... 1Introduction to Information Theory .......................................................................................................... 1Aim ................................................................................................................................................................ 1Objectives ...................................................................................................................................................... 1Learning outcome .......................................................................................................................................... 11.1 Introduction .............................................................................................................................................. 21.2 Data and Information .............................................................................................................................. 21.3 Information System .................................................................................................................................. 3 1.3.1 Characteristics of Useful Information ...................................................................................... 4 1.3.2 Information System Process .................................................................................................... 4 1.3.3 Computer Based Information Systems .................................................................................... 51.4 Information Theory .................................................................................................................................. 5 1.4.1 Efficient Encodings .................................................................................................................. 6 1.4.2 Measuring Information Content ............................................................................................... 7 1.4.3 The Intuition Behind the –P log P Formula ............................................................................. 7 1.4.4 Applications of Information Theory ........................................................................................ 81.5 Software Concepts ................................................................................................................................... 8 1.5.1 Importance of Software Application ....................................................................................... 8 1.5.2 Programming Language ........................................................................................................... 9 1.5.3 Types of Software .................................................................................................................... 9Summary ..................................................................................................................................................... 10References .................................................................................................................................................. 10Recommended Reading ............................................................................................................................. 10Self Assessment ............................................................................................................................................11

Chapter II .................................................................................................................................................. 13Computer Fundamentals ........................................................................................................................... 13Aim .............................................................................................................................................................. 13Objectives .................................................................................................................................................... 13Learning outcome ........................................................................................................................................ 132.1 Introduction ............................................................................................................................................ 142.2 Definition of Computer .......................................................................................................................... 142.3 Essential Features of Computers ............................................................................................................ 152.4 Characteristics of a Computer ................................................................................................................ 162.5 History of Computer .............................................................................................................................. 172.6 Computer Generations ........................................................................................................................... 202.7 Computer Classification ......................................................................................................................... 22Summary ..................................................................................................................................................... 24References .................................................................................................................................................. 24Recommended Reading ............................................................................................................................. 25Self Assessment ........................................................................................................................................... 26

Chapter III .................................................................................................................................................. 28Computer Peripherals ............................................................................................................................... 28Aim .............................................................................................................................................................. 28Objectives .................................................................................................................................................... 28Learning outcome ........................................................................................................................................ 283.1 Introduction ............................................................................................................................................ 293.2 Basic Computer Components ................................................................................................................ 293.3 Functional Units ..................................................................................................................................... 30 3.3.1 Arithmetic Logical Unit (ALU) ............................................................................................. 30 3.3.2 Control Unit (CU) .................................................................................................................. 30 3.3.3 Central Processing Unit (CPU) .............................................................................................. 31

Page 5: Introduction to Information Theory and Applicationsjnujprdistance.com/assets/lms/LMS JNU/B.Sc.(IT)/Sem I/Introduction...Summary ... 8.3.6 Project VENONA ... Table 8.2 VENONA Decrypt

III

3.4 Types of Computer Memory .................................................................................................................. 313.5 Primary Memory .................................................................................................................................... 32 3.5.1 Random Access Memory (RAM) .......................................................................................... 32 3.5.2 Read Only Memory (ROM) ................................................................................................... 33 3.5.3 Cache Memory (Small, Fast RAM) ....................................................................................... 33 3.5.4 Registers ................................................................................................................................. 343.6 Secondary Storage ................................................................................................................................. 34 3.6.1 Magnetic Tape ........................................................................................................................ 34 3.6.2 Magnetic Disk ........................................................................................................................ 35 3.6.3 Floppy Disk ............................................................................................................................ 35 3.6.4 Optical Disk ........................................................................................................................... 36 3.6.5 Flash Memory ........................................................................................................................ 37 3.6.6 USB Drives ............................................................................................................................ 38 3.6.7 Removable Hard Drives ......................................................................................................... 38 3.6.8 Smart Cards ............................................................................................................................ 38 3.6.9 Optical Cards ......................................................................................................................... 383.7 Input Output Devices ............................................................................................................................. 383.8 Input Devices ......................................................................................................................................... 383.9 Output Devices ....................................................................................................................................... 40Summary ..................................................................................................................................................... 43References ................................................................................................................................................... 43Recommended Reading ............................................................................................................................. 43Self Assessment ........................................................................................................................................... 44

Chapter IV ................................................................................................................................................. 46Computer Operations and Languages ..................................................................................................... 46Aim .............................................................................................................................................................. 46Objectives .................................................................................................................................................... 46Learning outcome ........................................................................................................................................ 464.1 Introduction ............................................................................................................................................ 474.2 Computer Arithmetic ............................................................................................................................. 474.3 Binary Number System .......................................................................................................................... 48 4.3.1 Counting in Binary ................................................................................................................. 49 4.3.2 Binary Arithmetic .................................................................................................................. 50 4.3.3 Conversion of Binary, Decimal, Hexadecimal and Octal Number Systems.......................... 53 4.3.4 1’s and 2’s Complement of Binary Number .......................................................................... 554.4 Floating Point Arithmetic ....................................................................................................................... 564.5 Arithmetic through Stacks ...................................................................................................................... 564.6 Computer Language ............................................................................................................................... 57 4.6.1 Machine Language ................................................................................................................. 58 4.6.2 Assembly Language ............................................................................................................... 59 4.6.3 High-Level Language ............................................................................................................ 634.7 Operating System (OS) .......................................................................................................................... 664.8 Instruction Cycle .................................................................................................................................... 684.9 Program Flow of Control with and without Interrupts .......................................................................... 70Summary ..................................................................................................................................................... 72References ................................................................................................................................................... 72Recommended Reading ............................................................................................................................. 73Self Assessment ........................................................................................................................................... 74

Chapter V ................................................................................................................................................... 76Communication .......................................................................................................................................... 76Aim .............................................................................................................................................................. 76Objectives .................................................................................................................................................... 76Learning outcome ........................................................................................................................................ 76

Page 6: Introduction to Information Theory and Applicationsjnujprdistance.com/assets/lms/LMS JNU/B.Sc.(IT)/Sem I/Introduction...Summary ... 8.3.6 Project VENONA ... Table 8.2 VENONA Decrypt

IV

5.1 Introduction ............................................................................................................................................ 775.2 Analog and Digital Communication ...................................................................................................... 77 5.2.1 Transmission Impairments ..................................................................................................... 78 5.2.2 Signal to Noise Ratio ............................................................................................................. 80 5.2.3 Hamming Error-Correction Codes ......................................................................................... 81 5.2.4 Channel Capacity ................................................................................................................... 825.3 Communication Channels ...................................................................................................................... 82 5.3.1 Wired Channels (Twisted-pair Wire, Coaxial Cable and Fibre-optic Cable) ......................... 83 5.3.2 Wireless Channels (Radio Link, Microwave Link, Satellite Communication) ..................... 845.4 Transmission Technology ...................................................................................................................... 85 5.4.1 Broadcast Networks .............................................................................................................. 85 5.4.2 Point-to-Point or Switched Networks ................................................................................... 865.5 Modulation ............................................................................................................................................. 86Summary ..................................................................................................................................................... 88References ................................................................................................................................................... 88Recommended Reading ............................................................................................................................. 88Self Assessment ........................................................................................................................................... 89

Chapter VI ................................................................................................................................................. 91Computer Networks ................................................................................................................................... 91Aim .............................................................................................................................................................. 91Objectives .................................................................................................................................................... 91Learning outcome ........................................................................................................................................ 916.1 Introduction ............................................................................................................................................ 926.2 Types of Networks ................................................................................................................................. 92 6.2.1 Local Area Network ............................................................................................................... 92 6.2.2 Wide Area Network ............................................................................................................... 93 6.2.3 Difference Between LAN and WAN ..................................................................................... 94 6.2.4 Other Types of Networks ....................................................................................................... 94 6.2.5 Network Topology ................................................................................................................. 956.3 ISO OSI model ....................................................................................................................................... 96 6.3.1 Layers of OSI Model ............................................................................................................. 97 6.3.2 Protocol .................................................................................................................................. 99 6.3.3 IP Address ............................................................................................................................. 99 6.3.4 TCP/IP Protocol ..................................................................................................................... 996.4 The Internet ............................................................................................................................................ 996.5 World Wide Web (WWW) ................................................................................................................... 1006.6 Clients and Servers .............................................................................................................................. 1016.7 Ports ..................................................................................................................................................... 102 6.7.1 Uses of Computer Ports ....................................................................................................... 103 6.7.2 Types of Ports ...................................................................................................................... 1036.8 Domain Name Service ......................................................................................................................... 1046.9 WWW, Browsers Connections ............................................................................................................. 1046.10 WWW Browsers ................................................................................................................................ 105 6.10.1 Web Page............................................................................................................................ 106 6.10.2 URL .................................................................................................................................... 106 6.10.3 Web Server ........................................................................................................................ 106 6.10.4 HTTP .................................................................................................................................. 106 6.10.5 HTML ................................................................................................................................ 1066.11 Using the WWW ................................................................................................................................ 107 6.11.1 Web Browser ...................................................................................................................... 107 6.11.2 Searching for Information ................................................................................................. 107 6.11.3 Search Techniques .............................................................................................................. 1086.12 Blogs .................................................................................................................................................. 108 6.12.1 Wikis .................................................................................................................................. 108

Page 7: Introduction to Information Theory and Applicationsjnujprdistance.com/assets/lms/LMS JNU/B.Sc.(IT)/Sem I/Introduction...Summary ... 8.3.6 Project VENONA ... Table 8.2 VENONA Decrypt

V

6.12.2 Electronic Social Network ................................................................................................. 108 6.12.3 Micro Blogging .................................................................................................................. 109 6.12.4 RSS .................................................................................................................................... 109 6.12.5 Web 3.0 .............................................................................................................................. 1096.13 Electronic Mail ................................................................................................................................... 109Summary ....................................................................................................................................................111References ..................................................................................................................................................111Recommended Reading ............................................................................................................................112Self Assessment ..........................................................................................................................................113

Chapter VII ...............................................................................................................................................115Introduction to Information Security .....................................................................................................115Aim .............................................................................................................................................................115Objectives ...................................................................................................................................................115Learning outcome .......................................................................................................................................1157.1 Introduction ...........................................................................................................................................1167.2 Information Security and Privacy Lifecycle .........................................................................................1167.3 Costs of Data Loss and Disclosure ...................................................................................................... 124 7.3.1 Cryptography ....................................................................................................................... 124 7.3.2 Access Control ..................................................................................................................... 125 7.3.3 Protocols .............................................................................................................................. 125 7.3.4 Software ............................................................................................................................... 1267.4 The People Problem ............................................................................................................................. 126Summary ................................................................................................................................................... 127References ................................................................................................................................................. 127Recommended Reading ........................................................................................................................... 127Self Assessment ......................................................................................................................................... 128

Chapter VIII ............................................................................................................................................. 130Crypto Basics ............................................................................................................................................ 130Aim ............................................................................................................................................................ 130Objectives .................................................................................................................................................. 130Learning outcome ...................................................................................................................................... 1308.1 Introduction .......................................................................................................................................... 131 8.1.1 Encryption and Decryption .................................................................................................. 1318.2 How to Speak Crypto? ......................................................................................................................... 1318.3 Classic Crypto ...................................................................................................................................... 132 8.3.1 Simple Substitution Cipher .................................................................................................. 132 8.3.2 Cryptanalysis of a Simple Substitution ................................................................................ 133 8.3.3 Definition of Secure ............................................................................................................. 134 8.3.4 Double Transposition Cipher ............................................................................................... 135 8.3.5 One-Time Pad ...................................................................................................................... 136 8.3.6 Project VENONA................................................................................................................. 137 8.3.7 Codebook Cipher ................................................................................................................. 138 8.3.8 Ciphers of the Election of 1876 ........................................................................................... 1398.4 Modern Crypto History ........................................................................................................................ 1408.5 A Taxonomy of Cryptography .............................................................................................................. 1418.6 A Taxonomy of Cryptanalysis .............................................................................................................. 142Summary ................................................................................................................................................... 144References ................................................................................................................................................. 144Recommended Reading ........................................................................................................................... 144Self Assessment ......................................................................................................................................... 145

Page 8: Introduction to Information Theory and Applicationsjnujprdistance.com/assets/lms/LMS JNU/B.Sc.(IT)/Sem I/Introduction...Summary ... 8.3.6 Project VENONA ... Table 8.2 VENONA Decrypt

VI

List of Figures

Fig. 1.1 Data and information ........................................................................................................................ 2Fig. 1.2 Audio signal to sound conversion ..................................................................................................... 3Fig. 1.3 Information system process .............................................................................................................. 3Fig. 1.4 Various steps in data processing ....................................................................................................... 5Fig. 2.1 Personal computer .......................................................................................................................... 14Fig. 2.2 Block diagram of a computer ......................................................................................................... 15Fig. 2.3 Abacus computer ............................................................................................................................ 17Fig. 2.4 Napier’s bones ................................................................................................................................ 18Fig. 2.5 Slide rule ......................................................................................................................................... 18Fig. 2.6 Leibniz’s multiplication and dividing machine ............................................................................. 19Fig. 2.7 Babbage’s analytical engine ........................................................................................................... 19Fig. 2.8 Vacuum tube, transistor, chips ........................................................................................................ 20Fig. 3.1 Basic computer organisation........................................................................................................... 29Fig. 3.2 Computer architecture .................................................................................................................... 31Fig. 3.3 Magnetic Tape ................................................................................................................................ 34Fig. 3.4 Disk sectors..................................................................................................................................... 35Fig. 3.5 Track sectors ................................................................................................................................... 35Fig. 3.6 Floppy disk ..................................................................................................................................... 36Fig. 3.7 Keyboard ........................................................................................................................................ 39Fig. 3.8 Pointing devices .............................................................................................................................. 39Fig. 3.9 Types of printers ............................................................................................................................. 41Fig. 3.10 Plotters .......................................................................................................................................... 42Fig. 4.1 Binary clock using LED to express binary values .......................................................................... 49Fig. 4.2 Stack processing ............................................................................................................................ 57Fig. 4.3 Addition using stack arithmetic ...................................................................................................... 57Fig. 4.4 Instruction format ........................................................................................................................... 58Fig. 4.5 Illustrating the translation process of an assembler ........................................................................ 60Fig. 4.6 Illustrating the translation process of a compiler ............................................................................ 63Fig. 4.7 Illustrating the role of an interpreter ............................................................................................... 65Fig. 4.8 A diagram of the fetch execute cycle .............................................................................................. 70Fig. 4.9 Program flow of control without and with interrupts ..................................................................... 71Fig. 5.1 Analog and digital signal ................................................................................................................ 78Fig. 5.2 Causes of impairment ..................................................................................................................... 78Fig. 5.3 Attenuation...................................................................................................................................... 79Fig. 5.4 Distortion ........................................................................................................................................ 79Fig. 5.5 Noise ............................................................................................................................................... 79Fig. 5.6 Two cases of SNR: a high SNR and a low SNR ............................................................................. 81Fig. 5.7 Modulation...................................................................................................................................... 87Fig. 6.1 Star topology................................................................................................................................... 95Fig. 6.2 Bus topology ................................................................................................................................... 96Fig. 6.3 Ring topology ................................................................................................................................. 96Fig. 6.4 OSI model ....................................................................................................................................... 97Fig. 6.5 Internet .......................................................................................................................................... 101Fig. 6.6 Client-server computing environment .......................................................................................... 102Fig. 6.7 Ports .............................................................................................................................................. 103Fig. 6.8 Basic hypertext enhanced by searches .......................................................................................... 106Fig. 8.1 Encryption and decryption ............................................................................................................ 131Fig. 8.2 Crypto as a black box ................................................................................................................... 131Fig. 8.3 English letter frequency counts .................................................................................................... 134Fig. 8.4 Ciphertext frequency counts ......................................................................................................... 135Fig. 8.5 The Zimmermann telegram .......................................................................................................... 139

Page 9: Introduction to Information Theory and Applicationsjnujprdistance.com/assets/lms/LMS JNU/B.Sc.(IT)/Sem I/Introduction...Summary ... 8.3.6 Project VENONA ... Table 8.2 VENONA Decrypt

VII

List of Tables

Table 2.1 Functions of a computer ............................................................................................................... 15Table 2.2 Examples of second generation computers .................................................................................. 21Table 2.3 Computer evolution ...................................................................................................................... 22Table 4.1 Various numerals or numeration systems ..................................................................................... 47Table 4.2 Binary numeric value of 667 ........................................................................................................ 48Table 4.3 Representation of decimal and binary counting ........................................................................... 49Table 4.4 Addition in binary ........................................................................................................................ 50Table 4.5 Subtraction in binary .................................................................................................................... 51Table 4.6 Conversion of binary to decimal .................................................................................................. 53Table 4.7 Decimal, binary, octal and hexadecimal numbers ........................................................................ 54Table 4.8 Pseudo-instructions in the system ................................................................................................ 60Table 4.9 A subset of the set of instructions supported by a computer ........................................................ 61Table 4.10 A sample assembly language program for adding two numbers and storing the result ............. 61Table 4.11 Mapping table set up by the assembler for the data items of the assembly language

of table 4.10 ................................................................................................................................ 62Table 4.12 The equivalent machine language program for the assembly language program

given in table 4.10 ...................................................................................................................... 62Table 5.1 Bit and data bit in error ................................................................................................................ 82Table 7.1 High-level information security and privacy requirements ........................................................119Table 7.2 High-level components of a typical information security and privacy policy ........................... 120Table 7.3 Minimum set of controls ............................................................................................................ 123Table 8.1 Abbreviated alphabet .................................................................................................................. 136Table 8.2 VENONA Decrypt of message of September 21, 1944 ............................................................. 138Table 8.3 Excerpt from a German codebook ............................................................................................. 138

Page 10: Introduction to Information Theory and Applicationsjnujprdistance.com/assets/lms/LMS JNU/B.Sc.(IT)/Sem I/Introduction...Summary ... 8.3.6 Project VENONA ... Table 8.2 VENONA Decrypt

VIII

Abbreviations

A/D - Analog to DigitalALU - Arithmetic Logical Unit. ARPA - Advanced Research Projects AgencyBASIC - Beginners All Purpose Symbolic Instruction CodeBIOS - Basic Input/Output Systembps - Bits per SecondCAN - Control Area NetworkCD - Compact DiskC-DAC - Centre for Development of Advanced ComputingCP - Central ProcessorCPU - Central Processing UnitCRT - Cathode Ray Tube DisplayCU - Control UnitdB - DecibelDNS - Domain Name SystemDRAM - Dynamic RAM DTP - Desktop PublishingDVD - Digital Video DeviceEDO RAM - Enhanced Data Output RAMEDSAC - Electronic Delay Storage Automatic ComputerEDVAC - Electronic Discrete Variable Automatic ComputerEEPROM - Electronically Erasable PROMEHF - Extremely High FrequencyE-mail - Electronic mailENIAC - Electronic Numerical Integrator and CalculatorEPROM - Erasable Programmable Read Only MemoryFDM - Frequency Division MultiplexingFDX - Fetch-Decode-ExecuteGbps - Gigabits Per SecondGUI - Graphical User InterfaceHTML - Hypertext Markup LanguageHTTP - Hypertext Transfer ProtocolHz - HertzI/O - Input/OutputIAS - Immediate Access StorageICC - Integrated Circuit CardICs - Integrated CircuitsIDE - Integrated Development EnvironmentIP - Internet ProtocolIQ - Intelligence QuotientIS - Information SystemISO - International Standards OrganisationIT - Information TechnologyKB - KilobyteLAN - Local Area NetworksLCD - Liquid Crystal DisplayLED - Light Emitted DiodeLLC - Logical Link ControlLSB - LeastSignificantBitLSIC - Large Scale Integrated CircuitsMAC - Media Access Control

Page 11: Introduction to Information Theory and Applicationsjnujprdistance.com/assets/lms/LMS JNU/B.Sc.(IT)/Sem I/Introduction...Summary ... 8.3.6 Project VENONA ... Table 8.2 VENONA Decrypt

IX

MAN - Metropolitan Area NetworkMAR - Memory Address RegisterMB - Million BytesMDR - Memory Data RegisterMICR - Magnetic Ink Character RecognitionMSB - MostSignificantBitMSRL - Multimedia Systems Research LaboratoryOCR - Optical Character ReaderOMR - Optical Mark ReaderOS - Operating SystemOSI - Open System InterconnectionPAN - Personal Area NetworkPC - Personal ComputerPDA - Personal Digital AssistantPIM - Personal Information ManagersPROM - Programmable Read Only MemoryRAM - Random Access MemoryRDRAM - Rambus Dynamic Random Access MemoryRF signal - Radio-frequency SignalRMS - Root Mean SquareROM - Read Only MemoryRSS - Really Simple SyndicationSAN - Storage Area NetworkSDRAM - Synchronous DRAMSGML - Standard Generalised Markup LanguageSHF - Super-high frequencySNR, S/N - Signal-to-Noise RatioSRAM - Static RAMSTP - Shielded Twisted PairTCP / IP - Transmission Control Protocol/Internet ProtocolURI - UniformResourceIdentifierURL - Uniform Resource LocatorUSB - Universal Serial BusUTP - Unshielded Twisted PairVDT - Visual Display TerminalVDU - Visual Display UnitVGA - Video Graphic ArrayVLF - Very Low FrequencyVLSIC - Very Large Scale Integrated CircuitsWAN - Wide Area NetworksWORM - Write Once Read ManyWWW - World Wide Web

Page 12: Introduction to Information Theory and Applicationsjnujprdistance.com/assets/lms/LMS JNU/B.Sc.(IT)/Sem I/Introduction...Summary ... 8.3.6 Project VENONA ... Table 8.2 VENONA Decrypt
Page 13: Introduction to Information Theory and Applicationsjnujprdistance.com/assets/lms/LMS JNU/B.Sc.(IT)/Sem I/Introduction...Summary ... 8.3.6 Project VENONA ... Table 8.2 VENONA Decrypt

1

Chapter I

Introduction to Information Theory

Aim

The aim of this chapter is to:

explain information technology •

elucidate the concepts of data and information•

introduce software concepts and programming language•

Objectives

The objectives of this chapter are to:

explain the difference between data and information•

explicate information system and information theory•

enlist the characteristics of useful information•

Learning outcome

At the end of this chapter, you will be able to:

understand the importance of information theory•

identify the types of data•

definetheimportanceofsoftwareapplication•

Page 14: Introduction to Information Theory and Applicationsjnujprdistance.com/assets/lms/LMS JNU/B.Sc.(IT)/Sem I/Introduction...Summary ... 8.3.6 Project VENONA ... Table 8.2 VENONA Decrypt

Introduction to Information Theory and Applications

2

1.1 IntroductionThe term “information technology” came into existence in 1970s. However, its basic idea can be traced back even further. Throughout the 20th century, an alliance between the military and various industries has existed in the development of electronics, computers, and information theory. The military has encouraged such research by providingmotivationandfundingforinnovationinthefieldofmechanisationandcomputing.

Thefieldofengineeringinvolvescomputer-basedhardwareandsoftwaresystemsandcommunicationsystems,toallow the acquisition, representation, storage, transmission and use of information. It also depends on the ability of successfully converting information into knowledge. “The acquisition, processing, storage and dissemination of vocal, pictorial, textual and numerical information by a microelectronics-based combination of computing and telecommunications” is known as information technology (IT). In short, IT is the technology which is used to acquire,store,organiseandprocessdatatoaformwhichcanbeusedinspecifiedapplicationsanddisseminatetheprocessed data.

The knowledge and skills essential in information technology come from the applied engineering sciences, in particular information, computer, and systems engineering sciences, and from professional practice. The hardware and software of computing and communications form the basic tools for information technology. Thus, efforts in this area include not only interactivity in working with clients to satisfy present needs, but also awareness of future technological, organisational, and human concerns so as to support transition over time to new information technology-based services.

1.2 Data and Information The word information was derived from Latin verb ‘informare’, which means ‘to instruct’. It also means giving shape to an idea or fact. Data is the plural of the Latin word datum which means ‘to give’.

Dataisdefinedastherawinputwhich,whenprocessedorarrangedmakesmeaningfuloutput.Itisthegrouporchunkswhich signifiesquantitative andqualitative traits pertaining to variables.Data is rawmaterial for dataprocessing.Datarelatestofact,eventandtransactions.Informationisdefinedastheprocessedoutcomeofdata.Itis derived from data. Information is a concept and can be used in many domains. Thus, information is the data that has been processed in such a way as to be meaningful to the person who receives it.

For example, researchers who conduct market research survey might ask members of the public to complete questionnaires about a product or a service. These completed questionnaires are data and then they are processed and analysed in order to prepare a report on the survey. This resulting report is information.

Expenses

Write in a diary Add Expenses each day

Total daily expense to budget

Data Stored data Processing Information

Fig. 1.1 Data and information

Types of dataThe versatility of IT comes from the ability to process a variety of data types. The types of data are:

numeric data•text data•image data•audio data•video data•

Page 15: Introduction to Information Theory and Applicationsjnujprdistance.com/assets/lms/LMS JNU/B.Sc.(IT)/Sem I/Introduction...Summary ... 8.3.6 Project VENONA ... Table 8.2 VENONA Decrypt

3

The varieties of data types, besides numbers, are:Text: For example, a paragraph in this book is textual data.•Picture or image: For example, your photograph (both, black and white and colour). Other types of pictures •areamapofIndia,afingerprint,alinedrawing,animagetransmittedbyasatelliteandanX-rayofyourchest.Their main characteristics are that they are two dimensional and static. Audio or sound: For example, speeches, songs, telephone conversations, street noise, etc. Their main property •is that they are continuous (i.e., vary with time) and cause pressure waves in the air which enter our ears and wehearthesound.Thisisshowninthefigurebelow.

v(f) Voltage

Audio signal

timev(f)

Loudspeaker

Sound waves Ear

Fig. 1.2 Audio signal to sound conversion

Video or moving pictures: When a number of images (each one slightly different from the other) are shown •one after another at a rate of about 30–60 pictures per seconds, due to persistence of vision, we have an illusion of movement. An example is silent movies or animations used in computer games. Video is usually combined with audio to give a better effect.

1.3 Information SystemThe systems concept and systems thinking can be used for understanding information system (IS) better and for designinganddevelopingeffectiveandefficientIStosuittherequirementsofdifferentorganisations.Asystemisdefinedasacollectionofinterrelatedpartsformingasynergisticwholethatjointlyservesthedesiredpurpose.The parts which form the whole system, also called components or elements of the system, can be things, people or both.

The system as a whole receives inputs from sources outside the system and processes these inputs within the system. The product or results of these processes within the systems are then given out of the system as output of the system. A part of the output of a system may be fed back to it as input. This is called feedback. The purpose of feedback is to determine how a system is performing and guide action on improvement of system performance.These actions intended to improve system performance are called control actions.

INPUT PROCESS OUTPUT

Feedback FeedbackCONTROL

Fig. 1.3 Information system process

Page 16: Introduction to Information Theory and Applicationsjnujprdistance.com/assets/lms/LMS JNU/B.Sc.(IT)/Sem I/Introduction...Summary ... 8.3.6 Project VENONA ... Table 8.2 VENONA Decrypt

Introduction to Information Theory and Applications

4

1.3.1 Characteristics of Useful InformationVarious characteristics that determine the quality and suitability of information for the proposed use are:

Relevant information is important for taking a decision. Information should be relevant to the purpose under •consideration. Also having too much information is not required which leads to waste of efforts in sorting out relevant information from irrelevant.Many times it thus happens that information is collected but not used. Due to this, efforts are wasted and the •user is distracted from more information they need.Itisalwaysrecommendedtoexcludethedataforwhichaclearuseandusercannotbeidentified.•Inaccurate information can cause more harm than good. Also too much insistence on accuracy can cause excessive •delays and increase cost of information.An optimum balance between accuracy, cost and time must be maintained for precise in time delivery of •information.Information has to be complete to be accurate. Incomplete information can be called correct but not accurate •from the user’s point of view.Reliability of information depends on the reliability of data collection. Reliability depends on the source of the •information. Information must be gathered from reliable sources and rechecked with related sources.Information must be concise, as concise information covers only required facts. It covers only the information •required by the user.Too much information causes information overload, where a decision maker has too much information and is •unable to determine what is really important.Information becomes more meaningful to the user when it is appropriately analysed. •When people use information, they have to process it mentally to make it a part of their internal knowledge. At •times they may also process it externally to have a better understanding of the issues involved.The analysis and presentation should also be decided on the basis of the user characteristics. The user must •understand the information.Information must be made available when required. It is advantageous to have information available as early •as possible.The value of information declines with time.•Anothersignificantconsiderationindecidingthenatureofinformationisthedegreetowhichinformationis•securedfromunauthorisedaccessandmodification.

1.3.2 Information System ProcessAny information has three basic components input, process and output. Both, the input and output are different types of information, though the input information is called data in relation to the system under consideration to indicate that it has not been processed.

Input to the system can also be the knowledge or understanding of individuals which gets transformed into information only when coded or represented in a form suitable for information processing. The various types of processes that may take place within the system that accepts data, convert it into information, and provide information to users.

Therearefivemaincategoriesofinformationsystemprocesses:data capture•data storage and retrieval•analysis•information presentation•transmission•

Page 17: Introduction to Information Theory and Applicationsjnujprdistance.com/assets/lms/LMS JNU/B.Sc.(IT)/Sem I/Introduction...Summary ... 8.3.6 Project VENONA ... Table 8.2 VENONA Decrypt

5

Data capture involves coding of the input data in a form suitable for processing by the system. The most common form of manual data capture is a person observing something and describing one’s observations orally or in writing. The captured data in the system may need to be stored for analysing and presentation later. The most common methodofstorageisthepaperrecord,storedasbooksandfiles.

Analysis is the core of information processing, which transforms raw data into information. Presentation of information in a more easily understandable form is also one of the analytical tasks. Thus analysis also covers tasks like preparing graphs and reports. Presentation involves converting the information to a form understood by the user. Thus giving an oral or written report is two of the ways of information presentation used in manual systems.

Information transfers data in frames and ensures error free transmission. It also controls the timing of the information transmission. Adds frame type, address, and error control information.

NumbersText

ImagesAudioVideo

Data acquisition

units

Data storage units

Data processor

(usually called a CPU)

Result display

unit

NumbersText

ImagesAudioVideo

Input data ACQUIRE STORE PROCESS OUTPUT Disseminate

results

Fig. 1.4 Various steps in data processing

1.3.3 Computer Based Information SystemsAll IS does not use computers, though most of the business organisations today use computer based information systems for their management. Computer based information systems are composed of hardware, software, telecommunications,peopleandproceduresthatareconfiguredtocollect,manipulate,store,andprocessdataintoinformation.

Hardwareistheequipmentusedtoperforminput,processing,andoutputdevices.Softwareisthesetofpredefinedinstructions to the computer that determines the sequence of operations performed by the computer. Telecommunications allow organisations to link computer systems into effective networks. Networks can connect computers and computer equipment in a building, around the country, or across the world. Computer based information systems have enabled managers to automate some of the decisions that earlier required expert judgment of experienced managers.

1.4 Information TheoryInformationtheoryisabranchofappliedmathematicsandelectricalengineeringinvolvingthequantificationofinformation.InformationtheorywasdevelopedbyClaudeE.Shannontofindfundamentallimitsonsignalprocessingoperations such as compressing data and on reliably storing and communicating data. Since its inception it has broadenedtofindapplicationsinmanyotherareas, includingstatistical inference,natural languageprocessing,cryptography, networks other than communication networks as in neurobiology, the evolution and function of molecular codes, model selection in ecology, thermal physics, quantum computing, plagiarism detection and other forms of data analysis.

A key measure of information in the theory is known as entropy, which is usually expressed by the average number of bitsneededforstorageorcommunication.Intuitively,entropyquantifiestheuncertaintyinvolvedwhenencounteringarandomvariable.Forexample,afaircoinflip(2equallylikelyoutcomes)willhavelessentropythanarollofadie (6 equally likely outcomes).

Page 18: Introduction to Information Theory and Applicationsjnujprdistance.com/assets/lms/LMS JNU/B.Sc.(IT)/Sem I/Introduction...Summary ... 8.3.6 Project VENONA ... Table 8.2 VENONA Decrypt

Introduction to Information Theory and Applications

6

Although information is sometimes measured in characters, as when describing the length of an email message, or in digits (as in the length of a phone number), the convention in information theory is to measure information in bits. A “bit”(thetermisacontractionofbinarydigit)iseitherazerooraone.Becausethereare8possibleconfigurationsof three bits (000, 001, 010, 011, 100, 101, 110, and 111), we can use three bits to encode any integer from 1 to 8. So when we refer to a “3-bit number”, what we mean is an integer in the range 1 through 8. All logarithms used in this paper will be to the base two, so log 8 is 3. Similarly, log 1000 is slightly less than 10, and log 1,000,000 is slightly less than 20.

1.4.1 Efficient EncodingsSupposeyouflipacoinonemilliontimesandwritedownthesequenceofresults.Ifyouwanttocommunicatethissequence to another person, how many bits will it take? If it’s a fair coin, the two possible outcomes, heads and tails,occurwithequalprobability.Therefore,eachfliprequires1bitofinformationtotransmit.Tosendtheentiresequence will require one million bits.

But suppose the coin is biased so that heads occur only 1/4 of the time, and tails occur 3/4. Then, the entire sequence can be sent in 811,300 bits, on an average. (the formula for computing this will be explained below.) This would seemtoimplythateachflipofthecoinrequiresjust0.8113bitstotransmit.Howcanyoutransmitacoinflipinless than one bit, when the only language available is that of zeros and ones? Obviously, you can’t. However, if thegoalistotransmitanentiresequenceofflipsandthedistributionisbiasedinsomeway,thenyoucanuseyourknowledgeofthedistributiontoselectamoreefficientcode.Anotherwaytolookatitis;asequenceofbiasedcoinflipscontainsless“information”thanasequenceofunbiasedflips,soitshouldtakefewerbitstotransmit.

Whatifweinventanevenclevererencoding?Whatisthelimitonhowefficientanyencodingcanbe?Theinformationcontentofasequenceisdefinedasthenumberofbitsrequiredtotransmitthatsequenceusinganoptimalencoding.Wearealwaysfreetousealessefficientcoding,whichwillrequiremorebits,butthatdoesnotincreasetheamountof information transmitted.

Sincetheflippingofcoinsisrepeatedmanytimes,itmakessensetosaythatthere’saprobabilitypi of getting head number i on any given trial, i.e. Prob(S=i) = pi. Now, the number of head-tail occurrence is, at most, logN, taking the logarithm to base two. (If you were allowed to ask questions with three possible answers, it’d be log to the base three.) But one can do better than that: if head i is more frequent than tail j (if pi > pj), it makes sense to ask whether it is i before considering the possibility that it’s j;you’llsavetime.Onecaninfactshow,withabitofalgebrathatthe smallest average number of head-tail questions is . This gives us log N when the entire pi is equal which makes sense and then there is no bias. The sum is called, variously, the information, the information content, the self-information, the entropy or the Shannon entropy of the message, conventionally written H[S].

Claude Shannon and Norbert Wiener worked on very serious and practical problems of coding, code-breaking, communicationandautomaticcontrol.Therealjustificationregardingtheentropyastheamountofinformationis that, though it’s abstracted away all the content of the message and almost all of the context (except for the distribution over messages), it works. You can try to design a communication channel which doesn’t respect the theorems of information theory.

The Kolmogorov complexity of a sequence of symbols is the shortest computer program which will generate that sequence as its output. For certain classes of random processes, the Kolmogorov complexity per symbol converges, on average, to the entropy per symbol, which in that case is the entropy rate, the entropy of the latest symbol, conditioned on all the previous ones. This gives us a pretty profound result like random sequences are incompressible and conversely, an incompressible sequence looks random. In fact, it turns out that one can write down formal analogs to almost all the usual theorems about information which talks, not about the entropy, but about the length of the Kolmogorov program, also for this reason it is called the algorithmic information.

Norbert Wiener worked out the continuous case of the standard entropy/coding/communication channel part of information theory at the same time as Shannon was doing the discrete version. In addition to the use in communications and technology, this stuff is also of some use in statistical physics, in dynamics, and in probability and statistics generally.

Page 19: Introduction to Information Theory and Applicationsjnujprdistance.com/assets/lms/LMS JNU/B.Sc.(IT)/Sem I/Introduction...Summary ... 8.3.6 Project VENONA ... Table 8.2 VENONA Decrypt

7

1.4.2 Measuring Information ContentFor example, we use a die with eight faces. Since eight is a power of two, the optimal code for a uniform probability distribution is easy to calculate: log 8 = 3 bits. Information theory provides us with a formula for determining the number of bits required in an optimal code even when we don’t know the code.

Let’sfirstconsideruniformprobabilitydistributionswherethenumberofpossibleoutcomesisnotapoweroftwo.Suppose we had a conventional die with six faces. The number of bits required to transmit one throw of a fair six-sided die is: log 6 = 2.58. Once again, we can’t really transmit a single throw in less than 3 bits, but a sequence of such throws can be transmitted using 2.58 bits on average. The optimal code in this case is complicated, but here’s an approach that’s fairly simple and yet does better than 3 bits/throw. Instead of treating throws individually, consider them three at a time. The number of possible three-throw sequences is 63= 216. Using 8 bits we can encode a number between 0 and 255, so a three-throw sequence can be encoded in 8 bits with a little to spare. This is better than the 9 bits we’d need if we encoded each of the three throws separately.

In probability terms, each possible value of the six-sided die occurs with equal probability P=1/6. Information theory tells us that the minimum number of bits required to encode a throw is -log P = 2.58. If you look back at the eight-sided die example, you’ll see that in the optimal code that was described, every message had a length exactly equal to -log P bits. Now let’s look at how to apply the formula to biased (non-uniform) probability distributions. Let the variable ‘x’ range over the values to be encoded, and let P(x) denote the probability of that value occurring. The expected number of bits required to encode one value is the weighted average of the number of bits required to encode each possible value, where the weight is the probability of that value:

Now we can revisit the case of the biased coin. Here the variable ranges over two outcomes namely heads and tails. If heads occur only 1/4 of the time and tails 3/4 of the time, then the number of bits required to transmit the outcome of one coin toss is:

1/4 × -log (1/4) + 3/4 × -log (3/4) = 0.8113bits

A fair coin is said to produce more “information” because it takes an entire bit to transmit the result of the toss:1/2 × -log (1/2) + 1/2 × -log (1/2) = 1 bit

1.4.3 The Intuition Behind the –P log P FormulaThe key to gaining an intuitive understanding of the -P log P formula for calculating information content is to see the duality between the number of messages to be encoded and their probabilities. If we want to encode any of eight possible messages, we need 3 bits, because log 8 = 3. We are implicitly assuming that the messages are drawn from a uniform distribution.

The alternate way to express this is: the probability of a particular message occurring is 1/8, and -log (1/8) = 3, so we need 3 bits to transmit any of these messages. Algebraically, log n = -log (1/n), so the two approaches are equivalent when the probability distribution is uniform. The advantage of using the probability approach is that when the distribution is non-uniform, and we can’t simply count the number of messages, the information content can still be expressed in terms of probabilities.

Sometimes we write about rare events as carrying a high number of bits of information. For example, in the case where a coin comes up heads only once in every 1,000 tosses, the signal that a heads has occurred is said to carry 10 bits of information. How is that possible, since the result of any particular coin toss takes 1 bit to describe? Transmitting when a rare event occurs, if it happens only about once in a thousand trials, will take 10 bits. Using our message counting approach, if a value occurs only 1/1000 of the time in a uniform distribution, there will be 999 other possible values, all equally alike, so transmitting any one value would indeed take 10 bits.

Page 20: Introduction to Information Theory and Applicationsjnujprdistance.com/assets/lms/LMS JNU/B.Sc.(IT)/Sem I/Introduction...Summary ... 8.3.6 Project VENONA ... Table 8.2 VENONA Decrypt

Introduction to Information Theory and Applications

8

With a coin there are only two possible values. What information theory says we can do is consider each value separately. If a particular value occurs with probability P, we assume that it is drawn from a uniformly distributed set of values when calculating its information content. The size of this set would be 1/P elements. Thus, the number of bits required to encode one value from this hypothetical set is -log P. Since the actual distribution which we’re trying to encode is not uniform, we take the weighted average of the estimated information content of each value (heads or tails, in the case of a coin), weighted by the probability P of that value occurring. Information theory tells us that an optimal encoding can do no better than this. Thus, with the heavily biased coin we have the following:

P (heads) = 1/1000, so heads takes -log (1/1000) = 9.96578 bits to encodeP (tails) = 999/1000, so tails takes -log (999/1000) = 0.00144 bits to encode

Avg.bits required =

= (1/1000) × 9.96578 + (999/1000) × 0.00144 = 0.01141 bits per coin toss

1.4.4 Applications of Information TheoryApplications of fundamental topics of information theory include:

losslessdatacompression(e.g.ZIPfiles)•lossy data compression (e.g. MP3s)•channel coding (e.g. for DSL lines)•

Thefieldisattheintersectionofmathematics,statistics,computerscience,physics,neurobiology,andelectricalengineering. Its impact has been crucial to the success of the Voyager missions to deep space, the invention of the compact disc, the feasibility of mobile phones, the development of the Internet, the study of linguistics and of human perception,theunderstandingofblackholes,andnumerousotherfields.Importantsub-fieldsofinformationtheoryare source coding, channel coding, algorithmic complexity theory, algorithmic information theory, information-theoretic security, and measures of information.

1.5 Software ConceptsAsoftwareoraprogramenablesacomputertoperformspecifictasks,asopposedtothephysicalcomponentsofthe system or the hardware. This includes application software such as a word processor, which enables a user to perform a task, and system software such as an operating system, which enables other software to run properly, by interfacingwithhardwareandwithothersoftwareorcustomsoftwaremadetouserspecifications.Incomputers,software is loaded into RAM and executed in the central processing unit. At the lowest level, software consists of amachinelanguagespecifictoanindividualprocessor.

In practical terms, a computer program might include anywhere from a dozen instructions to many millions of instructions for something like a word processor or a web browser. A typical modern computer can execute billions of instructions every second and nearly never make a mistake over years of operation. Errors in computer program are called bugs. Sometimes, bugs are kind and do not affect the usefulness of the program, in other cases they might cause the program to completely crash.

1.5.1 Importance of Software Application In operational level of any organisation, there are thousands of transactions to be performed daily. The transactions carried out help to improve the routine business activity and affect the overall performance of any organisation. The transactions may include calculations, summarising or sorting of data. Most of the organisations have automated computer systems for handling their transactions. The use of computers drastically increases the speed at which the transactions occur and provide greater accuracy. The main advantage is that the computers can be programmed and changed from time to time with change in activities.

Page 21: Introduction to Information Theory and Applicationsjnujprdistance.com/assets/lms/LMS JNU/B.Sc.(IT)/Sem I/Introduction...Summary ... 8.3.6 Project VENONA ... Table 8.2 VENONA Decrypt

9

Themiddlelevelmanagementbenefitsthemostbytheuseofcomputersandautomatedsystems.Thecomputerhelps the manager to take crucial decisions and helps in solving problems. With computers the manager can take better decisions and can draw conclusions with help of precise data in no time. Preparing daily reports in graphical format makes it easier for the manager. The rise and the fall in an employee’s performance can be easily traced with the several automated systems.

1.5.2 Programming LanguageFrom the moment you turn on your computer, it runs programs, carrying out instructions, testing your ram, resetting all attached devices and loading the operating system from hard disk or CD-Rom. Each and every operation that your computer performs has instructions that someone had to write in a programming language. These had to be created, compiled and tested which is a long and complex task. In other words, a program is written as a series of human understandable computer instructions that can be read by a compiler and linker, and translated into machine code so that a computer can understand and run it.

1.5.3 Types of SoftwarePractical computer systems divide software into three major classes, namely, systems software, programming software and application software, although the distinction is arbitrary, and often blurred.

Systems software is a generic term referring to any computer software which manages and controls the hardware, •so that application software can perform a task. It is an essential part of the computer system. It includes operating systems, device drivers, diagnostic tools, servers, windowing systems, utilities and more. The purpose of systems software is to insulate the applications programmer as much as possible from the details of the particular computer complex being used, especially memory and other hardware features, and such accessory devices as communications, printers, readers, displays, keyboards, etc.Programming software usually provides tools to assist a programmer in writing computer program and software •using different programming languages in a more convenient way. The tools include text editors, compilers, interpreters, linkers, debuggers, and so on. An Integrated Development Environment (IDE) merges those tools into a software bundle, and a programmer may not need to type multiple commands for compiling, interpreter, debugging, tracing, etc., because the IDE usually has an advanced graphical user interface, or GUI.Application software allows end users to accomplish one ormore specific (non-computer related) tasks.•Application software or applications are what most people think of when they think of software. Typical applications include industrial automation, business software, educational software, medical software, databases, andcomputergames.Businessesareprobablythebiggestusersofapplicationsoftware,butalmosteveryfieldof human activity now uses some form of application software. It is used to automate all sorts of functions.

Page 22: Introduction to Information Theory and Applicationsjnujprdistance.com/assets/lms/LMS JNU/B.Sc.(IT)/Sem I/Introduction...Summary ... 8.3.6 Project VENONA ... Table 8.2 VENONA Decrypt

Introduction to Information Theory and Applications

10

SummaryThe technology which is used to acquire, store, organise and process data into a form which can be used in •specifiedapplicationsanddisseminatetheprocesseddataiscalledinformationtechnology.Dataisdefinedastherawinputwhich,whenprocessedorarrangedmakesmeaningfuloutput.Itisthegroupor•achunkwhichsignifiesquantitativeandqualitativetraitspertainingtovariables.Thetypesofdataarenumericdata, text data, image data, audio data and video data.Informationisdefinedastheprocessedoutcomeofdata.Itisderivedfromdata.Informationisaconceptand•can be used in many domains. Thus, information is the data that has been processed in such a way as to be meaningful to the person who receives it. Asystemisdefinedasacollectionof interrelatedparts formingasynergisticwhole that jointlyserves the•desired purpose. The parts which form the whole system, also called components or elements of the system, can be things, people or both.Information must be concise, accurate, reliable and meaningful which must be made available when required. It is •advantageous to have information available as early as possible. The value of information declines with time.Therearefivemaincategoriesofinformationsystemprocesseswhicharedatacapture,datastorageandretrieval,•analysis, information presentation and transmission.Information theorywas developedbyClaudeE.Shannon tofind fundamental limits on signal processing•operations such as compressing data and on reliably storing and communicating data.A key measure of information in the theory is known as entropy, which is usually expressed by the average •number of bits needed for storage or communication.The transactions may include calculations, summarising or sorting of data. Most of the organisations have •automated computer systems for handling their transactions. Each and every operation that your computer performs has instructions that someone had to write in a •programming language. These had to be created, compiled and tested- a long and complex task.Practical computer systems divide softwares into three major classes: systems software, programming software •and application software.

References John, R. P., 1980. • An Introduction to Information Theory. 2nd ed, Dover publications.Fazlollah, M., 2010. • An Introduction to Information Theory, Dover Publications.Information theory and Coding• [Video online] Available at:< http://www.youtube.com/watch?v=UrefKMSEuAI&list=PL05BAE4D2A7018795> [Accessed 27 May 2013].Advances in classical communication for network quantum information theory• [Video online] Available at: <http://www.youtube.com/watch?v=Vx3Nr2zuFbI> [Accessed 27 May 2013].Entropy and Information Theory • [Pdf] Available at: <http://ee.stanford.edu/~gray/it.pdf> [Accessed 27 May 2013].An Information-Theoretic Cryptanalysis • [Pdf] Available at: <http://www.mit.edu/~medard/mpapers/isita08paper.pdf>[Accessed 27 May 2013].

Recommended ReadingCoope, R. and Mukai, K., 1993. • Theory and Its Applications, Center for the Study of Language.Paul, C., 2004. • Information Theory And Statistics, Now Publishers Inc.Bell, D. A., 1968. • Information theory and its engineering applications, Pitman.

Page 23: Introduction to Information Theory and Applicationsjnujprdistance.com/assets/lms/LMS JNU/B.Sc.(IT)/Sem I/Introduction...Summary ... 8.3.6 Project VENONA ... Table 8.2 VENONA Decrypt

11

Self AssessmentThe acquisition, processing, storage and dissemination of vocal, pictorial, textual and numerical information by 1. a microelectronics-based combination of computing and telecommunications” is known as _______.

Biotechnologya. Nanotechnologyb. Information technologyc. Biometricsd.

Which of the following is not a type of data?2. Numeric dataa. Book datab. Image datac. Audio datad.

Which of the following statements is true?3. The system as a whole receives inputs from sources outside the system and processes these inputs within a. the system.The system as a whole sends inputs from sources outside the system and processes these inputs within the b. system.The system as a whole receives inputs from sources inside the system and processes these inputs within c. the system.The system as a whole receives outputs from sources outside the system and processes these outputs within d. the system.

A part of the output of a system may be ________ to it as input.4. reviewa. reportb. datac. feedbackd.

An optimum balance between accuracy, cost and time must be maintained for precise in time delivery of 5. _________.

informationa. datab. boxesc. systemsd.

The knowledge or understanding of individuals which gets transformed into information only when coded or 6. represented in a form suitable for information processing is called __________to the system.

outputa. inputb. datac. informationd.

Page 24: Introduction to Information Theory and Applicationsjnujprdistance.com/assets/lms/LMS JNU/B.Sc.(IT)/Sem I/Introduction...Summary ... 8.3.6 Project VENONA ... Table 8.2 VENONA Decrypt

Introduction to Information Theory and Applications

12

Match the following7.

Data capture1. A. easily understandable form

Data storage2. B. involves coding of the input data in a form suitable for system processing

Analysis3. C. captured data kept for analysing and presentation later

Presentation4. D. core of information processing1-A, 2-B, 3-C, 4-Da. 1-C, 2-D, 3-A, 4-Bb. 1-B, 2-C, 3-D, 4-Ac. 1-D, 2-A, 3-B, 4-Cd.

___________isabranchofappliedmathematicsandelectricalengineering involving thequantificationof8. information.

Information theorya. Management theoryb. Software theoryc. Business theoryd.

_________allowsenduserstoaccomplishoneormorespecifictasks.9. Function softwarea. Purpose softwareb. Database softwarec. Application softwared.

The Kolmogorov complexity of a ___________ is the shortest computer program which will generate that 10. sequence as its output.

sequence of numbersa. sequence of symbolsb. sequence of wordsc. sequence of codesd.

Page 25: Introduction to Information Theory and Applicationsjnujprdistance.com/assets/lms/LMS JNU/B.Sc.(IT)/Sem I/Introduction...Summary ... 8.3.6 Project VENONA ... Table 8.2 VENONA Decrypt

13

Chapter II

Computer Fundamentals

Aim

The aim of this chapter is to:

definecomputer•

explain the characteristic features of computers•

explicate the history and evolution of computers•

Objectives

The objectives of this chapter are to:

explain important stages of computer evolution •

elucidatecomputerclassificationanddatarepresentation•

explicatethesignificanceofcomputersindailylife•

Learning outcome

At the end of this chapter, you will be able to:

understand computers and study 'data representation'•

identify the features of computer•

recognise important stages of computer evolution•

Page 26: Introduction to Information Theory and Applicationsjnujprdistance.com/assets/lms/LMS JNU/B.Sc.(IT)/Sem I/Introduction...Summary ... 8.3.6 Project VENONA ... Table 8.2 VENONA Decrypt

Introduction to Information Theory and Applications

14

2.1 IntroductionThe word computer comes from the word “compute”, which means, “to calculate”. We all are familiar with the calculations in our routine. We apply mathematical operations like addition, subtraction, multiplication, etc. for calculations. Simpler calculations take less time. On the other hand, complex calculations take much longer time. Another factor is accuracy in calculations. Therefore, man explored the idea to develop a machine which can perform this type of arithmetic calculation faster and with complete accuracy. This gave birth to a device or machine called ‘computer’. The computer we see today is quite different from the one made in the beginning. The number of applications of a computer has increased. One must appreciate the impact of computers in our day to day life. Reservation of tickets in airlines and railways, payment of telephone and electricity bills, deposits and withdrawals of money from banks, business data processing, medical diagnosis, weather forecasting, etc. are some of the areas where computer has become extremely useful.

However, there is one limitation of the computer. Human beings do calculations on their own by understanding the informationandthefinalresultsrequired,butcomputerisamachineandithastobegivenproperinstructionstocarry out its calculation. Hence, one should know how a computer works.

2.2 Definition of ComputerAcomputerisdefinedasaprogrammablemachinethatreceivesinput,storesandmanipulatesdata,andprovidesoutput in a useful format. Computer is an electronic device which can perform arithmetic calculations faster. It can be compared to a magic box, which serves different purpose to different people. For a common man, computer is simply a calculator, which works automatic and quite fast. For a person who knows much about it, computer is a machine capable of solving problems and manipulating data. It accepts data, processes the data by doing some mathematical andlogicaloperationsandgivesusthedesiredoutput.Therefore,wemaydefinecomputerasadevicethattransformsdata. Data can be anything like marks obtained by an individual in various subjects. It can also be name, age, sex, weight, height, etc. of all the students in a class or income, savings, investments, etc., of a country.

Computercanalsobedefinedintermsofitsfunctions.Itcan:accept data •store data •process data as desired•retrieve the stored data when required•print the result in desired format•

Monitor

SpeakerSpeaker

System Unit

Microphone

MouseKeyboard

Screen

Fig. 2.1 Personal computer

Page 27: Introduction to Information Theory and Applicationsjnujprdistance.com/assets/lms/LMS JNU/B.Sc.(IT)/Sem I/Introduction...Summary ... 8.3.6 Project VENONA ... Table 8.2 VENONA Decrypt

15

2.3 Essential Features of ComputersFollowing are the essential features of computers:

With the advancement in technologies, there are computers which are smaller than a palm. There are also super •computers which need large buildings for storage.Computer is an equipment that has information based on a set of data and processing instructions that are stored •within the equipment.Once data and processing instructions are fed in the computer, it can be instructed to start the processing, and •it will perform the operation on its own and determine the desired output.A computer system consists of two broad subsystems – hardware and software. Computer hardware is the •physical part of a computer, including the digital circuitry, as distinguished from the computer software that executes within the hardware. The hardware of a computer is infrequently changed, in comparison with software and data, which are “soft” •inthesensethattheyarereadilycreated,modifiedorerasedonthecomputerThe computer must be able to receive the data and instructions to be stored within it. The computer also needs •capability to give out the information resulting from the processing in a usable form. Ifdataandinformationcannotbefedinthecomputer,itcannotdoanyprocessing;andifitcannotgiveoutthe•results of processing to the user, there is no purpose served by processing.

Input Accepts dataProcessing Processes data

Output Produces outputStorage Stores results

Table 2.1 Functions of a computer

Basic computer has three essential features. An input facility to accept input data, a central processing unit to •store data, and an output facility to give/supply required data or information. Here we have not mentioned any separate facility to input or store the processing instructions, because the •facility for these is common with that for data to be processed.The part that stores and processes the data is called the central processor (CP). This central processor also •controls or manages operations of all other components of the computer.In addition to store the memory within the CP, a computer today has secondary data storage facility. The difference •between the memory within the computer and secondary storage is that the contents of the memory are totally controlled within the CP as per the requirements of the processing being done.

Control Unit

Input

Input unit Storage

ALU

Output Unit

Processing Output

Fig. 2.2 Block diagram of a computer(Source: http://www.careergears.com/block-diagram-of-computer-and-its-explanation/)

Page 28: Introduction to Information Theory and Applicationsjnujprdistance.com/assets/lms/LMS JNU/B.Sc.(IT)/Sem I/Introduction...Summary ... 8.3.6 Project VENONA ... Table 8.2 VENONA Decrypt

Introduction to Information Theory and Applications

16

2.4 Characteristics of a ComputerThe major characteristics of computer are:Accuracy

Suppose someone calculates faster but commits a lot of errors in computing, such a result is useless. One wants to •divide 15 by 7. They may work out up to 2 decimal places and say the dividend is 2.14. The other may calculate upto 4 decimal places and say that the result is 2.1428. Hence, in addition to speed, the computer should have accuracy or correctness in computing. The degree of accuracy of computer is very high and every calculation is performed with the same accuracy. •The accuracy level is determined on the basis of design of computer. The computer errors are due to wrong •inputs by humans and inaccurate data.

DiligenceA computer is free from tiredness, lack of concentration, fatigue, etc. It can work for hours without creating •any error. If millions of calculations are to be performed, a computer will perform every calculation with the same accuracy. •Due to this capability it overpowers human being in routine work.

SpeedComputer can work very fast. It takes only few seconds for calculations that we take hours to complete. •For example, if a person is asked to calculate the average monthly income of one thousand individuals in his/•her neighbourhood. For this, he/ she has to add income from all sources for all individuals on a day to day basis andfindouttheaverageforeachoneofthem.Howlongwillittakeforthepersontodothis?Oneday,twodaysormaybeoneweek,butacomputercanfinishthisworkinjustfewseconds.The weather forecast that we see everyday on television is the result of compilation and analysis of huge amount •of data on temperature, humidity, pressure, etc. of various places on computers. It takes a few minutes for the computer to process this huge amount of data and provide the result.One will be surprised to know that the computer can perform millions of instructions and even more per second. •Therefore, we determine the speed of computer in terms of microsecond (10-6 part of a second), nanosecond (10-9 part of a second) or picoseconds (10-12).Imagine how fast a computer performs work.

MemoryComputer has the power of storing any amount of information or data. Any information can be stored and •recalled as long as one requires it. It depends entirely upon an individual how much data he/ she want to store in a computer and when to lose or •retrieve this data.

VersatilityIt means the capacity to perform completely a different type of work. •One may use the computer to • prepare payroll slips. Next moment it may be used for inventory management or to prepare electric bills.

StorageThe computer has an in-built memory where it can store a large amount of data. •Userscanalsostoredatainsecondarystoragedevicessuchasfloppies,whichcanbekeptoutsideonescomputer•and can be carried to other computers.

No intelligence quotient (IQ) and no feelingComputer is a machine and it cannot do any work without instruction from the user. It performs the instructions •at tremendous speed and with accuracy.

Page 29: Introduction to Information Theory and Applicationsjnujprdistance.com/assets/lms/LMS JNU/B.Sc.(IT)/Sem I/Introduction...Summary ... 8.3.6 Project VENONA ... Table 8.2 VENONA Decrypt

17

It is the user who decides what he/ she want to do and in what sequence. So a computer cannot take its own •decision as humans can.It does not have feelings or emotion, taste, knowledge and experience. Thus it does not get tired even after long •hours of work. It does not distinguish between users.

2.5 History of ComputerHistory of computer could be traced back to the effort of man to count large numbers. This process of counting of large numbers generated various systems of numeration like Babylonian system of numeration, Greek system of numeration, Roman system of numeration and Indian system of numeration. Out of these the Indian system of numeration has been accepted universally. It is the basis of modern decimal system of numeration (0, 1, 2, 3, 4, 5, 6, 7, 8 and 9). Later on it was known how the computer solves all calculations based on decimal system. But one would be surprised to know that the computer does not understand the decimal system and uses binary system of numerationforprocessing.Someofthepath-breakinginventionsinthefieldofcomputingdevicesarementionedbelow.

Calculating machinesIttookgenerationsforearlymantobuildmechanicaldevicesforcountinglargenumbers.Thefirstcalculatingdevicecalled ABACUS was developed by the Egyptian and Chinese people. The word ABACUS means calculating board. It consisted of sticks in horizontal positions on which were inserted sets of pebbles. A modern form of ABACUS isgiveninthefigurebelow.Ithasanumberofhorizontalbars,eachhavingtenbeads.Horizontalbarsrepresentunits, tens, hundreds, etc.

6 3 0 2 7 1 5 4 0 8

Fig. 2.3 Abacus computer(Source: http://www.master-your-computer.com/abacus-history-of-computer.php)

Page 30: Introduction to Information Theory and Applicationsjnujprdistance.com/assets/lms/LMS JNU/B.Sc.(IT)/Sem I/Introduction...Summary ... 8.3.6 Project VENONA ... Table 8.2 VENONA Decrypt

Introduction to Information Theory and Applications

18

Napier’s bonesEnglish mathematician John Napier built a mechanical device for the purpose of multiplication in 1617 A D. The device was known as Napier’s bones.

X 1 2 3 4 5 6 7 8 9

1 01

02

03

04

05

06

07

08

09

2 02

04

06

08

10

12

14

16

18

3 03

06

09

12

15

18

21

24

27

4 04

08

12

16

20

24

28

32

36

5 05

10

15

20

25

30

35

40

45

6 06

12

18

24

30

36

42

48

54

7 07

14

21

28

35

42

49

56

63

8 08

16

24

32

40

48

56

64

72

9 09

18

27

36

45

54

63

72

81

Fig. 2.4 Napier’s bones(Source: http://ictlisfun.blogspot.in/2011/07/evolution-of-computer_2802.html)

Slide ruleEnglish mathematician Edmund Gunter developed the slide rule. This machine could perform operations like addition, subtraction, multiplication, and division. It was widely used in Europe in the 16th century.

6 504

25 30 35 40 45555

879 10 11

Fig. 2.5 Slide rule(Source:http://leemartinauthor.com/blog/2012/12/slide-rules-and-typewriters-a-memoir-of-christmas-presents-

past/sliderule/)

Page 31: Introduction to Information Theory and Applicationsjnujprdistance.com/assets/lms/LMS JNU/B.Sc.(IT)/Sem I/Introduction...Summary ... 8.3.6 Project VENONA ... Table 8.2 VENONA Decrypt

19

Pascal’s adding and subtractory machineBlaise Pascal developed a machine at the age of 19 that could add and subtract. The machine consisted of wheels, gears and cylinders.

Fig. 2.6 Leibniz’s multiplication and dividing machine (Source:http://www2.lv.psu.edu/ojj/courses/ist-240/reports/spring2001/fa-cb-bc-kf/1200-1940.html)

The German philosopher and mathematician Gottfried Leibniz built a mechanical device that could both multiply and divide in the year 1673.

Fig. 2.7 Babbage’s analytical engine(Source:http://www.sciencemuseum.org.uk/objects/computing_and_data_processing/1878-3.aspx)

It was in the year 1823 that a famous English man Charles Babbage built a mechanical machine to do complex mathematical calculations. It was called difference engine. Later he developed a general-purpose calculating machine called analytical engine. Charles Babbage is called the father of computer.

Mechanical and electrical calculatorIn the beginning of the 19th century, the mechanical calculator was developed to perform all sorts of mathematical calculations. Upto the 1960s, it was widely used. Later the rotating part of mechanical calculator was replaced by electric motor. So, it was called the electrical calculator.

Modern electronic calculatorThe electronic calculator used in 1960 s was run with the help of electron tubes, which was quite bulky. Later, it was replaced with transistors and as a result the size of calculators became too small. The modern electronic calculator can compute all kinds of mathematical computations and mathematical functions. It can also be used to store some data permanently. Some calculators have in-built programs to perform some complicated calculations.

Page 32: Introduction to Information Theory and Applicationsjnujprdistance.com/assets/lms/LMS JNU/B.Sc.(IT)/Sem I/Introduction...Summary ... 8.3.6 Project VENONA ... Table 8.2 VENONA Decrypt

Introduction to Information Theory and Applications

20

Vacuum TubeIC Chip

TransistorEB C

Fig. 2.8 Vacuum tube, transistor, chips(Source: http://encyclopedia2.thefreedictionary.com/chip)

2.6 Computer GenerationsThe evolution of computer started from the 16th century and resulted in the form that is seen today. The present daycomputer,however,hasalsoundergonerapidchangeduringthelastfiftyyears.Thisperiod,duringwhichtheevolutionofcomputertookplace,canbedividedintofivedistinctphasesknownasgenerationsofcomputers.Eachphase is distinguished from others on the basis of the type of switching circuits used.

First generation computersFirst generation computers used Thermion valves. These computers were large in size and writing programs on themwasdifficult.Someofthecomputersofthisgenerationwere:

ENIAC:Itwasthefirstelectroniccomputerbuiltin1946atUniversityofPennsylvania,USAbyJohnEckert•and John Mauchy. It was named Electronic Numerical Integrator and Calculator (ENIAC). The ENIAC was 30 -50 feet long, weighed 30 tons, contained 18,000 vacuum tubes, 70,000 registers 10,000 capacitors and required 150,000 watts of electricity. Present computer is many times as powerful as ENIAC, whose size is still very small.EDVAC: It stands for Electronic Discrete Variable Automatic Computer and was developed in 1950. The concept •of storing data and instructions inside the computer was introduced here. This allowed much faster operation since the computer had rapid access to both data and instructions. The other advantages of storing instruction were that computer could do logical decision internally.EDSAC: It stands for Electronic Delay Storage Automatic Computer and was developed by M.V. Wilkes at •Cambridge University in 1949.UNIVAC-1: Ecker and Mauchly produced it in 1951 by Universal Accounting Computer setup.•

Limitationsoffirstgenerationcomputersare:the operating speed was quite slow•power consumption was very high•it required large space for installation•the programming capability was quite low•

Page 33: Introduction to Information Theory and Applicationsjnujprdistance.com/assets/lms/LMS JNU/B.Sc.(IT)/Sem I/Introduction...Summary ... 8.3.6 Project VENONA ... Table 8.2 VENONA Decrypt

21

Second generation computersInaround1955,adevicecalledtransistorreplacedthebulkyelectrictubesinthefirstgenerationcomputer.Transistorsaresmallerthanelectrictubesandhavehigheroperatingspeed.Theyhavenofilamentandrequirenoheating.Manufacturing cost was also very low. Thus the size of the computer got reduced considerably.

It is in the second generation that the concept of Central Processing Unit (CPU), memory, programming language and input and output units were developed. The programming languages such as COBOL, FORTRAN were developed during this period.

IBM 1620 Its size was smaller as compared to First Generation computers and mostly usedforscientificpurpose.

IBM 1401 Its size was small to medium and used for business applications.

CDC 3600 Itssizewaslargeandisusedforscientificpurposes

Table 2.2 Examples of second generation computers

Third generation computersThe third generation computers were introduced in 1964. They used Integrated Circuits (ICs). These ICs are •popularly known as chips. A single IC has many transistors, registers and capacitors built on a single thin slice of silicon. So it is quite obvious that the size of the computer got further reduced. Some of the computers developed during this period were IBM-360, ICL-1900, IBM-370, and VAX-750. Higher •level language such as BASIC (Beginners All purpose Symbolic Instruction Code) was developed during this period.Computers of this generation were small in size, low cost, large memory and processing speed was very high.•

Fourth generation computersThe present day computers are the fourth generation computers that were introduced approximately in the year •1975. It uses large scale Integrated Circuits (LSIC) built on a single silicon chip called microprocessors. Due to the development of microprocessor, it is possible to place computer’s central processing unit (CPU) on •single chip. These computers are called microcomputers. Later, very large scale Integrated Circuits (VLSIC) replaced LSICs.Thus, the computer which was occupying a very large room in earlier days can now be placed on a table. The •personal computer (PC) used in school is a fourth generation computer.

Fifth generation computerThecomputersof1990saresaidtobefifthgenerationcomputers.Thespeedisextremelyhighinafifthgeneration•computer. Apart from this it can perform parallel processing. Theconceptofartificialintelligencehasbeenintroducedtoallowthecomputertotakeitsowndecision.Itis•still in a developmental stage.

Page 34: Introduction to Information Theory and Applicationsjnujprdistance.com/assets/lms/LMS JNU/B.Sc.(IT)/Sem I/Introduction...Summary ... 8.3.6 Project VENONA ... Table 8.2 VENONA Decrypt

Introduction to Information Theory and Applications

22

Computer evolution as per the generations has been tabulated below:

Generation Years Switching device Storage device Software Applications

First 1949-55 Vacuum tubes

Acoustic delay lines and later magnetic drum. I K byte memory

Machine and assembly languages. Simple monitors

Mostlyscientific,later simple business systems

Second 1956-65 Transistors

Magnetic core main memory, tapes and disk peripheral memory 100 Kbyte main memory

High level language. Fortran , Cobol, Algol batch operating system

Extensive business applications, engineering design optimisation

Third 1966-75 Integrated circuits (IC)

High speed magnetic cores. Large disks (100 MB). IM byte main memory

Fortran IV, Cobol 68, PL/1, time shared operating system

Data base management systems, online systems

Fourth(firstdecade) 1975-84

Large scale integrated circuits, microprocessors

Semi conductor memory. Winchester disk. 10 M byte main memory 1000 M byte disks

Fortran 77, Pascal, Cobol 74

Personal computers, Integrated CAD/CAM. Real time control. Graphics oriented systems

Fourth generation (second decade)

1985-91

Very large scale IC. Over 3 million transistors per chip

Semiconductor memory. 1 GB main memory. 100 GB disk

C, C++ Java, Prolog

Simulation, visualissation, parallel computing multimedia

Fifth generation 1991-present

Parallel computing and superconductors

Attachable hard drives, USB drives used to add memory

Useofartificialintelligence

Voice recognition and response to natural language

Table 2.3 Computer evolution

2.7 Computer ClassificationTherearevarietiesofcomputersseentoday.Althoughtheybelongtothefifthgeneration,theycanbedividedintodifferentcategoriesdependinguponthesize,efficiency,memoryandnumberofusers.Broadlytheycanbedividedinto the following categories:

Microcomputer: It is at the lowest possible end of the computer range in terms of speed and storage capacity. Its •CPUisamicroprocessor.Thefirstmicrocomputerswerebuiltof8-bitmicroprocessorchips.Themostcommonapplication of personal computers (PC) is in this category. The PC supports a number of input and output devices. An improvement of 8-bit chip is 16-bit and 32-bit chips. Examples of microcomputer are IBM PC, PC-AT.Mini computer: This is designed to support more than one user at a time. It possesses large storage capacity and •operates at a higher speed. The mini computer is used in multi-user system in which various users can work at the same time. This type of computer is generally used for processing large volume of data in an organisation. They are also used as servers in Local Area Networks (LAN).Mainframes: These types of computers are generally 32-bit microprocessors. They operate at very high speed, •have very large storage capacity and can handle the work load of many users. They are generally used in centralised databases. They are also used as controlling nodes in Wide Area Networks (WAN). Example of mainframes are DEC, ICL and IBM 3000 series.

Page 35: Introduction to Information Theory and Applicationsjnujprdistance.com/assets/lms/LMS JNU/B.Sc.(IT)/Sem I/Introduction...Summary ... 8.3.6 Project VENONA ... Table 8.2 VENONA Decrypt

23

Supercomputer:• They are the fastest and most expensive machines. They have high processing speed compared to other computers. They have also multiprocessing technique. One of the ways in which supercomputers are built is by interconnecting hundreds of microprocessors. Supercomputers are mainly being used for weather forecasting, biomedical research, remote sensing, aircraft design and other areas of science and technology. Examples of supercomputers are CRAY YMP, CRAY2, CRAY XMP and PARAM from India.

Page 36: Introduction to Information Theory and Applicationsjnujprdistance.com/assets/lms/LMS JNU/B.Sc.(IT)/Sem I/Introduction...Summary ... 8.3.6 Project VENONA ... Table 8.2 VENONA Decrypt

Introduction to Information Theory and Applications

24

SummaryComputerisanelectronicdevicethattransformsdata.Itisalsodefinedasaprogrammablemachinethatreceives•input, stores and manipulates data, and provides output in a useful format.Computercanalsobedefinedintermsofitsfunctions.Itcanacceptdata,storedata,processdataasdesired,•retrieve the stored data as and when required and print the result in desired format.A computer system consists of two broad subsystems – hardware and software. Computer hardware is the •physical part of a computer, including the digital circuitry, as distinguished from the computer software that executes within the hardware. The major characteristics of computer are accuracy, diligence, speed, memory, versatility, storage, no IQ and •no feeling.History of computer could be traced back towards the effort of man to count large numbers. This process of •counting of large numbers generated various systems of numeration like Babylonian system of numeration, Greek system of numeration, Roman system of numeration and Indian system of numeration.The word ABACUS means calculating board. It consisted of sticks in horizontal positions on which were inserted •sets of pebbles. Blaise Pascal developed a machine at the age of 19 that could add and subtract. The machine consisted of wheels, gears and cylinders. Charles Babbage built a mechanical machine to do complex •mathematical calculations. It was called difference engine. Later he developed a general-purpose calculating machine called analytical engine. Charles Babbage is called the father of computer.The electronic calculator used in 1960 s was run with electron tubes, which was quite bulky. Later it was replaced •with transistors and as a result the size of calculators became too small.Theperiod,duringwhichtheevolutionofcomputertookplace,canbedividedintofivedistinctphasesknown•as Generations of Computers.First generation computers used Thermion valves. These computers were large in size and writing programs •onthemwasdifficult.Transistorsaresmallerthanelectrictubesandhavehigheroperatingspeed.Theyhavenofilamentandrequire•no heating. It is in the second generation that the concept of Central Processing Unit (CPU).The third generation computers were introduced in 1964. They used Integrated Circuits (ICs). The computers •of1990saresaidtobefifthgenerationcomputers.Thespeedisextremelyhighinfifthgenerationcomputer.Apart from this it can perform parallel processing. Computersarebroadlyclassifiedasmicrocomputer,minicomputer,mainframesandsupercomputer.•

References Rajaraman, V., 1996. • Fundamentals of Computers. 2nd ed., Prentice-hall of India.Long, L. and Nancy, L., 2004. • Computers. 12th ed., Prentice Hall Publications.Fundamentals of Computer Programming• [Video online] Available at:<https://www.youtube.com/watch?v=RsmBNl8U0J4> [Accessed 26 May 2013].ComputerFundamentals -1Uses ofComputers.flv• [Video online] Available at:<https://www.youtube.com/watch?v=OTFnkulI45c> [Accessed 26 May 2013].Computer Fundamentals• [Pdf] Available at: <http://www.cl.cam.ac.uk/teaching/1011/CompFunds/CompFunds.pdf> [Accessed 26 May 2013].Fundamentals of Computer• [Pdf] Available at: <http://uotechnology.edu.iq/dep-production/textbook_computer/Fundamentals_of_Computers.pdf> [Accessed 26 May 2013].

Page 37: Introduction to Information Theory and Applicationsjnujprdistance.com/assets/lms/LMS JNU/B.Sc.(IT)/Sem I/Introduction...Summary ... 8.3.6 Project VENONA ... Table 8.2 VENONA Decrypt

25

Recommended ReadingAttiya, H. and Welch, J., 2004. • Distributed Computing: Fundamentals, Simulations, and Advanced Topics, John Wiley & Sons.Ram, B., 2000. • Computer Fundamentals: Architecture and Organization, New Age International.Mano, M., Charles, R. and Kime, 2004. • Logic and computer design fundamentals, Volume 1, Pearson/Prentice Hall.

Page 38: Introduction to Information Theory and Applicationsjnujprdistance.com/assets/lms/LMS JNU/B.Sc.(IT)/Sem I/Introduction...Summary ... 8.3.6 Project VENONA ... Table 8.2 VENONA Decrypt

Introduction to Information Theory and Applications

26

Self AssessmentMatch the following1.

Input1. Development of dataA. Processing2. Accepts dataB. Output3. Stocks resultsC. Storage4. Produces resultsD.

1-A, 2-B, 3-C, 4-Da. 1-B, 2-A, 3-D, 4-Cb. 1-C, 2-D, 3-A, 4-Bc. 1-D, 2-C, 3-B, 4-Ad.

A programmable machine that receives input, stores and manipulates data, and provides output in a useful 2. format is called a/an ______.

informationa. calculatorb. transistorc. computerd.

The part that stores and processes the data is called the ______.3. CPa. VDUb. PCc. ITd.

The word ABACUS means ________.4. adding boarda. dividing boardb. calculating boardc. multiplication boardd.

Match the following5. First generation computers1. TransistorA. Second generation computers2. Integrated circuitsB. Third generation computers.3. Microprocessors.C. Fourth generation computers4. Perform parallel processing. D. Fifth generation computers5. Thermion valvesE.

1-A, 2-B, 3-C, 4-D, 5-Ea. 1-B, 2-C, 3-D, 4-A, 5-Eb. 1-D, 2-E, 3-A, 4-B, 5-Cc. 1-E, 2-A, 3-B, 4-C, 5-Dd.

__________ are smaller than electric tubes and have higher operating speed.6. Chipsa. Siliconb. Transistorsc. Electroded.

Page 39: Introduction to Information Theory and Applicationsjnujprdistance.com/assets/lms/LMS JNU/B.Sc.(IT)/Sem I/Introduction...Summary ... 8.3.6 Project VENONA ... Table 8.2 VENONA Decrypt

27

Fourth generation computers uses Large Scale Integrated Circuits (LSIC) built on a single silicon chip called 7. ___________.

macroprocessora. microprocessorb. nanoprocessorc. miniprocessord.

The concept which allows the computer to take its own decision is known as ________.8. synthetic intelligencea. artificialintelligenceb. imitation intelligencec. simulated intelligenced.

___________ means the capacity to perform completely different type of work.9. Memory a. Storageb. Versatilityc. Accuracyd.

____________ is at the lowest possible end of the computer range in terms of speed and storage capacity.10. Microcomputera. Minicomputerb. Super computerc. Mainframesd.

Page 40: Introduction to Information Theory and Applicationsjnujprdistance.com/assets/lms/LMS JNU/B.Sc.(IT)/Sem I/Introduction...Summary ... 8.3.6 Project VENONA ... Table 8.2 VENONA Decrypt

Introduction to Information Theory and Applications

28

Chapter III

Computer Peripherals

Aim

The aim of this chapter is to:

explain the basic organisation of computer system•

definethemeaningofarithmeticlogicalunit,controlunitandcentralprocessingunit•

explicate input and output devices•

Objectives

The objectives of this chapter are to:

elucidate basic computer operations•

explain computer memory•

definevariousfunctionalunitsofcomputer•

Learning outcome

At the end of this chapter, you will be able to:

recognise the term computer organisation•

understand primary and secondary memory•

identify input devices and output devices•

Page 41: Introduction to Information Theory and Applicationsjnujprdistance.com/assets/lms/LMS JNU/B.Sc.(IT)/Sem I/Introduction...Summary ... 8.3.6 Project VENONA ... Table 8.2 VENONA Decrypt

29

3.1 IntroductionA peripheral is a device attached to a host computer, but not part of it, and is more or less dependent on the host. It expands the host's capabilities, but does not form part of the core computer architecture. A personal computer or workstation processes information and, strictly speaking, that is all the computer does. Data (unprocessed information) must get into the computer, and the processed information must get out. Entering and displaying information is carried out on a wide variety of accessory devices called peripherals, also known as Input/Output (I/O) devices. Someperipherals,suchaskeyboards,areonlyinputdevices;otherperipherals,suchasprinters,areonlyoutputdevices;andsomeareboth.

3.2 Basic Computer ComponentsAcomputerperformsbasicallyfivemajoroperationsorfunctionsirrespectiveoftheirsizeandmakes(refertothefigurebelow).Theseare:

it accepts data •it stores data•it can process data as required •it gives results •it controls all operations inside a computer •

PROGRAM and DATA

INPUTUNIT

STORAGE UNIT

CONTROL UNIT

ARITHMETIC LOGIC UNIT

OUTPUT UNIT

RESULTS

CENTRAL PROCESSING

UNIT

Fig. 3.1 Basic computer organisation

InputThis is the process of entering data and programs into the computer system.•Computer is an electronic machine like any other machine which takes raw data as inputs and performs some •processing giving out processed data.Therefore, the input unit takes data to the computer in an organised manner for processing.•

Storage The process of saving data and instructions permanently is known as storage. •Data has to be fed into the system before the actual processing starts. As the processing speed of central processing •unit (CPU) very fast the data has to be provided to CPU with the same speed. Therefore,thedataisfirststoredinthestorageunitforfasteraccessandprocessing.•The storage unit performs the following major functions:•

All data and instructions are stored here before and after processing. �Intermediate results of processing are also stored here. �

Page 42: Introduction to Information Theory and Applicationsjnujprdistance.com/assets/lms/LMS JNU/B.Sc.(IT)/Sem I/Introduction...Summary ... 8.3.6 Project VENONA ... Table 8.2 VENONA Decrypt

Introduction to Information Theory and Applications

30

Processing The task of performing operations like arithmetic and logical operations is called processing. •The CPU takes data and instructions from the storage unit and makes all types of calculations based on the •instructions given and the type of data provided. It is then sent back to the storage unit.

Output This is the process of producing results from the data for getting useful information. •Similarly, the output produced by the computer after processing must also be kept somewhere inside the •computer before being given in a readable form. Again the output is also stored inside the computer for further processing.

Control The manner how instructions are performed and the above operations are performed is known as control. •Controlling of all operations like input, processing and output are performed by control unit. It takes care of •step by step processing of all operations inside the computer.

3.3 Functional UnitsIn order to carry out the operations mentioned in the previous section, the computer allocates the task between its various functional units. The computer system is divided into three separate units for its operation. They are:

Arithmetic Logical Unit•Control Unit •Central Processing Unit•

3.3.1 Arithmetic Logical Unit (ALU)After entering data through the input device, it is stored in the primary storage unit. The actual processing of the data and instruction are performed by arithmetic logical unit. The major operations performed by the ALU are addition, subtraction, multiplication, division, logic and comparison.

Data is transferred to ALU from storage unit when required. After processing, the output is returned back to storage unit for further processing or getting stored.ALU consists of:

Accumulator, the main data register where all the intermediate results of a calculation are kept (accumulated) •untilthefinalresultisdetermined(whichisthenstoredinmemory).Data registers are supplemental storage registers that support the operations of the accumulator.•Computational circuits (e.g. a binary adder) perform mathematical operations.•Operational circuits that perform logic operations – here, all mathematical operations are performed in binary •numbers and all logic operations are performed using binary operations. Mathematical operations include addition, subtraction, multiplication and division. Logical operations allow programs to contain repetition and selection, the two essential control structures of •programming. Logical operations performed by ALU include comparing two quantities: keeping a counter and deciding the further route.

3.3.2 Control Unit (CU)

The next component of computer is the Control Unit, which acts like the supervisor seeing that things are done •in proper fashion. The control unit determines the sequence in which computer programs and instructions are executed. •Things like processing of programs stored in the main memory, interpretation of the instructions and issuing of •signals for other units of the computer to execute them.

Page 43: Introduction to Information Theory and Applicationsjnujprdistance.com/assets/lms/LMS JNU/B.Sc.(IT)/Sem I/Introduction...Summary ... 8.3.6 Project VENONA ... Table 8.2 VENONA Decrypt

31

It also acts as a switch board operator when several users access the computer simultaneously. Thereby, it •coordinates the activities of computer’s peripheral equipment as they perform the input and output. Therefore, it is the manager of all operations mentioned in the previous section.ThecomponentsofCU(greatlyoversimplifiedforillustrativepurposes)are:•

Decoders: Decoders interpret program instructions (object code written in machine language). �Timer (or clock): It sequences all CPU activities �Logical gates and circuits: They distribute signals which activate various components of the CPU. �Program counter/register: It keeps track of the next instruction to be executed. �Register: It is a group of (usually) bistable devices that are used to store information like instructions, address �and so on, within a computer system for high-speed access.

3.3.3 Central Processing Unit (CPU)

The ALU and the CU of a computer system are jointly known as the central processing unit (CPU). CPU is •known as the brain of any computer system. It is just like a brain that takes all major decisions, makes all sorts of calculations and directs different parts of •the computer functions by activating and controlling the operations.

Speakers Monitor

Mouse

Keyboard

RAM

Hard DriveCDROMFloppy Disk

Network Card

Modem

Printer

CPU

Hardware Software

Fig. 3.2 Computer architecture

3.4 Types of Computer MemoryComputer memory is used to store two things:

instructions to execute a program •data•

There are two kinds of computer memory: primary memory •secondary memory•

When the computer is doing any job, the data that has to be processed is stored in the primary memory. This data maycomefroman inputdevice likekeyboardor fromasecondarystoragedevice likeafloppydisk.Primarymemory, also called main memory or internal memory, provides temporary storage of programs in execution and the data being processed.

Page 44: Introduction to Information Theory and Applicationsjnujprdistance.com/assets/lms/LMS JNU/B.Sc.(IT)/Sem I/Introduction...Summary ... 8.3.6 Project VENONA ... Table 8.2 VENONA Decrypt

Introduction to Information Theory and Applications

32

Auxiliarystorageholdswhatisnotcurrentlybeingprocessed.Thisisthepartthatis‘filedaway’,butisreadilyavailable when needed. It is non-volatile, meaning that turning the power off does not erase it.Primary memory is accessible directly by the processing unit. RAM is an example of primary memory. As soon as the computer is switched off the contents of the primary memory is lost.

One can store and retrieve data much faster with primary memory compared to secondary memory. Secondary memory suchasfloppydisks,magneticdisk,etc.,islocatedoutsidethecomputer.Primarymemoryismoreexpensivethansecondary memory. This is because the size of primary memory is less than that of secondary memory.

Capacity of a memory:As program or the set of instructions is kept in primary memory, the computer is able to follow instantly the set •of instructions. In computer’s memory, both programs and data are stored in the binary form. Decimal number system was introduced in the previous chapter, that is, the numbers 1 to 9 and 0. •The binary system has only two values 0 and 1. These are called bits. It is because a large number of integrated •circuits inside the computer can be considered as switches, which can be made ON, or OFF. If a switch is ON, it is considered 1 and if it is OFF, it is 0. A number of switches in different states will give a message like this: 110101....10. So the computer takes input •in the form of 0 and 1 and gives output in the form 0 and 1 only. Is it not absurd if the computer gives outputs as 0’s and 1’s only? Everynumberinbinarysystemcanbeconvertedtodecimalsystemandviceversa;forexample,1010meaning•decimal 10. Therefore, it is the computer that takes information or data in decimal form from user, converts it into binary form, processes it giving output in binary form and again converts the output to decimal form.

3.5 Primary MemoryPrimary memory, also called main memory or internal memory, provides temporary storage of programs in execution and the data being processed. It is known as Immediate Access Storage (IAS) as this is the portion of CPU which can be accessed directly. Main memory keeps track of what is currently being processed. These memory chips are the fastest, but most expensive type of storage. It is volatile, meaning that turning the power off erases all of the data.From the hardware point of view, the primary memory is formed by a large number of basic units referred to as ‘memory cells.’ Each memory cell is a device or an electronic circuit that has two or more stable states, which represents the binary numbers 0 (Zero) or 1 (One).

Capacity of primary memoryEachcellofmemorycontainsonecharacteror1byteofdata.Sothecapacityisdefinedintermsofbyteorwords.Thus, 64 kilobyte (KB) memory is capable of storing 64 1024 = 32,768 bytes. (1 kilobyte is 1024 bytes). A memory size ranges from few kilobytes in small systems to several thousand kilobytes in large mainframe and super computer. In a personal computer the memory capacity is in the range of 64 KB, 4 MB, 8 MB and even 16 MB (MB = Million bytes).

The computer can retrieve any item of data or any instruction stored in primary memory at lightning speed. The modern computer does this in a few nanoseconds. Primary memory can be further grouped into:

Random Access Memory (RAM) •Read Only Memory (ROM) •Cache memory (small, fast RAM) is designed to hold frequently used data. •

3.5.1 Random Access Memory (RAM)This memory allows writing as well as reading of data, unlike ROM on which data cannot be written. It is a volatile storage because the contents of RAM are lost when the power (computer) is turned off. If one wants to store the data for later use, the user have to transfer all the contents to a secondary storage device. There are several types of RAM, the most popular of which include:

Page 45: Introduction to Information Theory and Applicationsjnujprdistance.com/assets/lms/LMS JNU/B.Sc.(IT)/Sem I/Introduction...Summary ... 8.3.6 Project VENONA ... Table 8.2 VENONA Decrypt

33

Dynamic RAM (DRAM), although its name sounds sophisticated, is the oldest and simplest (and therefore •the slowest) type of RAM used today. The word ‘dynamic’ comes from the fact that it must be electronically ‘refreshed’ constantly in order to maintain the stored data.StaticRAM(SRAM),unlikeDRAM,doesnotneedtoberefreshed:itsstorageisfixed(aslongaspoweris•supplied to the computer). This newer, more dependable, type of RAM is faster but more expensive than DRAM. SRAM is often used for cache memory.Enhanced Data Output DRAM (EDO RAM) is a type of RAM that improves the memory access time on faster •microprocessors such as the Intel Pentium. EDO RAM was initially optimised for the 66 MHz Pentium. Synchronous DRAM (SDRAM) is a new form of RAM that can be synchronised to the clock speed of the •computer, a powerful feature that optimises data access by the system buses.Rambus DRAM (RDRAM) is Intel’s designated successor to SDRAM having an effective speed of 800 MHz •and a peak data transfer rate of 1.6 GBps. However, it has yet to prove itself, and there are several rivals, e.g. DDR SDRAM, that are slower but have 64b bus widths thus providing comparable transfer rates.

3.5.2 Read Only Memory (ROM)Another type of microcomputer memory is read only memory. Data is ‘burnt’ into the ROM chip at the time of manufacturing. Unlike RAM, the data on the ROM is non-volatile, i.e., data is not lost when the computer is switched off. Following are the popular ROMs:

Programmable Read Only Memory (PROM) can be programmed to record information using a facility known •as a PROM-programmer. Once the chip has been programmed, the recorded information cannot be changed and remains intact even if power is switched off. Therefore programs or instructions written in PROM or ROM cannot be erased or changed.Erasable Programmable Read Only Memory (EPROM) which overcomes the problem of PROM & ROM. •EPROM chip can be programmed time and again by erasing the information stored earlier in it. It is erased by shining ultraviolet light on the exposed chip. To write to or erase from EPROM, one must use a PROM burner.Electronically Erasable PROM• (EEPROM) is more convenient than EPROM, because it can be erased electronically and can be written to in bytes.Flash Memory, a special type of EEPROM, can be erased and rewritten in multi-byte blocks rather than the •single bytes characteristic of EEPROM. Flash memory is most often used to hold control code such as the Basic Input/outputSystem(BIOS)inapersonalcomputer:theseareoftencalled‘flashBIOS’.

3.5.3 Cache Memory (Small, Fast RAM)The speed of CPU is extremely high compared to the access time of main memory. Therefore the performance of CPU decreases due to the slow speed of main memory. To decrease the mismatch in operating speed, a small memory chip is attached between CPU and main memory whose access time is very close to the processing speed of CPU. It is called CACHE memory.

Itisdesignedtoholdfrequentlyuseddata.Ingeneral,Cache(highspeedRAMthatisconfiguredtoholdthemostfrequently used data) is used to improve system performance. Memory cache or CPU cache is a dedicated bank of high-speed RAM chips used to cache data from primary memory. When data is read from primary memory, a block larger than immediately necessary is stored in the cache under the assumption that the next data needed by a program will be located near the data being read: when that data is needed, it will then be waiting in the high speed cache. Memory Cache may be either built into the CPU (level 1, or L1, cache, e.g. Pentiums and PowerPCs) or contained in separate chips (level 2, or L2, cache,).After the data received through the input devices is processed in the central processing unit, the data in the form of result is dispatched through the output devices.

Page 46: Introduction to Information Theory and Applicationsjnujprdistance.com/assets/lms/LMS JNU/B.Sc.(IT)/Sem I/Introduction...Summary ... 8.3.6 Project VENONA ... Table 8.2 VENONA Decrypt

Introduction to Information Theory and Applications

34

3.5.4 RegistersTheCPUprocessesdataandinstructionswithhighspeed;thereisalsomovementofdatabetweenvariousunitsofcomputer. It is necessary to transfer the processed data with high speed. So the computer uses a number of special memory units called registers. They are not part of the main memory but they store data or information temporarily and pass it on as directed by the control unit.

3.6 Secondary StorageOften it is necessary to store hundreds of millions of bytes of data for the CPU to process. Therefore, additional memory is required in all the computer systems. This memory is called auxiliary memory or secondary storage. In this type of memory the cost per bit of storage is low. However, the operating speed is slower than that of the primary storage. Huge volume of data are stored here on permanent basis and transferred to the primary storage as and when required.

Auxiliarystorageholdswhatisnotcurrentlybeingprocessed.Thisisthepartthatis‘filedaway’,butisreadilyavailable when needed. It is non-volatile which means that turning the power off does not erase it.Most widely used secondary storage devices are magnetic tapes and magnetic disk.

3.6.1 Magnetic TapeMagnetic tapes are used for large computers like mainframe computers where large volume of data is stored for a longer time. In PC also one can use tapes in the form of cassettes. The cost of storing data in tapes is inexpensive. Magnetic tapes are mounted on reels or a cartridge or a cassette of tape to store large volumes or backup data. These are cheaper and since are removable from the drive, they provide unlimited storage capacity. Information retrieval from tapes is sequential and not random.

These are not suitable for on-line retrieval of data, since sequential searching will take long time. These are convenient for archival storage, or for backup. The tapes are one of the earliest storage devices having low cost, low speed, portability.Advantages of magnetic tape are:

Compact: A 10-inch diameter reel of tape is 2400 feet long and is able to hold 800, 1600 or 6250 characters •in each inch of its length. The maximum capacity of such tape is 180 million characters. Thus data are stored much more compactly on tape.Economical: The cost of storing characters is very less as compared to other storage devices.•Fast: Copying of data is easier and fast.•Long term storage and re-usability: Magnetic tapes can be used for long term storage and a tape can be used •repeatedly without loss of data.

10 1/2 inches

Fig. 3.3 Magnetic Tape

Page 47: Introduction to Information Theory and Applicationsjnujprdistance.com/assets/lms/LMS JNU/B.Sc.(IT)/Sem I/Introduction...Summary ... 8.3.6 Project VENONA ... Table 8.2 VENONA Decrypt

35

3.6.2 Magnetic DiskA magnetic disk is a circular platter of plastic, which is coated with magnetised material. One of the key components of a magnetic disk is a conducting coil named as Head (Read-write head) which performs the job of reading and writing on the magnetic surface. The head remains stationary while the disk rotates below it for reading or writing operation.

All magnetic disks are similarly formatted, or divided into areas called tracks, sectors and cylinders. Disk sector is a wedge-shape piece of the disk, shown in grey. Each sector is numbered. A track sector is the area of intersection of a track and a sector, shown in grey. The head of disk is a small coil and reads or writes on the position of the disk rotating below it: therefore, the data is stored in concentric set of rings called tracks. The width of a track is equal tothewidthofthehead.Tominimisetheinterferenceofmagneticfieldsandtominimisetheerrorsofmisalignmentof head, the adjacent tracks are separated by inter track gaps. As we go towards the outer tracks, the size of a track increases but to simplify electronics same numbers of bits are stored on each track.

Floppy disks and hard disks are commonly used types of magnetic disk. Several other kinds of removable magnetic mediaareinuse,suchastheZipdisk.Allofthesehaveamuchhighercapacitythanfloppydisks,buteachtypeofmedia requires its own drive.

Fig. 3.4 Disk sectors(Source: http://www.jegsworks.com/Lessons/lesson6/lesson6-3.htm)

Fig. 3.5 Track sectors(Source: http://www.jegsworks.com/Lessons/lesson6/lesson6-3.htm)

3.6.3 Floppy DiskAfloppydiskismadeofaflexiblethinsheetofplasticmaterialwithamagneticcoatingandgroovesarrangedinconcentric circles with tracks. Disk is removable from the reading device attached to the computer and therefore provides unlimited storage capacity.

Page 48: Introduction to Information Theory and Applicationsjnujprdistance.com/assets/lms/LMS JNU/B.Sc.(IT)/Sem I/Introduction...Summary ... 8.3.6 Project VENONA ... Table 8.2 VENONA Decrypt

Introduction to Information Theory and Applications

36

Thefloppydisksareavailableintwosizes-5.25inchesand3.5incheswithcapacityrangingfrom360KBto1.44MBperdisk.However,useoffloppydiskshascompletelystopped,duetothevolatilityandavailabilityofotherimproved mediums.

Fig. 3.6 Floppy disk

3.6.4 Optical DiskIn optical storage devices, the information is written using laser beam on a plastic coated disk which can store digital data as tiny pits etched in the surface.

Characteristics of optical disksThey are formed of layers•Data is in a spiral groove, starting from the centre of disk•Data is in digital format (1s and 0s)•1'sand0'sareformedbyhowthediskabsorbsorreflectslightfromthetinylaserbeam.•

Working of optical disksAn optical disk is made up of polycarbonate (a plastic). The data is stored on a layer inside the polycarbonate. •Ametallayerreflectsthelaserlightbacktoasensor.To read the data on a disk, laser light shines through the polycarbonate and hits the data layer. How the laser •lightisreflectedorabsorbedisreadasa1or0bythecomputer.In a Compact Disk (CD), the data layer is near the top of the disk, the label side.•In a DVD, the data layer is in the middle of the disk. A DVD can actually have data in two layers. It can access •the data from one side or from both sides. This is how a double-sided, double-layered DVD can hold 4 times the data that a single-sided, single-layered •DVD can.

Types of optical disksFollowing are the types of optical disks:Read only

The most common type of optical disk is the CD-ROM, which stands for Compact Disk Read Only Memory. •It looks just like an audio CD but the recording format is quite different. CD-ROM disks are used for storing computer software.DVD stands for Digital Video Device or Digital Versatile Device: DVDs are used for recording movies and •store large amounts of data.Write Once Read Many: The CDs and DVDs that are commercially produced are of the Write Once Read Many •(WORM) variety. They can’t be changed once they are created. That is, they allow writing only once, while data may be read many times.

Page 49: Introduction to Information Theory and Applicationsjnujprdistance.com/assets/lms/LMS JNU/B.Sc.(IT)/Sem I/Introduction...Summary ... 8.3.6 Project VENONA ... Table 8.2 VENONA Decrypt

37

The data layer is physically moulded into the polycarbonate. Pits (depressions) and lands (surfaces) form the •digitaldata.Ametalcoating(usuallyaluminium)reflectsthelaserlightbacktothesensor.Oxygencanseepinto the disk, especially in high temperatures and high humidity. This corrodes the aluminium, making it too dulltoreflectthelasercorrectly.CD-ROM and DVD-ROM disks are readable for many years, if stored in good condition.•

Write onceThe optical disks that can record on one’s own computer are CD-R, DVD-R, and DVD+R disks, and called as •writable or recordable disks. Here, the metal and data layers are separate and metal layer can be of gold, silver, or a silver alloy.•Gold layers are best because gold does not corrode. Naturally, the best is more expensive. Sulphur dioxide in •air can seep in and corrode silver over time.Thedatalayerisanorganicdyethatthewritinglaserchanges.Oncethelasermodifiesthedye,itcannotbe•changed again. Ultraviolet light and heat can degrade the organic dye.Awritablediskisusefulasabackupmediumwhenlong-termstorageofdataisrequired.Itislessefficientfor•data that changes often, since a new recording is required each time the changed data is saved.

RewriteAn option for backup storage of changing data is rewritable disks, i.e. CD-RW, DVD-RW, DVD+RW, •DVD+RAM. Thedatalayerforthesedisksusesaphase-changingmetalalloyfilm.Thisfilmcanbemeltedbythelaser’sheat•to level out the marks made by earlier data and then again the new data can be recorded with laser. We can erase and write on these disks as many as 1000 times, for CD-RW, and even 100,000 times for the •DVD-RW types.

Advantages of optical disksPhysical:• Anopticaldiskismuchsturdierthanatapeorafloppydisk.Itisphysicallyhardertobreakormeltor warp.Delicacy: It is not sensitive to being touched, though it can get too dirty or scratched to be read, but it can be •cleaned. Magnetic:• Itisentirelyunaffectedbymagneticfields.Capacity: Optical disks hold a lot of data, especial the double-sided DVDs. •For software providers, an optical disk is a great way to store the software and data that they want to distribute •or sell.

Disadvantages of optical disksCost: The cost of a CD-RW has dropped drastically in short period of time. The cost of disks can add up, too. •Recordable disks (one time only) are also getting cheaper. But we have to be careful about the capacity and maximum recording speed. For commercial use, the read/write drives are quite cost effective. For personal use, they are available and are cheap enough to use for data storage for everyone. Duplication:ItisnotquiteaseasyorasfasttocopyanopticaldiskasitistocopyfilestoaUSBflashdrive.•One needs the software and hardware for writing disks

3.6.5 Flash MemorySeveral different brands of removable storage cards, also called as memory cards, are available in the market. These are solid state devices that read and write data electrically and not magnetically. Devices like digital cameras, mobile phonesetc.mayusecompactflash,smartmedia,memorystickoranotherflashmemorycard. Laptops use PCMCIA cards,whichareatypeofflashmemory.Theyareassolidasharddisks.

Page 50: Introduction to Information Theory and Applicationsjnujprdistance.com/assets/lms/LMS JNU/B.Sc.(IT)/Sem I/Introduction...Summary ... 8.3.6 Project VENONA ... Table 8.2 VENONA Decrypt

Introduction to Information Theory and Applications

38

3.6.6 USB DrivesIt is also known as flashdrive,jumpdrive,flashpen,keydriveetc.TheycanbepluggedintoaUSBportofcomputer,without any requirement of drivers. The storage capacities vary from 8MB to 128GB or more.

3.6.7 Removable Hard DrivesVarious types of special drives that compress data are available. Since they provide high storage capacity, they can be used for back up as well.

3.6.8 Smart CardsA smart card, chip card, or integrated circuit card (ICC) is any pocket-sized card with embedded integrated circuits. There are two broad categories of ICCs. Memory cards containing only non-volatile memory storage components, and perhaps dedicated security logic. Microprocessor cards containing volatile memory and microprocessor components. The card is made of plastic. The most common smart card applications are: credit cards, electronic cash, computer security systems and so on.

3.6.9 Optical CardsThe material is comprised of several layers that react when a laser light is directed at them. The laser burns a tiny hole (2.25 microns in diameter) in the material which can then be sensed by a low power laser during the read cycle. The presence or absence of the burn spot indicates a 1 or 0. Because the material is actually burned during the write cycle, the media is a ‘Write Once Read Many’ type and the data is non volatile (not lost when power is removed).Optical cards hold on the order of 1,000x the amount of information as the typical smart card and the data, once written, is permanent and cannot be erased or altered in any way. Optical cards, unlike smart cards, are also impervious toelectricandmagneticfieldsandalsotostaticelectricity.

3.7 Input Output DevicesA computer is only useful when it is able to communicate with the external environment. When one works with the computer data, the instructions are fed into it through some devices. These devices are called Input devices. Similarly, after processing the data the computer gives output through other devices called output devices.

Input and output devices are collectively called I/O devices. Input devices (and also output devices) are the hardware interfaces between the human user and computer system, but (as always) hardware is ‘driven’ by software, so when one talks about an I/O device, remember there is an associated ‘device driver.’

3.8 Input DevicesInput devices are necessary to convert our information or data into a form which can be understood by the computer. A good input device should provide timely, accurate and useful data to the main memory of the computer. For such processing following are the most useful input devices:Keyboard

Keyboard is the most common data entry device having more than 100 keys on it. Almost all general-purpose •computers are supplied with a keyboard.When a key is pressed, a number (code) is sent to the computer to tell it which key has been pressed. Keyboards •are often used in conjunction with a screen on which the data entered are displayed. The keys on a keyboard are usually arranged in the same order as those on a typewriter. This layout of keys is called •QWERTY because Q-W-E-R-T-Y is the order in which the letters occur on the top row of the keyboard. Keyboards arewidely used because they provideflexiblemethod of data entry and can be used inmost•applications. However, they do have limitations like entry using keyboard is a slow form of data entry process and is prone to error.

Page 51: Introduction to Information Theory and Applicationsjnujprdistance.com/assets/lms/LMS JNU/B.Sc.(IT)/Sem I/Introduction...Summary ... 8.3.6 Project VENONA ... Table 8.2 VENONA Decrypt

39

Fig. 3.7 KeyboardPointing devicesThese are also called Cursor Control Devices. Cursor control devices are used to place the cursor (a highlighted screen location indicating where the next action will occur), select menu items, and control the computer by ‘clicking buttons’ on the screen. If these are built into the computer they are called Integrated Pointing Devices. A few such devices available are:

Mouse: A standard device of GUI (Graphical User Interface). New versions are optic and have no moving parts. •AnLED(LightEmittedDiode)recordsareflectedlightwhichsensesmotionoveraflatsurface.Itrollsonasmallballandhastwoorthreebuttonsonthetop.Whenthemouseisrolledacrossaflatsurfacethescreencensors the mouse in the direction of mouse movement. The cursor moves very fast with mouse giving more freedom to work in any direction. It is easier and faster to move through a mouse.Trackballs: Like an ‘upside-down mouse’, it has the advantage of being stationary.•Joysticks: A hand-held stick that pivots about one end indicating 360 degree directions.•Trackpointorpointingstick:Aminiaturejoystickthatrespondstothetouchofasinglefinger.•Trackpads:Atouchsensitivesurfacethattranslatesfingermotionintocursormotion.•

Joysticks MouseTrackball

Fig. 3.8 Pointing devices

Pen input devicesThese are based on screens that sense the location of a special pen that is connected to the terminal. Following are some of the devices:

Light pens either detect the monitor’s light or emit light that can be picked up by a specially designed •monitor.Styluses are pens with electronic point heads which activate pixels on the monitor, usually a LCD display.•Handwriting recognition software translates alphanumeric to digitised equivalents: normally these needs to be •‘trained’ to recognise an individual’s carefully printed letters, numbers, and symbols. These have been rather primitive,butsignificantadvanceshavebeenmaderecently.Theyaretheprimaryinputdeviceofhand-heldPDAs (Personal Digital Assistants) and PIMs (Personal Information Managers) state-of-the-art readers reportedly are very accurate.

Digitising tablets are similar to light pens or styluses except one draws on a tablet rather than the screen. �Touch screen recognises human touch and allows selection of menu items displayed on a monitor by �touching them.

Page 52: Introduction to Information Theory and Applicationsjnujprdistance.com/assets/lms/LMS JNU/B.Sc.(IT)/Sem I/Introduction...Summary ... 8.3.6 Project VENONA ... Table 8.2 VENONA Decrypt

Introduction to Information Theory and Applications

40

Video input devicesSome different types of video input devices are:

Digital cameras•Digitalcamerashaveopticslikeregularphotographiccameras;however, theyrecordthesingleimages �electronically(ratherthanonphotographicfilm)indigitalform.Theseimagesarestoredincamera’sRAM(Random Access Memory), which like that in a computer is volatile. The images can be displayed immediately or stored on a secondary storage medium, e.g. a diskette and �processed later using image processing software.

Digital video cameras•These are digital cameras which can store sequences of digital images on magnetic tape and play them back �as ‘movies’. They are similar to camcorders, but camcorders store their images as analogue data. Digital video cameras are essential features of video conferencin � g where remote computers can actually control a remote camera and remote users can share applications and collaborate on ‘whiteboards ’.

Optical input methodsScanner: The keyboard can input only text through keys provided in it. If we want to input a picture the keyboard •cannot do that. Scanner is an optical device that can input any graphical matter and display it back. The common optical scanner devices are:

Magnetic ink character recognition (MICR): This is widely used by banks to process large volumes of cheques �and drafts. Cheques are put inside the MICR. As they enter the reading unit the cheques pass through the magneticfieldwhichcausesthereadheadtorecognisethecharacterofthecheques.Optical mark reader (OMR): This technique is used when students have appeared in objective type tests �and they had to mark their answer by darkening a square or circular space by pencil. These answer sheets are directly fed to a computer for grading where OMR is used.Optical character recognition (OCR) :This technique unites the direct reading of any printed character. �Suppose a set of hand written characters on a piece of paper is given. Put it inside the scanner of the computer. This pattern is compared with a site of patterns stored inside the computer. Whichever pattern ismatchediscalledacharacterread.Patternsthatcannotbeidentifiedarerejected.OCRsareexpensivethough better than MICR.

3.9 Output DevicesOutput devices are the means by which computer systems communicate with people. Output devices accept data from the processor and convert them into the required output format. The convenience of use of these devices and thequalityoftheirresultshasasignificantimpactontheeffectivenessofacomputersystem.Inotherwords,outputdevices translate the data in the processor into a format that is suitable for people to use. Most ‘real world’ data is analogue, i.e., it consists of continuous signals like sounds, pictures, voltage and so forth.

However, computers can only process digital data (discrete signals): therefore, input usually involves analogue to digitalconversion(A/Dhardware)andoutputreversestheprocessusingD/Aconverters.Outputcanbesub-classifiedas either direct (to/from I/O devices) or indirect (to/from secondary storage). Output can also be divided into another twokinds:hardcopyoutputandsoftcopyoutput.Hardcopyoutput(paper,microfilm,etc.)providesapermanentrecord, while soft copy output (visual, audio, tactile, or action) is transient.

Action output facilitates control of electromechanical devices, e.g. robotics. For the sake of convenience, let us followthegivenclassificationtodiscusstheoutputdevices.Someoutputdevicesareasfollows:

Terminals: It is a very popular interactive input-output unit. It can be divided into two types: hard copy terminals •and soft copy terminals. A hard copy terprintout on paper whereas soft copy terminals provide visual copy on monitor. A terminal when connected to a CPU sends instructions directly to the computer. Terminals are also classifiedasdumbterminalsorintelligentterminalsdependingupontheworksituation.Cathode ray tube displays (CRTs)•

Page 53: Introduction to Information Theory and Applicationsjnujprdistance.com/assets/lms/LMS JNU/B.Sc.(IT)/Sem I/Introduction...Summary ... 8.3.6 Project VENONA ... Table 8.2 VENONA Decrypt

41

These are the most commonly seen output devices. The computer screen is made of CRTs. They are also �called monitors or visual display terminals (VDTs) or Visual display unit (VDU). Monitors look identical to a television screen. They produce fast and virtually costless output of �information.CRTs use faster scan technology to portray images as bitmapped graphics on a phosphorescent screen. �Electronsarefiredatthescreenandlightuptinydotsofphosphor,whichthenglowforashortperiodoftime. Each point is called a picture element or pixel. Sincethephosphorsglowmomentarily, theelectronicgunkeepsonfiringtheelectronbeamatregular �intervals. This refreshing mechanism is measured in Hertz (Hz) or cycles per second. A low refresh rate leadstoscreenflicker.Monochrome monitors use one colour images (usually black) on a one colour background (usually white), �e.g. old mainframe monitors. These are now virtually obsolete in PCs.On the other hand, colour monitors use a triad of red, green and blue phosphor dots which are stimulated �in varying degrees to produce a wide range of colours.

Liquid Crystal Display (LCD): It is the• most popular type, which has a thin layer of liquid crystal molecules divided into small squares forming pixels that are held by two glass sheets. When power is applied to a square it turnsopaque.LCDsusedtocomeinlimitedsize,brightnessandclarity,butcurrenttechnologyhassignificantlyimproved. Gas-plasma: It displays the best image (though low contrast), but they cannot be battery operated.•Printer: It is an important output device which can be used to get a printed copy of the processed text or result •on paper. There are various types of printers that are designed for different types of applications. Depending on theirspeedandapproachofprinting,printersareclassifiedasimpactandnon-impactprinters.

Impact printers use the familiar typewriter approach of hammering a typeface against the paper and inked �ribbon. Dot-matrix printers are of this type. Non-impact printers do not hit or impact a ribbon to print. They use electro-static chemicals and ink-jet �technologies. Laser printers and ink-jet printers are of this type. This type of printers can produce colour printing and elaborate graphics.

Impact Printer Non Impact Printer

Fig. 3.9 Types of printers

Plotters: Though a few printers listed above are capable of producing graphics, there are a few special plotters •exclusively to print a good quality drawing and graphs. There are two types of plotters:

Flatbed plotters have a drawing instrument (pen, ink-jet, electrostatic head, or heater element) that moves �bothhorizontallyandvertically,underthecontrolofinputvoltages,overaflatpieceofstationarypaper.Drum � plotters have a drawing pen that move vertically, while the paper on a drum rotates under it.

Page 54: Introduction to Information Theory and Applicationsjnujprdistance.com/assets/lms/LMS JNU/B.Sc.(IT)/Sem I/Introduction...Summary ... 8.3.6 Project VENONA ... Table 8.2 VENONA Decrypt

Introduction to Information Theory and Applications

42

Fig. 3.10 Plotters

Page 55: Introduction to Information Theory and Applicationsjnujprdistance.com/assets/lms/LMS JNU/B.Sc.(IT)/Sem I/Introduction...Summary ... 8.3.6 Project VENONA ... Table 8.2 VENONA Decrypt

43

SummaryAcomputerperformsfivemajoroperations.Theseare:itacceptsdata,storesdata,itcanprocessdataasrequired,•it gives results and it controls all internal operations.Basic computer organisation includes input, storage, processing, output and control.•Input is the process of entering data and programs into the computer system. The process of saving data and •instructions permanently is known as storage. The task of performing operations like arithmetic and logical operations is called processing. Output is the process of producing results from the data for getting useful information. The manner how •instructions are performed and the above operations are performed is known as control. The computer system is divided into three separate units for its operation. They are: arithmetic logical unit, •control unit and central processing unit.Computer memory is used to store two things: instructions to execute a program and data.•There are two kinds of computer memory: primary memory and secondary memory.•Primary memory is accessible directly by the processing unit. RAM is an example of primary memory. As soon •as the computer is switched off the contents of the primary memory is lostPrimary memory can be further grouped into Random Access Memory (RAM), Read Only Memory (ROM) , •Cache memory (small, fast RAM) is designed to hold frequently used data. Often it is necessary to store hundreds of millions of bytes of data for the CPU to process. Therefore additional •memory is required in all the computer systems. This memory is called auxiliary memory or secondary storage. It is non-volatile, meaning that turning the power off does not erase it.A computer is only useful when it is able to communicate with the external environment. When one works with •the computer data and instructions is fed into it through some devices. These devices are called Input devices. Similarly computer after processing gives output through other devices called output devices. Input and output devices are collectively called I/O devices.

ReferencesHamacher, C., Vranesic, Z. and Zaky, S., 2001. • Computer Organization, 5th ed., McGraw-Hill, Science/Engineering/Math Publications.David, A. and John, L., 2008• . Computer organization and design: the hardware/software interface. 4th Edition, Morgan Kaufmann Publications.Computer Peripherals: How to Replace Corrupt RAM on Your Computer• [Video online] Available at: < https://www.youtube.com/watch?v=uwHUbF3KaWE> [Accessed 26 May 2013].Futuristic Peripherals to Transform your Computer! • [Video online] Available at: <https://www.youtube.com/watch?v=UTYTD27x-m0> [Accessed 26 May 2013].Introduction to Computer Peripherals• [Pdf] Available at: <http://faculty.ivytech.edu/~smilline/downloads/hardware.pdf> [Accessed 26 May 2013].Computer Peripherals• [Pdf] Available at: <http://cbse.gov.in/Chapter%201%20computer%20for%20fmml.pdf> [Accessed 26 May 2013].

Recommended ReadingKamra, A. and Bhambri, P., 2008. • Computer Peripherals And Interfaces, Technical Publications.Arthur, H., 1968. • Computer peripherals and typesetting: A study of the man-machine interface incorporating a survey of computer peripherals and typographic composing equipment, H.M.S.O.Snehi, J., 2006. • Computer Peripherals and Interfacing, Firewall Media.

Page 56: Introduction to Information Theory and Applicationsjnujprdistance.com/assets/lms/LMS JNU/B.Sc.(IT)/Sem I/Introduction...Summary ... 8.3.6 Project VENONA ... Table 8.2 VENONA Decrypt

Introduction to Information Theory and Applications

44

Self AssessmentMatch the following1.

1. ALU A. The brain of any computer system

2. CU B. Memory allows writing as well as reading of data

3. CPU C. Determines the sequence in which computer programs and instructions are executed

4. RAM D. The major operations performed are addition, subtraction, multiplication, division and comparison.

1-A, 2-B, 3-C, 4-Da. 1-D, 2-C, 3-A, 4-Bb. 1-B, 2-A, 3-D, 4-Cc. 1-C, 2-D, 3-B, 4-Ad.

A hard copy terminal provides a printout on paper whereas __________ terminals provide visual copy on 2. monitor.

soft copya. hard copyb. flexiblecopyc. stiff copyd.

The process of producing results from the data for getting useful information is _______ .3. output a. input b. processing c. storaged.

The task of performing arithmetic and logical operations is called ________.4. ALU a. editingb. storage c. outputd.

The ALU and CU jointly are known as ________.5. RAM a. ROM b. CPU c. CDd.

Which of the following statements is false?6. Secondary memory is called Auxiliary memory.a. A CD-ROM is read only memory.b. Printer is an important output device.c. The magnetic tapes and magnetic disk are primary memories.d.

Page 57: Introduction to Information Theory and Applicationsjnujprdistance.com/assets/lms/LMS JNU/B.Sc.(IT)/Sem I/Introduction...Summary ... 8.3.6 Project VENONA ... Table 8.2 VENONA Decrypt

45

Which of the following statements is false?7. There are two kinds of computer memory primary and secondary.a. The computer takes inputs in the form of 0 and 1.b. The storage of program and data in the RAM is permanent.c. The memories which do not loose their content on failure of power supply are known as non-volatile d. memories.

The memories which are erased if there is a power failure are known as______.8. non-volatile memoriesa. volatile memoriesb. stable memoriesc. unstable memoriesd.

ROM is a __________.9. non-volatile memorya. volatile memoryb. stable memoryc. unstable memoryd.

_________ printers do not hit or impact a ribbon to print.10. Typewritersa. Dot-matrixb. Impactc. Non-impactd.

Page 58: Introduction to Information Theory and Applicationsjnujprdistance.com/assets/lms/LMS JNU/B.Sc.(IT)/Sem I/Introduction...Summary ... 8.3.6 Project VENONA ... Table 8.2 VENONA Decrypt

Introduction to Information Theory and Applications

46

Chapter IV

Computer Operations and Languages

Aim

The aim of this chapter is to:

introduce computer arithmetic•

explain computer language•

explicate computer operation•

Objectives

The objectives of this chapter are to:

explain binary number system, octal and hexadecimal•

defineassembler,compilerandinterpreter•

elucidateinstructioncycleandprogramflow•

Learning outcome

At the end of this chapter, you will be able to:

identify characteristic features of computer arithmetic•

recogniseflotationpointrepresentationandarithmeticthroughstacks•

understand binary arithmetic•

Page 59: Introduction to Information Theory and Applicationsjnujprdistance.com/assets/lms/LMS JNU/B.Sc.(IT)/Sem I/Introduction...Summary ... 8.3.6 Project VENONA ... Table 8.2 VENONA Decrypt

47

4.1 IntroductionComputercannotdoanythingwithout instructionsfromtheuser. Inorder todoanyspecificjobasequenceofinstructions has to be given to the computer. This set of instructions is called a computer program. Computer programs are formed using computer arithmetic and computer language. Software refers to the set of computer programs, computer operations and computer application. Software guides the computer at every step where to start and stop during a particular job.

4.2 Computer ArithmeticArithmetic is a branch of mathematics that deals with numbers and numerical computation. Arithmetic operations on pairs of numbers x and y include addition, producing the sum s = x + y, subtraction, yielding the difference d = x – y, multiplication, resulting in the product p = x × y, and division, generating the quotient q = x / y (and, in case of integer division, the remainder z = x mod y). Subtraction and division can be viewed as operations that undo the effects of addition and multiplication, respectively.

Computer arithmetic is a branch of computer engineering that deals with methods of representing integers and realvalues(e.g.,fixed-andfloating-pointnumbers)indigitalsystemsandefficientalgorithmsformanipulatingsuch numbers by means of hardware circuits or software routines. On the hardware side, various types of adders, subtractors, multipliers, dividers, square-rooters and circuit techniques for function evaluation are considered. Both abstractstructuresandtechnology-specificdesignsaredealtwith.

Software aspects of computer arithmetic include complexity, error characteristics, stability andverifiability ofcomputational algorithms.

Natural numbers•Whenonethinksofnumbers,itisusuallythenaturalnumbersthatfirstcometothemind.Thetypeofnumbers �thatsequencebookorcalendarpages,markclockdials,flashonstadiumscoreboards,etc.Theset{0,1,2,3 . . .} of natural numbers are also known as whole numbers or unsigned integers which form the basis of arithmetic. Natural numbers are used for counting and for answering questions that ask “how many?”Four-thousandyearsago,Babyloniansknewaboutnaturalnumbersandwereproficientinarithmetic.Since �then, representations of natural numbers have advanced in parallel with the evolution of language.Ancient civilisations used sticks and pebbles to record inventories or accounts. When the need for larger �numbersarose,theideaofgroupingsticksorpebblessimplifiedcountingandcomparisons.Forexample,27wasrepresentedbyfivegroupsoffivesticks,plustwosticks.Eventually,objectsofdifferentshapesorcolours were used to denote such groups leading to more compact representations.Numbers must be differentiated from their representations, sometimes called numerals. For example, the �number “twenty-seven” can be represented in different ways using various numerals or numeration systems. These systems include:

||||| ||||| ||||| ||||| ||||| || sticks or unary code

27 radix-10 or decimal code

11011 radix-2 or binary code

XXVII roman numerals

Table 4.1 Various numerals or numeration systems

Base (Radix) : In the number system the base or radix tells the number of symbols used in the system. �In the earlier days different civilisations were using different radixes. The Egyptian used the radix 2, the Babylonians used the radix 60 and Mayans used 18 and 20.

Page 60: Introduction to Information Theory and Applicationsjnujprdistance.com/assets/lms/LMS JNU/B.Sc.(IT)/Sem I/Introduction...Summary ... 8.3.6 Project VENONA ... Table 8.2 VENONA Decrypt

Introduction to Information Theory and Applications

48

The base of a number system is indicated by a subscript (decimal number) and this will be followed by the value of the number. For example, (952)10, (456)8, (314)16

Number Systems which are used by the computers are:•Decimal system : The decimal system is the system which is used in everyday counting. The number system �includes the ten digits from 0 through 9. These digits are recognised as the symbols of the decimal system. Each digit in a base ten number represents units ten times the units of the digit to its right. For example, 9542= 9000 + 500 + 40 +2= (9 × 103) + (5 × 102) + (4 × 10) + (2× 100)Binary system : Computers do not use the decimal system for counting and arithmetic. Their CPU and �memory are made up of millions of tiny switches that can be either in ON or OFF states. 0 represents OFF and 1 represents ON. Binary system has two numbers 0 and 1. Binary system has base 2 therefore the weight of nth bit of the number from Right Hand Side is nth bit × 2n-1.Octal system : The octal system is commonly used with computers. The octal number system with its 8 �digit 0,1,2,3,4,5,6, and 7 has base 8. The octal system uses a power of 8 to determine the digit of a number’s position.Hexadecimal system : Hexadecimal is another number system that works exactly like the decimal, binary �and octal number systems, except that the base is 16. Each hexadecimal represents a power of 16. The system uses 0 to 9 numbers and A to F characters to represent 10 to 15 respectively.

4.3 Binary Number SystemThe binary numeral system or base-2 number system represents numeric values using two symbols, 0 and 1. Morespecifically,theusualbase-2systemisapositionalnotationwitharadixof2.Owingtoitsstraightforwardimplementation in digital electronic circuitry using logic gates, the binary system is used internally by all modern computers.

RepresentationA binary number can be represented by any sequence of bits (binary digits), which in turn may be represented by any mechanism capable of being in two mutually exclusive states. The following sequence of symbols could all be interpreted as the binary numeric value of 667:

1 0 1 0 0 1 1 0 1 1| – | – – | | – | |x o x o o x x o x xy n y n n y y n y y

Table 4.2 Binary numeric value of 667

The numeric value represented in each case is dependent upon the value assigned to each symbol. In a computer, thenumericvaluesmayberepresentedbytwodifferentvoltages;onamagneticdisk,magneticpolaritiesmaybeused.A“positive”,“yes”or“on”stateisnotnecessarilyequivalenttothenumericalvalueofone;itdependsonthe architecture in use.

Binary numbers are commonly written using the symbols 0 and 1. When written, binary numerals are often subscripted, prefixedorsuffixedinordertoindicatetheirbase,orradix.Whenspoken,binarynumeralsareusuallyreaddigit-by-digit, in order to distinguish them from decimal numbers. For example, the binary numeral 100 is pronounced one zero zero, rather than one hundred.

Page 61: Introduction to Information Theory and Applicationsjnujprdistance.com/assets/lms/LMS JNU/B.Sc.(IT)/Sem I/Introduction...Summary ... 8.3.6 Project VENONA ... Table 8.2 VENONA Decrypt

49

K

1

K

0

M

3

S

4

M

7

S

9

8

4

2

1

10:37:49

Fig. 4.1 Binary clock using LED to express binary values

4.3.1 Counting in BinaryCounting in binary is similar to counting in any other number system. Beginning with a single digit, counting proceeds through each symbol, in increasing order. Decimal counting uses the symbols 0 through 9, while binary only uses the symbols 0 and 1.

Whenthesymbolsforthefirstdigitareexhausted,thenext-higherdigit(totheleft)isincrementedandcountingstarts over at 0. In decimal, counting proceeds like:

000, 001, 002 ... 007, 008, 009, (rightmost digit starts over, and next digit is incremented)010, 011, 012... 090, 091, 092 ... 097, 098, 099, (rightmost two digits start over, and next digit is incremented)100, 101, 102...

Decimal Binary0 01 12 103 114 1005 1016 1107 1118 10009 100110 101011 101112 110013 110114 111015 111116 10000

Table 4.3 Representation of decimal and binary counting

Page 62: Introduction to Information Theory and Applicationsjnujprdistance.com/assets/lms/LMS JNU/B.Sc.(IT)/Sem I/Introduction...Summary ... 8.3.6 Project VENONA ... Table 8.2 VENONA Decrypt

Introduction to Information Theory and Applications

50

After a digit reaches 9, an increment resets it to 0 but also causes an increment of the next digit to the left. In binary, counting is the same except that only the two symbols 0 and 1 are used. Thus, after a digit reaches 1 in binary, an increment resets it to 0 but also causes an increment of the next digit to the left:0000,

0001, (rightmost digit starts over, and next digit is incremented)0010, 0011, (rightmost two digits start over, and next digit is incremented)0100, 0101, 0110, 0111, (rightmost three digits start over, and the next digit is incremented)1000, 1001...

Since binary is a base-2 system, each digit represents an increasing power of 2, with the rightmost digit representing 20, the next representing 21, then 22, and so on. To determine the decimal representation of a binary number simply take the sum of the products of the binary digits and the powers of 2 which they represent. For example, the binary number:

100101is converted to decimal form by:[(1) × 25] + [(0) × 24] + [(0) × 23] + [(1) × 22] + [(0) × 21] + [(1) × 20] =[1 × 32] + [0 × 16] + [0 × 8] + [1 × 4] + [0 × 2] + [1 × 1] = 37To create higher numbers, additional digits are simply added to the left side of the binary representation.

4.3.2 Binary ArithmeticArithmetic in binary is much like arithmetic in other numeral systems. Addition, subtraction, multiplication and division can be performed on binary numerals.

AdditionThe simplest arithmetic operation in binary is addition. Adding two single-digit binary numbers is relatively •simple, using a form of carrying:0+0→00+1→11+0→11+1→10,carry1(since1+1=0+1×binary10)Adding two “1” digits produces a digit “0”, while 1 will have to be added to the next column. This is similar to •what happens in decimal when certain single-digit numbers are added together. If the result equals or exceeds the value of the radix (10), the digit to the left is incremented:5+5→0,carry1(since5+5=0+1×10)7+9→6,carry1(since7+9=6+1×10)This is known as carrying. When the result of an addition exceeds the value of a digit, the procedure is to “carry” •the excess amount divided by the radix (that is, 10/10) to the left, adding it to the next positional value. This is correct since the next position has a weight that is higher by a factor equal to the radix. Carrying works the same way in binary:

1 1 1 1 1 carried digits0 1 1 0 1

+ 1 0 1 1 1= 1 0 0 1 0 0

Table 4.4 Addition in binary

Page 63: Introduction to Information Theory and Applicationsjnujprdistance.com/assets/lms/LMS JNU/B.Sc.(IT)/Sem I/Introduction...Summary ... 8.3.6 Project VENONA ... Table 8.2 VENONA Decrypt

51

SubtractionSubtraction works in much similar way:•0−0→00−1→1,borrow11−0→11−1→0Subtracting a “1” digit from a “0” digit produces the digit “1”, while 1 will have to be subtracted from the next •column. This is known as borrowing. The principle is the same as for carrying. When the result of a subtraction islessthan0,theleastpossiblevalueofadigit,theprocedureisto“borrow”thedeficitdividedbytheradix(that is, 10/10) from the left, subtracting it from the next positional value. Subtracting a positive number is equivalent to adding a negative number of equal absolute value. Computers typically use 2’s complement notation to represent negative values. This notation eliminates the need for a separate “subtract” operation.

* * * * (starred columns borrowed from)

1 1 0 1 1 1 0

- 1 0 1 1 1

= 1 0 1 0 1 1 1

Table 4.5 Subtraction in binary

Themostcommonwayofsubtractingbinarynumbersisdonebyfirsttakingthesecondvalue(thenumberto•be subtracted) and apply what is known as 2’s complement, this is done in two steps:

complement each digit in turn (change 1 for 0 and 0 for 1) and �add 1 (one) to the result �

Note:thefirststepbyitselfisknownas1’scomplement.

By applying these steps you are effectively turning the value into a negative number, and when dealing with •decimal numbers, if you add a negative number to a positive number then you are effectively subtracting to the same value.In other words 25 + (-8) = 17, which is the same as writing 25 - 8 = 17.•

An example, let’s do the following subtraction 11101011 – 01100110.

Note: when subtracting binary values it is important to maintain the same amount of digits for each number, even if it means placing zeroes to the left of the value to make up the digits, for instance, in our example we have added a zero to the left of the value 1100110 to make the amount of numerals up to 8 (one byte) 01100110.

First we apply 2’s complement to 01100110 Step 1 01100110 (reverse zeroes and ones) 10011001 10011001 +1 ________ (take result and add 1) 10011010

Now we need to add 11101011 + 10011010, however, when the addition is done always disregard the last carry, so it would be:

Page 64: Introduction to Information Theory and Applicationsjnujprdistance.com/assets/lms/LMS JNU/B.Sc.(IT)/Sem I/Introduction...Summary ... 8.3.6 Project VENONA ... Table 8.2 VENONA Decrypt

Introduction to Information Theory and Applications

52

11101011 + 10011010 10000101 11111 1 (Ignore the last carry on the left)This gives us 10000101

Negative numbersThe above example is subtracting a smaller number from a larger number, if one wants to subtract a larger •number from a smaller number (giving a negative result), then the process is slightly different.Usually,toindicateanegativenumber,themostsignificantbit(lefthandbit)issetto1andtheremaining7•digits are used to express the value. In this format the MSB is referred to as the sign bit.Here are the steps for subtracting a large number from a smaller one (negative result).•

Apply 2’s complement to the larger number. �Add this value to the smaller number. �Change the sign bit (MSB) to zero. �Apply2’scomplementtothevaluetogetfinalresult. �Themostsignificantbit(signbit)nowindicatesthevalueisnegative. �

MultiplicationMultiplication in binary is similar to its decimal counterpart. Two numbers A and B can be multiplied by partial •products. For each digit in B, the product of that digit in A is calculated and written on a new line, shifted leftward so that its rightmost digit lines up with the digit in B that was used. The sum of all these partial products gives thefinalresult.Since there are only two digits in binary, there are only two possible outcomes of each partial multiplication:•

If the digit in B is 0, the partial product is also 0. �If the digit in B is 1, the partial product is equal to A. �

For example, the binary numbers 1011 and 1010 are multiplied as follows: 1011 (A) × 1010 (B) 0000 ----- corresponds to a zero in B + 1011 ----- corresponds to a one in B + 0000 + 1011 = 1101110

DivisionIn binary division the divisor is 101• 2, or 5 decimal, while the dividend is 110112. The procedure is the same as that of decimal long division. Here, the divisor 1012goesintothefirstthreedigits1102 of the dividend one time, so a “1” is written on the top line.Thisresultismultipliedbythedivisor,andsubtractedfromthefirstthreedigitsofthedividend;thenextdigit•(a “1”) is included to obtain a new three-digit sequence:

______1______ 101 ) 1 1 0 1 1 - 1 0 1 0 1 1

The procedure is then repeated with the new sequence, continuing until the digits in the dividend have been •exhausted:

1 0 1 1 0 1) 1 1 0 1 1

Page 65: Introduction to Information Theory and Applicationsjnujprdistance.com/assets/lms/LMS JNU/B.Sc.(IT)/Sem I/Introduction...Summary ... 8.3.6 Project VENONA ... Table 8.2 VENONA Decrypt

53

- 1 0 1 0 1 1 - 0 0 0 1 1 1 - 1 0 1 1 0

Thus, the dividend of 11011• 2 divided by 1012 is 1012, as shown on the top line, while the remainder, shown on the bottom line, is 102. In decimal, 27 divided by 5 is 5, with a remainder of 2.

4.3.3 Conversion of Binary, Decimal, Hexadecimal and Octal Number SystemsAny number in one number system can be converted into any other number system. There are the various methods that are used in converting numbers from one base to another. In every number system:

ThefirstbitfromtherightisreferredasLSB(LeastSignificantBit).•ThefirstbitfromtheleftisreferredasMSB(MostSignificantBit).•

Conversion of binary to decimal To convert from a base-10 integer numeral to its base-2 (binary) equivalent, the number is divided by two and •theremainderistheleast-significantbit.The(integer)resultisagaindividedbytwo;itsremainderisthenextmostsignificantbit.Thisprocessrepeatsuntiltheresultoffurtherdivisionbecomeszero.Conversion from base-2 to base-10 proceeds by applying the preceding algorithm, so to speak, in reverse. •Thebitsofthebinarynumberareusedonebyone,startingwiththemostsignificant(leftmost)bit.Beginningwith the value 0, repeatedly double the prior value and add the next bit to produce the next value. This can be organised in a multi-column table. For example to convert 100101011012 to decimal: The result is 119710.

Prior value × 2 + Next Bit Next value0 × 2 + 1 = 11 × 2 + 0 = 22 × 2 + 0 = 44 × 2 + 1 = 9 9 × 2 + 0 = 1818 × 2 + 1 = 3737 × 2 + 0 = 7474 × 2 + 1 = 149149 × 2 + 1 = 299299 × 2 + 0 = 598598 × 2 + 1 = 1197

Table 4.6 Conversion of binary to decimal

Conversion of decimal to binary ( base 10 to base 2)The method that is used for converting decimals into binary is known as the remainder method. We use the •following steps for getting the binary number:

Divide the decimal number by 2. �Write the remainder (which is either 0 or 1) at the right most position. �Repeat the process of dividing by 2 until the quotient is 0 and keep writing the remainder after each step �of division.Write the remainders in reverse order. �

Page 66: Introduction to Information Theory and Applicationsjnujprdistance.com/assets/lms/LMS JNU/B.Sc.(IT)/Sem I/Introduction...Summary ... 8.3.6 Project VENONA ... Table 8.2 VENONA Decrypt

Introduction to Information Theory and Applications

54

Example: convert (68)10 to binary68/ 2 = 34 remainder is 034/ 2 = 17 remainder is 017 / 2 = 8 remainder is 18 / 2 = 4 remainder is 04 / 2 = 2 remainder is 02 / 2 = 1 remainder is 01 / 2 = 0 remainder is 1Answer = 1 0 0 0 1 0 0

Note: the answer is read from bottom (MSB) to top (LSB) as 10001002

Conversion of octal to decimal (base 8 to base 10)Example: convert (632)8 to decimal= (6 x 82) + (3 x 81) + (2 x 80)= (6 x 64) + (3 x 8) + (2 x 1)= 384 + 24 + 2= (410)10

Conversion of decimal to octal ( base 10 to base 8)Example: convert (177)10 to octal177 / 8 = 22 remainder is 122 / 8 = 2 remainder is 62 / 8 = 0 remainder is 2Answer = 2 6 1

Note: the answer is read from bottom to top as (261)8, the same as the binary case.

Decimal Binary Octal Hexadecimal0 0000 0 01 0001 1 12 0010 2 23 0011 3 34 0100 4 45 0101 5 56 0110 6 67 0111 7 78 1000 10 89 1001 11 910 1010 12 A11 1011 13 B12 1100 14 C13 1101 15 D14 1110 16 E15 1111 17 F

Table 4.7 Decimal, binary, octal and hexadecimal numbers

Page 67: Introduction to Information Theory and Applicationsjnujprdistance.com/assets/lms/LMS JNU/B.Sc.(IT)/Sem I/Introduction...Summary ... 8.3.6 Project VENONA ... Table 8.2 VENONA Decrypt

55

Conversion of hexadecimal to decimal ( base 16 to base 10) Example: convert (F4C) 16 to decimal= (F x 162) + (4 x 161) + (C x 160)= (15 x 256) + (4 x 16) + (12 x 1)= 3840 + 64 + 12 + 0= (3916)10

Conversion of decimal to hexadecimal ( base 10 to base 16) Example: convert (4768)10 to hexadecimal= 4768 / 16 = 298 remainder 0= 298 / 16 = 18 remainder 10 (A)= 18 / 16 = 1 remainder 2= 1 / 16 = 0 remainder 1Answer: 1 2 A 0

Note: the answer is read from bottom to top, same as with the binary case.

Conversion of binary to octal and hexadecimalConversion of binary numbers to octal and hexadecimal simply requires grouping bits in the binary numbers •into groups of three bits for conversion to octal and into groups of four bits for conversion to hexadecimal.Groups are formed beginning with the LSB and progressing to the MSB.•Thus, 11 100 111• 2 = 3478

11 100 010 101 010 010 0012 = 30252218

1110 01112 = E716

1 1000 1010 1000 01112 = 18A8716

4.3.4 1’s and 2’s Complement of Binary Number1’s and 2’s complement are important because they permit the representation of negative numbers. 2’s complement is commonly used in computer. To handle negative numbers.1’s complement of a binary number is found by simply changing all 1's to 0's and all 0's to 1's, as illustrated below:

1 0 1 1 0 0 1 0 binary number ↓ ↓ ↓ ↓ ↓ ↓ ↓ ↓ 0 1 0 0 1 1 0 1 1’s complement

2’s Complement of a binary numberThe 2’s complement of a binary number is found by adding 1 to the LSB of the 1’s complement 2’s Complement = 1’s complement + 1Example: Find the 2’s complement of a binary number. 10110010 10110010 binary number 01001101 1’s complement + 1 Add 1 01001110 2’s complementExample: Determine the 2’s complement of 11001011. 11001011 binary number 00110100 1’s complement + 1 00110101 2’s complement

Page 68: Introduction to Information Theory and Applicationsjnujprdistance.com/assets/lms/LMS JNU/B.Sc.(IT)/Sem I/Introduction...Summary ... 8.3.6 Project VENONA ... Table 8.2 VENONA Decrypt

Introduction to Information Theory and Applications

56

4.4 Floating Point ArithmeticIncomputing,floatingpointdescribesasystemforrepresentingnumbersthatwouldbetoolargeortoosmalltoberepresentedasintegers.Numbersareingeneralrepresentedapproximatelytoafixednumberofsignificantdigitsand scaled using an exponent. The base for the scaling is normally 2, 10 or 16.

The typical number that can be represented exactly is of the form:

Significantdigits×baseexponent

Thetermfloatingpointreferstothefactthattheradixpoint(decimalpointormorecommonlyincomputers,binarypoint)can“float”;i.e.itcanbeplacedanywhererelativetothesignificantdigitsofthenumber.Thispositionisindicatedseparatelyin theinternalrepresentationandfloating-pointrepresentationcanthus,bethoughtofasacomputerrealisationofscientificnotation.Overtheyears,severaldifferentfloating-pointrepresentationshavebeenusedincomputers.However,forthelasttenyearsthemostcommonlyencounteredrepresentationisthatdefinedby the IEEE 754 Standard.

Theadvantageoffloating-pointrepresentationoverfixed-point(andinteger)representationisthatitcansupportamuchwiderrangeofvalues.Forexample,afixed-pointrepresentationthathassevendecimaldigitswithtwodecimalplacescanrepresentthenumbers12345.67,123.45,1.23andsoon,whereasafloating-pointrepresentation(such as the IEEE 754 decimal 32 format) with seven decimal digits could in addition represent 1.234567, 123456.7, 0.00001234567, 1234567000000000, and so on.

Thefloating-pointformatneedsslightlymorestorage(toencodethepositionoftheradixpoint),sowhenstoredinthesamespace,floating-pointnumbersachievetheirgreaterrangeattheexpenseofprecision.Thespeedoffloating-pointoperationsisanimportantmeasureofperformanceforcomputersinmanyapplicationdomains.Itismeasured in FLOPS.

4.5 Arithmetic through StacksIn a stack machine model, arithmetic commands pop their operands from the top of the stack and push their results back onto the top of the stack. Other commands transfer data items from the stack’s top to designated memory locations, and vice versa. As it turns out, these simple stack operations can be used to implement the evaluation of any arithmetic or logical expression. Further, any program, written in any programming language, can be translated into an equivalent stack machine program.

Elementary stack operationsA stack is an abstract data structure that supports two basic operations: push and pop. •The push operation adds an element to the top of the stack. The element that was previously on top is pushed •below the newly added element. The pop operation retrieves and removes the top element. The element just below it moves up to the top position. •Thusthestackimplementsalast-in-first-out(LIFO)storagemodel,illustratedinfig.4.2.

Page 69: Introduction to Information Theory and Applicationsjnujprdistance.com/assets/lms/LMS JNU/B.Sc.(IT)/Sem I/Introduction...Summary ... 8.3.6 Project VENONA ... Table 8.2 VENONA Decrypt

57

Stack

Stack

Stack

Stack

SP

(before) (after)

SP

SP

SP

Memory

Memory

Memory

Memory

a

a

a

a

b

b

b

b

1215

17

1215

17

1215

17108

1215

•••6

•••108

•••

•••6

•••108

•••

•••6

•••108

•••

•••17

•••108

•••

push b

pop a

Fig. 4.2 Stack processing

Stackprocessingexample infig.4.2 illustrates the twoelementaryoperationspushandpop.Following the•convention, the stack is drawn upside down, as if it grows downward. The location just after the top position is always referred to by a special pointer called sp or stack pointer. The labels a and b refer to two arbitrary memory addresses.Stack arithmetic: Stack-based arithmetic is a simple matter. The two top elements are popped from the stack, the •required operation is performed on them, and the result is pushed back onto the stack. For example, here is how addition is handled. The stack version of other operations (subtract, multiply, etc.) are precisely the same.

17

4

5

SP

SPadd

17

9

Fig. 4.3 Addition using stack arithmetic

4.6 Computer LanguageA language that is acceptable to a computer system is called a computer language or programming language, and the process of writing instructions in such a language for an already planned program is called programming or coding. Over the years, programming languages have progressed from machine-oriented languages, which use strings 1s and 0s, to program-oriented languages, which use common mathematical and /or English terms. All computer languages canbroadlybeclassifiedintothefollowingthreecategories:

Machine language•Assembly language•High-level language•

Page 70: Introduction to Information Theory and Applicationsjnujprdistance.com/assets/lms/LMS JNU/B.Sc.(IT)/Sem I/Introduction...Summary ... 8.3.6 Project VENONA ... Table 8.2 VENONA Decrypt

Introduction to Information Theory and Applications

58

4.6.1 Machine LanguageMachine language is understood by the computer without using a translation program. The machine language of a computer is normally written as strings of binary 1's and 0's. Its instruction normally has a two-part format, as shown infig.4.4.Thefirstpartofaninstructionistheoperationcode,whichtellsthecomputerwhatfunctiontoperform,andthesecondpartistheoperand,whichtellsthecomputerwheretofindorstorethedataorotherinstructions,which are to be manipulated. Hence, each instruction tells the computer what operation to perform and the length andlocationsofthedatafieldswhichareinvolvedintheoperation.Everycomputerhasasetofoperationcodescalledtheinstructionset.Eachoperationcodeintheinstructionsetismeantforperformingaspecificbasicoperationor function.

Typical operations included in the instruction set of a computer are as follows:arithmetic operations•logical operations•branch operations (either conditional or unconditional) for transfer of control to the address given in the operand •fielddata movement operations for moving data between memory locations and registers•a data movement operation for moving data from or to one of the computer’s input/output devices.•

Fig. 4.4 shows a typical single-address machine language instructions. Although, some computers are designed to use only single-address instructions, many computers use multiple-address instructions, which include the addresses of two or more operands. For example, the augends and addend may be the two operands of an addition operation.

OPCODE(operation code)

OPERAND(Address/Location)

Fig. 4.4 Instruction format

All computers use binary digits (0s and 1s) for performing internal operations. Hence, most computer machine language instructions consist of strings of binary numbers. For example, a typical program instruction to print out a number on the printer might be

10110011111010011101100

The program to add two numbers in memory and print the result might look something like the following:001000000000001100111001001100000000010000100001011000000000011100101110101000111111011100101110000000000000000000000000

Thisisobviouslynotaneasytouselanguagebecauseitisdifficulttoreadandunderstand,andalsobecauseitiswritteninanumbersystemwithwhichwearenotfamiliar.However,someofthefirstprogrammerswhoworkedwiththefirstfewcomputersactuallywrotetheirprogramsinbinaryformasthisone.Sincehumanprogrammersare more familiar with the decimal number system, most of them will prefer to write the computer instructions in decimal, and leave the input device to convert these to binary. In fact, without too much efforts, a computer can be wired so that instead of using long strings of 1s and 0s, we can use the more familiar decimal numbers. With this change, the preceding program appears as follows:

100014711400204130003456

Page 71: Introduction to Information Theory and Applicationsjnujprdistance.com/assets/lms/LMS JNU/B.Sc.(IT)/Sem I/Introduction...Summary ... 8.3.6 Project VENONA ... Table 8.2 VENONA Decrypt

59

5077345600000000

This set of instructions, whether in binary or decimal, which can be directly understood by a computer without the help of a translating program, is called a machine code or machine language program. Hence, a machine language program need not necessarily be coded as strings of binary digits (1s and 0s). It can also be written using decimal digits, if the circuitry of the computer beings used permits this.

Advantages and limitation of machine languagePrograms written in machine language can be executed very fast by the computer. This is because machine instructions are directly understood by the computer, and no translation of the program is required. Writing a program in machine language has several disadvantages, which are discussed below:

Machine dependent: Because the internal design of every type of computer is different from every other type of •computer,themachinelanguagealsodiffersfromcomputertocomputer.Hence,afterbecomingproficientinthemachine language of a particular computer, if a company decides to change to another computer, its programmers will have to learn a new machine language, and would have to rewrite all the existing programs.Difficulttoprogram:Althoughmachinelanguageprogramsaredirectlyandefficientlyexecutedbythecomputer,•itisdifficulttoprograminmachinelanguage.Itisnecessaryfortheprogrammereithertomemorisethedozensof operation code numbers for the commands in the machine’s instruction set, or to constant refer to a reference card. A programmer is also forced to keep track of the storage locations of data and instructions. Moreover, a machine language programmer must be an expert who knows about the hardware structure of the computer.Error prone: For writing programs in machine language, since a programmer has to remember the opcodes, and •mustkeeptrackofstoragelocationsofdataandinstructions.Itbecomesverydifficultforhim/hertoconcentratefully on the logic of the problem. This frequently results in programming errors.Difficulttomodify:Itisdifficulttocorrectormodifymachinelanguageprograms.Checkingmachineinstructions•tolocateerrorsisverydifficultandtimeconsuming.Similarly,modifyingamachinelanguageprogramlaterissodifficultthatmanyprogrammerswouldprefertocodethenewlogicafresh,insteadofincorporatingthenecessarymodificationsintheoldprogram.Inshort,writingaprograminmachinelanguageissodifficultandtime consuming that it is rarely used today.

4.6.2 Assembly LanguageTheprogramminginmachinelanguageisdifficultanderror-pronebecause:

A programmer needs to write numeric codes for the instructions in the computer’s instruction set.•A programmer needs to write the storage locations of data and instructions in numeric form.•A programmer needs to keep track of the storage locations of data and instructions while writing a program•

Assembly language programming, which was instructed in 1952, helped in overcoming the above listed limitations of machine language programming in the following manner:

By using alphanumeric mnemonic codes, instead of numeric codes for the instructions in the instruction set. For •example, using ADD instead of 1110 (binary) or 14 (decimal) for the instructions to add, using SUB instead of 1111 (binary) or 15 (decimal) for the instruction to subtract, and so on. With this feature, the instructions in the instruction set can be much easily remembered and used by the programmers.By allowing storage locations to be represented in the form of alphanumeric addresses, instead of numeric •addresses. For example, the memory locations 1000, 1001 and 1002 may be represented as FRST, SCND and ANSR respectively, in an assembly language program. With this feature, a programmer can much easily remember and use the storage locations of the data and instructions used in an assembly language program.By providing additional instructions, called ‘pseudo-instructions’ in the instruction set, which are used for •instructing the system how we want the program to be assembled inside the computer’s memory. For example, there may be pseudo-instructions for telling the system things like:

Page 72: Introduction to Information Theory and Applicationsjnujprdistance.com/assets/lms/LMS JNU/B.Sc.(IT)/Sem I/Introduction...Summary ... 8.3.6 Project VENONA ... Table 8.2 VENONA Decrypt

Introduction to Information Theory and Applications

60

START PROGRAM AT 0000

START DATA AT 1000

SET ASIDE AN ADDRESS FOR FRST

SET ASIDE AN ADDRESS FOR SCND

SET ASIDE AN ADDRESS FOR ANSR

Table 4.8 Pseudo-instructions in the system

With this feature, a programmer need not keep track of the storage locations of the data and instructions while •writing an assembly language program. That is, an individual need not even tell the computer where to place each data item and where to place each instruction of a program.A language which allows instructions and storage locations to be represented by letters and symbols, instead of •numbers is called an ‘assembly language or symbolic language’. A program written in an assembly language is called an ‘assembly language program or a symbolic program’.Assembler•

A computer can directly execute only machine language programs, which use numbers for representing �instructions and storage locations. Hence, an assembly language program must be converted (translated) into its equivalent machine language program, before it can be executed on the computer. This translation is done with the help of a translator program which is known as ‘assembler’. The assembler of a computer system is system software, supplied by the computer manufacturer, which �translates an assembly language program into an equivalent machine language program of a computer. It is so called because, in addition to translating an assembly language program into its equivalent machine language program, it also ‘assembles’ the machine language program in the main memory of the computer and makes it ready for execution.The process of translating an assembly language program into its equivalent machine language program �withtheuseofanassemblerisillustratedinfig.4.5Asshowninthefigure,theinputtotheassembleristheassembly language program (often referred to as a ‘source program’) and its output is the machine language program (often referred to as ‘object program’). Since, the assembler translates each assembly language instruction into an equivalent machine language instruction, there is a one-to-one correspondence between the object programs. Note that, during the process of translation of a source program into its equivalent object program by the assembler, the source program is not being executed. It is only being converted into a form, which can be executed by the computer’s processor.

Assembly languageprogram Assembler

Input

(Source program)One-to-One correspondence

(Object Program)

Machine languageprogram

Fig. 4.5 Illustrating the translation process of an assembler

Notice that, in case of an assembly language program, the computer not only has to run the program to get �theanswer,butitalsomustfirstruntheassembler(program)totranslatetheoriginalassemblylanguageprogram into its equivalent machine language program. This means that the computer has to spend more time in getting the desired answer from an assembly language �program as compared to a machine language program. However, assembly language programming saves so much time and effort of the programmer that the extra time and effort spent by the computer is worth it.

Page 73: Introduction to Information Theory and Applicationsjnujprdistance.com/assets/lms/LMS JNU/B.Sc.(IT)/Sem I/Introduction...Summary ... 8.3.6 Project VENONA ... Table 8.2 VENONA Decrypt

61

Assembly language programming and translation of an assembly language program into its equivalent machine •language program can be best illustrated with the help of an example. For this, let’s assume that the computer uses the mnemonics given in table 4.9 for the operation codes mentioned against each. For simplicity, here an individualhaveconsideredonlyfiveoperationcodes,whichwillbeusedinwritingtheuserexampleprogram.Likethis, there can be more than hundred operation codes available in the instructions set of a particular computer.

Mnemonic Opcode Meaning

HLT 00 Halt, used at the end of program to stop

CLA 10 Clear and add into A register

ADD 14 Add to the contents of A register

SUB 15 Subtract from the contents of A register

STA 30 Store A register

Table 4.9 A subset of the set of instructions supported by a computer

Let’s write a simple assembly language program for adding two numbers and storing the result. The program is •shown in table 4.10. To get an idea of how the assembler will convert this program into an equivalent machine languageprogram,let’sfollowitsinstructionsone-by-one.Noticethat,thefirstfiveinstructionsoftheprogramare pseudo-instructions for telling the assembler what to do. They are not part of the main program to add the two numbers.

START PROGRAM AT 0000START DATA AT 1000

SET ASIDE AN ADDRESS FOR FRSTSET ASIDE AN ADDRESS FOR SCNDSET ASIDE AN ADDRESS FOR ANSRCLA FRSTADD SCNDSTA ANSRHLT

Table 4.10 A sample assembly language program for adding two numbers and storing the result

Thefirstinstructionoftheassemblylanguageprogramtellstheassemblerthattheinstructionsforthemain•program (to add two numbers) should start at memory location 0000. Based on this directive, the assembler willloadthefirstinstructionofthemainprogram(whichhappenstobeCLAFRSTinthisexample)atmemorylocation 0000, and each following instruction will be loaded in the following address (that is, ADD SCND will be location 0001, STA ANSR at location 0002, and HLT at location 0003.The second instruction of the assembly language program tells the assembler that the data of the program should •start at memory location 1000. The next three instructions tell the assembler to set aside addresses for data items FRST, SCND and ANSR. Based on these four directives, the assemblers set up a mapping table somewhere in the computer memory, which looks something like the one shown in table 4.11. That is, the assembler picks up the firstfreeaddressinthedataarea,whichisatlocation1000,andcallsitFRST;itpicksupthenextfreeaddressinthedataarea,whichisatlocation1001,andcallsitSCND;andfinally,itpicksupthenextfreeaddressinthe data area, which is at location 1002 and calls its ANSR.

Page 74: Introduction to Information Theory and Applicationsjnujprdistance.com/assets/lms/LMS JNU/B.Sc.(IT)/Sem I/Introduction...Summary ... 8.3.6 Project VENONA ... Table 8.2 VENONA Decrypt

Introduction to Information Theory and Applications

62

Symbolic name Memory location

FRST 1000

SCND 1001

ANSR 1002

Table 4.11 Mapping table set up by the assembler for the data items of the assembly language of table 4.10

The next instruction of the assembly language program is CLA, FRST, which the assembler translates into 10 1000, •by translating CLA into 10 with the help of table 4.9 and FRST into 1000 with the help of table 4.11. Similarly, the assembler will translate the instructi on ADD SCND into 14 1001 and the instruction STA ANSR into 30 1002. Finally, it translates the next instruction HLT into 00, thus, providing the complete machine language program for the given assembly language program. Table 4.12 shows the resulting machine language program.

Memory Location Contents CommentsOpcode Address

0000 10 1000 Clear and add the number stored at FRST to A register0001 14 1001 Add the number stored at SCND to the contents of A register0002 30 1002 Store the comments of A register into ANSR0003 00 Halt---1000 Reserved for FRST1001 Reserved for SCND1002 Reserved for ANSR

Table 4.12 The equivalent machine language program for the assembly language program given in table 4.10

Assembly language has the following advantages:•easier to understand and use �easier to locate and correct errors �easier to modify �no worry about addresses: An important advantage of assembly language is that programmers need not keep �track of the storage locations of the data and instructions while writing an assembly language program.easy to re-locate �efficiencyofmachinelanguage �

The following limitations of machine language are not solved by using assembly language:•Machine dependent: Since each instruction of an assembly language program is translated into exactly one �machine language instruction, assembly language programs are machine dependent. Knowledge of hardware required: Since assembly languages are machine dependent, an assembly language �programmer must have a good knowledge of the characteristics and the logical structure of his/her computer to write good assembly language programs.Machine level coding: In case of an assembly language, instructions are still written at the machine-code �level. Hence, writing assembly language programs is still time-consuming and not very easy.

Page 75: Introduction to Information Theory and Applicationsjnujprdistance.com/assets/lms/LMS JNU/B.Sc.(IT)/Sem I/Introduction...Summary ... 8.3.6 Project VENONA ... Table 8.2 VENONA Decrypt

63

4.6.3 High-Level LanguageMachine and assembly languages are often referred to as ‘low-level programming language’. High-level programming languages were designed to overcome the limitations of low-level programming languages. That is high-level languages are characterised by the following features:

They are machine dependent. In other words, a program written in a high-level language can be easily ported •and executed on any computer, which has the translator software for the high-level language.They do not require the programmers to know anything about the internal structure of the computer on which •the high-level language programs will be executed. In fact, since high-level language are machine independent, a programmer writing a program in a high-level language may not even know on which computer will his/her program be executed. This allows the programmers to mainly concentrate on the logic of the problem, rather than be concerned with the details of the internal structure of the computer.They do not deal with the machine-level coding. Rather, they deal with high-level coding, enabling the •programmers to write instructions using English words and familiar mathematical symbols and expressions. Each statement of a high-level language is normally a macro instruction, which is translated into several machine language instructions. This is one-to-many translation and not one-to-one as in case of assembly language.The advent of high-level language has enabled the use of computers to solve problems even by non-expert •users. This has allowed many users, without any background in computer science and engineering to become computer programmers. This, in turn, has resulted in the creation of a large number of computer applications in diverse areas, leading to the use of computers today in every occupation.

CompilerA computer can directly execute only machine language programs. Hence, a high-level language program must be •converted (translated) into its equivalent machine language program, before it can be executed on the computer. This translation is done with the help of a translator program, which is known as a ‘compiler’. Hence, a compiler is a translator program (much more sophisticated than an assembler), which translates a •high-level language program into its equivalent machine language program. A compiler is so called because it compiles a set of machine language instructions for every program instruction •of a high-level language. The process of translating a high-level language program into its equivalent machine languagewiththeuseofacompilerisillustratedinfig.4.6.Asshowninthefigure,theinputtothecompileristhehigh-levellanguageprogram(oftenreferredtoasa•‘source program’), and its output is the machine language program (often referred to as an ‘object program’). Since high-level language instructions are macro instructions, the compiler translates each high-level language instruction into a set of machine language instructions, rather than a single machine language instruction. Hence, there is a one-to-many correspondence between the high-level language instructions of a source program •and the machine language instructions of its equivalent object program. Note that, during the process of translation of a source program into its equivalent object program by the compiler, the source program is not being executed. It is only being converted into a form, which can be executed by the computer’s processor.

High level languageprogram Complier

Input

(Source program)One-to-many correspondence

(Object Program)

Output Machine languageprogram

Fig. 4.6 Illustrating the translation process of a compiler

A compiler can translate only those source programs, which have been written in the language for which the •compiler is meant. For example, a FORTRAN compiler is only capable of translating source programs, written in FORTRAN.

Page 76: Introduction to Information Theory and Applicationsjnujprdistance.com/assets/lms/LMS JNU/B.Sc.(IT)/Sem I/Introduction...Summary ... 8.3.6 Project VENONA ... Table 8.2 VENONA Decrypt

Introduction to Information Theory and Applications

64

Therefore, each computer requires a separate compiler for each high-level language that it supports. That is, to •execute both FORTRAN and COBOL programs on a computer, the computer must have a FORTAN compiler and a COBOL compiler. Compilers are large programs, which reside permanently on secondary storage. When a source program is to •be translated, the compiler and the source program are copied from secondary storage into the main memory of the computer. The compiler, being a program, is then executed with the source program as its input data. It generatestheequivalentobjectprogramasitsoutput,whichisnormallysavedinafileonsecondarystorage.Whenever there is a need to execute the program, the object program is copied from secondary storage into the main memory of the computer and executed. Note that, there is no need to repeat the compilation process every time you wish to execute the program. This is •because the object program stored on secondary storage is already in machine language. Simply have to load the object program from the secondary storage into the main memory of the computer and execute it directly. Also, note that compilation is necessary whenever we need to modify the program. That is, to incorporate changes in the program, you must load the original source program from secondary storage into the main memory of thecomputer,carryoutnecessarychangesinthesourceprogram,recompilethemodifiedsourceprogram,andcreate and store an updated object program for execution.In addition to translating high-level language instructions into machine language instructions, compilers also •automatically detect and indicate certain types of errors in source programs. These errors are referred to as ‘syntax error’ and are typically of the following types:

illegal characters �illegal combination of characters �improper sequencing of instructions in a program �useofundefinedvariablenames �

A source program containing one or more errors detected by the compiler will not be compiled into an object •program. In this case, the compiler will generate a list of coded error messages indicating the type of errors committed. This errors list is an invaluable aid to the programmer in correcting the program error. The programmer usesthiserrorlisttore-editthesourceprogramforremovingtheerrorsandcreatesamodifiedsourceprogram,which is recompiled. The process of editing the source program to make necessary corrections and recompiling themodifiedsourceprogramisrepeated,untilthesourceprogramisfreeofallsyntaxerrors.Thecompilergenerates the object program only when there are no syntax errors in the source program.A compiler, however, cannot detect ‘logic errors’. It can only detect grammatical (syntax) errors in the source •program. It cannot know ones intentions. For example, if one has wrongly entered -25 as the age of the person, when the person is actually intended +25, the compiler cannot detect this. Programs containing such errors will be successfully compiled and the object code will be obtained without any error message. However, such programs, when executed, will not produce correct results. Hence, logic errors are detected only after the program is executed and the result produced does not tally with the desired result.

LinkerSoftware often consists of several thousands even several millions of lines of program code. For software of •thissize,itisimpracticaltostoreallthelinesofcodesinasinglesourceprogramfileduetothefollowingreasons:

Thelargesizeofthefilewouldmakeitverydifficult,ifnotimpossible,toworkwith.Forexample,itmight �notbepossibletoloadthefileforcompilationonacomputerwithlimitedmainmemorycapacity.Again,whileeditingthefile,itcouldbeverytediousandtime-consumingtolocateaparticularlineofcode.Itwouldmakeitdifficulttodeploymultipleprogrammerstoworkconcurrentlytowardsthedevelopment �ofthesoftwareforcompletingitwithinaspecifiedtimelimit.Any change in the source program, no matter how small, would require the entire source program to be �recompiled. Recompilation of large source programs is often a time-consuming process.

To take care of these problems, a modular approach is generally adapted to develop reasonably sized software. •In this approach, the software is divided into functional modules and separate source programs are written for each module of the software.

Page 77: Introduction to Information Theory and Applicationsjnujprdistance.com/assets/lms/LMS JNU/B.Sc.(IT)/Sem I/Introduction...Summary ... 8.3.6 Project VENONA ... Table 8.2 VENONA Decrypt

65

Often, there is no need to even write source programs for some of the modules, because there might be programs •available in a program library, whith often the same functionality. These library programs are maintained in their object code form.Whenmodularapproachisusedfordevelopingsoftware,thesoftwareconsistsofmultiplesourceprogramfiles.•Eachsourceprogramfilecanbemodifiedandcompiledindependentofothersourceprogramfilestocreateacorrespondingobjectprogramfile.Inthiscase,aprogramcalleda‘linker’isusedtoproperlycombinealltheobjectprogramfiles(modules)of•thesoftware,andtoconvertthemintothefinalexecutableprogram,whichissometimescalleda‘loadmodule’.Thus,alinkertakesobjectprogramfiles(modules),andfitsthemtogethertoassemblethemintotheprogram’sfinalexecutableform.

InterpreterAn ‘interpreter’ is another type of translator, which is used for translating programs written in high-level language. •It takes one statement of a high-level language program, translates it into machine language instructions and then immediately executes the resulting machine language instructions. That is, in case of an interpreter, the translation and execution processes alternate for each statement encountered •in the high-level language program. This differs from a compiler, which merely translates the entire source program into an object program, and is not involved in its execution. Asshowninfig.4.7,theinputtoaninterpreteristhesourceprogram,butunlikeacompiler,itsoutputisthe•result of program execution, instead of an object program.

High level language program (Source Program)

Interpreter (translates and executes statements by statement)

Result of program execution

Input Output

Fig. 4.7 Illustrating the role of an interpreter

After compilation of the source program, the resulting object program is permanently saved for future use and •is used every time the program is to be executed. Hence, repeated compilation (translation of the source code) is not necessary for repeated execution of a program. However, in case of an interpreter, since no object program is saved for future use, repeated interpretation •(translation plus execution) of a program is necessary for its repeated execution. Note that, since an interpreter translates and executes a high-level language program statement-by-statement, a •program statement must be re-interpreted (translated and executed) every time it is encountered during program execution. For example, during the execution of a program, each instruction in a loop will have to be re-interpreted, every •time the loop is executed.As compared to compilers, interpreters are easier to write, because they are less complex programs than compilers. •They also require less memory space for execution than compilers.The main advantage of interpreters over compilers is that a syntax error in a program statement is detected •and brought to the attention of the programmer as soon as the program statement is interpreted. This allows the programmer to make corrections during interactive program development. Therefore, interpreters make it easier and faster to correct programs.

Page 78: Introduction to Information Theory and Applicationsjnujprdistance.com/assets/lms/LMS JNU/B.Sc.(IT)/Sem I/Introduction...Summary ... 8.3.6 Project VENONA ... Table 8.2 VENONA Decrypt

Introduction to Information Theory and Applications

66

The main disadvantage of interpreters over compilers is that they are slower than compilers when running a •finishedprogram.Thisisbecauseeachstatementistranslatedeverytimeitisexecutedfromthesourceprogram.In case of a compiler, each statement is translated only once and saved in the object program. The saved object program can be executed many times as and when needed and no translation of any statement is required during the execution of the program. Because the interpreter does not produce an object program, it must perform the translation process each time a program is executed.To combine the advantages of both interpreters and compilers, sometimes, the program development environment •of a computer system provides both a compiler and an interpreter for a high-level language. In these cases, the interpreter is used to develop and debug programs. Then, after a bug-free state is reached, •the programs are compiled to increase their execution speed. The object program produced by the compiler is subsequently used for routine processing.The following are the advantages of high-level language over assembly and machine languages:•

Machine independence: A program written in a high-level language can be executed on many different types �of computers with very little or practically no effort of porting it on different computers. Easier to learn and use: High-level language is easier to learn, because they are very similar to the natural �languages used by us in our day-to-day life. Fewer errors: While programming in high-level language, a programmer need not worry about how and �where to store the instructions and data of the program and need not write machine-level instructions for the steps to be carried out by the computer.Lower program preparation cost: Writing programs in high-level languages requires less time and effort, �which ultimately leads to lower programs preparation cost. Better documentation: The statements of a program written in a high-level language are very similar to the �natural language statements used by us in our day-to-day life.Easier to machine: Programs written in high-level language are easier to maintain than assembly language �or machine language programs.

The two main limitations of high-level language are as follows:•Lowerefficiency:Generally, aprogramwritten inahigh-level languagehas lowerefficiency thanone �written in an assembly language or a machine language, to do the same job. Lessflexibility:Generally,high-levellanguagesarelessflexiblethanassemblylanguagesbecausetheydo �not normally have instructions or mechanism to control the computer’s CPU, memory and registers.

In most cases, the advantages of high-level language far outweigh the disadvantages. Most computer installations •use a high-level language for most programs and use an assembly language for doing special tasks which cannot be easily done otherwise.

4.7 Operating System (OS)An operating system is an essential component of a computer system and its primary objective is to make computer systemconvenient touseandutilisecomputerhardware inanefficientmanner.AnOS isa largecollectionofsoftware,whichmanagesresourcesof thecomputersystem,suchasmemory,processor,filesystemandinput/output devices. It keeps track of the status of each resource and decides who will have a control over computer resources, for how long and when.

Following are the main functions of an operating system:Allocating system resourcesTheoperatingsystemdirectsthetrafficinsidethecomputer,decidingwhatresourceswillbeusedandforhowlong.

Time: Time in the CPU is divided into time slices which are measured in milliseconds. Each task which the •CPU performs is assigned a certain number of time slices. When time expires, another task gets a turn and the firsttaskmustwaituntilithasanotherturn.Sincetimeslicesaresosmall,oneusuallycan’ttellthatanysharingis going on. Tasks can be assigned priorities so that high priority (foreground) tasks get more time slices than low priority (background) tasks.

Page 79: Introduction to Information Theory and Applicationsjnujprdistance.com/assets/lms/LMS JNU/B.Sc.(IT)/Sem I/Introduction...Summary ... 8.3.6 Project VENONA ... Table 8.2 VENONA Decrypt

67

Memory: Memory must also be managed by the operating system. All those rotating turns of CPU use leave •datawaitingaroundinbuffers.Caremustbetakennottolosedata.Onewaytohelpoutthetrafficjamistousevirtual memory. This includes disk space as part of main memory. While it is slower to put data on a hard disk, it increases the amount of data that can be held in memory at one time. When the memory chips get full, some ofthedataispagedouttotheharddisk.Thisiscalledswapping.Windowsusesaswapfileforthispurpose.Input and output: Flow control is also a part of the operating system’s responsibilities. The operating system •must manage all requests to read data from disks or tape and all writes to these and to printers. To speed up the outputtoprinters,mostoperatingsystemsnowallowprintspooling,wherethedatatobeprintedisfirstputinafile.Thisfreesuptheprocessorforotherworkinbetweenthetimeswhendataisgoingtotheprinter.Aprintercanhandlelimitedinputatatime.Withoutprintspooling,onehastowaitforaprintjobtofinishbeforedoinganything else. With it, one can request several print jobs and go on working. The print spool will hold all the orders and process them in turn.

Monitoring system activitiesSystem performance: A user or administrator can check to see whether the computer or network is getting •overloaded. Changes could be made to the way tasks are allocated. System performance would include response time and CPU utilisation.System security: Some system security is part of the operating system, though additional software can add more •security functions. For multiple users who are not at all allowed access to everything, there must be a logon or login procedure where the user supplies a user name or ID and a password. An administrator must set up the permissions list of who can have access to what programmes and data.

File and disk managementKeepingtrackofwhatfilesareandwhere,isamajorjob.Anoperatingsystemcomeswithbasicfilemanagement•commands,where,auserneedstobeabletocreatedirectoriesforstoringfiles.Auserneedstocopy,move,deleteandrenamefiles.Thisisthecategoryofoperatingsystemfunctionsthatthe•user actually sees the most.

A more technical task is that of disk management. Under some operating systems, hard disk can be divided �or partitioned into several virtual disks. The operating system treats each virtual disk as a physically separate disk. Managing several physical and/ �or virtual disks can get pretty complex, especially if some of the disks are set up with different operating systems.

Following are the types of operating systems, categorised according to their functions:Batch operating system•Multiprogramming operating system•Network operating system•Distributed operating system•Utility software•Application software•

Application softwareApplicationsoftwareiswrittentoenablethecomputertosolveaspecificdataprocessingtask.Anumberof•powerfulapplicationsoftwarepackages,whichdonotrequiresignificantprogrammingknowledge,havebeendeveloped. These are easy to use and learn, as compared to the programming languages. Although such packages can •perform many general and special functions, there are applications where these are not found to be adequate. In such cases, application programme is written to meet the exact requirements. A user application programme may be written using one of these packages or a programming language:•

Page 80: Introduction to Information Theory and Applicationsjnujprdistance.com/assets/lms/LMS JNU/B.Sc.(IT)/Sem I/Introduction...Summary ... 8.3.6 Project VENONA ... Table 8.2 VENONA Decrypt

Introduction to Information Theory and Applications

68

Database management software �Spreadsheet software �Word processing, desktop publishing (DTP), presentation software and graphics software �Data communication software �Statistical and operational research software �

Database management systemsA database is a collection of data that is to be managed, rearranged and to be added to later. With a database one •can sort the data by name or city or postal code or by any individual item of information recorded. One can create forms to enter or update or just display the data or create reports that show just the data of interest, •like members who owe dues and so on.Both spreadsheets and databases can be used to handle much of the same information, but each is optimised to •handleadifferenttypemostefficiently.Thelargerthenumberofrecords,themoreimportantthedifferencesare.ExamplesofdatabasesareMSAccess,dBase,FoxPro,Paradox,Approach,Oracle,OpenOfficeBase.•

Purpose: managing data �Major advantages � : can change the way data is sorted and displayed

4.8 Instruction CycleAn instruction cycle also known as fetch-and-execute cycle or fetch-decode-execute cycle, or FDX is the basic operation cycle of a computer. It is the process by which a computer retrieves a program instruction from its memory, determines what actions the instruction requires and carries out those actions. This cycle is repeated continuously by the central processing unit (CPU), from boot up till the time the computer is shut down.It is the time period during which one instruction is fetched from memory and executed when a computer is given an instruction in machine language. There are four stages of an instruction cycle that the CPU carries out:

Fetch the instruction from memory. •“Decode” the instruction. •“Read the effective address” from memory if the instruction has an indirect address. •“Execute” the instruction.•

Each computer’s CPU can have different cycles based on different instruction sets, but are similar to the following cycle:Decode the instructionThe instruction decoder interprets the instruction. If the instruction has an indirect address, the effective address is read from main memory, and any required data is fetched from main memory to be processed and then placed into data registers. During this phase the instruction inside the IR (instruction register) decodes.

Execute the instructionThe CU passes the decoded information as a sequence of control signals to the relevant function units of the CPU to perform the actions required by the instruction such as reading values from registers, passing them to the ALU to perform mathematical or logic functions on them, and writing the result back to a register. If the ALU is involved, it sends a condition signal back to the CU.

Store resultsThe result generated by the operation is stored in the main memory, or sent to an output device. Based on the condition of any feedback from the ALU, Program Counter may be updated to a different address from which the next instruction will be fetched.

Page 81: Introduction to Information Theory and Applicationsjnujprdistance.com/assets/lms/LMS JNU/B.Sc.(IT)/Sem I/Introduction...Summary ... 8.3.6 Project VENONA ... Table 8.2 VENONA Decrypt

69

Different cyclesThe cycle is then repeated.

Fetch cycle: Steps (i) of the instruction cycle are called the Fetch Cycle. These steps are the same for each •instruction. The fetch cycle processes the instruction from the instruction word which contains an opcode and an operand.Execute cycle: Steps (ii) and (iii) of the instruction cycle are part of the Execute Cycle. These steps will change •with each instruction.

ThefirststepoftheexecutecycleistheProcess-Memory.DataistransferredbetweentheCPUandtheI/Omodule.Next are the Data-Processing which uses mathematical operations as well as logical operations in reference to data. Central alterations is the next step, it is a sequence of operations, for example a jump operation. The last step is a combined operation from all the other steps.

Initiating the cycleThecyclestartsimmediatelywhenpowerisappliedtothesystemusinganinitialPCvaluethatispredefinedforthesystemarchitecture(inIntelIA-32CPUs,forinstance,thepredefinedPCvalueis0xfffffff0).Typicallythisaddresspoints to instructions in a read-only memory (ROM) which begin the process of loading the operating system. (That loading process is called booting.)

The Fetch-Execute cycle in Transfer NotationExpressed in register transfer notation:MAR PCMDR Memory[MAR]PC PC+1 (Increment the PC for next cycle)CIR MDR

The registers used above, besides the ones described earlier, are the Memory Address Register (MAR) and the Memory Data Register (MDR), which are used (at least conceptually) in accessing the memory.

Page 82: Introduction to Information Theory and Applicationsjnujprdistance.com/assets/lms/LMS JNU/B.Sc.(IT)/Sem I/Introduction...Summary ... 8.3.6 Project VENONA ... Table 8.2 VENONA Decrypt

Introduction to Information Theory and Applications

70

Execute Instruction

Set PC To Value From Jumbo Instruction

Service InterruptInterrupt?

Jump?

Start

Yes

Yes

No

Load Address To PC

Load PC Contents To MAR

Update PC To Next Address

Load Data Required To MDR

MAR Contents To CIR

Decode CIR Contents

Fig. 4.8 A diagram of the fetch execute cycle

4.9 Program Flow of Control with and without InterruptsFlowcontrol(=handshaking=pacing)preventstoofastofaflowofbytesfromoverrunningaterminal,computer,modem or other device. Overrunning is when a device can’t process what it is receiving quickly and thus loses bytes and/or makes other serious errors.

Whatflowcontroldoesistohalttheflowofbytesuntiltheterminal(forexample)isreadyforsomemorebytes.Flowcontrolsendsitssignaltohalttheflowinadirectionoppositetotheflowofbytesitwantstostop.Flowcontrolmustbesetatboththeterminalandthecomputer.Thereare2typesofflowcontrol:hardwareandsoftware(Xon/Xoff or DC1/DC3).

HardwareflowcontrolusesdedicatedsignalwiressuchasRTS/CTSorDTR/DSRwhilesoftwareflowcontrol,signalsbysendingDC1orDC3controlbytesinthenormaldatawires.Forhardwareflowcontrol,thecablemustbecorrectlywired.Theflowofdatabytesinthecablebetween2serialportsisbi-directionalsothereare2differentflows(andwires)toconsider:

byteflowfromthecomputertotheterminal•byteflowfromtheterminalkeyboardtothecomputer•

Page 83: Introduction to Information Theory and Applicationsjnujprdistance.com/assets/lms/LMS JNU/B.Sc.(IT)/Sem I/Introduction...Summary ... 8.3.6 Project VENONA ... Table 8.2 VENONA Decrypt

71

Inputs are the signals or data received by the system, and outputs are the signals or data sent from it. The term canalsobeusedaspartofanaction;to“performI/O”istoperformaninputoroutputoperation.Incomputerscience,programcontrolflow(oralternatively,programflowofcontrol)referstotheorderinwhichtheindividualstatements, instructions or function calls of an imperative or a declarative program are executed or evaluated. Within animperativeprogramminglanguage,acontrolflowstatementisastatementwhoseexecutionresultsinachoicebeing made as to which of the two or more paths should be followed.

Interruptsandsignalsarelow-levelmechanismsthatcanaltertheflowofcontrolinawaysimilartoasubroutine,but usually occur as a response to some external stimulus or event (that can occur asynchronously), rather than executionofan‘in-line’controlflowstatement.

Self-modifyingcodecanalsobeusedtoaffectcontrolflowthroughitssideeffects,butusuallydoesnotinvolveanexplicitcontrolflowstatement.Ahardwareinterruptcausestheprocessortosaveitsstateofexecutionandbeginexecution of an interrupt handler. Software interrupts are usually implemented as instructions in the instruction set, which cause a context switch to an interrupt handler similar to a hardware interrupt.

User Program

WRITE

WRITE

WRITE

1

2

3

I/O Program

I/O Program

I/O Program

I/O Command

Interrupt Handler

Interrupt Handler

I/O Command

I/O Command

END

END END

User Program

User Program

WRITE WRITE

WRITE WRITE

WRITE WRITE

4 1 4

5

2

3

5

41

5

3b

3a

2b

2a

(a) NO interupts (b)interupts;shortI/Owait (c)interupts;longI/Owait

Fig. 4.9 Program flow of control without and with interrupts

Interrupts are a commonly used technique for computer multitasking, especially in real-time computing. Such system is said to be interrupt-driven.

Page 84: Introduction to Information Theory and Applicationsjnujprdistance.com/assets/lms/LMS JNU/B.Sc.(IT)/Sem I/Introduction...Summary ... 8.3.6 Project VENONA ... Table 8.2 VENONA Decrypt

Introduction to Information Theory and Applications

72

SummarySoftware refers to the set of computer programs, computer operations and computer application. Software guides •the computer at every step where to start and stop during a particular job.Arithmetic is a branch of mathematics that deals with numbers and numerical computation. Computer arithmetic •is a branch of computer engineering that deals with methods of representing integers and real values (e.g., fixed-andfloating-pointnumbers)indigitalsystemsandefficientalgorithmsformanipulatingsuchnumbersby means of hardware circuits or software routines.In the number system, the base or radix tells the number of symbols used in the system. In the earlier days, •different civilisations were using different radixes. The number system includes the ten digits from 0 through 9 is called decimal system. Binary system has only two numbers 0 and 1. The octal number system with its 8 digit 0,1,2,3,4,5,6, and 7 has base 8. Hexadecimal is another number system that works exactly like the decimal, binary and octal number systems, except that the base is 16.Incomputing,floatingpointdescribesasystemforrepresentingnumbersthatwouldbetoolargeortoosmall•asintegers.Numbersareingeneralrepresentedapproximatelytoafixednumberofsignificantdigitsandscaledusing an exponent. The base for the scaling is normally 2, 10 or 16. In a stack machine model, arithmetic commands pop their operands from the top of the stack and push their •results back onto the top of the stack. Stack-based arithmetic is a simple matter: the two top elements are popped from the stack, the required operation is performed on them, and the result is pushed back onto the stack.A language that is acceptable to a computer system is called a computer language or programming language. •The process of writing instructions in such a language for an already planned program is called programming or coding.Thefirstpartofaninstructionistheoperationcode,whichtellsthecomputerwhatfunctiontoperform,and the second part is the operand.A computer can directly execute only machine language programs. Hence, a high-level language program must be •converted (translated) into its equivalent machine language program, before it can be executed on the computer. This translation is done with the help of a translator program, which is known as a ‘compiler’.An ‘interpreter’ is another type of translator, which is used for translating programs written in high-level language. •It takes one statement of a high-level language program, translates it into machine language instructions, and then immediately executes the resulting machine language instructions.An operating system is an essential component of a computer system and its primary objective is to make •computersystemconvenienttouseandutilisecomputerhardwareinanefficientmanner.An instruction cycle is the process by which a computer retrieves a program instruction from its memory, •determines what actions the instruction requires, and carries out those actions.

ReferencesDavid, A. and John, L., 2008. • Computer organisation and design: The hardware/software interface, Morgan Kaufmann publications, 4th ed.Larry, L., 2004• . Computer Fundamentals, Dreamtech Press.Computer Operations• [Pdf] Available at: <http://research.microsoft.com/en-us/um/cambridge/events/needhambook/cap.pdf> [Accessed 26 May 2013].Introduction, Computer Operations, Data, and Program Development• [Pdf] Available at: <http://www.meteor.iastate.edu/classes/mt227/lectures/Intro_to_Fortran.pdf> [Accessed 26 May 2013].Lean Value Stream Mapping - Computer Operations• [Video online] Available at: <https://www.youtube.com/watch?v=GqxAPjrx-7s> > [Accessed 26 May 2013].RedPower Control/Computer - Strings/Input/IO/Variables & Stack• [Video online] Available at: <https://www.youtube.com/watch?v=_v_Jma1xX5s> [Accessed 26 May 2013].

Page 85: Introduction to Information Theory and Applicationsjnujprdistance.com/assets/lms/LMS JNU/B.Sc.(IT)/Sem I/Introduction...Summary ... 8.3.6 Project VENONA ... Table 8.2 VENONA Decrypt

73

Recommended ReadingRam, B., 2000. • Computer arithmetic algorithms, A K Peters Ltd Publication. Parhami, B., 2009. • Computer Fundamentals: Architecture and Organisation, New Age International.Larry, L., 2004• . Computer Fundamentals, Dreamtech Press.

Page 86: Introduction to Information Theory and Applicationsjnujprdistance.com/assets/lms/LMS JNU/B.Sc.(IT)/Sem I/Introduction...Summary ... 8.3.6 Project VENONA ... Table 8.2 VENONA Decrypt

Introduction to Information Theory and Applications

74

Self Assessment_____________is a branch of computer engineering that deals with methods of representing integers and real 1. valuesindigitalsystemsandefficientalgorithmsformanipulatingsuchnumbersbymeansofhardwarecircuitsor software routines.

Computer languagea. Computer arithmetic b. Computer architecturec. Computer programmingd.

Which of the following statements is true?2. The decimal numeral system, or base-2 number system, represents numeric values using two symbols, 0 a. and 1.The octal numeral system, or base-2 number system, represents numeric values using two symbols, 0 and b. 1.The binary numeral system, or base-2 number system, represents numeric values using two symbols, 0 and c. 1.The hexadecimal numeral system, or base-2 number system, represents numeric values using two symbols, d. 0 and 1.

Thefirstbitfromtherightisreferredtoas____.3. LSBa. MSBb. CPUc. ALUd.

By simply changing all 1s to 0s and all 0s to 1s of a binary number is called ________.4. 4’s complementa. 3’s complementb. 2’s complementc. 1’s complementd.

Which of the following statements is false?5. The process of writing instructions in such a language for an already planned program is coding.a. A language that is acceptable to a computer system is called a programming language.b. Machine language is understood by the computer without using a translation program.c. Machine and assembly languages are often referred to as ‘high-level programming language’.d.

______ are large programs, which reside permanently on secondary storage.6. Compilersa. Assembler b. Linkerc. Interpreterd.

Page 87: Introduction to Information Theory and Applicationsjnujprdistance.com/assets/lms/LMS JNU/B.Sc.(IT)/Sem I/Introduction...Summary ... 8.3.6 Project VENONA ... Table 8.2 VENONA Decrypt

75

Which of the following statements is false?7. A‘linker’isusedtoproperlycombinealltheobjectprogramfiles(modules)ofthesoftware,andtoconverta. themintothefinalexecutableprogram.An ‘interpreter’ is another type of translator, which is used for translating programs written in high-level b. language.A compiler can translate only those source programs, which have not been written in the language for which c. the compiler is meant/Machine language is understood by the computer without using a translation program.d.

An/A __________ is an essential component of a computer system and its primary objective is to make computer 8. systemconvenienttouseandutilizecomputerhardwareinanefficientmanner.

BIOSa. operating systemb. utility softwarec. application softwared.

The process by which a computer retrieves a program instruction from its memory, determines what actions the 9. instruction requires, and carries out those actions is known as ________.

Utility softwarea. Application softwareb. data base managementc. instruction cycled.

________is toprevent toofastofaflowofbytes fromoverrunninga terminal,computer,modemorother10. device

Flow control a. Start controlb. End controlc. Interrupt controld.

Page 88: Introduction to Information Theory and Applicationsjnujprdistance.com/assets/lms/LMS JNU/B.Sc.(IT)/Sem I/Introduction...Summary ... 8.3.6 Project VENONA ... Table 8.2 VENONA Decrypt

Introduction to Information Theory and Applications

76

Chapter V

Communication

Aim

The aim of this chapter is to:

introduce the concept of analog and digital signal•

explain concept of signal to noise ratio•

explicate communication channels•

Objectives

The objectives of this chapter are to:

definebasicanaloganddigitalsignal•

explain concept of signal to noise ratio•

enlist various communication technique•

Learning outcome

At the end of this chapter, you will be able to:

definevariouschannelcapacityandtransmissionimpairments•

identify and differentiate transmission media•

recognise the causes of impairments•

Page 89: Introduction to Information Theory and Applicationsjnujprdistance.com/assets/lms/LMS JNU/B.Sc.(IT)/Sem I/Introduction...Summary ... 8.3.6 Project VENONA ... Table 8.2 VENONA Decrypt

77

5.1 IntroductionCommunication means exchange of ideas from one person or place to another. Individuals mostly communicate by speaking. They also communicate using various forms of recorded information such as written documents, pictures, audios and videos. In all these cases the medium must be physically accessible by the recipient of the information. Thus, either the physical medium containing the information must be transported to the recipient or the recipient must come to the medium.

The widely used system for this type of communication is the postal system with its worldwide network. Much faster and economical communication is possible using telecommunication systems. Initially, communication of information to be fed to computers or generated by them was also by physical transfer of data storage media such asprinteddocuments,punchedcards,tapesandfloppies.Tosomeextentthisisstillusedtodayinalimitedway.However, it is more convenient, fast and economical to use telecommunication for the exchange of information between computers.

Telecommunication is the transmission of data by electrical means which may originate in alphabetical, numerical or pictorial form from one place to another. It is used to exchange data between different computer systems and parts of it. Networks of communication systems allow them to share peripherals, data and program. Many technologies have beenappliedtotheofficetoimprovebusinesscommunicationandinformationprocessing.Theseincludenetworkof computing facilities offering services like distributed processing, hardware facilities sharing, data transfer, e-mail, voice mail, and video conferencing.

5.2 Analog and Digital CommunicationA signal is a time-varying or spatial-varying quantity. In a communication system, a transmitter encodes a message into a signal, which is carried to a receiver by the communications channel. While in information theory, a signal is acodifiedmessage,thatis,thesequenceofstatesinacommunicationchannelthatencodesamessage.

There are two types of signals that carry information - analog and digital signals. The difference between analog and digital signals is that analog is a continuous electrical signal, whereas digital is a non-continuous electrical signal. Analog Signals vary in time, and the variations follow that of the non-electric signal. When compared to analog signals, digital signals change in individual steps and consist of pulses or digits. Digital signals have discrete levels, andthespecifiedvalueofthepulseremainsconstantuntilthechangeinthenextdigit.

Communications signals can be either by analog signals or digital signals. There are analog communication systems and digital communication systems. For an analog signal, the signal is varied continuously with respect to the information. In a digital signal, the information is encoded as a set of discrete values (for example, a set of ones and zeros). During the propagation and reception, the information contained in analog signals will inevitably be degraded by undesirable physical noise. (The output of a transmitter is noise-free for all practical purposes.)

Commonly, the noise in a communication system can be expressed as adding or subtracting from the desirable signal in a completely random way. This form of noise is called “additive noise”, with the understanding that the noise can benegativeorpositiveatdifferentinstantsoftime.Noisethatisnotadditivenoiseisamuchmoredifficultsituationto describe or analyse, and these other kinds of noise will be omitted here.On the other hand, unless the additive noise disturbance exceeds a certain threshold, the information contained in digital signals will remain intact. Their resistance to noise represents a key advantage of digital signals over analog signals.

Page 90: Introduction to Information Theory and Applicationsjnujprdistance.com/assets/lms/LMS JNU/B.Sc.(IT)/Sem I/Introduction...Summary ... 8.3.6 Project VENONA ... Table 8.2 VENONA Decrypt

Introduction to Information Theory and Applications

78

Analog

Digital

Fig. 5.1 Analog and digital signal

5.2.1 Transmission ImpairmentsSignals travel through transmission media, which are not perfect. The imperfection causes signal impairment. This means that the signal at the beginning of the medium is not the same as the signal at the end of the medium. What is sent is not what is received. Three causes of impairment are:

attenuation•distortion•noise•

Impairment Causes

Impairment Causes

Impairment Causes

Impairment Causes

Impairment Causes

Attenuation Distortion Noise

Fig. 5.2 Causes of impairment

Attenuation :It means loss of energy resulting weaker signal when a signal travels through a medium it loses •energyovercomingtheresistanceofthemediumamplifiersareusedtocompensateforthislossofenergybyamplifying the signal measurement of attenuation. To show the loss or gain of energy the unit “decibel” is used. dB = 10log10P2/P1

P1 - input signal P2 - output signal

Page 91: Introduction to Information Theory and Applicationsjnujprdistance.com/assets/lms/LMS JNU/B.Sc.(IT)/Sem I/Introduction...Summary ... 8.3.6 Project VENONA ... Table 8.2 VENONA Decrypt

79

Original

Point 1 Transmission medium Point 2Amplifier

Point 3

Attenuated Amplified

Fig. 5.3 Attenuation

Distortion: It means that the signal changes its form or shape distortion occurs in composite signals. Each •frequency component has its own propagation speed travelling through a medium. The different components therefore arrive with different delays at the receiver and this means that the signals have different phases at the receiver than they did at the source.

Composite signal sent

Components, in phase

Components, out of phase

At the receiver

Composite signal received

At the sender

Fig. 5.4 Distortion

Noise: There are different types of noise:•Thermal - random noise of electrons in the wire creates an extra signal �Induced - from motors and appliances, devices act are transmitter antenna and medium as receiving �antennaCrosstalk - same as above but between two wires �Impulse - spikes that result from power lines, lightening, etc. �

Transmitted Noise Received

Point 2Transmission mediumPoint 1

Fig. 5.5 Noise

Page 92: Introduction to Information Theory and Applicationsjnujprdistance.com/assets/lms/LMS JNU/B.Sc.(IT)/Sem I/Introduction...Summary ... 8.3.6 Project VENONA ... Table 8.2 VENONA Decrypt

Introduction to Information Theory and Applications

80

5.2.2 Signal to Noise RatioThe word noise means any unwanted sound. In both analog and digital electronics, noise is an unwanted perturbation to a wanted signal. It is called noise, as it is a generalisation of the audible noise heard when listening to a weak radio transmission.

Signal noise is heard as acoustic noise if played through a loudspeaker. Noise can block, distort, change or interfere with the meaning of a message in human, animal and electronic communication. In signal processing or computing it can be considered unwanted data without meaning. Data that is not being used to transmit a signal, but is simply produced as an unwanted by-product of other activities.

Ininformationtheory,however,noiseisstillconsideredtobeinformation.Inabroadersense,filmgrainorevenadvertisements encountered while looking for something else can be considered noise.“Signal-to-noise ratio” is sometimes used informally to refer to the ratio of useful information to false or irrelevant data in a conversation or exchange, such as off-topic posts and spam in online discussion forums and other online communities.

Signal-to-noise ratio (SNR or S/N) is a measure used in science and engineering to quantify how much a signal has beencorruptedbynoise.Itisdefinedastheratioofsignalpowertothenoisepowercorruptingthesignal.Aratiohigher than 1:1 indicates more signal than noise. While SNR is commonly quoted for electrical signals, it can be applied to any form of signal. Signal-to-noise ratio is the power ratio between a signal (meaningful information) and the background noise (unwanted signal):

SNR =

Where, P is average power. Signal and noise power must be measured at the same or equivalent points in system, and within the same system •bandwidth. If the signal and the noise are measured across the same impedance, then the SNR can be obtained by calculating the square of the amplitude ratio:

SNR = =

Where, A is root mean square (RMS) amplitude. Because many signals have a very wide dynamic range, SNRs are often expressed using the logarithmic decibel •scale.Indecibels,theSNRisdefinedas

SNRdb = 10log10 = Psignal, dB – Pnoise, dB,

which may equivalently be written using amplitude ratios as:

SNRdb = 10log10 = 20log10

SNR is usually taken to indicate an average signal-to-noise ratio, as it is possible that (near) instantaneous •signal-to-noise ratios will be considerably different. The concept can be understood as normalising the noise level to 1 (0 dB) and measuring how far the signal ‘stands out’.

Page 93: Introduction to Information Theory and Applicationsjnujprdistance.com/assets/lms/LMS JNU/B.Sc.(IT)/Sem I/Introduction...Summary ... 8.3.6 Project VENONA ... Table 8.2 VENONA Decrypt

81

a. Large SNR

b. Small SNR

Signal

Signal

Noise

Noise

Signal + Noise

Signal + Noise

Fig. 5.6 Two cases of SNR: a high SNR and a low SNR

5.2.3 Hamming Error-Correction CodesMessagesthataretransmittedoveracommunicationchannelcanbedamaged;theirbitscanbemaskedorinvertedby noise. Detecting and correcting these errors is important. Some simple codes can detect but not correct errors, others can detect and correct one or more errors. Hamming code that can correct are:

single-bit error•detect a double-bit error•

Parity checkingOne simple way to detect errors is:

Count the number of ones in the binary message.•Append one more bit, called the parity bit, to the message.•Set the parity bit to either 0 or 1 so that the number of ones in the result is even. For example, if the original •messagecontained17ones,theparitybitwouldbeaone;iftherehadbeen16ones,theparitybitwouldbeazero.Count the number of ones in the received message, including the parity bit. The result will always be even if •no errors were encountered. (This approach also works if the parity bit is set to make the count come out odd, as long as the receiver checks for an odd count.)

Thissimplecheckdoeshavetwolimitations:itonlydetectserrors,withoutbeingabletocorrectthem;anditcan’tdetect errors that invert an even number of bits.

Hamming code approachHamming codes are an extension of this simple method that can be used to detect and correct a larger set of errors. The basic idea is to have several parity bits (called check bits in Hamming codes) and assign different bits to several overlapping groups. If some parity bits are correct and others are not, the bit in error can be deduced.

A very simple example may help to understand this approach. Suppose that the original message is four bits long, numbered from 1 to 4. Then add three check bits as follows:

Checkbit1establishesevenparityoveritselfanddatabits1and2(thefirsttwo).•Check bit 2 establishes even parity over itself and data bits 1 and 3 (the odd-numbered bits).•Check bit 3 establishes even parity over itself and data bits 2 and 4 (the even-numbered bits). Assuming that •only one bit can be in error, it can be determined with the following table:

Check bits in error Data bit in errorNone None1 and 2 1(odd-numberedbitofthefirsttwo)1 and 3 2(even-numberedbitofthefirsttwo)

Page 94: Introduction to Information Theory and Applicationsjnujprdistance.com/assets/lms/LMS JNU/B.Sc.(IT)/Sem I/Introduction...Summary ... 8.3.6 Project VENONA ... Table 8.2 VENONA Decrypt

Introduction to Information Theory and Applications

82

2 only 3(odd-numbered bit of the second two)3 only 4(even-numbered bit of the second two)

Table 5.1 Bit and data bit in error

Althoughthiscodeworks,itisfairlyinefficient.Therearealmostasmanycheckbitsasdatabits.

5.2.4 Channel CapacityIn information theory, the noisy-channel coding theorem (Shannon’s theorem), establishes that for any given degree of noise contamination of a communication channel, it is possible to communicate discrete data (digital information) nearly error-free up to a computable maximum rate through the channel. This result was presented by Claude Shannon in 1948 and was based in part on earlier work and ideas of Harry Nyquist and Ralph Hartley.

The Shannon limit or Shannon capacity of a communications channel is the theoretical maximum information transfer rate of the channel, for a particular noise level. The Shannon theorem states that given a noisy channel with channel capacity C and information transmitted at a rate R, then if R < C there exist codes that allow the probability of error at the receiver to be made arbitrarily small. This means that, theoretically, it is possible to transmit information nearly without error at any rate below a limiting rate, C.

The converse is also important. If R > C, an arbitrarily small probability of error is not achievable. All codes will have a probability of error greater than a certain positive minimal level, and this level increases as the rate increases. So, information cannot be guaranteed to be transmitted reliably across a channel at rates beyond the channel capacity. The theorem does not address the rare situation in which rate and capacity are equal. The channel capacity C can becalculatedfromthephysicalpropertiesofachannel;foraband-limitedchannelwithGaussiannoise,usingtheShannon–Hartley theorem.

Mathematical representation of the theorem

Transmitter Channel (noisy) Receiver

Theorem (Shannon, 1948):For every discrete memory less channel, the channel capacity• C = suppx I hasthefollowingproperty.Foranyε>0andR<C,forlargeenoughN,thereexistsacodeoflengthNandrate≥Randadecodingalgorithm,suchthatthemaximalprobabilityofblockerroris≤ε.

If a probability of bit error p• b is acceptable, rates up to R(pb) are achievable, where

and H2(pb) is the binary entropy function

For any p• b, rates greater than R(pb) are not achievable.

5.3 Communication ChannelsTo network computers, a physical or non-physical connection needs to be established. This can be achieved using communication channels. A communication channel is the path—transmission medium—over which information travels in a communication system from its source to its destination. Channels are also called links, lines, or media.

Page 95: Introduction to Information Theory and Applicationsjnujprdistance.com/assets/lms/LMS JNU/B.Sc.(IT)/Sem I/Introduction...Summary ... 8.3.6 Project VENONA ... Table 8.2 VENONA Decrypt

83

Communication devices use analog electromagnetic signals representing data to transmit information from one device to another. Electromagnetic signals can travel through vacuum, air or other transmission media like wire, fibreopticsandsoon.

Communicationchannelswhichuseaphysicalmediumfortransmission(twistedpairwire,coaxialcable,andfibreoptic cable) are called wired channels. Communication channels which do not require any physical medium for transmission (radio, microwave and communication satellite) are called wireless channels.

The basis for all communication channels both wired and wireless, are the electromagnetic spectrum. The spectrum covers frequencies for voice, radio waves, infrared light, visible light, and ultraviolet light and X, gamma and cosmic rays.

5.3.1 Wired Channels (Twisted-pair Wire, Coaxial Cable and Fibre-optic Cable)These media are also called guided media since they provide a conduit from one device to another. A signal travelling along any of these media is directed and contained by the physical limits of the medium.Wired communication channels use the following physical media:

twisted-pair wire•coaxial cable•fibre-opticcable•

Opticalfibreisaglassorplasticcablethatacceptsandtransportssignalsintheformoflight.Letuslookateachof these mediums more closely.

Twisted-pair wireTwisted-pair and coaxial cable use metallic (copper) conductors that accept and transport signals in the form of electrical current.Twisted-pair wire are of two types:

Unshielded Twisted Pair (UTP) is the most common type and also used in telephone lines. A twisted pair consists •of two conductors (copper) each with its own coloured plastic insulation and twisted around each other. Twisted pairconfigurationreducesinterferencefromelectricalfieldascomparedtoparallelpairconfiguration.Unshieldedtwisted pair is currently the cable standard for most networks. It is relatively inexpensive, easy to install, very reliable and easy to maintain and expand. UTP supports a maximum data rate of 155 Mbps.Shielded Twisted Pair (STP) wire has a metal foil or braided mesh covering which encases each pair of insulated •conductors. The metal casing prevents the penetration of electromagnetic noise and the quality of transmission improves. In all other respects it resembles UTP.

Coaxial cableCoaxial cable (simply called coax) has a central core conductor of solid or standard wire (usually copper) enclosed in an insulating sheath which is, in return, encased in an outer conductor of metal foil, braid or a combination of the two (usually copper). The outer metallic wrapping serves both as a shield against noise and as the second conductor, which completes the circuit. This outer conductor is also enclosed in an insulating sheath and the whole cable is protected by a plastic cover.

Coaxial cable carries signals of higher frequency ranges than twisted-pair wire. Often many coaxial cables are bundled together. As a result of extra insulation, coaxial cable is much better than twisted pair wiring at resisting noise. Also, it is faster than UTP (supports a maximum data rate of 200 Mbps).

Fibre-optic cableThecableconsistsofacoremadeoffineglassorplasticfibre.Thecoreissurroundedbyarefractivelayercalledthecladdingthateffectivelytrapsthelightandkeepsitbouncingalongthecentralfibre.Inmostcases,thecladdingis covered by a buffer layer that protects it from moisture.

Page 96: Introduction to Information Theory and Applicationsjnujprdistance.com/assets/lms/LMS JNU/B.Sc.(IT)/Sem I/Introduction...Summary ... 8.3.6 Project VENONA ... Table 8.2 VENONA Decrypt

Introduction to Information Theory and Applications

84

The entire cable is encased in an outer jacket. Both core and cladding can be made of glass or plastic but must be of different densities. In addition, the core must be ultra pure and completely regular to ensure distortion- free signals. Since light has higher frequency on the electromagnetic spectrum than other types of radiation such as radio waves, a singlefibre-opticchannelcancarrymoreinformationthanmostothermeansofinformationtransmission.Hundredsofstrandsofopticalfibres(eachasthinashumanhair)canbehousedinasinglefibre-opticcable.Theyrepresentthe most promising type of transmission medium and their usage is fast increasing with the time.

Themajoradvantagesoffibre-opticcableare:Noise resistance: since fibre-optic transmission uses light rather than electricity, noise (electromagnetic•interference) is not a factor. Less signal attenuation: signal can run for miles without requiring regeneration.•Higherbandwidth:afibre-opticcablecansupportmuchhigherbandwidththanbothtwisted-pairandcoaxial•cable. It can support data rates of the order of gigabits per second (Gbps). The data rates are limited not by the medium but by the signal generation and reception technology available.•

Themaindisadvantagesoffibre-opticcableare:high cost•difficultiesintheirinstallation/maintenance•fragility•

5.3.2 Wireless Channels (Radio Link, Microwave Link, Satellite Communication)Wireless channels transport electromagnetic waves from one point to another through the atmosphere or space without using a physical conductor. The section of electromagnetic spectrum designated for wireless channels called radio spectrum ranges from 3GHz to 300 GHz and is divided into eight bands each regulated by government authorities. These bands are rated from very low frequency (VLF) to extremely high frequency (EHF). Radio link, microwave link and satellite communication utilise frequencies in the radio spectrum for data communication.

Radio link•Radio link is also called broadcast radio which deals with transmission of data over long distance. A �transmitter is required to send messages and a receiver to receive them. Depending upon the type of service, it uses a range of frequencies (3 kHz to 30MHz). In the lower frequencies �of radio spectrum, several broadcast radio bands are reserved for conventional AM/FM radio, broadcast television and private radio services.Radio link can support a bandwidth up to 2 Mbps. It is easy to install and involves low recurring costs. �

Microwave link•Microwave link is also called microwave radio which utilises point to point radio transmissions at the super- �high frequency (SHF) and extremely high frequency (EHF) bands. Microwaves do not follow the curvature of the earth and therefore require line-of-sight transmission and reception equipment. Microwave dishes which contain transceivers (sending and receiving devices) and antennas are setup on �towers or buildings to establish the link. Microwave stations need to be placed at some distances (a few kilometres) from each other with no obstruction �in between. The size of the dish varies with the distance. A string of microwave relay stations is used with each station receiving incoming messages, boosting the �signal strength, and relaying the signal to the next station.Microwave link supports a bandwidth up to 45Mbps and is widely used in data communication. �

Satellite communication •To overcome line-of-sight constraint of microwave earth stations, communication satellites (microwave ‘sky �stations’) are used. Communication satellites are microwave relay stations in orbit around the earth. Transmitting a signal from a ground station to a satellite is called up linking and the reverse is called down �linking.

Page 97: Introduction to Information Theory and Applicationsjnujprdistance.com/assets/lms/LMS JNU/B.Sc.(IT)/Sem I/Introduction...Summary ... 8.3.6 Project VENONA ... Table 8.2 VENONA Decrypt

85

Geosynchronous satellites are most commonly used in data communication. A geosynchronous satellite �is placed in geostationary earth orbit (nearly 36,000 km directly above the equator) where it travels at the same speed as the earth and appears to an observer on the ground to be stationary. Consequently,microwaveearthstationsarealwaysabletobeamsignalstoafixedlocationabove.The �orbiting satellite has solar powered transceivers to receive the signals, amplify them and re-transmit them to another earth station.Satellite communication provides transmission capabilities to and from any location on earth, no matter �how remote. This advantage makes high quality communication available to less developed regions without requiring �huge investment in ground-based infrastructure. Satellite communication supports high bandwidths capable of carrying large amounts of data and ensures �low error rates.

5.4 Transmission TechnologyAfter connecting computers by using some communication channel described above, data needs to be transmitted from one computer to another. Transmission technology is necessarily used to do so. There are two types of transmission technologies:

broadcast networks•point-to-point or switched networks•

5.4.1 Broadcast Networks Broadcast networks have a single communication channel that is shared by all the machines on the network. In this type of network, short messages sent by any machine are received by all the machines on the network. The packet containsanaddressfield,whichspecifiesforwhomthepacketisintended.Allthemachines,uponreceivingapacketcheckfortheaddressfield,ifthepacketisintendedforitself,itprocessesitandifnotthepacketisjustignored.This mode of operation is known as Broadcasting. Some broadcast networks also support transmission to a subset of machines and this is known as Multicasting.

One possible way to achieve multicasting is to reserve one bit to indicate multicasting and the remaining (n-1) address bits contain group number. Each machine can subscribe to any or all of the groups. Broadcast networks are easilyconfiguredforgeographicallylocalisednetworks.Broadcastnetworksmaybestaticordynamic,dependingon how the channel is allocated.

The different types of broadcast networks are: packet radio networks•satellite networks•local area networks •

Packet radio broadcasting differs from satellite network broadcasting in several ways. In particular, stations have limited range introducing the need for radio repeaters, which in turn affects the routing, and acknowledges schemes. Also the propagation delay is much less than for satellite broadcasting.

LAN(LocalAreaNetwork)isacomputernetworkthatspansoverarelativelysmallarea.MostLANsareconfinedto a single building or group of buildings within a campus. However, one LAN can be connected to other LANs over any distance via telephone lines and radio waves. A system of LANs connected in this way is called a wide-area network (WAN).There are many different types of LANs, Ethernets being the most common for PCs.

Page 98: Introduction to Information Theory and Applicationsjnujprdistance.com/assets/lms/LMS JNU/B.Sc.(IT)/Sem I/Introduction...Summary ... 8.3.6 Project VENONA ... Table 8.2 VENONA Decrypt

Introduction to Information Theory and Applications

86

5.4.2 Point-to-Point or Switched Networks Point–to-point or switched networks are those in which there are many connections between individual pairs of machines.Inthesenetworks,whenapackettravelsfromsourcetodestination,itmayhavetofirstvisitoneormoreintermediate machines.

Routing algorithms play an important role in point-to-point or switched networks because often multiple routes of different lengths are available. An example of switched network is the international dial-up telephone system. In switched network, the temporary connection is established from one point to another for either the duration of the session (circuit switching) or for the transmission of one or more packets of data (packet switching).

Two types of point-to-point or switched networks are: Circuit switched networks •

Circuit switched networks use a networking technology that provides a temporary, but dedicated connection �between two stations no matter how many switching devices are used in the data transfer route. This was originally developed for the analog based telephone system in order to guarantee steady and �consistent service for two people engaged in a phone conversation. Analog circuit switching has given way to digital circuit switching, and the digital counterpart still maintains �the connection until broken (one side hangs up.) This means, bandwidth is continuously reserved and “silence is transmitted” just the same as digital audio in voice conversation.

Packet switched networks •Packet switched networks use a networking technology that breaks up a message into smaller packets for �transmission and switches them to their destination. Unlike circuit switching, which requires a constant point-to-point circuit to be established, each packet in �a packet- switched network contains a destination address. Thus, all packets in a single message do not have to travel the same path. They can be dynamically routed over the network as lines become available or unavailable. The destination computer reassembles the packets back into their proper sequence. Packet switching �efficientlyhandlesmessagesofdifferentlengthsandpriorities.Higher-level protocols, such as TCP/IP, IPX/SPX and NetBIOS are also packet based and are designed to �ride over packet-switched topologies. Public packet switching networks may provide value added services, such as protocol conversion and electronic mail.

5.5 ModulationModulation is the process of conveying a message signal, for example a digital bit stream or an analog audio signal, inside another signal that can be physically transmitted. Modulation of a sine waveform is used to transform a baseband message signal to a passband signal, for example a radio-frequency signal (RF signal). In radio communications, cable TV systems or the public switched telephone network for instance, electrical signals can only be transferred overalimitedpassbandfrequencyspectrum,withspecific(non-zero)loweranduppercut-offfrequencies.

Modulating a sine wave carrier makes it possible to keep the frequency content of the transferred signal as close as possible to the centre frequency (typically the carrier frequency) of the passband. When coupled with demodulation, this technique can be used to, among other things, transmit a signal through a channel which may be opaque to the basebandfrequencyrange(forinstance,whensendingatelephonesignalthroughafiber-opticstrand).

The digital modulation is to transfer a digital bit stream over an analog bandpass channel, for example over the public switchedtelephonenetwork(whereabandpassfilterlimitsthefrequencyrangetobetween300and3400Hz),orover a limited radio frequency band.The analog modulation is to transfer an analog baseband (or lowpass) signal, for example an audio signal or TV signal, over an analog bandpass channel, for example a limited radio frequency band or a cable TV network channel.

Page 99: Introduction to Information Theory and Applicationsjnujprdistance.com/assets/lms/LMS JNU/B.Sc.(IT)/Sem I/Introduction...Summary ... 8.3.6 Project VENONA ... Table 8.2 VENONA Decrypt

87

Analog and digital modulation facilitate frequency division multiplexing (FDM), where several low pass information signals are transferred simultaneously over the same shared physical medium, using separate passband channels.Digital baseband modulation methods also known as line coding, is to transfer a digital bit stream over a baseband channel,typicallyanon-filteredcopperwiresuchasaserialbusorawiredlocalareanetwork.Apulsemodulationmethod is to transfer a narrowband analog signal, for example a phone call over a wideband baseband channel or in some of the schemes, as a bit stream over another digital transmission system.

0 1 0 0

Carrier

Modulation Wave (digital)

Modulated Result

Fig. 5.7 Modulation

Page 100: Introduction to Information Theory and Applicationsjnujprdistance.com/assets/lms/LMS JNU/B.Sc.(IT)/Sem I/Introduction...Summary ... 8.3.6 Project VENONA ... Table 8.2 VENONA Decrypt

Introduction to Information Theory and Applications

88

SummaryCommunication means exchange of ideas from one person or place to another. The widely used system for •this type of communication is the postal system with its worldwide network. Much faster and economical communication is possible using telecommunication systems.Analog is a continuous electrical signal, which vary in time and the variations follow that of the non-electric •signal. Whereas, digital is a non-continuous electrical signal which changes in individual steps and consist of pulsesordigits.Digitalsignalshavediscretelevels,andthespecifiedvalueofthepulseremainsconstantuntilthe change in the next digit.Signals travel through transmission media, which are not perfect. The imperfection causes signal impairment. This •means that the signal at the beginning of the medium is not the same as the signal at the end of the medium.The word noise means any unwanted sound signal-to-noise ratio (SNR or S/N) is a measure used in science •andengineeringtoquantifyhowmuchasignalhasbeencorruptedbynoise.Itisdefinedastheratioofsignalpower to the noise power corrupting the signal. Hamming codes are an extension of this simple method that can be used to detect and correct a larger set of •errors. The basic idea is to have several parity bits.In information theory, the noisy-channel coding theorem (Shannon’s theorem), establishes that, for any given •degree of noise contamination of a communication channel, it is possible to communicate discrete data (digital information) nearly error-free up to a computable maximum rate through the channel.Communication channels which use a physical medium for transmission (twisted pair wire, coaxial cable, and •fibreopticcable)arecalledwiredchannels.Communicationchannelswhichdonotrequireanyphysicalmediumfor transmission (radio, microwave and communication satellite) are called wireless channels. Broadcast networks have a single communication channel which is shared by all the machines on the network. •Point–to-point or switched, networks are those in which there are many connections between individual pairs of machines.

ReferencesCurt, W., 2010. • Data Communications and Computer Networks A Business User’s Approach, Course Technology publications. Stallings, W., 2010. • Data and Computer Communications, Prentice Hall Publication. Computer Communication - Network media• [Video online] Available at: <https://www.youtube.com/watch?v=_6SSiNIzfGc> [Accessed 26 May 2013].Networking Basics, Free Tutorial• [Video online] Available at: <https://www.youtube.com/watch?v=fMxO_8F9ADg> [Accessed 26 May 2013].Data Communication and its Networks• [Pdf]Availableat:<http://memberfiles.freewebs.com/00/88/103568800/documents/Data.And.Computer.Communications.8e.WilliamStallings.pdf> [Accessed 26 May 2013].Data Communications and Computer Networks• [Pdf] Available at: <http://www.csi.ucd.ie/staff/jcarthy/home/CourseNotes/Networks1.pdf> [Accessed 26 May 2013].

Recommended ReadingAndrew, S. and David, J., 2010. • Computer Networks, Prentice Hall Publication. James, F. and Kurose, W., 2009. • Computer Networking: A Top-Down Approach, Addison Wesley Publication. Dr Crispin, T. and Lengel, L., 2004. • Computer Mediated Communication, Sage Publications Ltd.

Page 101: Introduction to Information Theory and Applicationsjnujprdistance.com/assets/lms/LMS JNU/B.Sc.(IT)/Sem I/Introduction...Summary ... 8.3.6 Project VENONA ... Table 8.2 VENONA Decrypt

89

Self AssessmentAnalog is a continuous electrical signal, whereas ________is a non-continuous electrical signal.1.

analoga. digitalb. directc. electricd.

Match the following2. Transmission Impairments1. Loss of energy resulting weaker signalA. Attenuation2. The signal changes its form or shapeB. Distortion3. Any unwanted soundC.

Noise4. The signal at the beginning of the medium is not the same as D. the signal at the end of the medium

1-A, 2-B, 3-C, 4-Da. 1-B, 2-C, 3-D, 4-Ab. 1-D, 2-A, 3-B, 4-Cc. 1-C, 2-D, 3-A, 4-Bd.

The power ratio between a signal (meaningful information) and the background noise (unwanted signal) is 3. known as _________ .

Signal-to-noise ratioa. Signal-to-blast ratiob. Signal-to-sound ratioc. Signal-to-resonance ratiod.

Twisted-pair and coaxial cable use metallic (copper) conductors that accept and transport signals in the form 4. of _________.

electrical voltagea. electrical potentialb. electrical resistancec. electrical currentd.

___________ has a central core conductor of solid or standard wire (usually copper) enclosed in an insulating 5. sheath.

Fibre-optic cablea. Coaxial cableb. Axial cablec. Radio cabled.

Which of the following statements is false? 6. Microwave link is also called microwave radio which utilises point to point radio transmissions at the super-a. high frequency and extremely high frequency bands.Microwave dishes which contain transceivers and antennas are setup on towers or buildings to establish b. the link.A string of microwave relay stations is used with each station receiving outgoing messages, decreasing the c. signal strength, and obstructing the signal to the next station.Microwave link supports a bandwidth of upto 45Mbps and is widely used in data communication.d.

Page 102: Introduction to Information Theory and Applicationsjnujprdistance.com/assets/lms/LMS JNU/B.Sc.(IT)/Sem I/Introduction...Summary ... 8.3.6 Project VENONA ... Table 8.2 VENONA Decrypt

Introduction to Information Theory and Applications

90

Communication _________ are microwave relay stations in orbit around the earth.7. satellitesa. starsb. cablesc. wiresd.

__________ use a networking technology that provides a temporary, but dedicated connection between two 8. stations no matter how many switching devices are used in the data transfer route.

Packet Switched networksa. Circuit Switched networksb. Carton Switched networksc. Box Switched networksd.

The process of conveying a message signal is called____________.9. modulationa. demodulationb. switched networksc. networksd.

____________ is a glass or plastic cable that accepts and transports signals in the form of light.10. Metalfibrea. Non-metalfibreb. Paperfibrec. Opticalfibred.

Page 103: Introduction to Information Theory and Applicationsjnujprdistance.com/assets/lms/LMS JNU/B.Sc.(IT)/Sem I/Introduction...Summary ... 8.3.6 Project VENONA ... Table 8.2 VENONA Decrypt

91

Chapter VI

Computer Networks

Aim

The aim of this chapter is to:

introduce the concept of computer network•

defineprotocols•

explain the concept of internet•

Objectives

The objectives of this chapter are to:

explicate networks•

enlist types of networks•

explain different protocols•

Learning outcome

At the end of this chapter, you will be able to:

understand the term computer networks and their types•

identify the conceptofnetworkconfiguration•

recognise network topology•

Page 104: Introduction to Information Theory and Applicationsjnujprdistance.com/assets/lms/LMS JNU/B.Sc.(IT)/Sem I/Introduction...Summary ... 8.3.6 Project VENONA ... Table 8.2 VENONA Decrypt

Introduction to Information Theory and Applications

92

6.1 IntroductionA collection of computers, display terminals, printers and other devices linked either by physical or wireless means is called a network. A computer network is an interconnection of various computer systems located at different places. In computer network two or more computers are linked together with a medium and data communication devices for the purpose of communicating data and sharing resources. The computer that provides resources to other computers on a network is known as server. In the network the individual computers which access shared network resources, are known as workstations or nodes.

Components of networkTo form a network, one needs to install networking hardware and software. Every network includes following components.

Nodes: The computers which are connected together form a network computers and similar devices within a •connection are called as nodesHardware: The networking hardware is that, which connects the computers together. This includes a hardware •installed in the computer, network cables and devices which connect all the cable togetherSoftware : Networking software is that, which runs on each computer, it enables all computers to communicate •with each other on the same network

6.2 Types of NetworksThe various types of networks are listed below:

Local Loop Network•Local Area Network (LAN)•Metropolitan Area Network (MAN)•Storage Area Network (SAN)•Wide Area Network (WAN)•Control Area Network (CAN)•Personal Area Network (PAN)•

ComputerNetworksmaybeclassifiedonthebasisofgeographicalareaintwobroadcategories,LANandWAN

6.2.1 Local Area NetworkNetworks used to interconnect computers in a single room, rooms within a building or buildings on one site are called Local Area Network (LAN). LAN transmits data with a speed of several megabits per second (106 bits per second). The transmission medium is normally coaxial cables. LAN links computers, i.e., software and hardware, in the same area for the purpose of sharing information. Usually LAN links computers within a limited geographical area because they must be connected by a cable, which is quite expensive.

People working in LAN get more capabilities in data processing, work processing and other information exchange compared to stand-alone computers. Because of this information exchange most of the business and government organisations are using LAN.

Characteristics of LAN are as follows:Every computer has the potential to communicate with any other computers of the network.•High degree of interconnection between computers•Easy physical connection of computers in a network•Inexpensive medium of data transmission•High data transmission rate•

Page 105: Introduction to Information Theory and Applicationsjnujprdistance.com/assets/lms/LMS JNU/B.Sc.(IT)/Sem I/Introduction...Summary ... 8.3.6 Project VENONA ... Table 8.2 VENONA Decrypt

93

AdvantagesThe reliability of network is high because the failure of one computer in the network does not effect the •functioning of other computersAddition of new computer to network is easy•High rate of data transmission is possible•Peripheral devices like magnetic disk and printer can be shared by other computers•

DisadvantagesIf the communication line fails, the entire network system breaks down•

Followings are the major areas where LAN is normally used:filetransfersandaccess•word and text processing•electronic message handling•remote database access•personal computing•digital voice transmission and storage•

6.2.2 Wide Area NetworkThe term Wide Area Network (WAN) is used to describe a computer network spanning a regional, national or global area. For example, for a large company the head quarters might be at Delhi and regional branches at Mumbai, Madras, Bangalore and Kolkatta. Here, regional centres are connected to head quarters through WAN. The distance between computers connected to WAN is larger. Therefore the transmission medium used is normally telephone lines, microwaves and satellite links.

Characteristics of WAN:Communication facility: For a big company spanning over different parts of the country the employees can save •long distance phone calls and it overcomes the time lag in overseas communications. Computer conferencing is another use of WAN where users communicate with each other through their computer system.Remote data entry: Remote data entry is possible in WAN. It means from any location one can enter data, •update data and query other information of any computer attached to the WAN but located in other cities. For example, suppose one is sitting at Madras and want to see some data of a computer located at Delhi, it can be done through WAN.CentralisedInformation:Inmoderncomputerisedenvironmentonefindsthatbigorganisationsgoforcentralised•data storage. This means if the organisation is spread over many cities, they keep their important business data in a single place. As the data are generated at different sites, WAN permits collection of this data from different sites and save at a single site.

Examples of WANEthernet: Ethernet developed by Xerox Corporation is a famous example of WAN. This network uses coaxial •cables for data transmission. Special integrated circuit chips called controllers are used to connect equipment to the cable.Arpanet:• The Arpanet is another example of WAN. It was developed at Advanced Research Projects Agency of U. S. Department. This network connects more than 40 universities and institutions throughout USA and Europe.

Page 106: Introduction to Information Theory and Applicationsjnujprdistance.com/assets/lms/LMS JNU/B.Sc.(IT)/Sem I/Introduction...Summary ... 8.3.6 Project VENONA ... Table 8.2 VENONA Decrypt

Introduction to Information Theory and Applications

94

6.2.3 Difference Between LAN and WANLAN and WAN is differentiated as follows:

LAN is restricted to limited geographical area of few kilometres. But WAN covers great distance and operates •nationwide or even worldwide.In LAN, the computer terminals and peripheral devices are connected with wires and coaxial cables. In WAN •there is no physical connection. Communication is done through telephone lines and satellite links.Cost of data transmission in LAN is less because the transmission medium is owned by a single organisation. •In case of WAN the cost of data transmission is very high because the transmission medium used is either hired telephone lines or satellite links.The speed of data transmission is much higher in LAN than in WAN. The transmission speed in LAN varies from •0.1 to 100 megabits per second. In case of WAN the speed ranges from 1800 to 9600 bits per second (bps).Few data transmission errors occur in LAN compared to WAN. It is because in LAN the distance covered is •negligible.

6.2.4 Other Types of NetworksThe other types of networks include:

MAN•Metropolitan Area Networks (MANs) are networks that connect LANs together within a city. �MANs are used to connect to other LANs. �A MAN must have the requirement of using a telecommunication media such as Voice Channels or Data �Channels.BranchofficesareconnectedtoheadofficesthroughMANs. �Examples where MANs are used include universities and colleges, grocery chains and banks. �The main criterion for a MAN is that the connection between LANs is through a local exchange carrier �(the local phone company).The protocols that are used for MANs are quite different from LANs except for ATM which can be used �for both under certain conditions.

Examples of MAN protocols are:Frame Relay (up to 45 Mbps), FRADs•Asynchronous Transfer Mode (ATM)•ISDN (Integrated Services Digital Network) PRI and BRI•

SAN•SAN is also known as System Area Network, Server Area Network or sometimes Small Area Network. �A Storage Area Network (SAN) is a separate “network” dedicated to storage devices and at minimum consists �of one (or more) large banks of disks mounted in racks that provide ‘shared’ storage space accessible by many servers/systems.Other devices, such as robotic tape libraries, may be attached to the SAN. �SANs create new methods of attaching storage to servers. �These new methods promise great improvements in both availability and performance. �Today’s SANs are used to connect shared storage arrays to multiple servers and are used by clustered �servers for failover.They can interconnect mainframe disk or tape to network servers or clients and can create parallel data �paths for high bandwidth computing environments.A SAN is another network that differs from traditional networks because it is constructed from storage �interfaces.

Page 107: Introduction to Information Theory and Applicationsjnujprdistance.com/assets/lms/LMS JNU/B.Sc.(IT)/Sem I/Introduction...Summary ... 8.3.6 Project VENONA ... Table 8.2 VENONA Decrypt

95

Often it is referred to as the network behind the server � .

CAN•CAN is also known as Controller Area Network, Campus Area Network or Cluster Area Network. �It is a network spanning multiple LANs but smaller than a MAN, such as on university or local business �campus.

PAN•Personal Area Network (PAN) refers exclusively to wireless communications, both radio and optical. �PAN is usually considered to have a range in the tens of feet, with some references citing 10 meters i.e., �approximately33feet;thisisatypicalrangeofPANhardware.PAN is then a room-size network covering an individual’s work area or a work group. �

6.2.5 Network TopologyThe term topology in the context of communication network refers to the way the computers or workstations in the network are linked together. According to the physical arrangements of workstations and nature of work, there are three major types of network topology. They are star topology, bus topology and ring topology.

Star topologyIn star topology a number of workstations (or nodes) are directly linked to a central node. Any communication between stations on a star LAN must pass through the central node. There is bi-directional communication between various nodes. The central node controls all the activities of the nodes.

LAN

Fig. 6.1 Star topology

The advantages of the star topology are:itoffersflexibilityofaddingordeletingofworkstationsfromthenetwork•breakdown of one station does not affect any other device on the network•

The major disadvantage of star topology is that failure of the central node disables communication throughout the whole network.

Bus topologyIn bus topology all workstations are connected to a single communication line called bus. In this type of network topology there is no central node as in star topology. Transmission from any station travels the length of the bus in both directions and can be received by all workstations.

Page 108: Introduction to Information Theory and Applicationsjnujprdistance.com/assets/lms/LMS JNU/B.Sc.(IT)/Sem I/Introduction...Summary ... 8.3.6 Project VENONA ... Table 8.2 VENONA Decrypt

Introduction to Information Theory and Applications

96

LAN

Fig. 6.2 Bus topology

The advantage of the bus topology is that:it is quite easy to set up•if one station of the topology fails it does not affect the entire system•

Thedisadvantageofbustopologyisthatanybreakinthebusisdifficulttoidentify.

Ring TopologyIn ring topology each station is attached to nearby stations on a point to point basis so that the entire system is in the form of a ring. In this topology data is transmitted in one direction only. Thus, the data packets circulate along the ring in either clockwise or anti-clockwise direction.

LAN

Fig. 6.3 Ring topology

The advantage of this topology is that any signal transmitted on the network passes through all the LAN •stations. The disadvantage of ring network is that the breakdown of any one station on the ring can disable the entire •system.

6.3 ISO OSI modelThe ISO (International Standards Organisation) has created a layered model called OSI (Open System Interconnection) modeltodescribedefinedlayersinanetworkoperatingsystem.TheOSIReferenceModeliscomprisedofsevenconceptual layers each assigned a “ranking” number from one to seven.

The layer number represents the position of the layer in the model as a whole and indicates how “close” the layer is totheactualhardwareusedtoimplementanetwork.Thefirstandlowestlayeristhephysicallayer,whichiswherelow-level signalling and hardware are implemented. The seventh and highest layer is the application layer which deals with high-level applications employed by users: both end users and the operating system software.

Page 109: Introduction to Information Theory and Applicationsjnujprdistance.com/assets/lms/LMS JNU/B.Sc.(IT)/Sem I/Introduction...Summary ... 8.3.6 Project VENONA ... Table 8.2 VENONA Decrypt

97

Thefirstlayeristhemostconcreteasitdealswiththeactualhardwareofnetworksandthespecificmethodsofsending bits from one device to another. It is the domain of hardware engineers and signalling experts. The second layer is a bit more abstract but still deals with signalling and hardware. As it proceeds through the third, fourth and subsequent layers the technologies at those layers become increasingly abstract.

By the time it reaches the seventh layer, it no longer dealing with hardware or even operating system concepts very much and is in the realm of the user and high-level programs that rely on lower levels to do the “heavy lifting” for them.

OSI MODEL

Application LayerFacilitates communication between

software applications like outlook, IE

Presentation LayerData Representation and Encryption

Session LayerInter host Communication

Transport LayerEnd-to-End connection and reliability

Network LayerPath determination and logical addressing

Data Link LayerMAC and LLC – Physical Addressing

Physical LayerMedia, signal and binary transmission

DATA

SEGMENTS

PACKETS

FRAMES

BITS

LAYERS

Fig. 6.4 OSI model

6.3.1 Layers of OSI ModelAsseeninthefigureabove,thereare7differentlayersintheOSImodel.Theseareasfollows:

Application layer (Layer 7) •Theapplicationlayerisprobablythemosteasilymisunderstoodlayerofthemodel.Thistoplayerdefines �the language and syntax that programs use to communicate with other programs. The application layer represents the purpose of communicating. �For example, a program in a client workstation uses commands to request data from a program in the server. �Commonfunctionsatthislayerareopening,closing,readingandwritingfiles,transferringfilesande-mailmessages, executing remote jobs and obtaining directory information about network resources and so on.

Presentation layer (Layer 6) •The presentation layer performs code conversion and data reformatting (syntax translation). Like a translator �ofthenetwork;itmakessurethatthedataisinthecorrectformforthereceivingapplication.When data is transmitted between different types of computer systems, the presentation layer negotiates �and manages the way data is represented and encoded. For example, it provides a common denominator between ASCII and EBCDIC machines as well as between �differentfloatingpointandbinaryformats.Sun’sXDRandOSI’sASN.1aretwoprotocolsusedforthispurpose.This layer is also used for encryption and decryption and it also provides security features through encryption �and decryption.

Page 110: Introduction to Information Theory and Applicationsjnujprdistance.com/assets/lms/LMS JNU/B.Sc.(IT)/Sem I/Introduction...Summary ... 8.3.6 Project VENONA ... Table 8.2 VENONA Decrypt

Introduction to Information Theory and Applications

98

Session layer (Layer 5) •The session layer decides when to turn communication on and off between two computers. It provides �the mechanism that controls the data-exchange process and coordinates the interaction (communication) between them in an orderly manner. It sets up and clears communication channels between two communicating components. It determines one- �way or two-way communications and manages the dialogue between both parties.Forexample,itmakessurethatthepreviousrequesthasbeenfulfilledbeforethenextoneissent.Italso �markssignificantpartsofthetransmitteddatawithcheckpointstoallowfastrecoveryintheeventofaconnection failure.

Transport layer (Layer 4) •The transport layer is responsible for overall end-to-end validity and integrity of the transmission, i.e., it �ensures that data is successfully sent and received between two computers. The lower data link layer (layer 2) is only responsible for delivering packets from one node to another. Thus, �if a packet gets lost in a router somewhere in the enterprise internet, the transport layer will detect that. It ensuresthatifa12MBfileissent,thefull12MBisreceived.Ifdataissentincorrectly,thislayerhastheresponsibilityofaskingforretransmissionofthedata.Specifically, �it provides a network-independent, reliable message-independent, reliable message-interchange service to the top three application-oriented layers. This layer acts as an interface between the bottom and top three layers. By providing the session layer (layer �5) with a reliable message transfer service, it hides the detailed operation of the underlying network from the session layer.

Network layer (Layer 3) •The network layer establishes the route between the sending and receiving stations. The unit of data at �thenetworklayeriscalledapacket.Itprovidesnetworkroutingandflowandcongestionfunctionsacrosscomputer-network interface. It makes a decision as to where to route the packet based on information and calculations from other routers, �or according to static entries in the routing table. It examines network addresses in the data instead of physical addresses seen in the Data Link layer. �The network layer establishes, maintains and terminates logical and/or physical connections. It is responsible �for translating logical addresses or names into physical addresses. The main device found at the network layer is a router.

Data Link Layer (Layer 2) •The data link layer groups the bits that we see on the Physical layer into Frames. It is primarily responsible �for error-free delivery of data. The data link layer is split into two sub-layers namely, the Logical Link Control (LLC), Media Access Control (MAC). The data link layer handles the physical transfer, framing (the assembly of data into a single unit or block), �flowcontrolanderror-controlfunctions(andretransmissionintheeventofanerror)overasingletransmissionlink. It is responsible for getting the data packaged onto the network cable. The data link layer provides the network layer (layer 3) reliable information-transfer capabilities. The main network device found at the data link layer is a bridge. This device works at a higher layer than �the repeater and therefore is a more complex device. It has some understanding of the data it receives and can make a decision based on the frames it receives �as to whether it needs to let the information pass or can remove the information from the network. This meansthattheamountoftrafficonthemediumcanbereducedandtherefore,theusablebandwidthcanbe increased.

Physical layer (Layer 1) •Thedataunitsonthislayerarecalledbits.Thislayerdefinesthemechanicalandelectricaldefinitionofthe �network medium (cable) and network hardware. This includes how data is impressed onto the cable and retrieved from it.

Page 111: Introduction to Information Theory and Applicationsjnujprdistance.com/assets/lms/LMS JNU/B.Sc.(IT)/Sem I/Introduction...Summary ... 8.3.6 Project VENONA ... Table 8.2 VENONA Decrypt

99

The physical layer is responsible for passing bits onto and receiving them from the connecting medium. �This layer gives the data-link layer (layer 2) its ability to transport a stream of serial data bits between two communicating systems. It basically conveys the bits which move along the cable. It is responsible for ensuring that the raw bits move from one place to another, no matter what shape they �are in, and deals with the mechanical and electrical characteristics of the cable.

6.3.2 ProtocolArchitecture is basedon the very specificationof the standardTCP/IPprotocol, designed to connect any twonetworks which may be very different in internal hardware, software, and technical design. Once two networks are interconnected, communication with TCP/IP is enabled end-to-end, so that any node on the internet has the ability to communicate with any node irrespective of their location.

6.3.3 IP Address Every computer on the internet has a unique numerical address called an Internet Protocol (IP) address which is used to route packets across the internet. Just as a postal address enables the postal system to send mail to the desired destination from anywhere around the world, the computer’s IP address gives the internet routing protocols the unique information they need to route packets of information to the computer from anywhere across the internet. Ifamachineneedstocontactanotherbyadomainname,itfirstlooksupforthecorrespondingIPaddresswiththedomain name service. The IP address is the geographical descriptor of the virtual world, and the addresses of both sourceanddestinationsystemsarestoredintheheaderofeverypacketthatflowsacrosstheinternet.

6.3.4 TCP/IP ProtocolTCP/IP stands for Transmission Control Protocol/Internet Protocol. It is a protocol suite used by most communications software.TCP/IPisarobustandproventechnologythatwasfirsttestedintheearly1980sonARPANet,theU.S.military’sAdvancedResearchProjectsAgencyNetwork,andtheworld’sfirstpacket-switchednetwork.

TCP/IP was designed as an open protocol that would enable all types of computers to transmit data to each other via a common communications language. TCP/IP is a layered protocol similar to the ones used in all the other major networking architectures, including IBM’s SNA, Windows’ NetBIOS, Apple’s AppleTalk, Novell’s NetWare and Digital’s DECnet.

There are 4 layers. Layering means that after an application initiates the communications, the message (data) to be transmitted is passed through a number of stages or layers until it actually moves out onto the wire. The data are packaged with a different header at each layer. At the receiving end, the corresponding programs at each protocol layer unpack the data, moving it “back up the stack” to the receiving application.

TCP/IP is composed of two major parts: TCP (Transmission Control Protocol) at the transport layer and IP (Internet Protocol) at the network layer. TCP is a connection-oriented protocol that passes its data to IP, which is connectionless. TCP sets up a connection at both ends and guarantees reliable delivery of the full message sent. TCP tests for errors and requests retransmission if necessary, because IP does not do that.

6.4 The InternetThe Internet is a network of networks. It is a worldwide collection of networks, communication protocols and software applications. Millions of computers all over the world are connected through internet. Computer users on internet can contact one another anywhere in the world. It is emerging as a low-cost means of information sharing between almost any two computers that are connected to the public telephone or other telecommunication system. It is very similar to the telephone connection where one can talk with any person who is anywhere in the world.

Ininternet,ahugeresourceofinformationisaccessibletopeopleacrosstheworld.Informationineveryfieldstartingfrom education, science, health, medicine, history, and geography to business, news, etc. can be retrieved through the Internet. One can also download programs and software packages from anywhere in the world. Due to the tremendous information of resources which the internet provides, it has now become necessary to every organisation.

Page 112: Introduction to Information Theory and Applicationsjnujprdistance.com/assets/lms/LMS JNU/B.Sc.(IT)/Sem I/Introduction...Summary ... 8.3.6 Project VENONA ... Table 8.2 VENONA Decrypt

Introduction to Information Theory and Applications

100

There are two main ways of communicating through internet. These ways are:Any connected computer may send and receive e-mail to any other.•Significantnumbersoflargecomputersactashosts(calledservers)forlargerepositoriesofinformationona•wide range of topics. The owners allow these servers to be accessed by the public using the telephone system to transmit requests for data into the server and transmission of the requested data back to the inquirer.

Uses of internetIn spite of all the security issues discussed above, networks, especially internet, are one of the most essential componentsoftheroutinelifeofindividuals.Thisisnottoexaggerate,buttounderlinethesignificanceofinternetand its services in today’s world.Some of the important applications of internet are:

access to remote information•World Wide Web•person-to-person communication with electronic mail, video conference, and so on•interactive entertainment like Video-on-Demand, Games and so on•online shopping, booking, trading, social networking and so on•

6.5 World Wide Web (WWW)People often use the terms internet and WWW interchangeably, which is incorrect. The internet and the web work together, but they are not the same thing. The Internet provides the underlying structure, and the web utilises that structure to offer content, documents, multimedia, etc.

Functioning of the WWWIn simple language, it is the documents referring to each other by links. For its likeness to a spider’s construction, •this world is called the web. This is known as the hypertext paradigm. The reader sees a document on the screen with sensitive parts of text •representing the links. A link is followed by mere pointing and clicking.Hypertext alone is not practical when dealing with large sets of structured information such as what is contained •in data bases: adding a search to the hypertext model gives W3 its full power. Indexes are special documents which, rather than being read, may be searched. To search an index, a reader •gives keywords (or other search criteria). The result of a search is another document containing links to the documents found. The core of the internet is a network of supercomputers connected to each other by high-speed links (known •as “backbones”). Each node is linked to a number of smaller networks which in turn are linked to even smaller networks and ultimately to the PC of an individual user.Internet combines four important elements:•

network �people (who use it) �various programmes used in getting the information �information itself �

Page 113: Introduction to Information Theory and Applicationsjnujprdistance.com/assets/lms/LMS JNU/B.Sc.(IT)/Sem I/Introduction...Summary ... 8.3.6 Project VENONA ... Table 8.2 VENONA Decrypt

101

Internet Modem Router

Computer #1

Computer #2

Fig. 6.5 Internet

Uses of the internetThe internet can be used for diverse needs and purposes:

Itcanbeusedbyanorganisationforlinkingtheirofficesandemployeesandthususingtheinternetasavirtual•private network.It can serve as a communication channel where one can: exchange social notes and information, gather latest •newsfromallovertheworld,transfercomputerfilesandsoftware,cooperatecommunications,etc.It may be used as a virtual market place to advertise products, services, employment and other personal •needs.It may be used for publishing general information as done by many government and non-government •agencies.Many universities and educational and tanning institutes use the internet for academic communications.•Communication protocols•

The internet is a packet switching network and the data to be transmitted is converted into small packets. The software thatcarriesoutthefunctionofcommunicationefficientlyandaccuratelyisknownasTCP/IPwhichisactuallytwocommunication protocols put together.

TCP/IP stands for Transmission Control Protocol and Internet Protocol respectively.•TCP breaks the data into little packets and ensures that the same reaches the destination computer intact.•IP is a set of conventions ensuring routing of the packets from one host to another with destination IP •address.Three kinds of mechanisms are involved in routing namely: bridges, routers and gateways.•

Access to the internet depends on many factors e.g. the type of interface connection and browser installed, services and connection available from the network and the type of connection selected by the user depending on his requirement to explore the internet.

6.6 Clients and ServersThisisapurelyrole-basedclassificationofcomputersystems.Withtheincreasedpopularityofcomputernetworks,it has become possible to interconnect several computers, which can communicate and interact with each other over the network. In such a computing environment, there are several resources/services which can be shared among multiple users for cost-effective usage and can be best managed/offered centrally. A few examples of such resources/services are:

Fileserver:Itprovidesacentralstoragefacilitytostorefilesofseveralusersonthenetwork.•Database server: It manages a centralised database and enables several users on the network to have shared •access to the same database.

Page 114: Introduction to Information Theory and Applicationsjnujprdistance.com/assets/lms/LMS JNU/B.Sc.(IT)/Sem I/Introduction...Summary ... 8.3.6 Project VENONA ... Table 8.2 VENONA Decrypt

Introduction to Information Theory and Applications

102

Print serve: It manages one or more printers and accepts and processes print requests from any user in the •network.Name server: It translates names into network addresses enabling different computers on the network to •communicate with each other.

In these cases, it is usual to have one process, which “owns” the resource or service and is in charge of managing it. This process accepts requests from other processes which want to use the resources or service. The process that owns the resource and does this management is called a ‘server process’. The computer on which the server process runs is called a ‘server computer’ because it services requests for use of the resource. Other processes which send service requests to the server are called ‘client processes’. The computers on which the client processes runs are called ‘client computers’. Note that there may be multiple client computers which send service requests to the same servercomputer.Agenericclient-servercomputingenvironmentisshowninfig.6.6

Break-in Alarm

Door Switch

Unix Machine running Internet Browser

Apple Macintosh running Internet

Browser

Windows 96 running Internet Browser

Network Resources (Printer)

Remote Computer (overseas)

IP address: 202.1.3.xx

Door #1

Door #2

IP address: 202.1.3.10

IP address: 202.1.3.11

Internet

Corporate Computer Network

(i.e., LAN, Intranet)

Fig. 6.6 Client-server computing environment

In a client-server computing environment, it is common for one server to use the services of another server and hence, to be both a client and a server at the same time. For example: let us assume that a client-server computing environmenthasclients,afileserverandadiskblockserver.Anyclientcansendafileaccessrequesttothefileserver.Onreceivingsucharequest,thefileservercheckstheaccessrightsetcoftheuser,butdoesnotactuallyread/writethefileblocksitself.Instead,itsendsarequesttothediskblockserverforaccessingtherequesteddatablocks.Thediskblockserverreturnstothefileserver,therequesteddatablocks.Thefileserverthenextractstheactualdatafromthedatablocksandreturnsittotheclient.Inthisscenario,thefileserverisbothaserverandaclient.Itis a server for the clients, but a client for the disk block server. Hence, the concept of client and server computers is purely role-based and may change dynamically as the role of a computer changes.

6.7 PortsThe term ‘port’ is derived from a Latin word “porta” which means gate, entrance or door. A port is a point at which an external device (peripheral) attaches to the computer system. Ports allow data to be sent or retrieved from the external device.

Page 115: Introduction to Information Theory and Applicationsjnujprdistance.com/assets/lms/LMS JNU/B.Sc.(IT)/Sem I/Introduction...Summary ... 8.3.6 Project VENONA ... Table 8.2 VENONA Decrypt

103

In computer hardware, a ‘port’ serves as an interface between the computer and other computers or peripheral devices. Physically, a port is a specialised outlet on a piece of equipment to which a plug or cable connects. Electronically, the several conductors making up the outlet provide a signal transfer between devices.

PS/2 port (keyboard)

USB ports

Serial port (dial-up modem)

VGA port(monitor)

Speakers

Line in

Microphone

Game port (joystick)

LPT1 Printer port (Printer)

Ethernet port (network)

PS/2 port (mouse)

Fig. 6.7 Ports

Modern computers carry many types of ports. Example, mouse, keyboards, serial, USB, parallel (printer), microphone, telephone, network etc. Computer ports in common use cover a wide variety of shapes such as round (PS/2, etc), rectangular (FireWire, etc), square (Telephone plug), trapezoidal (D-Sub – the old printer port was a DB-25) etc. There is some standardisation to physical properties and function. For instance, most computers have a keyboard port (currently a round DIN-like outlet referred to as PS/2), into which the keyboard is connected.

6.7.1 Uses of Computer PortsFollowing are the uses of computer port:

Ports are used to connect various devices to the computer and hence enable communication between the device •and the computer.Example, external modem is connected to serial ports, printer are connected to parallel ports.•Ports help in transmitting the data from the device to the computer and vice versa.•Eachporthasaspecificwayofcommunicatingandthisdependsuponthespeedandsizeoftheport.Example:•Serial, Parallel and USB.

6.7.2 Types of PortsThe different types of ports are:

USB (Universal Serial Bus): A new universal connector•Firewire: Fast camcorder connector•Ethernet (Network Port): Connect to a network and high speed internet•

Page 116: Introduction to Information Theory and Applicationsjnujprdistance.com/assets/lms/LMS JNU/B.Sc.(IT)/Sem I/Introduction...Summary ... 8.3.6 Project VENONA ... Table 8.2 VENONA Decrypt

Introduction to Information Theory and Applications

104

Serial Ports: For external modems and old computer mice•Parallel Ports: Connector for scanners and printers•PS/2 Ports: Keywords and mouse interface•VGA (Video Graphic Array): A video port for the monitor•DVI: Used to interface between a PC and display items•Modern Port: Connect by phone to the internet•Power Port: For power plug•

A hardware port resembles a plug-in or connection commonly found on the back of a computer. Hardware ports allow computers to have access to external devices such as computer printers, etc.

6.8 Domain Name ServiceThe Domain Name System (DNS) as a whole consists of a network of servers that map ‘internet’ domain names to a local IP addresses. The DNS enables domain names to stay constant while the underlying network topology and IPaddresseschange.Thisprovidesstabilityattheapplicationlevelwhileenablingnetworkapplicationstofindandcommunicate with each other using the IP no matter how the underlying physical network changes.

6.9 WWW, Browsers ConnectionsThe ‘World Wide Web’ (called WWW or W3 in short) is the most popular and promising method of accessing the internet. The main reason for its popularity is the use of a concept called ‘hypertext’. Hypertext is a new way of information storage and retrieval which enables authors to structure information in novel ways. An effectively designed hypertext document can help users rapidly locate the desired type of information from the vast amount of information on the internet.

Hypertext documents enable this by using a series of ‘links’. A link can be shown on the screen in multiple ways, such as a labelled button, highlighted text, or different colour text than normal text if computer has a colour display, orauthor-definedgraphicsymbols.Alinkisaspecialtypeofiteminahypertextdocumentwhichconnectsthedocument to another document that provides more information about the linked item. The latter document can be anywhere on the internet. By “connect” we mean that a user simply selects the linked item and the user almost immediately sees the other document on his computer terminal.

The concept of hypertext can be best illustrated with the help of an example. Let us assume that the screen of the computer terminal has the following hypertext document currently displayed on it:

Pradeep K. Sinha has been involved in the research and development of distributed systems for almost a decade. At present, Dr. Sinha is working at the Centre For Development Of Advanced Computing ( C-DAC), Pune, India. Before joining C-DAC, Dr, Sinha worked with the Multimedia Systems Research Laboratory (MSRL) of Panasonic in Tokyo, Japan.

The hypertext document has the following two links which are shown on the screen as highlighted (bold and underlined) texts:

Centre for Development of Advanced Computing (C-DAC). Let us assume that this link connects the current •document to another document, which gives detailed information about C-DAC and is located on a computer system at C-DAC in Pune, India.Multimedia Systems Research Laboratory (MSRL) of Panasonic. Let us assume that this link connects the current •document to another document, which gives detailed information about MSRL of Panasonic and is located on a computer system at MSRL of Panasonic in Tokyo, Japan.

Page 117: Introduction to Information Theory and Applicationsjnujprdistance.com/assets/lms/LMS JNU/B.Sc.(IT)/Sem I/Introduction...Summary ... 8.3.6 Project VENONA ... Table 8.2 VENONA Decrypt

105

Now, if mouse is used to click anywhere on the link Multimedia Systems Research Laboratory (MSRL) of Panasonic of the displayed document, within a few seconds computer gets connected at MSRL of Panasonic in Tokyo and displayed on computer screen will be the document, which gives detailed information about MSRL of Panasonic.

Hypertext documents on the internet are known as ‘WebPages’. WebPages are created by using a special language called ‘Hypertext Markup Language’ (HTML). HTML is a subset of the more generalised language called ‘Standard Generalised Markup Language’ (SGML), which is a powerful language for linking documents for easier electronic access and manipulation. HTML is becoming a de-facto industrial standard for creating WebPages.

The WWW uses the client-server model and an internet Protocol called ‘Hypertext Transport Protocol’(HTTP) for interaction between the computers on the internet. Any computer on internet which uses the HTTP protocol is called a ‘Web Server’, and any computer which can access that server, is called a ‘Web Client’. The use of the client-server model and the HTTP allows different kinds of computers on the internet to interact with each other. For example, a Unix workstation may be the web server and a Windows PC may be the web client, if both of them use the HTTP protocol for transmitting and receiving information.

6.10 WWW BrowsersTo be used as a web client, a computer needs to be loaded with a special software tool which is known as ‘WWW browser’ (or browser in short). Browsers normally provide the following navigation facilities to help users save timewhentheyarejumpingfromservertoserverwhileinternetsurfing(theprocessofnavigatingtheinternettosearch for useful information):

Unlike FTP and Telnet, browsers do not require a user to remotely log into a server computer, and then to log •outagainwhentheuserhasfinishedaccessinginformationstoredontheservercomputer.Browsers allow a user to specify the URL address of a server computer to facilitate the user to directly visit the •server computer’s site and to access information stored on it. URL stands for ‘Uniform Resource Locator’. It is an addressing scheme used by WWW browsers to locate sites on the internet.Browsers allow a user to create and maintain a personal ‘hotlist’ of favourite URL addresses of server computers •which the user is likely to frequently visit in future. A user’s hotlist is stored on his/her local web client computer. Browsers provide hotlist commands to allow the user to add, delete, update URL addresses in the hotlist and to select a URL address of a server computer from the hotlist, when the user wants to visit that server computer.Many browsers have a “history” feature. These browsers maintain a history of the server computers visited in •asurfingsession.Thisis,theysave(cache)inthelocalcomputer’smemory,theURLaddressesoftheservercomputersvisitedduringasurfingsessionsothatiftheuserwantstogobacktoanalreadyvisitedserverlateron(inthesamesurfingsession),thelinkisstillavailableinthelocalcomputer’smemory.Browsers allow a user to download (copy from a server computer to the local computer’s hard disk) information •invariousformats(i.e.,asatextfile,asanHTMLfileorasaPostScriptfile).Thedownloadedinformationcanbelater(notnecessarilyinthesamesurfingsession)usedbytheuser.Forexample,downloadedinformationsavedasPostScriptfilecanbelaterprintedonaPostScript-compatibleprinterwhereeventhegraphicswillbeproperly reproduced.

Page 118: Introduction to Information Theory and Applicationsjnujprdistance.com/assets/lms/LMS JNU/B.Sc.(IT)/Sem I/Introduction...Summary ... 8.3.6 Project VENONA ... Table 8.2 VENONA Decrypt

Introduction to Information Theory and Applications

106

Index Server

Home

documentanchor

link

Phone Book

BloggsSearch

BloggsJoe4857

Synthesized hypertext

Telephone index A

B

C

XY

ZYou can link to the result of a search.

Fig. 6.8 Basic hypertext enhanced by searches(Source: http://www.freehep.org/chep92www.pdf)

The key elements of the web and its functions are mentioned below.

6.10.1 Web PageFrom the user’s point of view, the page is the basic unit of the web. Web pages are written in the HTML language and sent to web browsers via web server using the HTTP protocol. A web page has a similar page format as of a book or magazine, with text and graphics displayed in a layout and it is displayed in a normal computer application window. Collection of related web pages, images, videos or other digital assets that are addressed related to a common Uniform Resource Locator (URL) is called as a website. A web site is hosted on at least one web server.

6.10.2 URLWeb addresses are recorded in a URL, a logical address of a web page that can always be used to dynamically retrieve the current physical copy over the internet. The key advantage of URL is its universality, since the address is same no matter where in the world it is used.

6.10.3 Web Server Every web site is managed by a web server. The web server handles all the network communication with individual user browsers. The server accepts HTTP requests for web pages and sends the requested pages to browsers over the internet.Different web servers have different scalability, robustness, security, transportability, and related features. A web server may be dedicated for one domain name or maintain web sites for several domains.

6.10.4 HTTPThe Hypertext Transfer Protocol (HTTP) is used by web servers to communicate web pages to web browsers. HTTP is used when a browser connects to a web server, requests a web page from the server and downloads the page. It is the common standard that enables any browser to connect to any server, anywhere in the world. The HTTP protocol is designed to permit intermediate network elements to improve or enable communications between clients and servers.

6.10.5 HTMLHyper Text Mark up Language (HTML) is the publishing language of the World Wide Web. It is a simple and powerful language used to describe web pages and is still used as the main interface language to the web. Each command consists of an opening tag in angle brackets, like <tag> and a closing tag with an added slash, like </tag>.

Page 119: Introduction to Information Theory and Applicationsjnujprdistance.com/assets/lms/LMS JNU/B.Sc.(IT)/Sem I/Introduction...Summary ... 8.3.6 Project VENONA ... Table 8.2 VENONA Decrypt

107

To understand how it exactly helps, we can analyse it step by step.We can “move” around the web using “hypertext” — by clicking on special text called hyperlinks which take •us to the next page. “Hyper” just means it is not linear — i.e., one can go to any place on the internet whenever they want by clicking •on the links — there is no set order to do things.Mark up is what HTML tags do to the text inside them. They mark it as a certain type of text (for example, •italicised text).HTML is a language and has code-words and syntax like any other language.•

6.11 Using the WWWUsing the web requires some kind of interface for convenient access to the resources made available. Web browser issuchaninterface.Optimumbrowsingresultscanbeachievedbyefficientsearchmethods.

6.11.1 Web BrowserA web browser or internet browser is a software application for retrieving, presenting and traversing information resourcesontheWorldWideWeb.AninformationresourceisidentifiedbyaUniformResourceIdentifier(URI)and may be a web page, image, video or other piece of content. Hyperlinks present in resources enable users to easily navigate their browsers to related resources.

Although browsers are primarily intended to access the World Wide Web, they can also be used to access information providedbywebserversinprivatenetworksorfilesinfilesystems.Somebrowserscanalsobeusedtosaveinformationresourcestofilesystems.ExamplesofpopularbrowsersusedareInternetExplorer,MozillaFirefoxandsoon.

6.11.2 Searching for Information There are two broad ways to adapt for effective search results. Using a directory site or a search engine helps in simplifyingandfilteringsearchresultsandsaveextratimeandefforts.

Directory sitesDirectorysitesplaceeachwebsiteintheirdatabaseinoneormorepredefinedsubjectcategoriesfollowing•review by a moderator. A web site is included in a directory site’s database only after it has been judged for its usefulness, information •or otherwise worthwhile. Typical reasons a site might not be included in the database are because it isn’t unique enough, it isn’t guaranteed •to remain around for long or it doesn’t meet some other guideline or criteria. Examples of some popular directories are Google directory, Yahoo directory, Internet Public Library and so on.•

Search EnginesIn easy words, a search engine enables to search the internet for information of interest. Search engines work by •storing information about many web pages which they retrieve from the HTML itself. These pages are retrieved by web crawler (sometimes also known as a spider) — an automated web browser which follows every link on the site.The contents of each page are then analysed to determine how it should be indexed (for example, words are •extractedfromthetitles,headingsorspecialfieldscalledMetatags).Data about web pages are stored in an index database for use in later queries. A query can be a single word. The •purpose of an index is to allow information to be found as quickly as possible. Some search engines such as Google, store all or part of the source page (referred to as a cache) as well as information about the web pages, whereasothers,suchasAltaVista,storeeverywordofeverypagetheyfind.When a user enters a query into a search engine (typically by using key words), the engine examines its index and •provides a listing of best-matching web pages according to its criteria, usually with a short summary containing the document’s title and sometimes parts of the text. The index is built from the information stored with the data and the method by which the information is indexed.

Page 120: Introduction to Information Theory and Applicationsjnujprdistance.com/assets/lms/LMS JNU/B.Sc.(IT)/Sem I/Introduction...Summary ... 8.3.6 Project VENONA ... Table 8.2 VENONA Decrypt

Introduction to Information Theory and Applications

108

6.11.3 Search TechniquesMost search engines support the use of the Boolean operators AND, OR and NOT to further specify the search query.Booleanoperatorsareforliteralsearchesthatallowtheusertorefineandextendthetermsofthesearch.Theengine looks for the words or phrases exactly as entered. Some search engines provide an advanced feature called proximitysearchwhichallowsuserstodefinethedistancebetweenkeywords.

There is also concept-based searching where the research involves using statistical analysis on pages containing the words or phrases searched for. In addition, natural language queries allow the user to type a question in the same form one would ask to a human. Example of such site would be Ask.com.

6.12 BlogsIt is a short form for “Web log”. A blog is an online diary that keeps a running chronology of entries. Readers can comment on posts and can connect to other blogs through blog rolls or trackbacks.

Advantages:share ideas easily•advertise •obtain feedback•reverse chronology•mobilise a community •comment threads•persistence•search ability•tags•trackbacks•

6.12.1 WikisIt is a web site that anyone can edit directly within the browser. There are huge information resources on wide variety of topics.

Advantages:collaborate on common tasks •to create a common knowledge base •a complete revision history is maintained with the ability to roll back changes and revert to earlier versions •all changes are attributed•automaticnotificationofupdates•search ability•tags•monitoring•

6.12.2 Electronic Social NetworkOnlinecommunity thatallowsusers toestablishapersonalprofile, link toother’sprofiles (i.e., friends), sharecontent and communicate with members via messaging, posts. Examples of social networking sites are Facebook, LinkedIn, Orkut and so on.

Page 121: Introduction to Information Theory and Applicationsjnujprdistance.com/assets/lms/LMS JNU/B.Sc.(IT)/Sem I/Introduction...Summary ... 8.3.6 Project VENONA ... Table 8.2 VENONA Decrypt

109

Advantages:discoverandreinforceaffiliations•identify experts•message individuals or groups•virtually share media•detailedpersonalprofilesusingmultimedia•affiliationswithgroupsandindividuals•messaging and public discussions•media sharing•“Feeds” of recent activity among members•

6.12.3 Micro BloggingItisashort,asynchronousmessagingsystem.Here,userspostmessagesingeneralortospecificfollowerswhichare displayed on the user’s page. Popular example of micro blogging site is Twitter.

Advantages:distribute time-sensitive information•share opinions•virtually spread ideas•run contests and promotions•solicit feedback•provide customer support•trackcommentaryonfirms/products/issues•organise protests •

6.12.4 RSSIt is an acronym that stands for both “really simple syndication” and “rich site summary”. It enables busy users to scan the headlines of newly available content and click on an item’s title to view items of interest, thus sparing them fromhavingtocontinuallyvisitsitestofindoutwhat’snew.

6.12.5 Web 3.0The difference between web 2.0 and web 3.0 is not very evident and it is considered to be an extension of web 2.0. Theweb(web1.0)beforeweb2.0wasmoreaboutfindingandreadingavailabledata.Web2.0ismoreinteractive,with users participating in information generation and exchange. Web 3.0 is more about meaning of data, the semantic web, personalisation and intelligent search.To sum-up we can say that, the World Wide Web and its constant up gradation and evolution provide us with the greatest way of creating, accessing and interacting with information around the world.

6.13 Electronic MailThe electronic mail service allows an internet user to send a mail to another internet user in any part of the world in a near-real time manner. The message takes anywhere from a few seconds to several minutes to reach its destination, because it must be passed from one network to another until it reaches its destination. E-mail service has many similarities with the post email service which all of us are familiar with. All internet users have an e-mail address, just like all of us have a postal address. Each internet user has a logical mailbox just like each one of us have a mailbox in our house.

Page 122: Introduction to Information Theory and Applicationsjnujprdistance.com/assets/lms/LMS JNU/B.Sc.(IT)/Sem I/Introduction...Summary ... 8.3.6 Project VENONA ... Table 8.2 VENONA Decrypt

Introduction to Information Theory and Applications

110

Whilesendingamailtoanotheruser,thesenderspecifiestheemailaddressofthereceiverjustaswewritethepostal address of the receiver in the postal mail system. The email service delivers an already sent mail into the receiver’s mailbox. The receiver extracts the mail from the mailbox and reads it at theirs own convenient time just like in a postal mail system. After reading the message, the receiver can save it, delete it, pass it on to someone else or respond by sending another message back.

Messages in email service can contain not only text documents but also image, audio and video data. The only restriction is that the data must be digitised that is converted to a computer-readable format.With email service, the internet has proved to be a rapid and productive communication tool for millions of users. As compared to paper mail, telephone and fax, email is preferred by many because of its following advantages:

it is faster than paper mail•unlike the telephone, the persons communicating need not be available at the same time. •unlike fax documents, email documents can be stored in a computer and can be easily edited using editing •programs.

Page 123: Introduction to Information Theory and Applicationsjnujprdistance.com/assets/lms/LMS JNU/B.Sc.(IT)/Sem I/Introduction...Summary ... 8.3.6 Project VENONA ... Table 8.2 VENONA Decrypt

111

SummaryA collection of computers, display terminals, printers and other devices linked either by physical or wireless •means is called a network.Everynetworkincludescomponentslikenodes,hardwareandsoftware.Computernetworksmaybeclassified•on the basis of geographical area in two broad categories, LAN and WAN.Networks used to interconnect computers in a single room, rooms within a building or buildings on one site •are called Local Area Network (LAN). LAN transmits data with a speed of several megabits (106 bits) per second. The term Wide Area Network (WAN) is used to describe a computer network spanning a regional, national or •global area. The term topology in the context of communication network refers to the way the computers or workstations •in the network are linked together.The ISO (International Standards Organisation) has created a layered model called OSI (Open System •Interconnection)modeltodescribedefinedlayersinanetworkoperatingsystem.TheOSIReferenceModeliscomprised of seven conceptual layers each assigned a “ranking” number from one to seven.Every computer on the internet has a unique numerical address called an Internet Protocol (IP) address used to •route packets across the Internet.Computer users on the Internet can contact one another anywhere in the world. It is emerging as a low-cost •means of information sharing between almost any two computers that are connected to the public telephone or other telecommunication system.The World Wide Web utilises the internet to offer content, documents, multimedia etc. The core of the internet •is a network of supercomputers connected to each other by high-speed links. The internet is a packet switching network and the data to be transmitted is converted into small packets. A port •is a point at which an external device (peripheral) is attached to the computer system.The Domain Name System (DNS) as a whole consists of a network of servers that map ‘• internet’ domain names to a local IP addresses.The WWW uses the client-server model and an internet Protocol called ‘Hypertext Transport Protocol’(HTTP) •for interaction between the computers on the internet. Any computer on the Internet which uses the HTTP protocol is called a ‘Web Server’. Any computer which can access that server is called a ‘Web Client’. To be used as a web client, a computer needs to be loaded with a special software tool which is known as ‘WWW •browser’. A search engine enables to search the internet for information of interest.The electronic mail service allows an internet user to send a mail to another internet user in any part of the •world in a near-real time manner.

ReferencesFrank, J., Les, F., 2004. • How Networks Work, Que publications. Natalia, O. and Victor, O., 2010. • Computer Networks: Principles, Technologies and Protocols for Network Design, Wiley Publication.Introduction to Computer Networks• [Pdf] Available at:< http://vfu.bg/en/e-Learning/Computer-Networks--Introduction_Computer_Networking.pdf> [Accessed 26 May 2013].Overview of Computer Network• [Pdf] Available at:< http://heather.cs.ucdavis.edu/~matloff/Networks/Intro/NetIntro.pdf> [Accessed 26 May 2013].Numerical Problems in Computer Networks• [Video online] Available at: <https://www.youtube.com/watch?v=crYVUyVD-zw> [Accessed 26 May 2013].Computer Networks• [Video online] Available at:< https://www.youtube.com/watch?v=R9CjOOCzwZc> [Accessed 26 May 2013].

Page 124: Introduction to Information Theory and Applicationsjnujprdistance.com/assets/lms/LMS JNU/B.Sc.(IT)/Sem I/Introduction...Summary ... 8.3.6 Project VENONA ... Table 8.2 VENONA Decrypt

Introduction to Information Theory and Applications

112

Recommended ReadingAndrew, S. and David, J., 2010. • Computer Networks, Prentice Hall Publication.James, F. and Keith, W., 2009. • Computer Networking: A Top-Down Approach, Addison Wesley Publication.Kumar, A.S., 2005. • Computer Networks, Firewall Media Publications.

Page 125: Introduction to Information Theory and Applicationsjnujprdistance.com/assets/lms/LMS JNU/B.Sc.(IT)/Sem I/Introduction...Summary ... 8.3.6 Project VENONA ... Table 8.2 VENONA Decrypt

113

Self AssessmentComputers and similar devices within a connection are called _______. 1.

hardwarea. nodes b. softwarec. antinodesd.

Match the following:2.

LAN1. Networks used to interconnect computers in a single room, rooms within a A. building or buildings on one site.

WAN2. Networks that connect LANs together within a city.B.

MAN3. A separate “network” dedicated to storage devices and at minimum consists of C. one (or more) large banks of disks mounted in racks that provide ‘shared’ storage space accessible by many servers/systems.

SAN4. Networks used to interconnect computers in a single room, rooms within a D. building or buildings on one site.

1-A, 2-B, 3-C, 4-Da. 1-B, 2-C, 3-D, 4-Ab. 1-D, 2-A, 3-B, 4-Cc. 1-C, 2-D, 3-A, 4-Bd.

The ___________ is comprised of seven conceptual layers, each assigned a “ranking” number from one to 3. seven.

MTD Reference Model a. CPU Reference Modelb. ALU Reference Modelc. OSI Reference Modeld.

_______ was designed as an open protocol that would enable all types of computers to transmit data to each 4. other via a common communications language.

ISO OSI modela. TCP/IP b. CPU/ALUc. ISO/IP modeld.

The _______ is a packet switching network and the data to be transmitted is converted into small packets.5. monitora. keyboardb. internetc. radiod.

Page 126: Introduction to Information Theory and Applicationsjnujprdistance.com/assets/lms/LMS JNU/B.Sc.(IT)/Sem I/Introduction...Summary ... 8.3.6 Project VENONA ... Table 8.2 VENONA Decrypt

Introduction to Information Theory and Applications

114

Which of the following statement is false?6. A port is a point at which an external device (peripheral) is attached to the computer system. a. Ports allow data to be sent/ retrieved from the internal device.b. Modern computers carry many types of ports. Example, mouse, keyboards, serial, USB, parallel (printer), c. microphone, telephone, network etc.Ports help in transmitting the data from the device to the computer and vice versa.d.

A _____ is a special type of item in a hypertext document which connects the document to another document 7. that provides more information about it.

linka. starb. cablec. wired.

_____ is the publishing language of the World Wide Web.8. HTMLa. MTNLb. BSNLc. VTMLd.

Aninformationresourceisidentifiedbya______.9. modulationa. demodulationb. UniformResourceIdentifierc. networksd.

_______sitesplaceeachwebsite in theirdatabaseinoneormorepredefinedsubjectcategoriesfollowing10. review by a moderator.

Search enginea. Directoryb. Blogsc. E-maild.

Page 127: Introduction to Information Theory and Applicationsjnujprdistance.com/assets/lms/LMS JNU/B.Sc.(IT)/Sem I/Introduction...Summary ... 8.3.6 Project VENONA ... Table 8.2 VENONA Decrypt

115

Chapter VII

Introduction to Information Security

Aim

The aim of this chapter is to:

introduce cryptography•

explain information security•

elucidate phases of information security and private lifecycle•

Objectives

The objectives of this chapter are to:

explain information security and private lifecycle•

explain basic elements of cryptography•

elucidate aspects of security and software•

Learning outcome

At the end of this chapter, you will be able to:

distinguish between set of controls of information security and privacy•

understand high-level information security and privacy requirements•

defineprotocols•

Page 128: Introduction to Information Theory and Applicationsjnujprdistance.com/assets/lms/LMS JNU/B.Sc.(IT)/Sem I/Introduction...Summary ... 8.3.6 Project VENONA ... Table 8.2 VENONA Decrypt

Introduction to Information Theory and Applications

116

7.1 IntroductionDigital information is fundamental to life today. Digital information access devices and websites are everywhere, asmobilephones, tabletPCs,notebooks,DVDviewers,personaldataassistants,digital cameras,flashdrives,camcorders, e-commerce sites, blogs, micro-blogs, social networking sites and so on. Digital information permeates organisations as well, with almost all corporate data now stored electronically. The majority of organisations asset valuations are no longer in tangible assets like plant and equipment but are embodied in intangible assets like intellectual property that may be stored digitally and therefore more easily appropriated.

Consumers want their personal information kept private, while organisations have competitive and reputational interests in protecting their corporate and client data. With business and social interactions increasingly happening overtheInternet, individualpersonalandsensitivedata,corporateconfidentialandsecretdataandgovernmentdiplomaticandinfrastructuredataflowsoveropennetworksandisstoredinlocationsonlyindirectlyunderthecontrol of its owners.

And as the numbers and types of websites and devices that access this information proliferate, so do the risks and challengesofstayingsafeandsecure.Whatisneededamongthischaoticinterplayofever-expandingdataflowsand changing technologies, geographies, and business needs is a stable methodology that can be used at all times to understand and manage the dynamic set of risks to personal, corporate, and customer information.

7.2 Information Security and Privacy LifecycleThishigh-levellifecyclemethodologyrequiresthedesignandimplementationofunderlyingprocessesspecifictoeachorganisation, country and industry to address current as well as future information security and privacy risks.

ThisInformationSecurityandPrivacyLifecyclemethodologycomprisesofthefollowingfivephases:Synthesis of all legal obligations from applicable information security and privacy laws and regulations•Analysis of all information security and privacy legal liability exposures•Creation of information security and privacy policies and assessment of information security and privacy •risksSelection, design and implementation of information security and privacy controls•Compliance,auditandcertificationoftheinformationsecurityandprivacyprograms•

Aftercompletionofthefinalphase,thelifecyclestartsagainwiththefirstphaseinarepetitiveloop,asdepictedinFig. 7.1. Before discussing this lifecycle, everyone involved in this area must fully understand an important concept: information security and privacy is not a separate discipline within each organisation, like accounting or sales, but instead needs to permeate the organisation and be owned and executed by every executive, employee, contractor, vendor, and customer of the organisation. More than ever, top executives need to fully understand the obligations, liabilities, risks, and treatments involving information security and privacy. Leadership buy-in and follow-through are essential, as was most recently reemphasised from the top of the U.S. government:

Itisnotenoughfortheinformationtechnologyworkforcetounderstandtheimportanceofcybersecurity;leadersatall levels of government and industry need to be able to make business and investment decisions based on knowledge of risks and potential impacts.

Page 129: Introduction to Information Theory and Applicationsjnujprdistance.com/assets/lms/LMS JNU/B.Sc.(IT)/Sem I/Introduction...Summary ... 8.3.6 Project VENONA ... Table 8.2 VENONA Decrypt

117

1. Statutes and Regulations

5. Compliance, Audit andCertification

4. Security and Privacy Controls

3. Policies and Risk Assessment

2. Sources of Potential Liability

Fig. 7.1 The information security and privacy lifecycle(Source: http://apps.americanbar.org/abastore/products/books/abstracts/5450058%20excerpt_abs.pdf)

Phase 1: Synthesis of statutory and regulatory requirementsOrganisations must understand their information security and privacy obligations from statutes and regulations in eachcountrywheretheydobusiness,includinganyindustrysector-specificrules.Tocraftasinglesetofinformationsecurity and privacy rules usable worldwide, a synthesised global legal view should be created from all applicable current and prospective (where possible) laws. This can be done on a regional instead of global level but may lead todisparateandpotentiallyconflictingpolicies.

The global legal view includes the laws in each region, country and state/province in which a company operates, hosts (or outsources) data, or collects data. Once all the laws and their information security and privacy provisions havebeenidentified,theymustbesynthesisedinamannerthatencompassesalltheserequirementsintoasinglelegal view. A company, for example, operating in Europe, the United States and Japan would have to comply with the information security and privacy rules in the European Union privacy directive, the myriad U.S. state and federal laws on information security and privacy and Japan’s law on the protection of personal information, plus other regulationsforthespecificindustries,associationrulesandlawsofcountrieswhereitdoesbusinesselectronicallyor stores data in the Internet cloud.

Synthesis exampleIllustrating this process of a synthesised global legal view is the very simple example of a Japanese corporation doing business only in the United States and Japan. In general Japanese law requires corporations holding customers’ or employees’ personal information to take the “necessary and proper measures” to exercise control over that data, including ensuring that third parties who process this corporate data implement similarly adequate levels of securityprotection.Theinformationsecurityrequirementsarenotspecificbutfallunderthegeneralbannersofreasonableness and practicality.

IntheUnitedStates,anorganisationmayberequiredtoadheretosector-specificinformationsecurityandprivacyrules.CompaniesintheU.S.consumerfinancialsectoraresubjecttotheGramm-Leach-BlileyActanditsSafeguardsRule requiring a comprehensive security program, including physical, technical, and administrative controls. Those involved with the health care industry are subject to the Health Insurance Portability and Accountability

Page 130: Introduction to Information Theory and Applicationsjnujprdistance.com/assets/lms/LMS JNU/B.Sc.(IT)/Sem I/Introduction...Summary ... 8.3.6 Project VENONA ... Table 8.2 VENONA Decrypt

Introduction to Information Theory and Applications

118

Act, amended by the Health Insurance Technology for Economic and Clinical Health Act. Under its Security Rule, therearetechnical,physicalandadministrativesafeguardstoensuretheconfidentiality,integrityandavailabilityof electronically stored personal health information. The act also requires providing protection against reasonably anticipated threats, designating responsible individuals, providing employee training, performing risk assessments, providingbreachnotification,andestablishingcontractualcomplianceprovisionsforthird-partydataprocessors.

More generally, the company’s U.S. operations will be subject to the Federal Trade Commission (FTC) Act’s Section 5 for unfair or deceptive trade practices regarding information security. The lack of a reasonable information security program may be considered an unfair trade practice or even a deceptive trade practice if actual practices differ from the stated security policy. Consent decrees often require respondents to implement and maintain a comprehensive informationsecurityprogramtoprotectthesecurity,confidentiality,andintegrityofpersonalinformation,includingadministrative, technical and physical safeguards considering a respondent’s size, complexity, and activities and the sensitivity of the personal information.

The FTC’s “Red Flags Rule” for identity theft requires corporations selling and billing for goods or services to regularly assess the risk of identity theft and to develop “reasonable and appropriate” protections. At the U.S. state level, some statutes go further than current federal law in prescribing certain security requirements, so their provisions should be added to the synthesised information security and privacy legal view (the effect of potential preemption is ignored here, as a best practices synthesised statute is preferable even if not strictly required currently).

For example, Nevada’s encryption requirement mandates the use of encryption for any non-fax electronic transmission sent outside the sender’s secure system and requires the use of the Payment Card Industry’s DSS standard for card payments. Rhode Island’s data destruction statute requires a business to take reasonable steps to destroy or make unreadablecustomers’personalinformationitnolongerneedstoretain.California’sbreachnotificationstatuterequiresbusinesses that have customers’ electronic personal information to notify persons whose unencrypted data is subject to unauthorised access. Massachusetts’s information security law requires a comprehensive, written information security program with administrative, physical, and technical security controls as well as the following:

Developing information security policies and designating a leader for the program.•Creating an inventory of personal information and maintaining oversight of third-party service providers.•Monitoring the program and performing annual reviews.•Monitoring for unauthorised use and incident management procedures.•Establishing user authentication and access control procedures.•Encrypting transmitted records and stored data on mobile devices.•Maintaining up-to-date network and system protection software.•Providing employee security training•

Synthesised Legal ViewConsolidating the provisions from these various statutes results in a synthesised global legal view that encompasses at least the following high-level information security and privacy requirements:

Information Security/Privacy Policy Inventory of Personal InformationRisk Assessment Process Internal Reviews and MonitoringReactive and Preventive Controls Incident Management and MonitoringDataDestruction/De-identification User Authentication and Access ControlsBreachNotification Encryption (stored data)Responsible Person Encryption (transmission)Competence of Personnel Up-to-Date System/Network SoftwareSpecial Rules for Information Brokers Employee Security Training

Page 131: Introduction to Information Theory and Applicationsjnujprdistance.com/assets/lms/LMS JNU/B.Sc.(IT)/Sem I/Introduction...Summary ... 8.3.6 Project VENONA ... Table 8.2 VENONA Decrypt

119

PCI DSS Use for Electronic Payments Oversight of Third Parties/ProvisionsIdentity Theft Assessment Special Protections for Sensitive DataAdministrative and HR Security Physical and Environmental SecurityPersonal Info Collection/Use Limits Personal Info Integrity and CorrectionThird-Party Transfer Restrictions Choice and Accountability

Table 7.1 High-level information security and privacy requirements

Phase 2: Analysis of potential exposures to legal liabilityAnalysis of potential exposures to legal liability includes followings phases:ContractualBeyond what is mandated by law, what is unique to each organisation is the particular set of contractual and other commitments to implement certain information security and privacy controls and the possible tortious claims based onfailureorabsenceofthosecontrols.Thisphasemaytakesignificanttimetocomplete,asitrequiresgatheringdata about the organisation’s relationships and its use of information security and privacy across the world. The easiest place to begin is to inventory the organisation’s information assets. This inventory includes the hardware, system and application software, development and testing environments, and the networks and facilities that the organisation uses to transmit and host data and the owners and custodians thereof.

How an organisation acquired each asset will lead to a list of vendor agreements and any contractual information security and privacy requirements and restrictions. In addition, all outsourcing agreements and service level agreements (SLAs) must be obtained and examined for these provisions. A complete inventory of customer agreements isalsonecessarytofindtheinformationsecurityandprivacyprovisionsthereinandtodeterminepotentialliabilityexposures.

Torts and otherPossibleareasofliabilityfortortclaimsshouldbeidentifiedproactively,sothatexposurescanbedeterminedandthe proper controls and legal defenses can be built in advance. Also, a complete understanding of non-regulatory information security and privacy requirements, such as industry association rules, must be obtained. A starting point is a complete understanding of the business model of the organisation.

How does the company use information in delivering its products and services? Whose information is it using (that is,itscustomers’,itsemployees’,oritsown)?Whatclassificationlevelsareappliedtothedata?Howisthedataaccessed and by whom? Who owns each type of data? What data and processing are outsourced, to whom, and where? What other laws and regulations is the organisation subject to? What insurance coverage is in place? What countries does this organisation do business in?

Phase 3: Information security/privacy policies and risk assessment frameworksWe will learn information security/privacy policies and risk assessment frameworks separately and in brief in the following section:

Information security/Privacy policiesEach organisation must be guided by its own policies in information security and privacy. These policies are in response to the statutory, regulatory, and contractual commitments and business needs and risks an organisation faces. The information security/ privacy policy guides the corporation and its employees, customers, and vendors in the useofinformationsecurityandprivacy.Thefirststepistosetthescopefortheinformationsecurity/privacypolicy(although there can be and usually are separate information security and privacy policies, for simplicity they will be discussed as a single unit) as it can apply to all systems, organisations, technology, assets, and countries or any subset thereof. The commitment of corporate management to information security and privacy must be documented (andifnotsufficient,remedied).

Page 132: Introduction to Information Theory and Applicationsjnujprdistance.com/assets/lms/LMS JNU/B.Sc.(IT)/Sem I/Introduction...Summary ... 8.3.6 Project VENONA ... Table 8.2 VENONA Decrypt

Introduction to Information Theory and Applications

120

The roles of all the stakeholders (for example, users, custodians, managers, owners) are then documented (and assigned, if not already done). This process continues until all the legal, business, and technological directions for information security and privacy are addressed. The creation of a complete information security/privacy policy is quite involved, and whole books are devoted to the topic. Its creation and dissemination typically require much iteration, revisions, and approvals, but a typical information security and privacy policy will have at least the following high-level components:

Management’s Commitment Roles of All Stakeholders

DataClassification Acceptable Use

Physical Security Change Management

Malware Media Handling

Backup/Business Continuity E-mail/Messaging Systems

Data and Media Destruction Encryption

Software Patching Authentication

Monitoring and Logging Access Control

Password Management Network Access

Systems Development Third-Party Compliance

Incident Management Statutory Compliance

Use of Mobile/Wireless Devices Human Resources

Limits on Collection, Use and Disclosure of Per-sonal Data

Destruction/De-identificationofUnusedPer-sonal and Sensitive Data

Right of Access and Correction Choice/Right to Object

Notice to Data Subject Supervision of Third-Party Processors

Data Integrity Limits on Cross-Border Transfers

Limits on Retention Periods Limits on Direct Marketing/Opt-out

Data Subject Consent BusinessTransferNotification

Limits on Sensitive Data/Security DataBreachNotification

Table 7.2 High-level components of a typical information security and privacy policy

Risk assessment/ManagementTo realise the aspirations of the information security and privacy policy, the proper controls must be put into place to manage the various legal, business and technical risks. To know which information security and privacy controls are needed, a risk assessment process must be undertaken. Risk assessment requires understanding the external and internal threats to an organisation’s information assets and the vulnerabilities of its current systems and processes.

The risks are typically assessed on either a qualitative high-medium-low scale or a quantitative numerical scale, taking account of the likelihood that threats will materialise and the impact of loss based on the sensitivity and criticality of the in-scope information assets. Many risk assessment processes are available, all of which should lead to essentially the same results. Three of the most prominent risk assessment and management standards and guidelines are those from the International Organisation for Standardisation (ISO), the U.S. National Institute for Standards and Technology (NIST), and ISACA (formerly the Information Systems Audit and Control Association), specificallyISO27005,14NIST’sRiskControlFramework(RCF),andISACA’sRiskIT,respectively.

Page 133: Introduction to Information Theory and Applicationsjnujprdistance.com/assets/lms/LMS JNU/B.Sc.(IT)/Sem I/Introduction...Summary ... 8.3.6 Project VENONA ... Table 8.2 VENONA Decrypt

121

ISO 27005 describes a six-step risk management model: context establishment, risk assessment, risk treatment, risk acceptance, risk communication, and risk monitoring and review. NIST’s RCF uses a six-phase model for new system implementations: categorise the system and information processed by it, select security controls for the system, implement the security controls, assess the security controls, authorise information system operation, and monitor the security controls. Risk IT has three domains with three processes in each: Risk Governance establish and maintainacommonriskview,integratewithenterpriseriskmanagement,andmakerisk-awarebusinessdecisions;RiskEvaluationcollectdata,analyserisk,andmaintainriskprofile(inventory);andRiskResponsearticulaterisk,manage risk, and react to events (Risk IT views information security risk as just one of the risks of IT).

Before implementing the risk methodologies, an inventory and valuation of all the organisation’s information assetsmustbeundertaken.Regardlesswhichriskmethodologyisused,itrequiresasignificanttimeandresourcecommitment from the organisation to design, implement, and maintain. Risk assessment processes must be repeated regularly to address new threats and vulnerabilities that arise as well as changes to the business or the systems used and any new information assets that are introduced, including those that incorporate technologies new to the organisation.

Phase 4: Information security and privacy controlsOnce the risks have been assessed and the potential impacts understood, the risks must be prioritised and decisions made on how to respond. Risks can be retained, transferred/ shared, or avoided, or controls can be used to mitigate the risks. For the latter, the same three organisations previously mentioned have suggested lists of controls: NIST’s security control families, ISO 27002, 18 and ISACA’s COBIT.

Control groupingsThe NIST controls are divided into the three classes: management, operations, and technical. In addition to project management, controls are grouped into seventeen families: Access Control, Awareness and Training, Audit and Accountability, SecurityAssessment andAuthorisation,ConfigurationManagement,ContingencyPlanning,IdentificationandAuthentication,IncidentResponse,Maintenance,MediaProtection,PhysicalandEnvironmentalProtection, Planning, Personnel Security, Risk Assessment, System and Services Acquisition, System and Communications Protection, and System and Information Integrity.

The ISO 27002 controls are divided into administrative, technical, and physical, covering organisation, asset management, human resources security, physical and environmental security, communications and operations management, access control, systems development and maintenance, incident management, business continuity management, and compliance. COBIT is divided into four domains: Plan and Organise, Acquire and Implement, Deliver and Support, and Monitor and Evaluate and thirty-four (34) processes (COBIT, unlike the other two methodologies, is not solely for information security). ISACA has published several mapping documents that explain in detail the differences among these control methodologies.

In addition to procedural controls, contractual controls must be implemented to ensure that all external entities, including customers, vendors, and agents who interact with the organisation, are processing its data with at least the same level of information security and privacy controls. Standardised provisions addressing how to protect data and later destroy data at contract termination are needed. As the controls used for information security and privacy willoverlapthoseofotherdisciplines(e.g.,finance,informationtechnology,humanresources,compliance),agreatdeal of coordination is required in selecting controls based on the information security and privacy objectives.

The following is a high-level description of the minimum administrative, physical, and technical information security controls that any organisation should design, implement, and continually practice and review. Detailed controls require an analysis of each organisation’s business.

Page 134: Introduction to Information Theory and Applicationsjnujprdistance.com/assets/lms/LMS JNU/B.Sc.(IT)/Sem I/Introduction...Summary ... 8.3.6 Project VENONA ... Table 8.2 VENONA Decrypt

Introduction to Information Theory and Applications

122

Security Measures DescriptionAdministrative

1. Separation of Duties and Environments

Critical functions of users and administrators are split among different members of the organisation, and the production environment is separated.

2. Employee Training and Awareness

All employees and contractors must be trained for their controls and regularly be made aware of new security issues and procedure changes.

3. Human Resources Securityrolesaredefined,andtrainingisconductedregularly.

4. Independent External Testing

Tests to ensure that networks and systems cannot be externally (or internally) penetrated and external audits of security controls are conducted.

5. Third-Party Access and Oversight

The service levels of third parties are contractually committed to, their access controlled, and their activities supervised.

6. Internal Audits Tests are conducted to ensure that controls are designed properly and are working as designed and that all laws are being complied with.

7. Management ReviewsReviews of the security policy, risk assessments, and security controls are conducted on a regular basis, including legal compliance and implementation of follow-up actions.

Physical

8. Physical Access Controls Controls are implemented for all entrances to secure areas and access.

9. Environmental Controls Fires,earthquakes,floods,riots,etc.,areappropriatelyaddressed.

Technical

10. Authentication ControlsControls are implemented to ensure that users are who they claim to be throughappropriateuseofmultifactoridentificationtechniques,includingpassword standards.

11. User Access ControlsControls are implemented to ensure that only authorised users can access data and programs and that those who are no longer authorised (e.g., terminated employees) cannot.

12. Malware Protection Controls are implemented to limit the impact of software viruses and other malware.

13. System Monitoring and Capacity Controls

Network and system events and operator and system administrator actions are set, monitored, and recorded into logs, which are then reviewed.

14. Encryption-Storage and Transmission

Controls are implemented for the proper use of encryption technology for data storage and transmission and the proper encryption key management controls.

15. Mobile Device and Media Controls

Controls are implemented over the use of mobile computing and storage devices and all removable media, disabling the use of such devices to the extent possible.

16. E-mail and e-Commerce System Controls

Controls are implemented over all applications that interact externally, including e-mail, e-Commerce, EDI, SaaS, FTP servers, websites, blogs, etc., and the use of attachments.

Page 135: Introduction to Information Theory and Applicationsjnujprdistance.com/assets/lms/LMS JNU/B.Sc.(IT)/Sem I/Introduction...Summary ... 8.3.6 Project VENONA ... Table 8.2 VENONA Decrypt

123

17. Wireless Access Points Controls are implemented over the use of wireless access points into a corporate network.

18. Regular Backup and Business Continuity

Periodic backups are performed and sent off-site, and facilities and plans are built and tested to ensure availability during device outages or disasters.

19. Application Controls Checks are made during data input, processing and output.

20. Operational Procedures Controls are implemented to ensure that ops processes are documented and available to all.

21. Change ManagementControls are implemented to ensure that application, system, and data changes are managed appropriately to minimise impact on availability, integrity, and confidentiality.

22. Incident Management Incidentsareidentified,isolated,respondedto,resolved,documented,andfollowed up.

23. Information Ownership andClassification

Controls are implemented to ensure that all information is inventoried and owned by someone who assumes responsibility for its integrity and its classification.

24. Physical and Logical Segregation of Data

Customer data is logically and physically segregated from all other customer data and from corporate data, and secret and sensitive data are separated.

25. Asset Management Information assets are inventoried, tracked, maintained, and disposed of properly.

Table 7.3 Minimum set of controls

Phase 5: Compliance, audit, and certificationPhase5includescompliance,auditandcertification;wewillseetheseindetails:Compliance and auditAfter the controls are implemented, their use in the daily operations of the organisation must be monitored for compliance with the information security/privacy policies and control objectives and ultimately the applicable laws and regulations. The regular monitoring and evaluation for effectiveness is from two directions internal and external. Internal monitoring involves the response to, review, and follow-up of all security incidents that arise and periodic reviews by management of the effectiveness of the information security/ privacy program. This monitoring should be part of the ongoing information security/privacy policy, risk assessment, and control review processes. External reviews include reviews by customers and by independent auditors employed by the organisation itself.

An example is the American Institute of CPA’s (AICPA) SysTrust methodology. This procedure requires an assurance audit on thefiveSysTrust principles (security, availability, processing integrity, confidentiality, and privacy).The security principle includes documenting, communicating, and monitoring the AICPA’s security policies and procedures. Other types of external audits include those that are part of Sarbanes-Oxley internal controls reviews, AICPA SAS 70/ISAE 340223 service provider audits, and NIST’s security control assessments. External vulnerability reviews for potential system and network penetration attacks should be based on an overall testing and assessment methodology such as NIST’s SP 800-115 guidelines.

CertificationToensuretheyhaveimplementedbestpractices,organisationsmayseekindependentcertificationoftheirinformationsecurity/privacyprogram.ThisismosttypicallydoneundertheISO27001certificationstandard.Thisstandarddescribes all the components that an adequate information security management system must have. In conjunction withtheotherISO27000standards,anindependentISO-designatedcertifierwillexamineboththedesignandongoingoperation of the information security management program to determine if it meets the described standard, including thesecuritypoliciesandcontrolsdescribedpreviously.Thereareotherauditsspecificallytargetingprivacy.

Page 136: Introduction to Information Theory and Applicationsjnujprdistance.com/assets/lms/LMS JNU/B.Sc.(IT)/Sem I/Introduction...Summary ... 8.3.6 Project VENONA ... Table 8.2 VENONA Decrypt

Introduction to Information Theory and Applications

124

Before moving on to the detailed chapters, four additional concepts will help to create the foundation needed for an organisation’s total commitment to information security and privacy. First is an understanding of the reasons to protect data, which include both the direct costs associated with data breaches of customer, employee, or corporate information (responding to the incidents and settling claims) and the indirect costs (harm to reputation and resultant lossofbusiness).Statutesandregulationsalsocreategeneralandspecificrequirementsforinformationsecurityandprivacy.Andcorporateboardsofdirectorsandofficershaveadutyofcaretosafeguarddata.Secondisanexplanationof just what information security is. Third is a presentation of examples of data breaches that occur when information security fails. And fourth is a discussion of the dynamic relationship between privacy and information security.

7.3 Costs of Data Loss and DisclosureThe increase of available information on technicalmedia contributes tofiscalmotivations for the growth of“cybercrime.” In the past, demonstrating technical prowess by breaking into seemingly secure sites was an adequate reason for cybercrime, but today a multi-billion-dollar industry has grown around data theft. This industry has taken hold because of the ways in which people and companies use technology resources. For example, in the United States, 8 out of 10 households now bank online. The rise of online banking and the prevalence of malware on consumers’ computers contribute to an annualised rate of $480 million in online banking fraud. A black market exists for both industry and consumer information where stolen data is readily traded. From 2002 through 2009, the overall amount ofcardfraudhasmorethandoubled,from$3billioninlossestoabout$7billion.In2010,forthefirsttimeever,theft of information replaced theft of physical assets and stock as the leading type of fraudulent activity globally.

The cost to individual businesses from information security breaches is staggering. In 2009, the average cost per incident of a data breach in the United States was over $6 million with the cost of a single breached record estimated at $204. Costs are incurred in completing security repairs, performing investigations, complying with laws, and covering litigation-related expenses. If a small business has 50,000 customers in its database and has to pay $3 tomailarecordtoeachcustomerintheeventofabreach,itwillspend$150,000justmailingthenotificationtosatisfy data breach laws. Loss of reputation, goodwill, and increased customer churn can often be more costly for anorganisationthandemonstrablefinancialexpenditures.

Loss due to data breach is a global problem. A study by a protection software vendor estimates that over $1 trillion was lost by global organisations in 2009 worldwide due to loss of intellectual property and the costs of repairing data breaches. The report of data breaches in countries ranges from the theft of personal information from customers of Japanese online supermarkets via SQL injection attack to the loss in the United Kingdom of two password-protected CDs containing the names, birth dates, and National Insurance numbers of 25 million children, parents, guardians, andcaregiversinvolvedwiththeHMRevenueandCustomschildbenefit.Thecostsofdatabreachesinseveralcountries around the world are now being analysed.

7.3.1 CryptographyCryptography or “secret codes” are a fundamental information security tool. Cryptography has many uses, including theprotectionofconfidentialityandintegrity,amongmanyothervitalinformationsecurityfunctions.Cryptographyis the essential background for much of the remainder of the book. The discussion of cryptography starts with a look at a handful of classic cipher systems. These classic systems illustrate fundamental principles that are employed in modern digital cipher systems, but in a more user-friendly format.

This background helps to study modern cryptography. Symmetric key cryptography and public key cryptography both play major roles in information security. It also includes hash functions, which are another fundamental security tool. Hash functions are used in many different contexts in information security. Some of these uses are quite surprising and not always intuitive. Applications of hash functions include online bidding and spam reduction.

Tolearnspecialtopicsthatarerelatedtocryptographywewillconsidertheexampleofsomefictitiouscharactersthroughout the book. For example, we’ll discuss information hiding, where the goal is for Alice and Bob to communicate information without Trudy even knowing that any information has been passed. This is closely relatedtotheconceptofdigitalwatermarking,whichwealsocoverbriefly.Thefinalchapteroncryptographydealswith modern cryptanalysis, that is, the methods used to break modern cipher systems. Although this is relatively technical and specialised information, it’s necessary to appreciate the attack methods in order to understand the design principles behind modern cryptographic systems.

Page 137: Introduction to Information Theory and Applicationsjnujprdistance.com/assets/lms/LMS JNU/B.Sc.(IT)/Sem I/Introduction...Summary ... 8.3.6 Project VENONA ... Table 8.2 VENONA Decrypt

125

7.3.2 Access ControlAccess control deals with authentication and authorisation. In the area of authentication, there are many issues related to passwords. Passwords are the most often used form of authentication today, but this is primarily because passwordsarefreeanddefinitelynotbecausetheyaresecure.

Access control considers how to securely store passwords. This issue is also delved into the issues surrounding secure password selection. Although it is possible to select strong passwords that are relatively easy to remember, it’sdifficulttoenforcesuchpoliciesonusers.Infact,weakpasswordspresentamajorsecurityweaknessinmostsystems.Thealternativestopasswordsincludebiometricsandsmartcards.Someofthesecuritybenefitsoftheseformsof authentication are explained below. For this consider an example of the details of several biometric authentication methods. Authorisation deals with restrictions placed on authenticated users. Once Alice’s Bank is convinced that Bob is really Bob, it must to enforce restrictions on Bob’s actions.

The two classic methods for enforcing such restrictions are access control lists and capabilities. There are many the pluses and minuses of each of these authorisation methods. Authorisation leads naturally to a few relatively specialised topics. We’ll discuss multilevel security (and the related topic of multilateral security). For example, the military has TOP SECRET and SECRET information. Some users can see both types of information, while other users can only see the SECRET information. If both types of information are on a single system, how can we enforcesuchrestrictions?Thisisanauthorisationissuethathaspotentialimplicationsfarbeyondclassifiedmilitaryand government systems.

Multilevelsecurityleadsnaturallyintotherarefiedairofsecuritymodelling.Theideabehindsuchmodellingisto lay out the essential security requirements of a system. Ideally, by verifying a few simply properties, we would knowthataparticularsystemsatisfiesaparticularsecuritymodel.Ifso,thesystemwouldautomaticallyinheritallof the security properties that are known to hold for such a model. The two simplest security models are included in it, both of which arise in the context of multilevel security. Multilevel security also provides an opportunity to discuss covert channels and inference control. Covert channels are unintended channels of communication. Such channels are common and create potential security problems. Inference control attempts to limit the information that can unintentionally leak out of a database due to legitimate user queries. Both covert channels and inference controlaredifficultproblemstodealwitheffectivelyinreal-worldsystems.

Sincefirewallsactasaformofaccesscontrolforthenetwork,westretchtheusualdefinitionofaccesscontroltoincludefirewalls.Regardlessofthetypeofaccesscontrolemployed,attacksareboundtooccur.Anintrusiondetection system (IDS) is designed to detect attacks in progress. So we include a discussion of IDS techniques after ourdiscussionoffirewalls.

7.3.3 ProtocolsWe’ll then cover security protocols. First, we’ll consider the general problem of authentication over a network. Many examples will be provided, each of which illustrates a particular security pitfall. For example, replay is a critical problem, and we’ll consider ways to prevent such an attack.

Cryptography will prove useful in authentication protocols. We’ll give example of protocols that uses symmetric cryptography, as well as examples that rely on public key cryptography. Hash functions also have an important role to play in security protocols. Our study of simple authentication protocols will illustrate some of the subtleties that canariseinthefieldofsecurityprotocols.Aseeminglyinsignificantchangetoaprotocolcancompletelychangeitssecurity.We’llalsohighlightseveralspecifictechniquesthatarecommonlyusedinreal-worldsecurityprotocols.

Page 138: Introduction to Information Theory and Applicationsjnujprdistance.com/assets/lms/LMS JNU/B.Sc.(IT)/Sem I/Introduction...Summary ... 8.3.6 Project VENONA ... Table 8.2 VENONA Decrypt

Introduction to Information Theory and Applications

126

Thenwemoveontostudyfourspecificsecurityprotocols.ThefirstoftheseistheSecureSocketLayer,orSSL,whichisusedextensivelytosecuree-commerceontheInternettoday.SSLisanelegantandefficientprotocol.We’llthen discuss IPSec, which is another Internet security protocol. Conceptually, SSL and IPSec share many similarities, but the implementations differ greatly. In contrast to SSL, IPSec is complex and “over-engineered.” Apparently duetoitscomplexity,severalsecurityflawsarepresentinIPSecdespitealengthyandopendevelopmentprocess.This nicely illustrates the challenges inherent in developing security protocols. The third real-world protocol that we’ll consider is Kerberos, which is an authentication system based on symmetric cryptography. Kerberos follows an approach much different from either SSL or IPSec.

We’ll also discuss the security mechanisms employed in GSM, a cellular phone system. Although the GSM security protocol is fairly simple, it’s an interesting case study due to the large number of known attacks. These attacks include various combinations of attacks on the protocol itself, as well as the underlying cryptography.

7.3.4 SoftwareAspectsofsecurityandsoftwareareahugetopic.Someofthesecurityflawsandmalwarearealreadymentionedabove. Software reverse engineering in order to illustrate how a dedicated attacker can deconstruct software, even without access to the source code. We then apply our newfound hacker’s knowledge to the problem of digital rights management, which provides an excellent example of the limits of security in software particularly when that software must execute in a hostile environment.

Ourfinalsoftware-relatedtopicisoperatingsystems(OSs).TheOSisthearbiterofmostsecurityoperations,soit’simportant to understand how the OS enforces security. We then consider the requirements of a so-called trusted OS. A trusted OS provides strong assurances that the OS is performing properly. After this background, we consider a recent attempt by Microsoft to implement a trusted OS for the PC platform. This discussion further illustrates the challenges inherent in implementing security in software.

7.4 The People ProblemClever users have the ability to destroy the best laid security plans. For example, suppose that Bob wants to purchase an item from Amazon.com. Bob can use his Web browser to securely contact Amazon using the SSL protocol, which relies on cryptographic techniques. Various access control issues arise in such a transaction, and all of these security mechanisms are enforced in software. Unfortunately, if Bob is a typical user, he will simply ignore the warning, which has the effect of defeating the security regardless of how secure the cryptography, how well-designed the protocolsandaccesscontrolmechanisms,andhowflawlessthesoftware.

To take just one more example, a great deal of security today rests on passwords. Users want to choose easy to remember passwords, but this makes it easier for Trudy to guess passwords. An obvious solution is to assign strong passwords to users. However, this is almost certain to result in passwords written on post-it notes and posted in prominent locations, making the system less secure than if users were allowed to choose their own (relatively weak) passwords.

The primary focus of this book is on understanding security mechanisms the nuts and bolts of information security. In a few places, the “people problem” is discussed. For more information on the role that humans play in information securitywhich isfilledwith case studies of security failures,most ofwhichhave their rootsfirmly in humannature.

Page 139: Introduction to Information Theory and Applicationsjnujprdistance.com/assets/lms/LMS JNU/B.Sc.(IT)/Sem I/Introduction...Summary ... 8.3.6 Project VENONA ... Table 8.2 VENONA Decrypt

127

SummaryDigital information permeates organisations as well, with almost all corporate data now stored electronically.•Organisations must understand their information security and privacy obligations from statutes and regulations •ineachcountrywheretheydobusiness,includinganyindustrysector-specificrules.The global legal view includes the laws in each region, country and state/province in which a company operates, •hosts (or outsources) data, or collects data.Theinformationsecurityrequirementsarenotspecificbutfallunderthegeneralbannersofreasonablenessand•practicality.CompaniesintheU.S.consumerfinancialsectoraresubjecttotheGramm-Leach-BlileyActanditsSafeguards•Rule requiring a comprehensive security program, including physical, technical, and administrative controls.Possibleareasofliabilityfortortclaimsshouldbeidentifiedproactively,sothatexposurescanbedetermined•and the proper controls and legal defenses can be built in advance.Before implementing the risk methodologies, an inventory and valuation of all the organisation’s information •assets must be undertaken.Once the risks have been assessed and the potential impacts understood, the risks must be prioritised and •decisions made on how to respond. The NIST controls are divided into the three classes: management, operations, and technical.•After the controls are implemented, their use in the daily operations of the organisation must be monitored for •compliance with the information security/privacy policies and control objectives and ultimately the applicable laws and regulations.Theincreaseofavailableinformationontechnicalmediacontributestofiscalmotivationsforthegrowthof•“cybercrime.”Cryptography or “secret codes” are a fundamental information security tool.•Access control deals with authentication and authorisation.•Passwords are the most often used form of authentication today, but this is primarily because passwords are •freeanddefinitelynotbecausetheyaresecure.

ReferencesWatkins, G. S., 2008. • An Introduction to Information Security and ISO27001: A Pocket Guide, IT Governance.Bosworth, B., 1982. Codes, Ciphers, and Computers: • An Introduction to Information Security, Hayden Book Co.Introduction to Information Security• [Online] Available at: <http://apps.americanbar.org/abastore/products/books/abstracts/5450058%20excerpt_abs.pdf> [Accessed 22 October 2012].Introduction to Information Security• [Online] Available at: <www.csudh.edu/.../Introduction%20to%20Information%20Security.p> [Accessed 22 October 2012].Introduction to Information Security• , [Video Online] Available at: <http://www.youtube.com/watch?v=yFRc-wpQc9c> [Accessed 22 October 2012].Introduction to Information Security and Risk Management• , [Video Online] Available at: <http://www.youtube.com/watch?v=n81w0zCkRR4> [Accessed 22 October 2012].

Recommended ReadingWhitman, E. M. & Mattord, J. H., 2011. • Principles of Information Security, 4th ed., Cengage Learning.Rainer, K. R. & Cegielski, G. C., 2010. • Introduction to Information Systems: Enabling and Transforming Business, 3rd ed. John Wiley & Sons.Niit, 2008. • Introduction to Information Security Risk Management, Prentice-Hall of India Pvt. Ltd.

Page 140: Introduction to Information Theory and Applicationsjnujprdistance.com/assets/lms/LMS JNU/B.Sc.(IT)/Sem I/Introduction...Summary ... 8.3.6 Project VENONA ... Table 8.2 VENONA Decrypt

Introduction to Information Theory and Applications

128

Self AssessmentWhich of the following is not a COBIT domain?1.

Plan and Organisea. Acquire and Implementb. Deliver and Supportc. Awareness and Trainingd.

NIST controls have which of the following class?2. Managementa. Identificationb. Securityc. Auditd.

In 2009, the average cost per incident of a data breach in the United States was over ___________.3. $8milliona. $10millionb. $6millionc. $16milliond.

Which of the following is a fundamental information security tool?4. Hash functiona. Cryptographyb. Classic systemc. Spam reductiond.

_________ are the most often used form of authentication today.5. Passwordsa. Thumb impressionb. Patternsc. Diagramsd.

____________ deals with authentication and authorisation.6. Protocola. Cryptographyb. Cryptanalysisc. Access controld.

Page 141: Introduction to Information Theory and Applicationsjnujprdistance.com/assets/lms/LMS JNU/B.Sc.(IT)/Sem I/Introduction...Summary ... 8.3.6 Project VENONA ... Table 8.2 VENONA Decrypt

129

Match the following7.

Human resources 1. Controls are implemented for all entrances to secure areas and access.A.

Physical access control2. Controls are implemented to limit the impact of software viruses and B. other malware.

Environments control3. Securityrolesaredefined,andtrainingisconductedregularly.C.

Malware protection4. Fires,earthquakes,floods,riots,etc.,areappropriatelyaddressed.D.

1-C, 2-A, 3-D, 4-Ba. 1-A, 2-B, 3-C, 4-Db. 1-C, 2-B, 3-D, 4-Ac. 1-D, 2-C, 3-B, 4-Ad.

Aseeminglyinsignificantchangetoa________cancompletelychangeitssecurity.8. protocola. cryptographyb. cryptanalysisc. access controld.

__________ are unintended channels of communication.9. Inference controla. Covert channelsb. Multilevel securityc. Firewallsd.

____________ attempts to limit the information that can unintentionally leak out of a database due to legitimate 10. user queries.

Covert channelsa. Multilevel securityb. Inference controlc. Firewallsd.

Page 142: Introduction to Information Theory and Applicationsjnujprdistance.com/assets/lms/LMS JNU/B.Sc.(IT)/Sem I/Introduction...Summary ... 8.3.6 Project VENONA ... Table 8.2 VENONA Decrypt

Introduction to Information Theory and Applications

130

Chapter VIII

Crypto Basics

Aim

The aim of this chapter is to:

introduce the basic elements of cryptography•

explain taxonomy of cryptography•

definecryptohistory•

Objectives

The objectives of this chapter are to:

explain ciphers•

classify taxanomy of cryptography and cryptanalysis•

elucidate frequency count•

Learning outcome

At the end of this chapter, you will be able to:

distinguish between encryption and decryption•

distinguish between cryptography and cryptanalysis•

understand ciphers of the election of 1876•

Page 143: Introduction to Information Theory and Applicationsjnujprdistance.com/assets/lms/LMS JNU/B.Sc.(IT)/Sem I/Introduction...Summary ... 8.3.6 Project VENONA ... Table 8.2 VENONA Decrypt

131

8.1 IntroductionCryptography is the science of using mathematics to encrypt and decrypt data. Cryptography enables to store sensitive information or transmit it across insecure networks (like the Internet) so that it cannot be read by anyone except the intended recipient.

While cryptography is the science of securing data, cryptanalysis is the science of analysing and breaking secure communication. Classical cryptanalysis involves an interesting combination of analytical reasoning, application ofmathematicaltools,patternfinding,patience,determinationandluck.Cryptanalystsarealsocalledattackers.Cryptology embraces both cryptography and cryptanalysis.

8.1.1 Encryption and DecryptionData that can be read and understood without any special measures is called plaintext or cleartext. The method of disguising plaintext in such a way as to hide its substance is called encryption. Encrypting plaintext results in unreadable gibberish called ciphertext. We use encryption to ensure that information is hidden from anyone for whom it is not intended, even those who can see the encrypted data. The process of reverting ciphertext to its original plaintext is called decryption. Figure below illustrates this process.

plaintext plaintextencryption

ciphertextdecryption

Memo: Confidenti

al

Re: Fiscal Review

This quarter’s

earnings have

just come in

and..

qANQR1DBw

+dB/b9SXx

QQzrGYXD9

VSoOTF6gk

/XTBPce8+M

mdf&UILdDe5

END PGP

Memo: Confidentia

l

Re: Fiscal Review

This quarter’s

earnings have

just come in

and..

Fig. 8.1 Encryption and decryption(Source:http://www.mavi1.org/web_security/cryptography/pgp/pgp_pdf_files/IntrotoCrypto.pdf)

encryption decryption

key

plaintext plaintext

ciphertext

key

Fig. 8.2 Crypto as a black box(Source: Stamp, M., Information security, A John Wiley & Sons)

8.2 How to Speak Crypto?The basic terminology of crypto includes the following:

Cryptology is the art and science of making and breaking “secret codes.”•Cryptography is the making of “secret codes.”•Cryptanalysis is the breaking of “secret codes.”•Crypto is a synonym for any or all of the above (and more). The precise meaning should be clear from •context.

Page 144: Introduction to Information Theory and Applicationsjnujprdistance.com/assets/lms/LMS JNU/B.Sc.(IT)/Sem I/Introduction...Summary ... 8.3.6 Project VENONA ... Table 8.2 VENONA Decrypt

Introduction to Information Theory and Applications

132

A cipher or cryptosystem is used to encrypt data. The original data is known as plaintext and the result of encryption isciphertext.Wedecrypttheciphertexttorecovertheoriginalplaintext.Akeyisusedtoconfigureacryptosystemfor encryption and decryption. In a symmetric cipher, the same key is used to encrypt and to decrypt, as illustrated in the “black box” cryptosystem in Fig. 8.2.

There is also a concept of public key cryptography where the encryption and decryption keys are different. Since different keys are used, it’s possible to make the encryption key public. In public key crypto, the encryption key is appropriately known as the public key, whereas the decryption key, which must remain secret, is the private key. In symmetric key crypto, the key is known as a symmetric key. We’ll avoid the ambiguous term “secret key.”

With any cipher, the goal is to have a system where the key is necessary in order to recover the plaintext from the ciphertext. That is, even if the attacker, Trudy, has complete knowledge of the algorithms used and lots of other information (to be made more precise later), she can’t recover the plaintext without the key. That’s the goal, although realitysometimesdifferssignificantly.

A fundamental tenet of cryptography is that the inner workings of the cryptosystem are completely known to the attacker, Trudy and the only secret is a key. This is known as Kerckhoffs Principle, named after its originator, who in laid out six principles of cipher design and use. The principle that now bears Kerckhoffs’ name states that a cipher “must not be required to be secret, and it must be able to fall into the hands of the enemy without inconvenience”, that is, the design of the cipher is not secret.

WhatisthepointofKerckhoffsPrinciple?Afterall,lifemustcertainlybemoredifficultforTrudyifshedoesn’tknow how a cipher works. While this may be true, it’s also true that the details of cryptosystems seldom remain secret for long. Reverse engineering efforts can easily recover algorithms from software and algorithms embedded in tamper-resistant hardware are susceptible to similar attacks. And even more to the point, secret crypto-algorithms have a long history of failing to be secure once the algorithm has been exposed to public scrutiny. For these reasons, the cryptographic community will not accept an algorithm as secure until it has withstood extensive analyses by many cryptographers over an extended period of time. The bottom line is that any cryptosystem that does not satisfy KerckhoffsPrinciplemustbeassumedflawed.Thatis,acipheris“guiltyuntilproveninnocent.”

Kerckhoffs Principle can be extended to cover aspects of security other than cryptography. In other contexts, Kerckhoffs Principle is taken to mean that the security design itself is open. The belief is that “more eyeballs” aremore likely to expose securityflaws.AlthoughKerckhoffsPrinciple (inboth forms) iswidelyaccepted inprinciple, there are many real-world temptations to violate this fundamental tenet, almost invariably with disastrous consequences for security.

8.3 Classic CryptoWe’ll examine four classic cryptosystems, each of which illustrates some particularly relevant feature. First on our agenda is the simple substitution, which is one of the oldest cipher systems dating back at least 2,000 years and one that is ideal for illustrating basic attacks. We then turn our attention to a double transposition cipher, which includes important concepts that are used in modern ciphers. We also discuss classic codebooks, since many modern ciphers can be viewed as the “electronic” equivalent of codebooks. Finally, we consider the only practical cryptosystem that is provably secure the onetime pad.

8.3.1 Simple Substitution CipherIn a particularly simple implementation of a simple substitution cipher, the message is encrypted by substituting the letter of the alphabet n places ahead of the current letter.

For example, with n = 3, the substitution which acts as the key isplaintext: a b c d e f g h i j k l m n o p q r s t u v w x y zciphertext: D E F G H I J K L M N O P Q R S T U V W X Y Z A B C

Page 145: Introduction to Information Theory and Applicationsjnujprdistance.com/assets/lms/LMS JNU/B.Sc.(IT)/Sem I/Introduction...Summary ... 8.3.6 Project VENONA ... Table 8.2 VENONA Decrypt

133

Where we’ve followed the convention that the plaintext is lowercase and the ciphertext is uppercase. In this example, the key could be stated more succinctly as “3” since the amount of the shift is the key.

Using the key of 3, we can encrypt the plaintext message: “fourscoreandsevenyearsago”

by looking up each letter in the plaintext row and substituting the corresponding letter in the ciphertext row or by simply replacing each letter by the letter that is three positions ahead of it in the alphabet. In this particular example, the resulting ciphertext is:

“IRXUVFRUHDAGVHYHABHDUVDIR”

It should be clear why this cipher is known as a simple substitution. To decrypt, we simply look up the ciphertext letter in the ciphertext row and replace it with the corresponding letter in the plaintext row or simply shift each ciphertext letter backward by three. The simple substitution with a shift of three is known as the Caesar’s cipher because it was reputedly used with success by Julius Caesar.

If we limit the simple substitution to shifts, then the possible keys are n ∈{0,1,2,...,25}.SupposeTrudyinterceptsthe ciphertext message:

“CSYEVIXIVQMREXIH”

and she suspect that it was encrypted with a simple substitution cipher of the “shift by n” variety. Then she can try each of the 26 possible keys, decrypting the message with each putative key and checking whether the resulting putative plaintext looks like sensible plaintext. If the message really was encrypted via a shift by n, Trudy can expect tofindthetrueplaintextandtherebyrecoverthekeyafter13tries,onaverage.Thebruteforceapproachoftryingall possible keys until we stumble across the correct one is known as an exhaustive key search. Since this attack isalwaysanoption,it’snecessary(althoughfarfromsufficient)thatthenumberofpossiblekeysbetoolargeforTrudy to simply try them all in any reasonable amount of time.

How large of a key space is large enough? Suppose Trudy has an incredibly fast computer that’s able to test 240 keys each second. Then a key space of size 256 can be exhausted in 216 seconds or about 18 hours, whereas a key space of size 264 would take more than half a year to exhaust. The simple substitution cipher need not be limited to shiftingbyn.Anypermutationofthe26letterswillsufficeasakey.Forexample,thefollowingkey,whichisnotashiftofthealphabet,definesasimplesubstitutioncipher:

plaintext: a b c d e f g h i j k l m n o p q r s t u v w x y zciphertext: Z P B Y J R G K F L X Q N W V D H M S U T O I A E C

Ifasimplesubstitutionciphercanemployanypermutationasakey,thenthereare26!≈288possiblekeys.Withour superfast computer that tests 240 keys per second, a key space of size 288 would take more than 8900 millennia toexhaust.Ofcourse,we’dexpecttofindthecorrectkeyhalfthattime,or“just”4450millennia!Since288keysisfarmorethanTrudycantryinanyreasonableamountoftime,thiscipherpassesourfirstrequirement,namely,thatthe key space is big enough to make an exhaustive key search infeasible. Does this mean that a simple substitution cipher is secure? The answer is no, as the attack described in the next section illustrates.

8.3.2 Cryptanalysis of a Simple SubstitutionSuppose Trudy intercepts the following ciphertext, which she suspects was produced by a simple substitution cipher though not necessarily a shift by n.

PBFPVYFBQXZTYFPBFEQJHDXXQVAPTPQJKTOYQWIPBVWLXTOXBTFXQWAXBVCXQWAXFQJVWLEQNTOZQGGQLFXQWAKVWLXQWAEBIPBFXFQVXGTVJVWLBTPQWAEBPBFHCVLXBQUFEVWLXGDPEQVPQGVPPBFTIXPFHXZHVFAGFOTHFEFBQUFTDHZBQPOTHXTYFTODXQHFTDPTOGHFQPBQWAQJJTODXQHFOQPWTBDHHIXQVAPBFZQHCFWPFHPBFIPBQWKFABVYYDZBOTHPBQ

Page 146: Introduction to Information Theory and Applicationsjnujprdistance.com/assets/lms/LMS JNU/B.Sc.(IT)/Sem I/Introduction...Summary ... 8.3.6 Project VENONA ... Table 8.2 VENONA Decrypt

Introduction to Information Theory and Applications

134

PQJTQOTOGHFQAPBFEQJHDXXQVAVXEBQPEFZBVFOJIWFFACFCCFHQWAUVWFLQHGFXVAFXQHFUFHILTTAVWAFFAWTEVOITDHFHFQAITIXPFHXAFQHEFZQWGFLVWPTOFFA (8.1)

Since it’s too much work for Trudy to try all 288 possible keys, can she be cleverer? Assuming the underlying message isEnglish;TrudycanmakeuseoftheEnglishletterfrequencycountsinFig.8.2togetherwiththefrequencycountsfor the ciphertext 8.1, which appear in Fig. 8.3.

From the ciphertext frequency counts, Trudy can see that “F” is the most common letter in the ciphertext message, whereas,accordingtofigure8.2,“E”isthemostcommonletterintheEnglishlanguage.Trudythereforesurmisesthat it’s likely that “F” has been substituted for “E.” Continuing in this manner, Trudy can try likely substitutions untilsherecogniseswords,atwhichpointshecanbeconfidentinherassumptions.

Initially,theeasiestwordtodeterminemightbethefirstword,sinceTrudydoesn’tknowwherethespacesbelonginthetext.Sincethethirdletteris“e,”andgiventhehighfrequencycountsofthefirsttwoletter,Trudymightreasonablyguess(correctly,asitturnsout)thatthefirstwordoftheplaintextis“the.”Makingthesesubstitutionsinto the remaining ciphertext, she will be able to guess more letters and the puzzle will quickly unravel. Trudy will likelymakesomemisstepsalongtheway,butwithsensibleuseofthestatisticalinformationavailable,shewillfindthe plaintext in far less than 4450 millennia!

0.140.120.100.080.060.040.020.00

A B C D E F G H I J K L M N O P Q R S T U V W X Y Z

Fig. 8.3 English letter frequency counts(Source: Stamp, M., Information security, A John Wiley & Sons)

Thisattackonthesimplesubstitutionciphershowsthatalargekeyspaceisnotsufficienttoensuresecurity.Thisattack also shows that cipher designers must guard against clever attacks. But how can we protect against all such attacks, since clever new attacks are developed all the time? The answer is that we can’t. As a result, a cipher can only be considered secure as long as no attack against it has yet been found. And the more skilled cryptographers whohavetriedtobreakacipherandfailed,themoreconfidencewecanhaveinthesystem.

8.3.3 Definition of SecureThereareseveralreasonabledefinitionsofasecurecipher.Ideally,wewouldliketohavemathematicalproofthatthere is no feasible attack on the system. However, there is only one cipher system that comes with such a proof and it’s impractical for most uses. Lacking a proof of the strength of a cipher, we could require that the best-known attack on the system is impractical. While this would seem to be the most desirable property, we’ll choose a slightly differentdefinition.We’llsaythatacryptosystemissecureifthebest-knownattackrequiresasmuchworkasanexhaustivekeysearch,thatis,thereisnoshort-cutattack.Bythisdefinition,asecurecryptosystemwithasmallnumber of keys could be easier to break than an insecure cryptosystem with a large number of keys.

Page 147: Introduction to Information Theory and Applicationsjnujprdistance.com/assets/lms/LMS JNU/B.Sc.(IT)/Sem I/Introduction...Summary ... 8.3.6 Project VENONA ... Table 8.2 VENONA Decrypt

135

60

50

40

30

20

10

0A B C D E F G H I J K L M N O P Q R S T U V W X Y Z

Fig. 8.4 Ciphertext frequency counts(Source: Stamp, M., Information security, A John Wiley & Sons)

Therationaleforourdefinitionisthat,ifashortcutattackisknown,thealgorithmfailstoprovideits“advertised”levelofsecurity,asindicatedbythekeylength.Suchashortcutattackindicatesthatthecipherhasadesignflaw.Inpractice,wemustselectacipherthatissecure(inthesenseofourdefinition)andhasalargeenoughkeyspaceso that an exhaustive key search is impractical. Both factors are necessary.

8.3.4 Double Transposition CipherToencryptwithadoubletranspositioncipher,wefirstwritetheplaintextintoanarrayofagivensizeandthenpermutetherowsandcolumnsaccordingtospecifiedpermutations.Forexample,supposewewritetheplaintextattack at dawn into a 3 × 4 array

Nowifwetranspose(orpermute)therowsaccordingto(1,2,3)→(3,2,1)andthentransposethecolumnsaccordingto(1,2,3,4)→(4,2,1,3),weobtain

→ →

Theciphertextisthenreadfromthefinalarray:

NADWTKCAATAT (8.2)

For the double transposition, the key consists of the size of the matrix and the row and column permutations. The recipient who knows the key can simply put the ciphertext into the appropriate sized matrix and undo the permutations torecovertheplaintext.Forexample,todecryptciphertext2.2,theciphertextisfirstputintoa3×4array.Thenthe columns are numbered as (4, 2, 1, 3) and rearranged to (1, 2, 3, 4). Then the rows are numbered (3, 2, 1) and rearranged into (1, 2, 3), as illustrated below and we have recovered the plaintext, attackatdawn.

→ →

Unlike a simple substitution, the double transposition does nothing to disguise the letters that appear in the message. But it does appear to thwart an attack that relies on the statistical information contained in the plaintext, since the plaintext statistics are disbursed throughout the ciphertext.

Page 148: Introduction to Information Theory and Applicationsjnujprdistance.com/assets/lms/LMS JNU/B.Sc.(IT)/Sem I/Introduction...Summary ... 8.3.6 Project VENONA ... Table 8.2 VENONA Decrypt

Introduction to Information Theory and Applications

136

letter e h i k l r s tbinary 000 001 010 011 100 101 110 111

Table 8.1 Abbreviated alphabet

The double transposition is not a trivial cipher to break. The idea of “smearing” plaintext information through the ciphertext is so useful that it is employed by modern block ciphers.

8.3.5 One-Time PadThe Vernam cipher, or one-time pad, is a provably secure cryptosystem. Historically it has been used at various times, but it’s not very practical for most situations. However, it does nicely illustrate some important concepts that we’ll see again later. For simplicity, let’s consider an alphabet with only eight letters. Our alphabet and the corresponding binary representation of letters are given in table 8.1. It is important to note that the mapping between letters and bits is not secret. This mapping serves a similar purpose as the ASCII code, which is not secret either.

SupposeaspynamedAlicewantstoencrypttheplaintextmessage:“heilhitler”usingaone-timepad.Shefirstconsults table 8.1 to convert the letters to the bit string

001 000 010 100 001 010 111 100 000 101

The one-time pad requires a key consisting of a randomly selected string of bits that is the same length as the message. The key is then XORed with the plaintext to yield the ciphertext. A fancier way to say this is that we add the plaintext and key bits modulo 2. We denote the XOR of bit x with bit y as x ⊕ y. Since x ⊕ y ⊕ y = x, decryption is accomplished by XORing the same key with the ciphertext.

Suppose the spy Alice has the key

111 101 110 101 111 100 000 101 110 000

which is of the proper length to encrypt the message above. Then to encrypt, Alice computes

h e i l h i t l e r plaintext: 001 000 010 100 001 010 111 100 000 101key: 111 101 110 101 111 100 000 101 110 000 ----------------------------------------------------------ciphertext:110 101 100 001 110 110 111 001 110 101 s r l h s s t h s r

Converting the ciphertext bits back into letters, the ciphertext message to be transmitted is: “srlhssthsr”.

When fellow spy Bob receives Alice’s message, he decrypts it using the same key and thereby recovers the original plaintext. s r l h s s t h s rciphertext: 110 101 100 001 110 110 111 001 110 101key: 111 101 110 101 111 100 000 101 110 000 -----------------------------------------------------------plaintext: 001 000 010 100 001 010 111 100 000 101 h e i l h i t l e r

Let’s consider a couple of scenarios. First, suppose that Alice has an enemy, Charlie, within her spy organisation. Charlie claims that the actual key used to encrypt Alice’s message is

Page 149: Introduction to Information Theory and Applicationsjnujprdistance.com/assets/lms/LMS JNU/B.Sc.(IT)/Sem I/Introduction...Summary ... 8.3.6 Project VENONA ... Table 8.2 VENONA Decrypt

137

101 111 000 101 111 100 000 101 110 000

WhenBobdecryptstheciphertextusingthiskey,hefind s r l h s s t h s rciphertext: 110 101 100 001 110 110 111 001 110 101“key”: 101 111 000 101 111 100 000 101 110 000 -------------------------------------------------------------------------“plaintext”: 011 010 100 100 001 010 111 100 000 101 k i l l h i t l e r

Bob, who doesn’t really understand crypto, orders that Alice be brought in for questioning. Now let’s consider a different scenario. Suppose that Alice is captured by her enemies, who have also intercepted the ciphertext. The captors are eager to read the message and Alice is encouraged to provide the key for this super-secret message. Alice claims that she is actually a double-agent and to prove it she claims that the key is,

111 101 000 011 101 110 001 011 101 101

These examples indicate why the one-time pad is provably secure. If the key is chosen at random, then an attacker who sees the ciphertext has no information about the message other than its length. That is, given the ciphertext, any “plaintext” of the same length can be generated by a suitable choice of “key,” and all possible plaintexts are equally likely. And since we could pad the message with any number of random letters before encryption, the length is of no use either. So the ciphertext provides no information at all about the plaintext. This is the sense in which the one-time pad is provably secure. Of course, this assumes that the cipher is used correctly. The pad, or key, must be chosen at random, used only once, and must be known only by the sender and receiver.

We can’t do better than a provably secure cipher, so perhaps we should always use the one-time pad. However, there is one serious drawback to the one-time pad: the pad is the same length as the message and the pad which is the key must be securely transmitted to the recipient before the ciphertext can be decrypted. If we can securely transmit the pad, why not simply transmit the plaintext by the same means and do away with the encryption? Below, we’ll see an historical example where it actually did make sense to use a one-time pad, in spite of this serious limitation. However, for modern high data-rate systems, a one-time pad cipher is totally impractical.

Whyis it that the one-time pad can only be used once? Suppose we have two plaintext messages P1 and P2 , encrypted as C1 =P1 ⊕ K and C2= P2 ⊕K;thatis,wehavetwomessagesencryptedwiththesame“one-time”padK.Inthecryptanalysis business, this is known as a depth. In the case of a one-time pad in depth,

C1⊕C2 = P1 ⊕ K ⊕ P2 ⊕ K = P1 ⊕ P2

and the key has disappeared from the problem. This cannot be good for anyone except for Trudy, the cryptanalyst.

8.3.6 Project VENONAThe VENONA project is an interesting example of a real-world use of a one-time pad. In the 1930s and 1940s, Soviet spies entering the United States brought one-time pad keys with them. The spies used these keys to encrypt important messages, which were then sent back to Moscow. These messages dealt with the most sensitive spy operationsof the time. Inparticular, the secretdevelopmentof thefirst atomicbombwas a focusofmuchofthespying.TheRosenberg’s,AlgerHissandmanyotheridentifiedspiesandmanyneveridentifiedspiesfigureprominently in VENONA.

[C% Ruth] learned that her husband [v] was called upby the army but he was not sent to the front. He is a

mechanical engineer and is now working at the ENORMOUS[ENORMOZ] [vi] plant in SANTA FE, New Mexico.

Page 150: Introduction to Information Theory and Applicationsjnujprdistance.com/assets/lms/LMS JNU/B.Sc.(IT)/Sem I/Introduction...Summary ... 8.3.6 Project VENONA ... Table 8.2 VENONA Decrypt

Introduction to Information Theory and Applications

138

[45 groups unrecoverable]detain VOLOK [vii] who is working in a plant on ENORMOUS.

He is a FELLOWCOUNTRYMAN [ZEMLYaK] [viii]. Yesterday helearned that they had dismissed him from his work. Hisactive work in progressive organizations in the past was

cause of his dismissal.In the FELLOWCOUNTRYMAN line LIBERAL is in touch with

CHESTER [ix]. They meet once a month for the payment ofdues.CHESTERisinterestedinwhetherwearesatisfied

with the collaboration and whether there are not anymisunderstandings.Hedoesnotinquireaboutspecific

items of work [KONKRETNAYa RABOTA]. In as much as CHESTERknows about the role of LIBERAL’s group we beg consent toask C through LIBERAL about leads from among people whoareworkingonENOURMOUSandinothertechnicalfields.

Table 8.2 VENONA Decrypt of message of September 21, 1944

The Soviet spies were well trained and never reused the key, yet many of the intercepted ciphertext messages were eventually decrypted by American cryptanalysts. How can that be, given that the one-time pad is provably secure? In fact,therewasaflawinthemethodusedtogeneratethepads,sothattherewererepeats.Asaresult,manymessageswere in depth, which enabled the cryptanalysis of these messages. Part of a VENONA decrypt is given in table 8.2. This message refers to David Greenglass and his wife Ruth. LIBERAL is Julius Rosenberg who, along with his wife Ethyl, was eventually executed for his role in nuclear espionage. The Soviet codename for the atomic bomb was, appropriately, ENORMOUS. TheVENONA decrypts at make for interesting reading.

8.3.7 Codebook CipherA classic codebook cipher is, literally, a dictionary-like book containing words and their corresponding codeword’s. Table 8.3 contains an excerpt from a famous codebook used by Germany during World War I.

For example, to encrypt the German word Februar, the entire word was replaced with the 5-digit “codeword” 13605. The codebook in table 8.3 was used for encryption, while a corresponding codebook, arranged with the 5-digit codeword’s in numerical order, was used for decryption. A codebook is a substitution cipher, but the substitutions are far from simple, since substitutions are for entire words or even phrases. The codebook illustrated in table8.3 was used to encrypt the famous Zimmermann telegram. In 1917, German Foreign Minister Arthur Zimmermann sent an encrypted telegram to the German ambassador in Mexico City.

Plaintext CiphertextFebruar 13605fest 13732finanzielle 13850folgender 13918Frieden 17142Friedenschluss 17149

Table 8.3 Excerpt from a German codebook

Theciphertextmessage,asshowninfigure8.4,wasinterceptedbytheBritish.Atthetime,theBritishandFrenchwere at war with Germany and its allies, but the United States was neutral. The Russians had recovered a damaged version of the German codebook, and the partial codebook had been passed on to the British. Through painstaking analyses, the British were able to recover enough of the codebook to decrypt the Zimmermann telegram. The telegram stated that the German government was planning to begin “unrestricted submarine warfare” and had concluded that this would likely lead to war with the United States.

Page 151: Introduction to Information Theory and Applicationsjnujprdistance.com/assets/lms/LMS JNU/B.Sc.(IT)/Sem I/Introduction...Summary ... 8.3.6 Project VENONA ... Table 8.2 VENONA Decrypt

139

Fig. 8.5 The Zimmermann telegram(Source: Stamp, M., Information security, A John Wiley & Sons)

Asaresult,ZimmermannhaddecidedthatGermanyshouldtrytorecruitMexicoasanallytofightagainsttheUnited States. The incentive for Mexico was that it would “reconquer the lost territory in Texas, New Mexico, and Arizona.” When the decrypted Zimmermann telegram was released in the United States, public opinion turned against Germany and, after the sinking of the passenger liner Lusitania, the United States declared war on Germany.

The British were initially hesitant to release the Zimmermann telegram since they feared that the Germans would realise that their cipher was broken and presumably, stop using it. However, in sifting through other cabled messages that had been sent at about the same time as the Zimmermann telegram, British analysts found that a variant of the telegram had been sent unencrypted. The version of the Zimmermann telegram that the British subsequently released closely matched the unencrypted version of the telegram. The German’s concluded that their codebook had not been compromised and continued to use it for sensitive messages throughout the war.

Modern block ciphers use complex algorithms to generate ciphertext from plaintext (and vice versa) but at a higher level, a block cipher can be viewed as a codebook, where each key determines a distinct codebook.

8.3.8 Ciphers of the Election of 1876The U.S. presidential election of 1876 was a virtual dead heat. At the time, the Civil War was still fresh in people’s minds, “radical” Reconstruction was ongoing in the former Confederacy, and the nation was, in many ways, still bitterly divided. The contestants in the election were Republican Rutherford B. Hayes and Democrat Samuel J. Tilden. Tilden had obtained a slight plurality of the popular vote, but it is the Electoral College that determines the presidency. In the electoral college, each state sends a delegation and the entire delegation is supposed to vote for the candidate who received the largest number of votes in that particular state (though there is no legal requirement for a delegate to vote for a particular candidate and on rare occasion a delegate will vote for another candidate).

Page 152: Introduction to Information Theory and Applicationsjnujprdistance.com/assets/lms/LMS JNU/B.Sc.(IT)/Sem I/Introduction...Summary ... 8.3.6 Project VENONA ... Table 8.2 VENONA Decrypt

Introduction to Information Theory and Applications

140

In 1876, the Electoral College delegations of four states were in dispute, and these held the balance. A commission of 15 members was appointed to determine which state delegations were legitimate and thus determine the presidency. The commission decided that all four states should go to Hayes and he became president of the United States. Tilden’ssupportersimmediatelychargedthatHayes’peoplehadbribedofficialstoturnthevoteinhisfavour,butno evidence was forthcoming.

Some months after the election, reporters discovered a large number of encrypted messages that had been sent from Tilden’ssupporterstoofficialsinthedisputedstates.Oneoftheciphersusedwasapartialcodebooktogetherwithatranspositiononthewords.Thecodebookwasonlyappliedto“important”wordsandthetranspositionwasafixedpermutation for a message of a given length. The allowed message lengths were 10, 15, 20, 25, and 30 words, with all messages padded to one of these lengths.

The cryptanalysis of this weak cipher was relatively easy to accomplish. Since a permutation of a given length was used repeatedly, many messages of particular length were in depth with respect to permutation as well as the codebook. A cryptanalyst could therefore compare all messages of the same length, making it relatively easy to discoverthefixedpermutation,evenwithoutknowledgeofthepartialcodebook.Theanalysthadtobecleverenoughto consider the possibility that all messages of a given length were using the same permutation, but, with this insight, the permutations were easily recovered. The codebook was then deduced from context and also with the aid of some unencrypted messages that provided clues as to the substance of the ciphertext messages.

And what did these decrypted messages reveal? The reporters were amused to discover that Tilden’s supporters had triedtobribeofficialsinthedisputedstates.TheironywasthatTilden’speoplewereguiltyofpreciselywhattheyhad accused Hayes’ people of doing! By any measure, this cipher was poorly designed and weak. One lesson here isthatthereuse(oroveruse)ofkeyscanbeanexploitableflaw.Inthiscase,eachtimeapermutationwasreused,itgave the cryptanalyst more information that could be collated to recover the permutation. In modern cipher systems, we try to limit the use of a single key so that we do not allow a cryptanalyst to accumulate too much information about a particular key and to limit the damage if a key is discovered.

8.4 Modern Crypto HistoryThroughout the 20th century, cryptography played an important role in major world events. Late in the 20th century, cryptography became a critical technology for commercial and business communications as well. The Zimmermann telegramisoneofthefirstexamplesfromthelastcenturyoftherolethatcryptanalysishashadinpoliticalandmilitary affairs. In this section, we mention a few other historical highlights from the past century.

In1929,SecretaryofStateHenryL.StimsonendedtheU.S.government’sofficialcryptanalyticactivity,justifyinghis actions with the immortal line, “Gentlemen do not read each other’s mail”. This would prove to be a costly mistake in the run up to the Japanese attack on Pearl Harbor. Shortly after the attack of December 7, 1941, the United States restarted its cryptanalytic program in earnest. The successes of allied cryptanalysts during the WorldWar II era were remarkable, and this period is often seen as the “golden age” of cryptanalysis. Virtually all significantaxiscryptosystemswerebrokenandthevalueoftheintelligenceobtainedfromthesesystemsisdifficultto overestimate.

InthePacifictheatre,theso-calledPurplecipherwasusedforhighlevelJapanesegovernmentcommunication.This cipher was broken by American cryptanalysts before the attack on Pearl Harbor, but the intelligence gained (code named Magic) provided no clear indication of the impending attack. The Japanese Imperial Navy used a cipher known as JN-25, which was also broken by the Americans. The intelligence from JN-25 was almost certainly decisive in the extended battle of Coral Sea and Midway, where an inferior American force was able to halt the advanceoftheJapaneseinthePacificforthefirsttime.TheJapaneseNavywasneverabletorecoverfromthelossesinflictedduringthisbattle.

Page 153: Introduction to Information Theory and Applicationsjnujprdistance.com/assets/lms/LMS JNU/B.Sc.(IT)/Sem I/Introduction...Summary ... 8.3.6 Project VENONA ... Table 8.2 VENONA Decrypt

141

In Europe, the breaking of the Enigma cipher (code namedULTRA) was also a crucial aid to the allies during the war. It is often claimed that the ULTRA intelligence was so valuable that in November of 1940, Churchill decided not to inform the British city of Coventry of an impending attack by the German Luftwaffe, since the primary source of information on the attack came from Enigma decrypts. Churchill was supposedly concerned that a warning might tip off the Germans that their cipher had been broken.

The Enigma was initially broken by the Poles. After the fall of Poland, the Polish cryptanalysts escaped to France. Shortly thereafter, France fell to the Nazis and the Polish cryptanalysts escaped to England, where they provided their knowledge to British cryptanalysts. Remarkably, the Polish cryptanalysts were not allowed to continue their work on the Enigma. However, the British team including the computing pioneer, Alan Turing developed an improved attack.

InthepostWorldWarIIera,cryptographyfinallymovedfroma“blackart”intotherealmofscience.Thepublicationof Claude Shannon’s seminal 1949 paper Information Theory of Secrecy Systems marks the turning point. Shannon’s paper proved that the one-time pad is secure and also offered two fundamental cipher design principles:

confusion•diffusion•

Confusion is designed to obscure the relationship between the plaintext and ciphertext, while diffusion is supposed to spread the plaintext statistics through the ciphertext. A simple substitution cipher and a one-time pad employ only confusion, whereas a double transposition is a diffusion-only cipher. Since the one-time pad is provably secure, evidently confusion alone is “enough,” while, apparently, diffusion alone is not.

These two concepts confusion and diffusion are still the guiding principles in cipher design today. In subsequent chapters, it will become clear how crucial these concepts are to modern block cipher design. Until recently, cryptography remained primarily the domain of governments. That changed dramatically in the 1970s, primarily due to the computer revolution, which led to the need to protect large amounts of electronic data. By the mid-1970s, even the U.S. government realised that there was a legitimate commercial need for secure cryptography and it was clear that the commercial products of the day were lacking.

The National Bureau of Standards or NBS4, issued a request for cryptographic algorithms. The ultimate result of this processwasacipherknownastheDataEncryptionStandardorDES,whichbecameanofficialU.S.governmentstandard. It’s impossible to overemphasise the role that DES has played in the modern history of cryptography. After DES, academic interest in cryptography grew rapidly. Public key cryptography was discovered (or, more precisely, rediscovered) shortly after the arrival of DES. By the 1980s, there were annual CRYPTO conferences, which have consistentlydisplayedhigh-qualitywork in thefield. In the1990s, theClipperChipand thedevelopmentofareplacement for the aging DES were two of the many crypto highlights.

8.5 A Taxonomy of CryptographyIn public key cryptography, the encryption keys can be made public. If, for example, you post your public key on the Internet, anyone with an Internet connection can encrypt a message for you, without any prior arrangement regarding the key. This is in stark contrast to a symmetric cipher, where the participants must agree on a key in advance. Prior to the adoption of public key crypto, secure delivery of symmetric keys was the Achilles heel of modern cryptography. A spectacular case of a failed symmetric key distribution system can be seen in the exploits of the Walker family spy ring. The Walker family sold cryptographic keys used by the U.S. military to the Soviet Union for nearly two decades before being discovered.

Public key cryptography has another somewhat surprising and extremely useful feature, for which there is no parallel in the symmetric key world. Suppose a message is “encrypted” with the private key instead of the public key. Since thepublickeyispublic,anyonecandecryptthismessage.Atfirstglancesuchencryptionmightseempointless.However, it can be used as a digital form of a handwritten signature anyone can read the signature, but only the signer could have created the signature.

Page 154: Introduction to Information Theory and Applicationsjnujprdistance.com/assets/lms/LMS JNU/B.Sc.(IT)/Sem I/Introduction...Summary ... 8.3.6 Project VENONA ... Table 8.2 VENONA Decrypt

Introduction to Information Theory and Applications

142

Anything we can do with a symmetric cipher we can also accomplish with a public key cryptosystem. Public key crypto also enables us to do things that cannot be accomplished with a symmetric cipher. So why not use public key crypto for everything? The primary reason is speed. Symmetric key crypto is orders of magnitude faster than public key crypto. As a result, symmetric key crypto is used to encrypt the vast majority of data today. Yet public key crypto has a critical role to play in modern information security.

Each of the classic ciphers discussed above is a symmetric cipher. Modern symmetric ciphers can be subdivided intostreamciphersandblockciphers.Streamciphersgeneralisetheone-timepadapproach,sacrificingprovablesecurity for a key that is of a reasonable length. A block cipher is, in a sense, the generalisation of a codebook. In ablockcipher,thekeydeterminesthecodebookandaslongasthekeyremainsfixed,thesamecodebookisused.Conversely,whenthekeychanges;adifferentcodebookisselected.

While stream ciphers dominated in the post-World War II era, today block ciphers are the kings of symmetric key crypto with a few notable exceptions. Generally speaking, block ciphers are easier to optimise for software implementations,whilestreamciphersareusuallymostefficientinhardware.

The third major crypto category we’ll consider is hash functions. These functions take an input of any size and produceanoutputofafixedsizethatsatisfiessomeveryspecialproperties.Forexample,iftheinputchangesinoneormorebits,theoutputshouldchangeinabouthalfofitsbits.Foranother,itmustbeinfeasibletofindanytwo inputs that produce the same output. It may not be obvious that such a function is useful or that such functions actually exist but we’ll see that they do exist and that they turn out to be extremely useful for a surprisingly wide array of problems.

8.6 A Taxonomy of CryptanalysisThe goal of cryptanalysis is to recover the plaintext, the key, or both. By Kerckhoffs Principle, we assume that Trudy the cryptanalyst has complete knowledge of the inner workings of the algorithm. Another basic assumption is that Trudy has access to the ciphertext otherwise, why bother to encrypt? If Trudy only knows the algorithms and the ciphertext, then she must conduct a ciphertext only attack. This is the most disadvantageous possible scenario from Trudy’s perspective.

Trudy’s chances of success might improve if she has access to known plaintext. That is, Trudy might know some of the plaintext and observe the corresponding ciphertext. These matched plaintext-ciphertext pairs might provide information about the key. If all of the plaintext were known, there would be little point in recovering the key. But it’s often the case that Trudy has access to (or can guess) some of the plaintext. For example, many kinds of data include stereotypical headers e-mail being a good example. If such data is encrypted, the attacker can likely guess some of the plaintext and view the corresponding ciphertext.

Often, Trudy can actually choose the plaintext to be encrypted and see the corresponding ciphertext. Not surprisingly, this goes by the name of chosen plaintext attack. How is it possible for Trudy to choose the plaintext? We’ll see that some protocols encrypt anything that is sent and return the corresponding ciphertext. It’s also possible that Trudy could have limited access to a cryptosystem, allowing her to encrypt plaintext of her choice. For example, Alice might forget to log out of her computer when she takes her lunch break. Trudy could then encrypt some selected messages before Alice returns. This type of “lunchtime attack” takes many forms.

Potentially more advantageous for the attacker is an adaptively chosen plaintext attack. In this scenario, Trudy chooses the plaintext, views the resulting ciphertext and chooses the next plaintext based on the observed ciphertext. In some cases,thiscanmakeTrudy’sjobsignificantlyeasier.Relatedkeyattacksarealsosignificantinsomeapplications.The idea here is to look for a weakness in the system when the keys are related in some special way. There are other types of attacks that cryptographers occasionally worry about mostly when they feel the need to publish another academic paper. In any case, a cipher can only be considered secure if no successful attack is known.

Page 155: Introduction to Information Theory and Applicationsjnujprdistance.com/assets/lms/LMS JNU/B.Sc.(IT)/Sem I/Introduction...Summary ... 8.3.6 Project VENONA ... Table 8.2 VENONA Decrypt

143

Finally, there is one particular attack scenario that only applies to public key cryptography. Suppose Trudy intercepts a ciphertext that was encrypted with Alice’s public key. If Trudy suspects that the plaintext message was either “yes” or “no,” then she can encrypt both of these putative plaintexts with Alice’s public key. If either matches the ciphertext, then the message has been broken. This is known as a forward search. Although a forward search will not succeed against a symmetric cipher, we’ll see that this approach can be used to attack hash functions in some applications. We’ve previously seen that the size of the key space must be large enough to prevent an attacker from trying all possible keys. The forward search attack implies that in public key crypto, we must also ensure that the size of the plaintext message space is large enough that the attacker cannot simply encrypt all possible plaintext messages.

Page 156: Introduction to Information Theory and Applicationsjnujprdistance.com/assets/lms/LMS JNU/B.Sc.(IT)/Sem I/Introduction...Summary ... 8.3.6 Project VENONA ... Table 8.2 VENONA Decrypt

Introduction to Information Theory and Applications

144

SummaryCryptography is the science of using mathematics to encrypt and decrypt data.•Cryptography enables to store sensitive information or transmit it across insecure networks (like the Internet) •so that it cannot be read by anyone except the intended recipient.Data that can be read and understood without any special measures is called plaintext or cleartext.•Akeyisusedtoconfigureacryptosystemforencryptionanddecryption.•Kerckhoffs Principle can be extended to cover aspects of security other than cryptography.•Toencryptwithadoubletranspositioncipher,wefirstwritetheplaintextintoanarrayofagivensizeandthen•permutetherowsandcolumnsaccordingtospecifiedpermutations.Modern block ciphers use complex algorithms to generate ciphertext from plaintext (and vice versa) but at a •higher level, a block cipher can be viewed as a codebook, where each key determines a distinct codebook.InthepostWorldWarIIEra,cryptographyfinallymovedfroma“blackart”intotherealmofscience.•Confusion is designed to obscure the relationship between the plaintext and ciphertext, while diffusion is •supposed to spread the plaintext statistics through the ciphertext.A simple substitution cipher and a one-time pad employ only confusion, whereas a double transposition is a •diffusion-only cipher.The National Bureau of Standards or NBS4, issued a request for cryptographic algorithms.•Public key cryptography has another somewhat surprising and extremely useful feature, for which there is no •parallel in the symmetric key world.Modern symmetric ciphers can be subdivided into stream ciphers and block ciphers.•The goal of cryptanalysis is to recover the plaintext, the key, or both.•

ReferencesCobb, C., 2004. • Cryptography For Dummies, John Wiley & Sons.Stamp, M., 2011. • Information Security: Principles and Practice, 2nd ed., John Wiley & Sons.An Introduction to Cryptography• , [Pdf] Available at: <http://www.mavi1.org/web_security/cryptography/pgp/pgp_pdf_files/IntrotoCrypto.pdf>[Accessed28September2012].The Basics of Cryptography• , [Online] Available at: <http://www.pgpi.org/doc/pgpintro/> [Accessed 28 Sept 2012].Jeremy, 2011, part 1, • Information Security: Principles and Practice, [Video Online] Available at: <http://www.youtube.com/watch?v=vdr74e7D9IU> [Accessed 28 September 2012].Jeremy, 2011, • Crypto Basics --- crypto history, ciphers of election of 1876, [Video Online] Available at: <http://www.youtube.com/watch?v=ZwIfquvaDoE> [Accessed 28 September 2012].

Recommended ReadingRyabko, B. & Fionov, A., 2005• . Basics of Contemporary Cryptography for It Practitioners,WorldScientific.Hershey, J., 2002. • CryptographyDemystified, McGraw-Hill Prof Med/Tech. Smith, 1997• . Internet Cryptography, Pearson Education India.

Page 157: Introduction to Information Theory and Applicationsjnujprdistance.com/assets/lms/LMS JNU/B.Sc.(IT)/Sem I/Introduction...Summary ... 8.3.6 Project VENONA ... Table 8.2 VENONA Decrypt

145

Self Assessment__________ is the science of using mathematics to encrypt and decrypt data.1.

Cryptographya. Authenticationb. Authorizationc. Firewalld.

___________ is the science of analysing and breaking secure communication.2. Cryptographya. Cryptanalysisb. Encryptionc. Decryptiond.

Which of the following statements is false?3. Cryptography enables to store sensitive information or transmit it across insecure networks.a. Cryptology embraces only cryptography and not cryptanalysis.b. Cryptanalysis is the breaking of secret codes.c. A cipher or cryptosystem is used to encrypt data.d.

Data that can be read and understood without any special measures is called ________.4. ciphertexta. secret codeb. cryptosystemc. plaintextd.

The process of reverting ciphertext to its original plaintext is called ___________.5. decryptiona. encryptionb. cryptographyc. cryptanalysisd.

A__________isusedtoconfigureacryptosystemforencryptionanddecryption.6. protocola. data b. keyc. coded.

__________ plaintext results in unreadable gibberish called ciphertext.7. Encryptinga. Decryptingb. Lockingc. Revertingd.

Page 158: Introduction to Information Theory and Applicationsjnujprdistance.com/assets/lms/LMS JNU/B.Sc.(IT)/Sem I/Introduction...Summary ... 8.3.6 Project VENONA ... Table 8.2 VENONA Decrypt

Introduction to Information Theory and Applications

146

____________ is designed to obscure the relationship between the plaintext and ciphertext.8. Confusiona. Diffusionb. Transpositionc. Cryptod.

___________ efforts can easily recover algorithms from software and algorithms embedded in tamper-resistant 9. hardware are susceptible to similar attacks.

Classic cryptoa. Reverse engineeringb. Engineeringc. Kerckhoffs Principled.

The ___________ requires a key consisting of a randomly selected string of bits that is the same length as the 10. message.

substitution ciphera. codebook cipherb. vernam cipherc. one-time padd.

Page 159: Introduction to Information Theory and Applicationsjnujprdistance.com/assets/lms/LMS JNU/B.Sc.(IT)/Sem I/Introduction...Summary ... 8.3.6 Project VENONA ... Table 8.2 VENONA Decrypt

147

Case Study I

Application of Information Technology in Airline ReservationsA computer reservation system or CRS is a computerised system that is used for reservation purpose in airlines, railwaysandbuses.Thiscomputerisedsystemisusedtostoreandretrieveinformation;andconducttransactionsrelated to the transport. This was originally designed and operated by airlines.

Thus, information technology has made the airline reservation system quite effective and less time consuming. It is observed that reservation system in airlines is almost error free and provide various facilities.

A typical airline system includes following information:Flight details: This includes the information like starting destination and end destination, along with the stops •in between and the number of seats booked/available.Customer description: This section includes customer’s name, code, address, phone number, and seat number •allotted. This information is useful for keeping the records of customers for any emergency.Reservationdescription:Thisincludesinformationlikecustomer’scodenumber,flightnumber,dateofbooking,•and date of travelling.

The computerised reservation system has made the reservation procedure of airlines easier and less time-consuming. The airlines are able to meet their ticket sales target and maintain their tickets selling ratio. It also reduces confusion from the point of view of customers. Thus, information technology has transformed the reservation system in the airline industry and relieved the customers from a tedious and prolonged reservation procedure.

QuestionsHow is the information technology useful in the airline reservation?1. AnswerThe computerised reservation system or CRS is used in the airline reservation. This form of information technologyisusedtostoreandretrieveinformation;andconducttransactionsrelatedtothetransport.

What are the advantages of computerised reservation system for the airline industry?2. AnswerThe advantages of the computerised reservation system for the airline industry are as follows:

The computerised reservation system has made the airline reservation system quit effective and less time •–consuming.It is observed that reservation system in airlines is almost error free and provide various facilities. •Due to CSR, the airlines are able to meet their ticket sales target and maintain their tickets selling ratio. •It has also reduced confusion from the point of view of customers and has made travelling a lot more •easier.

Which information is available on a typical airline system?3. Answer A typical Airline System includes following information:

Flight details: It includes the information like starting destination and end destination, along with the stops •in between, and the number of seats booked/available.Customer description: This section includes customer’s name, code, address, phone number, and seat number •allotted. This information is useful for keeping the records of customers for any emergency.Reservationdescription:Itincludesinformationlikecustomer’scodenumber,flightnumber,dateofbooking,•and date of travelling.

Page 160: Introduction to Information Theory and Applicationsjnujprdistance.com/assets/lms/LMS JNU/B.Sc.(IT)/Sem I/Introduction...Summary ... 8.3.6 Project VENONA ... Table 8.2 VENONA Decrypt

Introduction to Information Theory and Applications

148

Case study II

IntroductionXYZCompanyhasmorethanonecomputer;chancesarethattheycouldbenefitfromnetworkingthem.Alocalarea network (LAN) connects company’s computers, allowing them to share and exchange a variety of information. While one computer can be useful on its own, several networked computers can be much more useful. Here are some of the ways a computer network can help their business:

File sharingAnetworkmakesiteasyforeveryonetoaccessthesamefileandpreventspeoplefromaccidentallycreatingdifferentversions.

Printer sharingIf XYZ Company uses a computer, chances are they also use a printer. With a network, several computers can share the same printer. Although you might need a more expensive printer to handle the added workload, it’s still cheaper touseanetworkprinterthantoconnectaseparateprintertoeverycomputerintheiroffice.

Communication and collaborationIt is hard for people to work together if neither knows what either is doing. A network allows employees to share files,viewotherpeople’sworkandexchangeideasmoreefficiently.Inalargeroffice,youcanusee-mailandinstantmessaging tools to communicate quickly and to store messages for future reference.

OrganisationA variety of scheduling software is available that makes it possible to arrange meetings without constantly checking everyone’s schedules. This software usually includes other helpful features, such as shared address books and to-do lists.

Remote accessHaving one’s own network allows greater mobility while maintaining the same level of productivity. With remote accessinplace,usersareabletoaccessthesamefiles,data,andmessagesevenwhenthey’renotintheoffice.Thisaccess can even be given to mobile handheld devices.

Data protectionOne should know by now that it’s vital to back up their computer data regularly. A network makes it easier to back up all of their company’s data on an offsite server, a set of tapes, CDs, or other backup systems.

Conclusion Bottom line is, if XYZ Company uses a LAN connection in their organization then it would be easy for them to share data, establish successful communication with each other, protect data as well as share resources coming in same LAN connection.

(Source: http://www.allbusiness.com/technology/computer-networking/994-1.html)

QuestionsWhat is LAN?1. WhydoesfilesharingjobbecomeeasywhenitinonLAN?2. How will you communicate with each other using LAN network?3.

Page 161: Introduction to Information Theory and Applicationsjnujprdistance.com/assets/lms/LMS JNU/B.Sc.(IT)/Sem I/Introduction...Summary ... 8.3.6 Project VENONA ... Table 8.2 VENONA Decrypt

149

Case Study III

IntroductionTata Communications is a leading global provider of a new world of communications. Obtaining a leading position in emerging markets, Tata Communications leverages its advanced solutions capabilities and domain expertise across its global and pan-India network to deliver managed solutions to multi-national enterprises, service providers and Indian consumers.

The Tata Global Network includes one of the most advanced and largest submarine cable networks, a Tier-1 IP network, with connectivity to more than 200 countries across 400 PoPs, and nearly 1 million square feet of data centre and collocation space worldwide. Reach of Tata Communications in emerging markets includes leadership in Indian enterprise data services, leadership in global international voice, and strategic investments in operators in South Africa (Neotel), Sri Lanka (Tata Communications Lanka Limited), and Nepal (United Telecom Limited). Tata Communications Limited is listed on the Bombay Stock Exchange and the National Stock Exchange of India and its ADRs are listed on the New York Stock Exchange.

Let us see how Tata communications uses the advance networking in their company.WANTata communications uses WAN, i.e., Wide Area Network. Wide Area Networks (WANs) connect LANs together between cities. A WAN is a geographically-dispersed collection of LANs. A network device called “Router” connects LANs to a WAN. The main difference between a MAN and a WAN is that the WAN uses Long Distance Carriers. Otherwise the same protocols and equipment are used as a MAN.

Transmission ModeThe transmission mode used in this organization is full-duplex. Full-Duplex Ethernet Hubs are Hubs which allow 2 way communications between Hubs thus doubling the available bandwidth from 10 Mbps to 20 Mbps. Full duplex Hubs are proprietary products and normally only work within their own manufacturer’s line.

Communication mediaIn Tata communications, devices are connected to a switch using cat-5 UTP (unshielded twisted pair) straight Cu-cable. The end connectors used for the cat-5 cables are called RJ-45 (regulated Jacks 45) connector.

Guided transmission mediaGuidedTransmissionMediausesa“cabling”systemthatguidesthedatasignalsalongaspecificpath.Thedatasignals are bound by the “cabling” system. Guided Media is also known as Bound Media.

Cabling is meant in a generic sense in the previous sentences and is not meant to be interpreted as copper wire •cabling only.Unguided Transmission Media consists of a means for the data signals to travel but nothing to guide them along •aspecificpath.The data signals are not bound to a cabling media and as such are often called Unbound Media.•There four basic types of Guided Media they are as follows:•

Open Wire �Twisted Pair �Coaxial Cable �Optical Fiber �

Network topologyIn Tata communications, tree topology is comprised of the multiple star topologies on a bus. Only one hub device is connected directly with the tree bus. Each hub functions as a root of a tree of the network devices

Page 162: Introduction to Information Theory and Applicationsjnujprdistance.com/assets/lms/LMS JNU/B.Sc.(IT)/Sem I/Introduction...Summary ... 8.3.6 Project VENONA ... Table 8.2 VENONA Decrypt

Introduction to Information Theory and Applications

150

QuestionsWhich topology does Tata communications use?1. Why is 2. WAN network used? Give Reasons.What are the types of the guided media?3.

Page 163: Introduction to Information Theory and Applicationsjnujprdistance.com/assets/lms/LMS JNU/B.Sc.(IT)/Sem I/Introduction...Summary ... 8.3.6 Project VENONA ... Table 8.2 VENONA Decrypt

151

Bibliography

ReferencesAdvances in classical communication for network quantum information theory • [Video online] Available at: <http://www.youtube.com/watch?v=Vx3Nr2zuFbI> [Accessed 27 May 2013].An Information-Theoretic Cryptanalysis• [Pdf] Available at: <http://www.mit.edu/~medard/mpapers/isita08paper.pdf>[Accessed 27 May 2013].An Introduction to Cryptography, • [Pdf] Available at: <http://www.mavi1.org/web_security/cryptography/pgp/pgp_pdf_files/IntrotoCrypto.pdf>[Accessed28September2012].Bosworth, B., 1982. • Codes, Ciphers, and Computers: An Introduction to Information Security, Hayden Book Co.Cobb, C., 2004. • Cryptography For Dummies, John Wiley & Sons.Computer Communication - Network media• [Video online] Available at: <https://www.youtube.com/watch?v=_6SSiNIzfGc> [Accessed 26 May 2013].Computer Fundamentals• [Pdf] Available at: <http://www.cl.cam.ac.uk/teaching/1011/CompFunds/CompFunds.pdf> [Accessed 26 May 2013].ComputerFundamentals -1Uses ofComputers.flv• [Video online] Available at:<https://www.youtube.com/watch?v=OTFnkulI45c> [Accessed 26 May 2013].Computer Operations• [Pdf] Available at: <http://research.microsoft.com/en-us/um/cambridge/events/needhambook/cap.pdf> [Accessed 26 May 2013].Computer Peripherals• [Pdf] Available at: <http://cbse.gov.in/Chapter%201%20computer%20for%20fmml.pdf> > [Accessed 26 May 2013].Computer Peripherals: How to Replace Corrupt RAM on Your Computer• [Video online] Available at: < https://www.youtube.com/watch?v=uwHUbF3KaWE> [Accessed 26 May 2013].Curt, W., 2010. • Data Communications and Computer Networks A Business User’s Approach. Course Technology publications. Data Communication and its Networks • [Pdf]Availableat:<http://memberfiles.freewebs.com/00/88/103568800/documents/Data.And.Computer.Communications.8e.WilliamStallings.pdf> [Accessed 26 May 2013].Data Communications and Computer Network• s [Pdf] Available at: <http://www.csi.ucd.ie/staff/jcarthy/home/CourseNotes/Networks1.pdf> [Accessed 26 May 2013].David, A. and John, L., 2008. • Computer organisation and design: The hardware/software interface, Morgan Kaufmann publications, 4th ed.Entropy and Information Theory• [Pdf] Available at: <http://ee.stanford.edu/~gray/it.pdf> [Accessed 27 May 2013].Fazlollah, M., 2010. • An Introduction to Information Theory, Dover Publications.Frank, J., Les, F., 2004. • How Networks Work, Que publications. Fundamentals of Computer• [Pdf] Available at: <http://uotechnology.edu.iq/dep-production/textbook_computer/Fundamentals_of_Computers.pdf> [Accessed 26 May 2013].Fundamentals of Computer Programming• [Video online] Available at:<https://www.youtube.com/watch?v=RsmBNl8U0J4> [Accessed 26 May 2013].Futuristic Peripherals to Transform your Computer!• [Video online] Available at: https://www.youtube.com/watch?v=UTYTD27x-m0> [Accessed 26 May 2013].Hamacher, C., Vranesic, Z. and Zaky, S., 2001. • Computer Organization, 5th ed., McGraw-Hill, Science/Engineering/Math Publications.

Page 164: Introduction to Information Theory and Applicationsjnujprdistance.com/assets/lms/LMS JNU/B.Sc.(IT)/Sem I/Introduction...Summary ... 8.3.6 Project VENONA ... Table 8.2 VENONA Decrypt

Introduction to Information Theory and Applications

152

Information theory and Coding• [Video online] Available at:< http://www.youtube.com/watch?v=UrefKMSEuAI&list=PL05BAE4D2A7018795> [Accessed 27 May 2013].Introduction to Computer Networks• [Pdf] Available at:< http://vfu.bg/en/e-Learning/Computer-Networks--Introduction_Computer_Networking.pdf> [Accessed 26 May 2013].Introduction to Computer Peripherals• [Pdf] Available at: <http://faculty.ivytech.edu/~smilline/downloads/hardware.pdf> [Accessed 26 May 2013].I• ntroduction to Information Security [Online] Available at: <http://apps.americanbar.org/abastore/products/books/abstracts/5450058%20excerpt_abs.pdf> [Accessed 22 October 2012].Introduction to Information Security• [Online] Available at: <www.csudh.edu/.../Introduction%20to%20Information%20Security.p> [Accessed 22 October 2012].Introduction to Information Security and Risk Management,• [Video online] Available at: <http://www.youtube.com/watch?v=n81w0zCkRR4> [Accessed 22 October 2012].Introduction to Information Security,• [Video online] Available at: <http://www.youtube.com/watch?v=yFRc-wpQc9c> [Accessed 22 October 2012].Introduction, Computer Operations, Data, and Program Development • [Pdf] Available at: <http://www.meteor.iastate.edu/classes/mt227/lectures/Intro_to_Fortran.pdf> [Accessed 26 May 2013].Crypto Basics --- crypto history, ciphers of election of 1876,• [Video online] Available at: <http://www.youtube.com/watch?v=ZwIfquvaDoE> [Accessed 28 September 2012].Jeremy, 2011, part 1, • Information Security: Principles and Practice, [Video online] Available at: <http://www.youtube.com/watch?v=vdr74e7D9IU> [Accessed 28 September 2012].John, R. P., 1980. • An Introduction to Information Theory. 2nd ed, Dover publications.Larry, L., 2004. • Computer Fundamentals, Dreamtech Press.Lean Value Stream Mapping - Computer Operations• [Video online] Available at: <https://www.youtube.com/watch?v=GqxAPjrx-7s> > [Accessed 26 May 2013].Long, L. and Nancy, L., 2004. • Computers. 12th ed., Prentice Hall Publications.Natalia, O. and Victor, O., 2010. • Computer Networks: Principles, Technologies and Protocols for Network Design, Wiley Publication.Networking Basics, Free Tutorial • [Video online] Available at: <https://www.youtube.com/watch?v=fMxO_8F9ADg> [Accessed 26 May 2013].Numerical Problems in Computer Networks • [Video online] Available at: <https://www.youtube.com/watch?v=crYVUyVD-zw> [Accessed 26 May 2013].Overview of Computer Network • [Pdf] Available at:< http://heather.cs.ucdavis.edu/~matloff/Networks/Intro/NetIntro.pdf> [Accessed 26 May 2013].Rajaraman, V., 1996. • Fundamentals of Computers. 2nd ed., Prentice-hall of India.RedPower Control/Computer - Strings/Input/IO/Variables & Stack• [Video online] Available at: <https://www.youtube.com/watch?v=_v_Jma1xX5s> [Accessed 26 May 2013].Stallings, W., 2010. • Data and Computer Communications. Prentice Hall Publication. Stamp, M., 2011. • Information Security: Principles and Practice, 2nd ed., John Wiley & Sons.The Basics of Cryptography• , [Online] Available at: <http://www.pgpi.org/doc/pgpintro/> [Accessed 28 Sept 2012].Watkins, G. S., 2008. • An Introduction to Information Security and ISO27001: A Pocket Guide, IT Governance.

Page 165: Introduction to Information Theory and Applicationsjnujprdistance.com/assets/lms/LMS JNU/B.Sc.(IT)/Sem I/Introduction...Summary ... 8.3.6 Project VENONA ... Table 8.2 VENONA Decrypt

153

Recommended ReadingAndrew, S. and David, J., 2010. • Computer Networks, Prentice Hall Publication. Arthur, H., 1968. • Computer peripherals and typesetting: A study of the man-machine interface incorporating a survey of computer peripherals and typographic composing equipment, H.M.S.O.Attiya, H. and Welch, J., 2004. • Distributed Computing: Fundamentals, Simulations, and Advanced Topics, John Wiley & Sons.Bell, D. A., 1968. • Information theory and its engineering applications, Pitman.Coope, R. and Mukai, K., 1993. • Theory and Its Applications, Center for the Study of Language.Dr Crispin, T. and Lengel, L., 2004. • Computer Mediated Communication, Sage Publications Ltd. Hershey, J., 2002. • CryptographyDemystified, McGraw-Hill Prof Med/Tech. James, F. and Keith, W., 2009. • Computer Networking: A Top-Down Approach, Addison Wesley Publication.James, F. and Kurose, W., 2009. • Computer Networking: A Top-Down Approach, Addison Wesley Publication. Kamra, A. and Bhambri, P., 2008. • Computer Peripherals And Interfaces, Technical Publications.Kumar, A.S., 2005. • Computer Networks, Firewall Media Publications. Larry, L., 2004. • Computer Fundamentals, Dreamtech Press.Mano, M., Charles, R. and Kime, 2004. • Logic and computer design fundamentals, Volume 1, Pearson/Prentice Hall.Niit, 2008. • Introduction to Information Security Risk Management, Prentice-Hall of India Pvt. Ltd.Parhami, B., 2009. • Computer Fundamentals: Architecture and Organisation, New Age International.Paul, C., 2004. • Information Theory And Statistics, Now Publishers Inc.Rainer, K. R. & Cegielski, G. C., 2010. I• ntroduction to Information Systems: Enabling and Transforming Business, 3rd ed. John Wiley & Sons.Ram, B., 2000. • Computer arithmetic algorithms, A K Peters Ltd Publication. Ram, B., 2000. • Computer Fundamentals: Architecture and Organization, New Age International.Ryabko, B. & Fionov, A., 2005. • Basics of Contemporary Cryptography for It Practitioners,WorldScientific.Smith, 1997. • Internet Cryptography, Pearson Education India.Snehi, J., 2006. • Computer Peripherals and Interfacing, Firewall Media.Whitman, E. M. & Mattord, J. H., 2011. • Principles of Information Security, 4th ed., Cengage Learning.

Page 166: Introduction to Information Theory and Applicationsjnujprdistance.com/assets/lms/LMS JNU/B.Sc.(IT)/Sem I/Introduction...Summary ... 8.3.6 Project VENONA ... Table 8.2 VENONA Decrypt

Introduction to Information Theory and Applications

154

Self Assessment Answers

Chapter Ic1. b2. a3. d4. a5. b6. c7. a8. d9. b10.

Chapter IIb1. d2. a3. c4. a5. d6. c7. b8. b9. c10.

Chapter IIIb1. d2. a3. c4. a5. d6. c7. b8. b9. c10.

Chapter IVb1. c2. a3. d4. d5. a6. c7. b8. d9. a10.

Page 167: Introduction to Information Theory and Applicationsjnujprdistance.com/assets/lms/LMS JNU/B.Sc.(IT)/Sem I/Introduction...Summary ... 8.3.6 Project VENONA ... Table 8.2 VENONA Decrypt

155

Chapter Vb1. c2. a3. d4. b5. c6. a7. b8. a9. d10.

Chapter VIb1. c2. d3. b4. c5. b6. a7. a8. c9. b10.

Chapter VIIa1. a2. c3. b4. a5. d6. a7. a8. b9. c10.

Chapter VIIIa1. b2. b3. d4. a5. c6. a7. a8. b9. d10.