1 intro crypto

Upload: janhavi-vishwanath

Post on 08-Aug-2018

216 views

Category:

Documents


0 download

TRANSCRIPT

  • 8/22/2019 1 Intro Crypto

    1/26

    Chapter 1 2301373: Introduction 1

    Introduction

  • 8/22/2019 1 Intro Crypto

    2/26

    Chapter 1 2301373: Introduction 2

    What is a Compiler?

    A compileris a computer

    program that translates aprogram in a source language

    into an equivalent program in a

    target language.

    A source program/code is a

    program/code written in the

    source language, which is

    usually a high-level language.

    A target program/code is a

    program/code written in thetarget language, which often is a

    machine language or an

    intermediate code.

    compilerSource

    programTarget

    program

    Error

    message

  • 8/22/2019 1 Intro Crypto

    3/26

    Chapter 1 2301373: Introduction 3

    Process of Compiling

    scannerparser

    Semantic analyzerIntermediate code generatorCode optimization

    Code generatorCode optimization

    Stream of characters

    Stream of tokens

    Parse/syntax tree

    Annotated tree

    Intermediate code

    Intermediate codeTarget code

    Target code

  • 8/22/2019 1 Intro Crypto

    4/26

    Chapter 1 2301373: Introduction 4

    Some Data Structures

    Symbol table

    Literal table

    Parse tree

  • 8/22/2019 1 Intro Crypto

    5/26

    Chapter 1 2301373: Introduction 5

    Symbol Table

    Identifiers are names of variables,

    constants, functions, data types, etc.

    Store information associated with identifiers

    Information associated with different types of

    identifiers can be different

    Information associated with variables are name, type,

    address,size (for array), etc.

    Information associated with functions are name,typeof return value, parameters, address, etc.

  • 8/22/2019 1 Intro Crypto

    6/26

    Chapter 1 2301373: Introduction 6

    Symbol Table (contd)

    Accessed in every phase of compilers

    The scanner, parser, and semantic analyzer put

    names of identifiers in symbol table.

    The semantic analyzer stores more information

    (e.g. data types) in the table.

    The intermediate code generator, code

    optimizer and code generator use information in

    symbol table to generate appropriate code. Mostly use hash table for efficiency.

  • 8/22/2019 1 Intro Crypto

    7/26

    Chapter 1 2301373: Introduction 7

    Literal table

    Store constants and strings used in program

    reduce the memory size by reusing constants

    and strings

    Can be combined with symbol table

  • 8/22/2019 1 Intro Crypto

    8/26

    Chapter 1 2301373: Introduction 8

    Parse tree

    Dynamically-allocated, pointer-based

    structure

    Information for different data types

    related to parse trees need to be stored

    somewhere.

    Nodes are variant records, storing

    information for different types of data

    Nodes store pointers to information storedin other data structure, e.g. symbol table

  • 8/22/2019 1 Intro Crypto

    9/26

    Chapter 1 2301373: Introduction 9

    Scanning

    A scanner reads a stream of characters and

    puts them together into some meaningful

    (with respect to the source language) units

    called tokens .

    It produces a stream of tokens for the next

    phase of compiler.

  • 8/22/2019 1 Intro Crypto

    10/26

    Chapter 1 2301373: Introduction 10

    Parsing

    A parser gets a stream of tokens from the

    scanner, and determines if the syntax

    (structure) of the program is correct

    according to the (context-free) grammar of

    the source language.

    Then, it produces a data structure, called a

    parse tree or an abstract syntax tree, which

    describes the syntactic structure of theprogram.

  • 8/22/2019 1 Intro Crypto

    11/26

    Chapter 1 2301373: Introduction 11

    Semantic analysis

    It gets the parse tree from the parser together with

    information about some syntactic elements It determines if the semantics or meaning of the

    program is correct.

    This part deals with static semantic. semantic of programs that can be checked by readingoff from the program only.

    syntax of the language which cannot be described incontext-free grammar.

    Mostly, a semantic analyzer does type checking.

    It modifies the parse tree in order to get that(static) semantically correct code.

  • 8/22/2019 1 Intro Crypto

    12/26

    Chapter 1 2301373: Introduction 12

    Intermediate code generation

    An intermediate code generator

    takes a parse tree from the semantic analyzer

    generates a program in the intermediate

    language.

    In some compilers, a source program is

    translated into an intermediate code first and

    then the intermediate code is translated into

    the target language. In other compilers, a source program is

    translated directly into the target language.

  • 8/22/2019 1 Intro Crypto

    13/26

    Chapter 1 2301373: Introduction 13

    Intermediate code generation (contd)

    Using intermediate code is beneficial whencompilers which translates a single sourcelanguage to many target languages arerequired.

    The front-end of a compilerscanner tointermediate code generator can be used forevery compilers.

    Different back-endscode optimizer and code

    generator is required for each target language. One of the popular intermediate code is

    three-address code. A three-address codeinstruction is in the form of x= y op z.

  • 8/22/2019 1 Intro Crypto

    14/26

    Chapter 1 2301373: Introduction 14

    Code optimization

    Replacing an inefficient sequence of

    instructions with a better sequence of

    instructions.

    Sometimes called code improvement.

    Code optimization can be done:

    after semantic analyzing performed on a parse tree

    after intermediate code generation performed on a intermediate code

    after code generation performed on a target code

  • 8/22/2019 1 Intro Crypto

    15/26

    Chapter 1 2301373: Introduction 15

    Code generation

    A code generator

    takes either an intermediate code or a parse

    tree

    produces a target program.

  • 8/22/2019 1 Intro Crypto

    16/26

    Chapter 1 2301373: Introduction 16

    Error Handling

    Error can be found in every phase of

    compilation.

    Errors found during compilation are called static

    (orcompile-time) errors.

    Errors found during execution are calleddynamic(orrun-time) errors

    Compilers need to detect, report, and

    recover from error found in source programs Error handlers are different in different

    phases of compiler.

  • 8/22/2019 1 Intro Crypto

    17/26

    Chapter 1 2301373: Introduction 17

    a compiler which generates target code for a

    different machine from one on which the

    compiler runs.

    A host language is a language in which the

    compiler is written.

    T-diagram

    Cross compilers are used very often in

    practice.

    Cross Compiler

    S

    H

    T

  • 8/22/2019 1 Intro Crypto

    18/26

    Chapter 1 2301373: Introduction 18

    Cross Compilers (contd)

    If we want a compiler from

    languageA to language B on amachine with language E,

    write one with E

    write one with D if you have acompiler from D to Eon some

    machine

    It is better than the former approach

    ifD is a high-level language but Eis

    a machine language

    write one from G to B with Eif we

    have a compiler from A to G

    written in E

    A

    E

    B

    D

    ?

    EA

    DB

    G

    E

    BA

    E

    G

  • 8/22/2019 1 Intro Crypto

    19/26

    Chapter 1 2301373: Introduction 19

    Porting

    Porting: construct a compiler between a

    source and a target language using onehost language from another host language

    AA

    KA

    H

    H A

    H

    K

    AA

    K

    A

    H

    K A

    K

    K

  • 8/22/2019 1 Intro Crypto

    20/26

    Chapter 1 2301373: Introduction 20

    Bootstrapping

    If we have to implement, from

    scratch, a compiler from ahigh-level language A to a

    machine, which is also a host,

    language,

    direct method

    bootstrapping

    AH

    H

    A

    A1

    H

    A1

    A2

    H

    A2A3

    HA

    3

    H

    H

  • 8/22/2019 1 Intro Crypto

    21/26

    Chapter 1 2301373: Introduction 21

    Cousins of Compilers

    Linkers

    Loaders

    Interpreters

    Assemblers

  • 8/22/2019 1 Intro Crypto

    22/26

    Chapter 1 2301373: Introduction 22

    History (1930s -40s)

    1930s

    John von Neumann invented the concept of

    stored-program computer.

    Alan Turing defined Turing machine and

    computability.

    1940s

    Many electro-mechanic, stored-program

    computers were constructed. ABC (Atanasoff Berry Computer) at Iowa

    Z1-4 (by Zuse) in Germany

    ENIAC (programmed by a plug board)

  • 8/22/2019 1 Intro Crypto

    23/26

    Chapter 1 2301373: Introduction 23

    History : 1950

    Many electronic, stored-program computers were

    designed. EDVAC (by von Neumann)

    ACE (by Turing)

    Programs were written in machine languages.

    Later, programs are written in assembly languagesinstead. Assemblers translate symbolic code and memory

    address to machine code.

    John Backus developed FORTRAN (no recursivecall) and FORTRAN compiler.

    Noam Chomsky studied structure of languagesand classified them into classes called Chomskyhierarchy.

    0A 1F 83 90 4B

    op code, address,..

    LDI B, 4

    LDI C, 3

    LDI A, 0

    ST: ADI A, C

    DEC BJNZ B, ST

    STO 0XF0, A

    Grammar

  • 8/22/2019 1 Intro Crypto

    24/26

    Chapter 1 2301373: Introduction 24

    History (1960s)

    Recursive-descent parsing was introduced.

    Nuar designed Algol60, Pascals ancestor,

    which allows recursive call.

    Backus-Nuar form (BNF) was used to

    described Algol60.

    LL(1) parsing was proposed by Lewis and

    Stearns.

    General LR parsing was invented by Knuth.

    SLR parsing was developed by DeRemer.

  • 8/22/2019 1 Intro Crypto

    25/26

    Chapter 1 2301373: Introduction 25

    History (1970s)

    LALR was develpoed by DeRemer.

    Aho and Ullman founded the theory of LR

    parsing techniques.

    Yacc (Yet Another Compiler Compiler) was

    developed by Johnson.

    Type inference was studied by Milner.

  • 8/22/2019 1 Intro Crypto

    26/26

    Chapter 1 2301373: Introduction 26

    Reading Assignment

    Louden, K.C., Compiler Construction:

    Principles and Practice, PWS Publishing,1997. ->Chapter 1