sisteme integrate ver4

8/3/2019 Sisteme Integrate Ver4

1/129

2009 2010

Cosmin Ionete

Dragos Surlea

Nicolae Neagu

FACULTY OF

AUTOMATIONEMBEDDED SYSTEMS


2/129

Contents

1. Embedded Systems Architecture .............................................................................................................. 5

1.1 What is an embedded system ? ---------------------------------------------------------------------------------5

1.2 Microprocessor and Microcontroller Architectures--------------------------------------------------------6

1.3 Microprocessor/Microcontroller Basics ---------------------------------------------------------------------10

1.3.1 What is a microprocessor? ............................................................................................................. 11

1.3.1.2 Microprocessor Fundamentals................................................................................................ 16

1.3.2 What is a microcontroller?............................................................................................................. 20

1.3.3 Some differences between microprocessors and microcontrollers............................................... 21

1.4 Compiling, Linking, and Locating ------------------------------------------------------------------------------25

1.4.1 The Build Process....................................................................................................................... 251.4.2 Compiling................................................................................................................................... 27

1.4.3 Linking........................................................................................................................................ 28

1.4.4 Locating...................................................................................................................................... 29

1.4.5 Dowloading and Debugging....................................................................................................... 29

1.4.6 Emulators................................................................................................................................... 31

1.5 Fixed points vs. Floating point numbers. Fundamentals ------------------------------------------------34

1.5.1 About Fixed-Point Numbers ...................................................................................................... 34

1.5.2 Scaling........................................................................................................................................ 35

1.5.3 Quantization, Range and Precision............................................................................................ 41

1.5.4 Recommendations for Arithmetic and Scaling .......................................................................... 45

1.6 Microcontroller CPU, Interupts, Memory, and I/O -------------------------------------------------------48

1.6.1 CPU Central Processing Unit................................................................................................... 48

1.6.2 Interrupts................................................................................................................................... 49

1.6.2.1 Vectored Interrupts & Non-Vectored Interrupts .............................................................. 52

1.6.2.2 Interrupt Priority............................................................................................................... 52

1.6.3 On-Chip Memory ....................................................................................................................... 53

1.6.3.1 Read-Only Memory (ROM) ............................................................................................... 54


3/129

1.6.3.2 Random-Access Memory (RAM) ....................................................................................... 55

1.6.3.3 Hybrid Types...................................................................................................................... 57

1.6.4 I/O .............................................................................................................................................. 60

1.6.4.1 Study of External Peripherals............................................................................................ 60

1.6.4.2 Initialize the Hardware...................................................................................................... 61

1.7 Peripheral devices ------------------------------------------------------------------------------------------------ 62

1.7.1 Control and Status Registers...................................................................................................... 62

1.7.2 The Device Driver Philosophy.................................................................................................... 63

1.7.3 Timers/Counters........................................................................................................................ 64

1.7.3.1 Reloading a timer .............................................................................................................. 65

1.7.3.2 Input Capture Timer.......................................................................................................... 65

1.7.3.3 Watchdog Timer................................................................................................................ 65

1.7.3.4 Using Timers...................................................................................................................... 66

1.7.4 PWM .......................................................................................................................................... 67

1.7.4.1 PWM Output ..................................................................................................................... 69

1.7.4.2 PWM Study ....................................................................................................................... 77

1.7.5 Digital-to-Analog Converters (DAC)........................................................................................... 84

1.7.6 Analog-to-Digital Converters (ADC)........................................................................................... 85

1.7.6.1 Reference Voltage............................................................................................................. 85

1.7.6.2 Resolution ......................................................................................................................... 85

Communication ....................................................................................................................................... 86

1.7.6.3 UART.................................................................................................................................. 86

1.7.6.4 RS232................................................................................................................................. 87

1.7.6.5 Serial Peripheral Interface ................................................................................................ 91

1.7.6.6 Local Interconnect Network (LIN)..................................................................................... 94

1.7.6.7 Controller Area Network................................................................................................... 99

2. IDE Integrated Development Environment ........................................................................................ 107

2.1 Source Code Editor --------------------------------------------------------------------------------------------- 108


4/129

2.2 Compiler----------------------------------------------------------------------------------------------------------- 108

2.2.1 Front end ................................................................................................................................. 108

2.2.2 Back end................................................................................................................................... 109

2.3 Linker--------------------------------------------------------------------------------------------------------------- 109

2.4 Debugger ---------------------------------------------------------------------------------------------------------- 110

3. Real-Time Operating Systems................................................................................................................ 111

3.1 Introduction ------------------------------------------------------------------------------------------------------ 111

3.2 Defining an RTOS ------------------------------------------------------------------------------------------------ 111

3.3 The Scheduler ---------------------------------------------------------------------------------------------------- 112

3.3.1 Schedulable Entities................................................................................................................. 112

3.3.2 Multitasking............................................................................................................................. 113

3.3.3 The Context Switch.................................................................................................................. 114

3.3.4 The Dispatcher......................................................................................................................... 114

3.3.5 Scheduling Algorithms............................................................................................................. 115

3.4 Objects------------------------------------------------------------------------------------------------------------- 116

3.4.1 Tasks ........................................................................................................................................ 116

3.4.1.1 Introduction .................................................................................................................... 116

3.4.1.2 Defining a Task ................................................................................................................ 117

3.4.1.3 Task States and Scheduling ............................................................................................. 118

3.4.1.4 Typical Task Operations .................................................................................................. 121

3.4.2 Semaphores ............................................................................................................................. 124

3.4.2.1 Introduction .................................................................................................................... 124

3.4.2.2 Defining Semaphores...................................................................................................... 124

3.4.2.3 Typical Semaphore Operations....................................................................................... 128

3.5 Services ------------------------------------------------------------------------------------------------------------ 129


5/129

1. Embedded Systems Architecture

1.1 What is an embedded system ?

An embedded system is a special-purpose computer system designed to perform one or afew dedicated functions. It is usually embedded as part of a complete device including hardwareand mechanical parts. In contrast, a general-purpose computer, such as a personal computer, can domany different tasks depending on programming. Since the embedded system is dedicated to

specific tasks, design engineers can optimize it, reducing the size and cost of the product, orincreasing the reliability and performance. Complexity varies from low, with a singlemicrocontroller chip, to very high with multiple units, peripherals and networks mounted inside alarge chassis or enclosure.

In general, "embedded system" is not an exactly defined term, as many systems have someelement of programmability. For example, Handheld computers share some elements withembedded systems - such as the operating systems and microprocessors which power them - but arenot truly embedded systems, because they allow different applications to be loaded and peripheralsto be connected.

Some of the actual commercial applications ofembedded systems include:

Market Embedded Device

Ignition SystemEngine ControlBrake System (Antilock Braking System)Automotive

Interior/Exterior Lights

Set-Top Boxes (DVDs, VCRs, Cable Boxes, etc.)

Kitchen Appliances (Refrigerators, Toasters, Microwave Ovens)

Cameras

Handheld tools

Remote control devices

Security systemsGlobal Positioning Systems (GPS)

Consumer

Electronics

Cordless and cellular phones

Industrial ControlRobotics and Control Systems (Manufacturing)Electronic measurement instruments (e.g., digital multimeters, frequencysynthesisers, and oscilloscopes)Infusion PumpsDialysis MachinesProsthetic DevicesHearing aids

Medical

Cardiac MonitorsRoutersHubsNetworkingGatewaysFax MachineMonitorsScannersPhotocopier

Office Automation

Printers


6/129

Selecting a particular processor for a given application is usually a function of the designersfamiliarity with a particular architecture. While there are many variations in the details and specificfeatures, there are two general categories of devices: microprocessors and microcontrollers. The keydifference between a microprocessor and a microcontrolleris that a microprocessor contains only acentral processing unit (CPU) while a microcontroller has memory and I/O on the chip in additionto a CPU. Microcontrollers are generally used for dedicated tasks. Microcomputer is a general term

that applies to complete computer systems implemented with either a microprocessor ormicrocontroller.

1.2 Microprocessor and Microcontroller Architectures

Microprocessors are generally utilized for relatively high performance applications wherecost and size are not critical selection criteria. Because microprocessor chips have their entirefunction dedicated to the CPU and thus have room for more circuitry to increase execution speed,they can achieve very high-levels of processing power. However, microprocessors require externalmemory and I/O hardware. Microprocessor chips are used in desktop PCs and workstations wheresoftware compatibility, performance, generality, and flexibility are important.

By contrast, microcontroller chips are usually designed to minimize the total chip count andcost by incorporating memory and I/O on the chip. They are often application specialized at the

expense of flexibility. In some cases, the microcontroller has enough resources on-chip that it is theonly IC required for a product. Examples of a single-chip application include the key fob used toarm a security system, a toaster, or hand-held games. The hardware interfaces of both devices havemuch in common, and those of the microcontrollers are generally a simplified subset of themicroprocessor. The primary design goals for each type of chip can be summarized this way:

microprocessors are most flexible microcontrollers are most compact

Microcontroller Architectures


7/129

A. Princeton (Von Neumann) vs. Harvard

All memory space on same busEvery location has unique addressSo instructions and data treated the same way

Possible bottleneck between instruction and data fetchesOvercome with instruction prefetching (overlapping, pipelining) and/or

Instruction/Data cachesSimplifies processor design -- one memory interface

More reliable -- fewer things can failAlso RAM can be used for both data and instruction storage

Greater flexibility in design of software (esp. real-time OS)

There are also differences in the basic CPU architectures used, and these tend toreflect the application. Microprocessor based machines usually have a von Neumannarchitecture with a single memory for both programs and data to allow maximum flexibilityin allocation of memory. Microcontroller chips, on the other hand, frequently embody the

Harvard architecture, which has separate memories for programs and data. Figure 1.1illustrates this difference.


8/129

CPU

Program

and Data

Memory

Data

Memory

Program

MemoryCPU

Figure 1.1 - At left is the von Neumann architecture; at right is the Harvard architecture

One advantage the Harvard architecture has for embedded applications is due to the twotypes of memory used in embedded systems. A fixed program and constants can be stored in non-volatile ROM memory while working variable data storage can reside in volatile RAM. Volatilememory loses its contents when power is removed, but non-volatile ROM memory alwaysmaintains its contents even after power is removed.

The Harvard architecture also has the potential advantage of a separate interface allowingtwice the memory transfer rate by allowing instruction fetches to occur in parallel with datatransfers. Unfortunately, in most Harvard architecture machines, the memory is connected to theCPU using a bus that limits the parallelism to a single bus.

A typical embedded computer consists of the CPU, memory, and I/O. They are most oftenconnected by means of a shared bus for communication. The peripherals on a microcontroller chipare typically timers, counters, serial or parallel data ports, and analog-to-digital and digital-to-

analog converters that are integrated directly on the chip. The performance of these peripherals isgenerally less than that of dedicated peripheral chips, which are frequently used withmicroprocessor chips. However, having the bus connections, CPU, memory, and I/O functions onone chip has several advantages:

- Fewer chips are required since most functions are already present on the processor chip.- Lower cost and smaller size result from a simpler design.- Lower power requirements because on-chip power requirements are much smaller than

external loads.- Fewer external connections are required because most are made on-chip, and most of the

chip connections can be used for I/O.- More pins on the chip are available for user I/O since they arent needed for the bus.- Overall reliability is higher since there are fewer components and interconnections.

Of course there are disadvantages too, including:- Reduced flexibility since you cant easily change the functions designed into the chip.- Expansion of memory or I/O is limited or impossible.- Limited data transfer rates due to practical size and speed limits for a single-chip.- Lower performance I/O because of design compromises to fit everything on one chip.

The von Neumann machine, with only one memory, requires all instruction and datatransfers to occur on the same interface. This is sometimes referred to as the von Neumannbottleneck. In common computer architectures, this is the primary upper limit to processorthroughput. The Harvard architecture has the potential advantage of a separate interface allowing


9/129

twice the memory transfer rate by allowing instruction fetches to occur in parallel with datatransfers. Unfortunately, in most Harvard architecture machines, the memory is connected to theCPU using a bus that limits the parallelism to a single bus. The memory separation is still used toadvantage in microcontrollers, as the program is usually stored in non-volatile memory (program isnot lost when power is removed), and the temporary data storage is in volatile memory.

Non-volatile memories, such as read-only memory (ROM) are used in both types of systems

to store permanent programs. In a desktop PC, ROMs are used to store just the start-up or bootstrapprograms and hardware specific programs. Volatile random access memory (RAM) can be read andwritten easily, but it loses its contents when power is removed. RAM is used to store bothapplication programs and data in PCs that need to be able to run many different programs. In adedicated embedded computer, however, the programs are stored permanently in ROM where theywill always be available. Microcontroller chips that are usedin dedicated applications generally use ROM for program storage and RAM for data storage.

B. CISC vs. RISC

CISC (Complex Instruction Set Computer)

Tend to have many instruction in instruction setCan carry out complex operations (many used very infrequently)Many are very long (many bits)And require many clock cycles

RISC (Reduced Instruction Set Computer)Few instructionsSimple instructionsShort (few bits) and fastOften orthogonal instruction sets

Can read/write/use all registers in same wayAllows for great power and flexibility

Example PICmicroMany other microcontrollers use RISC

Some microprocessors offer both CISC and RISC features

C. Microcoded versus Hardwired processors

The under cover design of a processor

MicrocodedProcessor within a processorSignals required to execute instructions "fetched" from internal "Control ROM" memory

Allows for great flexibility in instruction setEasier to designSlower than hardwired


10/129

HardwiredSignals required to execute instruction generated by logic gates (combinational circuitry)

The "control matrix" is:FasterLess flexible

1.3 Microprocessor/Microcontroller Basics

Microprocessor vs Micro-controllers Microprocessors

high end of market where performance mattershigh power dissipationhigh costneed peripheral devices to work

mostly used in microcomputers Microcontollers

targeted for low end of market where performance does not matterlow power dissipationlow costmemory plus I/O devices, all integrated into one chipMostly used in embedded systems


11/129

1.3.1 What is a microprocessor?

A device that integrates a number of useful functions into a single IC packageSome functions are:

- Ability to execute a stored set of instructions to carry out user defined tasks.- Ability to access external memory chips to read/write data from/to memory.- Ability to interface with I/O devices

There are three groups of signals, or buses, that connect the CPU to the other majorcomponents. The buses are:

- Data bus- Address bus- Control bus

concepts ofaddress and data is fundamental to the operation of the microprocessor

memory -consists oflocations uniquely identified by CPU through their address

CPU communicates with those addresses to read and write the data

the communications go via buses

the CPU -responsible for control of address, data and control buses

All devices attached to data bus -potential clash

Devices connected to data buses can be driven to high-impedance states

The ability of devices to set their output at either logic 1, logic 0 or in a high impedance state is an

essential feature of common bus systems and is termed a tristate device.

A. Data bus - to transfer the data associated with the processing function of the microprocessor. (8lines, typically)

Thedata bus width is defined as the number of bits that can be transferred on the bus at onetime. This defines the processors word size. Many chip vendors define the word size based onthe width of an internal data bus. A processor with eight data bus pins is an 8-bit CPU. Bothinstructions and data are transferred on the data bus one word at a time. This allows the re-use ofthe same connections for many different types of information. Due to packaging limitations, the


12/129

number of connections or pins on a chip is limited. By sharing the pins in this way, the number ofpins required is reduced at the expense of increased complexity in the external circuits. Manyprocessors also take this a step further and share some or all of the data bus pins to carry addressinformation as well. This is referred to as a multiplexed address/data bus. Processors that havemultiplexed address/data buses require an external address latch to separate and hold the addressinformation stable for the duration of a data transfer. The processor controls the direction of data

transfer on the data bus(read/write).

B. Address bus - contains the address of a specific memory location for accessing (reading/writing)stored data. 16, typically

Theaddress bus is a set of wires that are used to point to the memory or I/O location that isto be read from or written to. The address signals must generally be held at a constant value forsome period of time before, during, and after the data is transferred. In most cases, the processoractively drives the address bus with either instruction or data addresses.


13/129

Memory Read and Write Cycles

Hardware Control lines used by the CPU to Control reads and Writes to Memory

Active low signal RD asserted for a Read Cycle Active Low signal WR indicates a writeRD and WR signals supply timing information to

memory device


14/129

Read cycle It lasts 2 cycles of the clock signal:1. address of required memory location puton address bus (by CPU), at rising edge2. while device held at tristate level -control bus issues read signal (active low) to the device(2nd cycle begins)3. after delay -valid data placed on data bus4. levels on the data bus sampled by CPUat falling edgeof the 2nd cycle

Write cycle1. CPU places address at rising edge2. decoding logic selects correct device3. 2nd cycle -rising edge: CPU outputs data onto data bus & sets WRITE control bus signalactive (LOW)Note:memory devices & other I/O components have static logic -do not depend on clock signal-read data from data bus when write signal high (inactive) - data must be valid for transition

C. Control bus - carries the control signals to the memory and the I/O devices. Arbitrary number,

often 15.Thecontrol bus is an assortment of signals that determine what kind of information is on thedata bus and determines where the data will go, in conjunction with the address bus. Most of thedesign process is concerned with the logic and timing of the control signals. The timing analysis isprimarily involved with the relative timing between these control signals and the appearance anddisappearance of data and addresses on their respective buses.


15/129


16/129

1.3.1.2 Microprocessor Fundamentals

MPU Register set and Internal ArchitectureMPU busesMemory ConsiderationsMPU interfacing

The CPU

processes the data by executing a program stored in the memory performs sequence of fetch-and-execute operations consists of:

Control Unit + ALU + Registers responsible for the control of address, data and control buses (a master) all actions within P synchronised to the CPU via a clock signal clock signal = a logic square-wave to drive all the circuitry in the P, typically 1 to 30 MHz orhigher


17/129

The Control Unit determines timing and sequence of operations generates timing signals which are used to fetch program instructions from memory and to

execute it also responsible for decoding instructions supplies control signals to read and write data into registers, controls ALU and external

control signals

The ALU The arithmetic and logic unit (ALU) -responsible for data manipulation

arithmetic operations, logic operations (AND, OR, XOR etc.) bit shifting, rotating, incrementing, decrementing, negate, complementing, addition etc.


18/129

Registers Registers data/adressesthat CPU currently uses -stored in special memory (Small and fast)locations on the CPUaccumulator register-input to ALU is stored temp and sometime I/O operations. It may be8, 16, 32 bits wideflags registeror status registerIndividual bits in the register are called flags. Conditions ofthe latest ALU operations are reflected. Used by subsequent jump, branch instructionsgeneral purpose register-temporary storage for data or addresses. Not assigned anyspecific task.

program counter-tracks CPUs position in program. Width of the program counter is sameas address bus instruction register-stores instruction where it can be decoded; not accessible by theprogrammer index registers-hold the address of an operand when the indexed address mode is usedstack pointer register-holds the address of the next memory location in the stack in RAM.Stack -special area of RAM: last-in first-out (LIFO or FILO) file organisation. It is usedduring subroutine calls andinteruppts

Types of registers:

Stack

Part of memory where program data can be stored by a simple PUSH operation Restore data by a POP Stack is in main memory and is defined by the program Stack Pointer (SP) keeps track of the next location available on the Stack Organised as a FILO Buffer

General Registers Small set of internal registers -temporary data storage CU ensures that data from the correct register is presented to the CPU CU ensures that data is written back to correct register


19/129

Accumulatorusually holds ALU result

Status or Flags Register

CF -Carry Flag1 -there is a carry out from the most significant bit0 -no carry out frommsb

PF -Parity Flag1-low bye has an even number of 1 bits0 -low byte has odd parity

AF -Auxiliary carry Flag1 -carry out from bit 3 on addition0 -borrow into bit 3 on addition

ZF -Zero Flag1 -zero result

0 -non-zero result SF -Sign Flag

1 -msbis 1 (negative)0 -msbis 0 (positive)

TF -Trap FlagUsed by debuggers for single step operation1 -Trap on0 -Trap off

IF -Interrupt Flag1 -Enabled0 -Disabled

OF -Overflow Flag

1 -signed overflow occurred0 -no overflow

Flag bits are set by instructions

Flag bits are basis of conditional jump instructions

The program status word (PSW) is an area of memory or a hardware register which containsinformation about program state used by the operating system and the underlying hardware. It will


20/129


21/129

Most microcontrollers will also combine other devices such as:

A Timer module to allow the microcontroller to perform tasks for certain timeperiods.

Serial I/O (UART) for data flow between microcontroller and devices such as a PCor other microcontroller.

Analog input and output (e.g., to receive data from sensors or control motors)

Interrupt capability (from a variety of sources) Bus/external memory interfaces (for RAM or ROM) Built-in monitor/debugger program Support for external peripherals (e.g., I/O and bus extenders)

A typical microcontroller; the different sub units integrated onto the microcontroller chip.

The heart of the microcontroller is the CPU core

1.3.3 Some differences between microprocessors and microcontrollers

MP: suited to processing information in computer systemsMC: suited to control of I/O devices requiring a minimum component count

Instruction sets:

MP: processing intensivepowerful addressing modesinstructions to perform complex operations & manipulate large volumes of dataprocessing capability of MCs never approaches those of MPslarge instructions -- e.g., 80X86 7-byte long instructions

MC: cater to control of inputs and outputsinstructions to set/clear bitsboolean operations (AND, OR, XOR, NOT, jump if a bit is set/cleared), etc.


22/129

Extremely compact instructions, many implemented in one byte(Control program must often fit in the small, on-chip ROM)

Instruction sets:

The set of instructions given to the P to execute a task is called an instruction set Generally, instructions can be classified into the following categories:

Data transfer Arithmetic Logical Program control

Differ depending on the manufacturer, but some are reasonably common to most P's.

A. Data transfer

1. Load reads the content of a specified memory location and copies it to the specified registerlocation in the CPU2. Store copies the current contents of a specified register into a specified memory location.

B. Arithmetic3. Add Adds the contents of a specified memory location to the data in some register4. Decrement subtracts 1 from the content of a specified location.5. Compare indicates whether the contents of a register are greater than, less than or same as thecontents of a specified memory location. The result appears as a flag in the status register.

C. Logical6. AND carries out the logical AND operation with the contents of a specified memory location and

the data in some register7. OR carries out the logical OR operation with the contents of a specified memory location andthe data in some register8. EXCLUSIVE OR-(similar to 6, but for exclusive OR)9. Logical shift moving the pattern of bits in the register one place to the left or right by moving zero (0) tothe end of the number10. Arithmetic shift moving the pattern of bits one place left/right but with copying of the end bit into thevacancy created by shift

D. Program control

11. Jump changes the sequence in which the program is executed. So the program counter jumps tosome specified location (other than sequential)12. Branch a conditional instruction which might be 'branch if zero'or 'branch if plus'. It is followedif the right conditions are met.13. Halt stops all further microprocessor activities


23/129

Hardware & Instructionset support:MC: built-in I/O operations, event timing, enabling & setting up priority levels

for interrupts caused by external stimuliMP: usually require external circuitry to do similar things (e.g, 8255 PPI, 8254 PIT,

8259 PIC)

Bus widths:MP: very wide

large memory address spaces (>4 Gbytes)lots of data (Data bus: 32, 64, 128 bits wide)

MC: narrowrelatively small memory address spaces (typically kBytes)less data (Data bus typically 4, 8, 16 bits wide)

Clock rates:MP very fast (> 1 GHz)MC: Relatively slow (typically 10-20 MHz)

since most I/O devices being controlled are relatively slow

Cost:MP's expensive (often > $100)MCs cheap (often $1 - $10)

4-bit: < $1.00


24/129

8-bit: $1.00 - $8.0016-32-bit: $6.00 - $20.00


25/129

1.4Compiling, Linking, and Locating

1.4.1 The Build Process

There are a lot of things that software development tools can do automatically when the targetplatform is well defined. This automation is possible because the tools can exploit features of thehardware and operating system on which your program will execute. For example, if all of yourprograms will be executed on IBM-compatible PCs running DOS, your compiler can automate-and,therefore, hide from your view-certain aspects of the software build process.


26/129

Embedded software development tools, on the other hand, can rarely make assumptions about thetarget platform. Instead, the user must provide some of his own knowledge of the system to thetools by giving them more explicit instructions.The term "target platform" is best understood to include not only the hardware but also theoperating system that forms the basic runtime environment for your software. If no operatingsystem is present-as is sometimes the case in an embedded system-the target platform is simply theprocessor on which your program will be run.

The process of converting the source code representation of your embedded software into anexecutable binary image involves three distinct steps. First, each of the source files must becompiled or assembled into an object file. Second, all of the object files that result from the firststep must be linked together to produce a single object file, called the relocatable program. Finally,physical memory addresses must be assigned to the relative offsets within the relocatable programin a process called relocation. The result of this third step is a file that contains an executable binaryimage that is ready to be run on the embedded system.The embedded software development process just described is illustrated in Figure below. In thisfigure, the three steps are shown from top to bottom, with the tools that perform them shown inboxes that have rounded corners. Each of these development tools takes one or more files as inputand produces a single output file. More specific information about these tools and the files theyproduce is provided in the sections that follow.

Each of the steps of the embedded software build process is a transformation performed bysoftware running on a general-purpose computer. To distinguish this development computer

(usually a PC or Unix workstation) from the target embedded system, it is referred to as thehost computer. In other words, the compiler, assembler, linker, and locator are all pieces ofsoftware that run on a host computer, rather than on the embedded system itself. Yet, despitethe fact that they run on some other computer platform, these tools combine their efforts toproduce an executable binary image that will execute properly only on the target embeddedsystem. This split of responsibilities is shown in Figure below.


27/129

1.4.2 Compiling

The job of a compiler is mainly to translate programs written in some human-readable language intoan equivalent set of opcodes for a particular processor. In that sense, an assembler is also a compiler(you might call it an "assembly language compiler") but one that performs a much simpler one-to-one translation from one line of human-readable mnemonics to the equivalent opcode. Everythingin this section applies equally to compilers and assemblers. Together these tools make up the firststep of the embedded software build process.Of course, each processor has its own unique machine language, so you need to choose a compilerthat is capable of producing programs for your specific target processor. In the embedded systemscase, this compiler almost always runs on the host computer. It simply doesn't make sense toexecute the compiler on the embedded system itself. A compiler such as this-that runs on onecomputer platform and produces code for another-is called a cross-compiler. The use of a cross-compiler is one of the defining features of embedded software development.Regardless of the input language (C/C++, assembly, or any other), the output of the cross-compilerwill be an object file. This is a specially formatted binary file that contains the set of instructions

and data resulting from the language translation process. Although parts of this file containexecutable code, the object file is not intended to be executed directly. In fact, the internal structureof an object file emphasizes the incompleteness of the larger program.The contents of an object file can be thought of as a very large, flexible data structure. The structureof the file is usually defined by a standard format like the Common Object File Format (COFF) orExtended Linker Format (ELF). If you'll be using more than one compiler (i.e., you'll be writingparts of your program in different source languages), you need to make sure that each is capable ofproducing object files in the same format. Although many compilers (particularly those that run onUnix platforms) support standard object file formats like COFF and ELF ( gcc supports both), thereare also some others that produce object files only in proprietary formats. If you're using one of thecompilers in the latter group, you might find that you need to buy all of your other developmenttools from the same vendor.

Most object files begin with a header that describes the sections that follow. Each of these sectionscontains one or more blocks of code or data that originated within the original source file. However,these blocks have been regrouped by the compiler into related sections. For example, all of the codeblocks are collected into a section called text, initialized global variables (and their initial values)into a section called data, and uninitialized global variables into a section called bss.There is also usually a symbol table somewhere in the object file that contains the names andlocations of all the variables and functions referenced within the source file. Parts of this table maybe incomplete, however, because not all of the variables and functions are always defined in the


28/129

same file. These are the symbols that refer to variables and functions defined in other source files.And it is up to the linker to resolve such unresolved references.

1.4.3 Linking

All of the object files resulting from step one (compiling) must be combined in a special way beforethe program can be executed. The object files themselves are individually incomplete, most notably

in that some of the internal variable and function references have not yet been resolved. The job ofthe linker is to combine these object files and, in the process, to resolve all of the unresolvedsymbols.The output of the linker is a new object file that contains all of the code and data from the inputobject files and is in the same object file format. It does this by merging the text, data, and bsssections of the input files. So, when the linker is finished executing, all of the machine languagecode from all of the input object files will be in the text section of the new file, and all of theinitialized and uninitialized variables will reside in the new data and bss sections, respectively.While the linker is in the process of merging the section contents, it is also on the lookout forunresolved symbols. For example, if one object file contains an unresolved reference to a variablenamed foo and a variable with that same name is declared in one of the other object files, the linker

will match them up. The unresolved reference will be replaced with a reference to the actualvariable. In other words, iffoo is located at offset 14 of the output data section, its entry in thesymbol table will now contain that address.The GNU linker (ld) runs on all of the same host platforms as the GNU compiler. It is essentially acommand-line tool that takes the names of all the object files to be linked together as arguments.For embedded development, a special object file that contains the compiled startup code must alsobe included within this list.Startup CodeOne of the things that traditional software development tools do automatically is to insert startupcode. Startup code is a small block of assembly language code that prepares the way for theexecution of software written in a high-level language.Each high-level language has its own set of expectations about the runtime environment. For

example, C and C++ both utilize an implicit stack. Space for the stack has to be allocated andinitialized before software written in either language can be properly executed. That is just one ofthe responsibilities assigned to startup code for C/C++ programs.Most cross-compilers for embedded systems include an assembly language file called startup.asm,crt0.s (short for C runtime), or something similar. The location and contents of this file are usuallydescribed in the documentation supplied with the compiler.Startup code for C/C++ programs usually consists of the following actions, performed in the orderdescribed:

1. Disable all interrupts.2. Copy any initialized data from ROM to RAM.3. Zero the uninitialized data area.4. Allocate space for and initialize the stack.

5. Initialize the processor's stack pointer.6. Create and initialize the heap.7. Execute the constructors and initializers for all global variables (C++ only).8. Enable interrupts.9. Call main.

Typically, the startup code will also include a few instructions after the call to main.These instructions will be executed only in the event that the high-level language program exits(i.e., the call to main returns). Depending on the nature of the embedded system, you might want to


29/129

use these instructions to halt the processor, reset the entire system, or transfer control to adebugging tool.Because the startup code is not inserted automatically, the programmer must usually assemble ithimself and include the resulting object file among the list of input files to the linker. He might evenneed to give the linker a special command-line option to prevent it from inserting the usual startupcode.If the same symbol is declared in more than one object file, the linker is unable to proceed. It will

likely appeal to the programmer-by displaying an error message-and exit. However, if symbolreference instead remains unresolved after all of the object files have been merged, the linker willtry to resolve the reference on its own. The reference might be to a function that is part of thestandard library, so the linker will open each of the libraries described to it on the command line (inthe order provided) and examine their symbol tables. If it finds a function with that name, thereference will be resolved by including the associated code and data sections within the outputobject file.After merging all of the code and data sections and resolving all of the symbol references, the linkerproduces a special "relocatable" copy of the program. In other words, the program is completeexcept for one thing: no memory addresses have yet been assigned to the code and data sectionswithin. If you weren't working on an embedded system, you'd be finished building your softwarenow.

But embedded programmers aren't generally finished with the build process at this point.Even if your embedded system includes an operating system, you'll probably still need anabsolutely located binary image. In fact, if there is an operating system, the code and data of whichit consists are most likely within the relocatable program too. The entire embedded application-including the operating system-is almost always statically linked together and executed as a singlebinary image.

1.4.4 Locating

The tool that performs the conversion from relocatable program to executable binary image is calleda locator. It takes responsibility for the easiest step of the three. In fact, you will have to do most ofthe work in this step yourself, by providing information about the memory on the target board as

input to the locator. The locator will use this information to assign physical memory addresses toeach of the code and data sections within the relocatable program. It will then produce an output filethat contains a binary memory image that can be loaded into the target ROM.In many cases, the locator is a separate development tool. However, in the case of the GNU tools,this functionality is built right into the linker. Try not to be confused by this one particularimplementation. Whether you are writing software for a general-purpose computer or an embeddedsystem, at some point the sections of your relocatable program must have actual addresses assignedto them. In the first case, the operating system does it for you at load time. In the second, you mustperform the step with a special tool. This is true even if the locator is a part of the linker.The memory information required by the GNU linker can be passed to it in the form of a linkerscript. Such scripts are sometimes used to control the exact order of the code and datasections within the relocatable program.

1.4.5 Dowloading and Debugging

Once you have an executable binary image stored as a file on the host computer, you will need away to download that image to the embedded system and execute it. The executable binary image isusually loaded into a memory device on the target board and executed from there. And if you havethe right tools at your disposal, it will be possible to set breakpoints in the program or to observe itsexecution in less intrusive ways. This chapter describes various techniques for downloading,executing, and debugging embedded software.


30/129

One of the most obvious ways to download your embedded software is to load the binary imageinto a read-only memory device and insert that chip into a socket on the target board. Obviously, thecontents of a truly read-only memory device could not be overwritten.However, embedded systems commonly employ special read-only memory devices that can beprogrammed (or reprogrammed) with the help of a special piece of equipment called a deviceprogrammer. A device programmer is a computer system that has several memory sockets on thetop-of varying shapes and sizes-and is capable of programming memory devices of all sorts.

In an ideal development scenario, the device programmer would be connected to the same networkas the host computer. That way, files that contain executable binary images could be easilytransferred to it for ROM programming. After the binary image has been transferred to the deviceprogrammer, the memory chip is placed into the appropriately sized and shaped socket and thedevice type is selected from an on-screen menu. The actual device programming process can takeanywhere from a few seconds to several minutes, depending on the size of the binary image and thetype of memory device you are using.After you program the ROM, it is ready to be inserted into its socket on the board. Of course, hisshouldn't be done while the embedded system is still powered on. The power should be turned offand then reapplied only after the chip has been carefully inserted.As soon as power is applied to it, the processor will begin to fetch and execute the code that isstored inside the ROM. However, beware that each type of processor has its own rules about the

location of its first instruction.If your program doesn't appear to be working, it could be there is something wrong with your resetcode. You must always ensure that the binary image you've loaded into the ROM satisfies the targetprocessor's reset rules.A development board includes a special in-circuit programmable memory, called Flash memory,that does not have to be removed from the board to be reprogrammed. In fact, software that canperform the device programming function, the monitor, is already installed in another memorydevice on the board. The board actually has two read-only memory devices: one (a true ROM)contains a simple program that allows the user to in-circuit program the other (a Flash memorydevice). All the host computers need to talk to the monitor program on a serial port and with aterminal program.The biggest disadvantage of this download technique is that there is no easy way to debug software

that is executing out of ROM. The processor fetches and executes the instructions at a high rate ofspeed and provides no way for you to view the internal state of the program.This might be fine once you know that your software works and you're ready to deploy the system,but it's not very helpful during software development.Remote DebuggersIf available, a remote debugger can be used to download, execute, and debug embedded softwareover a serial port or network connection between the host and target. The frontend of a remotedebugger looks just like any other debugger that you might have used. It usually has a text or GUI-based main window and several smaller windows for the source code, register contents, and otherrelevant information about the executing program. However, in the case of embedded systems, thedebugger and the software being debugged are executing on two different computer systems.A remote debugger actually consists of two pieces of software.The frontend runs on the host computer and provides the human interface just described. But thereis also a hidden backend that runs on the target processor and communicates with the frontend overa communications link of some sort. The backend provides for low-level control of the targetprocessor and is usually called the debug monitor. Figure belowshows how these two componentswork together.


31/129

The debug monitor resides in ROM-having been placed there in the manner described earlier (eitherby you or at the factory)-and is automatically started whenever the target processor is reset. Itmonitors the communications link to the host computer and responds to requests from the remotedebugger running there. Of course, these requests and the monitor's responses must conform tosome predefined communications protocol and are typically of a very low-level nature. Examples ofrequests the remote debugger can make are "read register x," "modify register y," "read n bytes of

memory starting at address," and "modify the data at address." The remote debugger combinessequences of these low-level commands to accomplish high-level debugging tasks like downloadinga program, single-stepping through it, and setting breakpoints.Communication between the frontend and the debug monitor is byte-oriented and designed fortransmission over a serial connection, RS232 or USB.Remote debuggers are one of the most commonly used downloading and testing tools duringdevelopment of embedded software. This is mainly because of their low cost. Embedded softwaredevelopers already have the requisite host computer. In addition, the price of a remote debuggerfrontend does not add significantly to the cost of a suite of cross-development tools (compiler,linker, locator, etc.). Finally, the suppliers of remote debuggers often desire to give away the sourcecode for their debug monitors, in order to increase the size of their installed user base.As shipped, the Keil board includes a free debug monitor in Flash memory. Together with host

software provided by Arcom, this debug monitor can be used to download programs directly intotarget RAM and execute them.

1.4.6 Emulators

Remote debuggers are helpful for monitoring and controlling the state of embedded software, butonly an in-circuit emulator (ICE) allows you to examine the state of the processor on which thatprogram is running. In fact, an ICE actually takes the place of - or emulates - the processor on yourtarget board. It is itself an embedded system, with its own copy of the target processor, RAM,ROM, and its own embedded software. As a result, in-circuit emulators are usually prettyexpensive-often more expensive than the target hardware. But they are a powerful tool, and in atight debugging spot nothing else will help you get the job done better.

Like a debug monitor, an emulator uses a remote debugger for its human interface. In some cases, itis even possible to use the same debugger frontend for both. But because the emulator has its owncopy of the target processor it is possible to monitor and control the state of the processor in realtime. This allows the emulator to support such powerful debugging features as hardwarebreakpoints and real-time tracing, in addition to the features provided by any debug monitor.With a debug monitor, you can set breakpoints in your program. However, these softwarebreakpoints are restricted to instruction fetches-the equivalent of the command "stop execution ifthis instruction is about to be fetched." Emulators, by contrast, also support hardware breakpoints.Hardware breakpoints allow you to stop execution in response to a wide variety of events. These


32/129

events include not only instruction fetches, but also memory and I/O reads and writes, andinterrupts. For example, you might set a hardware breakpoint on the event "variable foo contains15 and register AX becomes 0."Another useful feature of an in-circuit emulator is real-time tracing. Typically, an emulatorincorporates a large block of special-purpose RAM that is dedicated to storing information abouteach of the processor cycles that are executed. This feature allows you to see in exactly what orderthings happened, so it can help you answer questions, such as, did the timer interrupt occur before

or after the variable bar became 94? In addition, it is usually possible to either restrict theinformation that is stored or post-process the data prior to viewing it in order to cut down on theamount of trace data to be examined.ROM EmulatorsOne other type of emulator is worth mentioning at this point. A ROM emulator is a device thatemulates a read-only memory device. Like an ICE, it is an embedded system that connects to thetarget and communicates with the host. However, this time the target connection is via a ROMsocket. To the embedded processor, it looks like any other read-only memory device. But to theremote debugger, it looks like a debug monitor.ROM emulators have several advantages over debug monitors. First, no one has to port the debugmonitor code to your particular target hardware. Second, the ROM emulator supplies its own serialor network connection to the host, so it is not necessary to use the target's own, usually limited,resources. And finally, the ROM emulator is a true replacement for the original ROM, so none ofthe target's memory is used up by the debug monitor code.

Simulators and Other ToolsOf course, many other debugging tools are available to you, including simulators, logic analyzers,and oscilloscopes. A simulator is a completely host-based program that simulates the functionalityand instruction set of the target processor. The human interface is usually the same as or similar tothat of the remote debugger. In fact, it might be possible to use one debugger frontend for thesimulator backend as well, as shown in Figure below. Although simulators have manydisadvantages, they are quite valuable in the earlier stages of a project when there is not yet anyactual hardware for the programmers to experiment with.

By far, the biggest disadvantage of a simulator is that it only simulates the processor. Andembedded systems frequently contain one or more other important peripherals. Interaction withthese devices can sometimes be imitated with simulator scripts or other workarounds, but suchworkarounds are often more trouble to create than the simulation is valuable. So you probably won'tdo too much with the simulator once you have the actual embedded hardware available to you.


33/129

Once you have access to your target hardware-and especially during the hardware debugging-logicanalyzers and oscilloscopes can be indispensable debugging tools. They are most useful fordebugging the interactions between the processor and other chips on the board.Because they can only view signals that lie outside the processor, however, they cannot control theflow of execution of your software like a debugger or an emulator can. This makes these toolssignificantly less useful by themselves. But coupled with a software debugging tool like a remotedebugger or an emulator, they can be extremely valuable.

An oscilloscope is another piece of laboratory equipment for hardware debugging. But this one isused to examine any electrical signal, analog or digital, on any piece of hardware.Oscilloscopes are sometimes useful for quickly observing the voltage on a particular pin or, in theabsence of a logic analyzer, for something slightly more complex. However, the number of inputs ismuch smaller (there are usually about four) and advanced triggering logic is not often available. Asa result, it'll be useful to you only rarely as a software debugging tool.Most of the debugging tools described in this chapter will be used at some point or another in everyembedded project. Oscilloscopes and logic analyzers are most often used to debug hardwareproblems - simulators during early stages of the software development, and debug monitors andemulators during the actual software debugging. To be most effective, you should understand whateach tool is for and when and where to apply it for the greatest impact.

ProgrammingGenerally done in either the core's native assembly language or CSometimes HLL support (often BASIC) is availableAssemblers/Linkers often supplied free by the micro's manufacturerC compilers vary from free and very buggy to very expensive and only moderately buggyEnvironments generally not friendly or reliable

DownloadingProgram development usually done on a PCSoftware tools must produce a file to download to the MC's EPROMSeveral standard formats (e.g., binary, hex)EPROM burner often necessary

Can download program to an EPROM emulatorBut to reprogram, must us an UV erasor first

Flash memory programmers make this easierVery easy to reprogram with inexpensive "in-circuit debugger"

Interacts with MC via 3 pins + power + groundOr can be programmed/debugged with a resident monitor program

on-chip UART for communications with PCNo burner or UV erasor neededNo expensive quartz window requiredExpedites program-test-erase-reprogram code development cycle

MonitorA program module that communicates with PC softwareTypically uses a serial port to talk to a PC's terminal programCapabilities vary widelyUsually can send/receive text and ASCII-converted numbersOften has commands to examine/change registers, memory locations, I/O ports


34/129


35/129

essence, the logic circuits have no knowledge of a scale factor. They are performing signed orunsigned fixed-point binary algebra as if the binary point is to the right of b0.

Within the Simulink Fixed Point software, the main difference between fixed-point data types is thedefault binary point. For integers and fractionals, the binary point is fixed at the default value. Forgeneralized fixed-point data types, you must either explicitly specify the scaling by configuringdialog box parameters, or inherit the scaling from another block. The sections that follow describethe supported fixed-point data types.

Integers

The default binary point for signed and unsigned integer data types is assumed to be just to the rightof the LSB. You specify unsigned and signed integers with the uintand sintfunctions, respectively.

Fractionals

The default binary point for unsigned fractional data types is just to the left of the MSB, while forsigned fractionals the binary point is just to the right of the MSB. If you specify guard bits, then

they lie to the left of the binary point. You specify unsigned and signed fractional numbers with theufrac and sfrac functions, respectively.

Generalized Fixed-Point Numbers

For signed and unsigned generalized fixed-point numbers, there is no default binary point. Youspecify unsigned and signed generalized fixed-point numbers with the ufix and sfix functions,respectively.

Note: You can also use thefixdtfunction to create integer, fractional, and generalized fixed-point objects.

1.5.2 ScalingThe dynamic range of fixed-point numbers is much less than that of floating-point numbers withequivalent word sizes. To avoid overflow conditions and minimize quantization errors, fixed-pointnumbers must be scaled.

With the Simulink Fixed Point software, you can select a fixed-point data type whose scaling isdefined by its default binary point, or you can select a generalized fixed-point data type and choosean arbitrary linear scaling that suits your needs. This section presents the scaling choices availablefor generalized fixed-point data types.

A fixed-point number can be represented by a general [Slope Bias] encoding scheme

BQSV~~V +==

where

* V is an arbitrarily precise real-world value.

* V~

is the approximate real-world value.

* Q is an integer that encodes V.


36/129

* S = Fx2E is the slope.

* B is the bias.

The slope is partitioned into two components:

* 2E specifies the binary point. E is the fixed power-of-two exponent.

* F is the fractional slope. It is normalized such that 2F1


37/129

The number x is converted to a signed, 10-bit generalized fixed-point data type with binary-point-only scaling of 2-7 (that is, the binary point is located seven places to the left of the rightmost bit).

0.033333 0.

0.066666 0.0

0.133332 0.00

0.266664 0.000

0.533328 0.0000

1.066656 ->0.066656 0.00001

0.133312 0.000010

0.266624 0.0000100

0.533248 0.00001000

1.066496 -> 0.066496 0.000010001

0.132992 0.0000100010

etc

We use 10-bit generalized fixed-point data type with binary-point-only scaling of 2-7, so we use 7bits for fractional representation and 3 bits for signed integer:

x = 000. 0000100010.... ~= 000. 0000100 /(010...) = 2-5 = 0.03125

2. Lets represent x = 3.3333e-001

0.33333 0.0.66666 0.0

1.33332 ->0.33332 0.01

0.66664 0.010

1.33328 -> 0.33328 0.0101

0.66656 0.01010

1.33312 -> 0. 33312 0.010101

0.66624 0.0101010

1.33248 -> 0. 33248 0. 01010101

0.66496 0. 010101010

1.32992 0. 0101010101

etc

x = 000. 0101010 (101.....)~= 000. 0101010 = 0.328125


38/129

3. Lets represent x = 3.3333e-003

0.0033333 0.

0.0066666 0.0

0.0133332 0.00

0.0266664 0.000

0.0533328 0.0000

0.1066656 0.00000

0.2133312 0.000000

0.4266624 0.0000000

0.8533248 0.00000000

1.766496 -> 0. 766496 0.000000001

1.532992 -> 0.532992 0.0000000011

etc

x = 0.0000000 /(011...) ~= 0

fi(v,s,w,f) returns a fixed-point object with value v, signedness s, word length w, and fraction lengthf.

fi(0.33333,1,10,7, 'RoundMode','floor')

ans =

0.328125000000000

DataTypeMode: Fixed-point: binary point scaling

Signed: true

WordLength: 10

FractionLength: 7

RoundMode: floor

OverflowMode: saturate

ProductMode: FullPrecision

MaxProductWordLength: 128

SumMode: FullPrecision

MaxSumWordLength: 128

CastBeforeSum: true


39/129

>> 1/4+1/16+1/64

ans =

0.328125000000000

By default, the RoundMode isNearest.

fi(0.33333,1,10,7)

ans =

0.335937500000000

>> (1/4+1/16+1/64)+1/128

ans =

0.335937500000000Another example:

m= [3.3333e-005 3.3333e-006 3.3333e-007 3.3333e-008

3.3333e-004 3.3333e-005 3.3333e-006 3.3333e-007

3.3333e-003 3.3333e-004 3.3333e-005 3.3333e-006

3.3333e-002 3.3333e-003 3.3333e-004 3.3333e-005

3.3333e-001 3.3333e-002 3.3333e-003 3.3333e-004]

We use 10 bit word length wit 7 bits fraction

>>x=2^-7

x =

0.007812500000000

>>round(m/x)*x

ans =

0 0 0 0

0 0 0 0

0 0 0 0

0.031250000000000 0 0 0

0.335937500000000 0.031250000000000 0 0

The same result can be obtained with:


40/129

>>fi(m,0,10,7)

>> M=m/m(5,1)

M =

0.0001 0.0000 0.0000 0.0000

0.0010 0.0001 0.0000 0.0000

0.0100 0.0010 0.0001 0.0000

0.1000 0.0100 0.0010 0.0001

1.0000 0.1000 0.0100 0.0010

>> fi(M,0,10,7)

ans =

0 0 0 0

0 0 0 0

0.0078 0 0 0

0.1016 0.0078 0 0

1.0000 0.1016 0.0078 0

>>fi(M,0,10,7)*round(m(5,1)/x)*x

ans =

0 0 0 0

0 0 0 0

0.0026 0 0 0

0.0341 0.0026 0 0

0.3359 0.0341 0.0026 0


41/129

1.5.3 Quantization, Range and Precision

Introduction

The sections that follow describe the relationship between arithmetic operations and fixed-pointscaling, and offer some basic recommendations that may be appropriate for your fixed-point design.For each arithmetic operation,

* The general [Slope Bias] encoding scheme described in Scaling is used.

* The scaling of the result is automatically selected based on the scaling of the two inputs. Inother words, the scaling is inherited.

* Scaling choices are based on

o Minimizing the number of arithmetic operations of the result

o Maximizing the precision of the result

Additionally, binary-point-only scaling is presented as a special case of the general encoding

scheme.

In embedded systems, the scaling of variables at the hardware interface (the ADC or DAC) is fixed.However for most other variables, the scaling is something you can choose to give the best design.When scaling fixed-point variables, it is important to remember that

* Your scaling choices depend on the particular design you are simulating.

* There is no best scaling approach. All choices have associated advantages and disadvantages. Itis the goal of this section to expose these advantages and disadvantages to you.

From the previous analysis of fixed-point variables scaled within the general [Slope Bias] encoding

scheme, you can conclude

* Addition, subtraction, multiplication, and division can be very involved unless certain choicesare made for the biases and slopes.

* Binary-point-only scaling guarantees simpler math, but generally sacrifices some precision.

Note that the previous formulas don't show the following:

* Constants and variables are represented with a finite number of bits.

* Variables are either signed or unsigned.

* Rounding and overflow handling schemes. You must make these decisions before an actualfixed-point realization is achieved.

A. Quantization

The quantization Q of a real-world value V is represented by a weighted sum of bits.


42/129


43/129

while the two's complement value is

B. Range and Precision

The range of a number gives the limits of the representation, while the precision gives the distancebetween successive numbers in the representation. The range and precision of a fixed-point numberdepend on the length of the word and the scaling.

Range

The range of representable numbers for an unsigned and two's complement fixed-point number ofsize ws, scaling S, and bias B is illustrated in the following figure.

For both the signed and unsigned fixed-point numbers of any data type, the number of different bitpatterns is 2ws.

For example, if the fixed-point data type is an integer with scaling defined as S = 1 and B = 0, thenthe maximum unsigned value is 2ws - 1, because zero must be represented. In two's complement,negative numbers must be represented as well as zero, so the maximum value is 2ws - 1- 1.Additionally, since there is only one representation for zero, there must be an unequal number ofpositive and negative numbers. This means there is a representation for

-2ws 1 but not for 2ws - 1.

Precision

The precision (scaling) of integer and fractional data types is specified by the default binary point.For generalized fixed-point data types, the scaling must be explicitly defined as either [Slope Bias]or binary-point-only. In either case, the precision is given by the slope.


44/129

Fixed-Point Data Type Parameters

The low limit, high limit, and default binary-point-only scaling for the supported fixed-point datatypes discussed in Binary Point Interpretation are given in the following table. See Limitations onPrecision and Limitations on Range for more information.

Fixed-Point Data Type Range and Default Scaling

Range of an 8-Bit Fixed-Point Data Type Binary-Point-Only Scaling

The precision, range of signed values, and range of unsigned values for an 8-bit generalized fixed-point data type with binary-point-only scaling follow. Note that the first scaling value (21)represents a binary point that is not contiguous with the word.

Range of an 8-Bit Fixed-Point Data Type [Slope Bias] Scaling

The precision and range of signed and unsigned values for an 8-bit fixed-point data type using[Slope Bias] scaling follow. The slope starts at a value of 1.25 and the bias is 1.0 for all slopes. Notethat the slope is the same as the precision.


45/129

Fixed-Point Data Type and Scaling Notation

The following table provides a key for various symbols that may appear in Simulink products toindicate the data type and scaling of a fixed-point value.

1.5.4 Recommendations for Arithmetic and Scaling

Introduction

The sections that follow describe the relationship between arithmetic operations and fixed-pointscaling, and offer some basic recommendations that may be appropriate for your fixed-point design.For each arithmetic operation,

* The general [Slope Bias] encoding scheme described in Scaling is used.

* The scaling of the result is automatically selected based on the scaling of the two inputs. Inother words, the scaling is inherited.

* Scaling choices are based on

o Minimizing the number of arithmetic operations of the result

o Maximizing the precision of the result

Additionally, binary-point-only scaling is presented as a special case of the general encodingscheme.

In embedded systems, the scaling of variables at the hardware interface (the ADC or DAC) is fixed.However for most other variables, the scaling is something you can choose to give the best design.When scaling fixed-point variables, it is important to remember that


46/129

* Your scaling choices depend on the particular design you are simulating.

* There is no best scaling approach. All choices have associated advantages anddisadvantages. It is the goal of this section to expose these advantages and disadvantages to you.

Addition

Consider the addition of two real-world values:

These values are represented by the general [Slope Bias] encoding scheme described in Scaling:

In a fixed-point system, the addition of values results in finding the variable Qa:

This formula shows

* In general, Qa is not computed through a simple addition of Qb and Qc.

* In general, there are two multiplications of a constant and a variable, two additions, andsome additional bit shifting.

Inherited Scaling for Speed

In the process of finding the scaling of the sum, one reasonable goal is to simplify the calculations.

Simplifying the calculations should reduce the number of operations, thereby increasing executionspeed. The following choices can help to minimize the number of arithmetic operations:

* Set Ba = Bb + Bc. This eliminates one addition.

* Set Fa = Fb or Fa = Fc. Either choice eliminates one of the two constant times variablemultiplications.

The resulting formula is

These equations appear to be equivalent. However, your choice of rounding and precision maymake one choice stand out over the other. To further simplify matters, you could choose Ea = Ec orEa = Eb. This will eliminate some bit shifting.


47/129

Inherited Scaling for Maximum Precision

In the process of finding the scaling of the sum, one reasonable goal is maximum precision. Youcan determine the maximum-precision scaling if the range of the variable is known. Example:Maximizing Precision shows that you can determine the range of a fixed-point operation from and .For a summation, you can determine the range from

You can now derive the maximum-precision slope:

In most cases the input and output word sizes are much greater than one, and the slope becomes

which depends only on the size of the input and output words. The corresponding bias is

The value of the bias depends on whether the inputs and output are signed or unsigned numbers.

If the inputs and output are all unsigned, then the minimum values for these variables are all zeroand the bias reduces to a particularly simple form:

If the inputs and the output are all signed, then the bias becomes

Binary-Point-Only Scaling

For binary-point-only scaling, finding Qa results in this simple expression:

This scaling choice results in only one addition and some bit shifting. The avoidance of anymultiplications is a big advantage of binary-point-only scaling.


48/129

1.6 Microcontroller CPU, Interupts, Memory, and I/O

The interconnection between the CPU, memory, and I/O of the address and data buses isgenerally a one-to-one connection. The hard part is designing the appropriate circuitry to adapt thecontrol signals present on each device to be compatible with that of the other devices. The mostbasic control signals are generated by the CPU to control the data transfers between the CPU andmemory, and between the CPU and I/O devices. The four most common types of CPU controlled

data transfers are:- CPU reads data/instructions from memory (memory read)- CPU writes data to memory (memory write)- CPU reads data from an input device (I/O read)- CPU writes data to an output device (I/O write)

1.6.1 CPU Central Processing Unit

The four major CPU components are:- the arithmetic logic unit (ALU) The ALU contains the circuitry to perform simple

arithmetic and logical operations on the inputs

- registers a type of fast memory- the control unit (CU) The control unit is the circuitry that controls the flow of data through

the processor, and coordinates the activities of the other units within it. In a way, it is the"brain within the brain".

- the internal CPU buses interconnect the ALU, registers, and the CUThe Figure 1.2 presents the internal block diagram of the V850 CPU.

Figure 1.2 Internal block diagram of V850ES CPU

- The general-purpose registers can be used to store a data variable or an address variable.- The program counter holds the instruction address during program execution.- The system registers control the status of the CPU and hold interrupt information- The program status word (PSW) is an area of memory or a hardware register which contains

information about program state used by the operating system and the underlying hardware.


49/129

It will normally include a pointer (address) to the next instruction to be executed. Theprogram status word typically contains an error status field and condition codes such as theinterrupt enable/disable bit and a supervisor / user mode bit.

Registers

Registers are simply a combination of various flip-flops that can be used to temporarily storedata or to delay signals. A storage register is a form of fast programmable internal processormemory usually used to temporarily store, copy, and modify operands that are immediately orfrequently used by the system. Shift registers delay signals by passing the signals between thevarious internal flip-flops with every clock pulse.

Registers are made up of a set of flip-flops that can be activated either individually or as aset. In fact, it is the number of flip-flops in each registerthat is actually used to describe a processor(for example, a 32-bit processor has working registers that are 32 bits wide containing 32 flip-flops,a 16-bit processor has working registers that are 16 bits wide containing 16 flipflops, and so on).The number of flip-flops within these registers also determines the width of the data buses used inthe system

While ISA designs do not all use registers in the same way to process the data, storage

typically falls under one of two categories, either general purpose or special purpose. Generalpurpose registers can be used to store and manipulate any type of data determined by theprogrammer, whereas special purpose registers can only be used in a manner specified by the ISA,including holding results for specific types of computations, having predetermined flags(single bitswithin a register that can act and be controlled independently), acting as counters(registers that canbe programmed to change statesthat is, increment asynchronously or synchronously after aspecified length of time), and controlling I/O ports (registers managing the external I/O pinsconnected to the body of the processor and to board I/O). Shift registers are inherently specialpurpose, because of their limited functionality.

The number of registers, the types of registers, and the size of the data that these registerscan store (8-bit, 16-bit, 32-bit, and so forth) varies depending on the CPU, according to the ISAdefinitions. In the cycle of fetching and executing instructions, the CPUs registers have to be fast,so as to quickly feed data to the ALU, for example, and to receive data from the CPUs internal databus. Registers are also multi-ported so as to be able to both receive and transmit data to these CPUcomponents.

1.6.2 Interrupts

Now that you know the names and addresses of the memory and peripherals attached to theprocessor, it is time to learn how to communicate with the latter. There are two basiccommunication techniques: polling and interrupts. In either case, the processor usually issues somesort of commands to the device-by way of the memory or I/O space-and waits for the device tocomplete the assigned task. For example, the processor might ask a timer to count down from 1000to 0. Once the countdown begins, the processor is interested in just one thing: is the timer finishedcounting yet?If polling is used, then the processor repeatedly checks to see if the task has been completed.This is analogous to the small child who repeatedly asks "are we there yet?" throughout a long trip.Like the child, the processor spends a large amount of otherwise useful time asking the question andgetting a negative response. To implement polling in software, you need only create a loop thatreads the status register of the device in question.The second communication technique uses interrupts. An interrupt is an asynchronous electricalsignal from a peripheral to the processor. When interrupts are used, the processor issues commandsto the peripheral exactly as before, but then waits for an interrupt to signal completion of theassigned work. While the processor is waiting for the interrupt to arrive, it is free to continue


50/129

working on other things. When the interrupt signal is finally asserted, the processor temporarily setsaside its current work and executes a small piece of software called the interrupt service routine(ISR). When the ISR completes, the processor returns to the work that was interrupted.Of course, this isn't all automatic. The programmer must write the ISR himself and "install" andenable it so that it will be executed when the relevant interrupt occurs. The first few times you dothis, it will be a significant challenge. But, even so, the use of interrupts generally decreases thecomplexity of one's overall code by giving it a better structure. Rather than device polling being

embedded within an unrelated part of the program, the two pieces of code remain appropriatelyseparate.On the whole, interrupts are a much more efficient use of the processor than polling. The processoris able to use a larger percentage of its waiting time to perform useful work.However, there is some overhead associated with each interrupt. It takes a good bit of time-relativeto the length of time it takes to execute an opcode-to put aside the processor's current work andtransfer control to the interrupt service routine. Many of the processor's registers must be saved inmemory, and lower-priority interrupts must be disabled. So in practice both methods are usedfrequently. Interrupts are used when efficiency is paramount or multiple devices must be monitoredsimultaneously. Polling is used when the processor must respond to some event more quickly thanis possible using interrupts.

An interrupt is an asynchronous signal from hardware indicating the need for attention or asynchronous event in software indicating the need for a change in execution.

Hardware interrupts are triggered by a physical event, such as the closure of a switch, thatcauses a specific subroutine to be called. They can be thought of as a sort of hardware initiatedsubroutine call. They can and do occur at any time in the program, depending on when the eventoccurs. These are referred to as asynchronous events because they may occur during the executionof any part of the program. Interrupts allow the programs to respond to an event when it occurs.

A software interrupt is a special subroutine call. It is synchronous meaning that it alwaysoccurs at the same time and place in the program that is interrupted. It is frequently used as a quickand simple way to do a subroutine call for accessing programs such as the operating system and I/Oprograms. Software interrupts are usually implemented as instructions in the instruction set, whichcause a context switch to an interrupt handler similar to a hardware interrupt.

Interrupts can be categorized into: maskable interrupt (IRQ), non-maskable interrupt (NMI),interprocessor interrupt (IPI), software interrupt, and spurious interrupt.

- A maskable interrupt(IRQ) is a hardware interrupt that may be ignored by setting a bit in aninterrupt mask register's (IMR) bit-mask.

- Likewise, a non-maskable interrupt(NMI) is a hardware interrupt that does not have a bit-mask associated with it - meaning that it can never be ignored. NMIs are often used fortimers, especially watchdog timers.

- An interprocessor interruptis a special case of interrupt that is generated by one processorto interrupt another processor in a multiprocessor system.

- A software interruptis an interrupt generated within a processor by executing an instruction.Software interrupts are often used to implement System calls because they implement asubroutine call with a CPU ring level change.

- A spurious interruptis a hardware interrupt that is unwanted. They are typically generatedby system conditions such as electrical interference on an interrupt line or throughincorrectly designed hardware.An interrupt can notify the processor when an analog-to-digital converter (ADC) has new

data, when a timer rolls over, when a direct memory access (DMA) transfer is complete, whenanother processor wants to communicate, or when almost any asynchronous event happens. Theinterrupt hardware is initialized and programmed by the system software. When an interrupt isacknowledged, that process is performed by hardware internal to the processor and the interruptcontroller integrated circuit (IC) (if any).


51/129

When an interrupt occurs, the on-chip hardware performs the following functions: It saves the program counter (the address the processor was executing when the

interrupt occurred) on the stack. Some processors save other information aswell, such as registercontents.

It executes an interrupt acknowledge cycle to get a vector from the interrupting peripheral,depending on the processor and the specific type of interrupt.

It branches to a predetermined address specific to that particular interrupt.The destination address is th

sisteme integrate ver4

Documents