compeng 701_2
TRANSCRIPT
7/23/2019 COMPENG 701_2
http://slidepdf.com/reader/full/compeng-7012 1/33
Computer components
• CPU - Central Processor Unit(e.g., G3, Pentium III, RISC)
• RAM - Random Access Memory(generally lost when power cycled)
• VRAM - Video RAM (amount setsscreen size and color depth)
• ROM -Read Only Memory (usedfor boot)• IO Input/Output: IO through
interfaces such as SCSI (SmallComputer System Interface), USB(Universal Serial Bus), Firewire(video standard), Ethernet
• HD - Hard Disk Drive (permanentstorage); CDROM - Compact DiskRead-only-memory; CD-RW; DVDRead/Write
7/23/2019 COMPENG 701_2
http://slidepdf.com/reader/full/compeng-7012 2/33
Computer Architecture
• Computer Organization
– The von Neumann architecture
– Same storage device for both instructions and data
7/23/2019 COMPENG 701_2
http://slidepdf.com/reader/full/compeng-7012 3/33
Computer Architecture
• Device Controllers
– Memory mapped I/O – Direct Memory Access (DMA)
• Instruction Set
– Data transfer operations
– Arithmetic / logic operations
– Control flow instructions
7/23/2019 COMPENG 701_2
http://slidepdf.com/reader/full/compeng-7012 4/33
The architecture of a hypothetical
machine
7/23/2019 COMPENG 701_2
http://slidepdf.com/reader/full/compeng-7012 5/33
Computer basics
• Smallest thing a computer knows is a bit 0 or 1
(false/true)• CPU knows how to perform and, or, xor (exclusive or) operations
– And returns true if both same – Or returns true if either true
– Xor returns true if different
• CPU is a massive collection of and and or gates• A specific CPU has a set of instructions it can
execute (usually 50-100, machine language)
7/23/2019 COMPENG 701_2
http://slidepdf.com/reader/full/compeng-7012 6/33
Basics
• The number of instructions per seconds is set by
the “clock speed”, e.g., 500 MHz Pentium III• One clock tick is called a cycle, modern CPUscan often execute more than one instruction per
cycle• All programs end up as set of instructions to beexecuted by the CPU (compilers/linkers or
interpreters do this for you)• Floating point speed is measured in floating
point operations per seconds, or flops
7/23/2019 COMPENG 701_2
http://slidepdf.com/reader/full/compeng-7012 7/33
Bits/Bytes and words
• Bits are grouped into larger units
– 8 bits = 1 byte (still common) – 2/4/8 bytes = word (varies between CPU)
– Most desktop machines are 32-bit words but64 bits machines are becoming more common(set by data bus)
– Why important? Sets minimum size unit youcan access in program, and often precision
7/23/2019 COMPENG 701_2
http://slidepdf.com/reader/full/compeng-7012 8/33
Grouping words and bytes
• Number of unique values that can berepresented depends on number of bits
• With n bits one can have 2n unique values• For n=8 have (byte) = 256
• Grouped into larger units to represent differentdata• ASCII (American Standard Code for Information
Interchange) – Basic version is 7 bit (127 characters) – A-Z, a-z, 0-9 and special characters – Values <32 are “control characters”
7/23/2019 COMPENG 701_2
http://slidepdf.com/reader/full/compeng-7012 9/33
Numbers and bases
• Numbers represented in different base systems
– Binary base 2 (0-1) – Octal base 8 (0-7)
– Hexadecimal base 16 (0-15, with A-F representing
10-15) – E.g, 5410=3616=668=1101102
• Prefixes: kilo = 1024; mega=1048576;
giga=10737741824 (approximately 103,106,109)
7/23/2019 COMPENG 701_2
http://slidepdf.com/reader/full/compeng-7012 10/33
Program Execution
“The machine cycle”
7/23/2019 COMPENG 701_2
http://slidepdf.com/reader/full/compeng-7012 11/33
The composition of an instruction
for a machine
7/23/2019 COMPENG 701_2
http://slidepdf.com/reader/full/compeng-7012 12/33
Stored Program
7/23/2019 COMPENG 701_2
http://slidepdf.com/reader/full/compeng-7012 13/33
Performing the fetch step of the
machine cycle I
7/23/2019 COMPENG 701_2
http://slidepdf.com/reader/full/compeng-7012 14/33
Performing the fetch step of the
machine cycle II
7/23/2019 COMPENG 701_2
http://slidepdf.com/reader/full/compeng-7012 15/33
Decoding the instruction 35A7
7/23/2019 COMPENG 701_2
http://slidepdf.com/reader/full/compeng-7012 16/33
Mnemonics
• It is hard to remember commands as numbers
• Use words associated with the numbers
7/23/2019 COMPENG 701_2
http://slidepdf.com/reader/full/compeng-7012 17/33
Some Assembly language
7/23/2019 COMPENG 701_2
http://slidepdf.com/reader/full/compeng-7012 18/33
Components: OS
• The Operating System (OS)
– Controls everything in the way the computer works. – Not Specific to a CPU type but often some OSs are
associated with specific CPUs
• G3/4/5 68x series MacOS• Pentium, x86 DOS (Windows)
• SPARC Solaris (Unix)
– OS controls IO and memory management
• Program implementations are dependent on OS
7/23/2019 COMPENG 701_2
http://slidepdf.com/reader/full/compeng-7012 19/33
Programming interface to OS
• Depending on language used, OS
interface may or may not be important• For Fortran, C, C++ when program is
linked OS routines are needed – How to read from keyboard or file?
– How to write to screen or disk?
• In your program you do not need to go intothe low-level (OS) details
7/23/2019 COMPENG 701_2
http://slidepdf.com/reader/full/compeng-7012 20/33
Storage in memory
• Memory may be treated as a linear array ofbytes from 1 to <size of memory>
• The OS keeps track of which parts of memoryare being used and which parts are still free foruse by programs and data
• Some computers do “byte-swapping”, i.e., thebytes are not counted linearly but rather areswitched. The main (but not only) styles are Big
Endian (HP, Sun, Macs) and Little Endian (PC).• Affects ability to transfer binary data (TCP knowsthis and will accommodate this up to a certaindegree)
7/23/2019 COMPENG 701_2
http://slidepdf.com/reader/full/compeng-7012 21/33
Hard disks
• Hard disks contain thecomputers “file system” (allows
access through file names)• Directory structure points towhere files are located (reasonfor having less space available
than the size of disk + somecalibration tracks)• Actual content of HD and
directories depend on OS used
(different standards exist, e.g.,FAT16, FAT32, NTFS forWindows, EXT2 for Linux
• In general, OS can only use
their own file-system
7/23/2019 COMPENG 701_2
http://slidepdf.com/reader/full/compeng-7012 22/33
Accessing RAM vs. HDD
• The highest possiblebandwidth (peak
bandwidth) for the varioustypes of RAM
– However, RAM also has tomatch the motherboard,chipset and the CPUsystem bus
• HDD ~ only 80MB/s
• In MATLAB: try save,load, pack, clear
Module type Max.Transfer, MB/s
SD RAM, PC100 800
SD RAM, PC133 1064
Rambus, PC800 1600
Rambus, Dual PC800 3200
DDR 266 (PC2100) 2128
DDR 333 (PC2700) 2664
DDR 400 (PC3200) 3200
DUAL DDR PC3200 6400
DUAL DDR2-400 8600
DUAL DDR2-533 10600
7/23/2019 COMPENG 701_2
http://slidepdf.com/reader/full/compeng-7012 23/33
RAM and “fast” RAM/cache
• A CPU cache is a cache usedby the central processing unit of
a computer to reduce theaverage time to access memory
– Access time: roughly speaking“CPU speed against the bus speed”
7/23/2019 COMPENG 701_2
http://slidepdf.com/reader/full/compeng-7012 24/33
Integers
• Integer numbers can be represented exactly (up
to the range allowed by the number of bytes)• A 2-byte integer, unsigned 0-65535, signed
±32767 (sometimes called short)
• A 4-byte integer, unsigned 0-4294967295,signed ±2147483827
– (With a 32-bit address bus, can have 4Gbytes ofmemory—reason max memory is limited incomputers)
7/23/2019 COMPENG 701_2
http://slidepdf.com/reader/full/compeng-7012 25/33
Floating point
• Representations vary between machines (often
reason binary files can not be shared)
– Precise layout of bits depends on machine andformat; all formats are (mantissa)*2(exponent)
Th IEEE t d d f fl ti
7/23/2019 COMPENG 701_2
http://slidepdf.com/reader/full/compeng-7012 26/33
The IEEE standard for floating
point arithmetic• Single precision (32 bits=4 bytes)
S EEEEEEEE FFFFFFFFFFFFFFFFFFFFFF0 1 8 9 31
The value V represented by the word may be determined as follows:
• If E=255 and F is nonzero, then V=NaN ("Not a number")• If E=255 and F is zero and S is 1, then V=-Infinity• If E=255 and F is zero and S is 0, then V=Infinity• If 0<E<255 then V=(-1)S * 2(E-127) * (1.F) where "1.F" is intended to represent
the binary number created by prefixing F with an implicit leading 1 and a
binary point• If E=0 and F is nonzero, then V=(-1)S * 2 (-126) * (0.F). These are
"unnormalized" values.• If E=0 and F is zero and S is 1, then V=-0• If E=0 and F is zero and S is 0, then V=0
7/23/2019 COMPENG 701_2
http://slidepdf.com/reader/full/compeng-7012 27/33
Single precision floating point• In particular
0 00000000 00000000000000000000000 = 0
1 00000000 00000000000000000000000 = -0
0 11111111 00000000000000000000000 = Infinity1 11111111 00000000000000000000000 = -Infinity
0 11111111 00000100000000000000000 = NaN1 11111111 00100010001001010101010 = NaN
0 10000000 00000000000000000000000 = +1 * 2(128-127) * 1.0 = 20 10000001 10100000000000000000000 = +1 * 2(129-127) * 1.101 = 22*(23+22+1)/23 = 6.5
1 10000001 10100000000000000000000 = -1 * 2(129-127) * 1.101 = -6.5
0 00000001 00000000000000000000000 = +1 * 2(1-127) * 1.0 = 2(-126)
0 00000000 10000000000000000000000 = +1 * 2(-126) * 0.1 = 2(-127)
0 00000000 00000000000000000000001 = +1 * 2(-126) *
0.00000000000000000000001 =2(-149) (Smallest positive value)
7/23/2019 COMPENG 701_2
http://slidepdf.com/reader/full/compeng-7012 28/33
Double precision floating point
S EEEEEEEEEEE FFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFF
0 1 11 12 63
The value V represented by the word may be determined as follows:
• If E=2047 and F is nonzero, then V=NaN ("Not a number")
• If E=2047 and F is zero and S is 1, then V=-Infinity
• If E=2047 and F is zero and S is 0, then V=Infinity• If 0<E<2047 then V=(-1)S * 2(E-1023) * (1.F) where "1.F" is intended to represent
the binary number created by prefixing F with an implicit leading 1 and a binarypoint.
• If E=0 and F is nonzero, then V=(-1)S
* 2(-1022)
* (0.F) These are "unnormalized"values.
• If E=0 and F is zero and S is 1, then V=-0
• If E=0 and F is zero and S is 0, then V=0
7/23/2019 COMPENG 701_2
http://slidepdf.com/reader/full/compeng-7012 29/33
Is the finite precision an issue?
• An extended example: condition number of a
symmetric matrix.
Consider a system of linear equations
and the perturbed system
⎟⎟ ⎠
⎞⎜⎜⎝
⎛ =⎟⎟
⎠
⎞⎜⎜⎝
⎛ ⎟⎟ ⎠
⎞⎜⎜⎝
⎛
1997
1999
998999
9991000
y
x
⎟⎟ ⎠
⎞⎜⎜⎝
⎛ =⎟⎟
⎠
⎞⎜⎜⎝
⎛ ⎟⎟ ⎠
⎞⎜⎜⎝
⎛ 01.1997
99.1998
ˆ
ˆ
998999
9991000
y
x
7/23/2019 COMPENG 701_2
http://slidepdf.com/reader/full/compeng-7012 30/33
Example
Note
What went wrong?
Recallan n-by-n matrix A is symmetric iff A=A
T
.Fact (“spectral decomposition”):
If A is symmetric, it may be written as
A=UDUT,where D is the diagonal matrix,
U is unitary (i.e., UUT=I, the identity matrix).
99.18
97.20
ˆ
ˆ but
1
1
⎟⎟ ⎠
⎞⎜⎜⎝
⎛ −
+=⎟⎟
⎠
⎞⎜⎜⎝
⎛ ⎟⎟ ⎠
⎞⎜⎜⎝
⎛ =⎟⎟
⎠
⎞⎜⎜⎝
⎛ y
x
y
x
7/23/2019 COMPENG 701_2
http://slidepdf.com/reader/full/compeng-7012 31/33
Example
Fact: U is unitary is “almost the same” as
U is a rotation matrix(“almost the same” because U might includereflections).
Fact: For A=UDUT as before, the diagonalelements of D are the eigenvalues of A andcolumns of U are the right eigenvectors of A.
Recall t is an eigenvalue of A iff det(A-tI)=0,x is the corresponding right eigenvector iff
Ax=tx.
7/23/2019 COMPENG 701_2
http://slidepdf.com/reader/full/compeng-7012 32/33
Example
• How does A act on x, step-by-step: Ax=UDUTx=UD(UTx)=U(D(UTx)),
that is, “rotate, scale, rotate back”.
Define the condition number of A as
χ(A)=|t|max(A) / |t|min(A)where |t|max and |t|min are the largest and the smallest inabsolute value eigenvalues of A.
χ(A) shows how far off the solution to Ax=b may be fromthe solution to Ax=b+Δb (a measure of “relativesingularity”).
7/23/2019 COMPENG 701_2
http://slidepdf.com/reader/full/compeng-7012 33/33
Assignment 2MATLAB and C code are posted on the web. The MATLAB script is slightly modified in an
attempt to produce a more accurate graph: the C call is referred to several timesusing reps counter.
1. Using the MATLAB script from the class, try to identify the cache size on your
machine. Note that usually a number in MATLAB occupies 8 bytes (double precisionfloating point), and that the function requires roughly 2-times the memory needed tostore the x vector (x in the input, x_new in the output). Explain your results. As asanity check, you might want to use a CPU info tool.
2. Condition number I: for the class example, answer the following:a) find the matrix spectral decomposition and compute the matrix condition number,b) find perturbations Δb of the right-hand side of the equation with (the Euclidean norm) ||Δb||=1
that give the largest and the smallest errors in the solution vector; plot each of the twoperturbations, together with their image under UT, under D-1UT and, finally, under A-1,
c) given the equation for the curve ||Δb||=1, find its image under A-1 and plot it,d) find the explicit expression for the condition number χ(A-γ I) for γ > 0.
3. Condition number II: given a 2x2 matrix with double-precision entries, what is theworst condition number this matrix might have? Explain your answer.
4. Condition number III: for the question 2 of the first assignment, is there any ill-conditioning (ill-conditioning refers to a matrix having large condition number, henceoften resulting in numerical instability; if you happen to encounter complexeigenvalues, you are on the wrong track; for the definition in class, matrix must be
symmetric)? Explain your answer.5. Bonus question: find all the distinct spectral decompositions of the identity matrix.