advanced micro devices - athlon buddy guest mike lewitt bill mccorkle november 28, 2001
Post on 20-Dec-2015
219 views
TRANSCRIPT
![Page 1: Advanced Micro Devices - Athlon Buddy Guest Mike Lewitt Bill McCorkle November 28, 2001](https://reader035.vdocuments.net/reader035/viewer/2022062407/56649d535503460f94a2fa4e/html5/thumbnails/1.jpg)
Advanced Micro Devices - Athlon
Buddy Guest Mike Lewitt Bill McCorkle
November 28, 2001
![Page 2: Advanced Micro Devices - Athlon Buddy Guest Mike Lewitt Bill McCorkle November 28, 2001](https://reader035.vdocuments.net/reader035/viewer/2022062407/56649d535503460f94a2fa4e/html5/thumbnails/2.jpg)
RISC
IA-64
IA-32
What Have We Seen So Far?Where is the Competition?
![Page 3: Advanced Micro Devices - Athlon Buddy Guest Mike Lewitt Bill McCorkle November 28, 2001](https://reader035.vdocuments.net/reader035/viewer/2022062407/56649d535503460f94a2fa4e/html5/thumbnails/3.jpg)
Overview of Today’s Events Company History Differences in AMD Athlon
Architecture System Bus Macro vs. Micro Operations Floating Point Operations Branch Prediction Memory Management
Comparing Processor Performance
![Page 4: Advanced Micro Devices - Athlon Buddy Guest Mike Lewitt Bill McCorkle November 28, 2001](https://reader035.vdocuments.net/reader035/viewer/2022062407/56649d535503460f94a2fa4e/html5/thumbnails/4.jpg)
AMD Intel May 1, 1969 – founded
Semiconductor company 1975 8080A and AM2900 1976 Sign cross-licensing
agreement 1987 AMD & Intel go to court 1992 Court awards full rights
to AMD to produce AM386 Processor
1991 AM386 (breaks Intel Monopoly)
1993 AM486 1997 AMD-K6 1998 Athlon – 1st 7th
Generation Processor
July 18, 1968 – founded Semiconductor memory
1971 4004 introduced 1971 8008 introduced 1976 Sign cross-
licensing agreement 1981 16-bit 8086 1982 286 (on-board
memory) 1985 32-bit 386 1989 486 1993 Pentium 1998 Celeron & Pentium
II
![Page 5: Advanced Micro Devices - Athlon Buddy Guest Mike Lewitt Bill McCorkle November 28, 2001](https://reader035.vdocuments.net/reader035/viewer/2022062407/56649d535503460f94a2fa4e/html5/thumbnails/5.jpg)
Architecture Summary AMD Approach
Balanced approach to optimize processor performance (IPC) and improving the operating frequency at the same time.
Intel Approach Increased pipelining depth to handle more
instructions which created loss in processor performance (IPC).
Solution: Compensated with much higher frequency to stay in competition. (=IPC)
![Page 6: Advanced Micro Devices - Athlon Buddy Guest Mike Lewitt Bill McCorkle November 28, 2001](https://reader035.vdocuments.net/reader035/viewer/2022062407/56649d535503460f94a2fa4e/html5/thumbnails/6.jpg)
Architecture Summary Overall Improvement to Performance
Frequency Improvements Smaller Geometries Faster Transistors (“process shrinks”) Deeper Pipelines Fewer Gates Per Clock Cycle
Work Per Clock Improvements Super scalar Architectures Dynamic Instruction Schedulers Larger On-Chip Caches Advanced Branch Prediction
![Page 7: Advanced Micro Devices - Athlon Buddy Guest Mike Lewitt Bill McCorkle November 28, 2001](https://reader035.vdocuments.net/reader035/viewer/2022062407/56649d535503460f94a2fa4e/html5/thumbnails/7.jpg)
Architecture Summary Clock Speed / EV6 Bus
Designed with very high clock speeds in mind
K7 has very deep buffers to enable those high clock speeds, offering up to 72 x86 instructions in-flight.
Uses Rising Edge and Falling Edge Detection For Bus
100 MHz Clock 200 MHz Processor 133 MHz Clock 266 MHz Processor
AMD vs. Intel comparing same clock
![Page 8: Advanced Micro Devices - Athlon Buddy Guest Mike Lewitt Bill McCorkle November 28, 2001](https://reader035.vdocuments.net/reader035/viewer/2022062407/56649d535503460f94a2fa4e/html5/thumbnails/8.jpg)
Architecture Summary EV6 Bus on AMD Athlon
Scalable up to 200 MHz Yielding Effective frequency 400 MHz
Multiprocessor support Highest bus bandwidth (1.60 GB/s)
Intel using 133 MHz (1.01 GB/s)
![Page 9: Advanced Micro Devices - Athlon Buddy Guest Mike Lewitt Bill McCorkle November 28, 2001](https://reader035.vdocuments.net/reader035/viewer/2022062407/56649d535503460f94a2fa4e/html5/thumbnails/9.jpg)
AMD Athlon
PIII
![Page 10: Advanced Micro Devices - Athlon Buddy Guest Mike Lewitt Bill McCorkle November 28, 2001](https://reader035.vdocuments.net/reader035/viewer/2022062407/56649d535503460f94a2fa4e/html5/thumbnails/10.jpg)
Architecture Summary Instruction Control Unit
Holds 72 MOps Before Assignment(MOp = x86 instruction, therefore Athlon
can have 72 “in-flight” instructions) P6 Only Holds 13 in-flight MOps
![Page 11: Advanced Micro Devices - Athlon Buddy Guest Mike Lewitt Bill McCorkle November 28, 2001](https://reader035.vdocuments.net/reader035/viewer/2022062407/56649d535503460f94a2fa4e/html5/thumbnails/11.jpg)
Architecture Summary Execution Ports
AMD Has No Less Than 9 Intel Has 5
2 Dedicated to memory stores
Enhanced Parallelism Inside Athlon
![Page 12: Advanced Micro Devices - Athlon Buddy Guest Mike Lewitt Bill McCorkle November 28, 2001](https://reader035.vdocuments.net/reader035/viewer/2022062407/56649d535503460f94a2fa4e/html5/thumbnails/12.jpg)
Micro-OPs / Macro-OPs Athlon has 3 parallel x86 instruction
decoders translate into a Macro-Op of 72-entry ICU Uses 2 pipelines (Intel uses 1)
-Decoding common instructions (direct path) -Decoding complex x86 instructions (vector path)
Integer Scheduler is fed and holds max 15 M-Ops, representing 30 at a time
Leads to 3 parallel integer execution units
![Page 13: Advanced Micro Devices - Athlon Buddy Guest Mike Lewitt Bill McCorkle November 28, 2001](https://reader035.vdocuments.net/reader035/viewer/2022062407/56649d535503460f94a2fa4e/html5/thumbnails/13.jpg)
Micro-OPs / Macro-OPs Athlon Decoders 3-Way Instruction
Has 3 parallel decoding units Can handle any combination of instructions with
any of it’s decoders that are “fully capable” decoders
Handles Complex and Simple Instructions Intel Decoders
Has 3 parallel decoding units 1 Complex 2 Simple
Handles Complex / Simple / Simple
![Page 14: Advanced Micro Devices - Athlon Buddy Guest Mike Lewitt Bill McCorkle November 28, 2001](https://reader035.vdocuments.net/reader035/viewer/2022062407/56649d535503460f94a2fa4e/html5/thumbnails/14.jpg)
3DNOW!
3DNOW! (Athlon) SSE (Intel)
Pipelines (parallel) 2 2
Instructions (how wide) 2 4
Effective Instructions per Cycle 4* 4
Registers Used 3DNOW! / FPU No FPU
Every 4-wide Intel SSE instruction is actually 2 Athlon micro-ops
*AMD takes advantage of rising edge as well as falling edge
**SSE Cannot be used with MMX Registers
MMX Developed When FPUs Not As Important
![Page 15: Advanced Micro Devices - Athlon Buddy Guest Mike Lewitt Bill McCorkle November 28, 2001](https://reader035.vdocuments.net/reader035/viewer/2022062407/56649d535503460f94a2fa4e/html5/thumbnails/15.jpg)
3DNOW!
Each pipeline can do any instruction above.
The second pipeline can do any instruction in any group except the group the first pipeline has chosen.
![Page 16: Advanced Micro Devices - Athlon Buddy Guest Mike Lewitt Bill McCorkle November 28, 2001](https://reader035.vdocuments.net/reader035/viewer/2022062407/56649d535503460f94a2fa4e/html5/thumbnails/16.jpg)
3DNOW!
Conclusion of 3DNOW! Vs SSE Both have pairing restrictions
SSE Separate Unit implementation more difficult program with more freedom
MMX-add & prefetch-instructions slightly better for SSE
Final Conclusion: DRAW
![Page 17: Advanced Micro Devices - Athlon Buddy Guest Mike Lewitt Bill McCorkle November 28, 2001](https://reader035.vdocuments.net/reader035/viewer/2022062407/56649d535503460f94a2fa4e/html5/thumbnails/17.jpg)
Full Architecture viewsAMD Athlon
PIII
![Page 18: Advanced Micro Devices - Athlon Buddy Guest Mike Lewitt Bill McCorkle November 28, 2001](https://reader035.vdocuments.net/reader035/viewer/2022062407/56649d535503460f94a2fa4e/html5/thumbnails/18.jpg)
Looking at the ALUs
![Page 19: Advanced Micro Devices - Athlon Buddy Guest Mike Lewitt Bill McCorkle November 28, 2001](https://reader035.vdocuments.net/reader035/viewer/2022062407/56649d535503460f94a2fa4e/html5/thumbnails/19.jpg)
Floating Point Operations
Fully pipelined FPU 3 ported parallel Floating Point
Execution Units Pentium has 3 also, but are behind
only one port FPU can execute two 80-bit
extended Ops Intel can currently only execute one
![Page 20: Advanced Micro Devices - Athlon Buddy Guest Mike Lewitt Bill McCorkle November 28, 2001](https://reader035.vdocuments.net/reader035/viewer/2022062407/56649d535503460f94a2fa4e/html5/thumbnails/20.jpg)
Pipelining Differences Determining the length
Execution rate of pipeline (ALU) Degree of Parallelism
AMD Athlo
n
Intel Pentium III
Integer Pipeline Length
10 12-17
Floating Point Pipeline length
15 25
(AMD-Athlon)
![Page 21: Advanced Micro Devices - Athlon Buddy Guest Mike Lewitt Bill McCorkle November 28, 2001](https://reader035.vdocuments.net/reader035/viewer/2022062407/56649d535503460f94a2fa4e/html5/thumbnails/21.jpg)
Branch PredictionExample:
if (x > 0){a=0;b=1;c=2; }
d=3;
Cycle Fetch Decode Execute Save
1 if (x>0)2 a=0 if (x>0)3 b=1 a=0 if (x>0)4 c=2 b=1 a=0 if (x>0)5 c=2 b=1 a=06 c=2 b=17 c=2
Cycle Fetch Decode Execute Save
1 if (x>0)2 a=0 if (x>0)3 b=1 a=0 if (x>0)
4 d=3squash
b=1squash
a=0 if (x>0)
5 d=3squash
b=1squash
a=0
6 d=3squash
b=17 d=3
Cycle Fetch Decode Execute Save
1 if (x>0)2 d=3 if (x>0)3 d=3 if (x>0)4 d=3 if (x>0)5 d=3
When x>0
When x<0
Predicting x<0
![Page 22: Advanced Micro Devices - Athlon Buddy Guest Mike Lewitt Bill McCorkle November 28, 2001](https://reader035.vdocuments.net/reader035/viewer/2022062407/56649d535503460f94a2fa4e/html5/thumbnails/22.jpg)
Branch Prediction AMD Athlon
Branch Target Buffer size of 2048 entries Branch History Table can store 4096 entries
Intel Pentium III Dynamic Branch Predictor can store 512
entries Approximate Correct Branch Predictions
AMD Athlon: 95% Intel Pentium III: 90-92%
![Page 23: Advanced Micro Devices - Athlon Buddy Guest Mike Lewitt Bill McCorkle November 28, 2001](https://reader035.vdocuments.net/reader035/viewer/2022062407/56649d535503460f94a2fa4e/html5/thumbnails/23.jpg)
Memory Management Level 2 Cache
512kB to 8 MB Rate of 1/3, 1/2, 2/3, 1/1 the clock frequency External to the CPU (Weakness of Athlon)
Intel L2: 256kB ‘on-die’ Intel moving away from Slot1 and back to socket AMD will need to move to ‘on-die’ and socket
connections to stay competitive Main push towards 0.18 -process
Level 1 Cache 64kB data and instruction caches (4x Pentium III) Scalability
![Page 24: Advanced Micro Devices - Athlon Buddy Guest Mike Lewitt Bill McCorkle November 28, 2001](https://reader035.vdocuments.net/reader035/viewer/2022062407/56649d535503460f94a2fa4e/html5/thumbnails/24.jpg)
Which One Is Better? In the past (286, 386, 486)
Performance = Frequency
In Today’s World Performance = IPC * Frequency
How else so we compare? Benchmarking
![Page 25: Advanced Micro Devices - Athlon Buddy Guest Mike Lewitt Bill McCorkle November 28, 2001](https://reader035.vdocuments.net/reader035/viewer/2022062407/56649d535503460f94a2fa4e/html5/thumbnails/25.jpg)
Benchmarking Software that performs different
tasks to obtain comparisons between processors.
Problems: Processor frequencies. Other processes already running. Types of programs
Some programs are written to take advantage of certain architecture.
![Page 26: Advanced Micro Devices - Athlon Buddy Guest Mike Lewitt Bill McCorkle November 28, 2001](https://reader035.vdocuments.net/reader035/viewer/2022062407/56649d535503460f94a2fa4e/html5/thumbnails/26.jpg)
Photo Editing Software
![Page 27: Advanced Micro Devices - Athlon Buddy Guest Mike Lewitt Bill McCorkle November 28, 2001](https://reader035.vdocuments.net/reader035/viewer/2022062407/56649d535503460f94a2fa4e/html5/thumbnails/27.jpg)
Animation Software
![Page 28: Advanced Micro Devices - Athlon Buddy Guest Mike Lewitt Bill McCorkle November 28, 2001](https://reader035.vdocuments.net/reader035/viewer/2022062407/56649d535503460f94a2fa4e/html5/thumbnails/28.jpg)
3D Graphics Editor
![Page 29: Advanced Micro Devices - Athlon Buddy Guest Mike Lewitt Bill McCorkle November 28, 2001](https://reader035.vdocuments.net/reader035/viewer/2022062407/56649d535503460f94a2fa4e/html5/thumbnails/29.jpg)
3D Gaming
![Page 30: Advanced Micro Devices - Athlon Buddy Guest Mike Lewitt Bill McCorkle November 28, 2001](https://reader035.vdocuments.net/reader035/viewer/2022062407/56649d535503460f94a2fa4e/html5/thumbnails/30.jpg)
Various Benchmarks
![Page 31: Advanced Micro Devices - Athlon Buddy Guest Mike Lewitt Bill McCorkle November 28, 2001](https://reader035.vdocuments.net/reader035/viewer/2022062407/56649d535503460f94a2fa4e/html5/thumbnails/31.jpg)
Summary Past couple years, AMD and Intel
have taken different approaches. We have gone over the main
architectural differences. We have shown how they compare. It will be very interesting to see
how the market plays out.
![Page 32: Advanced Micro Devices - Athlon Buddy Guest Mike Lewitt Bill McCorkle November 28, 2001](https://reader035.vdocuments.net/reader035/viewer/2022062407/56649d535503460f94a2fa4e/html5/thumbnails/32.jpg)
Questions?
![Page 33: Advanced Micro Devices - Athlon Buddy Guest Mike Lewitt Bill McCorkle November 28, 2001](https://reader035.vdocuments.net/reader035/viewer/2022062407/56649d535503460f94a2fa4e/html5/thumbnails/33.jpg)
References http://www.amd.com http://www.amdzone.com http://www.intel.com Gardner, Ryan. AMD employee CPU Specialist
email: [email protected] Hsieh, Paul. 7th Generation CPU Comparisons.
http://www.azillionmonkeys.com/qed/cpujihad.shtml . 11/30/00 Pabst, Thomas. The New Athlon Processor – AMD is Finally Overtaking
Intel . http://www6.tomshardware.com/cpu/99q3/990809/index.html. 8/9/99
Pabst, Thomas. AMD Processors vs. Intel Processors – Facts and Lies. http://www6.tomshardware.com/cpu/00q4/001017/athlon-02.html. 10/12/00
Morgan, Rob. Power Mac G4 Dual 500 vs. Pentium 4 vs. Athlon. http://www.barefeats.com/pentium.html . 1/08/01