Computer Architecture Lec 2 - Introduction. 01/19/10Lec 02-intro 2 Review from last lecture Computer Architecture >> instruction sets Computer Architecture

Download Computer Architecture Lec 2 - Introduction. 01/19/10Lec 02-intro 2 Review from last lecture Computer Architecture >> instruction sets Computer Architecture

Post on 13-Jan-2016

218 views

Category:

Documents

5 download

Embed Size (px)

TRANSCRIPT

<ul><li><p> Computer Architecture</p><p> Lec 2 - Introduction </p></li><li><p>Review from last lectureComputer Architecture &gt;&gt; instruction setsComputer Architecture skill sets are different 5 Quantitative principles of designQuantitative approach to designSolid interfaces that really workTechnology tracking and anticipationComputer Science at the crossroads from sequential to parallel computingSalvation requires innovation in many fields, including computer architecture</p><p>Lec 02-intro</p></li><li><p>Review: Computer Architecture bringsOther fields often borrow ideas from architectureQuantitative Principles of DesignTake Advantage of ParallelismPrinciple of LocalityFocus on the Common CaseAmdahls LawThe Processor Performance EquationCareful, quantitative comparisonsDefine, quantity, and summarize relative performanceDefine and quantity relative costDefine and quantity dependabilityDefine and quantity powerCulture of anticipating and exploiting advances in technologyCulture of well-defined interfaces that are carefully implemented and thoroughly checked</p><p>Lec 02-intro</p></li><li><p>OutlineReviewTechnology Trends: Culture of tracking, anticipating and exploiting advances in technologyCareful, quantitative comparisons:Define, quantity, and summarize relative performanceDefine and quantity relative costDefine and quantity dependabilityDefine and quantity power</p><p>Lec 02-intro</p></li><li><p>Moores Law: 2X transistors / yearCramming More Components onto Integrated CircuitsGordon Moore, Electronics, 1965# on transistors / cost-effective integrated circuit double every N months (12 N 24)</p><p>Lec 02-intro</p></li><li><p>Tracking Technology Performance TrendsDrill down into 4 technologies:Disks, Memory, Network, Processors Compare ~1980 Archaic (Nostalgic) vs. ~2000 Modern (Newfangled)Performance Milestones in each technologyCompare for Bandwidth vs. Latency improvements in performance over timeBandwidth: number of events per unit timeE.g., M bits / second over network, M bytes / second from diskLatency: elapsed time for a single event E.g., one-way network delay in microseconds, average disk access time in milliseconds</p><p>Lec 02-intro</p></li><li><p>Disks: Archaic(Nostalgic) v. Modern(Newfangled)Seagate 373453, 200315000 RPM (4X)73.4 GBytes (2500X)Tracks/Inch: 64000 (80X)Bits/Inch: 533,000 (60X)Four 2.5 platters (in 3.5 form factor)Bandwidth: 86 MBytes/sec (140X)Latency: 5.7 ms (8X)Cache: 8 MBytesCDC Wren I, 19833600 RPM0.03 GBytes capacityTracks/Inch: 800 Bits/Inch: 9550 Three 5.25 platters Bandwidth: 0.6 MBytes/secLatency: 48.3 msCache: none</p><p>Lec 02-intro</p></li><li><p>Latency Lags Bandwidth (for last ~20 years)Performance Milestones</p><p>Disk: 3600, 5400, 7200, 10000, 15000 RPM (8x, 143x)</p><p>(latency = simple operation w/o contentionBW = best-case)</p><p>Lec 02-intro</p><p>Chart1</p><p>11</p><p>7.04100</p><p>14.4</p><p>38.4</p><p>137.6</p><p>Disk</p><p>(Latency improvement = Bandwidth improvement)</p><p>Relative Latency Improvement</p><p>Relative BW Improvement</p><p>Sheet5</p><p>Sheet5</p><p>1111</p><p>3107.04100</p><p>1210014.4</p><p>20100038.4</p><p>48137.6</p><p>120</p><p>Memory</p><p>Network</p><p>Disk</p><p>(Latency improvement = Bandwidth improvement)</p><p>Relative Latency Improvement</p><p>Relative BW Improvement</p><p>Sheet6</p><p>111</p><p>37.04100</p><p>1214.4</p><p>2038.4</p><p>48137.6</p><p>120</p><p>Memory</p><p>Disk</p><p>(Latency improvement = Bandwidth improvement)</p><p>Relative Latency Improvement</p><p>Relative BW Improvement</p><p>Generation 2</p><p>11</p><p>7.04100</p><p>14.4</p><p>38.4</p><p>137.6</p><p>Disk</p><p>(Latency improvement = Bandwidth improvement)</p><p>Relative Latency Improvement</p><p>Relative BW Improvement</p><p>Disk</p><p>11111</p><p>33107.04100</p><p>12.51210014.4</p><p>6620100038.4</p><p>30048137.6</p><p>2250120</p><p>Processor</p><p>Memory</p><p>Network</p><p>Disk</p><p>(Latency improvement = Bandwidth improvement)</p><p>Bandwidth improvement</p><p>Relative Latency Improvement</p><p>Relative BW Improvement</p><p>Generation Table</p><p>CDC Wren ISeagate 373453</p><p>1983200330008037.5150012.5120.0</p><p>3600 RPM15000 RPM3600150004.26400080080.0</p><p>0.03 GBytes capacity73.4 GBytes capacity0.0373.42,447533,0009,55055.8</p><p>Three 5.25 plattersFour 2.5 platters204355.8</p><p>Bandwidth:(in 3.5 form factor)217474.6300019015.8</p><p>0.6 MBytes/secBandwidth:423666.4</p><p>Latency: 48.3 ms86 MBytes/sec0.68614342000134313.4</p><p>Cache: noneLatency: 5.7 ms48.35.78.550.1338.5</p><p>0.33770.1353.8</p><p>DRAMIntel 80286, 12.5 MHzIntel Pentium 4,1500 MHz</p><p>(asynchronous)Double Data Rate Synchr. (clocked) DRAM19822001</p><p>Year: 1980Year: 20002 MIPS (peak)4500 MIPS (peak)245002,250.0</p><p>0.06 Mbits/chip256.00 Mbits/chip0.06252564,096.0Latency 320 nsLatency 15 ns3201521.30.14</p><p>655362684354564,096.0472174.6</p><p>16-bit data bus per module64-bit data bus per Dual Inline Memory Module (DIMM)16644.0684236.2</p><p>13 Mbytes/sec1600 Mbytes/sec131600123.1</p><p>Latency: 225 nsLatency: 52 ns225524.3</p><p>Block transfers (page mode)</p><p>Generation Table</p><p>11111</p><p>33107.04100</p><p>12.51210014.4</p><p>6620100038.4</p><p>30048137.6</p><p>2250120</p><p>Processor</p><p>Memory</p><p>Network</p><p>Disk</p><p>(Latency improvement = Bandwidth improvement)</p><p>Bandwidth improvement</p><p>Relative Latency Improvement</p><p>Relative BW Improvement</p><p>Generations</p><p>1111</p><p>3107.04100</p><p>1210014.4</p><p>20100038.4</p><p>48137.6</p><p>120</p><p>Memory</p><p>Network</p><p>Disk</p><p>(Latency improvement = Bandwidth improvement)</p><p>Relative Latency Improvement</p><p>Relative BW Improvement</p><p>Sheet4</p><p>111</p><p>37.04100</p><p>1214.4</p><p>2038.4</p><p>48137.6</p><p>120</p><p>Memory</p><p>Disk</p><p>(Latency improvement = Bandwidth improvement)</p><p>Relative Latency Improvement</p><p>Relative BW Improvement</p><p>Sheet1</p><p>11</p><p>7.04100</p><p>14.4</p><p>38.4</p><p>137.6</p><p>Disk</p><p>(Latency improvement = Bandwidth improvement)</p><p>Relative Latency Improvement</p><p>Relative BW Improvement</p><p>Sheet2</p><p>Milestone123456</p><p>Microprocessor16-bit address/bus, microcoded32-bit address/bus, microcoded5-stage pipeline, on-chip I &amp; D caches, FPU2-way superscalar, 64-bit busOut-of-Order, 3-way superscalarSuperpipelined, integrated L2 cache</p><p>ProductIntel 80286Intel 80286Intel 80486Intel PentiumIntel Pentium ProIntel Pentium 4</p><p>Year19821985198919931997200134444</p><p>Latency (clocks)65551022</p><p>Bus width (bits)16 bits32 bits32 bits64 bits64 bits64 bits</p><p>Clock rate (MHz)12.51625662001500</p><p>Bandwidth (MIPS)26251326004500</p><p>Latency (ns)320313200765015</p><p>Memory ModuleDRAMPage Mode DRAMFast Page Mode DRAMFast Page Mode DRAMSynchronous DRAMDouble Data Rate DRAM</p><p>Module width16 bits16 bits32 bits64 bits64 bits64 bits</p><p>Year19801983198619931997200033743</p><p>Mbit/DRAM chip0.060.2511664256</p><p>Bandwidth (MB/s)13401602676401,600</p><p>Latency (ns)225170125756252</p><p>Local Area NetworkEthernetFast EthernetGigabit Ethernet10 Gigabit Ethernet</p><p>IEEE Standard802.3802.3u802.3ab802.3ae</p><p>Year19781995199920031744</p><p>Link speed (Mb/s)10100100010000</p><p>Bandwidth (Mb/s)10100100010000</p><p>100023015080</p><p>Hard Disk3600 RPM5400 RPM7200 RPM10000 RPM15000 RPM150751150</p><p>ProductFujitsu 2351A?Seagate ST41600Seagate ST15150Seagate ST39102Seagate ST373307</p><p>Year1982?19901994199820038445</p><p>Capacity0.4 Gbytes?1.4 Gbytes4.3 Gbytes9.1 Gbytes73.4 Gbytes</p><p>Disk diameter14 inch?5.25 inch3.5 inch3.5 inch3.5 inch</p><p>Interface?SCSISCSISCSISCSI</p><p>Bandwidth (MB/s)2492486</p><p>Latency (ms)35.617.112.78.85.7</p><p>Plot</p><p>Microprocessor16-bit address/bus, microcoded32-bit address/bus, microcoded5-stage pipeline, on-chip I &amp; D caches, FPU2-way superscalar, 64-bit busOut-of-Order, 3-way superscalarSuperpipelined, integrated L2 cache</p><p>Generation123456</p><p>Bandwidth improvement1313663002,250</p><p>Latency improvement1124622</p><p>Bandwidth improvement13122048120</p><p>Latency improvement112344</p><p>Bandwidth improvement1101001,0000.0</p><p>Latency improvement147130.0</p><p>Bandwidth improvement1251345</p><p>Latency improvement12346</p><p>Latency improvement1124622</p><p>Bandwidth improvement1313663002,250</p><p>Latency improvement1.01.31.83.03.64.3</p><p>Bandwidth improvement13122048120</p><p>Latency improvement147130.0</p><p>Bandwidth improvement1101001,0000.0</p><p>Latency improvement12346.26.2</p><p>Bandwidth improvement125134515000</p><p>11003600</p><p>11004.1666666667</p><p>1.5</p><p>6.25</p><p>Capacity</p><p>Disk00000</p><p>Cap/BW improvement0.00.00.00.00.0</p><p>DRAM141625610244096</p><p>Cap/BW improvement1.001.331.3312.8021.3334.13</p><p>Sheet2</p><p>11111</p><p>33107.04100</p><p>12.51210014.4</p><p>6620100038.4</p><p>30048137.6</p><p>2250120</p><p>Microprocessor</p><p>Memory</p><p>Network</p><p>Disk</p><p>(Latency improvement = Bandwidth improvement)</p><p>Bandwidth improvement</p><p>Relative Latency Improvement</p><p>Relative Bandwidth Improve-ment</p><p>Sheet3</p><p>11111</p><p>33107.04100</p><p>12.51210014.4</p><p>6620100038.4</p><p>30048137.6</p><p>2250120</p><p>Processor</p><p>Memory</p><p>Network</p><p>Disk</p><p>(Latency improvement = Bandwidth improvement)</p><p>Bandwidth improvement</p><p>Relative Latency Improvement</p><p>Relative BW Improvement</p><p>MPU</p><p>83%% data of bits</p><p>231Mbit/s</p><p>DECRA81Fujitsu 2351A24.0846774194Mbyte/s</p><p>198236</p><p>Rotation3600RPM39613.7534562212</p><p>Average seek28ms28186blocks/track</p><p>MAX BW2.2MB/s1.993KB/track</p><p>Capacity456MB4040.0ms/track</p><p>SMD0.0MB/s</p><p>14 inch50,400bytes per track</p><p>0.0ms/rotation</p><p>Year0.0</p><p>TechnologyYearBandwidthEstPer YearSeekRPMAvg. Rot.LatencyEstPer YearOtherBus width</p><p>CDC Wren I 9414-53619830.625MB/s403,6008.3348.33ms219830.030 Gbytes0.03</p><p>Fujitsu 2361A19872.5MB/s0.791.0616.73,6008.3325.03ms36.871.07419901.4 Gbytes</p><p>ST 41600n19904.4MB/s4.431.2111.55,4005.5617.06ms17.351.13419944.3 Gbytes4.3</p><p>ST 15150n19949MB/s9.121.208.507,2004.1712.67ms12.541.08819989.1 Gbytes</p><p>ST39102LW199824MB/s24.161.285.810,0252.998.79ms8.971.09200373.4 Gbytes73.4</p><p>3.5 in SCSI200034MB/s33.991.1910,000.003.008.1ms8.131.048</p><p>3.5 in SCSI200386MB/s62.341.2315,0002.005.7ms5.771.128Capacity</p><p>2.641.141st 1018.731.091st 10</p><p>89.031.29Last 3 milestones5.831.092nd 1073.10609054437%Last 3 milestones</p><p>87.111.28All milestones5.991.11All milestones76.2748%All milestones</p><p>Years to 2X BWLatencyBandwidth increase in that time</p><p>Yearsincrease%BW%latencyper year</p><p>Disk 1st 105.31.582.0014%9%1.6</p><p>Disk 2nd 103.221.322.2729%9%3.2</p><p>Disk Full4.21.552.8228%11%2.5</p><p>Average1.482.36</p><p>LatencyBandwidth increase in that time</p><p>Per DecadeYearsincrease%BW%latency</p><p>102.4414%9%</p><p>102.41329%9%</p><p>102.81228%11%</p><p>Average2.59</p><p>TechnologyYearBandwidthPer YearSeekRPMAvg. Rot.LatencyPer YearOtherCapacity</p><p>Fujitsu 2351A19821.9MB/s283,9617.5735.57ms12.5 MHz</p><p>Fujitsu 2361A19872.5MB/s16.73,6008.3325.0316 MHz</p><p>ST 41600n19904.4MB/s11.55,4005.5617.0625 MHz</p><p>ST41601N19924.511.554005.5617.06</p><p>HP C3323A1994MB/s5,4005.565.56</p><p>ST 15150n19949MB/s1.758.507,2004.1712.67ns1.1066 MHz</p><p>ST32171199710MB/s8.57,2004.1712.67</p><p>ST39102LW199824MB/s5.810,0252.998.79</p><p>3.5 in SCSI200034MB/s1.658.1ns1.30200 MHz</p><p>3.5 in SCSI200362MB/s1.455.7ns1.451500 MHz</p><p>198230msYears</p><p>19912019.345%10%</p><p>199710ms10.6911%40%</p><p>20026ms5.9311%40%</p><p>Ed Chrowchowski Data</p><p>Years to 2X BWLatencyBandwidth increase in that time</p><p>Yearsincrease%BW%latencyper year</p><p>Disk7.21.421.9910%5%2.0</p><p>Disk2.051.241.9940%11%3.6</p><p>Processor Full1.831.402.0046%20%2.3</p><p>Average1.351.99</p><p>LatencyBandwidth increase in that time</p><p>Per DecadeYearsincrease%BW%latency</p><p>101.6310%5%</p><p>102.82940%11%</p><p>106.24446%20%</p><p>Average3.625</p><p>DRAM</p><p>Generation123456</p><p>Microprocessor (Intel)802868038680486PenitumPentium ProPentium 4</p><p>Year198219851989199319972001</p><p>clocks latency/add inst65551022</p><p>Bus width (bits)16 bits32 bits32 bits64 bits64 bits64 bits</p><p>Clock rate (MHz)131625662001500</p><p>Bandwidth (Peak MIPS)26251326004500</p><p>Lantecy (nanoseconds)320313200765015</p><p>Memory ModuleDRAMPage Mode DRAMFast Page Mode DRAMFast Page Mode DRAMSDRAMDDR DRAM</p><p>Module width (bits)16 bits16 bits32 bits64 bits64 bits64 bits</p><p>Year198019831986199319972000</p><p>Mbits/DRAM chip0.060.251.0016.0064.00256.00</p><p>Bandwidth (MBytes/sec)13401602676401,600</p><p>Lantecy (nanoseconds)225170125756252</p><p>Ethernet10FastGigabit10 Gigabit</p><p>Year1978199519992003</p><p>IEEE Standard802.3802.3u802.3ab802.3ae</p><p>Link speed (Mbits/sec)10100100010000</p><p>Bandwidth (Mbits/sec)10100100010000</p><p>Latency (microseconds)30005003401903401703340</p><p>Hard Disk3600 RPM5400 RPM7200 RPM10000 RPM15000 RPM</p><p>Year19831990199419982003</p><p>Capacity (GBytes)0.031.44.39.173.4</p><p>Disk diamater (inches)5.25 inch5.25 inch3.5 inch3.5 inch3.5 inch</p><p>Bandwidth (MBytes/sec)1492486</p><p>Lantecy (milliseconds)48.317.112.78.85.7</p><p>Plot</p><p>Microprocessor (Intel)802868038680486PenitumPentium ProPentium 4</p><p>Generation123456</p><p>Bandwidth improvement1313663002,250</p><p>Latency improvement1124622</p><p>Bandwidth improvement13122048120</p><p>Latency improvement112344</p><p>Bandwidth improvement1101001,0000.0</p><p>Latency improvement169160.0</p><p>Bandwidth improvement171438138</p><p>Latency improvement13458</p><p>Latency improvement1124622</p><p>Bandwidth improvement1313663002,250</p><p>Latency improvement112344</p><p>Bandwidth improvement13122048120</p><p>Latency improvement16916</p><p>Bandwidth improvement1101001,000</p><p>Latency improvement13458areapinsxtors</p><p>Bandwidth improvement1714381384768134,0002</p><p>110021342342,000,0004500</p><p>11004.56.2313.42,250.0</p><p>1949.7366110623</p><p>9112000</p><p>Capacity17766000000</p><p>Disk146.6666666667143.3333333333303.33333333332446.66666666671949.7366110623</p><p>Cap/BW improvement1.006.639.957.9017.78</p><p>DRAM141625610244096</p><p>Cap/BW improvement1.001.331.3312.8021.3334.13</p><p>DRAM</p><p>00000</p><p>00000</p><p>0000</p><p>0000</p><p>000</p><p>00</p><p>Microprocessor</p><p>Memory</p><p>Network</p><p>Disk</p><p>(Latency improvement = Bandwidth improvement)</p><p>Bandwidth improvement</p><p>Relative Latency Improvement</p><p>Relative Bandwidth Improvement</p><p>Table 1</p><p>Performance</p><p>GenerationMPUYearClockDRAM ModuleYearMbit/chipDiskYearDiameter, GBEthernetYearIEEE Std</p><p>180286 (16b bus)19821316 bit, Asynchonous19800.063600 RPM198214", 0.4 GB101978802.3</p><p>280386 (32b bus)19851616 bit, Page mode19830.255400 RPM19905.25", 1.4 GB10 + Bridges1990802.1D-1990</p><p>380486 (32b bus)19892532 bit, Fast Page Mode19861.007200 RPM19943.5", 4.3 GB1001995802.3u</p><p>4Penitum (64b bus)19936664 bit, Fast Page Mode199316.0010000 RPM20003.5", 37.4 GB10001999802.3ab</p><p>5Pentium Pro (64b bus)199720064 bit, Synchronous199764.0015000 RPM20033.5", 73.4 GB100002003802.3ae</p><p>6Pentium 4 (64b bus)2001150064 bit, Double Data Rate2000256.00</p><p>MHz802.1D</p><p>SDRAM199766tree spanning algorithm</p><p>DDR2000200</p><p>Source:http://www.ethermanage.com/ethernet/ethernet.html</p><p>http://home.cfl.rr.com/bjp/</p><p>Is clock rate correct for 80286?</p><p>reg-reg latency forstagesAdd times</p><p>12866claims these were similar pipelines3add rr</p><p>4386532-bit adds2</p><p>2048651</p><p>100Penitum (64b bus)5</p><p>500Pentium Pro (64b bus)10http://216.239.57.104/search?q=cache:rcPAp8ZRPB0J:www.chez.com/fturi/FR_C/tec_8086.htm+80286+instruction+timing&amp;hl=en&amp;ie=UTF-8</p><p>1700Pentium 4 (64b bus)22</p><p>IEEE Micro</p><p>Volume 5, Number 6, December, 1985</p><p>Khaled A. El-Ayat and</p><p>Rakesh K. Agarwal The Intel 80386 --- Architecture and</p><p>Implementation . . . . . . . . . . . . . 4--22</p><p>p. 44.4CPI16MHz</p><p>3 to 4MIPS</p><p>pipelined units</p><p>BIU2clocks per cycle</p><p>Prefetch2clock cycle pipeline with instruction decode unit</p><p>Instruction decode unit1clock cycle per byte</p><p>execution2clock latency</p><p>seg unit</p><p>paging unit0</p><p>/ YearCapacity</p><p>TechnologyBandwidthLatencyCapacity</p><p>Networks80%15%n.a.</p><p>Disks30%10%100%</p><p>DRAM30%5%40%</p><p>Processors70%20%n.a.</p><p>(MIPS)</p><p>Years to halve latencyBandwidth increase in that time</p><p>Years%BW%latency</p><p>Networks52.018.980%15%5.3</p><p>Disks7.22.06.630%10%3.0</p><p>DRAM102.013.830%7%4.3</p><p>Processors3.92.07.970%20%3.5</p><p>Years to 2X BWLatencyBandwidth increase in that time</p><p>Yearsincrease%BW%latency</p><p>Networks1.181.182.0080%15%5.3</p><p>Disks2.651.292.0030%10%3.0</p><p>DRAM2.651.202.0030%7%4.3</p><p>Processors1.311.272.0070%20%3.5</p><p>Average1.232.00</p><p>0.05</p><p>In the time that bandwidth doubles, latency shrinks 20% to 30%</p><p>Years</p><p>Processors1.311.269776932.00395556310.70.2</p><p>DRAM2.651.1963741232.00424073020.30.07</p><p>Networks1.181.17929770022.00088236260.80.15</p><p>Disks2.651.28733215052.00424073020.30.1</p><p>Ethenet Link Speed</p><p>1st YearMbit/sec%/year</p><p>19781012%19781012%</p><p>1990403920%1990403915%</p><p>199510078%10078%199510078%10878%</p><p>1999100078%1,00478%1999100078%1,00478%</p><p>20031000078%10,07878%Since 199520031000078%10,07878%Since 1995</p><p>10335.899554502632%Since 197837,61639%Since 1978</p><p>10,07053%Since 1990</p><p>SendReceive/ YearCapacity</p><p>19781000TechnologyBandwidthLatencyCapacity</p><p>1990600Networks80%15%n.a.</p><p>199523014%270Since 199515%500Disks30%10%100%</p><p>2003808110088.2634789385180DRAM30%5%40%</p><p>741...</p></li></ul>

Recommended

View more >