upc at crd/lbnl
Embed Size (px)
DESCRIPTION
UPC at CRD/LBNL. Kathy Yelick Dan Bonachea, Jason Duell, Paul Hargrove, Parry Husbands, Costin Iancu, Mike Welcome, Christian Bell. What is UPC?. UPC is an explicitly parallel language Global address space; can read/write remote memory - PowerPoint PPT PresentationTRANSCRIPT
-
UPC at CRD/LBNLKathy YelickDan Bonachea, Jason Duell, Paul Hargrove, Parry Husbands, Costin Iancu, Mike Welcome, Christian Bell
-
What is UPC?UPC is an explicitly parallel languageGlobal address space; can read/write remote memoryProgrammer control over layout and schedulingFrom Split-C, AC, PCP
Why a new language?Easier to use than MPI, especially for program with complicated data structuresPossibly faster on some machines, but current goal is comparable performancep0p1p2
-
BackgroundUPC efforts elsewhereIDA: Bill Carlson, UPC promoter GMU (documentation) and UMC (benchmarking) HP (Alpha cluster and C+MPI compiler (with MTU))Cray (implementations)Intrepid (SGI and t3e compiler)UPC Book: T. El-Ghazawi, B. Carlson, T. Sterling, K. Yelick3 chapters in draft form; goal is to have proofs by SC03Three components of NERSC effortCompilers for DOE machines (SP and PC clusters)Runtime systems for ours and other compilersApplications and benchmarks
-
UPC FundingBase program funding K52004Compiler/translator workApplicationsRuntime for DOE machinesPart of Pmodels Center K52018Runtime support common to Titanium (and hopefully CoArray Fortran, at some point)Collaboration with ARMCI groupNSA fundingUPC for clusters
-
Compiler StatusNERSC compiler/translatorCostin Iancu and Wei ChenTranslates UPC to C + Berkeley UPC RuntimeBased on Open64 compiler for CStatus Complete in prototype formDebugging, tuning, extensions ongoingRelease planned for next month:Quadrics, Myrinet, IBM/SP, and MPIShared memory/process implementation is nextInvestigating optimization opportunitiesCommunication optimizationsUPC language optimizations
-
UPC CompilerUPCHigher WHIRL Lower WHIRLCompiler based on Open64Multiple front-ends, including gccIntermediate form called WHIRL Leverage standard optimizations and analysesPointer analysisLoop optimizations Current focus on C backendIA64 possible in futureUPC Runtime built on GASNetPortable Language-independentOptimizingtransformations C + Runtime Assembly: IA64, MIPS, + Runtime
-
Portable Runtime SupportDeveloping a runtime layer that can be easily ported and tuned to multiple architectures.
Runtime: Global pointers (opaque type with rich set of pointer operations), memory management, job startup, etc.
GASNet Extended API: Supports put, get, locks, barrier, bulk, scatter/gather
GASNet Core API:Small interface based on Active MessagesGeneric support for UPC, CAF, TitaniumCore sufficient for functional implementationDirect implementations of parts of full GASNetGASNet released 1/03
-
Communication OptimizationsCharacterizing performance of current machinesLatency, overlap (communication & computation)Plan to automatically optimization using communication performance model
Preliminary results: 10x improvement on Matmul
Chart1
000.8480.182
000.051.195
0.4744.4871.5650
2.1409.9259.4075
5.34307.7386.4175
0.01200.4281.26
6.88902.260.324
000.7427.757
1.85750.71857.46450
4.98205.8880.793
7.17108.0382.901
Rec Overhead (Alone)
Send & Rec Overhead
Send Overhead (Alone)
Added Latency
usec
Overview
Each worksheet should contain all the data for one machine, with different network layers (MPI vs. LAPI, for example) treated as separate machines
The data is:
1) The ping pong test for 1 Byte through 131028 (128K) in powers of two
2) The loop-back ping pong test for 1 Byte through 131028 (128K) in powers of two
This is not actually in the spreadsheet yet, because I don't have much data, but I'll add it later.
3) The flood test for a set of reasonable queue depths. Different machines may explore a different set of queue depths,
but for now let's at least report queue depths of 1,2,4,8, and 16. This uses the same sizes as the ping pong test.
The table on the right side of the ping-pong test is a link to the flood test data with the best overall queue depth.
4) Overlap data for 8-byte messages. This data has the following form:
Queue depthCPU LoopsCPU Time (usec)Inverse Bandwidth (usec)Time to solution (sec)
CPU Time can be computed from CPU Loops, as long as you give me the scaling factor (usec/loop)
Use the same queue depths as in 2
Please include a 0-CPU loop example for sanity, which should match the flood test for 8 bytes
For a cannonical example of a sheet with no data, check out the "Template" worksheet.
The overlap numbers don't have a complete table to fill in, because what are reasonable cycle counts
depends on your machine.
Who is responsible for which numbers (in each case MPI and anything else that exists):
T3ECostin
SPMike
CompaqJason
LLNL ??Jason
Myrinet 2KLChristian and Yannick
GiganetPaul and Mike
SysKonnectPaul and Mike
Update notes
7-Aprupdated quadrics shmem (put and get) overlap 4/7
7-Aprupdated t3e shmem overlap 4/7
7-Aprupdated ibm lapi
8-Aprupdated millenium gm results
9-Aprupdated ibm mpi data
4/9/02added dolphin data
Checking flood data (check ones in flood that are not E32)
LatencySummary
RTTEELLogP ParametersLat/BW ModelFor graphing logPCost per ByteSpeedup Over 1Word Blocking
machineBlocking (roundtrip)End-to-end latency (1-way)Gap (g)Send OverheadReceive OverheadLogP Lg-osaba/bbeta (usec per KB)Rec Overhead (Alone)Send & Rec OverheadAdded LatencySend Overhead (Alone)Cross-checkBlockingOverlapped ComputationOverlapped CommunicationLarge MessageBlockingOverlapped ComputationOverlapped CommunicationLarge MessageOld Gap
T3E/Shm2.061.031.2060.850.000.180.371.20.0034113.0396562500.000.180.850.000.25750.110.1523750.0031x2x2x87x1.22
T3E/E-Reg2.491.2450.1730.0501.200.1200.001.200.050.000.311250.010.0216250.0031x50x14x104x0.173
T3E/MPI13.056.5266.2486.054.96-4.490.686.70.00321923.14557812504.490.001.570.001.63151.380.8416250.0031x1x2x531x6.73
IBM/LAPI42.9521.47259.4279.932.149.41-0.449.40.00335522.71782812520.009.419.930.005.3681251.511.1860.0031x4x5x2023x9.49
IBM/MPI39.0019.49857.5217.745.346.420.477.60.00419224.02904687550.006.427.740.004.8746251.641.0256250.0041x3x5x1239x8.21
Quadrics/Get5.632.8153.2670.620.002.205.883.30.0056555.1037500.002.200.620.00141.198138623615.55163.06754837130.00528330x3120x32717x1x6.50
Quadrics/Shm3.401.71.33650.430.011.261.071.30.0052665.188793945300.001.260.430.000.4250.050.187750.0051x8x2x84x1.50
Quadrics/MPI18.959.4737.2562.266.890.325.007.30.00514255.214773437570.000.322.260.002.368251.140.9070.0051x2x3x465x7.26
Myrinet/GM17.008.4997.6580.740.007.7612.917.70.00516184.847179687500.007.760.740.002.124750.091.706250.0051x23x1x449x13.65
Myrinet/MPI20.0810.04057.1738.182.58-0.72-1.017.20.00611906.173414062520.720.007.460.002.5101251.340.8966250.0061x2x3x416x7.17
GigE/VIPL23.3311.6634.555.894.980.7917.114.60.0085428.60062550.000.795.892.915751.362.8748750.0081x2x1x347x23.00
GigE/MPI36.2218.115.8548.047.172.9017.935.90.0096718.9332812570.002.908.040.004.52751.903.2463750.0091x2x1x519x25.97
blue boxes are not links -- entered directly
Graph 1
Graph 1: Latency only
Graph 2: Latency and overheads
Graph 3: Gap (qd=1) and overheads
Graph 4: Gap (qd=1) and overhead (2nd option)
Graph 5: Cost per message (overlapping large msgs)
Graph 6: Cost per byte (large message)
Graph 7: Alpha and Beta on 1 graph (alternate to Graph 5 & 6)
Graph 8: What size is the cross-over point between when latency matters more and when bandwidth matters more (alpha/beta)
Graph 9: Cost per byte in various modes
Graph 2
Graph 3
Graph 4
Graph 5
Graph 6
Graph 7
b
Graph 8
Table
Blocking (2*EEL)Overlap (os+or)Pipelined (g)Large Msg (G)
T3E/Shm1x2x2x87x
T3E/E-Reg1x50x14x104x
T3E/MPI1x1x2x531x
IBM/LAPI1x4x5x2023x
IBM/MPI1x3x5x1239x
Quadrics/Shm1x8x2x84x
Quadrics/MPI1x2x3x465x
Myrinet/GM1x23x1x449x
Myrinet/MPI1x2x3x416x
GigE/VIPL1x2x1x347x
GigE/MPI1x2x1x519x
LatencySummary
0000
0000
0000
0000
0000
0000
0000
0000
0000
0000
0000
0000
0000
0000
0000
Rec Overhead (Alone)
Send & Rec Overhead
Send Overhead (Alone)
Added Latency
usec
LargeMsgActual
000
000
000
000
000
000
000
000
000
000
000
000
000
000
000
Gap (g)
Send Overhead
Receive Overhead
usec
LargeMsgModel
00
00
00
00
00
00
00
00
00
00
00
00
00
00
00
Send Overhead
g-os
usec
LargeMsgSummary
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
End-to-end latency (1-way)
usec
GapSummary
0
0
0
0
0
0
0
0
0
0
0
0
0
0
Per message cost (g)
usec
Template
0
0
0
0
0
0
0
0
0
0
0
0
0
0
Per Byte cost (G)
usec
IBM-MPI
00
00
00
00
00
00
00
00
00
00
00
00
00
00
Per Message Cost (g)
Per KByte Cost (G*1024)
usec
IBM-LAPI
0
0
0
0
0
0
0
0
0
0
0
0
0
0
Cross-over between g and G
Bytes
Quadrics-MPI
1.2196.7339.4277.5611.3466257.2567.6587.1734.555.854
1.2716.9369.4347.5621.35957.2817.6687.3384.5915.87
1.3716.969.9938.0891.356757.2787.7097.144.6225.962
1.50911.94511.079.141.3128.0077.8077.184.7316.042
1.42212.19212.76510.9171.8563758.6278.096.6764.8667.352
1.6713.5448.22312.8032.9499.7227.8547.0145.18810.518
2.10913.9828.6798.5216.8881257.8557.6057.9896.111.747
3.48115.52212.01213.1089.4867510.0439.5059.3388.8320.85
7.58418.5213.58616.65414.39112515.08613.71113.19417.47727.994
12.53125.42215.50524.70824.25737525.9522.2429.12134.40446.807
24.71937.15524.22349.00244.62212546.67941.74150.56168.80571.478
48.93261.52145.17981.60985.47037591.3881.205150.083137.68147.505
97.269110.1593.335143.167167.8475176.448158.895241.59832768285.865
195.377206.951183.144268.781332.626875342.601312.677424.55965536639.59
390.392402.634347.882515.718664.165625667.491620.439790.1971310721293.759
T3E/Shm
T3E/MPI
IBM/LAPI
IBM/MPI
Quadrics/Shm
Quadrics/MPI
Myrinet/GM
Myrinet/MPI
GigE/VIPL
GigE/MPI
Quadrics-Puts
1.24274731456.75757482919.44823303227.59247692871.38716245277.29674041757.69586859137.22122979744.61719238285.9237912598
1.26649462896.78214965829.46946606457.62395385741.42769990547.3374808357.73373718267.26945959474.68438476565.9935825195
1.31398925786.83129931649.51193212897.68690771481.50877481087.41896166997.80947436527.36591918954.81876953126.1331650391
1.40897851566.92959863289.59686425787.81281542971.67092462167.58192333987.96094873057.55883837895.08753906256.4123300781
1.59895703127.12619726569.76672851568.06463085941.99522424327.90784667978.26389746097.94467675785.6250781256.9706601562
1.97891406257.519394531210.10645703128.56826171882.64382348638.55969335948.86979492198.71635351566.700156258.0873203125
2.7388281258.305789062510.78591406259.57552343753.94102197279.863386718810.081589843810.25970703128.850312510.320640625
4.258656259.87857812512.14482812511.5900468756.535418945312.470773437512.505179687513.346414062513.15062514.78728125
7.298312513.0241562514.8626562515.6190937511.724212890617.68554687517.35235937519.51982812521.7512523.7205625
13.37762519.315312520.298312523.677187522.101800781228.1150937527.0467187531.8666562538.952541.587125
25.5362531.89762531.16962539.79337542.856976562548.974187546.435437556.560312573.35577.32025
49.853557.0622552.9122572.0257584.36732812590.69237585.212875105.947625142.16148.7865
98.488107.391596.3975136.4905167.38803125174.12875162.76775204.72225279.77291.719
195.757208.05183.368265.42333.4294375341.0015317.8775402.2715554.99577.584
390.295409.367357.309523.279665.51225674.747628.097797.371105.431149.314
T3E/Shm
T3E/MPI
IBM/LAPI
IBM/MPI
Quadrics/Shm
Quadrics/MPI
Myrinet/GM
Myrinet/MPI
GigE/VIPL
GigE/MPI
Quadrics-Gets
Bandwidth
machinesizequeue depthcpu loopscpu timegapbandwidthusec/ByteabmodelErrorRelErr
T3E/Shm11.2190.00296841431.2219684143
T3E/Shm21.2190.00296841431.2249368286
T3E/Shm41.2190.00296841431.2308736572
T3E/Shm81001.2196.25873218310.1521.2190.00296841431.24274731450.02374731452%Sum of gapmachine
T3E/Shm161001.27112.00534151260.0791.2190.00296841431.26649462890.00450537110%sizeT3E/ShmT3E/MPIIBM/LAPIIBM/MPIQuadrics/ShmQuadrics/MPIMyrinet/GMMyrinet/MPIGigE/VIPLGigE/MPI
T3E/Shm321001.37122.25935676510.0431.2190.00296841431.31398925780.05701074224%8.001.226.739.437.561.357.267.667.174.555.85
T3E/Shm641001.50940.44741964880.0241.2190.00296841431.40897851560.10002148447%16.001.276.949.437.561.367.287.677.344.595.87
T3E/Shm1281001.42285.84410161740.0111.2190.00296841431.59895703120.176957031212%32.001.376.969.998.091.367.287.717.144.625.96
T3E/Shm2561001.67146.1919910180.0071.2190.00296841431.97891406250.308914062518%64.001.5111.9511.079.141.318.017.817.184.736.04
T3E/Shm5121002.109231.52264106210.0041.2190.00296841432.7388281250.62982812530%128.001.4212.1912.7710.921.868.638.096.684.877.35
T3E/Shm10241003.481280.54079287560.0031.2190.00296841434.258656250.7776562522%256.001.6713.548.2212.802.959.727.857.015.1910.52
T3E/Shm20481007.584257.53230485230.0041.2190.00296841437.29831250.28568754%512.002.1113.988.688.526.897.867.617.996.1011.75
T3E/Shm409610012.531311.72691724520.0031.2190.002968414313.3776250.8466257%1024.003.4815.5212.0113.119.4910.049.519.348.8320.85
T3E/Shm819210024.719316.05242930540.0031.2190.002968414325.536250.817253%2048.007.5818.5213.5916.6514.3915.0913.7113.1917.4827.99
T3E/Shm1638410048.932319.32068993710.0031.2190.002968414349.85350.92152%4096.0012.5325.4215.5124.7124.2625.9522.2429.1234.4046.81
T3E/Shm3276810097.269321.27399274180.0031.2190.002968414398.4881.2191%8192.0024.7237.1624.2249.0044.6246.6841.7450.5668.8171.48
T3E/Shm65536100195.377319.89435808720.0031.2190.0029684143195.7570.380%16384.0048.9361.5245.1881.6185.4791.3881.21150.08137.68147.51
T3E/Shm131072100390.392320.19098752020.0031.2190.0029684143390.2950.0970%32768.0097.27110.1593.34143.17167.85176.45158.90241.60285.87
T3E/MPI11006.7150.1426.7156.7330.00307185366.73607185360.02107185360%65536.00195.38206.95183.14268.78332.63342.60312.68424.56639.59
T3E/MPI21006.7260.2843.3636.7330.00307185366.73914370730.01314370730%131072.00390.39402.63347.88515.72664.17667.49620.44790.201293.76
T3E/MPI41006.7290.5671.6826.7330.00307185366.74528741460.01628741460%
T3E/MPI81006.7331.1330.8426.7330.00307185366.75757482910.02457482910%
T3E/MPI161006.9362.20.4346.7330.00307185366.78214965820.15385034182%
T3E/MPI321006.964.3850.2186.7330.00307185366.83129931640.12870068362%
T3E/MPI6410011.9455.110.1876.7330.00307185366.92959863285.015401367242%
T3E/MPI12810012.19210.0120.0956.7330.00307185367.12619726565.065802734442%Sum of modelmachine
T3E/MPI25610013.54418.0250.0536.7330.00307185367.51939453126.024605468844%sizeT3E/ShmT3E/MPIIBM/LAPIIBM/MPIQuadrics/ShmQuadrics/MPIMyrinet/GMMyrinet/MPIGigE/VIPLGigE/MPI
T3E/MPI51210013.98234.9220.0276.7330.00307185368.30578906255.676210937541%8.001.246.769.457.591.397.307.707.224.625.92
T3E/MPI102410015.52262.9160.0156.7330.00307185369.8785781255.64342187536%16.001.276.789.477.621.437.347.737.274.685.99
T3E/MPI204810018.52105.4580.0096.7330.003071853613.024156255.4958437530%32.001.316.839.517.691.517.427.817.374.826.13
T3E/MPI409610025.422153.6540.0066.7330.003071853619.31531256.106687524%64.001.416.939.607.811.677.587.967.565.096.41
T3E/MPI819210037.155210.2690.0056.7330.003071853631.8976255.25737514%128.001.607.139.778.062.007.918.267.945.636.97
T3E/MPI1638410061.521253.9770.0046.7330.003071853657.062254.458757%256.001.987.5210.118.572.648.568.878.726.708.09
T3E/MPI32768100110.15283.7040.0036.7330.0030718536107.39152.75853%512.002.748.3110.799.583.949.8610.0810.268.8510.32
T3E/MPI65536100206.951302.0030.0036.7330.0030718536208.051.0991%1024.004.269.8812.1411.596.5412.4712.5113.3513.1514.79
T3E/MPI131072100402.634310.4560.0036.7330.0030718536409.3676.7332%2048.007.3013.0214.8615.6211.7217.6917.3519.5221.7523.72
IBM/LAPI18009.410.1019.4109.4270.0026541299.4296541290.0196541290%4096.0013.3819.3220.3023.6822.1028.1227.0531.8738.9541.59
IBM/LAPI28009.4250.2024.7139.4270.0026541299.43230825810.00730825810%8192.0025.5431.9031.1739.7942.8648.9746.4456.5673.3677.32
IBM/LAPI48009.4080.4052.3529.4270.0026541299.43761651610.02961651610%16384.0049.8557.0652.9172.0384.3790.6985.21105.95142.16148.79
IBM/LAPI88009.4270.8091.1789.4270.0026541299.44823303220.02123303220%32768.0098.49107.3996.40136.49167.39174.13162.77204.72279.77291.72
IBM/LAPI168009.4341.6170.5909.4270.0026541299.46946606450.03546606450%65536.00195.76208.05183.37265.42333.43341.00317.88402.27554.99577.58
IBM/LAPI328009.9933.0540.3129.4270.0026541299.51193212890.48106787115%131072.00390.30409.37357.31523.28665.51674.75628.10797.371105.431149.31
IBM/LAPI6480011.075.5130.1739.4270.0026541299.59686425781.473135742213%
IBM/LAPI12880012.7659.5630.1009.4270.0026541299.76672851562.998271484423%
IBM/LAPI2568008.22329.6910.0329.4270.00265412910.10645703121.883457031223%
IBM/LAPI5128008.67956.260.0179.4270.00265412910.78591406252.106914062524%
IBM/LAPI102480012.01281.30.0129.4270.00265412912.1448281250.1328281251%machineabMaxErrMaxRelErr
IBM/LAPI204880013.586143.7570.0079.4270.00265412914.862656251.276656259%T3E/Shm1.20.0031.230%
IBM/LAPI409680015.505251.930.0049.4270.00265412920.29831254.793312531%
IBM/LAPI819280024.223322.520.0039.4270.00265412931.1696256.94662529%T3E/MPI6.70.0036.744%
IBM/LAPI1638480045.179345.8470.0039.4270.00265412952.912257.7332517%IBM/LAPI9.40.0039.431%
IBM/LAPI3276880093.335334.8150.0039.4270.00265412996.39753.06253%IBM/MPI7.60.0049.633%
IBM/LAPI65536800183.144341.2610.0039.4270.002654129183.3680.2240%Quadrics/Get3.30.0053.30.1
IBM/LAPI131072800347.882359.3170.0039.4270.002654129357.3099.4273%Quadrics/Shm1.30.0053.00.4
IBM/MPI18007.5530.1267.5537.5610.00393461617.56493461610.01193461610%Quadrics/MPI7.30.0057.326%
IBM/MPI28007.5590.2523.7807.5610.00393461617.56886923220.00986923220%Myrinet/GM7.70.0057.733%
IBM/MPI48007.5480.5051.8877.5610.00393461617.57673846440.02873846440%Myrinet/MPI7.20.00644.148%
IBM/MPI88007.5611.0090.9457.5610.00393461617.59247692870.03147692870%Dolphin/MPI7.80.005157.529%
IBM/MPI168007.5622.0180.4737.5610.00393461617.62395385740.06195385741%Giganet/VIPL3.00.0103.017%
IBM/MPI328008.0893.7720.2537.5610.00393461617.68690771480.40209228525%GigE/VIPL4.60.0084.649%
IBM/MPI648009.146.6780.1437.5610.00393461617.81281542971.327184570315%GigE/MPI5.8540.0087239075144.429%
IBM/MPI12880010.91711.1810.0857.5610.00393461618.06463085942.852369140626%
IBM/MPI25680012.80319.0690.0507.5610.00393461618.56826171884.234738281233%Sum of usec/Bytemachine
IBM/MPI5128008.52157.3050.0177.5610.00393461619.57552343751.054523437512%sizeT3E/ShmT3E/MPIIBM/LAPIIBM/MPIQuadrics/ShmQuadrics/MPIMyrinet/MPIMyrinet/GMGigE/VIPLGigE/MPIGrand Total
IBM/MPI102480013.10874.5010.0137.5610.003934616111.5900468751.51795312512%16.7159.417.5537.2837.1667.6464.4165.76455.953
IBM/MPI204880016.654117.2780.0087.5610.003934616115.619093751.034906256%23.3634.71253.77953.61553.5893.82352.19952.90327.9855
IBM/MPI409680024.708158.0950.0067.5610.003934616123.67718751.03081254%41.682252.3521.8871.815251.790251.91251.102251.4567513.99825
IBM/MPI819280049.002159.4310.0067.5610.003934616139.7933759.20862519%80.1523750.8416251.1783750.9451250.1683281250.9070.8966250.957250.568750.731757.347203125
IBM/MPI1638480081.609191.4630.0057.5610.003934616172.025759.5832512%160.07943750.43350.5896250.4726250.084968750.45506250.4586250.479250.28693750.3668753.70690625
IBM/MPI32768800143.167218.2770.0047.5610.0039346161136.49056.67655%320.042843750.21750.312281250.252781250.04239843750.22743750.2231250.240906250.14443750.18631251.8900234375
IBM/MPI65536800268.781232.5320.0047.5610.0039346161265.423.3611%640.0235781250.1866406250.172968750.14281250.02050.1251093750.11218750.1219843750.0739218750.094406251.074109375
IBM/MPI131072800515.718242.3810.0047.5610.0039346161523.2797.5611%1280.0111093750.095250.09972656250.08528906250.01450292970.06739843750.052156250.0632031250.0380156250.05743750.5840888672
Quadrics/Shm18001.3466250.00506718161.35169218162560.00652343750.052906250.03212109380.05001171880.01151953120.03797656250.02739843750.03067968750.0202656250.04108593750.3104882812
Quadrics/Shm28001.3466250.00506718161.35675936325120.00411914060.02730859380.01695117190.01664257810.01345336910.01534179690.01560351560.01485351560.01191406250.02294335940.1591311035
Quadrics/Shm48001.3466250.00506718161.366893726310240.00339941410.01515820310.01173046880.01280078120.00926440430.00980761720.00911914060.00928222660.00862304690.02036132810.1095466309
Quadrics/Shm88001.3466255.66556727470.1681.3466250.00506718161.38716245270.04053745273%20480.0037031250.00904296880.00663378910.00813183590.00702691650.00736621090.00644238280.00669482420.00853369140.01366894530.0772446899
Quadrics/Shm168001.359511.2238242460.0851.3466250.00506718161.42769990540.06819990545%40960.00305932620.0062065430.00378540040.00603222660.00592221070.00633544920.00710961910.00542968750.00839941410.01142749020.0637073669
Quadrics/Shm328001.3567522.49314768750.0421.3466250.00506718161.50877481080.152024810811%81920.00301745610.00453552250.00295690920.00598168950.00544703670.00569812010.00617199710.00509533690.00839904790.00872534180.0560284576
Quadrics/Shm648001.31246.52069836130.0211.3466250.00506718161.67092462160.358924621627%163840.00298657230.00375494380.00275750730.00498101810.00521669770.00557739260.00916033940.00495635990.00840332030.00900299070.056797142
Quadrics/Shm1288001.85637565.7573564070.0151.3466250.00506718161.99522424320.13884924327%327680.00296841430.00336151120.00284835820.00436911010.00512229920.00538476560.00737298580.00484909060.00872390750.0450004425
Quadrics/Shm2568002.94982.78759749070.0121.3466250.00506718162.64382348630.305176513710%655360.00298121640.00315782170.00279455570.00410127260.00507548330.00522767640.00647825620.00477107240.00975936890.0443467236
Quadrics/Shm5128006.88812570.8873967880.0131.3466250.00506718163.94102197272.947103027343%1310720.00297845460.00307185360.0026541290.00393461610.00506718160.00509255220.00602872470.00473357390.00987059780.0434316835
Quadrics/Shm10248009.48675102.93962632090.0091.3466250.00506718166.53541894532.951331054731%Grand Total0.34508030713.663269836418.912709945715.23511965940.403813372614.599565956114.388854148915.33593912518.90435070811.7161005173113.5048035765
Quadrics/Shm204880014.391125135.71732578240.0071.3466250.005067181611.72421289062.666912109419%
Quadrics/Shm409680024.257375161.03350012110.0061.3466250.005067181622.10180078122.15557421889%
Quadrics/Shm819280044.622125175.08130775930.0051.3466250.005067181642.85697656251.76514843754%
Quadrics/Shm1638480085.470375182.81188072480.0051.3466250.005067181684.3673281251.1030468751%
Quadrics/Shm32768800167.8475186.18090826490.0051.3466250.0050671816167.388031250.459468750%
Quadrics/Shm65536800332.626875187.89822680440.0051.3466250.0050671816333.42943750.80256250%
Quadrics/Shm131072800664.165625188.20606682260.0051.3466250.0050671816665.512251.3466250%
Quadrics/Get18003.2670.00498413093.2719841309
Quadrics/Get28003.2670.00498413093.2769682617
Quadrics/Get48003.2670.00498413093.2869365234
Quadrics/Get88003.2672.33529064320.4083.2670.00498413093.30687304690.03987304691%
Quadrics/Get168003.2664.6720113480.2043.2670.00498413093.34674609380.08074609372%
Quadrics/Get328003.2639.35261358410.1023.2670.00498413093.42649218750.16349218755%
Quadrics/Get648003.2718.66518539760.0513.2670.00498413093.5859843750.31598437510%
Quadrics/Get1288003.75132.5434050920.0293.2670.00498413093.904968750.153968754%
Quadrics/Get2568004.3456.25360023040.0173.2670.00498413094.54293750.20293755%
Quadrics/Get5128005.52888.32873552820.0113.2670.00498413095.8188750.2908755%
Quadrics/Get10248007.855124.32367918520.0083.2670.00498413098.370750.515757%
Quadrics/Get204880013.197147.99765098130.0063.2670.004984130913.47450.27752%
Quadrics/Get409680022.863170.85465599440.0063.2670.004984130923.6820.8194%
Quadrics/Get819280042.931181.97805781370.0053.2670.004984130944.0971.1663%
Quadrics/Get1638480083.859186.32466401940.0053.2670.004984130984.9271.0681%
Quadrics/Get32768800164.928189.47662010090.0053.2670.0049841309166.5871.6591%
Quadrics/Get65536800326.64191.34215037960.0053.2670.0049841309329.9073.2671%
Quadrics/Get131072800654.201191.07277426970.0053.2670.0049841309656.5472.3460%
Quadrics/MPI18007.2830.1317.2837.2560.00509255227.26109255220.02190744780%
Quadrics/MPI28007.2310.2643.6167.2560.00509255227.26618510440.03518510440%
Quadrics/MPI48007.2610.5251.8157.2560.00509255227.27637020870.01537020870%
Quadrics/MPI88007.2561.0510.9077.2560.00509255227.29674041750.04074041751%
Quadrics/MPI168007.2812.0960.4557.2560.00509255227.3374808350.0564808351%
Quadrics/MPI328007.2784.1930.2277.2560.00509255227.41896166990.14096166992%
Quadrics/MPI648008.0077.6230.1257.2560.00509255227.58192333980.42507666025%
Quadrics/MPI1288008.62714.1510.0677.2560.00509255227.90784667970.71915332038%
Quadrics/MPI2568009.72225.1130.0387.2560.00509255228.55969335941.162306640612%
Quadrics/MPI5128007.85562.1590.0157.2560.00509255229.86338671882.008386718826%
Quadrics/MPI102480010.04397.2350.0107.2560.005092552212.47077343752.427773437524%
Quadrics/MPI204880015.086129.4630.0077.2560.005092552217.6855468752.59954687517%
Quadrics/MPI409680025.95150.5310.0067.2560.005092552228.115093752.165093758%
Quadrics/MPI819280046.679167.3680.0067.2560.005092552248.97418752.29518755%
Quadrics/MPI1638480091.38170.9890.0067.2560.005092552290.6923750.6876251%
Quadrics/MPI32768800176.448177.1060.0057.2560.0050925522174.128752.319251%
Quadrics/MPI65536800342.601182.4280.0057.2560.0050925522341.00151.59950%
Quadrics/MPI131072800667.491187.2680.0057.2560.0050925522674.7477.2561%
Dolphin/MPI11007.1310.1347.1317.7670.00529222117.77229222110.64129222119%
Dolphin/MPI21007.1660.2663.5837.7670.00529222117.77758444210.61158444219%
Dolphin/MPI41007.1480.5341.7877.7670.00529222117.78816888430.64016888439%
Dolphin/MPI81007.7670.9820.9717.7670.00529222117.80933776860.04233776861%
Dolphin/MPI161007.7571.9670.4857.7670.00529222117.85167553710.09467553711%
Dolphin/MPI321007.9873.8210.2507.7670.00529222117.93635107420.05064892581%
Dolphin/MPI641009.8556.1930.1547.7670.00529222118.10570214841.749297851618%
Dolphin/MPI12810010.43811.6940.0827.7670.00529222118.44440429691.993595703119%
Dolphin/MPI25610011.57821.0870.0457.7670.00529222119.12180859382.456191406221%
Dolphin/MPI51210013.79935.3850.0277.7670.005292221110.47661718753.322382812524%
Dolphin/MPI102410018.54152.670.0187.7670.005292221113.1862343755.35476562529%
Dolphin/MPI204810023.54882.9420.0117.7670.005292221118.605468754.9425312521%
Dolphin/MPI409610034.177114.2950.0087.7670.005292221129.44393754.733062514%
Dolphin/MPI819210054.767142.6510.0077.7670.005292221151.1208753.6461257%
Dolphin/MPI1638410096.044162.6860.0067.7670.005292221194.474751.569252%
Dolphin/MPI32768100178.029175.5330.0057.7670.0052922211181.18253.15352%
Dolphin/MPI65536100346.831180.2030.0057.7670.0052922211354.5987.7672%
Dolphin/MPI131072100858.924145.5310.0077.7670.0052922211701.429157.49518%
Myrinet/GM116007.6460.1317.6467.6580.00473357397.66273357390.01673357390%
Myrinet/GM216007.6470.2623.8247.6580.00473357397.66746714780.02046714780%
Myrinet/GM416007.650.5231.9137.6580.00473357397.67693429570.02693429570%
Myrinet/GM816007.6581.0450.9577.6580.00473357397.69586859130.03786859130%
Myrinet/GM1616007.6682.0870.4797.6580.00473357397.73373718260.06573718261%
Myrinet/GM3216007.7094.1510.2417.6580.00473357397.80947436520.10047436521%
Myrinet/GM6416007.8078.1980.1227.6580.00473357397.96094873050.15394873052%
Myrinet/GM12816008.0915.8230.0637.6580.00473357398.26389746090.17389746092%
Myrinet/GM25616007.85432.5940.0317.6580.00473357398.86979492191.015794921913%
Myrinet/GM51216007.60567.3260.0157.6580.004733573910.08158984382.476589843833%
Myrinet/GM102416009.505107.7270.0097.6580.004733573912.50517968753.000179687532%
Myrinet/GM2048160013.711149.3660.0077.6580.004733573917.3523593753.64135937527%
Myrinet/GM4096160022.24184.1760.0057.6580.004733573927.046718754.8067187522%
Myrinet/GM8192160041.741196.2590.0057.6580.004733573946.43543754.694437511%
Myrinet/GM16384160081.205201.7610.0057.6580.004733573985.2128754.0078755%
Myrinet/GM327681600158.895206.2250.0057.6580.0047335739162.767753.872752%
Myrinet/GM655361600312.677209.5960.0057.6580.0047335739317.87755.20052%
Myrinet/GM1310721600620.439211.2570.0057.6580.0047335739628.0977.6581%
Myrinet/MPI11007.1660.1337.1667.1730.00602872477.17902872470.01302872470%
Myrinet/MPI21007.1780.2663.5897.1730.00602872477.18505744930.00705744930%
Myrinet/MPI41007.1610.5331.7907.1730.00602872477.19711489870.03611489871%
Myrinet/MPI81007.1731.0640.8977.1730.00602872477.22122979740.04822979741%
Myrinet/MPI161007.3382.0790.4597.1730.00602872477.26945959470.06854040531%
Myrinet/MPI321007.144.2740.2237.1730.00602872477.36591918950.22591918953%
Myrinet/MPI641007.188.5010.1127.1730.00602872477.55883837890.37883837895%
Myrinet/MPI1281006.67618.2860.0527.1730.00602872477.94467675781.268676757819%
Myrinet/MPI2561007.01434.8080.0277.1730.00602872478.71635351561.702353515624%
Myrinet/MPI5121007.98961.1220.0167.1730.006028724710.25970703122.270707031228%
Myrinet/MPI10241009.338104.5780.0097.1730.006028724713.34641406254.008414062543%
Myrinet/MPI204810013.194148.0260.0067.1730.006028724719.5198281256.32582812548%
Myrinet/MPI409610029.121134.1370.0077.1730.006028724731.866656252.745656259%
Myrinet/MPI819210050.561154.5160.0067.1730.006028724756.56031255.999312512%
Myrinet/MPI16384100150.083104.1090.0097.1730.0060287247105.94762544.13537529%
Myrinet/MPI32768100241.598129.3470.0077.1730.0060287247204.7222536.8757515%
Myrinet/MPI65536100424.559147.2120.0067.1730.0060287247402.271522.28755%
Myrinet/MPI131072100790.197158.1880.0067.1730.0060287247797.377.1731%
Giganet/VIPL12002.9670.3212.9672.9670.00997344972.97697344970.00997344970%
Giganet/VIPL22002.9680.6421.4842.9670.00997344972.98694689940.01894689941%
Giganet/VIPL42002.9671.2850.7422.9670.00997344973.00689379880.03989379881%
Giganet/VIPL82002.9672.570.3712.9670.00997344973.04678759770.07978759773%
Giganet/VIPL162003.015.0680.1882.9670.00997344973.12657519530.11657519534%
Giganet/VIPL322003.199.5650.1002.9670.00997344973.28615039060.09615039063%
Giganet/VIPL642003.36318.1450.0532.9670.00997344973.60530078120.24230078127%
Giganet/VIPL1282004.21628.9540.0332.9670.00997344974.24360156250.02760156251%
Giganet/VIPL2562004.76951.190.0192.9670.00997344975.5202031250.75120312516%
Giganet/VIPL5122006.88470.9230.0132.9670.00997344978.073406251.1894062517%
Giganet/VIPL102420012.12780.5280.0122.9670.009973449713.17981251.05281259%
Giganet/VIPL204820021.54390.6620.0112.9670.009973449723.3926251.8496259%
Giganet/VIPL409620041.63193.8290.0102.9670.009973449743.818252.187255%
Giganet/VIPL819220082.45594.7480.0102.9670.009973449784.66952.21453%
Giganet/VIPL16384200163.895.390.0102.9670.0099734497166.3722.5722%
Giganet/VIPL32768200326.8195.6210.0102.9670.0099734497329.7772.9671%
Giganet/VIPL655362002.9670.0099734497656.587
Giganet/VIPL1310722002.9670.00997344971310.207
GigE/VIPL18004.4160.2164.4164.550.00839904794.55839904790.14239904793%
GigE/VIPL28004.3990.4342.2004.550.00839904794.56679809570.16779809574%
GigE/VIPL48004.4090.8651.1024.550.00839904794.58359619140.17459619144%
GigE/VIPL88004.551.6770.5694.550.00839904794.61719238280.06719238281%
GigE/VIPL168004.5913.3240.2874.550.00839904794.68438476560.09338476562%
GigE/VIPL328004.6226.6020.1444.550.00839904794.81876953120.19676953124%
GigE/VIPL648004.73112.9010.0744.550.00839904795.08753906250.35653906258%
GigE/VIPL1288004.86625.0840.0384.550.00839904795.6250781250.75907812516%
GigE/VIPL2568005.18847.0580.0204.550.00839904796.700156251.5121562529%
GigE/VIPL5128006.180.0410.0124.550.00839904798.85031252.750312545%
GigE/VIPL10248008.83110.590.0094.550.008399047913.1506254.32062549%
GigE/VIPL204880017.477111.7520.0094.550.008399047921.751254.2742524%
GigE/VIPL409680034.404113.540.0084.550.008399047938.95254.548513%
GigE/VIPL819280068.805113.5460.0084.550.008399047973.3554.557%
GigE/VIPL16384800137.68113.4870.0084.550.0083990479142.164.483%
GigE/VIPL327688004.550.0083990479279.77
GigE/VIPL655368004.550.0083990479554.99
GigE/VIPL1310728004.550.00839904791105.43
GigE/MPI116005.7640.1655.7645.8540.00872390755.86272390750.09872390752%
GigE/MPI216005.8060.3292.9035.8540.00872390755.87144781490.06544781491%
GigE/MPI416005.8270.6551.4575.8540.00872390755.88889562990.06189562991%
GigE/MPI816005.8541.3030.7325.8540.00872390755.92379125980.06979125981%
GigE/MPI1616005.872.60.3675.8540.00872390755.99358251950.12358251952%
GigE/MPI3216005.9625.1190.1865.8540.00872390756.13316503910.17116503913%
GigE/MPI6416006.04210.1020.0945.8540.00872390756.41233007810.37033007816%
GigE/MPI12816007.35216.6040.0575.8540.00872390756.97066015620.38133984385%
GigE/MPI256160010.51823.2120.0415.8540.00872390758.08732031252.430679687523%
GigE/MPI512160011.74741.5660.0235.8540.008723907510.3206406251.42635937512%
GigE/MPI1024160020.8546.8370.0205.8540.008723907514.787281256.0627187529%
GigE/MPI2048160027.99469.770.0145.8540.008723907523.72056254.273437515%
GigE/MPI4096160046.80783.4550.0115.8540.008723907541.5871255.21987511%
GigE/MPI8192160071.478109.2990.0095.8540.008723907577.320255.842258%
GigE/MPI163841600147.505105.9290.0095.8540.0087239075148.78651.28151%
GigE/MPI327681600285.865109.3170.0095.8540.0087239075291.7195.8542%
GigE/MPI655361600639.5997.7190.0105.8540.0087239075577.58462.00610%
GigE/MPI13107216001293.75996.6180.0105.8540.00872390751149.314144.44511%
T3E-MPI
blockingqd=1qd=2qd=4qd=8qd=16Min
T3E/Shm2.061.2191.221.2121.2091.2061.206
T3E/MPI13.0526.7336.4746.3756.2796.2486.248
IBM/LAPI42.9459.4889.4279.4369.4279.459.427
IBM/MPI38.9978.2057.9067.6737.5627.5217.521
Quadrics/Shm3.41.5021.4131.365251.3466251.33651.3365
Quadrics/Get13.0526.5024.663.7313.2676.1563.267
Quadrics/MPI18.94610.5217.3457.2567.45810.6777.256
Myrinet/GM16.99813.658.7278.6948.0077.6587.658
Myrinet/MPI20.0817.1737.1957.2077.5357.8657.173
Dolphin/MPI13.9497.7676.2225.8015.5555.866
Giganet/VIPL37.8324.6282.9673.163.173.192
GigE/VIPL23.32622.99912.1116.6994.554.564.55
GigE/MPI36.2225.97113.0447.56.1545.8545.854
55
73
91
109
127
This graph shows the cost of communication for small messsages that are possibly pipelined (message vectorization)
Blocking is the full roundtrip
Others use piplining
Observations:
1) On most machines, overlapping with qd 1 gets about a 2x speedup
2) larger are significant on some machines, especially M2K/GM
3) This is old data for Dolphin, M2K, LAPI, and others
4) Missing compaq/MPI data
T3E-MPI
00000
00000
00000
00000
00000
00000
00000
00000
00000
00000
qd=1
qd=2
qd=4
qd=8
qd=16
usec
T3E-Shmem
Machine Name/Message Layer
Ping PongPingPongGapO_SendO_RecLatency
Msg SizeQueueCPUCPU TimeInv BWBWTime to0.000.000.000.00
(Bytes)depthLoops(usec)(usec)MB/secsoln (sec)
1000Updated1/1/02
2000Questions:
4000
8000
16000
32000
64000
128000
256000
512000
1024000
2048000
4096000
8192000
16384000
32768000
65536000
131072000
Machine Name/Message Layer
Flood
Msg SizeQueueCPUCPU TimeInv BWBWTime to
(Bytes)depthLoops(usec)(usec)MB/secsoln (sec)
1100
2100
4100
8100
16100
32100
64100
128100
256100
512100
1024100
2048100
4096100
8192100
16384100
32768100
65536100
131072100
1200
2200
4200
8200
16200
32200
64200
128200
256200
512200
1024200
2048200
4096200
8192200
16384200
32768200
65536200
131072200
1400
2400
4400
8400
16400
32400
64400
128400
256400
512400
1024400
2048400
4096400
8192400
16384400
32768400
65536400
131072400
1800
2800
4800
8800
16800
32800
64800
128800
256800
512800
1024800
2048800
4096800
8192800
16384800
32768800
65536800
131072800
11600
21600
41600
81600
161600
321600
641600
1281600
2561600
5121600
10241600
20481600
40961600
81921600
163841600
327681600
655361600
1310721600
Machine Name/Message LayerLoop increment:
Overlap10
Msg SizeQueueCPUCPU TimeInv BWBWTime to
BytesdepthLoops(usec)(usec)MB/ssoln (sec)
810
8110
8120
8130
8140
8150
8160
8170
8180
8190
81100
81110
81120
81130
81140
81150
81160
81170
81180
81190
81200
81210
81220
81230
81240
81250
81260
81270
81280
81290
81300
81310
81320
81330
81340
81350
81360
81370
820
8210
8220
8230
8240
8250
8260
8270
8280
8290
82100
82110
82120
82130
82140
82150
82160
82170
82180
82190
82200
82210
82220
82230
82240
82250
82260
82270
82280
82290
82300
82310
82320
82330
82340
82350
82360
82370
840
8410
8420
8430
8440
8450
8460
8470
8480
8490
84100
84110
84120
84130
84140
84150
84160
84170
84180
84190
84200
84210
84220
84230
84240
84250
84260
84270
84280
84290
84300
84310
84320
84330
84340
84350
84360
84370
880
8810
8820
8830
8840
8850
8860
8870
8880
8890
88100
88110
88120
88130
88140
88150
88160
88170
88180
88190
88200
88210
88220
88230
88240
88250
88260
88270
88280
88290
88300
88310
88320
88330
88340
88350
88360
88370
8160
81610
81620
81630
81640
81650
81660
81670
81680
81690
816100
816110
816120
816130
816140
816150
816160
816170
816180
816190
816200
816210
816220
816230
816240
816250
816260
816270
816280
816290
816300
816310
816320
816330
816340
816350
816360
816370
8320
83210
83220
83230
83240
83250
83260
83270
83280
83290
832100
832110
832120
832130
832140
832150
832160
832170
832180
832190
832200
832210
832220
832230
832240
832250
832260
832270
832280
832290
832300
832310
832320
832330
832340
832350
832360
832370
8640
86410
86420
86430
86440
86450
86460
86470
86480
86490
864100
864110
864120
864130
864140
864150
864160
864170
864180
864190
864200
864210
864220
864230
864240
864250
864260
864270
864280
864290
864300
864310
864320
864330
864340
864350
864360
864370
81280
812810
812820
812830
812840
812850
812860
812870
812880
812890
8128100
8128110
8128120
8128130
8128140
8128150
8128160
8128170
8128180
8128190
8128200
8128210
8128220
8128230
8128240
8128250
8128260
8128270
8128280
8128290
8128300
8128310
8128320
8128330
8128340
8128350
8128360
8128370
82560
825610
825620
825630
825640
825650
825660
825670
825680
825690
8256100
8256110
8256120
8256130
8256140
8256150
8256160
8256170
8256180
8256190
8256200
8256210
8256220
8256230
8256240
8256250
8256260
8256270
8256280
8256290
8256300
8256310
8256320
8256330
8256340
8256350
8256360
8256370
85120
851210
851220
851230
851240
851250
851260
851270
851280
851290
8512100
8512110
8512120
8512130
8512140
8512150
8512160
8512170
8512180
8512190
8512200
8512210
8512220
8512230
8512240
8512250
8512260
8512270
8512280
8512290
8512300
8512310
8512320
8512330
8512340
8512350
8512360
8512370
810240
8102410
8102420
8102430
8102440
8102450
8102460
8102470
8102480
8102490
81024100
81024110
81024120
81024130
81024140
81024150
81024160
81024170
81024180
81024190
81024200
81024210
81024220
81024230
81024240
81024250
81024260
81024270
81024280
81024290
81024300
81024310
81024320
81024330
81024340
81024350
81024360
81024370
Millennium-MPI
IBM/MPI
Ping PongPingPongGapO_SendO_RecLatencyGap-pipeOver-pipe
Msg SizeQueueCPUCPU TimeInv BWBWTime to39.008.217.745.3436.427.567.94
(Bytes)depthLoops(usec)(usec)MB/secsoln (sec)
139.1380.049Updated1/29/02ping, flood, overlap (send and receive)
239.1710.097
439.1860.195
838.9970.391Questions:
1639.4450.774
3241.3781.475
6443.6272.798
12849.9854.884
25659.7728.169
51262.11115.723
102475.0426.028
204895.18641.038
4096132.14659.12
8192258.30660.49
16384338.89592.212
32768498.324125.42
65536808.599154.588
1310721236.753202.142
IBM/MPI
Flood
Msg SizeQueueCPUCPU TimeInv BWBWTime to
(Bytes)depthLoops(usec)(usec)MB/secsoln (sec)
187.5530.126
287.5590.252
487.5480.505
887.5611.009
1687.5622.018
3288.0893.772
6489.146.678
128810.91711.181
256812.80319.069
51288.52157.305
1024813.10874.501
2048816.654117.278
4096824.708158.095
8192849.002159.431
16384881.609191.463
327688143.167218.277
655368268.781232.532
1310728515.718242.381
All Flood
Msg SizeQueueCPUCPU TimeInv BWBWTime to
(Bytes)depthLoops(usec)(usec)MB/secsoln (sec)
118.2060.116
218.220.232
418.2050.465
818.3280.916
1618.2051.86
3218.733.496
6419.7936.233
128111.57910.542
256111.85620.592
51219.15553.333
1024114.04269.548
2048117.545111.322
4096125.73151.818
81921112.62269.369
163841153.004102.121
327681232.436134.446
655361382.822163.261
1310721600.011208.329
127.8960.121
227.8930.242
427.9610.479
827.890.967
1627.9061.93
3228.4313.62
6429.5326.403
128211.2410.86
256212.29519.858
51228.81455.398
1024213.64371.577
2048217.073114.4
4096225.35154.095
8192270.995110.043
163842102.798151.997
327682186.724167.359
655362318.784196.058
1310722550.067227.245
147.6670.124
247.6710.249
447.6570.498
847.6620.996
1647.6731.989
3248.2083.718
6449.2636.589
128411.02711.07
256412.63419.324
51248.60256.764
1024413.14674.286
2048416.774116.436
4096425.056155.901
8192458.79132.888
16384497.639160.028
327684162.107192.774
655364291.179214.645
1310724530.12235.796
187.5530.126
287.5590.252
487.5480.505
887.5611.009
1687.5622.018
3288.0893.772
6489.146.678
128810.91711.181
256812.80319.069
51288.52157.305
1024813.10874.501
2048816.654117.278
4096824.708158.095
8192849.002159.431
16384881.609191.463
327688143.167218.277
655368268.781232.532
1310728515.718242.381
1167.5090.127
2167.6010.251
4167.5210.507
8167.6730.994
16167.5212.029
32168.0683.783
64169.1056.704
1281610.86111.239
2561612.54919.456
512168.70756.079
10241613.00175.113
20481616.576117.828
40961624.526159.269
81921644.196176.769
163841674.716209.125
3276816136.523228.899
6553616271.866229.892
13107216510.183245.01
IBM/MPI
Send Overlap
Msg SizeQueueCPUCPU TimeInv BWBWTime toOverlap
BytesdepthLoops(usec)(usec)MB/ssoln (sec)usec
8110.1527.9087.756
81110.7338.4957.762
81211.2949.0367.742
81311.8559.67.745
81412.41610.167.744
81512.97910.7817.802
81613.54411.2877.743
81714.10411.8867.782
81814.66112.427.759
81915.22512.9867.761
811015.78313.6437.86
811116.34414.0997.755
811216.9114.6577.747
811317.46715.2347.767
811418.03716.0868.049
811518.58916.3457.756
811619.15716.9037.746
811719.71217.4817.769
8118110.27918.057.771
8119110.85518.6087.753
8120111.40619.1917.785
8121111.96819.8787.91
8122112.51820.2917.773
8123113.08620.857.764
8124113.64121.4037.762
8125114.2121.9487.738
8126114.76422.5487.784
8127115.32523.1747.849
8128115.88823.6657.777
8129116.46624.237.764
8130117.01724.8117.794
8131117.5725.3497.779
8132118.14225.9247.782
8133118.72326.4817.758
8134119.27827.0387.76
8135119.81627.7357.919
8136120.40428.1597.755
8137120.95828.747.782
8138121.51229.2847.772
8139122.05729.8397.782
82008.2521.471688.252
822007.01115.3161.473048.305
8240013.94422.31.474038.356
8260020.89829.2441.473558.346
8280027.84636.1991.473418.353
82100034.79943.1511.473718.352
82120041.73250.1451.473988.413
82140048.74457.0581.474588.314
82160055.68364.0151.473898.332
82180062.56270.9621.473718.4
82200069.57277.9921.474718.42
82220076.51884.8851.47418.367
82240083.39491.8271.474078.433
82260090.34898.781.474038.432
82280097.365105.7471.474288.382
823000104.321112.7961.475348.475
823200111.302119.7341.475228.432
823400118.196126.5861.474228.39
823600125.147133.5571.474488.41
823800132.087140.4961.474418.409
824000139.033147.4591.474598.426
84008.0891.470268.089
842007.01515.1381.471918.123
8440013.99122.0871.472238.096
8460020.89529.0231.471278.128
8480027.84735.9831.471268.136
84100034.77842.9381.471478.16
84120041.72949.8751.471238.146
84140048.74556.8521.471528.107
84160055.61663.8011.471968.185
84180062.56370.7771.471918.214
84200069.51577.7161.472818.201
84220076.50284.6731.471968.171
84240083.41391.6161.471968.203
84260090.34498.591.472128.246
84280097.365105.5441.472368.179
843000104.445112.5261.472568.081
843200111.25119.4571.472448.207
843400118.291126.3751.472188.084
843600125.143133.4591.47358.316
843800132.082140.2911.472438.209
844000139.041147.3321.473328.291
88007.9681.469377.968
882006.99114.9591.469527.968
8840013.96321.9031.469397.94
8860020.89428.8771.469727.983
8880027.84635.8221.469697.976
88100034.77842.791.471048.012
88120041.72849.7371.469958.009
88140048.76256.7441.470547.982
88160055.62963.6561.47048.027
88180062.56170.6471.470778.086
88200069.50777.6291.471058.122
88220076.46384.5691.471118.106
88240083.47291.5131.470818.041
88260090.40898.5111.472298.103
88280097.356105.3761.470498.02
883000104.286112.3351.470618.049
883200111.17119.2781.470618.108
883400118.101126.251.470828.149
883600125.073133.1921.470858.119
883800132.003140.2141.471668.211
884000138.95147.1141.471158.164
816007.8061.468367.806
8162007.01114.7721.467417.761
81640013.94421.7191.467487.775
81660020.89428.6921.467777.798
81680027.84735.6471.467767.8
816100034.79342.5981.468377.805
816120041.79449.551.467967.756
816140048.67656.511.468097.834
816160055.62563.4631.469217.838
816180062.62370.411.468327.787
816200069.50677.3751.468287.869
816220076.58484.3191.468367.735
816240083.73391.3071.468667.574
816260090.33698.2441.468637.908
816280097.353105.1981.468717.845
8163000104.222112.1411.468757.919
8163200111.168119.11.469827.932
8163400118.123126.0931.469347.97
8163600125.071133.0611.469497.99
8163800132.004139.9511.468967.947
8164000138.952146.9951.469958.043
IBM/MPI
Receive Overlap
Msg SizeQueueCPUCPU TimeInv BWBWTime toOverlap
BytesdepthLoops(usec)(usec)MB/ssoln (sec)usec
8110.1527.6977.545
81110.7337.6976.964
81211.2947.7186.424
81311.8567.6985.842
81412.4187.7825.364
81512.9798.3225.343
81613.5389.0875.549
81714.1029.8195.717
81814.66710.0875.42
81915.22310.5865.363
811015.78811.2185.43
811116.34511.8345.489
811216.90812.755.842
811317.47313.2935.82
811418.03614.0175.981
811518.59814.4275.829
811619.15114.9715.82
811719.71715.5525.835
8118110.27416.0965.822
8119110.83516.6635.828
8120111.40217.255.848
8121111.95617.7915.835
8122112.52718.3415.814
8123113.09618.9085.812
8124113.64919.4785.829
8125114.20220.0515.849
8126114.77420.6655.891
8127115.32421.1575.833
8128115.88521.7125.827
8129116.45922.2975.838
8130117.00722.9045.897
8131117.58723.5675.98
8132118.14923.9465.797
8133118.70524.5085.803
8134119.25425.0895.835
8135119.8325.6665.836
8136120.37626.2275.851
8137120.94926.9475.998
8138121.50727.3455.838
8139122.05727.9065.849
IBM/MPI
Ping Ackold data -- ping with simple ack
Msg SizeQueueCPUCPU TimeInv BWBWTime to
(Bytes)depthLoops(usec)(usec)MB/secsoln (sec)
145.5990.021
245.6190.042
445.3950.084
845.5110.168
1645.5530.335
3246.3330.659
6448.2231.266
12851.1732.385
25656.7594.301
51259.3738.224
102462.52715.618
204873.70626.499
409692.31142.316
8192156.33149.974
16384196.63579.462
32768272.527114.668
65536423.851147.457
131072634.744196.93
Millennium-GM
IBM/LAPI
Ping PongPingPongGapO_SendO_RecLatencyGap-pipeOver-pipe
Msg SizeQueueCPUCPU TimeInv BWBWTime to42.959.499.932.149.419.439.337
(Bytes)depthLoops(usec)(usec)MB/secsoln (sec)
100043.1460.044Updated5/2/02ping, flood, overlap
200042.920.088
400042.9610.178Questions: Why is the overhead larger than the gap?
800042.9450.356
1600043.1830.706
3200044.1311.384
6400045.6582.674
12800051.6454.728
25600061.8297.898
51200063.44815.392
102400067.21229.06
204800089.76643.516
4096000113.83168.632
8192000144.875107.852
16384000189.998164.476
32768000282.041221.6
65536000456.638273.74
131072000794.482314.67
IBM/LAPI
Flood
Msg SizeQueueCPUCPU TimeInv BWBWTime to
(Bytes)depthLoops(usec)(usec)MB/secsoln (sec)
189.410.101
289.4250.202
489.4080.405
889.4270.809
1689.4341.617
3289.9933.054
64811.075.513
128812.7659.563
25688.22329.691
51288.67956.26
1024812.01281.3
2048813.586143.757
4096815.505251.93
8192824.223322.52
16384845.179345.847
32768893.335334.815
655368183.144341.261
1310728347.882359.317
IBM/LAPI
Flood
Msg SizeQueueCPUCPU TimeInv BWBWTime to
(Bytes)depthLoops(usec)(usec)MB/secsoln (sec)
119.3620.102
219.3760.203
419.3560.408
819.4880.804
1619.4521.614
3219.9623.063
64111.0575.52
128112.7629.565
25618.23129.663
51218.58356.889
1024148.64220.077
2048157.4833.979
4096172.35153.99
8192188.45688.321
163841110.063141.964
327681155.508200.954
655361244.722255.391
1310721411.635303.667
129.420.101
229.4320.202
429.4210.405
829.4270.809
1629.4751.61
3229.9933.054
64211.0715.513
128212.7869.547
25628.22429.687
51228.64856.461
1024228.44934.326
2048230.24864.57
4096239.12999.831
8192249.067159.22
16384259.193263.966
327682122.624254.843
655362182.239342.957
1310722341.304366.242
149.4090.101
249.4350.202
449.4190.405
849.4360.809
1649.4861.609
3249.9743.06
64411.035.533
128412.7599.567
25648.21929.706
51248.60756.728
1024415.07464.785
2048418.029108.331
4096423.762164.393
8192430.264258.148
16384445.516343.284
32768493.746333.349
655364183.325340.925
1310724347.85359.351
189.410.101
289.4250.202
489.4080.405
889.4270.809
1689.4341.617
3289.9933.054
64811.075.513
128812.7659.563
25688.22329.691
51288.67956.26
1024812.01281.3
2048813.586143.757
4096815.505251.93
8192824.223322.52
16384845.179345.847
32768893.335334.815
655368183.144341.261
1310728347.882359.317
1169.4220.101
2169.4060.203
4169.4160.405
8169.450.807
16169.4421.616
32169.983.058
641611.1385.48
1281612.7659.563
256168.22529.683
512168.66456.359
10241612.06780.929
20481613.519144.468
40961614.675266.186
81921624.257322.077
163841645.158346.004
327681693.593333.892
6553616184.073339.539
13107216350.721356.409
IBM/LAPI
Send Overhead
Msg SizeQueueCPUCPU TimeInv BWBWTime toOverhead
BytesdepthLoops(usec)(usec)MB/ssoln (sec)usec
8110.09610.0780.303739.982
81110.61210.6130.3035510.001
81211.11911.1170.303579.998
81311.62811.6180.304519.99
81412.13512.1120.303449.977
81512.64312.6260.303449.983
81613.1513.1320.303359.982
81713.65913.6410.303459.982
81814.16514.140.303299.975
81914.67314.6930.3038210.02
811015.1815.1290.303029.949
811115.68815.6690.303729.981
811216.19816.1230.303829.925
811316.70316.6570.303159.954
811417.21317.160.30319.947
811517.71817.660.302959.942
811618.22918.1690.302989.94
811718.73418.6730.302949.939
811819.24119.2520.3037210.011
811919.7519.7140.303199.964
8120110.26420.2360.303429.972
8121110.76820.7620.303539.994
8122111.28221.2920.3037810.01
8123111.78321.8010.3037910.018
8124112.28722.3190.3038910.032
8125112.822.8460.3040810.046
8126113.30223.3630.3042310.061
8127113.8123.8960.3044310.086
8128114.31624.3780.3041810.062
8129114.82424.8760.3040710.052
8130115.33725.3790.3040410.042
8131115.85625.8780.3039610.022
8132116.35426.370.3038710.016
8133116.85626.8890.3039610.033
8134117.37127.3880.3038310.017
8135117.8727.9070.3039510.037
8136118.37628.4140.3042610.038
8137118.88428.9080.3037910.024
8138119.39229.4110.3037610.019
8139119.9129.9110.3036910.001
8200.0499.4310.258059.382old send overhead data below
8250.2419.6990.259949.458
82100.4149.8970.260479.483
82150.58710.0780.260599.491
82200.76110.2580.260739.497
82250.93310.4490.261079.516
82301.10710.6310.26139.524
82351.310.8090.261389.509
82401.45310.9870.261419.534
82451.62711.1840.261889.557
82501.8211.3580.261889.538
82551.97311.5260.261779.553
82602.14711.7010.261819.554
82652.3211.8930.262199.573
82702.49312.0660.262169.573
82752.66712.250.262379.583
82802.8412.4310.262529.591
82853.01312.6010.262449.588
82903.20712.7860.262679.579
82953.35912.9490.262479.59
821003.55813.140.26289.582
8400.0489.3990.257439.351
8450.2419.6440.258849.403
84100.4149.8330.259179.419
84150.58710.0230.259479.436
84200.77910.20.259549.421
84250.93410.40.260169.466
84301.10710.5730.260089.466
84351.2810.7580.260319.478
84401.45410.9450.260569.491
84451.62711.1140.260489.487
84501.81911.2820.260369.463
84551.97411.440.260089.466
84602.14611.6470.260729.501
84652.3211.820.26079.5
84702.51311.9960.260759.483
84752.66712.1650.260679.498
84802.86112.3740.26149.513
84853.01412.5780.261999.564
84903.18712.6790.260539.492
84953.38112.8410.260299.46
841003.53313.0170.260349.484
8800.0499.3860.257189.337
8850.2419.60.2589.359
88100.4339.790.25839.357
88150.5879.9840.258769.397
88200.7610.1590.258729.399
88250.93310.3370.258839.404
88301.10710.5130.258889.406
88351.2810.7010.259179.421
88401.47210.8710.259089.399
88451.64611.0570.259339.411
88501.80211.2150.259049.413
88551.97311.380.258859.407
88602.14711.5460.258729.399
88652.3211.7410.259149.421
88702.49411.9070.258979.413
88752.66612.0880.259139.422
88802.8612.2530.258959.393
88853.03312.4110.258639.378
88903.18712.5910.258799.404
88953.3612.740.258289.38
881003.53312.8970.257959.364
81600.0489.3380.256229.29
81650.2419.5340.256669.293
816100.4149.6840.256179.27
816150.5879.8330.25579.246
816200.7619.9820.255219.221
816250.93410.1480.255049.214
816301.10610.3080.254789.202
816351.2810.4340.253819.154
816401.45310.6040.253749.151
816451.62710.7440.253089.117
816501.80110.8940.252619.093
816551.97411.0850.252959.111
816602.14811.2650.253119.117
816652.3211.4590.253489.139
816702.49411.590.252749.096
816752.66611.6910.25129.025
816802.8411.8790.251489.039
816853.01312.0550.251539.042
816903.20712.2380.25179.031
816953.3612.4190.251919.059
8161003.55512.5660.251329.011
IBM/LAPI
Receive Overlap
Msg SizeQueueCPUCPU TimeInv BWBWTime toOverhead
BytesdepthLoops(usec)(usec)MB/ssoln (sec)usec
8110.09610.0040.303229.908
81110.6139.9420.302429.329
81211.1199.9040.30218.785
81311.6279.8760.302278.249
81412.1349.8410.30157.707
81512.6429.7020.300017.06
81613.1519.1620.294676.011
81713.6576.1220.264282.465
81814.1686.4040.267092.236
81914.6736.8650.271652.192
811015.1857.3610.276672.176
811115.6877.8790.281782.192
811216.1958.3730.286792.178
811316.7058.860.291652.155
811417.219.3650.29672.155
811517.729.880.301882.16
811618.22810.3920.307112.164
811718.75110.9010.3122.15
811819.24111.3990.317062.158
811919.75811.8990.322412.141
8120110.25612.4160.327172.16
8121110.76412.9190.332262.155
8122111.27613.4290.337282.153
8123111.78113.9440.342482.163
8124112.2914.450.347562.16
8125112.79314.9490.352482.156
8126113.30615.4580.357642.152
8127113.80915.9520.362572.143
8128114.33516.4780.367792.143
8129114.82516.9830.372872.158
8130115.33317.4970.378042.164
8131115.83918.0030.383042.164
8132116.34618.510.388522.164
8133116.85619.0110.393182.155
8134117.36219.510.398132.148
8135117.87720.0550.403622.178
8136118.39420.5340.408332.14
8137118.89421.0350.413342.141
8138119.39221.5460.418512.154
8139119.90822.0570.423562.149
IBM/LAPI
Ping Ack
Msg SizeQueueCPUCPU TimeInv BWBWTime to
(Bytes)depthLoops(usec)(usec)MB/secsoln (sec)
100041.6470.023
200041.6660.046
400041.7570.091
800041.550.184
1600041.6770.366
3200042.2270.723
6400043.8291.393
12800047.8982.549
25600050.5124.833
51200051.9439.4
102400054.85717.802
204800062.63231.184
409600075.67751.618
819200093.41183.636
16384000117.065133.473
32768000161.548193.442
65536000249.608250.393
131072000416.875299.85
Alvarez-MPI
Quadrics/MPI
Ping PongPingPongGapO_SendO_RecLatency
Msg SizeQueueCPUCPU TimeInv BWBWTime to18.957.262.266.8890.32
(Bytes)depthLoops(usec)(usec)MB/secsoln (sec)
100019.2390.099Updated9/24/02
200018.9460.201Questions:
400019.0440.401
800018.9460.805
1600018.9461.611
3200018.9463.221
6400021.9745.555
12800024.5139.96
25600029.29816.666
51200034.37728.408
102400038.96750.123
204800049.02679.678
409600069.339112.671
8192000110.65141.211
16384000193.369161.609
32768000363.201172.081
65536000698.861178.862
1310720001377.116181.539
Quadrics/MPI
Flood
Msg SizeQueueCPUCPU TimeInv BWBWTime to
(Bytes)depthLoops(usec)(usec)MB/secsoln (sec)
18007.2830.131
28007.2310.264
48007.2610.525
88007.2561.051
168007.2812.096
328007.2784.193
648008.0077.623
1288008.62714.151
2568009.72225.113
5128007.85562.159
102480010.04397.235
204880015.086129.463
409680025.95150.531
819280046.679167.368
1638480091.38170.989
32768800176.448177.106
65536800342.601182.428
131072800667.491187.268
Quadrics/MPI
All Flood
Msg SizeQueueCPUCPU TimeInv BWBWTime to
(Bytes)depthLoops(usec)(usec)MB/secsoln (sec)
1100100.095
21009.5870.199
410010.0930.378
810010.5210.725
161009.7561.564
3210010.1623.003
6410010.555.785
12810010.83511.266
25610012.59319.386
51210015.37531.757
102410019.2950.624
204810024.56979.497
409610036.103108.198
819210060.153129.877
16384100107.284145.641
32768100198.308157.584
65536100392.705159.153
131072100788.525158.524
12007.3880.129
22007.4340.257
42007.4550.512
82007.3451.039
162007.2672.1
322007.2654.201
642007.9437.684
1282008.61714.167
2562009.74925.043
5122008.89854.876
102420010.77690.626
204820015.614125.09
409620025.718151.888
819220046.348168.561
1638420087.21179.164
32768200170.382183.411
65536200334.606186.787
131072200667.203187.349
14007.2830.131
24007.2310.264
44007.2610.525
84007.2561.051
164007.2812.096
324007.2784.193
644008.0077.623
1284008.62714.151
2564009.72225.113
5124007.85562.159
102440010.04397.235
204840015.086129.463
409640025.95150.531
819240046.679167.368
1638440091.38170.989
32768400176.448177.106
65536400342.601182.428
131072400667.491187.268
18008.080.118
28008.1190.235
48007.7610.492
88007.4581.023
168008.4141.814
328008.6313.536
648008.8836.871
1288008.63914.131
2568009.75725.022
5128008.19759.566
10248009.86798.972
204880014.79132.054
409680025.049155.947
819280045.791170.613
1638480087.118179.354
32768800177.172176.383
65536800349.938178.603
131072800669.53186.698
1160010.7940.088
2160010.9370.174
4160010.910.35
8160010.6770.715
16160010.7161.424
32160010.6422.868
64160010.1965.986
12816009.21113.253
256160010.11724.133
51216007.64563.869
102416009.90498.607
2048160014.805131.923
4096160025.583152.688
8192160045.981169.908
16384160087.276179.029
327681600177.442176.114
655361600348.474179.354
1310721600691.085180.875
Quadrics/MPI
Send Overhead
Msg SizeQueueCPUCPU TimeInv BWBWTime toOverhead
BytesdepthLoops(usec)(usec)MB/ssoln (sec)usec
8110.028.9858.965
811010.6219.188.559
812011.2279.0827.855
813011.8379.187.343
814012.4289.2776.849
815013.0339.186.147
816013.6399.185.541
817014.2449.0824.838
818014.8118.9854.174
819015.4559.0823.627
8110016.0619.0823.021
8111016.6668.9852.319
8112017.2729.572.298
8113017.87710.1562.279
8114018.48210.7422.26
8115019.08811.4262.338
8116019.69312.0122.319
81170110.29912.5982.299
81180110.90513.1842.279
81190111.5113.772.26
Quadrics/MPI
Receive Overlap
Msg SizeQueueCPUCPU TimeInv BWBWTime toOverhead
BytesdepthLoops(usec)(usec)MB/ssoln (sec)usec
8210.029.0829.062
821010.6168.4967.88
822011.2178.8877.67
823011.8379.0827.245
824012.4289.4737.045
825013.00910.1567.147
826013.63911.5247.885
827014.2112.6968.486
828014.8513.2828.432
829015.45513.8678.412
8210016.06114.3568.295
8211016.66614.7468.08
8212017.27215.0397.767
8213017.87715.3327.455
8214018.48215.6257.143
8215019.08816.0166.928
8216019.61516.5046.889
82170110.29917.2856.986
82180110.90518.2627.357
82190111.5119.0437.533
Alvarez-GM
Quadrics/Shm
Ping PongPingPongGapO_SendO_RecLatency
Msg SizeQueueCPUCPU TimeInv BWBWTime to3.401.500.430.0121.26
(Bytes)depthLoops(usec)(usec)MB/secsoln (sec)
1000Updated4/23/02flood
2000Updated5/2/02ping-pong (symmetric)
4000
80003.42.24
160003.334.59
320003.418.94Questions:
640003.6916.53These are old numbers for ping/pong, which are higher
1280004.4727.31than the newest ones (appear in get spreadsheet)
2560006.0340.45
5120006.970.77
10240009.65101.16
204800015.16128.84
409600025.95150.5
819200048.32161.68
1638400093.63166.87
32768000184.66169.23
65536000364.02171.7
131072000728.49171.59
Quadrics/Shm
Flood
Msg SizeQueueCPUCPU TimeInv BWBWTime to
(Bytes)depthLoops(usec)(usec)MB/secsoln (sec)
18
28
48
881.3466255.6655672747
1681.359511.223824246
3281.3567522.4931476875
6481.31246.5206983613
12881.85637565.757356407
25682.94982.7875974907
51286.88812570.887396788
102489.48675102.9396263209
2048814.391125135.7173257824
4096824.257375161.0335001211
8192844.622125175.0813077593
16384885.470375182.8118807248
327688167.8475186.1809082649
655368332.626875187.8982268044
1310728664.165625188.2060668226
Quadrics/Shm
All Flood
Msg SizeQueueCPUCPU TimeInv BWBWTime to
(Bytes)depthLoops(usec)(usec)MB/secsoln (sec)
11
21
41
81001.5025.079490367
161001.5269.999206463
321001.5419.8166091721
641001.50240.6359229361
1281001.964.2475328947
2561002.95182.7314893257
5121007.10568.7236101337
10241009.81399.5172220524
204810014.574134.0143406066
409610024.449159.7713607919
819210044.841174.2267121607
1638410085.695182.3326915223
32768100168.527185.4302277973
65536100333.35187.4906254687
131072100666.609187.516220153
12
22
42
82001.4135.3994299584
162001.42910.6779489591
322001.438521.2148614008
642001.4143.2873448582
1282001.859565.6468472708
2562002.951582.7174741657
5122006.9270.5608742775
10242009.6735100.9523440327
204820014.4095135.5442589958
409620024.3995160.0954937601
819220044.735174.6395439812
1638420085.5605182.6193161564
32768200168.16185.8349191246
65536200333.3445187.4937189604
131072200665.2045187.9121382973
14
24
44
84001.365255.5882765290.04
164001.3822511.03909499910.04
324001.39621.86072931590.04
644001.3272545.98617912980.04
1284001.85765.7352248250.05
2564002.9497582.76654801250.074
5124006.92370.53029755890.18
10244009.61725101.54280069670.238
204840014.4155135.48784294680.362
409640024.27825160.8950397990.61
819240044.6715174.88779199271.106
1638440085.46525182.82284320242.123
32768400167.9745186.04014299794.165
65536400332.89675187.74590019288.201
131072400664.55875188.094732030816.351
18
28
48
88001.3466255.66556727470.04
168001.359511.2238242460.04
328001.3567522.49314768750.04
648001.31246.52069836130.04
1288001.85637565.7573564070.05
2568002.94982.78759749070.074
5128006.88812570.8873967880.181
10248009.48675102.93962632090.238
204880014.391125135.71732578240.364
409680024.257375161.03350012110.61
819280044.622125175.08130775931.107
1638480085.470375182.81188072482.123
32768800167.8475186.18090826494.165
65536800332.626875187.89822680448.201
131072800664.165625188.206066822616.351
116
216
416
816001.33655.70848823890.04
1616001.35112511.293395550.04
3216001.34287522.7255538490.04
6416001.3122546.51183558770.04
12816001.855562565.78614975240.05
25616002.946437582.85959739520.074
51216006.86037571.1741340670.18
102416009.5060625102.73049435560.238
2048160014.346875136.13591810060.36
4096160024.2374375161.16596484260.611
8192160044.6245175.07198960211.106
16384160085.4235182.91219629262.123
327681600167.8015186.23194667514.166
655361600332.4945625187.97299880668.201
1310721600663.9455188.268464806216.351
Quadrics/Shm
Send Overhead
Msg SizeQueueCPUCPU TimeInv BWBWTime tooverlap
BytesdepthLoops(usec)(usec)MB/ssoln (sec)usec
8110.0231.3671.344
81510.3291.3671.038
811010.6291.3450.716
811510.931.3690.439
812011.2311.6670.436
812511.5321.960.428
813011.8332.290.457
813512.1372.5680.431
814012.4352.8680.433
816013.644.0830.443
818014.8455.2790.434
8110016.056.4930.443
Quadrics/Shm
Receive Overhead
Msg SizeQueueCPUCPU TimeInv BWBWTime tooverlap
BytesdepthLoops(usec)(usec)MB/ssoln (sec)usec
8110.0231.3671.344
81110.0711.3671.296
81210.1491.3671.218
81310.2071.3671.16
81410.2671.3671.1
81510.3291.3671.038
81610.3881.3670.979
81710.4491.3670.918
81810.5091.3670.858
811010.6291.3450.716
812011.2311.3680.137
813011.8331.8480.015
814012.4352.450.015
815013.0373.0510.014
816013.643.6530.013
817014.2424.2560.014
818014.8454.8590.014
819015.4485.460.012
Compaq/Put
Ping Ack
Msg SizeQueueCPUCPU TimeInv BWBWTime to
(Bytes)depthLoops(usec)(usec)MB/secsoln (sec)
10007.721.04
20007.552.12
40008.323.84
80009.616.66
1600010.5812.1
3200013.8718.45
6400017.3529.52
12800031.2832.73
25600043.4847.1
51200072.0556.85
1024000125.5265.26
2048000233.2670.24
4096000452.7872.37
8192000807.2381.19
163840001609.4781.44
327680003213.4481.58
655360006382.8782.14
13107200012814.6381.83
Dolphin
Quadrics/Get
Ping PongPingPongGapO_SendO_RecLatency
Msg SizeQueueCPUCPU TimeInv BWBWTime to5.636.500.622.20
(Bytes)depthLoops(usec)(usec)MB/secsoln (sec)
1000Updated4/24/02flood
20005.840.34Questions
40005.710.71) Should we include custom gets?
80005.631.422) Need data for Flood (if this makes sense)
160005.572.87
320005.665.66
640006.719.54
1280007.5916.86
2560008.1431.45
5120009.3154.98
102400012.0485.07
204800017.85114.7
409600029.09140.79
819200053.11154.23
16384000101.84160.88
32768000198.61164.99
65536000471.46139.01
1310720001180.09111.07
Quadrics/Get
Flood
Msg SizeQueueCPUCPU TimeInv BWBWTime to
(Bytes)depthLoops(usec)(usec)MB/secsoln (sec)
18
28
48
883.2672.3352906432
1683.2664.672011348
3283.2639.3526135841
6483.2718.6651853976
12883.75132.543405092
25684.3456.2536002304
51285.52888.3287355282
102487.855124.3236791852
2048813.197147.9976509813
4096822.863170.8546559944
8192842.931181.9780578137
16384883.859186.3246640194
327688164.928189.4766201009
655368326.64191.3421503796
1310728654.201191.0727742697
Quadrics/Get
All Flood
Msg SizeQueueCPUCPU TimeInv BWBWTime to
(Bytes)depthLoops(usec)(usec)MB/secsoln (sec)
11
21
41
81006.5021.17339196110.163
161006.4992.347867220.162
321006.5054.69140324750.163
641006.5299.34831616630.163
1281007.00317.43114558050.175
2561007.58432.19153810650.19
5121008.83855.24793505320.221
102410011.1487.66270197490.279
204810016.335119.56688093050.408
409610026.243148.8492169340.656
819210046.079169.54578007341.152
1638410087.171179.24539124252.179
32768100168.69185.2510522264.217
65536100330.433189.14575723378.261
131072100657.085190.234140179716.427
1200
2200
4200
82004.661.63720912690.116
162004.6573.27652760630.116
322004.6586.55164837380.116
642004.66813.07522627460.117
1282005.14523.72600826040.129
2562005.72642.63720310860.143
5122006.96470.11505600230.174
10242009.267105.38065177510.232
204820014.562134.12477681640.364
409620024.347160.44071138130.609
819220044.187176.8053952521.105
1638420085.047183.7219419852.126
32768200166.748187.40854463024.169
65536200328.253190.40191559568.206
131072200655.219190.77590851316.38
1400
2400
4400
84003.7312.0448658620.093
164003.734.09082816690.093
324003.7288.18604563440.093
644003.7416.31956049470.093
1284004.2228.92661433650.105
2564004.80450.8202799750.12
5124006.00981.25832085210.15
10244008.322117.34709204520.208
204840013.65143.08608058610.341
409640023.351167.28405635730.584
819240043.399180.01566856381.085
1638440084.129185.72668164372.103
32768400166.132188.10343582214.153
65536400327.33190.93880793088.183
131072400653.903191.159850925916.348
1800
2800
4800
88003.2672.33529064320.082
168003.2664.6720113480.082
328003.2639.35261358410.082
648003.2718.66518539760.082
1288003.75132.5434050920.094
2568004.3456.25360023040.109
5128005.52888.32873552820.138
10248007.855124.32367918520.196
204880013.197147.99765098130.33
409680022.863170.85465599440.572
819280042.931181.97805781371.073
1638480083.859186.32466401942.096
32768800164.928189.47662010094.123
65536800326.64191.34215037968.166
131072800654.201191.072774269716.355
11600
21600
41600
816006.1561.23934284130.154
1616006.1562.47868568270.154
3216006.1444.96705373130.154
6416006.159.92441565040.154
12816006.63618.39516463230.166
25616007.22433.79576758030.181
51216008.42257.97687603890.211
1024160010.7490.92760707640.269
2048160016.073121.51589622350.402
4096160025.765151.61071220650.644
8192160045.809170.54508939291.145
16384160086.808179.99493134272.17
327681600168.288185.69357292264.207
655361600329.354189.76541957898.234
1310721600654.312191.040359950616.358
Quadrics/GetOld data
OverlapQueueCPUCPU TimeInv BWBWTime to
Msg SizeQueueCPUCPU TimeInv BWBWTime toOverheaddepthLoops(usec)(usec)MB/ssoln (sec)
BytesdepthLoops(usec)(usec)MB/ssoln (sec)usec1006.0740.134
81006.3290.0636.3291500.3046.0420.13
8110.0066.3520.1366.34611000.6096.0130.127
8120.0126.3170.1356.30511500.9136.010.124
8140.0246.310.1356.28612001.2176.0010.121
8180.0486.3130.1356.26512501.5226.0160.118
81160.0976.310.1356.21313001.8266.0220.115
81320.1936.3050.1346.11213502.136.0430.113
81640.3866.3070.1325.92114002.4356.0150.109
811280.7726.2150.1275.44314502.7396.0310.106
812561.5446.2030.1194.65915003.0446.0420.103
815123.0886.1970.1043.10915503.3486.0670.101
8110246.1776.7970.0790.6216003.6526.0050.097
82004.4910.0454.49116503.9576.0140.094
8210.0064.390.1164.38417004.2616.0150.09
8220.0124.40.1164.38817504.5656.0220.088
8240.0244.3880.1164.36418004.876.020.084
8280.0484.3950.1164.34718505.1746.0220.081
82160.0974.390.1154.29319005.4786.1770.08
82320.1934.3890.1144.19619505.7836.460.08
82640.3864.390.1124.004110006.0876.7810.08
821280.7724.3880.1093.616110506.3917.0980.08
822561.5444.390.1012.846111006.6967.4210.08
825123.0884.5570.0871.4691115077.6880.08
8210246.1777.4470.0851.27112007.3048.0010.08
84003.5670.0363.5672004.3590.117
8410.0063.4650.1073.4592500.3044.3640.113
8420.0123.4640.1073.45221000.6094.3650.11
8440.0243.4660.1073.44221500.9134.3610.107
8480.0483.4670.1073.41922001.2174.3580.104
84160.0973.4660.1063.36922501.5224.3760.101
84320.1933.4640.1053.27123001.8264.370.098
84640.3863.4660.1033.0823502.134.3580.095
841280.7723.4660.0992.69424002.4354.3640.092
842561.5443.4870.0921.94324502.7394.3640.089
845123.0885.0060.0921.91825003.0444.5720.088
8410246.1778.0290.0911.85225503.3484.8530.092
88003.1010.0313.10126003.6525.1220.088
8810.0063.010.1023.00426503.9575.4440.088
8820.0122.9970.1022.98527004.2615.750.088
8840.02430.1022.97627504.5656.0590.088
8880.04830.1022.95228004.876.3380.088
88160.0973.0020.1012.90528505.1746.6240.087
88320.1933.0020.1012.80929005.4786.9290.087
88640.3863.0010.0992.61529505.7837.2290.087
881280.7723.0040.0952.232210006.0877.5340.087
882561.5443.7150.0942.171210506.3917.840.088
885123.0885.2620.0942.174211006.6968.150.088
8810246.1778.3360.0942.1592115078.4490.088
816002.8820.0292.882212007.3048.7460.088
81610.0062.7640.12.7583005.7020.129
81620.0122.7760.12.7643500.3045.7020.126
81640.0242.7660.12.74231000.6095.6980.123
81680.0482.7640.12.71631500.9135.6990.12
816160.0972.7760.0992.67932001.2175.6980.117
816320.1932.7650.0982.57232501.5225.7020.114
816640.3862.7640.0962.37833001.8265.70.111
8161280.7723.0920.0962.3233502.135.9160.11
8162561.5443.860.0962.31634002.4356.1850.11
8165123.0885.4010.0962.31334502.7396.4850.11
81610246.1778.4760.0952.29935003.0446.7960.11
35503.3487.0990.11
Quadrics/Get36003.6527.4010.11
Bytes36503.9577.7060.11
Msg SizeQueueCPUCPU TimeInv BWBWTime toOverhead37004.2618.0180.11
BytesdepthLoops(usec)(usec)MB/ssoln (sec)usec37504.5658.3040.11
81006.3290.0636.32938004.878.6110.11
8110.0066.3520.1366.34638505.1748.9140.11
8120.0126.3170.1356.30539005.4789.2120.11
8140.0246.310.1356.28639505.7839.5110.11
8180.0486.3130.1356.265310006.0879.8060.11
81160.0976.310.1356.213310506.39110.120.11
81320.1936.3050.1346.112311006.69610.410.11
81640.3866.3070.1325.92131150710.710.11
811280.7726.2150.1275.443312007.30411.010.11
812561.5446.2030.1194.6594003.40.106
815123.0886.1970.1043.1094500.3043.3990.104
8110246.1776.7970.0790.6241000.6093.3930.1
82004.4910.0454.49141500.9133.3980.097
8210.0064.390.1164.38442001.2173.4040.094
8220.0124.40.1164.38842501.5223.4980.092
8240.0244.3880.1164.36443001.8263.7790.092
8280.0484.3950.1164.34743502.134.0730.092
82160.0974.390.1154.29344002.4354.3760.092
82320.1934.3890.1144.19644502.7394.6730.092
82640.3864.390.1124.00445003.0444.9770.092
821280.7724.3880.1093.61645503.3485.2760.092
822561.5444.390.1012.84646003.6525.5830.092
825123.0884.5570.0871.46946503.9575.880.092
8210246.1777.4470.0851.2747004.2616.1810.092
84003.5670.0363.56747504.5656.4840.092
8410.0063.4650.1073.45948004.876.7840.092
8420.0123.4640.1073.45248505.1747.090.092
8440.0243.4660.1073.44249005.4787.3890.092
8480.0483.4670.1073.41949505.7837.6960.092
84160.0973.4660.1063.369410006.0877.9950.092
84320.1933.4640.1053.271410506.3918.2980.092
84640.3863.4660.1033.08411006.6968.5990.092
841280.7723.4660.0992.6944115078.8970.092
842561.5443.4870.0921.943412007.3049.1980.092
845123.0885.0060.0921.9185003.2220.105
8410246.1778.0290.0911.8525500.3043.2150.102
88003.1010.0313.10151000.6093.2310.1
8810.0063.010.1023.00451500.9133.2250.096
8820.0122.9970.1022.98552001.2173.2980.093
8840.02430.1022.97652501.5223.5810.093
8880.04830.1022.95253001.8263.8820.093
88160.0973.0020.1012.90553502.134.1830.093
88320.1933.0020.1012.80954002.4354.4820.093
88640.3863.0010.0992.61554502.7394.7820.093
881280.7723.0040.0952.23255003.0445.0860.093
882561.5443.7150.0942.17155503.3485.3870.093
885123.0885.2620.0942.17456003.6525.6870.093
8810246.1778.3360.0942.15956503.9575.9930.093
816002.8820.0292.88257004.2616.2960.093
81610.0062.7640.12.75857504.5656.5950.093
81620.0122.7760.12.76458004.876.8990.093
81640.0242.7660.12.74258505.1747.1910.093
81680.0482.7640.12.71659005.4787.4990.093
816160.0972.7760.0992.67959505.7837.8060.093
816320.1932.7650.0982.572510006.0878.1040.093
816640.3862.7640.0962.378510506.3918.4050.093
8161280.7723.0920.0962.32511006.6968.7110.093
8162561.5443.860.0962.3165115079.0090.093
8165123.0885.4010.0962.313512007.3049.3090.093
81610246.1778.4760.0952.2996005.1030.123
6500.3045.0960.12
61000.6095.1060.117
61500.9135.1040.114
62001.2175.3540.114
62501.5225.6550.114
63001.8265.9650.114
63502.136.2680.114
64002.4356.5570.114
64502.7396.8580.114
65003.0447.1630.114
65503.3487.4560.114
66003.6527.7630.114
66503.9578.070.114
67004.2618.3720.114
67504.5658.670.114
68004.878.9680.114
68505.1749.2730.116
69005.4789.5740.114
69505.7839.8720.114
610006.08710.170.114
610506.39110.470.114
611006.69610.770.114
61150711.080.114
612007.30411.380.114
7006.0130.133
7500.3046.0150.13
71000.6096.0140.127
71500.9136.1090.125
72001.2176.4060.124
72501.5226.7030.124
73001.8267.0150.124
73502.137.3050.124
74002.4357.610.124
74502.7397.9080.124
75003.0448.2090.124
75503.3488.5080.124
76003.6528.8130.124
76503.9579.1220.125
77004.2619.4240.124
77504.5659.7410.125
78004.8710.020.124
78505.17410.320.124
79005.47810.620.124
79505.78310.920.124
710006.08711.220.124
710506.39111.520.124
711006.69611.830.124
71150712.130.124
712007.30412.430.124
8002.950.102
8500.3042.9510.099
81000.6092.9520.096
81500.9133.1390.095
82001.2173.440.095
82501.5223.7390.095
83001.8264.0450.095
83502.134.3410.095
84002.4354.6420.095
84502.7394.9430.095
85003.0445.2480.095
85503.3485.5470.095
86003.6525.8490.095
86503.9576.1490.095
87004.2616.4550.095
87504.5656.7580.095
88004.877.0570.095
88505.1747.3620.095
89005.4787.6570.095
89505.7837.9630.095
810006.0878.2670.095
810506.3918.5680.095
811006.6968.8670.095
8115079.1690.095
812007.3049.4720.095
90010.90.182
9500.30410.890.179
91000.60910.90.175
91500.91311.170.175
92001.21711.460.175
92501.52211.760.175
93001.82612.060.175
93502.1312.350.175
94002.43512.650.175
94502.73912.950.175
95003.04413.30.176
95503.34813.660.176
96003.65213.930.176
96503.95714.140.175
97004.26114.450.175
97504.56514.750.175
98004.8715.050.175
98505.17415.350.175
99005.47815.650.175
99505.78315.950.175
910006.08716.250.175
910506.39116.570.175
911006.69616.860.175
91150717.160.175
912007.30417.480.175
10002.8180.101
10500.3042.8260.098
101000.6092.8660.095
101500.9133.1550.095
102001.2173.4520.095
102501.5223.7610.095
103001.8264.0610.095
103502.134.3590.095
104002.4354.6620.095
104502.7394.9610.095
105003.0445.2630.095
105503.3485.5650.095
106003.6525.870.095
106503.9576.1740.095
107004.2616.4830.095
107504.5656.7820.095
108004.877.0810.095
108505.1747.3780.095
109005.4787.6810.095
109505.7837.9890.095
1010006.0878.2890.095
1010506.3918.5960.095
1011006.6968.8910.095
10115079.1910.095
1012007.3049.4920.095
Giganet
T3E/MPI
Ping PongPingPongGapO_SendO_RecLatency
Msg SizeQueueCPUCPU TimeInv BWBWTime to13.056.736.054.961-4.49
(Bytes)depthLoops(usec)(usec)MB/secsoln (sec)
113.0460.146Updated2/30/2002overlap (qd=1)
213.060.292Updated2/23/02ping (still nonsymmetric version)
413.0560.584Updated2/30/2002flood
813.0521.169
1614.3542.126
3214.4164.234
6424.2265.039
12825.2069.686
256461.041.059
51227.85235.064
102431.43262.139
204837.338104.617
409650.226155.546
819276.392204.537
16384130.132240.142
32768232.734268.547b
65536439.454284.443