amd epyc, with industry-leading core count and memory€¦ · 3. based on amd internal testing of...

2
© Copyright 2017-2019 AMD. All Rights Reserved. 1 AMD EPYC DELIVERS EXCELLENT PERFORMANCE FOR HPC WORKLOADS AMD EPYC, with industry-leading core count and memory bandwidth enables HPC performance enhancements. ACCELERATION IS EVERYTHING Once the domain of sciensts, high-performance compung (HPC) workloads are used by many organizaons, from oil and gas companies and financial instuons to weather and climate modeling services, genome sequencing companies, and universies. These innovave applicaons require the capability to process very large data sets and quickly run compute-intensive models and analysis techniques. IT INFRASTRUCTURE CHALLENGES FOR HPC Although processor and system technology improved incrementally over the last decade, there haven’t been core architectural advancements to efficiently support HPC workloads. Even with modern systems, HPC workloads connue to be challenged by: { Insufficient memory bandwidth to keep CPU compute engines occupied { Inadequate core density, requiring massive scale-out soluons to complete HPC tasks { Growing need for GPU acceleraon for highly parallel workloads { Poorly opmized I/O { Lack of data security during computaon WHY AMD EPYC FOR HPC? The AMD EPYC™ processor family balances the raos of cores, memory, I/O bandwidth, and deploys security features embedded in silicon to achieve opmized performance for today’s HPC applicaons. ENHANCED CORE DENSITY { Supports 8-32 cores per socket to deliver massively parallel performance { Offers more cores in the same server rack space as other 1RU and 2RU servers UP TO 33% MORE MEMORY BANDWIDTH 1 { Uses 8 memory channels to speed the flow of data into and out of the CPU { Virtually eliminates memory bolenecks and unlocks applicaon performance HIGHLY SCALABLE I/O { Offers up to 128 lanes of PCIe® bandwidth without the need for a switch { Supports high-bandwidth network interfaces, giving HPC workloads quick access to data { Directly aaches up to 32 NVMe or SATA devices to opmize I/O and efficiently handle storage needs. EMBEDDED SECURITY PROCESSOR { Full memory encrypon with no changes needed to your applicaons { Secure root-of-trust technology to help securely boot soſtware 2 TB OF MEMORY 8–32 CORES 128 LANES OF PCIe BANDWIDTH x16 x16 x16 x16 x16 x16 x16 x16 AMD EPYC Processor 8 MEMORY CHANNELS { SYSTEM-ON-CHIP (SOC) DESIGN { UP TO 32 CORES { UP TO 16 DIMMS (2 TB) OF MEMORY PER SOCKET { 8 MEMORY CHANNELS FOR UP TO 33% MORE BANDWIDTH 2 { 128 LANES OF PCIE® BANDWIDTH { SERVER CONTROLLER HUB { DEDICATED, EMBEDDED SECURITY PROCESSOR WITH SECURE BOOT AND FULL MEMORY ENCRYPTION THROUGH ON-CHIP MEMORY CONTROLLERS

Upload: others

Post on 07-Aug-2020

5 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: AMD EPYC, with industry-leading core count and memory€¦ · 3. Based on AMD internal testing of ANSYS FLUENT v19.1 as of January 20, 2019 and Intel results on ANSYS FLUENT v19.0

copy Copyright 2017-2019 AMD All Rights Reserved 1

AMD EPYC DELIVERS EXCELLENT PERFORMANCE FOR HPC WORKLOADS

AMD EPYC with industry-leading core count and memory bandwidth enables HPC performance enhancementsACCELERATION IS EVERYTHINGOnce the domain of scientists high-performance computing (HPC) workloads are used by many organizations from oil and gas companies and financial institutions to weather and climate modeling services genome sequencing companies and universities These innovative applications require the capability to process very large data sets and quickly run compute-intensive models and analysis techniques

IT INFRASTRUCTURE CHALLENGES FOR HPCAlthough processor and system technology improved incrementally over the last decade there havenrsquot been core architectural advancements to efficiently support HPC workloads Even with modern systems HPC workloads continue to be challenged by

Insufficient memory bandwidth to keep CPU compute engines occupied

Inadequate core density requiring massive scale-out solutions to complete HPC tasks

Growing need for GPU acceleration for highly parallel workloads

Poorly optimized IO

Lack of data security during computation

WHY AMD EPYC FOR HPCThe AMD EPYCtrade processor family balances the ratios of cores memory IO bandwidth and deploys security features embedded in silicon to achieve optimized performance for todayrsquos HPC applications

ENHANCED CORE DENSITY Supports 8-32 cores per socket to deliver massively parallel performance

Offers more cores in the same server rack space as other 1RU and 2RU servers

UP TO 33 MORE MEMORY BANDWIDTH1 Uses 8 memory channels to speed the flow of data into and out of the CPU

Virtually eliminates memory bottlenecks and unlocks application performance

HIGHLY SCALABLE IO Offers up to 128 lanes of PCIereg bandwidth without the need for a switch

Supports high-bandwidth network interfaces giving HPC workloads quick access to data

Directly attaches up to 32 NVMe or SATA devices to optimize IO and efficiently handle storage needs

EMBEDDED SECURITY PROCESSOR Full memory encryption with no changes needed to your applications

Secure root-of-trust technology to help securely boot software

2 TB

OF

MEM

ORY

8ndash32 CORES

128 LANES OF PCIe BANDWIDTH

x16 x16 x16 x16

x16 x16 x16 x16

AMD EPYC Processor

8 M

EMOR

Y CH

ANN

ELS

SYSTEM-ON-CHIP (SOC) DESIGN

UP TO 32 CORES

UP TO 16 DIMMS (2 TB) OF MEMORY PER SOCKET

8 MEMORY CHANNELS FOR UP TO 33 MORE BANDWIDTH2

128 LANES OF PCIEreg BANDWIDTH

SERVER CONTROLLER HUB

DEDICATED EMBEDDED SECURITY PROCESSOR WITH SECURE BOOT AND FULL MEMORY ENCRYPTION THROUGH ON-CHIP MEMORY CONTROLLERS

AMD EPYC DELIVERS EXCELLENT PERFORMANCE FOR HPC WORKLOADS

FOOTNOTES1 AMD EPYC 7601 processor supports up to 8 channels of DDR4-2667 versus

the Xeon Platinum 8180 processor at 6 channels of DDR4-2667 NAP-42

2 In AMD internal testing on STREAM Triad on an AMD ldquoEthanolrdquo reference system with 2 x EPYC 7551 CPU in 256 GB (16 x 16GB) DDR2666 memory using the GCC v72 compiler Ubuntu 1604 1002E BIOS which achieved 293081 MBs NAP-94

3 Based on AMD internal testing of ANSYS FLUENT v191 as of January 20 2019 and Intel results on ANSYS FLUENT v190 published by ANSYS as of January 20 2019 2068 Core Solver Rating on combustor_71m benchmark using 16 x AMD EPYC Processor 7451 (24-core 23GHz) in 8 servers (2 processors per server) 256GB DDR4-2666 memory per server Mellanox ConnectX-5 EDR 100Gb InfiniBand x16 PCIe per server 1 x 256GB NVMe (OS storage) per server 1 x 1TB NVMe (Data storage) per server Red Hatreg Enterprise Linux 75 MLNX_OFED_LINUS-43-3031 OFED Driver Mellanox EDR 100Gbs Managed Switch (MSB7800-ES2F) ANSYS FLUENT v191 SMT=OFF Boost=ON Determinism Slider = Power Transparent Huge Pages=ON Swappiness=0 Governor = Performance 1392650 Core Solver Rating on combustor_71m benchmark using 16 x Intel Xeon Gold Processor

model 6148 (20-core 24 GHz) in 8 Cray XC50 servers (2 processors per server) Cray Linux Enterprise 60 update 07 based on SUSE 12 SP3 Cray Aries network FLUENT AVX2 binary httpswwwansyscomsolutionssolutions-by-roleit-professionalsplatform-supportbenchmarks-overviewansys-fluent-benchmarksansys-fluent-benchmarks-release-19flow-through-combustor-71m Testing with other EPYC or Intel parts may result in different performance results NAP-138

4 Based on SPEC CPUreg 2017 scores published on wwwspecorg as of January 20 2019 AMD based system scored 268 results found at httpwww specorgcpu2017resultsres2018q4cpu2017-20180917-08862html Xeon based system scored 216 results found at httpwwwspecorgcpu2017resultsres2018q3cpu2017-20180809-08277html See www specorg for more information NAP-126

CONSISTENT FEATURE SET Simultaneous multithreading (SMT) 8 memory channels and 128 PCIereg lanes across SKUs

Balance of compute capabilities and economics without sacrificing features

SCALABILITY FOR HPC WORKLOADS

PARTCORES THREADS

BASE FREQ (GHz)

MAX BOOST (GHz)

MAX DDR FREQ (1DPC)

PCIE GEN3 LANES

TDP (W)

7601 32 64 220 320 2666 128 180

7551 32 64 200 300 2666 128 180

7501 32 64 200 300 24002666 128 155170

7451 24 48 230 320 2666 128 180

7401 24 48 200 300 24002666 128 155170

7371 16 32 310 380 2666 128 200

7351 16 32 240 290 24002666 128 155170

7301 16 32 220 270 24002666 128 155170

7281 16 32 210 270 24002666 128 155170

7261 8 16 250 290 24002666 128 155170

FOR MORE INFORMATIONFor more information visit amdcomEPYCserver

copy2017ndash2019 Advanced Micro Devices Inc All rights reserved AMD the AMD Arrow logo EPYC and combinations thereof are trademarks of Advanced Micro Devices Inc PCIe is a registered trademark of PCI-SIG Corporation SPECreg and the benchmarks SPECratereg and SPECfpreg are registered trademarks of the Standard Performance Evaluation Corporation For more information go to specorg Other names are for informational purposes only and may be trademarks of their respective owners LE-62008-01 0219

OUTSTANDING PERFORMANCE

DRAMATIC IMPROVEMENT FOR MEMORY-BOUND HPC APPLICATIONS

WORLD-RECORD FLOATING-POINT PERFORMANCE

EXCEPTIONAL BANDWIDTH AS MEASURED BY THE STREAM BENCHMARK

48 higher computational fluid dynamics performance than the Intel Xeon Gold processor6148 on the Ansys Fluent benchmark3

AMD EPYC 7551 delivers high memory throughput of over 290 GBs as measured by STREAM Triad2

24 higher SPECratereg2017_fp_base performanceon AMD EPYC 7601 compared to Intel Xeon Gold 6148 processors4

Page 2: AMD EPYC, with industry-leading core count and memory€¦ · 3. Based on AMD internal testing of ANSYS FLUENT v19.1 as of January 20, 2019 and Intel results on ANSYS FLUENT v19.0

AMD EPYC DELIVERS EXCELLENT PERFORMANCE FOR HPC WORKLOADS

FOOTNOTES1 AMD EPYC 7601 processor supports up to 8 channels of DDR4-2667 versus

the Xeon Platinum 8180 processor at 6 channels of DDR4-2667 NAP-42

2 In AMD internal testing on STREAM Triad on an AMD ldquoEthanolrdquo reference system with 2 x EPYC 7551 CPU in 256 GB (16 x 16GB) DDR2666 memory using the GCC v72 compiler Ubuntu 1604 1002E BIOS which achieved 293081 MBs NAP-94

3 Based on AMD internal testing of ANSYS FLUENT v191 as of January 20 2019 and Intel results on ANSYS FLUENT v190 published by ANSYS as of January 20 2019 2068 Core Solver Rating on combustor_71m benchmark using 16 x AMD EPYC Processor 7451 (24-core 23GHz) in 8 servers (2 processors per server) 256GB DDR4-2666 memory per server Mellanox ConnectX-5 EDR 100Gb InfiniBand x16 PCIe per server 1 x 256GB NVMe (OS storage) per server 1 x 1TB NVMe (Data storage) per server Red Hatreg Enterprise Linux 75 MLNX_OFED_LINUS-43-3031 OFED Driver Mellanox EDR 100Gbs Managed Switch (MSB7800-ES2F) ANSYS FLUENT v191 SMT=OFF Boost=ON Determinism Slider = Power Transparent Huge Pages=ON Swappiness=0 Governor = Performance 1392650 Core Solver Rating on combustor_71m benchmark using 16 x Intel Xeon Gold Processor

model 6148 (20-core 24 GHz) in 8 Cray XC50 servers (2 processors per server) Cray Linux Enterprise 60 update 07 based on SUSE 12 SP3 Cray Aries network FLUENT AVX2 binary httpswwwansyscomsolutionssolutions-by-roleit-professionalsplatform-supportbenchmarks-overviewansys-fluent-benchmarksansys-fluent-benchmarks-release-19flow-through-combustor-71m Testing with other EPYC or Intel parts may result in different performance results NAP-138

4 Based on SPEC CPUreg 2017 scores published on wwwspecorg as of January 20 2019 AMD based system scored 268 results found at httpwww specorgcpu2017resultsres2018q4cpu2017-20180917-08862html Xeon based system scored 216 results found at httpwwwspecorgcpu2017resultsres2018q3cpu2017-20180809-08277html See www specorg for more information NAP-126

CONSISTENT FEATURE SET Simultaneous multithreading (SMT) 8 memory channels and 128 PCIereg lanes across SKUs

Balance of compute capabilities and economics without sacrificing features

SCALABILITY FOR HPC WORKLOADS

PARTCORES THREADS

BASE FREQ (GHz)

MAX BOOST (GHz)

MAX DDR FREQ (1DPC)

PCIE GEN3 LANES

TDP (W)

7601 32 64 220 320 2666 128 180

7551 32 64 200 300 2666 128 180

7501 32 64 200 300 24002666 128 155170

7451 24 48 230 320 2666 128 180

7401 24 48 200 300 24002666 128 155170

7371 16 32 310 380 2666 128 200

7351 16 32 240 290 24002666 128 155170

7301 16 32 220 270 24002666 128 155170

7281 16 32 210 270 24002666 128 155170

7261 8 16 250 290 24002666 128 155170

FOR MORE INFORMATIONFor more information visit amdcomEPYCserver

copy2017ndash2019 Advanced Micro Devices Inc All rights reserved AMD the AMD Arrow logo EPYC and combinations thereof are trademarks of Advanced Micro Devices Inc PCIe is a registered trademark of PCI-SIG Corporation SPECreg and the benchmarks SPECratereg and SPECfpreg are registered trademarks of the Standard Performance Evaluation Corporation For more information go to specorg Other names are for informational purposes only and may be trademarks of their respective owners LE-62008-01 0219

OUTSTANDING PERFORMANCE

DRAMATIC IMPROVEMENT FOR MEMORY-BOUND HPC APPLICATIONS

WORLD-RECORD FLOATING-POINT PERFORMANCE

EXCEPTIONAL BANDWIDTH AS MEASURED BY THE STREAM BENCHMARK

48 higher computational fluid dynamics performance than the Intel Xeon Gold processor6148 on the Ansys Fluent benchmark3

AMD EPYC 7551 delivers high memory throughput of over 290 GBs as measured by STREAM Triad2

24 higher SPECratereg2017_fp_base performanceon AMD EPYC 7601 compared to Intel Xeon Gold 6148 processors4