ansys® fluent® on 3rd generation intel® xeon® scalable

4
Addressing computational fluid-dynamics challenges Computational fluid dynamics (CFD) is the practice of modeling steady or unsteady fluid flows in a wide variety of engineering disciplines. Design engineers use CFD software to simulate and analyze how products will perform as gas and liquid flow around them. These workloads involve complex, unstructured meshes with tens of millions of cells, and they demand high memory bandwidth for solvers to perform efficiently. Intel’s outstanding hardware, software, and ecosystem help manufacturers design better products faster and on budget. With increases in memory bandwidth, system memory, and instructions per clock (IPC), 3rd Generation Intel Xeon Scalable processors deliver outstanding Ansys Fluent performance compared to previous-generation processors. 3rd Generation Intel Xeon Scalable processors also offer built-in high- performance computing (HPC) and artificial intelligence (AI) acceleration, in addition to built-in configuration flexibility with Intel® Speed Select Technology. Ansys Fluent users have seen measured performance improvements of up to 19 percent from optimizations with the popular Intel Math Kernel Library (Intel MKL), 1 which ensures new instructions like Intel Advanced Vector Extensions 512 (Intel AVX-512) work seamlessly for developers. In addition, Ansys collaborates closely with Intel to ensure Fluent is optimized to perform at scale on Intel architecture. About Ansys Fluent Ansys Fluent software contains the diverse physical modeling capabilities needed to model flow, turbulence, heat transfer, and reactions for industrial applications. These range from airflow over an aircraft wing to combustion in a furnace, from Intel HPC leadership Intel’s unmatched portfolio and broad ecosystem help users: Solve complex problems faster Expand design space to gain new insights Meet deadlines without compromising quality Outstanding performance With eight DDR4 memory channels, up to 40 cores per socket, increased cache sizes, and a 20 percent increase in instructions per clock compared to previous-generation processors, 3rd Generation Intel Xeon Scalable processors deliver exceptional performance for a range of applications. The latest generation is configured to support up to 6 TB of system memory per processor with support for Intel® Optane™ persistent memory (PMem) 200 series. Built-in Intel Speed Select Technology, plus a special SKU engineered for liquid-cooled systems, provides unparalleled flexibility. Built-in acceleration Only Intel Xeon processors support Intel AVX-512 instructions, for 2x the instructions completed per cycle, versus Intel AVX2. In one study, Ansys Fluent saw a 19 percent speedup based on this feature, which is enabled by Intel MKL. 1 Unmatched ecosystem Intel has engaged for decades with software providers like Ansys, who optimize their applications for Intel architecture. As a result, users achieve greater return on investment (ROI) from software licenses, while improving performance and scalability—and developers get a better out-of-box experience. speedup based on Intel MKL and Intel AVX-512 1 • 20% higher instructions per clock • Up to 40 cores per socket • Built-in acceleration with Intel AVX-512 Up to Key features: 19 better performance than previous-generation processors 2 Up to Ansys Fluent on Intel 54 Ansys® Fluent® on 3rd Generation Intel® Xeon® Scalable Processors High-Performance Computing (HPC): Manufacturing

Upload: others

Post on 03-May-2022

11 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Ansys® Fluent® on 3rd Generation Intel® Xeon® Scalable

Addressing computational fluid-dynamics challengesComputational fluid dynamics (CFD) is the practice of modeling steady or unsteady fluid flows in a wide variety of engineering disciplines. Design engineers use CFD software to simulate and analyze how products will perform as gas and liquid flow around them. These workloads involve complex, unstructured meshes with tens of millions of cells, and they demand high memory bandwidth for solvers to perform efficiently.

Intel’s outstanding hardware, software, and ecosystem help manufacturers design better products faster and on budget. With increases in memory bandwidth, system memory, and instructions per clock (IPC), 3rd Generation Intel Xeon Scalable processors deliver outstanding Ansys Fluent performance compared to previous-generation processors. 3rd Generation Intel Xeon Scalable processors also offer built-in high-performance computing (HPC) and artificial intelligence (AI) acceleration, in addition to built-in configuration flexibility with Intel® Speed Select Technology.

Ansys Fluent users have seen measured performance improvements of up to 19 percent from optimizations with the popular Intel Math Kernel Library (Intel MKL),1 which ensures new instructions like Intel Advanced Vector Extensions 512 (Intel AVX-512) work seamlessly for developers. In addition, Ansys collaborates closely with Intel to ensure Fluent is optimized to perform at scale on Intel architecture.

About Ansys FluentAnsys Fluent software contains the diverse physical modeling capabilities needed to model flow, turbulence, heat transfer, and reactions for industrial applications. These range from airflow over an aircraft wing to combustion in a furnace, from

Intel HPC leadership

Intel’s unmatched portfolio and broad ecosystem help users:

• Solve complex problems faster

• Expand design space to gain new insights

• Meet deadlines without compromising quality

Outstanding performance

With eight DDR4 memory channels, up to 40 cores per socket, increased cache sizes, and a 20 percent increase in instructions per clock compared to previous-generation processors, 3rd Generation Intel Xeon Scalable processors deliver exceptional performance for a range of applications. The latest generation is configured to support up to 6 TB of system memory per processor with support for Intel® Optane™ persistent memory (PMem) 200 series. Built-in Intel Speed Select Technology, plus a special SKU engineered for liquid-cooled systems, provides unparalleled flexibility.

Built-in acceleration

Only Intel Xeon processors support Intel AVX-512 instructions, for 2x the instructions completed per cycle, versus Intel AVX2. In one study, Ansys Fluent saw a 19 percent speedup based on this feature, which is enabled by Intel MKL.1

Unmatched ecosystem

Intel has engaged for decades with software providers like Ansys, who optimize their applications for Intel architecture. As a result, users achieve greater return on investment (ROI) from software licenses, while improving performance and scalability—and developers get a better out-of-box experience.

speedup based on Intel MKL and Intel AVX-5121

• 20% higher instructions per clock• Up to 40 cores per socket• Built-in acceleration with Intel AVX-512

Upto

Key features:

19better performance than previous-generation processors2

Upto

Ansys Fluent on Intel

54

Ansys® Fluent® on 3rd Generation Intel® Xeon® Scalable Processors

High-Performance Computing (HPC): Manufacturing

Page 2: Ansys® Fluent® on 3rd Generation Intel® Xeon® Scalable

Application Brief | Ansys® Fluent® on 3rd Generation Intel® Xeon® Scalable Processors

bubble columns to oil platforms, from blood flow to semiconductor manufacturing, and from cleanroom design to wastewater treatment plants. Ansys Fluent also includes capabilities to model in-cylinder combustion, aero-acoustics, turbomachinery, and multiphase systems.

Ansys Fluent helps HPC users solve complex, large-model CFD simulation problems quickly and cost-effectively. Fluent is highly scalable, setting a world supercomputing record by scaling to 172,000 cores on an Intel-based system.3 With the Ansys Fluent experience, novices and expert users alike can run fluids simulations in less time and with less training than ever before.

Optimizing Ansys Fluent performance with IntelCFD workloads tend to be memory bandwidth–bound, where increased memory channels matter more than core counts or clock speeds. With an increase from 6 to 8 memory channels, in addition to 6 TB of total system memory, 3rd Generation Intel Xeon Scalable processors help memory-hungry Ansys Fluent workloads perform optimally.

In addition, Ansys Fluent benefits from Intel AVX-512—available only with Intel—which doubles the amount of work completed per instruction and is seamlessly available out of the box for engineers using the Ansys Fluent 2020 R2 release.

Because commercial software is often licensed per core, some users choose lower-core-count CPUs with higher frequencies from Intel Xeon Gold 6300 processors, which offer 8–32 cores and up to 3.7 GHz turbo frequencies. For customers who require the highest performance levels, Intel Xeon Platinum 8300 processors offer 32–40 cores, with up to 33.7 GHz frequencies. The new core microarchitecture delivers outstanding performance per core, a critical factor for users investing in top applications like Fluent.

Cluster scaling can reduce Ansys Fluent simulation time from days to hours or minutes. As shown in Figure 2, in combination with High Dynamic Range (HDR) InfiniBand fabric and Intel MPI Library, Intel Xeon Platinum 8360Y processors provide nearly ideal scalability out to more than 2,000 cores for the largest Fluent workloads. Super-linear scaling is also possible, as shown with aircraft_wing_14m achieving more than 32 times the single-node performance on 32 cluster nodes.

Image courtesy of ANSYS, Inc.

2

Page 3: Ansys® Fluent® on 3rd Generation Intel® Xeon® Scalable

Application Brief | Ansys® Fluent® on 3rd Generation Intel® Xeon® Scalable Processors

Figure 1. Normalized Ansys Fluent 2021 R1 per-core performance2

Figure 2. Node scaling for Ansys Fluent 2021 R14

Intel Xeon E5-2697v4 (54 cores)

2.50

2.00

1.001.00

1.50

0.50

0.00

Intel Xeon Gold 6148 (20 cores)

Intel Xeon Platinum 8358 (32 cores)

Intel Xeon Platinum 8268 (24 cores)

Rela

tive

perf

orm

ance

per

cor

e(n

orm

aliz

ed; h

ighe

r is

bett

er)

Generational performance improvements

1.29

1.56

2.33

Aircraft Wing 14m F1 Racecar 140m Exhaust system 33m Combustor 12m Sedan 4m

0 4 8 12 16 20 24 28 32

35.00

30.00

25.00

20.00

15.00

10.00

5.00

0.00

Nor

mal

ized

sol

ver r

ate

(hig

her i

s be

tter

)

Number of nodesIntel Xeon Platinum 8360Y (72 cores per node)

Node scaling

Near ideal scalability at 2,000+ cores

3

Page 4: Ansys® Fluent® on 3rd Generation Intel® Xeon® Scalable

Application Brief | Ansys® Fluent® on 3rd Generation Intel® Xeon® Scalable Processors

Learn moreFor more information about Intel Xeon Scalable processors for HPC, visit intel.com/content/www/us/en/high-performance-computing/processors.html.

For details on Intel software tools and libraries, visit intel.com/content/www/us/en/software/ software-overview.html.

For more information about Ansys Fluent, visit ansys.com/products/fluids/ansys-fluent.

ConclusionIntel architecture offers an outstanding combination of increased memory bandwidth and instructions per clock, in addition to out-of-the-box optimizations, resulting in optimized performance for Ansys Fluent users. 3rd Generation Intel Xeon Scalable processors offer increased core counts over previous-generation processors, in addition to support for Intel Optane PMem 200 series.

Only Intel Xeon processors support the Intel AVX-512 instruction set for double the FLOPS per cycle of traditional Intel AVX2.1 In addition, Ansys Fluent is optimized by using Intel tools to run optimally on Intel architecture. Fluent sees exceptional scaling on Intel Xeon Scalable processors, and Intel MKL, the fastest and most widely used math library for Intel-based systems, simplifies development and helps ensure that new instruction-set architectures (ISAs) just work.5

Together, Intel and Ansys help Fluent users reduce project timelines while ensuring high-fidelity modeling for faster delivery of better products.

1 Intel. “Run Your Ansys Fluent Simulations at Top Speed.” June 2020. ansys.com/content/dam/product/fluids/fluent/run-your-ansys-fluent-simulations-at-top-speed.pdf.2 All Ansys Fluent 2021 R1 runs were conducted with turbo (or turbo boost) enabled, hyper-threading (or core multi-threading) enabled, all physical cores utilized, and one rank per physical core.

Intel Xeon processor E5-2697 v4 configuration tested as of January 18, 2021: Intel Xeon processor E5-2697 v4 (54 cores, 6.9 GHz base, 10.8 GHz max, 145 W); RAM: 128 GB (8 x 16 GB 2,400 MHz DDR4); BIOS: SE5C610.86B.01.01.0028.121720182203; microcode: 0xb000030; operating system: CentOS Linux 8.3.2011; kernel: 4.18.0-240.1.1.el8_3.crt1.x86_64. Intel Xeon Gold 6148 processor configuration tested as of January 18, 2021: Intel Xeon Gold 6148 processor (20 cores, 2.4 GHz base, 3.7 GHz max, 150 W); RAM: 192 GB (12 x 16 GB 2,666 MHz DDR4); BIOS: SE5C620.86B.02.01.0008.031920191559; microcode: 0x2000065; operating system: CentOS Linux 8.3.2011; kernel: 4.18.0-240.1.1.el8_3.crt1.x86_64. Intel Xeon Platinum 8268 processor configuration tested as of January 18, 2021: Intel Xeon Platinum 8268 processor (24 cores, 2.9 GHz base, 3.9 GHz max, 205 W); RAM: 192 GB (12 x 16 GB 2,933 MHz DDR4); BIOS: SE5C620.86B.02.01.0012.070720200218; microcode: 0x5002f01; operating system: CentOS Linux 8.3.2011; kernel: 4.18.0-240.1.1.el8_3.crt1.x86_64. Intel Xeon Platinum 8358 processor configuration tested as of March 21, 2021: Intel Xeon Platinum 8358 processor (32 cores, 2.6 GHz base, 2.6 GHz max, 250 W); RAM: 256 GB (16 x 16 GB 3,200 megatransfers per second [MT/s] DDR4); BIOS: SE5C6200.86B.2021.D40.2103100308; microcode: 0x8d055260; operating system: CentOS Linux release 8.3.2011; kernel: 4.18.0-240.1.1.el8_3.crt1.x86_64.

3 Ansys. “Ansys, HLRS And Cray Set New Supercomputing Record.” November 2016. ansys.com/about-ansys/news-center/11-15-16-ansys-hlrs-cray-set-new-supercomputing-record.4 Based on Intel testing as of May 16, 2021. All Ansys Fluent 2021 R1 runs were conducted with turbo (or turbo boost) enabled, hyper-threading (or core multi-threading) enabled, all physical

cores utilized, and one rank per physical core. Configuration: Intel Xeon Platinum 8360Y processor (36 cores per socket, 1.8 GHz base, 2.4 GHz max), RAM: 256 GB (16 x 16 GB 3,200 MT/s DDR4), BIOS: SE5C6200.86B.0021.D40.2101090208, microcode: 0xd0001e0, operation system: CentOS Linux release 8.3.2011, kernel: 4.18.0-240.22.1.el8_3.crt1.x86_64 running Intel MKL 2020 and Intel MPI 2019u8.

5 Data from Evans Data Software Developer survey, 2020.

Performance varies by use, configuration and other factors. Learn more at www.Intel.com/PerformanceIndex.

Performance results are based on testing as of dates shown in configurations and may not reflect all publicly available updates. See backup for configuration details. No product or component can be absolutely secure.

Your costs and results may vary.

Intel technologies may require enabled hardware, software or service activation.

Intel Advanced Vector Extensions (Intel AVX) provides higher throughput to certain processor operations. Due to varying processor power characteristics, utilizing AVX instructions may cause some parts to operate at less than the rated frequency and b) some parts with Intel Turbo Boost Technology 2.0 to not achieve any or maximum turbo frequencies. Performance varies depending on hardware, software, and system configuration and you can learn more at http://www.intel.com/go/turbo.

© Intel Corporation. Intel, the Intel logo, and other Intel marks are trademarks of Intel Corporation or its subsidiaries. Other names and brands may be claimed as the property of others.

Printed in USA 0821/MB/PRW/PDF Please Recycle 344926-002US

+

4