digital manufacturing technology and convenient access to ...€¦ · about the benefits and best...

8

Upload: others

Post on 08-Aug-2020

1 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Digital manufacturing technology and convenient access to ...€¦ · about the benefits and best practices of using remote HPC capabilities. This document is an invaluable resource
Page 2: Digital manufacturing technology and convenient access to ...€¦ · about the benefits and best practices of using remote HPC capabilities. This document is an invaluable resource

Digital manufacturing technology and convenient access to High Performance Computing (HPC) in industry R&D are essential to increase the quality of our products and the competitiveness of our companies. Progress can only be achieved by educating our engineers, especially those in the “missing middle,” and making HPC easier to access and use for everyone who can benefit from this advanced technology. The UberCloud HPC Experiment actively promotes the wider adoption of digital manufacturing technology. It is an example of a grass roots effort to foster collaboration among engineers, HPC experts, and service providers to address challenges at scale. The UberCloud HPC Experiment started in mid-2012 with the aim of exploring the end-to-end process employed by digital manufacturing engineers to access and use remote computing resources in HPC centers and in the cloud. In the meantime, the UberCloud HPC Experiment has achieved the participation of 500 organizations and individuals from 48 countries. Over 80 teams have been involved so far. Each team consists of an industry end-user and a software provider; the organizers match them with a well-suited resource provider and an HPC expert. Together, the team members work on the end-user’s application – defining the requirements, implementing the application on the remote HPC system, running and monitoring the job, getting the results back to the end-user, and writing a case study. Intel decided to sponsor a Compendium of 25 case studies, including the one you are reading, to raise awareness in the digital manufacturing community about the benefits and best practices of using remote HPC capabilities. This document is an invaluable resource for engineers, managers and executives who believe in the strategic importance of this technology for their organizations. You can download it at: http://tci.taborcommunications.com/UberCloud_HPC_Experiment Very special thanks to Wolfgang Gentzsch and Burak Yenier for making the UberCloud HPC Experiment possible. This HPC UberCloud Compendium of Case Studies has been sponsored by Intel and produced in conjunction with Tabor Communications Custom Publishing, which includes HPCwire, HPC in the Cloud, and Digital Manufacturing Report. If you are interested in participating in this experiment, either actively as a team member or passively as an observer, please register at http://www.hpcexperiment.com

Page 3: Digital manufacturing technology and convenient access to ...€¦ · about the benefits and best practices of using remote HPC capabilities. This document is an invaluable resource

Development of Stents for a Narrowed Artery

“For an Abaqus user using SGI Cyclone this is a viable solution for both compute and visualization.”

MEET THE TEAM End User – Anonymous Software Provider – Matt Dunbar Dunbar is Chief Architect at SIMULIA. Resource Providers – Tony DeVarco and Eurgne Kremenetsky Devarco is Senior Manager for Strategic Partners and Cloud Computing at SGI. Kremenetsky is Systems Engineering Technical Lead at SGI HPC/CAE Experts – Scott Shaw and Gregory Shirin Shaw is a Senior Applications Engineer at SGI. Shirin, the HPCExperiment team mentor, is a senior consultant with Grid Dynamics. USE CASE This project focused on simulating stent deployment using SIMULIA’s Abaqus/Standard and Remote Visualization Software from NICE to run Abaqus/CAE on SGI Cyclone. The intent was to determine the viability of shifting similar work to the cloud during periods of full-utilization of in-house compute resources. Information on Software and Resource Providers Abaqus from SIMULIA, the Dassault Systems’ brand for realistic simulation, is an industry leading product family that provides a comprehensive and scalable set of Finite Element Analysis (FEA) and multiphysics solvers and modeling tools for simulating a wide range of linear and nonlinear model types. It is used for stress, heat transfer crack initiation, failure and other types of analysis in mechanical, structural, aerospace,

Page 4: Digital manufacturing technology and convenient access to ...€¦ · about the benefits and best practices of using remote HPC capabilities. This document is an invaluable resource

automotive, bio-medical, civil, energy, and related engineering and research applications. Abaqus includes four core products: Abaqus/CAE, Abaqus/Standard, Abaqus/Explicit, and Abaqus/CFD. Abaqus/CAE provides users with a modeling and visualization environment for Abaqus analysis.

NICE Desktop Cloud Visualization (DCV) is an advanced technology that enables technical computing users to remote access 2D/3D interactive applications over a standard network. Engineers and scientists are immediately empowered by taking full advantage of high-end graphics cards, fast I/O performance and large memory nodes hosted in "Public or Private 3D Cloud", rather than waiting for the next upgrade of the workstations.

SGI Cyclone is the world's first large scale on-demand cloud computing service specifically dedicated to technical applications. Cyclone capitalizes on over twenty years of SGI HPC expertise to address the growing science and engineering technical markets that rely on extremely high-end computational hardware, software and networking equipment to achieve rapid results. Current State The end user currently has two 8 core PC workstations for pre- and post-processing with Abaqus/CAE, and a Linux based compute server with 40 cores and 128GB of available memory. They do not use any batch job scheduling software. The typical size of model of the stent design that they run has 2-6 million degrees of freedom (DOF). Typical job uses 20 cores and takes six hours. After the job is run, the data is transferred to the workstation for post-processing. For the experiment it was agreed the Simulia and SGI would provide the end user with Abaqus licenses for up to 128 cores in order to see if running a job on more cores could reduce the time to finish the job, as well as provide access to NICE DCV remote graphics software to view the results in Northern California before downloading them to the end user office in New Hampshire. End-To-End Process

1. Set up Cyclone account for End User. 2. SGI License Server info sent to Software Provider. 3. Issuance of a 128 core temporary license of Abaqus by Software Provider. 4. End user uploads model to his home directory on Cyclone login node and sends

email to CAE Expert. 5. Benchmark scaling exercise to find core count sweet spot is done by CAE Expert. 6. Results of benchmark scaling exercise sent to End User by CAE Expert. 7. Remote Viz session to view data using Abaqus CAE is set up by CAE Expert. 8. Remote Viz demo via WebEx with End User. 9. PBS submission script written by CAE Expert and shared with End User.

Page 5: Digital manufacturing technology and convenient access to ...€¦ · about the benefits and best practices of using remote HPC capabilities. This document is an invaluable resource

10. End user uploads, runs, views and downloads test case. 11. 10 days of free access is given to End User.

CHALLENGES The team met via a con call and agreed upon the list of steps that made up the end-to-end process. The setting up of the end user account and having the software licenses issued was quickly done. In order for the End User to upload their model via SSH they needed to get permission from their internal IT group, which took some time. Once the model was uploaded, the CAE Expert ran the model at various core counts and produced a routine benchmark report for the End User to review (see results in table below). The remote viz demo went smoothly but when the End User tried to run the software themselves it took both the Resource and End User IT network teams to open the necessary ports, which took much longer than anticipated. Once the ports were open, the remote viz post-processing experience was better than expected. Analysis output files still needed to be shipped back to the End User for future reuse, additional post-processing, etc. Data transfer via the network was found to be slow. Final results might be better transferred through an external USB hard drive via FedEx.

BENEFITS Here are the top 3 benefits of participating in the experiment for each of the team members: End User

1. Gained an increased understanding of what is involved in turning on and using a cloud-based solution for computational work with the Abaqus suite of finite element software. 2. Determined that shifting computational work to the cloud during periods of full-utilization of in-house compute resources is a viable approach to ensuring analysis throughput. 3. Participation in the experiment allowed direct assessment of the speed and integrity of remote visualization of computational models (both pre- and post-processing) for a variety of model and output database sizes. SGI/Nice DCV provided a robust solution, which permitted fast and accurate manipulation of the computational models used in the study.

Software Provider

1. I was able to hear from an experienced Abaqus user that doing remote postprocessing using a client machine in New Hampshire to an SGI Cyclone server in California provided a good user experience. 2. I was able to hear from an end user that managing the networking requirements (opening ports in firewalls) took some work but was manageable.

Page 6: Digital manufacturing technology and convenient access to ...€¦ · about the benefits and best practices of using remote HPC capabilities. This document is an invaluable resource

3. I have a reference point for an Abaqus user who views executing his Abaqus workflow on SGI Cyclone to be a viable solution.

CAE Expert 1. Expanded my knowledge of analytical methods used in medical stent

engineering with Abaqus/Standard. 2. Increased awareness of user interactions with cloud based solution and

networking requirements. 3. The geographic distance of ~3100 miles between customer and SGI Cyclone

Cloud resources confirms distance is no longer a barrier in high performance computing and remote visualization. Based on the Abaqus Engineer, he comments the SGI Remote Visualization for cloud computing was “faster and smoother than I expected”.

Resource Provider:

1. The ability to walk a new customer through our HPC cloud process for usage. 2. Testing our remote visualization solution, which is in beta. 3. Working with a long time CAE ISV partner to offer a joint cloud base solution

to run and view Abaqus jobs. CONCLUSION For an Abaqus user using SGI Cyclone this is a viable solution for both compute and visualization. The Viz side was impressive.

Test Model - Abaqus 6.12 600K Elements, 1M Nodes, 2M DOF, 12 Steps, 563 Iterations

ICE 8200EX X5570 2x4 2.93GHz 24GB/node, SUSE 11SP1, IB QDR 4x Fabric

Cores

# Node

s Memor

y

Scr Storag

e host_spl

it MP_MOD

E Runtim

e hh:mm:s

s Speed up

16 2 24GB NAS 1 MPI 20636 5:43:56 1.0 32 4 24GB NAS 1 MPI 14659 4:04:19 1.4 48 6 24GB NAS 1 MPI 12084 3:21:24 1.7 64 8 24GB NAS 1 MPI 11367 3:09:27 1.8

Cores

# Node

s Memor

y

Scr Storag

e host_spl

it MP_MOD

E Runtim

e hh:mm:s

s Speed up

16 2 24GB NAS 2 MPI 16776 4:39:36 1.0 32 4 24GB NAS 2 MPI 12022 3:20:22 1.4 48 6 24GB NAS 2 MPI 11152 3:05:52 1.5 64 8 24GB NAS 2 MPI 9536 2:38:56 1.8

Page 7: Digital manufacturing technology and convenient access to ...€¦ · about the benefits and best practices of using remote HPC capabilities. This document is an invaluable resource

Host Split Perf Improvement

Cores HS2/HS1 16 18.7% 32 18.0% 48 7.7% 64 16.1%

The host_split option set in the Abaqus_v6.env file allows multiple MPI ranks per compute node to

improve Abaqus/Std message passing performance with multi-socket compute nodes. Typically this setting applies to high contact and low duration

solver walltime/iteration simulations. The host split default is 1.

Page 8: Digital manufacturing technology and convenient access to ...€¦ · about the benefits and best practices of using remote HPC capabilities. This document is an invaluable resource

Thank you for your interest in the free and voluntary UberCloud HPC Experiment. To download similar case studies go to: http://tci.taborcommunications.com/UberCloud_HPC_Experiment If you, or your organization would like to participate in this Experiment to explore hands-on the end-to-end process of HPC as a Service for your business then please register at: http://www.hpcexperiment.com/why-participate If you are interested in promoting your service/product at the UberCloud Exhibit then please register at http://www.exhibit.hpcexperiment.com/