lattice boltzmann methods on the way to exascale - lss · pdf filelbm on the way to exascale...

31
LBM on the way to ExaScale Ulrich Rüde Lehrstuhl für Simulation Universität Erlangen-Nürnberg www10.informatik.uni-erlangen.de Ulrich Rüde (LSS Erlangen, [email protected]) 1 Lattice Boltzmann Methods on the way to exascale HIGH PERFORMANCE COMPUTING From Clouds and Big Data to Exascale and Beyond An International Advanced Workshop Cetraro – Italy, June 27 – July 1, 2016

Upload: doandat

Post on 06-Feb-2018

214 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Lattice Boltzmann Methods on the way to exascale - LSS · PDF fileLBM on the way to ExaScale — Ulrich Rüde Outline Goals: drive algorithms towards their performance limits (scalability

LBM on the way to ExaScale — Ulrich Rüde

Lehrstuhl für Simulation Universität Erlangen-Nürnberg

www10.informatik.uni-erlangen.de

Ulrich Rüde(LSS Erlangen, [email protected])

1

Lattice Boltzmann Methods on the way to exascale

HIGH PERFORMANCE COMPUTINGFrom Clouds and Big Data to Exascale and Beyond

An International Advanced WorkshopCetraro – Italy, June 27 – July 1, 2016

Page 2: Lattice Boltzmann Methods on the way to exascale - LSS · PDF fileLBM on the way to ExaScale — Ulrich Rüde Outline Goals: drive algorithms towards their performance limits (scalability

LBM on the way to ExaScale — Ulrich Rüde

OutlineGoals:

drive algorithms towards their performance limits (scalability is necessary but not sufficient) sustainable software: reproducibility & flexibility coupled multi physics

Three software packages:1. Many body problems: rigid body dynamics

2.8 × 1010 non-spherical particles2. Kinetic methods: Lattice Boltzmann - fluid flow

>1012 cells, adaptive, load balancing3. Continuum methods: Finite element - multigrid

fully implicit solves with >1013 DoFReal life applications

2

Page 3: Lattice Boltzmann Methods on the way to exascale - LSS · PDF fileLBM on the way to ExaScale — Ulrich Rüde Outline Goals: drive algorithms towards their performance limits (scalability

LBM Methods Ulrich Rüde

The work horsesJUQUEEN SuperMUC

Blue Gene/Q architecture 458,752 PowerPC A2 cores 16 cores (1.6 GHz) per node 16 GiB RAM per node 5D torus interconnect 5.8 PFlops Peak TOP 500: #13

Intel Xeon architecture 147,456 cores 16 cores (2.7 GHz) per node 32 GiB RAM per node Pruned tree interconnect 3.2 PFlops Peak TOP 500: #27

Page 4: Lattice Boltzmann Methods on the way to exascale - LSS · PDF fileLBM on the way to ExaScale — Ulrich Rüde Outline Goals: drive algorithms towards their performance limits (scalability

LBM on the way to ExaScale — Ulrich Rüde

Building block I:

The Lagrangian View:

Granular media simulations

with the physics engine

4

1250000 spherical particles256 processors 300300 time stepsruntime: 48h (including data output)texture mapping, ray tracing

Pöschel, T., & Schwager, T. (2005). Computational granular dynamics: models and algorithms. Springer Science & Business Media.

Page 5: Lattice Boltzmann Methods on the way to exascale - LSS · PDF fileLBM on the way to ExaScale — Ulrich Rüde Outline Goals: drive algorithms towards their performance limits (scalability

Non-penetration conditions Coulomb friction conditions

ξ ≥ 0 ⊥ λn ≥ 0 ‖λto‖2 ≤ μλn

ξ̇+ ≥ 0 ⊥ λn ≥ 0 ‖δv+to‖2λto = −μλnδv

+to

ξ̈+ ≥ 0 ⊥ λn ≥ 0 ‖ ˙δv+

to‖2λto = −μλn˙δv

+

to

ξ ≥ 0 ⊥ Λn ≥ 0 ‖Λto‖2 ≤ μΛn

ξ̇+ ≥ 0 ⊥ Λn ≥ 0 ‖δv+to‖2Λto = −μΛnδv

+to

ξ

δt+ δv′n(λ) ≥ 0 ⊥ λn ≥ 0

‖λto‖2 ≤ μλn

‖δv′to(λ)‖2λto = −μλnδv

′to(λ)

Signorini condition impact law friction cone condition frictional reaction opposes slip

ξ = 0

ξ̇+ = 0

ξ = 0

‖δv+to‖2 = 0fo

rces

impu

lses

cont

inuo

usdi

scre

te

LBM on the way to ExaScale — Ulrich Rüde

Nonlinear Complementarity and Time Stepping

5

Moreau, J., Panagiotopoulos P. (1988): Nonsmooth mechanics and applications, vol 302. Springer, Wien-New York

Popa, C., Preclik, T., & UR (2014). Regularized solution of LCP problems with application to rigid body dynamics. Numerical Algorithms, 1-12.

Preclik, T. & UR (2015). Ultrascale simulations of non-smooth granular dynamics; Computational Particle Mechanics, DOI: 10.1007/s40571-015-0047-6

Page 6: Lattice Boltzmann Methods on the way to exascale - LSS · PDF fileLBM on the way to ExaScale — Ulrich Rüde Outline Goals: drive algorithms towards their performance limits (scalability

LBM on the way to ExaScale — Ulrich Rüde 6

Dense granular channel flow with crystallization

Page 7: Lattice Boltzmann Methods on the way to exascale - LSS · PDF fileLBM on the way to ExaScale — Ulrich Rüde Outline Goals: drive algorithms towards their performance limits (scalability

25.9%

9.5%

8.0

% 25.8%

18.1%

12.6%

(a) Time-step profile of the granular gas exe-cuted with 5×2×2 = 20 processes on a singlenode.

16.0%

5.9%

22

.7%

22

.7%

30.6%

16.5%

8.3%

(b) Time-step profile of the granular gas exe-cuted with 8 × 8 × 5 = 320 processes on 16nodes.

LBM on the way to ExaScale — Ulrich Rüde

Scaling ResultsSolver algorithmically not optimal for dense systems, hence cannot scale unconditionally, but is highly efficient in many cases of practical importance Strong and weak scaling results for a constant number of iterations performed on SuperMUC and Juqueen Largest ensembles computed

2.8 × 1010 non-spherical particles 1.1 × 1010 contacts

granular gas: scaling results

7

(b) Weak-scaling graph on the Juqueen supercomputer.

Breakup up of compute times on Erlangen RRZE Cluster Emmy

Largest ensembles computed 10

10

granular gas: scaling results

Page 8: Lattice Boltzmann Methods on the way to exascale - LSS · PDF fileLBM on the way to ExaScale — Ulrich Rüde Outline Goals: drive algorithms towards their performance limits (scalability

Building Block III:

Scalable Flow Simulationswith the Lattice Boltzmann Method

8Extreme Scale LBM Methods - Ulrich Rüde

Succi, S. (2001). The lattice Boltzmann equation: for fluid dynamics and beyond. Oxford university press.Feichtinger, C., Donath, S., Köstler, H., Götz, J., & Rüde, U. (2011). WaLBerla: HPC software design for computational engineering simulations. Journal of Computational Science, 2(2), 105-112.

Page 9: Lattice Boltzmann Methods on the way to exascale - LSS · PDF fileLBM on the way to ExaScale — Ulrich Rüde Outline Goals: drive algorithms towards their performance limits (scalability

LBM on the way to ExaScale — Ulrich Rüde

Partitioning and Parallelization

9

static load balancing

allocation of block data (→ grids)

static block-level refinement (→ forest of octrees)

separation of domain partitioningfrom simulation (optional)

compact (KiB/MiB) binary MPI IO

Page 10: Lattice Boltzmann Methods on the way to exascale - LSS · PDF fileLBM on the way to ExaScale — Ulrich Rüde Outline Goals: drive algorithms towards their performance limits (scalability

LBM on the way to ExaScale — Ulrich Rüde

Parallel AMR load balancing

10

forest of octrees: octrees are not explicitly stored,

but implicitly defined via block IDs

2:1 balanced grid(used for the LBM)

distributed graph: nodes = blocks

edges explicitly stored as<block ID, process rank> pairs

different views on domain partitioning

Page 11: Lattice Boltzmann Methods on the way to exascale - LSS · PDF fileLBM on the way to ExaScale — Ulrich Rüde Outline Goals: drive algorithms towards their performance limits (scalability

AMR and Load Balancing with waLBerla

11Extreme Scale LBM Methods - Ulrich Rüde

Isaac, T., Burstedde, C., Wilcox, L. C., & Ghattas, O. (2015). Recursive algorithms for distributed forests of octrees. SIAM Journal on Scientific Computing, 37(5), C497-C531.

Meyerhenke, H., Monien, B., & Sauerwald, T. (2009). A new diffusion-based multilevel algorithm for computing graph partitions. Journal of Parallel and Distributed Computing, 69(9), 750-761.

Schornbaum, F., & Rüde, U. (2016). Massively Parallel Algorithms for the Lattice Boltzmann Method on NonUniform Grids. SIAM Journal on Scientific Computing, 38(2), C96-C126.

Page 12: Lattice Boltzmann Methods on the way to exascale - LSS · PDF fileLBM on the way to ExaScale — Ulrich Rüde Outline Goals: drive algorithms towards their performance limits (scalability

AMR Performance

12Extreme Scale LBM Methods - Ulrich Rüde

••

••

Page 13: Lattice Boltzmann Methods on the way to exascale - LSS · PDF fileLBM on the way to ExaScale — Ulrich Rüde Outline Goals: drive algorithms towards their performance limits (scalability

AMR Performance

13Extreme Scale LBM Methods - Ulrich Rüde

••

••

uring this refresh process …… all

Page 14: Lattice Boltzmann Methods on the way to exascale - LSS · PDF fileLBM on the way to ExaScale — Ulrich Rüde Outline Goals: drive algorithms towards their performance limits (scalability

AMR Performance

14Extreme Scale LBM Methods - Ulrich Rüde

• –

⇔ ⇔

Page 15: Lattice Boltzmann Methods on the way to exascale - LSS · PDF fileLBM on the way to ExaScale — Ulrich Rüde Outline Goals: drive algorithms towards their performance limits (scalability

AMR Performance

15Extreme Scale LBM Methods - Ulrich Rüde

• –

Page 16: Lattice Boltzmann Methods on the way to exascale - LSS · PDF fileLBM on the way to ExaScale — Ulrich Rüde Outline Goals: drive algorithms towards their performance limits (scalability

Performance onCoronary Arteries Geometry

Extreme Scale LBM Methods - Ulrich Rüde

Godenschwager, C., Schornbaum, F., Bauer, M., Köstler, H., & UR (2013). A framework for hybrid parallel flow simulations with a trillion cells in complex geometries. In Proceedings of SC13: International Conference for High Performance Computing, Networking, Storage and Analysis (p. 35). ACM.

Weak scaling458,752 cores of JUQUEENover a trillion (1012) fluid lattice cells

cell sizes 1.27μmdiameter of red blood cells: 7μm 2.1 1012 cell updates per second 0.41 PFlops

Strong scaling32,768 cores of SuperMUC

cell sizes of 0.1 mm2.1 million fluid cells6000+ time steps per second

Color coded proc assignment

Page 17: Lattice Boltzmann Methods on the way to exascale - LSS · PDF fileLBM on the way to ExaScale — Ulrich Rüde Outline Goals: drive algorithms towards their performance limits (scalability

Single Node Performance

Extreme Scale LBM - Ulrich Rüde

SuperMUCJUQUEEN

vectorized

optimized

standard

Pohl, T., Deserno, F., Thürey, N., UR, Lammers, P., Wellein, G., & Zeiser, T. (2004). Performance evaluation of parallel large-scale lattice Boltzmann applications on three supercomputing architectures. Proceedings of the 2004 ACM/IEEE conference on Supercomputing (p. 21). IEEE Computer Society.

Donath, S., Iglberger, K., Wellein, G., Zeiser, T., Nitsure, A., & UR (2008). Performance comparison of different parallel lattice Boltzmann implementations on multi-core multi-socket systems. International Journal of Computational Science and Engineering, 4(1), 3-11.

Page 18: Lattice Boltzmann Methods on the way to exascale - LSS · PDF fileLBM on the way to ExaScale — Ulrich Rüde Outline Goals: drive algorithms towards their performance limits (scalability

LBM on the way to ExaScale — Ulrich Rüde

Flow through structure of thin crystals (filter)

18

work with Jose Pedro Galache and Antonio Gil CMT-Motores Termicos, Universitat Politecnica de Valencia

Page 19: Lattice Boltzmann Methods on the way to exascale - LSS · PDF fileLBM on the way to ExaScale — Ulrich Rüde Outline Goals: drive algorithms towards their performance limits (scalability

LBM on the way to ExaScale — Ulrich Rüde 19

Direct numerical simulation ofcharged particles in flowMasilamani, K., Ganguly, S., Feichtinger, C., & UR (2011). Hybrid lattice-boltzmann and finite-difference simulation of electroosmotic flow in a microchannel. Fluid Dynamics Research, 43(2), 025501.

Bartuschat, D., Ritter, D., & UR (2012). Parallel multigrid for electrokinetic simulation in particle-fluid flows. In High Performance Computing and Simulation (HPCS), 2012 International Conference on (pp. 374-380). IEEE.

Bartuschat, D. & UR (2015). Parallel Multiphysics Simulations of Charged Particles in Microfluidic Flows, Journal of Computational Science, Volume 8, May 2015, Pages 1-19

Positive and negatively charged particles in flow subjected to transversal electric field

Building Block IV (electrostatics)

Page 20: Lattice Boltzmann Methods on the way to exascale - LSS · PDF fileLBM on the way to ExaScale — Ulrich Rüde Outline Goals: drive algorithms towards their performance limits (scalability

hydrodynam. force

object motion

Lubricationcorrection

electrostat. force

velocity BCs

object distance

LBM

correction force

charge distribution

Newtonian mechanicscollision response

treat BCsstream-collide step

Finite volumes

MGiterat.

treat BCsV-cycle

LBM on the way to ExaScale — Ulrich Rüde

6-way coupling

20

Page 21: Lattice Boltzmann Methods on the way to exascale - LSS · PDF fileLBM on the way to ExaScale — Ulrich Rüde Outline Goals: drive algorithms towards their performance limits (scalability

LBM on the way to ExaScale — Ulrich Rüde

Separation experiment

21

0 250 500 750 1000 1250 1500 1750 2000Number of nodes

0102030405060708090

103

MFL

UPS

(L

BM

)

LBM Perform.20

40

60

80

100

120

103

ML

UPS

(M

G)

MG Perform.

1 2 4 8 16 32 64 12825

651

210

2420

48

0

100

200

300

400

Number of nodes

Total

runtimes

[]

LBM

Map

Lubr

HydrFpeMG

SetRHS

PtCm

ElectF

240 time steps fully 6-way coupled simulation 400 sec on SuperMuc weak scaling up to 32 768 cores 7.1 Mio particles

Page 22: Lattice Boltzmann Methods on the way to exascale - LSS · PDF fileLBM on the way to ExaScale — Ulrich Rüde Outline Goals: drive algorithms towards their performance limits (scalability

LBM on the way to ExaScale — Ulrich Rüde

Volume of Fluids Methodfor Free Surface Flows

22

joint work with Regina Ammer, Simon Bogner, Martin Bauer, Daniela Anderl, Nils Thürey, Stefan Donath, Thomas Pohl, C Körner, A. Delgado

Körner, C., Thies, M., Hofmann, T., Thürey, N., & UR. (2005). Lattice Boltzmann model for free surface flow for modeling foaming. Journal of Statistical Physics, 121(1-2), 179-196. Donath, S., Feichtinger, C., Pohl, T., Götz, J., & UR. (2010). A Parallel Free Surface Lattice Boltzmann Method for Large-Scale Applications. Parallel Computational Fluid Dynamics: Recent Advances and Future Directions, 318. Anderl, D., Bauer, M., Rauh, C., UR, & Delgado, A. (2014). Numerical simulation of adsorption and bubble interaction in protein foams using a lattice Boltzmann method. Food & function, 5(4), 755-763.

Building Block V

Page 23: Lattice Boltzmann Methods on the way to exascale - LSS · PDF fileLBM on the way to ExaScale — Ulrich Rüde Outline Goals: drive algorithms towards their performance limits (scalability

LBM on the way to ExaScale — Ulrich Rüde 23

Free Surface FlowsVolume-of-Fluids like approach Flag field: Compute only in fluidSpecial “free surface” conditions in interface cells Reconstruction of curvature for surface tension

Page 24: Lattice Boltzmann Methods on the way to exascale - LSS · PDF fileLBM on the way to ExaScale — Ulrich Rüde Outline Goals: drive algorithms towards their performance limits (scalability

LBM on the way to ExaScale — Ulrich Rüde 24

Free Surface Bubble ModelData of a Bubble:

Initial Volume (Density=1)Current VolumeDensity/Pressure = initial volume / current volume

Update ManagementEach process logs change of volume due to cell conversions (Interface – Gas / Gas – Interface) and mass variations in Interface cellsAll volume changes are added to the volume of the bubble at the end of the timestep (which also has to be communicated)

Page 25: Lattice Boltzmann Methods on the way to exascale - LSS · PDF fileLBM on the way to ExaScale — Ulrich Rüde Outline Goals: drive algorithms towards their performance limits (scalability

LBM on the way to ExaScale — Ulrich Rüde

Simulation for hygiene products (for Procter&Gamble)

capillary pressure inclination

surface tension contact angle

25

ill f t

Page 26: Lattice Boltzmann Methods on the way to exascale - LSS · PDF fileLBM on the way to ExaScale — Ulrich Rüde Outline Goals: drive algorithms towards their performance limits (scalability

LBM on the way to ExaScale — Ulrich Rüde

Additive ManufacturingFast Electron Beam Melting

26

Bikas, H., Stavropoulos, P., & Chryssolouris, G. (2015). Additive manufacturing methods and modelling approaches: a critical review. The International Journal of Advanced Manufacturing Technology, 1-17.

Klassen, A., Scharowsky, T., & Körner, C. (2014). Evaporation model for beam based additive manufacturing using free surface lattice Boltzmann methods. Journal of Physics D: Applied Physics, 47(27), 275303.

Page 27: Lattice Boltzmann Methods on the way to exascale - LSS · PDF fileLBM on the way to ExaScale — Ulrich Rüde Outline Goals: drive algorithms towards their performance limits (scalability

LBM on the way to ExaScale — Ulrich Rüde

Electron Beam Melting Process3D printing

EU-Project Fast-EBM

ARCAM (Gothenburg) TWI (Cambridge) FAU Erlangen

Generation of powder bed Energy transfer by electron beam

penetration depth heat transfer

Flow dynamics meltingmelt flow surface tension wettingcapillary forcescontact angles solidification

27

Ammer, R., Markl, M., Ljungblad, U., Körner, C., & UR (2014). Simulating fast electron beam melting with a parallel thermal free surface lattice Boltzmann method. Computers & Mathematics with Applications, 67(2), 318-330.

Ammer, R., UR, Markl, M., Jüchter V., & Körner, C. (2014). Validation experiments for LBM simulations of electron beam melting. International Journal of Modern Physics C.

Page 28: Lattice Boltzmann Methods on the way to exascale - LSS · PDF fileLBM on the way to ExaScale — Ulrich Rüde Outline Goals: drive algorithms towards their performance limits (scalability

LBM on the way to ExaScale — Ulrich Rüde

Simulation of Electron Beam Melting

28

Simulating powder bed generation using the PE framework

High speed camera shows HHHHiiiigggggggggghh ssppppppppeeeeddddd ccaammeerraaaa sssshhoowwss melting step for manufacturing a sstteepppppppppp fffoorr mmaaaannnnuufffaacc

hollow cylinder

WaLBerla Simulation

Page 29: Lattice Boltzmann Methods on the way to exascale - LSS · PDF fileLBM on the way to ExaScale — Ulrich Rüde Outline Goals: drive algorithms towards their performance limits (scalability

LBM on the way to ExaScale — Ulrich Rüde

Conclusions

29

Page 30: Lattice Boltzmann Methods on the way to exascale - LSS · PDF fileLBM on the way to ExaScale — Ulrich Rüde Outline Goals: drive algorithms towards their performance limits (scalability

LBM on the way to ExaScale — Ulrich Rüde

CSE research is done by teams

30

Harald KöstlerChristian

Godenschwager Kristina Pickl Regina Ammer Simon Bogner

Florian Schornbaum

Sebastian Kuckuk

Christoph Rettinger

Dominik Bartuschat Martin Bauer

Page 31: Lattice Boltzmann Methods on the way to exascale - LSS · PDF fileLBM on the way to ExaScale — Ulrich Rüde Outline Goals: drive algorithms towards their performance limits (scalability

LBM on the way to ExaScale — Ulrich Rüde

Thank you for your attention!

31

Videos, preprints, slides at https://www10.informatik.uni-erlangen.de