mamico: parallel noise reduction for multi-instance ...2.1 molecular dynamics (md) for the sake of...

14
MaMiCo: Parallel Noise Reduction for Multi-Instance Molecular-Continuum Flow Simulation Piet Jarmatz 1,3[0000-0002-5463-0740] and Philipp Neumann 2[0000-0001-8604-8846] 1 Department of Informatics, Technical University of Munich, Garching, Germany 2 Department of Informatics, University of Hamburg, Hamburg, Germany 3 Faculty of Mechanical Engineering, Helmut Schmidt University, Hamburg, Germany [email protected] Abstract. Transient molecular-continuum coupled flow simulations of- ten suffer from high thermal noise, created by fluctuating hydrodynam- ics within the molecular dynamics (MD) simulation. Multi-instance MD computations are an approach to extract smooth flow field quantities on rather short time scales, but they require a huge amount of compu- tational resources. Filtering particle data using signal processing meth- ods to reduce numerical noise can significantly reduce the number of instances necessary. This leads to improved stability and reduced com- putational cost in the molecular-continuum setting. We extend the Macro-Micro-Coupling tool (MaMiCo) – a software to couple arbitrary continuum and MD solvers – by a new parallel interface for universal MD data analytics and post-processing, especially for noise reduction. It is designed modularly and compatible with multi-instance sampling. We present a Proper Orthogonal Decomposition (POD) imple- mentation of the interface, capable of massively parallel noise filtering. The resulting coupled simulation is validated using a three-dimensional Couette flow scenario. We quantify the denoising, conduct performance benchmarks and scaling tests on a supercomputing platform. We thus demonstrate that the new interface enables massively parallel data an- alytics and post-processing in conjunction with any MD solver coupled to MaMiCo. Keywords: Multiscale · Fluid Dynamics · HPC · Noise Filter · Parallel · Transient · Coupling · Software Design · Molecular Dynamics · Simu- lation Data Analytics · Multi-instance · POD · Molecular-continuum 1 Introduction Multiscale methods in computational fluid dynamics, in particular coupled mo- lecular-continuum simulations [6, 7, 12, 22], allow to go beyond the limitations imposed by modelling accuracy or computational feasibility of a particular single- scale method. They are frequently applied for instance for nanostructure inves- tigation in chemistry, especially for research in lithium-ion batteries [15]. ICCS Camera Ready Version 2019 To cite this paper please use the final published version: DOI: 10.1007/978-3-030-22747-0_34

Upload: others

Post on 17-Sep-2020

1 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: MaMiCo: Parallel Noise Reduction for Multi-Instance ...2.1 Molecular Dynamics (MD) For the sake of simplicity we restrict our considerations to a set of Lennard-Jones molecules here,

MaMiCo: Parallel Noise Reduction forMulti-Instance Molecular-Continuum Flow

Simulation

Piet Jarmatz1,3[0000−0002−5463−0740] and Philipp Neumann2[0000−0001−8604−8846]

1 Department of Informatics, Technical University of Munich, Garching, Germany2 Department of Informatics, University of Hamburg, Hamburg, Germany

3 Faculty of Mechanical Engineering, Helmut Schmidt University, Hamburg, [email protected]

Abstract. Transient molecular-continuum coupled flow simulations of-ten suffer from high thermal noise, created by fluctuating hydrodynam-ics within the molecular dynamics (MD) simulation. Multi-instance MDcomputations are an approach to extract smooth flow field quantitieson rather short time scales, but they require a huge amount of compu-tational resources. Filtering particle data using signal processing meth-ods to reduce numerical noise can significantly reduce the number ofinstances necessary. This leads to improved stability and reduced com-putational cost in the molecular-continuum setting.We extend the Macro-Micro-Coupling tool (MaMiCo) – a software tocouple arbitrary continuum and MD solvers – by a new parallel interfacefor universal MD data analytics and post-processing, especially for noisereduction. It is designed modularly and compatible with multi-instancesampling. We present a Proper Orthogonal Decomposition (POD) imple-mentation of the interface, capable of massively parallel noise filtering.The resulting coupled simulation is validated using a three-dimensionalCouette flow scenario. We quantify the denoising, conduct performancebenchmarks and scaling tests on a supercomputing platform. We thusdemonstrate that the new interface enables massively parallel data an-alytics and post-processing in conjunction with any MD solver coupledto MaMiCo.

Keywords: Multiscale · Fluid Dynamics · HPC · Noise Filter · Parallel· Transient · Coupling · Software Design · Molecular Dynamics · Simu-lation Data Analytics · Multi-instance · POD · Molecular-continuum

1 Introduction

Multiscale methods in computational fluid dynamics, in particular coupled mo-lecular-continuum simulations [6, 7, 12, 22], allow to go beyond the limitationsimposed by modelling accuracy or computational feasibility of a particular single-scale method. They are frequently applied for instance for nanostructure inves-tigation in chemistry, especially for research in lithium-ion batteries [15].

ICCS Camera Ready Version 2019To cite this paper please use the final published version:

DOI: 10.1007/978-3-030-22747-0_34

Page 2: MaMiCo: Parallel Noise Reduction for Multi-Instance ...2.1 Molecular Dynamics (MD) For the sake of simplicity we restrict our considerations to a set of Lennard-Jones molecules here,

2 Piet Jarmatz and Philipp Neumann

In the molecular-continuum context, MD regions and continuum flow re-gions are coupled to extend the simulation capability over multiple temporaland spatial scales. This is useful for many challenging applications involving e.g.nanomembranes [20] or polymer physics [1, 18].

Due to applications that often require large domains, long time spans, or fineresolution, parallelism and scalability constitute important aspects of molecular-continuum methods and software. Several codes for massively parallel executionof molecular-continuum simulations on high performance computing (HPC) sys-tems exist, for instance the CPL library [22], MaMiCo[14] or HACPar [19].

Many coupling schemes for molecular-continuum simulations have been in-vestigated [2, 8, 12, 16, 24], based amongst others on the internal-flow multiscalemethod [2] or the heterogeneous multiscale method [8]. In the steady state case,time-averaging of hydrodynamic quantities sampled from MD is sufficient forthe coupling to a continuum solver [7]. But a transient simulation with shortcoupling time intervals can easily become unstable due to fluctuating MD flowfield quantities.

One approach to tackle this is multi-instance sampling [13], where averagedinformation comes from an ensemble of MD systems. This approach, however,is computationally very expensive.

It has been shown that noise removal techniques, i.e. filters, are anothereffective approach to reduce thermal fluctuations in the molecular-continuumsetting. Grinberg used proper orthogonal decomposition (POD) of MD flow fielddata, demonstrating also HPC applicability of their method by running a coupledatomistic-continuum simulation on up to 294,912 compute cores [11]. Zimonet al. [26] have investigated and compared different kinds of noise filters forseveral particle based flow simulations, such as harmonically pulsating flow orwater flow through a carbon nanotube. They pointed out that the combinationof POD with one of various other noise filtering algorithms yields significantimprovements in denoising quality.

One of the main challenges in the field of coupled multiscale flow simulationis the software design and implementation, since goals such as flexibility, inter-changeability of solvers and modularity on an algorithmic level often contradictto hardware and high performance requirements. Some very generic solutionsexist, such as MUI [23] or the MUSCLE 2 [3] software, which can be used for var-ious multiscale settings. We recently presented the MaMiCo coupling framework[13, 14]. It is system-specific for molecular-continuum coupling, but independentof actual MD and CFD solvers. MaMiCo provides interface definitions for arbi-trary flow simulation software, hides coupling algorithmics from the solvers andsupports 2D as well as 3D simulations.

In this paper, we present extensions of the MaMiCo tool for massively par-allel particle simulation data analytics and noise reduction. We introduce a newinterface that is intended primarily for MD flow quantity noise filtering, but inthe same way usable for any kind of MD data post-processing for the molecular-continuum coupling. We also present an implementation of a HPC compatibleand scalable POD noise filter for MaMiCo. Our interface is compatible with

ICCS Camera Ready Version 2019To cite this paper please use the final published version:

DOI: 10.1007/978-3-030-22747-0_34

Page 3: MaMiCo: Parallel Noise Reduction for Multi-Instance ...2.1 Molecular Dynamics (MD) For the sake of simplicity we restrict our considerations to a set of Lennard-Jones molecules here,

Parallel Noise Reduction for Molecular-Continuum Flow Simulation 3

multi-instance MD computations in a natural way, thus it enables a novel com-bination of noise filtering and multi-instance sampling: A small ensemble ofseparate MD simulations delivers information about the behaviour of the simu-lated system on a molecular level, while the filtering efficiently extracts a smoothsignal for the continuum solver. The goal of this paper is to investigate this newcombination and to discuss the related software design issues.

In Section 2 we introduce the relevant theoretical background on MD (2.1),the considered molecular-continuum coupling (2.2) and POD (2.3). Section 3focuses on the implementation of the MaMiCo data analytics and filtering ex-tension. In Section 4, we quantify the denoising quality of the filtering. Wefurther analyse signal-to-noise ratios (SNR) for a three-dimensional oscillatingCouette flow with synthetic noisy data (4.1). To validate the resulting transientcoupled molecular-continuum simulation, we use a Couette flow setup and ex-amine its startup (4.2). We also conduct performance measurements and scalingtests (4.3) of our new coupled flow simulation on the LRZ Linux Cluster4 plat-forms CoolMUC-2 and CoolMUC-3. Finally, we give a summary and provide anoutlook to future work in Section 5.

2 Theoretical Background

2.1 Molecular Dynamics (MD)

For the sake of simplicity we restrict our considerations to a set of Lennard-Jones molecules here, without loosing generality since the coupling and noisereduction methodology and software is compatible with other particle systemssuch as dissipative particle dynamics [9] systems in the same way.

For a number N of molecules with positions xi, velocities vi, mass m andinteracting forces F i, the behaviour of the system is determined by a Verlet-typenumerical integration, using a particle system time step width dtP, of Newton’sequation of motion

d

dtvi =

1

mF i,

d

dtxi = vi. (1)

F i is defined by Lennard-Jones parameters ε, σ and a cut-off-radius rc, see [13]for details.

The method is implemented using linked cells [10]. The implementation isMPI-parallelized in the simulation software SimpleMD [14] which is part ofMaMiCo; note that this solver choice is for simplicity only as it has been shownthat MaMiCo interfaces various MD packages, such as LAMMPS, ls1 mardyn,and ESPResSo/ESPResSo++.

2.2 Coupling and Quantity Transfer

The method used here to couple continuum and MD solver is based on [13, 16].

4 https://www.lrz.de/services/compute/linux-cluster/overview/

ICCS Camera Ready Version 2019To cite this paper please use the final published version:

DOI: 10.1007/978-3-030-22747-0_34

Page 4: MaMiCo: Parallel Noise Reduction for Multi-Instance ...2.1 Molecular Dynamics (MD) For the sake of simplicity we restrict our considerations to a set of Lennard-Jones molecules here,

4 Piet Jarmatz and Philipp Neumann

The simulation setup and overlapping domain decomposition is shown inFigure 1. Nested time stepping is used, i.e. the particle system is advanced overn time steps during every time step of the continuum solver, dtC := n · dtP.

Fig. 1. Coupled Couette simula-tion transfer region setup. 2D slicethrough 3D domain, H is the walldistance. Here, only the four outercell layers close to the MD bound-ary are shown. At the outer MDdomain boundary, an additionalboundary force Fb(r) is applied tothe molecules. In the green outerMD cells, mass fluxes are trans-ferred from continuum to MD. Onthe two blue cell layers, velocityvalues are imposed to MD. Fi-nally, inside the red cells (arbi-trarily large region), velocity anddensity values are sampled, post-processed, and sent back to thecontinuum solver.

Particle → continuum data transfer: Continuum quantities for the coupling,such as average velocity u, are sampled cell-wise over a time interval dtC:

u =dtPNdtC

dtCdtP−1∑

k=0

N∑i=1

vi(t0 + kdtP) (2)

Afterwards, these quantities are optionally post-processed (separately for ev-ery MD instance), accumulated from all instances together, and represent themolecular flow field at continuum scale.

Continuum→ particle data transfer: Velocities coming from overlapping cellsof the continuum flow solver are used to accelerate the molecules in the corre-sponding grid cells, they are applied via an additional forcing term with shiftedtime intervals, see [13] for details. Mass flux from the continuum into the particlesimulation is realised by particle deletion and insertion with the USHER algo-rithm [5]. As boundary conditions for the MD-continuum boundary reflectingboundaries are used. We add the boundary force Fb(r) proposed by Zhou et al.[25]. It is given as an analytic fitting formula and estimates missing intermolec-ular force contributions over a wide range of densities and temperatures.

2.3 Noise Reduction: POD

The proper orthogonal decomposition (POD), also known as principal compo-nent analysis, is a general statistical method for high-dimensional data analysis

ICCS Camera Ready Version 2019To cite this paper please use the final published version:

DOI: 10.1007/978-3-030-22747-0_34

Page 5: MaMiCo: Parallel Noise Reduction for Multi-Instance ...2.1 Molecular Dynamics (MD) For the sake of simplicity we restrict our considerations to a set of Lennard-Jones molecules here,

Parallel Noise Reduction for Molecular-Continuum Flow Simulation 5

of dynamic processes and a main technique for data dimensionality reduction. Itwas already used and investigated for analysis of velocity fields from atomisticsimulations [11, 26]. Here we employ POD as a fast standard method and basicexample to demonstrate the capabilities of our data analytics and noise reductioninterface. However, note that various algorithms are known to yield significantlystronger denoising, such as the POD+ methods proposed by Zimon et al. [26],e.g. POD together with wavelet transform or Wiener filtering. Also many multi-dimensional image processing filters like non-local means, anisotropic filteringor total variation denoising [4, 17] are promising for CFD data. They can beapplied with the noise reduction interface in the same way as POD.

Based on the method of snapshots [21] to analyse a space-time window of fluc-tuating data, consider a window of N discrete time snapshots t ∈ t1, ..., tN ⊂R. A finite set Ω of discrete sampling points x ∈ Ω ⊂ R3 defines a set of signalsources u(x, t). POD describes the function u(x, t) as a finite sum:

u(x, t) ≈kmax∑k=1

φk(x)ak(t) (3)

where the orthonormal basis functions (modes) φk(x) and ak(t) represent thetemporal and spatial components of the data, such that the first kmax modesare the best approximation of u(x, t). Choosing a sufficiently small kmax willapproximate only the dominating components of the signal u and exclude noise.The temporal modes ak(t) can be obtained by an eigenvalue decomposition ofthe temporal auto-correlation covariance matrix C,

Cij =∑x∈Ω

u(x, ti)u(x, tj) i, j = 1, 2, ..., N. (4)

They in turn can be used to compute the spatial modes φk(x) with orthogonalityrelations.

In case of parallel execution, i.e. when subsets of Ω are distributed over sev-eral processors, Eq. (4) can be evaluated by communicating local versions ofC in a global reduction operation. Although this is computationally expensivecompared to purely local, independent POD executions, it is helpful to pre-vent inconsistencies between subdomains and enforce smoothness, especially ina coupled molecular-continuum setting. Every other step of the algorithm canbe performed locally.

3 Implementation: MaMiCo Design and Extension

The MaMiCo tool [13, 14] is a C++ framework designed to couple arbitrary mas-sively parallel (i.e., MPI-parallel) continuum and particle solvers in a modularand flexible way. It also provides interfaces to coupling algorithms and data ex-change routines. In this paper we employ only the built-in MD simulation, calledSimpleMD on the microscopic side. On the macroscopic side, a simple Lattice

ICCS Camera Ready Version 2019To cite this paper please use the final published version:

DOI: 10.1007/978-3-030-22747-0_34

Page 6: MaMiCo: Parallel Noise Reduction for Multi-Instance ...2.1 Molecular Dynamics (MD) For the sake of simplicity we restrict our considerations to a set of Lennard-Jones molecules here,

6 Piet Jarmatz and Philipp Neumann

Boltzmann (LB) implementation, the LBCouetteSolver and an analytical Cou-etteSolver are used. The necessary communication between SimpleMD ranksand the ranks of the LBCouetteSolver, where the respective cells are located, isperformed by MaMiCo.

The extended system design of MaMiCo is shown in Figure 2, where the lat-est developments have been marked in the central red box: A newly introducedinterface NoiseReduction (Listing 1) is part of the quantity transfer and cou-pling algorithmics bundle. It is primarily intended for noise filtering, but ableto manage arbitrary particle data analytics or post-processing tasks during thecoupling, independently of the coupling scheme, actual particle and continuumsolver in use. It is designed to be compatible with multi-instance MD compu-tations in a natural way, as separate noise filters are instantiated for each MDsystem automatically. This yields a strong subsystem separation rather thanalgorithmic interdependency. However, explicit cross-instance communication isstill possible if necessary.

Fig. 2. Extended MaMiCo system design.

We provide a dummy implementation, IdentityTransform, that performs nofiltering at all, of the NoiseReduction interface, as well as our own massivelyparallel implementation of a particle data noise filter using POD. The POD im-plementation employs Eigen5 to perform linear algebra tasks such as the eigen-value decomposition. It involves a single invocation of a global (per MD instance)

5 http://eigen.tuxfamily.org

ICCS Camera Ready Version 2019To cite this paper please use the final published version:

DOI: 10.1007/978-3-030-22747-0_34

Page 7: MaMiCo: Parallel Noise Reduction for Multi-Instance ...2.1 Molecular Dynamics (MD) For the sake of simplicity we restrict our considerations to a set of Lennard-Jones molecules here,

Parallel Noise Reduction for Molecular-Continuum Flow Simulation 7

MPI reduction operation to enable detection of supra-process flow data correla-tion and its separation from thermal noise. It is fully configurable using XMLconfiguration files, where you can specify kmax and N .

1 template<unsigned int dim> class

coupling::noisereduction::NoiseReduction →

2 public: [...]

3 /** Is called for every macroscopic cell right before sending the

macroscopicMass and -Momentum data to the macroscopic solver,

but after the call to TransferStrategy::[...] */

4 virtual void processInnerMacroscopicCell(

5 coupling::datastructures::MacroscopicCell<dim> &cell, const

unsigned int &index )→

6 virtual void beginProcessInnerMacroscopicCells()

7 virtual void endProcessInnerMacroscopicCells()

8 [...]

9 ; // Excerpts from main coupling loop:

10 for (int cycles = 0; cycles < couplingCycles; cycles++)

11 couetteSolver->advance(dt_C); // Run one continuum step.

12 // Extract data from couette solver and send them to MD.

13 fillSendBuffer(*couetteSolver,sendBuffer,...);

14 multiMDCellService.sendFromMacro2MD(sendBuffer,...);

15 for (int i = 0; i < mdInstances; i++) // Run MD instances.

16 simpleMD[i]->simulateTimesteps(dt_P,...);

17 // Send back data from MD instances and merge it into recvBuffer of

this rank. This automatically calls the noise filter.→

18 multiMDCellService.sendFromMD2Macro(recvBuffer,...);

Listing 1: Code snippets from C++ noise reduction interface, outlining its call-back methodology, structure and basic usage.

4 Analysis of Simulation Results

Our test scenario, the Couette flow, consists of flow between two infinite parallelplates. A cubic simulation domain with periodic boundaries in x- and y-directionis used. The upper wall at a distance of z = H is at rest, while the lower wall atz = 0 moves in x-direction with constant speed uwall = 0.5. The analytical flowsolution for the start-up from unit density and zero velocity everywhere can bederived from the Navier–Stokes equations and is given by:

ux(z, t) = uwall

(1− z

H

)− 2uwall

π

∞∑k=1

1

ksin(kπ

z

H

)e−k

2π2νt/H2

(5)

where ν is the kinematic viscosity of the fluid and uy = 0, uz = 0.We refer to the smallest scenario that we frequently use as MD-30, because it

has a MD domain size of 30x30x30, embedded in a continuum simulation domain

ICCS Camera Ready Version 2019To cite this paper please use the final published version:

DOI: 10.1007/978-3-030-22747-0_34

Page 8: MaMiCo: Parallel Noise Reduction for Multi-Instance ...2.1 Molecular Dynamics (MD) For the sake of simplicity we restrict our considerations to a set of Lennard-Jones molecules here,

8 Piet Jarmatz and Philipp Neumann

with H = 50. We always use a continuum (and MD) cell size of 2.5 and time stepsof dtC = 0.5, dtP = 0.005. The Lennard-Jones parameters and the molecule massare set to m = σ = ε = 1.0. The MD domain is filled with 28x28x28 particles,this yields a density value ρ ≈ 0.813 and a kinematic viscosity ν ≈ 2.63. MD-60and MD-120 scenarios are defined by doubling or quadrupling domain sizes andparticle numbers in all spatial directions, keeping everything else constant.

For a fluctuating x-velocity signal u(x, t) we define the signal-to-noise ratio

SNR = 10 log10

( ∑∀x,t u(x, t)2∑

∀x,t(u(x, t)− u(x, t))2

)dB (6)

with the analytical noiseless x-velocity u. As SNR is expressed using a loga-rithmic scale (and can take values equal to or less than zero - if noise level isequal to or greater than signal), absolute differences in SNR correspond to rela-tive changes of squared signal amplitude ratios, so we define the gain of a noisereduction method as

gain = SNROUT − SNRIN, (7)

with SNROUT and SNRIN denoting the signal-to-noise ratios of denoised dataand original fluctuating signal, respectively.

4.1 Denoising Harmonically Oscillating Flow

To point out and quantify the filtering quality of MaMiCo’s new POD noise filtercomponent, we investigate a harmonically oscillating 3D Couette flow. We usethe MD-30 scenario and set uwall to a time dependent sine signal. Since we wantto investigate only the noise filter here but not the coupled simulation, we usesynthetic (analytical) MD data with additive Gaussian noise, without runninga real MD simulation. This is not a physically valid flow profile as it disregardsviscous shear forces caused by the time-dependent oscillating acceleration, butit is eligible to examine and demonstrate noise filtering performance. We showthe influence of varying the POD parameters kmax and N in Figure 3, where thex-component of velocity, for clarity in only one of the cells in the MD domain,is plotted over time; the SNR value is computed over all cells.

The number kmax of POD modes used for filtering is considered to be a fixedsimulation parameter in this paper, however note that the optimal value of kmaxdepends on the flow features and several methods to choose kmax adaptively atruntime by analysing the eigenspectra have been proposed [11, 26].

Figure 3f shows the best reduction of fluctuations with a SNR gain of 17.05dB compared to the input signal 3a. Only one POD mode is sufficient here,since the sine frequency is low and the flow close to a steady-state, so thathigher eigenvalues already describe noise components of the input.

We validate the synthetic MD data test series by repeating the same experi-ment with a real coupled molecular-continuum simulation. Figure 4a shows thenoisy x-velocity signal in one of the cells. In Figure 4b we additionally enablemulti-instance sampling using 8 MD instances and observe a smooth sine outputwith a SNR gain of 19.84 dB.

ICCS Camera Ready Version 2019To cite this paper please use the final published version:

DOI: 10.1007/978-3-030-22747-0_34

Page 9: MaMiCo: Parallel Noise Reduction for Multi-Instance ...2.1 Molecular Dynamics (MD) For the sake of simplicity we restrict our considerations to a set of Lennard-Jones molecules here,

Parallel Noise Reduction for Molecular-Continuum Flow Simulation 9

(a) Original noisy, Gaussian signal, withzero mean and standard deviation 0.16,SNR = 2.88 dB

(b) kmax = 3, N = 40, SNR = 11.68 dB

(c) kmax = 2, N = 10, SNR = 8.95 dB (d) kmax = 2, N = 80, SNR = 15.82 dB

(e) kmax = 1, N = 10, SNR = 12.28 dB (f) kmax = 1, N = 80, SNR = 19.93 dB

Fig. 3. SNR gain of our POD implementation in MaMiCo, for oscillating 3D Couetteflow, using synthetic MD data. Maximum uwall = 0.5. Red: x-component of velocity;Black: True noiseless signal for this cell

4.2 Start-Up of Coupled Couette Flow

We employ a one-way coupled LBCouetteSolver → SimpleMD simulation run-ning on 64 cores in a MD-30 scenario. The MD quantities that would be re-turned to the CouetteSolver are collected on the corresponding CouetteSolverrank. They are compared to the analytical solution in Figure 5.

Figure 5a shows a very high fluctuation level, because uwall = 0.5 is rela-tively low compared to thermal fluctuations of MD particles. Figures 5b and5c compare multi-instance sampling and noise filtering. The 32 instance simu-lation is computationally more expensive, but yields a strong gain of 14.73 dB.

ICCS Camera Ready Version 2019To cite this paper please use the final published version:

DOI: 10.1007/978-3-030-22747-0_34

Page 10: MaMiCo: Parallel Noise Reduction for Multi-Instance ...2.1 Molecular Dynamics (MD) For the sake of simplicity we restrict our considerations to a set of Lennard-Jones molecules here,

10 Piet Jarmatz and Philipp Neumann

(a) Original noisy signal, SNR = 1.26 dB (b) 8 instances, kmax = 1, N = 80, SNR= 21.10 dB

Fig. 4. Oscillating 3D Couette, coupled one-way into a real MD simulation instead ofusing synthetic data.

(a) 1 MD instance, no noise filter, SNR =0.29 dB

(b) 32 MD instances, no noise filter, SNR= 15.02 dB

(c) 1 MD instance, POD(kmax = 2, N =40), SNR = 11.44 dB

(d) 32 MD instances, POD(kmax = 2, N =40), SNR = 22.92 dB

Fig. 5. Couette startup flow profiles, one-way LB → SimpleMD coupling, multi-instance MD versus noise filtering

Theoretically one would expect from multi-instance MD sampling with I = 32

ICCS Camera Ready Version 2019To cite this paper please use the final published version:

DOI: 10.1007/978-3-030-22747-0_34

Page 11: MaMiCo: Parallel Noise Reduction for Multi-Instance ...2.1 Molecular Dynamics (MD) For the sake of simplicity we restrict our considerations to a set of Lennard-Jones molecules here,

Parallel Noise Reduction for Molecular-Continuum Flow Simulation 11

instances a reduction of the thermal noise standard deviation by factor√I,

and a reduction of squared noise amplitude by factor I, so the expected gain is10 log10(I) dB = 15.05 dB, which is in good compliance with our experimentalresult. The simulation with POD is using two modes here, which is necessaryas data in a singe mode does not capture the fast flow start-up. A smallercomparable gain of 11.15 dB is obtained here, using much less computationalresources (see Section 4.3). The best result is achieved using a combination ofmulti-instance sampling and noise filtering shown in Figure 5d. This novel ap-proach yields a signal-to-noise gain of 22.63 dB for this test scenario, so that theexperimentally produced velocity values closely match the analytical solution.

Thus the new combined coupling approach features benefits for both per-formance and precision. Its ability to extract considerably smoother flow fieldquantities permits coupling on shorter time scales.

4.3 Performance and Scaling Tests

Table 1. Impact of noise filter on overall coupled simulation performance. Ratios in

the form[

MD LBPOD MaMiCo

]– given as percentages of total runtime spent in the respective

simulation component.

domain size # cores NLB MD LB MD 20 40 80

100x100x100 30x30x30 1 1

[98.90 0.680.13 0.29

] [98.69 0.670.35 0.29

] [97.86 0.661.19 0.29

]

100x100x100 60x60x60 1 8

[98.00 0.460.17 1.36

] [97.84 0.460.33 1.36

] [97.23 0.460.96 1.35

]

200x200x200 120x120x120 8 64

[69.23 0.170.37 30.24

] [69.15 0.170.46 30.22

] [68.95 0.170.81 30.07

]

400x400x400 240x240x240 512 512

[95.67 2.621.01 0.70

] [95.30 2.701.32 0.68

] [94.81 2.781.71 0.70

]

In Table 1 we investigate the performance of the noise filtering subsystemcompared to the other components of the coupled simulation. POD always runswith same number of cores as MD. The MaMiCo time includes only efforts forcoupling communication – particle insertion and velocity imposition is counted asMD time. The table entries are sampled over 100 coupling cycles each, excludinginitializations, using a Couette flow simulation on the LRZ Linux Cluster.

The filter parameter N (time window size) strongly influences the POD run-time, as C ∈ RN×N – and communication and eigenvalue decomposition isperformed on C. This leads in practice to a complexity in the order of O(N3).

ICCS Camera Ready Version 2019To cite this paper please use the final published version:

DOI: 10.1007/978-3-030-22747-0_34

Page 12: MaMiCo: Parallel Noise Reduction for Multi-Instance ...2.1 Molecular Dynamics (MD) For the sake of simplicity we restrict our considerations to a set of Lennard-Jones molecules here,

12 Piet Jarmatz and Philipp Neumann

Thus, choosing a sufficiently small N is important to limit the computationaldemand of the method.

Fig. 6. Strong scaling of coupled simula-tion, MD-120, including noise filtering. Allcores are used for LB, MD and POD, re-spectively.

However, the noise reduction run-time is always very low comparedto the other simulation components.Particularly with regard to the highgain in signal-to-noise ratio that isreached with relatively little compu-tational effort here, this is a very goodresult.

The scalability of the coupled sim-ulation, including the noise filter, isevaluated on LRZ CoolMUC-3. Thestrong scaling tests in Figure 6 areperformed in a fixed MD-120 domain with up to 512 cores used. It is foundthat our new POD implementation does not significantly impede the scaling,even though it employs a MPI reduction operation. This operation is executedonly once per coupling cycle, i.e. every 100 MD time steps, and is restrictedto a single MD instance. These results demonstrate the eligibility of our MDdata post-processing approach and POD implementation for high-performancecomputing applications.

5 Conclusions

We have introduced a new noise filtering subsystem into MaMiCo that en-ables massively parallel particle data post-processing in the context of transientmolecular-continuum coupling. Thanks to MaMiCo’s modular design, it is com-patible with any particle and continuum solver and can be utilized in conjunctionwith MD multi-instance sampling. Experiments with flow start-up profiles vali-date our coupled simulation. SNR considerations for oscillating flow demonstratethe filtering quality of our POD implementation. The noise filtering interface andimplementation are very scalable and have only a minimal impact on the overallcoupled simulation performance.

We observed SNR gains from the POD filter ranging roughly from 11 dBto 17 dB. This corresponds to a potential simulation performance increase byMD instance reduction by a factor of 10 to 50. However, we point out that acombination of both, POD and multi-instance sampling, yields even more smoothquantities and thus enables coupling on shorter time scales, while yielding ahigher level of parallelism than a POD-only approach.

As our interface is very generic and flexible, a possible future work suggestingitself would be to conduct experiments with more noise filtering algorithms.Especially the area of image processing offers many promising methods whichcould be applied to CFD instead of image data. Besides, further data analyticsand particle analysis tasks may be tackled in the future, such as flow profileextraction, pressure wave and density gradient detection, or specific molecular

ICCS Camera Ready Version 2019To cite this paper please use the final published version:

DOI: 10.1007/978-3-030-22747-0_34

Page 13: MaMiCo: Parallel Noise Reduction for Multi-Instance ...2.1 Molecular Dynamics (MD) For the sake of simplicity we restrict our considerations to a set of Lennard-Jones molecules here,

Parallel Noise Reduction for Molecular-Continuum Flow Simulation 13

data collection modules, e.g. for determination of radial distribution functions.Resilient MD computations by a fault-tolerant multi-instance system with theability to recover from errors are also conceivable, as well as a machine learningbased extension extracting more smooth quantities by learning and detection ofprevalent flow features.

Acknowledgements

The provision of compute resources through the Leibniz Supercomputing Cen-tre is acknowledged. The authors further acknowledge financial support by theFederal Ministry of Education and Research, Germany, project ”Task-basedload balancing and auto-tuning in particle simulations” (TaLPas), grant number01IH16008B.

Bibliography

[1] Barsky, S., Delgado-Buscalioni, R., Coveney, P.V.: Comparison of moleculardynamics with hybrid continuum–molecular dynamics for a single tetheredpolymer in a solvent. The Journal of Chemical Physics 121(5) (2004)

[2] Borg, M.K., Lockerby, D.A., Reese, J.M.: A hybrid molecular–continuummethod for unsteady compressible multiscale flows. Journal of Fluid Me-chanics 768, 388–414 (2015), https://doi.org/10.1017/jfm.2015.83

[3] Borgdorff, J., Mamonski, M., Bosak, B., Kurowski, K., Belgacem, M.B.,Chopard, B., Groen, D., Coveney, P.V., Hoekstra, A.G.: Distributed mul-tiscale computing with muscle 2, the multiscale coupling library and envi-ronment. Journal of Computational Science 5(5), 719–731 (2014)

[4] Buades, A., Coll, B., Morel, J.M.: A non-local algorithm for image denoising.In: Computer Vision and Pattern Recognition, 2005. CVPR 2005. IEEEComputer Society Conference on, vol. 2, pp. 60–65 (2005)

[5] Delgado-Buscalioni, R., Coveney, P.: USHER: an algorithm for particle in-sertion in dense fluids. The Journal of chemical physics 119(2) (2003)

[6] Delgado-Buscalioni, R., Coveney, P.V., Riley, G.D., Ford, R.W.: Hybridmolecular-continuum fluid models: implementation within a general cou-pling framework. Philosophical Transactions of the Royal Society of LondonA: Mathematical, Physical and Engineering Sciences 363(1833) (2005)

[7] Dupuis, A., Kotsalis, E., Koumoutsakos, P.: Coupling lattice boltzmannand molecular dynamics models for dense fluids. Physical Review E 75(4),046704 (2007)

[8] E, W., Engquist, B., Li, X., Ren, W., Vanden-Eijnden, E.: Heterogeneousmultiscale methods: A review. Communications in Computational Physics2(3), 367–450 (2007)

[9] Espanol, P.: Hydrodynamics from dissipative particle dynamics. Phys. Rev.E 52, 1734–1742 (Aug 1995)

[10] Gonnet, P.: A simple algorithm to accelerate the computation of non-bondedinteractions in cell-based molecular dynamics simulations. Journal of Com-putational Chemistry 28(2), 570–573 (2007)

ICCS Camera Ready Version 2019To cite this paper please use the final published version:

DOI: 10.1007/978-3-030-22747-0_34

Page 14: MaMiCo: Parallel Noise Reduction for Multi-Instance ...2.1 Molecular Dynamics (MD) For the sake of simplicity we restrict our considerations to a set of Lennard-Jones molecules here,

14 Piet Jarmatz and Philipp Neumann

[11] Grinberg, L.: Proper orthogonal decomposition of atomistic flow simula-tions. Journal of Computational Physics 231(16), 5542–5556 (2012)

[12] Koumoutsakos, P.: Multiscale flow simulations using particles. Annu. Rev.Fluid Mech. 37, 457–487 (2005)

[13] Neumann, P., Bian, X.: Mamico: Transient multi-instance molecular-continuum flow simulation on supercomputers. Computer Physics Commu-nications 220, 390–402 (2017)

[14] Neumann, P., Tchipev, N.: A coupling tool for parallel molecular dynamics-continuum simulations. In: Parallel and Distributed Computing (ISPDC),2012 11th International Symposium on, pp. 111–118 (2012)

[15] Ngandjong, A.C., Rucci, A., Maiza, M., Shukla, G., Vazquez-Arenas, J.,Franco, A.A.: Multiscale simulation platform linking lithium ion batteryelectrode fabrication process with performance at the cell level. The Journalof Physical Chemistry Letters 8(23), 5966–5972 (2017)

[16] Nie, X., Chen, S., Robbins, M.O., et al.: A continuum and molecular dynam-ics hybrid method for micro-and nano-fluid flow. Journal of Fluid Mechanics500, 55–64 (2004)

[17] Perona, P., Malik, J.: Scale-space and edge detection using anisotropic diffu-sion. IEEE Transactions on pattern analysis and machine intelligence 12(7),629–639 (1990)

[18] Praprotnik, M., Site, L.D., Kremer, K.: Multiscale simulation of soft mat-ter: From scale bridging to adaptive resolution. Annual Review of PhysicalChemistry 59(1), 545–571 (2008)

[19] Ren, X.G., Wang, Q., Xu, L.Y., Yang, W.J., Xu, X.H.: Hacpar: An efficientparallel multiscale framework for hybrid atomistic–continuum simulation atthe micro-and nanoscale. Advances in Mechanical Engineering 9(8) (2017)

[20] Ritos, K., Borg, M.K., Lockerby, D.A., Emerson, D.R., R., J.M.: Hybridmolecular-continuum simulations of water flow through carbon nanotubemembranes of realistic thickness. Microfluidics and Nanofluidics (2015)

[21] Sirovich, L.: Turbulence and the dynamics of coherent structures. i. coherentstructures. Quarterly of applied mathematics 45(3), 561–571 (1987)

[22] Smith, E.: On the coupling of molecular dynamics to continuum computa-tional fluid dynamics (2013)

[23] Tang, Y.H., Kudo, S., Bian, X., Li, Z., Karniadakis, G.E.: Multiscale uni-versal interface: a concurrent framework for coupling heterogeneous solvers.Journal of Computational Physics 297, 13–31 (2015)

[24] Werder, T., Walther, J.H., Koumoutsakos, P.: Hybrid atomistic–continuummethod for the simulation of dense fluid flows. Journal of ComputationalPhysics 205(1), 373–390 (2005)

[25] Zhou, W., Luan, H., He, Y., Sun, J., Tao, W.: A study on boundary forcemodel used in multiscale simulations with non-periodic boundary condition.Microfluidics and nanofluidics 16(3), 1 (2014)

[26] Zimon, M., Prosser, R., Emerson, D., Borg, M.K., Bray, D., Grinberg,L., Reese, J.M.: An evaluation of noise reduction algorithms for particle-based fluid simulations in multi-scale applications. Journal of Computa-tional Physics 325, 380–394 (2016)

ICCS Camera Ready Version 2019To cite this paper please use the final published version:

DOI: 10.1007/978-3-030-22747-0_34