performance study of a parallelized level-set method based 3d transient solver on various two-phase...

7/30/2019 Performance Study of a Parallelized Level-Set Method Based 3D Transient Solver on Various Two-Phase Flow Probl

1/8

38th National Conference on Fluid Mechanics and Fluid Power (FMFP 2011)

1* Address all correspondence to this author

Proceedings of the 38th National Conference on Fluid Mechanics and Fluid Power

December 15-17, 2011, MANIT, Bhopal

CFD-17

PERFORMANCE STUDY OF A PARALLELIZED LEVEL-SET METHOD

BASED 3D TRANSIENT SOLVER ON VARIOUS TWO-PHASE FLOW

PROBLEMS

Vishesh AggarwalDepartment of Mechanical

Engineering, Indian Institute ofTechnology BombayMumbai 400 076

Email: [email protected]

Vinesh H. GadaDepartment of Mechanical



Atul Sharma*Department of Mechanical



ABSTRACT

A level-set method based two-phase flow solver is parallelized using a unidirectional domain

decomposition approach. It employs a finite volume formulation for discretizing the conservation

equations and a finite difference formulation for discretizing the level-set advection equation, over

a staggered grid in Cartesian/cylindrical co-ordinates. The domain is mapped over a distributed

memory parallel architecture using domain decomposition, with overlapping boundary cells which

exchange data using MPI. The parallel code is validated against a strategic set of test cases

(ranging from laminar pipe flow to film boiling) which are also used to quantify the parallel

performance of the code across a range of problems. The parallel code is run on a 64-bit Xeon

cluster for up to 16 processors. Numerical predictions from the parallelized code bear an excellent

agreement with those from the serial code, with parallel efficiencies ranging up to 99%.Keywords:Level-Set Method, Two-Phase Flow, MPI, Parallel Speedup, Domain Decomposition

INTRODUCTION

The need to keep computational time within

practical time-frames (particularly for multi-

phase flows), coupled with an easy access to

parallel computing hardware, has given impetus

to parallel implementation of CFD solvers. This

work is motivated towards parallelizing an

existing serial two-phase flow solver for a

distributed memory parallel architecture.

From a literature survey, it is found that

majority parallel solvers are implemented and

tested over single phase flow problems. Fewer

studies have delved into applying these

techniques to simulate multiphase flows, as

shown in Table 1. In all of these studies, the

parallel speedup has been addressed by varying

grid sizes on a particular problem. However,

this may not be sufficient to demonstrate the

complete capability of a parallelization method.

The parallel speedup, particularly in multiphase

problems, has a bearing not only on the

phenomenon under consideration but also on the

physical properties of the interacting fluids. For

example, a higher density ratio of the two fluids

results in a stiffer coefficient matrix of the

pressure Poisson equation. This increases the

overall computation time to convergence, which

in turn may affect the parallel performance,

either adversely or favorably. Moreover, few

studies have explicitly evaluated the order of

communication and idle times spent by each

processor and its effect on parallel speedup.

The present study employs a novel

technique for the level-set method, as discussed

by Gada and Sharma, 2011. The parallelization

is implemented using a single directional


2/8


2

domain decomposition, which incurs minimum

modification in the corresponding serial code.

Pseudo boundary cells are created on each

partitioned sub-domain which exchange data

across processors using MPI (Message PassingInterface). We evaluate the scalability of the

proposed method over various two-phase flow

problems, each being tested on 1, 2, 4, 8 and 16

processors. A preliminary single phase flow

problem is also tested, which forms the basis for

comparing performance across the different

two-phase problems. Each test case is chosen to

employ a different combination of solvers

and/or fluid properties. The range of test cases

serves two motives. First, it aids in investigating

the effect of problem stiffness on scalability.

Second, it helps to trace the limitation on

parallel performance for a particular problem toa bottleneck in the scheme of solvers, which

include Navier-Stokes (velocity prediction and

pressure projection), level-set and energy

equation. It can be further related to the

percentage of inter-processor communication

time for each of the solvers. Thus, besides

evaluating scalability, such a study illuminates

the potential areas for improvement.

Table 1. Summary of literature review on distributed memory MPI based parallel two-phase flow solvers

Authors (Np)max Problems tested for parallel

speedup

Time criteria used in

evaluating parallel speedup

2D/3D Numerical

methoda

George and

Warren, 2002

24 Dendritic growth Total run times 3D Phase-

field

Sussman, 2005 16 wobbly bubble Average run time per time

step

3D CLSVOF

Wang et al., 2006 64 Dendritic growth Run time for 500 time steps 2D LS

Fortmeier and

Bucker, 2010

256 Bubble rise in quiescent

fluid

Run time for a single time

step

3D LS

Hajihashemi and

Shenawee, 2010

400 Reconstruction of star,

ellipse, cylinder shapes

Total run times 2D LS

Agbaglah et al.,

2011

512 Lid driven cavity Run time for 100 time steps 3D VOF

Fortmeier and

Bucker, 2011

128 Re-initialization of cube

slices, sphere

Total run times 3D LS

Zuzio andEstivalezes, 2011

256 Damped surface waveoscillation

Average time per iteration 2D LS

a LSLevel set; VOFvolume-of-fluid; CLSVOFCombined LS and VOF

PHYSICAL DESCRIPTION OF TEST

PROBLEMS AND CODE VALIDATIONThe test problems considered in this study

are enlisted in Table 2. The grid sizes are

selected such that the ratio of cells involved in

MPI to the total number of cells per sub-domain

is nearly the same across different test cases.

This normalizes the effect of communication

overheads on parallel performance across the set

of problems, which would have otherwise

induced a bias in the comparison. Within each

test case, critical numerical parameters (such as

the grid size, time step, final time and userspecified error tolerances) are kept identical for

both serial and parallel codes.

Single-phase Flow in a Pipe

This problem is executed considering two

sub-cases, 1A: Hydrodynamically developing

isothermal flow and 1B: Hydrodynamically and

thermally developing flow in a pipe maintained

at a constant wall temperature.


3/8


3

Table 2. Test cases used for validating 3D transient parallel solver and comparing parallel performance

Case Description Solvers Invoked Grid Size % grid using MPIc

1A Single-phase isothermal flow in a pipe NS 3092306 20.9

1B Single-phase non-isothermal flow in a pipe NS+EE 3092306 20.92 Two-phase stratified flow in a pipe NS+LS 3092306 20.9

3A Rise of n-butanol bubble in quiescent water NS+LS (with ST) 15442354 18.2

3B Rise of an air bubble in a quiescent liquid NS+LS (with ST) 6842354 18.2

4 Jet formation in quiescent water NS+LS (with ST) 11232354 18.2

5 Film boiling over a flat surface NS+LS+EE+PC (with ST) 6666290 22.2

bNS - Navier-Stokes solver, LS - level-set advection and re-initialization, EE - energy equation solver, PC - phase

change related modules, ST - surface tension source term

c Evaluated forNp = 16

Uniform flow and temperature conditions are

applied at the inlet, with Re = 50 and Pr= 0.7.

A pipe length to diameter ratio L/D = 5 is taken

which allows the flow to reach a fully

developed condition near the exit. The domain

is discretized using a cylindrical grid of

3092306.

In the fully developed region, friction factor

is obtained as 1.286 and 1.279, while the

Nusselt number is 3.677 and 3.678 for Np = 1

and 16, respectively. They are in excellent

agreement with the analytical values off= 1.28

andNu = 3.657.

Two-phase Stratified Flow in a Pipe

Instead of single-phase flow considered in

case 1, here, a two-phase stratified flow in a

pipe is simulated. A uniform velocity condition,

with an ideally flat interface, is assumed to exist

at the inlet. Further, a hold-up ratio (the ratio of

flow area occupied by the lighter fluid to thetotal flow area) equal to 0.5 is taken at the inlet.

The two fluids are assumed immiscible and

having a density ratio, = 1 and viscosity ratio,

= 5.326. Similar to case 1, we take L/D = 5

andRe1 = 50.

The analytical solution and numerically

predicted iso-contours ofw-velocity at the pipe

exit are compared in Fig. 1. The numerical

value ofwmax (obtained along the vertical line of

symmetry) agrees within 2% of its analytical

value. Further, the variation of wmax is within

0.5% across 1 to 16 processors.

Figure 1. Comparison of analytical and numerical

w-velocity iso-contours in fully developed two-

phase stratified flow (case 2,Np = 16)

Bubble Rise in a Quiescent Liquid ColumnHere, we consider two sub-cases of bubble

rise in a stagnant liquid, each with a different

fluid combination. A higher density ratio

demands a lower time step and also increases

the stiffness of pressure Poisson equation. It is

conjectured that this may affect the proportion

of communication overheads in a parallel run.

Therefore, such a comparison can narrow down

the target areas for improvement in parallel

performance.


4/8


4

Case 3A represents the rise of n-butanol

(fluid 2) droplet in water (fluid 1), while case

3B deals with the rise of an air (fluid 2) bubble

in a liquid (fluid 1). Fluid properties are taken as

1 = 986.51, 2 = 845.44, 1 = 1.3910-3

, 2 =3.2810

-3, = 1.6310

-3for case 3A; and 1 =

875.5, 2 = 1, 1 = 0.118, 2 = 110-3

, =

32.210-3

for case 3B. Initially, the bubble is

assumed to be perfectly spherical and at rest

inside a cylindrical domain. A drop diameter

(Db) of 0.002m and 0.0122m, with length scales

ofDb and 0.5Db, are taken for cases A and B,

respectively. The velocity scales are taken as

0.058m/s and 0.215m/s for the respective cases.

The non-dimensional domain size for both the

cases is taken as L = 15 and D = 6. Free slipboundary condition is applied on the side and

bottom walls, while outflow condition is applied

at the top wall of the domain.

Figure 2 shows an excellent agreement

between the present and published results for

the instantaneous bubble shapes. For case A, the

terminal velocity is obtained as u/uc = 0.991,

which is within 0.8% of that reported by

Bertakis et al., 2010. For case B, the terminal

velocity reaches a steady value ofu/uc = 0.933.

The results published by Sussman and Smereka,

1997 are with the far field boundary condition

on the side walls, which is reported to give a

terminal velocity higher by about 9% when

compared to the free slip boundary condition

used here.

Jet Formation in Quiescent Water

Unlike the previous test case, jet formation

ensures a more uniform interface presence

throughout the domain, thereby distributing thecomputational burden more uniformly across

the partitioned sub-domains. This is conjectured

to affect the parallel performance favorably.

Here, we simulate the breakup of a paraffin-

kerosene (fluid 2) jet injected vertically upwards

into stagnant water (fluid 1), which is similar to

the system 3-2 of Kitamura et al., 1982. Fluid

properties are taken as 1 = 998, 1 = 1.0310-3

,

2 = 848, 2 = 1.8810-2

, = 40.410-3

. Nozzle

injection diameter (Db) is taken as 0.122m.

(a)

(b)

Figure 2. Evolution of rising bubble shapes (case 3,

Np = 16); (a) Rise n-butanol bubble in water (b) Rise

of air bubble in liquid (dotted pattern represents the

results from present study, superimposed on those

reported by Sussman and Smereka, 1997)

Taking Lc = Db and uc = 0.35m/s (average jet

injection velocity), we get Re1 = 414, We = 3.7,

Fr= 3.2. Further, the non-dimensional domain

size is taken asL = 40 andD = 13. The injection

velocity mimics a fully developed velocity

profile. No-slip boundary conditions are applied

on the side and the bottom walls.

Figure 3 shows the predicted flow pattern.

The diameter of droplets, averaged from those

between 20D to 35D, is 3.32 whereas the jetbreakup length is 5.3. These results compare

within 4% and 21% of the published results

(Kitamura et al., 1982), respectively.

Film Boiling over a Flat Surface

A film boiling problem applies all the

solvers employed in the present code and aids in

complete evaluation of the parallel performance.

The problem consists of a liquid pool with a thin

vapour film present over the bottom surface.


5/8


5

Figure 3. Jet pattern evolution (case 4,Np = 16)

A constant wall superheat is imposed on the

bottom surface. The minimum domain size

required to capture the phenomenon is dictatedby the most dangerous Taylor wavelength for

a three-dimensional simulation, d3 (Esmaeeli

and Tryggvason, 2004), evaluated using Eq. (1).

3

1 2

32 2d

g(1)

A domain size of 0.5d30.5d3d3 with agrid size of 6666130 is selected to benchmark

the numerical results, whereas a domain of

0.5d30.5d32.25d3 with a grid size of

6666290 is employed to compare the parallel

performance. While the former is sufficient to

capture the phenomena, the latter maintains the

ratio of MPI cells to interior cells similar to the

other cases. The computational domain

considered here is a quarter of the complete

domain (d3d3d3) shown in Fig. 4. Thus, this

quarter domain captures a quarter of the bubblesreleased in the node and anti-node modes on the

pair of diagonally opposite corners. The

characteristic length scale is taken equal to the

capillary length, Lc = [ /g(1-2)]1/2

, and the

characteristic velocity scale, uc = (gLc)1/2

. The

property ratios are taken as = 0.603, = 0.693,

= 0.987, = 1.615. The governing parameters

are obtained asRe1 = 18.81, Pr1 = 2.79, Fr= 1,

We = 1.06 and Ja = 0.57. Similar to the

initialization method adopted by Esmaeeli and

Tryggvason, 2004; the vapour film is initially

perturbed using Eq. (2).

3

3 3

1 2 2

1 cos cos8 5

d

d d

x y

z (2)

Figure 4 shows the interface shapes at = 50

and 100. The temporal variation of surface-

averaged Nu on the superheated surface is

shown in Fig. 5. The time-averaged value ofNu

from = 110 to 300 is 1.459. The deviation is

13.6% and 16.5% compared to the average Nu

calculated from correlations given by Berenson,

1961; Son and Dhir, 1998; respectively. Similar

deviations in the numerically predicted valuesofNu have been reported in literature.

Figure 4. Interface evolution in film boiling, with

bubble formation at node and anti-node locations

Figure 5. Temporal variation of surface averaged

Nusselt number (case 5,Np = 16)

PARALLEL PERFORMANCE

Each test case is run on a 20 node cluster,

with each node having 2Gb memory and 8 dual

core Intel Xeon 2.4GHz processors. The code is

compiled using the C++ library of MPICH2.


6/8


7/8


7

rise test case 3A (even though the density ratio

of the two phases are nearly same in the cases

3A and 4). This abrupt nature of performance

for test case 4 is due to an increased number of

iterations taken by Np = 4 and 16 compared tothose taken by Np = 1, 2 and 8. Therefore, the

denominator of Eq. (3) has peaks forNp = 4 and

16, which results in very low values of S4 and

S16.

While parallel efficiency shows a generally

reducing trend with increasing number of

processors, it is seen that cases 1A, 2 and 3A

show a marginal improving trend at E16. This

improvement can be attributed to a better

memory performance and cache availability for

Np = 16. However, for cases 1B and 5, such anadvantage gets offset by an increased size of

data storage required by the heat transfer

module variables, thereby giving a substantially

reducing trend at E16 compared to the other

cases.

Effect of the Dominant Solvers onEnFigure 7 shows the relative computational

time taken by different solvers. For each case,

the time is evaluated as the average value from

the runs for Np = 2 to Np = 16. Between cases1A, 1B and 2, test case 1B has a higher

contribution from the pressure Poisson equation

(PPE) solver and also gives a better

performance. Similarly, case 3B has a higher

contribution from PPE solver and also a better

performance compared to case 3A. This is due

to a stiffer PPE in the former. Comparing case

3A and 4, although case 4 has a lower

contribution from the PPE solver, it gives a

better performance for Np

= 2 and 8 due to the

reduced idle times associated with MPI. This is

due to a better computational load distribution

in case 4.

CONCLUDING REMARKS

A level-set based two-phase solver is

parallelized using a domain decomposition

procedure, based on the data-parallel model,

which incurs minimum modifications in the

corresponding serial code.

Figure 7. Relative computational effort per solver

for all the test problems

The parallel processes are coupled via data

exchange/update on the boundary cells of each

sub-domain using MPI. The parallelized code

has been validated with various test problems

against analytical/published results. The parallel

performance is evaluated on 2 to 16 processors.

For a fixed number of processors, the

performance shows a considerable variation

across the set of test cases. This suggests that a

single problem with varying grid sizes may not

be sufficient to fully exhibit the parallelscalability of an algorithm, especially in two-

phase flow problems. Factors such as the

property ratio of the fluid, relative distribution

of the light and heavy fluid over the domain, the

uniformity in fluid action being handled by

various processors and the nature of iterative

solvers being invoked contribute to the variation

observed in the parallel scaleup.

In the present scheme of solvers, the PPE

solver is found to give the best improvement in

scaling, while the Gauss-Seidel velocity

predictor can be improvised. On the other hand,

the Gauss-Seidel energy equation solver has a

favorable effect on the parallel performance.

The phase change solver is conjectured to

increase the communication time at a faster rate

with increasing number of processors compared

to other problems. Further, a problem with

uniform fluid action over the domain is found to

have a better load balance.


8/8


8

NOMENCLATURE

D Diameter [m]

En Parallel efficiency with n processors

f Friction factor

Fr Froude number [ (uc/gLc) ]g Acceleration due to gravity [m/s ]

Ja Jacob number [cp2(Tw-Tsat)/h12]

k Thermal conductivity [W/m.K]

L Length [m]

Np Number of processors in parallel

Nu Nusselt number [hLc/k]

Prf Prandtl number [fcpf/kf]

Ref Reynolds number [fucLc/f]

Sn Parallel speedup with n processors

t Time [s]

T Temperature [K]u Velocity [m/s]

w Axial velocity [m/s]

We Weber number [uc Lc/]

Greek Ratio of specific heat [cp2/cp1]

Ratio of thermal conductivity [k2/k1]

Viscosity ratio [2/1]

d3 Critical wavelength for 3D boiling [m] Viscosity [Pa.s]

Density [kg/m ] Non-dimensional time

Density ratio [2/1]

Subscripts

b Bubble/Injection Nozzle

c Characteristic scale

f Phase (f = 1 for heavier fluid; f = 2 for

lighter fluid)

w Wall

REFERENCESAgbaglah G, Delaux S, Fuster D, Hoepffner J,

Josserand C, Popinet S, Ray P, Scardovelli R,

Zaleski S, 2011. Parallel simulation of multiphase

flows using octree adaptivity and VOF method,

Comptes Rendus Mcanique 339, 194-207.

Berenson PJ, 1961. Film boiling heat transfer from a

horizontal surface,J Heat Transfer 83, 351-362.

Bertakis E, Gross S, Grande J, Fortmeier O,

Reuksen A, Pfennig A, 2010. Validated simulation

of droplet sedimentation with finite-element and

level-set methods, Chemical Engineering Science

65, 2037-2051.

Esmaeeli A, Tryggvason G, 2004. Computations of

film boiling. Part I: numerical method, Int J Heat

and Mass Transfer 47, 5451-5461.

Fortmeier O, Bucker HM, 2010. A parallel strategy

for a level set simulation of droplets moving in a

liquid medium, Proceedings of VECPAR, 200-209.

Fortmeier O, Bucker HM, 2011. Parallel re-

initialization of level set functions on distributed

unstructured tetrahedral grids, J Computational

Physics 230, 4437-4453.

Gada VH, Sharma A, 2011. On a novel dual-grid

level-set method for two-phase flow simulation,

Numerical Heat Transfer Part B: Fundamentals 59,26-57.

George WL, Warren JA, 2002. A parallel 3D

dendritic growth simulator using the Phase-Field

Method, J Computational Physics 177, 264-283.

Hajihashemi MR, El-Shenawee M, 2010. High

performance computing for the level-set

reconstruction algorithm, J Parallel and Distributed

Computing 70, 671-679.

Johnson SP, Cross M, 1991. Mapping structured

grid three-dimensional CFD codes onto parallel

architectures, Applied Mathematical Modelling 15,394-405.

Kitamura Y, Mishima H, Takahashi T, 1982.

Stability of jets in liquid-liquid systems, The

Canadian J Chemical Engineering 60, 723-731.

Son G, Dhir VK, 1998. Numerical simulation of film

boiling near critical pressures with a level-set

method, J Heat Transfer 120, 183-192.

Sussman M, 2005. A parallelized, adaptive

algorithm for multiphase flows in general

geometries, Computers and Structures 83, 435-444.Sussman M, Smereka P, 1997. Axisymmetric free

boundary problems,Journal of Fluid Mechanics 341,

269-294.

Wang K, Chang A, Kale LV, Dantzig JA, 2006.

Parallelization of a level set method for simulating

dendritic growth, J Parallel and Distributed

Computing 66, 1379-1386.

Zuzio D, Estivalezes JL, 2011. An efficient block

parallel AMR method for two phase interfacial flow

simulations, Computers and Fluids 44, 339-357.

performance study of a parallelized level-set method based 3d transient solver on various two-phase...

Documents