performance study of a parallelized level-set method based 3d transient solver on various two-phase...
TRANSCRIPT
-
7/30/2019 Performance Study of a Parallelized Level-Set Method Based 3D Transient Solver on Various Two-Phase Flow Probl
1/8
38th National Conference on Fluid Mechanics and Fluid Power (FMFP 2011)
1* Address all correspondence to this author
Proceedings of the 38th National Conference on Fluid Mechanics and Fluid Power
December 15-17, 2011, MANIT, Bhopal
CFD-17
PERFORMANCE STUDY OF A PARALLELIZED LEVEL-SET METHOD
BASED 3D TRANSIENT SOLVER ON VARIOUS TWO-PHASE FLOW
PROBLEMS
Vishesh AggarwalDepartment of Mechanical
Engineering, Indian Institute ofTechnology BombayMumbai 400 076
Email: [email protected]
Vinesh H. GadaDepartment of Mechanical
Engineering, Indian Institute ofTechnology BombayMumbai 400 076
Email: [email protected]
Atul Sharma*Department of Mechanical
Engineering, Indian Institute ofTechnology BombayMumbai 400 076
Email: [email protected]
ABSTRACT
A level-set method based two-phase flow solver is parallelized using a unidirectional domain
decomposition approach. It employs a finite volume formulation for discretizing the conservation
equations and a finite difference formulation for discretizing the level-set advection equation, over
a staggered grid in Cartesian/cylindrical co-ordinates. The domain is mapped over a distributed
memory parallel architecture using domain decomposition, with overlapping boundary cells which
exchange data using MPI. The parallel code is validated against a strategic set of test cases
(ranging from laminar pipe flow to film boiling) which are also used to quantify the parallel
performance of the code across a range of problems. The parallel code is run on a 64-bit Xeon
cluster for up to 16 processors. Numerical predictions from the parallelized code bear an excellent
agreement with those from the serial code, with parallel efficiencies ranging up to 99%.Keywords:Level-Set Method, Two-Phase Flow, MPI, Parallel Speedup, Domain Decomposition
INTRODUCTION
The need to keep computational time within
practical time-frames (particularly for multi-
phase flows), coupled with an easy access to
parallel computing hardware, has given impetus
to parallel implementation of CFD solvers. This
work is motivated towards parallelizing an
existing serial two-phase flow solver for a
distributed memory parallel architecture.
From a literature survey, it is found that
majority parallel solvers are implemented and
tested over single phase flow problems. Fewer
studies have delved into applying these
techniques to simulate multiphase flows, as
shown in Table 1. In all of these studies, the
parallel speedup has been addressed by varying
grid sizes on a particular problem. However,
this may not be sufficient to demonstrate the
complete capability of a parallelization method.
The parallel speedup, particularly in multiphase
problems, has a bearing not only on the
phenomenon under consideration but also on the
physical properties of the interacting fluids. For
example, a higher density ratio of the two fluids
results in a stiffer coefficient matrix of the
pressure Poisson equation. This increases the
overall computation time to convergence, which
in turn may affect the parallel performance,
either adversely or favorably. Moreover, few
studies have explicitly evaluated the order of
communication and idle times spent by each
processor and its effect on parallel speedup.
The present study employs a novel
technique for the level-set method, as discussed
by Gada and Sharma, 2011. The parallelization
is implemented using a single directional
-
7/30/2019 Performance Study of a Parallelized Level-Set Method Based 3D Transient Solver on Various Two-Phase Flow Probl
2/8
38th National Conference on Fluid Mechanics and Fluid Power (FMFP 2011)
2
domain decomposition, which incurs minimum
modification in the corresponding serial code.
Pseudo boundary cells are created on each
partitioned sub-domain which exchange data
across processors using MPI (Message PassingInterface). We evaluate the scalability of the
proposed method over various two-phase flow
problems, each being tested on 1, 2, 4, 8 and 16
processors. A preliminary single phase flow
problem is also tested, which forms the basis for
comparing performance across the different
two-phase problems. Each test case is chosen to
employ a different combination of solvers
and/or fluid properties. The range of test cases
serves two motives. First, it aids in investigating
the effect of problem stiffness on scalability.
Second, it helps to trace the limitation on
parallel performance for a particular problem toa bottleneck in the scheme of solvers, which
include Navier-Stokes (velocity prediction and
pressure projection), level-set and energy
equation. It can be further related to the
percentage of inter-processor communication
time for each of the solvers. Thus, besides
evaluating scalability, such a study illuminates
the potential areas for improvement.
Table 1. Summary of literature review on distributed memory MPI based parallel two-phase flow solvers
Authors (Np)max Problems tested for parallel
speedup
Time criteria used in
evaluating parallel speedup
2D/3D Numerical
methoda
George and
Warren, 2002
24 Dendritic growth Total run times 3D Phase-
field
Sussman, 2005 16 wobbly bubble Average run time per time
step
3D CLSVOF
Wang et al., 2006 64 Dendritic growth Run time for 500 time steps 2D LS
Fortmeier and
Bucker, 2010
256 Bubble rise in quiescent
fluid
Run time for a single time
step
3D LS
Hajihashemi and
Shenawee, 2010
400 Reconstruction of star,
ellipse, cylinder shapes
Total run times 2D LS
Agbaglah et al.,
2011
512 Lid driven cavity Run time for 100 time steps 3D VOF
Fortmeier and
Bucker, 2011
128 Re-initialization of cube
slices, sphere
Total run times 3D LS
Zuzio andEstivalezes, 2011
256 Damped surface waveoscillation
Average time per iteration 2D LS
a LSLevel set; VOFvolume-of-fluid; CLSVOFCombined LS and VOF
PHYSICAL DESCRIPTION OF TEST
PROBLEMS AND CODE VALIDATIONThe test problems considered in this study
are enlisted in Table 2. The grid sizes are
selected such that the ratio of cells involved in
MPI to the total number of cells per sub-domain
is nearly the same across different test cases.
This normalizes the effect of communication
overheads on parallel performance across the set
of problems, which would have otherwise
induced a bias in the comparison. Within each
test case, critical numerical parameters (such as
the grid size, time step, final time and userspecified error tolerances) are kept identical for
both serial and parallel codes.
Single-phase Flow in a Pipe
This problem is executed considering two
sub-cases, 1A: Hydrodynamically developing
isothermal flow and 1B: Hydrodynamically and
thermally developing flow in a pipe maintained
at a constant wall temperature.
-
7/30/2019 Performance Study of a Parallelized Level-Set Method Based 3D Transient Solver on Various Two-Phase Flow Probl
3/8
38th National Conference on Fluid Mechanics and Fluid Power (FMFP 2011)
3
Table 2. Test cases used for validating 3D transient parallel solver and comparing parallel performance
Case Description Solvers Invoked Grid Size % grid using MPIc
1A Single-phase isothermal flow in a pipe NS 3092306 20.9
1B Single-phase non-isothermal flow in a pipe NS+EE 3092306 20.92 Two-phase stratified flow in a pipe NS+LS 3092306 20.9
3A Rise of n-butanol bubble in quiescent water NS+LS (with ST) 15442354 18.2
3B Rise of an air bubble in a quiescent liquid NS+LS (with ST) 6842354 18.2
4 Jet formation in quiescent water NS+LS (with ST) 11232354 18.2
5 Film boiling over a flat surface NS+LS+EE+PC (with ST) 6666290 22.2
bNS - Navier-Stokes solver, LS - level-set advection and re-initialization, EE - energy equation solver, PC - phase
change related modules, ST - surface tension source term
c Evaluated forNp = 16
Uniform flow and temperature conditions are
applied at the inlet, with Re = 50 and Pr= 0.7.
A pipe length to diameter ratio L/D = 5 is taken
which allows the flow to reach a fully
developed condition near the exit. The domain
is discretized using a cylindrical grid of
3092306.
In the fully developed region, friction factor
is obtained as 1.286 and 1.279, while the
Nusselt number is 3.677 and 3.678 for Np = 1
and 16, respectively. They are in excellent
agreement with the analytical values off= 1.28
andNu = 3.657.
Two-phase Stratified Flow in a Pipe
Instead of single-phase flow considered in
case 1, here, a two-phase stratified flow in a
pipe is simulated. A uniform velocity condition,
with an ideally flat interface, is assumed to exist
at the inlet. Further, a hold-up ratio (the ratio of
flow area occupied by the lighter fluid to thetotal flow area) equal to 0.5 is taken at the inlet.
The two fluids are assumed immiscible and
having a density ratio, = 1 and viscosity ratio,
= 5.326. Similar to case 1, we take L/D = 5
andRe1 = 50.
The analytical solution and numerically
predicted iso-contours ofw-velocity at the pipe
exit are compared in Fig. 1. The numerical
value ofwmax (obtained along the vertical line of
symmetry) agrees within 2% of its analytical
value. Further, the variation of wmax is within
0.5% across 1 to 16 processors.
Figure 1. Comparison of analytical and numerical
w-velocity iso-contours in fully developed two-
phase stratified flow (case 2,Np = 16)
Bubble Rise in a Quiescent Liquid ColumnHere, we consider two sub-cases of bubble
rise in a stagnant liquid, each with a different
fluid combination. A higher density ratio
demands a lower time step and also increases
the stiffness of pressure Poisson equation. It is
conjectured that this may affect the proportion
of communication overheads in a parallel run.
Therefore, such a comparison can narrow down
the target areas for improvement in parallel
performance.
-
7/30/2019 Performance Study of a Parallelized Level-Set Method Based 3D Transient Solver on Various Two-Phase Flow Probl
4/8
38th National Conference on Fluid Mechanics and Fluid Power (FMFP 2011)
4
Case 3A represents the rise of n-butanol
(fluid 2) droplet in water (fluid 1), while case
3B deals with the rise of an air (fluid 2) bubble
in a liquid (fluid 1). Fluid properties are taken as
1 = 986.51, 2 = 845.44, 1 = 1.3910-3
, 2 =3.2810
-3, = 1.6310
-3for case 3A; and 1 =
875.5, 2 = 1, 1 = 0.118, 2 = 110-3
, =
32.210-3
for case 3B. Initially, the bubble is
assumed to be perfectly spherical and at rest
inside a cylindrical domain. A drop diameter
(Db) of 0.002m and 0.0122m, with length scales
ofDb and 0.5Db, are taken for cases A and B,
respectively. The velocity scales are taken as
0.058m/s and 0.215m/s for the respective cases.
The non-dimensional domain size for both the
cases is taken as L = 15 and D = 6. Free slipboundary condition is applied on the side and
bottom walls, while outflow condition is applied
at the top wall of the domain.
Figure 2 shows an excellent agreement
between the present and published results for
the instantaneous bubble shapes. For case A, the
terminal velocity is obtained as u/uc = 0.991,
which is within 0.8% of that reported by
Bertakis et al., 2010. For case B, the terminal
velocity reaches a steady value ofu/uc = 0.933.
The results published by Sussman and Smereka,
1997 are with the far field boundary condition
on the side walls, which is reported to give a
terminal velocity higher by about 9% when
compared to the free slip boundary condition
used here.
Jet Formation in Quiescent Water
Unlike the previous test case, jet formation
ensures a more uniform interface presence
throughout the domain, thereby distributing thecomputational burden more uniformly across
the partitioned sub-domains. This is conjectured
to affect the parallel performance favorably.
Here, we simulate the breakup of a paraffin-
kerosene (fluid 2) jet injected vertically upwards
into stagnant water (fluid 1), which is similar to
the system 3-2 of Kitamura et al., 1982. Fluid
properties are taken as 1 = 998, 1 = 1.0310-3
,
2 = 848, 2 = 1.8810-2
, = 40.410-3
. Nozzle
injection diameter (Db) is taken as 0.122m.
(a)
(b)
Figure 2. Evolution of rising bubble shapes (case 3,
Np = 16); (a) Rise n-butanol bubble in water (b) Rise
of air bubble in liquid (dotted pattern represents the
results from present study, superimposed on those
reported by Sussman and Smereka, 1997)
Taking Lc = Db and uc = 0.35m/s (average jet
injection velocity), we get Re1 = 414, We = 3.7,
Fr= 3.2. Further, the non-dimensional domain
size is taken asL = 40 andD = 13. The injection
velocity mimics a fully developed velocity
profile. No-slip boundary conditions are applied
on the side and the bottom walls.
Figure 3 shows the predicted flow pattern.
The diameter of droplets, averaged from those
between 20D to 35D, is 3.32 whereas the jetbreakup length is 5.3. These results compare
within 4% and 21% of the published results
(Kitamura et al., 1982), respectively.
Film Boiling over a Flat Surface
A film boiling problem applies all the
solvers employed in the present code and aids in
complete evaluation of the parallel performance.
The problem consists of a liquid pool with a thin
vapour film present over the bottom surface.
-
7/30/2019 Performance Study of a Parallelized Level-Set Method Based 3D Transient Solver on Various Two-Phase Flow Probl
5/8
38th National Conference on Fluid Mechanics and Fluid Power (FMFP 2011)
5
Figure 3. Jet pattern evolution (case 4,Np = 16)
A constant wall superheat is imposed on the
bottom surface. The minimum domain size
required to capture the phenomenon is dictatedby the most dangerous Taylor wavelength for
a three-dimensional simulation, d3 (Esmaeeli
and Tryggvason, 2004), evaluated using Eq. (1).
3
1 2
32 2d
g(1)
A domain size of 0.5d30.5d3d3 with agrid size of 6666130 is selected to benchmark
the numerical results, whereas a domain of
0.5d30.5d32.25d3 with a grid size of
6666290 is employed to compare the parallel
performance. While the former is sufficient to
capture the phenomena, the latter maintains the
ratio of MPI cells to interior cells similar to the
other cases. The computational domain
considered here is a quarter of the complete
domain (d3d3d3) shown in Fig. 4. Thus, this
quarter domain captures a quarter of the bubblesreleased in the node and anti-node modes on the
pair of diagonally opposite corners. The
characteristic length scale is taken equal to the
capillary length, Lc = [ /g(1-2)]1/2
, and the
characteristic velocity scale, uc = (gLc)1/2
. The
property ratios are taken as = 0.603, = 0.693,
= 0.987, = 1.615. The governing parameters
are obtained asRe1 = 18.81, Pr1 = 2.79, Fr= 1,
We = 1.06 and Ja = 0.57. Similar to the
initialization method adopted by Esmaeeli and
Tryggvason, 2004; the vapour film is initially
perturbed using Eq. (2).
3
3 3
1 2 2
1 cos cos8 5
d
d d
x y
z (2)
Figure 4 shows the interface shapes at = 50
and 100. The temporal variation of surface-
averaged Nu on the superheated surface is
shown in Fig. 5. The time-averaged value ofNu
from = 110 to 300 is 1.459. The deviation is
13.6% and 16.5% compared to the average Nu
calculated from correlations given by Berenson,
1961; Son and Dhir, 1998; respectively. Similar
deviations in the numerically predicted valuesofNu have been reported in literature.
Figure 4. Interface evolution in film boiling, with
bubble formation at node and anti-node locations
Figure 5. Temporal variation of surface averaged
Nusselt number (case 5,Np = 16)
PARALLEL PERFORMANCE
Each test case is run on a 20 node cluster,
with each node having 2Gb memory and 8 dual
core Intel Xeon 2.4GHz processors. The code is
compiled using the C++ library of MPICH2.
-
7/30/2019 Performance Study of a Parallelized Level-Set Method Based 3D Transient Solver on Various Two-Phase Flow Probl
6/8
-
7/30/2019 Performance Study of a Parallelized Level-Set Method Based 3D Transient Solver on Various Two-Phase Flow Probl
7/8
38th National Conference on Fluid Mechanics and Fluid Power (FMFP 2011)
7
rise test case 3A (even though the density ratio
of the two phases are nearly same in the cases
3A and 4). This abrupt nature of performance
for test case 4 is due to an increased number of
iterations taken by Np = 4 and 16 compared tothose taken by Np = 1, 2 and 8. Therefore, the
denominator of Eq. (3) has peaks forNp = 4 and
16, which results in very low values of S4 and
S16.
While parallel efficiency shows a generally
reducing trend with increasing number of
processors, it is seen that cases 1A, 2 and 3A
show a marginal improving trend at E16. This
improvement can be attributed to a better
memory performance and cache availability for
Np = 16. However, for cases 1B and 5, such anadvantage gets offset by an increased size of
data storage required by the heat transfer
module variables, thereby giving a substantially
reducing trend at E16 compared to the other
cases.
Effect of the Dominant Solvers onEnFigure 7 shows the relative computational
time taken by different solvers. For each case,
the time is evaluated as the average value from
the runs for Np = 2 to Np = 16. Between cases1A, 1B and 2, test case 1B has a higher
contribution from the pressure Poisson equation
(PPE) solver and also gives a better
performance. Similarly, case 3B has a higher
contribution from PPE solver and also a better
performance compared to case 3A. This is due
to a stiffer PPE in the former. Comparing case
3A and 4, although case 4 has a lower
contribution from the PPE solver, it gives a
better performance for Np
= 2 and 8 due to the
reduced idle times associated with MPI. This is
due to a better computational load distribution
in case 4.
CONCLUDING REMARKS
A level-set based two-phase solver is
parallelized using a domain decomposition
procedure, based on the data-parallel model,
which incurs minimum modifications in the
corresponding serial code.
Figure 7. Relative computational effort per solver
for all the test problems
The parallel processes are coupled via data
exchange/update on the boundary cells of each
sub-domain using MPI. The parallelized code
has been validated with various test problems
against analytical/published results. The parallel
performance is evaluated on 2 to 16 processors.
For a fixed number of processors, the
performance shows a considerable variation
across the set of test cases. This suggests that a
single problem with varying grid sizes may not
be sufficient to fully exhibit the parallelscalability of an algorithm, especially in two-
phase flow problems. Factors such as the
property ratio of the fluid, relative distribution
of the light and heavy fluid over the domain, the
uniformity in fluid action being handled by
various processors and the nature of iterative
solvers being invoked contribute to the variation
observed in the parallel scaleup.
In the present scheme of solvers, the PPE
solver is found to give the best improvement in
scaling, while the Gauss-Seidel velocity
predictor can be improvised. On the other hand,
the Gauss-Seidel energy equation solver has a
favorable effect on the parallel performance.
The phase change solver is conjectured to
increase the communication time at a faster rate
with increasing number of processors compared
to other problems. Further, a problem with
uniform fluid action over the domain is found to
have a better load balance.
-
7/30/2019 Performance Study of a Parallelized Level-Set Method Based 3D Transient Solver on Various Two-Phase Flow Probl
8/8
38th National Conference on Fluid Mechanics and Fluid Power (FMFP 2011)
8
NOMENCLATURE
D Diameter [m]
En Parallel efficiency with n processors
f Friction factor
Fr Froude number [ (uc/gLc) ]g Acceleration due to gravity [m/s ]
Ja Jacob number [cp2(Tw-Tsat)/h12]
k Thermal conductivity [W/m.K]
L Length [m]
Np Number of processors in parallel
Nu Nusselt number [hLc/k]
Prf Prandtl number [fcpf/kf]
Ref Reynolds number [fucLc/f]
Sn Parallel speedup with n processors
t Time [s]
T Temperature [K]u Velocity [m/s]
w Axial velocity [m/s]
We Weber number [uc Lc/]
Greek Ratio of specific heat [cp2/cp1]
Ratio of thermal conductivity [k2/k1]
Viscosity ratio [2/1]
d3 Critical wavelength for 3D boiling [m] Viscosity [Pa.s]
Density [kg/m ] Non-dimensional time
Density ratio [2/1]
Subscripts
b Bubble/Injection Nozzle
c Characteristic scale
f Phase (f = 1 for heavier fluid; f = 2 for
lighter fluid)
w Wall
REFERENCESAgbaglah G, Delaux S, Fuster D, Hoepffner J,
Josserand C, Popinet S, Ray P, Scardovelli R,
Zaleski S, 2011. Parallel simulation of multiphase
flows using octree adaptivity and VOF method,
Comptes Rendus Mcanique 339, 194-207.
Berenson PJ, 1961. Film boiling heat transfer from a
horizontal surface,J Heat Transfer 83, 351-362.
Bertakis E, Gross S, Grande J, Fortmeier O,
Reuksen A, Pfennig A, 2010. Validated simulation
of droplet sedimentation with finite-element and
level-set methods, Chemical Engineering Science
65, 2037-2051.
Esmaeeli A, Tryggvason G, 2004. Computations of
film boiling. Part I: numerical method, Int J Heat
and Mass Transfer 47, 5451-5461.
Fortmeier O, Bucker HM, 2010. A parallel strategy
for a level set simulation of droplets moving in a
liquid medium, Proceedings of VECPAR, 200-209.
Fortmeier O, Bucker HM, 2011. Parallel re-
initialization of level set functions on distributed
unstructured tetrahedral grids, J Computational
Physics 230, 4437-4453.
Gada VH, Sharma A, 2011. On a novel dual-grid
level-set method for two-phase flow simulation,
Numerical Heat Transfer Part B: Fundamentals 59,26-57.
George WL, Warren JA, 2002. A parallel 3D
dendritic growth simulator using the Phase-Field
Method, J Computational Physics 177, 264-283.
Hajihashemi MR, El-Shenawee M, 2010. High
performance computing for the level-set
reconstruction algorithm, J Parallel and Distributed
Computing 70, 671-679.
Johnson SP, Cross M, 1991. Mapping structured
grid three-dimensional CFD codes onto parallel
architectures, Applied Mathematical Modelling 15,394-405.
Kitamura Y, Mishima H, Takahashi T, 1982.
Stability of jets in liquid-liquid systems, The
Canadian J Chemical Engineering 60, 723-731.
Son G, Dhir VK, 1998. Numerical simulation of film
boiling near critical pressures with a level-set
method, J Heat Transfer 120, 183-192.
Sussman M, 2005. A parallelized, adaptive
algorithm for multiphase flows in general
geometries, Computers and Structures 83, 435-444.Sussman M, Smereka P, 1997. Axisymmetric free
boundary problems,Journal of Fluid Mechanics 341,
269-294.
Wang K, Chang A, Kale LV, Dantzig JA, 2006.
Parallelization of a level set method for simulating
dendritic growth, J Parallel and Distributed
Computing 66, 1379-1386.
Zuzio D, Estivalezes JL, 2011. An efficient block
parallel AMR method for two phase interfacial flow
simulations, Computers and Fluids 44, 339-357.