fast multipole methods for incompressible flow...
TRANSCRIPT
© Gumerov & Duraiswami, 2003
Fast Multipole Methods for Incompressible Flow Simulation
Nail A. Gumerov & Ramani DuraiswamiInstitute for Advanced Computer Studies
University of Maryland, College Park
Support of NSF awards 0086075 and 0219681 is gratefully acknowledged
© Gumerov & Duraiswami, 2003
Fast Multipole Methods• Complex geometric features (e.g., aircraft, submarine, or turbine
geometries), physics (high <, wake-structure interactions etc.) all tend to increase problem sizes Many simulations involve several million variables
• Most large problems boil down to solution of linear systems or performing several matrix-vector products
• Regular product requires O(N2) time and O(N2) memory• The FMM is a way to
accelerate the products of particular dense matrices with vectors Do this using O(N) memory
• FMM achieves product in O(N) or O(N log N) time and memory• Combined with iterative solution methods, can allow solution of
problems hitherto unsolvable
© Gumerov & Duraiswami, 2003
Fast Multipole Methods• To speed up a matrix vector product ∑i Φij ui = vj
• Key idea: Let Φij=φ( xi, yj)“Translate” elements corresponding to different xi to common
location, x*
φ (xi,yj)=∑ l=1p βl ψl(xi-x*) Ψl (yj-x*)
Achieves a separation of variables by translation The number of terms retained, p, is only related to accuracy
vj=∑j=1N ui ∑ l=1
p βl ψl(xi) Ψl (yj) = ∑ l=1p βl Ψl (yj)∑i=1
N uiψl(xi)
Can evaluate p sums over i
Al=∑j=1N ui ψl(xi)
Requires Np operations
• Then evaluate an expression of the typeS(xi)=∑ l=1
p Alβl ψl(xi) i=1,…,M
© Gumerov & Duraiswami, 2003
Outline• Incompressible Fluid problems where FMM can be used
Boundary integral and particle formulationsPotential flowStokes FlowVortex Element MethodsComponent of Navier-Stokes Solvers via Generalized Helmholtz
decomposition
Multi-sphere simulations of multiphase flowLaplaceStokesHelmholtz
• Numerical Analysis issues in developing efficient FMM• Towards black-box FMM solvers
© Gumerov & Duraiswami, 2003
FMM & Fluid Mechanics• Basic Equations
12
n
© Gumerov & Duraiswami, 2003
Helmholtz Decomposition• Key to integral equation and particle methods
© Gumerov & Duraiswami, 2003
Potential Flow
• Knowledge of the potential is sufficient to compute velocity and pressure
• Need a fast solver for the Laplace equation
• Applications – panel methods for subsonic flow, water waves, bubble dynamics, …
Crum, 1979
Boschitsch et al, 1999
© Gumerov & Duraiswami, 2003
BEM/FMM Solution Laplace’s Equation
• Jaswon/Symm (60s) Hess & Smith (70s),
• Korsmeyer et al 1993, Epton& Dembart 1998,Boschitsch & Epstein 1999
Lohse, 2002
© Gumerov & Duraiswami, 2003
Stokes Flow
• Green’s function (Ladyzhenskaya1969, Pozrikidis 1992)
• Integral equation formulation
• Stokes flow simulations remain a very important area of research
• MEMS, bio-fluids, emulsions, etc.• BEM formulations (Tran-Cong &
Phan-Thien 1989, Pozrikidis 1992)• FMM (Kropinski 2000 (2D), Power
2000 (3D))
Motion of spermatozoaCummins et al 1988
Cherax quadricarinatus.
MEMS force calculations (Aluru & White, 1998
© Gumerov & Duraiswami, 2003
Rotational Flows and VEM• For rotational flows
Vorticity released at boundary layer or trailing edge and advected with the flow Simulated with vortex particles
Especially useful where flow is mostly irrotationalFast calculation of Biot-Savart integrals
(x1,y1,z1)
(x2,y2,z2)
ΓΓΓΓnl
Evaluation point
x
(where y is the mid point of the filament)
(circulation strength)
(Far field)
© Gumerov & Duraiswami, 2003
Vorticity formulations of NSE
• Problems with boundary conditions for this equation (see e.g., Gresho, 1991) Divergence free and curl-free components are linked only by boundary
conditions Splitting is invalid unless potentials are consistent on boundary
• Recently resolved by using the generalized Helmholtz decomposition (Kempka et al, 1997; Ingber & Kempka, 2001)
• This formulation uses a kinematically consistent Helmholtz decomposition in terms of boundary integrals
• When widely adopted will need use of boundary integrals, and hence the FMM Preliminary results in Ingber & Kempka, 2001
© Gumerov & Duraiswami, 2003
Generalized Helmholtz Decomposition• Helmholtz decomposition leaves too many
degrees of freedom• Way to achieve decomposition valid on boundary and in domain,
with consistent values is to use the GHD
• D is the domain dilatation (zero for incompressible flow)• Requires solution of a boundary integral equation as part of the
solution => role for the FMM in such formulations
© Gumerov & Duraiswami, 2003
Multi Sphere Problems• So far we have seen FMM for boundary integral problems• Can also use the FMM in problems involving many
spheresSimulations of multiphase flow, effective media (porous media),
dusty gases, slurries, and the likeKey is a way to enforce proper
boundary/continuity conditions onthe spheres
Translation theorems providean ideal way to do this
With FMM one can easily simulate 104-105 particles on desktops and 106 -107 particles on supercomputers
Sangani & Bo (1996), Gumerov & Duraiswami (2002,2003)Stokesian Dynamics (Brady/Bossis 1988; Kim/Karilla 1991)
© Gumerov & Duraiswami, 2003
Fast Multipole Methods
• Matrix-Vector Multiplication• Middleman and Single Level Methods• Multilevel FMM (MLFMM)• Adaptive MLFMM• Fast Translations• Some computational results• Conclusions
© Gumerov & Duraiswami, 2003
Iterative Methods• To solve linear systems of equations;• Simple iteration methods;• Conjugate gradient or similar methods;• We use Krylov subspace methods (GMRES):
Preconditioners;Research is ongoing.
• Efficiency of these methods depends on efficiency of the matrix-vector multiplication.
© Gumerov & Duraiswami, 2003
Matrix-Vector Multiplication
© Gumerov & Duraiswami, 2003
FMM Works with Influence Matrices
© Gumerov & Duraiswami, 2003
Examples of Influence Matrices
exp
© Gumerov & Duraiswami, 2003
Complexity of Standard Method
EvaluationPoints
Sources
Standard algorithm
N M
Total number of operations: O(NM)
© Gumerov & Duraiswami, 2003
Goal of FMM
•Reduce complexity of matrix-vector multiplication (or field evaluation) by trading exactness for speed•Evaluate to arbitrary accuracy
© Gumerov & Duraiswami, 2003
Five Key Stones of FMM
• Factorization• Error Bound• Translation• Space Partitioning• Data Structure
© Gumerov & Duraiswami, 2003
Factorization
Degenerate Kernel:
O(pN) operations:
O(pM) operations:
Total Complexity: O(p(N+M))
© Gumerov & Duraiswami, 2003
Factorization (Example)
O(dN) operationsO(dM) operations:
Total Complexity: O(d(N+M))
Scalar Product in d-dimensional space:
© Gumerov & Duraiswami, 2003
Middleman Algorithm
SourcesSources
EvaluationPoints
EvaluationPoints
Standard algorithm Middleman algorithm
N M N M
Total number of operations: O(NM) Total number of operations: O(N+M)
© Gumerov & Duraiswami, 2003
FactorizationNon-Degenerate Kernel:
Error Bound:
Middleman Algorithm Applicability:
Truncation Number
© Gumerov & Duraiswami, 2003
Factorization Problem:
Usually there is no factorization available that provides a uniform approximation of the kernel in the entire computational domain.
© Gumerov & Duraiswami, 2003
Far and Near Field Expansions
xi
Ω
Ω
x*
R Sy
y
rc|xi - x*|
Rc|xi - x*|
xi
Ω
ΩΩ
x*
R Sy
y
rc|xi - x*|
Rc|xi - x*|
Far Field
Near Field
Far Field:
Near Field:
S: “Singular”(also “Multipole”,
“Outer”“Far Field”),
R: “Regular”(also “Local”,
“Inner”“Near Field”)
© Gumerov & Duraiswami, 2003
Example of S and R expansions (3D Laplace)
Spherical Harmonics:x
y
z
O
x
y
z
O
x
y
z
Oθ
φ
r
r
Spherical Coordinates:
© Gumerov & Duraiswami, 2003
S and R Expansions (3D Helmholtz)
x
y
z
O
x
y
z
O
x
y
z
Oθ
φ
r
r
Spherical Coordinates:
Spherical HankelFunctions
Spherical Bessel Functions
© Gumerov & Duraiswami, 2003
Idea of a Single Level FMM
Sources SourcesEvaluation
PointsEvaluation
Points
Standard algorithm SLFMM
N M N M
Total number of operations: O(NM) Total number of operations: O(N+M+KL)
K groupsL groups
Needs Translation!
© Gumerov & Duraiswami, 2003
Space Partitioning
ΕΕ1 Ε3Ε21 Ε3Ε2
n n n
Φ1(n)(y) Φ2
(n)(y) Φ3(n)(y)
Potentials due to sources in these spatial domains
© Gumerov & Duraiswami, 2003
Single Level Algorithm1. For each box build Far Field representation of the
potential due to all sources in the box.2. For each box build Near Field representation of the
potential due to all sources outside the neighborhood.3. Total potential for evaluation points belonging to this
box is a direct sum of potentials due to sources in its neighborhood and the Near Field expansion of other sources near the box center.
© Gumerov & Duraiswami, 2003
The SLFMM requires S|R-translation:
© Gumerov & Duraiswami, 2003
S|R-translationAlso “Far-to-Local”, “Outer-to-Inner”, “Multipole-to-Local”
xi
Ω1i
Ω1i
Ω2i
x*1
x*2
S
(S|R)
y
R
Rc|xi - x*1|
R2 = min|x*2 - x*1|-Rc |xi - x*1|,rc|xi - x*2|
R2
xi
Ω1i
Ω1iΩ1i
Ω2i
x*1
x*2
S
(S|R)
y
R
Rc|xi - x*1|
R2 = min|x*2 - x*1|-Rc |xi - x*1|,rc|xi - x*2|
R2
© Gumerov & Duraiswami, 2003
S|R-translation Operator
S|R-Translation Matrix
S|R-Translation Coefficients
© Gumerov & Duraiswami, 2003
S|R-translation Operatorsfor 3D Laplace and Helmholtz equations
© Gumerov & Duraiswami, 2003
Complexity of SLFMM
Κ0
F(K)
Kopt
“Middleman” complexityAdditional
We have p2 terms and P(p) translation cost:
For M~N:
© Gumerov & Duraiswami, 2003
Idea of Multilevel FMMSource Data Hierarchy Evaluation Data Hierarchy
N M
Level 2Level 3
Level 4Level 5
Level 2Level 3
Level 4 Level 5
S|S
S|R
R|R
© Gumerov & Duraiswami, 2003
Complexity of MLFMM
Upward Pass: Going Up on SOURCE HierarchyDownward Pass: Going Down on EVALUATION Hierarchy
Definitions:
Not factorial!
© Gumerov & Duraiswami, 2003
Hierarchical Spatial Domains
ΕΕΕΕ1111
ΕΕΕΕ3333 ΕΕΕΕ4444
ΕΕΕΕ2222ΕΕΕΕ1111
ΕΕΕΕ3333 ΕΕΕΕ4444
ΕΕΕΕ2222
© Gumerov & Duraiswami, 2003
Upward Pass. Step 1.
xi
Ω
Ω
x*
R
xi
Ω
Ω
x*
y
S-expansion valid in ΩΕΕΕΕΕΕΕΕ3333
xi
xc(n,L)
y
S-expansion valid in E3(n,L)
S-expansion
© Gumerov & Duraiswami, 2003
Upward Pass. Step 2.S|S-translation. Build potential for the parent box (find its S-expansion).
© Gumerov & Duraiswami, 2003
Downward Pass. Step 1.
Level 2: Level 3:S|R-translation
© Gumerov & Duraiswami, 2003
Downward Pass. Step 2.R|R-translation
© Gumerov & Duraiswami, 2003
Final Summation
yj
Contribution of near sources(calculated directly)
Contribution of far sources(represented by R-expansion)
© Gumerov & Duraiswami, 2003
Adaptive MLFMM
Goal of computations:
© Gumerov & Duraiswami, 2003
2 2 2
2 2
2 2
2
3
3
3
33
4
2 2 2
2 2
2 2
2
3
3
3
33
4
EvaluationPoint
SourcePoint
Box Level
Each evaluation box in this picture contains not more than 3 sources in the neighborhood.Very important for particle methods with concentrations of particles
Idea of Adaptation
© Gumerov & Duraiswami, 2003
We have implemented MLFMM• General “MLFMM shell” software:
Arbitrary dimensionality;Variable sizes of neighborhoods;Variable clustering parameter;Regular and Adaptive versions;User Specified basis functions, and translation
operators;Efficient data structures using bit interleaving;Technical Report #1 is available online (visit the
authors’ home pages).http://www.umiacs.umd.edu/~ramani/pubs/umiacs-tr-2003-28.pdf
© Gumerov & Duraiswami, 2003
Optimal choice of the clustering parameter is important
0.1
1
10
100
1 10 100 1000Number of Points in the Smallest Box
CPU
Tim
e (s
)
N=1024
4096
16384
65536
Regular Mesh, d=2
5
4
5
5
5
3
4
4
3
3
2
2
1
6
7 6
6
Max Level=8
7
© Gumerov & Duraiswami, 2003
Speed of computation
0.01
0.1
1
10
100
1000
1000 10000 100000 1000000Number of Points
CPU
Tim
e (s
)
Straightforward
FMM (s=4)FMM/(a*log(N))
Setting FMM
Middleman
y=bx
y=cx2
Regular Mesh, d=2, k=1, Reduced S|R
© Gumerov & Duraiswami, 2003
Error analysis
© Gumerov & Duraiswami, 2003
Complexity of Translation• For 3D Laplace and Helmholtz series have p2 terms;• Translation matrices have p4 elements;• Translation performed by direct matrix-vector
multiplication has complexity O(p4);• Can be reduced to O(p3);• Can be reduced to O(p2log2 p);• Can be reduced to O(p2) (?).
© Gumerov & Duraiswami, 2003
Rotation-Coaxial Translation Decomposition Yields O(p3) Method
z
yy
xx
yx
z
yx
z
zp4
p3 p3
p3
Coaxial Translation
Rotation
© Gumerov & Duraiswami, 2003
FFT-based Translation for 3D Laplace Equation Yields O(p2log2p) Method
Translation Matrices are Toeplitz and Hankel
Multiplication can be performed using FFT(Elliot & Board 1996; Tang et al 2003)
© Gumerov & Duraiswami, 2003
Sparse Matrix Decompositions Can Result in O(p2) Methods
Laplace and Helmholtz:
Helmholtz 3D: D is a sparse matrix
© Gumerov & Duraiswami, 2003
Interaction of Multiple Spherical ParticlesHelmholtz (acoustical scattering) Laplace (potential incompressible flow)
k 0
© Gumerov & Duraiswami, 2003
Multipole Solution
1) Reexpand solution near the center of each sphere, and satisfy boundary conditions2) Solve linear system to determine the expansion coefficients
© Gumerov & Duraiswami, 2003
Computational Methods UsedM
etho
d
Number of Spheres
101 102 103100 104 105
BEM
Multipole Straightforward
Multipole Iterative
MLFMM
106
Computable on 1 GHz, 1 GB RAM Desktop PC
© Gumerov & Duraiswami, 2003
Comparisons with BEM
-12
-9
-6
-3
0
3
6
9
12
-180 -90 0 90 180
Angle φ1 (deg)
HR
TF (d
B)
BEMMultisphereHelmholtz
θ1 = 0o
30o
60o
90o
60o
30o
90o
120o
120o
150o
150o
180o
Three Spheres, ka1 =3.0255.
BEM discretization with 5400 elements
© Gumerov & Duraiswami, 2003
Convergence of the Iterative Procedure
1.E-04
1.E-03
1.E-02
1.E-01
1.E+00
1.E+01
1.E+02
0 5 10 15 20 25 30 35Iteration #
Max
Abs
olut
e Er
ror
Iterations with Reflection Method
3D Helmholtz Equation,MLFMM100 Spheres
ka = 4.8
2.8 1.6
For Laplace equation convergence is much faster
© Gumerov & Duraiswami, 2003
Multiple Scattering from 100 spheres
ka=1.6 ka=2.8 ka=4.8
FMM also used here for visualization
© Gumerov & Duraiswami, 2003
Various Configurations
343 spheres in a regular grid 1000 randomly placed spheres
© Gumerov & Duraiswami, 2003
Truncation Errors
© Gumerov & Duraiswami, 2003
Some Observations for Laplace Equation(limit at small k)
• Relative errors below 1% can be achieved even for p~1;• Reflection method converges very fast; Number of
iterations can be 2-10 for very high accuracy.• CPU time on 1GHz, 1GB RAM PC for 1000 spheres ~ 1
min.
© Gumerov & Duraiswami, 2003
Some projects of students in our course related to incompressible flow
1). Jun Shen (Mechanical Engineering, UMD): 2D simulation of vortex flow (particle method + MLFMM)
© Gumerov & Duraiswami, 2003
Project 2- Calculate vortex induced forces2). Jayanarayanan Sitaraman (Alfred Gessow Rotorcraft Center, UMD): Fast Multipole Methods For 3-D Biot-Savart Law Calculations(free vortex methods)
Error,M=3200,N=3200
p7
10-6
10-5
6Break-even point:MLFMM is faster then straightforwardfor N=M>1300
© Gumerov & Duraiswami, 2003
Conclusions• We developed a shell and framework for MLFMM which can be
used for many problems, including simulations of potential and vortex flows and multiparticle motion.
• MLFMM shows itself as very efficient method for solution of fluid dynamics problems;
• MLFMM enables computation of large problems (sometimes even on desktop PC);
• Research is ongoing.• Problems:
Efficient choice of parameters for MLFMM; Efficient translation algorithms; Efficient iterative procedures; More work on error bounds is needed; Etc.
• CSCAMM Meeting on FMM in November 2003
© Gumerov & Duraiswami, 2003
Thank You!