myers_siamcse15

39
Controlling Numerical Error in Particle-in-Cell Simulations of Collisionless Dark Matter Andrew Myers [email protected] Applied Numerical Algorithms Group, Computational Research Division with Phillip Colella, Brian Van Straalen SIAM-CSE Meeting March 17 th , 2015 Extreme Resilient Discretizations Submitted to ApJ Tuesday, March 17, 15

Upload: karen-pao

Post on 17-Jul-2015

39 views

Category:

Science


0 download

TRANSCRIPT

Page 1: Myers_SIAMCSE15

Controlling Numerical Error in Particle-in-Cell Simulations of Collisionless Dark Matter

Andrew [email protected] Numerical Algorithms Group, Computational Research Division with Phillip Colella, Brian Van Straalen

SIAM-CSE MeetingMarch 17th, 2015

Extreme Resilient Discretizations

Submitted to ApJ

Tuesday, March 17, 15

Page 2: Myers_SIAMCSE15

• Motivation - why is understanding these errors relevant for us here?

• Standard PIC methods don’t converge for Cosmology applications

• Two modifications:

– Regularization

– Adaptive Remapping

• Summary and Future Research

Controlling Numerical Error in Particle-in-Cell Simulations of Collisionless Dark MatterAndrew Myers, LBNL

2

Talk outline

Tuesday, March 17, 15

Page 3: Myers_SIAMCSE15

• Motivation - why is understanding these errors relevant for us here?

• Standard PIC methods don’t converge for Cosmology applications

• Two modifications:

– Regularization

– Adaptive Remapping

• Summary and Future Research

Controlling Numerical Error in Particle-in-Cell Simulations of Collisionless Dark MatterAndrew Myers, LBNL

3

Talk outline

Tuesday, March 17, 15

Page 4: Myers_SIAMCSE15

Roofline Performance Model

Cori Node

Controlling Numerical Error in Particle-in-Cell Simulations of Collisionless Dark MatterAndrew Myers, LBNL

4

Roofline Model - High Arithmetic Intensity needed for maximum performance

Tuesday, March 17, 15

Page 5: Myers_SIAMCSE15

• Field solve: force is computed on the mesh by e.g.

solving Poisson’s Equation w/ 2nd order finite

differences.

• Interpolation: Force is interpolated back to particle

positions using same kernel.

• Particle Push: Particle positions and velocities are

updated. 2nd-order leapfrog.

• Deposition: Particle masses are deposited onto mesh:

Controlling Numerical Error in Particle-in-Cell Simulations of Collisionless Dark MatterAndrew Myers, LBNL

5

This is a problem for 2nd Order PIC Methods

n+1i

=X

p

✓mp

Vi

◆W

x i � x

n+1p

�x

!

2nd order: Piecewise linear, Cloud-in-Cell interpolation

Start

Deposition

Field Solve

Interpolation

Particle Push

Tuesday, March 17, 15

Page 6: Myers_SIAMCSE15

• Poisson solve is a global bottleneck. Theoretical peak AI is bad.

• Even if we read in a chunk of particles and do all the work we possibly can before

moving on to the next chunk, by:

• Reading in a batch of particles

• Subtracting of their contribution to the density

• Interpolating the field to the particle positions

• Pushing the particles

• Depositing the particles at their new positions

AI . 1

1 + 1/nppc

Controlling Numerical Error in Particle-in-Cell Simulations of Collisionless Dark MatterAndrew Myers, LBNL

6

Performance problems: global bottleneck, poor AI

In 1D,perfect cache:24 Flops,3 doubles per particle,3 doubles per cell

1 for high ppc (convergence)1/2 for 1 particle per cell

Tuesday, March 17, 15

Page 7: Myers_SIAMCSE15

Controlling Numerical Error in Particle-in-Cell Simulations of Collisionless Dark MatterAndrew Myers, LBNL

7

Improving AI - Higher Order in SpaceB a0 = 1

˜Wq

(k)

Wq

(x) =

q/2�1X

p=0

a2p

(�1)

pM (2p)q

(x),

B[�q/2, q/2]

Wq

q = 4, 6

W4(x) =

8><

>:

|x|32 � |x|2 � |x|

2 + 1, |x| 2 [0, 1],

� |x|36 + |x|2 � 11|x|

6 + 1, |x| 2 [1, 2],

0,

W6(x) =

8>>>><

>>>>:

� |x|512 +

|x|44 +

5|x|312 � 5|x|2

4 � |x|3 + 1, |x| 2 [0, 1],

|x|524 � 3|x|4

8 +

25|x|324 � 5|x|2

8 � 13|x|12 + 1, |x| 2 [1, 2],

� |x|5120 +

|x|48 � 17|x|3

24 +

15|x|28 � 137|x|

60 + 1, |x| 2 [2, 3],

0,

kk = 4

qq

[�q/2, q/2]

B a0 = 1

˜Wq

(k)

Wq

(x) =

q/2�1X

p=0

a2p

(�1)

pM (2p)q

(x),

B[�q/2, q/2]

Wq

q = 4, 6

W4(x) =

8><

>:

|x|32 � |x|2 � |x|

2 + 1, |x| 2 [0, 1],

� |x|36 + |x|2 � 11|x|

6 + 1, |x| 2 [1, 2],

0,

W6(x) =

8>>>><

>>>>:

� |x|512 +

|x|44 +

5|x|312 � 5|x|2

4 � |x|3 + 1, |x| 2 [0, 1],

|x|524 � 3|x|4

8 +

25|x|324 � 5|x|2

8 � 13|x|12 + 1, |x| 2 [1, 2],

� |x|5120 +

|x|48 � 17|x|3

24 +

15|x|28 � 137|x|

60 + 1, |x| 2 [2, 3],

0,

kk = 4

qq

[�q/2, q/2]

• Replace CIC with higher-order interpolation kernels

• Discrete delta approximations of any order we want (Lo, Minden, Colella

2015, in prep)

4 x AI 8 x AI

Tuesday, March 17, 15

Page 8: Myers_SIAMCSE15

Controlling Numerical Error in Particle-in-Cell Simulations of Collisionless Dark MatterAndrew Myers, LBNL

8

High-order, time integrators with fewer bottlenecks - Example: RK4

• 4th order RK methods require 3 or 4 force evaluations per time step, and 2

or 3 particle pushes. These must be done sequentially.

• An alternative is to store the force evaluations from the RK4 stages of the

previous time step at grid points, and extrapolate to get approximate forces

with which to compute the displacements for your next time step.

• Cheap with many p.p.c.

tn ��t tn � 1

2�t tn +

1

2�t tn +�ttn

f(tn ��t) f(tn +�t)

f(tn � 1

2�t) f(tn +

1

2�t)

f(tn)

- past

- future

Tuesday, March 17, 15

Page 9: Myers_SIAMCSE15

Controlling Numerical Error in Particle-in-Cell Simulations of Collisionless Dark MatterAndrew Myers, LBNL

9

• Here is an example for a velocity-independent force using a 2nd order

interpolating polynomial to extrapolate the forces.

If we use these approximate force to compute the displacements, we have

k

n+11 = F (t

n, x

n)

k

n+12 = F

✓t

n+

1

2

�t, x

n+

1

2

v

n�t +

1

8

˜

f(1)�t

2

k

n+13 = F

✓t

n+ �t, x

n+ v

n�t +

1

2

˜

f

✓3

2

◆�t

2

◆(7)

An alternative time-stepping scheme with better arithmetic intensity is

thus

k

n+11 = F (t

n, x

n)

k

n+12 = F

✓t

n+

1

2

�t, x

n+

1

2

v

n�t +

1

8

k

n3 �t

2

k

n+13 = F

✓t

n+ �t, x

n+ v

n�t +

1

2

(k

n1 � 3k

n2 + 3k

n3 ) �t

2

x

n+1= x

n+ v

n�t +

1

6

�k

n+11 + 2k

n+12

��t

2

v

n+1= v

n+

1

6

�k

n+11 + 4k

n+12 + k

n+13

��t. (8)

This is quite similar to the classical method, except that the displacements

for all three k values can be computed without doing any force solves.

Using Taylor expansions, one can verify that this is still 4th-order accu-

rate. In fact, one could also use linear extrapolation for

˜

f(⌧) with any two

of the RK stages from time step n� 1 and still retain 4th-order accuracy.

2 Stability Analysis

To check for stability, we consider the linearized model system:

x = v

v = �x. (9)

The numerical scheme in equation (7), applied to (8), can be written in

a compact form if we consider the lagged forces k

n1 , k

n2 , and k

n3 to be part

2

1 The Method

The goal is to solve the following system of equations for the particle posi-

tions and velocities x and v given the force F :

x = v

v = F (t, x). (1)

The standard, fourth-order Runge-Kutta method applied to this system

gives, for the special case of a force that does not depend on v:

x

n+1= x

n+ v

n�t +

1

6

(k1 + 2k2) �t

2

v

n+1= v

n+

1

6

(k

n1 + 4k2 + k3) �t, (2)

where

k1 = F (t

n, x

n)

k2 = F

✓t

n+

1

2

�t, x

n+

1

2

v

n�t +

1

8

k1�t

2

k3 = F

✓t

n+ �t, x

n+ v

n�t +

1

2

k2�t

2

◆. (3)

Note the sequential nature of this algorithm; k1 must be computed before

k2, which must be computed before k3. An alternative is to extrapolate the

forces from the previous time step. For example, the forces at the stages

corresponding to times t

n � �t, t

n � 12�t, and t

nwere k

n1 , k

n2 , and k

n3 .

Defining

⌧ =

t� (t

n ��t)

�t

, (4)

a 2nd order interpolating polynomial that passes through the required points

is

˜

f(⌧) = (2k

n1 � 4k

n2 + 2k

n3 ) ⌧

2+ (�3k

n1 + 4k

n2 � k

n3 ) ⌧ + k

n1 . (5)

For ⌧ = 0, 1/2, and 1, we recover k

n1 , k

n2 , and k

n3 , respectively. Extrapolated

forward to ⌧ = 1, 3/2, and 2 (t = t

n, t

n+ 1/2�t, and t

n+ �t), we find:

˜

f(1) = k

n3

˜

f

✓3

2

◆= k

n1 � 3k

n2 + 3k

n3

˜

f(2) = 3k

n1 � 8k

n2 + 6k

n3 . (6)

1

1 The Method

The goal is to solve the following system of equations for the particle posi-

tions and velocities x and v given the force F :

x = v

v = F (t, x). (1)

The standard, fourth-order Runge-Kutta method applied to this system

gives, for the special case of a force that does not depend on v:

x

n+1= x

n+ v

n�t +

1

6

(k1 + 2k2) �t

2

v

n+1= v

n+

1

6

(k

n1 + 4k2 + k3) �t, (2)

where

k1 = F (t

n, x

n)

k2 = F

✓t

n+

1

2

�t, x

n+

1

2

v

n�t +

1

8

k1�t

2

k3 = F

✓t

n+ �t, x

n+ v

n�t +

1

2

k2�t

2

◆. (3)

Note the sequential nature of this algorithm; k1 must be computed before

k2, which must be computed before k3. An alternative is to extrapolate the

forces from the previous time step. For example, the forces at the stages

corresponding to times t

n � �t, t

n � 12�t, and t

nwere k

n1 , k

n2 , and k

n3 .

Defining

⌧ =

t� (t

n ��t)

�t

, (4)

a 2nd order interpolating polynomial that passes through the required points

is

˜

f(⌧) = (2k

n1 � 4k

n2 + 2k

n3 ) ⌧

2+ (�3k

n1 + 4k

n2 � k

n3 ) ⌧ + k

n1 . (5)

For ⌧ = 0, 1/2, and 1, we recover k

n1 , k

n2 , and k

n3 , respectively. Extrapolated

forward to ⌧ = 1, 3/2, and 2 (t = t

n, t

n+ 1/2�t, and t

n+ �t), we find:

˜

f(1) = k

n3

˜

f

✓3

2

◆= k

n1 � 3k

n2 + 3k

n3

˜

f(2) = 3k

n1 � 8k

n2 + 6k

n3 . (6)

1

All the right hand sides for Poisson can be computed at once for a subset of particles that will fit in cache. Still 4th order accurate. Real issue is time step.

3 - 4 x AI

High-order time integrators, w/ fewer bottlenecks - Example: Extrapolating RK4

Total for going to high order: ~ 10 x

Tuesday, March 17, 15

Page 10: Myers_SIAMCSE15

Controlling Numerical Error in Particle-in-Cell Simulations of Collisionless Dark MatterAndrew Myers, LBNL

10

Another important question: Will these techniques actually give better results?

• Need to evaluate the utility of these methods on realistic problems

• For plasma PIC, yes. Convergence theory and empirical evidence from Wang, Miller,

and Colella 2011. Need remapping and high particle counts (100-1000 particles per

cell).

Copyright © by SIAM. Unauthorized reproduction of this article is prohibited.

3526 B. WANG, G. H. MILLER, AND P. COLELLA

1e-06

1e-05

0.0001

0.001

0.01

0.1

0 5 10 15 20 25 30er

ror

t

hx=L/64, hv=vmax/128hx=L/128, hv=vmax/256hx=L/256, hv=vmax/512

(a)

-1

0

1

2

3

0 5 10 15 20 25 30

conv

erge

nce

rate

t

hx=L/128, hv=vmax/256hx=L/256, hv=vmax/512

(b)

Fig. 10. Error and convergence rate plots for the two-stream instability without remapping.We set rh = 1/2. Scales (hx, hv) denote the particle grid mesh spacing at the base level. (a) TheL! norm of the electric field errors on three di!erent resolutions. (b) The convergence rates forthe errors on plot (a). Second-order convergence rates are lost around t = 20.

1e-06

1e-05

0.0001

0.001

0.01

0.1

0 5 10 15 20 25 30

erro

r

t

hx=L/64, hv=vmax/128hx=L/128, hv=vmax/256hx=L/256, hv=vmax/512

(a)

-1

0

1

2

3

0 5 10 15 20 25 30

conv

erge

nce

rate

t

hx=L/128, hv=vmax/256hx=L/256, hv=vmax/512

(b)

Fig. 11. Error and convergence rate plots for the two-stream instability with remapping. Weset rh = 1/2. Scales (hx, hv) denote the particle grid mesh spacing at the base level. (a) The L!norm of the electric field errors on three di!erent resolutions. (b) The convergence rates for theerrors on plot (a). Second-order convergence rates are observed until t = 28. The lost of accuracyafter t = 28 is due to filamentation.

even with rh = 1. This further demonstrate our error formula (3.12), which says theconsistency error is second order as long as rh ! 1.

We also compare the distribution function at the same instant time t = 20 by bothmethods in Figure 14. For visualization purposes, in the case of the PIC method with-out remapping, we interpolate the particle-based distribution on phase-space grids.We see that the standard PIC method results in a very noisy solution in Figure 14(a).In addition, the maximum of the approximated distribution function has a large errorcompared with the analytic value, fmax = 0.3. Figure 14(b) shows the distributionfunction solved by the PIC with remapping. Compared to the case without remap-ping, remapping obviously controls numerical noise and reduces the maximum error.We preserve the maximum of the distribution function by applying the mass redistri-bution algorithm as in positivity preservation.

Finally, we compare the evolution of the total number of particles in three cases.In the first case, we initialize and remap the problem on two levels of grids, with

Dow

nloa

ded

10/0

8/13

to 1

28.3

.5.1

31. R

edis

tribu

tion

subj

ect t

o SI

AM

lice

nse

or c

opyr

ight

; see

http

://w

ww

.siam

.org

/jour

nals

/ojs

a.ph

p

Wang+2011

Tuesday, March 17, 15

Page 11: Myers_SIAMCSE15

Controlling Numerical Error in Particle-in-Cell Simulations of Collisionless Dark MatterAndrew Myers, LBNL

11

For Cosmology, Evidence Suggests Not...

Romain TeyssierComputational Astrophysics 2009

High-resolution with constant force softening

Particle discreteness effects show up quite dramatically in Warm Dark Matter simulations (from Wang & White, MNRAS, 2007)

Very slow convergence N^(-1/3)

These effects can be neglected if

d: local inter-particular spacing

Adaptive force softening ?

Splinter, R.J., Melott, A.L., Shandarin, S.F., Suto, Y., “ Fundamental Discreteness Limitations of Cosmological N-Body Clustering Simulations”, ApJ, 497, 38, (1998)

Romeo, A.B., Agertz, O., Moore, B., Stadel, J., “Discreteness Effects in ΛCDM Simulations: A Wavelet-Statistical View”, ApJ, 686, 1, (2008)

maintain the strict planar symmetry of the pancake collapse thatis apparently the root cause of the problem. To reiterate, the 2563

and 5123 PM runs roughly span the force resolutions used forthe other codes, and since only the 5123 run shows a very mildfailure of convergence, force resolution alone cannot be the sourceof the difficulty.

Our results provide a different and more optimistic interpreta-tion of the findings of Melott et al. (1997; see also Binney 2004).While high-resolution codes when run with small smoothinglengths (or several refinement levels in the case of AMR) arenot able to pass the pancake test after the formation of severalcaustics, the main culprit appears to be an inability to maintainthe planar symmetry of the problem and not direct collisionality(at least at the force resolutions relevant for this paper), whichwould have been far more serious. Whether the failure to treatplanar collapse is a problem in more realistic situations can betested by comparing results from the high-resolution codes againstbrute-force PM simulations. A battery of such tests have beencarried out in xx 4 and 5. At the force resolutions investigated,these tests failed to yield evidence for significant deviations.

4. THE SANTA BARBARA CLUSTER

4.1. Description of the Test

Results from the Santa Barbara Cluster Comparison Projectwere reported in 1999 in Frenk et al. (1999). The aim of this proj-ect was to compare different techniques for simulating the for-mation of a cluster of galaxies in a cold dark matter universeand to decide if the results from different codes were consistentand reproducible. For this purpose outputs from 12 differentcodes were examined, representing numerical techniques rangingfrom SPH to gridmethodswith fixed, deformable, andmultilevelmeshes. The starting point for every code was the same set ofinitial conditions given either by a set of initial positions or aninitial density field. Every simulator was then allowed to evolvethese initial conditions in a way best suited for the individualcode, i.e., implementations of smoothing strategies, integration

Fig. 2.—Pancake test at z ! 0, 643 particles, following Fig. 1. Very close tothe center of the spiral, there is a seven-stream flow. Here FLASH is run with aneffective resolution equivalent to a 5123 mesh (For the equivalent resolutionMC2 results, see Fig. 3. For a discussion of all of the results, see the text.

Fig. 3.—Failure of convergence near the midplane for the pancake test: MC2

results, 643 particles with four grid sizes at z ! 0. Convergence fails at the finalresolution reduction step (going from a 2563 mesh to a 5123 mesh). See the textfor a discussion of these results.

ROBUSTNESS OF COSMOLOGICAL SIMULATIONS. I. 33No. 1, 2005

Wang + White 2007

Heitmann+2005

Also: “Demonstrating Discreteness and Collision Error in Cosmological N-body Simulations of Dark Matter Gravitational Clustering” - Melott + 1997

Need to be addressed before benefiting from high order

Tuesday, March 17, 15

Page 12: Myers_SIAMCSE15

• Motivation - why is understanding these errors relevant for us here?

• Standard PIC methods don’t converge for Cosmology applications

• Two modifications:

– Regularization

– Adaptive Remapping

• Summary and Future Research

Controlling Numerical Error in Particle-in-Cell Simulations of Collisionless Dark MatterAndrew Myers, LBNL

12

Talk outline

Tuesday, March 17, 15

Page 13: Myers_SIAMCSE15

• Important point: all cosmology simulations are run with singular

initial conditions:

@f

@t= �v

a· @f@x

+

✓a

a

◆v +

1

ar�

�· @f@v

Controlling Numerical Error in Particle-in-Cell Simulations of Collisionless Dark MatterAndrew Myers, LBNL

13

Vlasov-Poisson for Cosmology Simulations

f(x , v , tini) = ⇢(x , tini)� (v � v)

• Low particle counts.

• Done for sound physical reasons, numerically problematic.

• We use PIC to solve this. Run a “Zel’dovich Pancake” setup.

• First, we do a 1D problem.

Tuesday, March 17, 15

Page 14: Myers_SIAMCSE15

Controlling Numerical Error in Particle-in-Cell Simulations of Collisionless Dark MatterAndrew Myers, LBNL

14

The Zel’dovich Pancake - 1D Convergence Results

• Convergence is bad

after particle

trajectories cross

• Poor convergence

rates in 1D hint at

more serious

problems in higher

dimensions...

Tuesday, March 17, 15

Page 15: Myers_SIAMCSE15

Controlling Numerical Error in Particle-in-Cell Simulations of Collisionless Dark MatterAndrew Myers, LBNL

15

The Zel’dovich Pancake - 2D, Tilted Results

• Spurious

fragmentation

regardless of the

number of particles

per Poisson cell

• Does not respect the

initial symmetry of

the problem setup

• Suggestive of Wang

+White 2007

1/4

1

256

Tuesday, March 17, 15

Page 16: Myers_SIAMCSE15

• Motivation - why is understanding these errors relevant for us here?

• Standard PIC methods don’t converge for Cosmology applications

• Two modifications:

– Regularization

– Adaptive Remapping

• Summary and Future Research

Controlling Numerical Error in Particle-in-Cell Simulations of Collisionless Dark MatterAndrew Myers, LBNL

16

Talk outline

Tuesday, March 17, 15

Page 17: Myers_SIAMCSE15

• Remove the singularity in the initial data

• Natural approach is to regularize the initial conditions via a finite, artificial

initial velocity dispersion, , for which we choose a Gaussian form:�i

� (v � ¯v) !✓

1

2⇡�i2

◆D/2

exp

� (v � ¯v , tini))2

2�i2

Controlling Numerical Error in Particle-in-Cell Simulations of Collisionless Dark MatterAndrew Myers, LBNL

17

Regularized Initial Conditions

• Makes things look more like plasma case. Many particles per cell.

• Analogy with shock-capturing schemes in gas dynamics is instructive.

Tuesday, March 17, 15

Page 18: Myers_SIAMCSE15

lim

�i!1

✓1

2⇡�i2

◆D/2

exp

� (v � ¯v , tini))2

2�i2

�= � (v � ¯v)

Controlling Numerical Error in Particle-in-Cell Simulations of Collisionless Dark MatterAndrew Myers, LBNL

18

Regularized Initial Conditions

�i = 0.2�i = 0.4�i = 0.8

Tuesday, March 17, 15

Page 19: Myers_SIAMCSE15

• For finite , we

do obtain the

expected order of

accuracy.

�i

Controlling Numerical Error in Particle-in-Cell Simulations of Collisionless Dark MatterAndrew Myers, LBNL

19

Regularized Convergence Results - 1D

Tuesday, March 17, 15

Page 20: Myers_SIAMCSE15

• This approach gives us a way to obtain

solutions to the original, cold problem

• For a given , increase resolution until a

converged solution is obtained.

• Then, look to see how the solutions

behave as .

• Inspired by a similar technique in vortex

methods

�i

�i ! 0

Controlling Numerical Error in Particle-in-Cell Simulations of Collisionless Dark MatterAndrew Myers, LBNL

20

The Double Limit

PERIODIC VORTEX SHEET ROLL-UP 331

is increased. This convergence occurs at any time. even past the time of singularity formation in the vortex sheet (r, = 0.375). Figure 4 illustrates this, showing the results at t = 4 for 6 = 0.25 with N= 50, 100, and 200. The time step was small enough to ensure that for each value of N the point positions are an accurate solution of Eqs. (7) (8) to within the plotting resoluticn. With a small value of ii:, the interpolating curve in Fig. 4 is tangled, but as 1V increases, the tangling dis- appears. When N= 200, the curve’s shape has already converged to within plotting resolution as may be seen by comparison with the S = 400 solution in the iast panel of Fig. 3~ It is therefore presumed that the curves in Figs 2 and 3 arc essentially The solution of the 6 equations (l), (2) for the two particular values of 6 chosen, over the time intervat 0 d t 6 4. Comparable accuracy can be obtained at later times by using smaller 3 I and larger N.

The effect of decreasing 6 at a fixed time (I = 1) greater than the vortex sheet’s critical time (TV = 0.375) is shown in Fig. 5 which plots the interpolating curve fc:- several values of 6 between 0.2 and 0.05. These calculations used N = 406 and

b x 1

FIG. 5. Solution of the 6 equations (1 1. (2) at I = 1.0 using 6 = 0.2. 0.15, 0.1. 0.05

Krasny, 1986

Tuesday, March 17, 15

Page 21: Myers_SIAMCSE15

Controlling Numerical Error in Particle-in-Cell Simulations of Collisionless Dark MatterAndrew Myers, LBNL

21

The Double Limit

• The converged, regularized solutions approach a well-defined curve.

• Artificial smooths out structures smaller than some length scale

• In practice, pick a length scale below which you won’t believe the results

�i

Tuesday, March 17, 15

Page 22: Myers_SIAMCSE15

Controlling Numerical Error in Particle-in-Cell Simulations of Collisionless Dark MatterAndrew Myers, LBNL

22

Regularized Results - 2D

• Regularization works in 1D. However, the problem with fragmentation in

2D persists...

Tuesday, March 17, 15

Page 23: Myers_SIAMCSE15

• Motivation - why is understanding these errors relevant for us here?

• The failure of basic PIC for Cosmology applications

• Two modifications:

– Regularization

– Adaptive Remapping

• Summary and Future Research

Controlling Numerical Error in Particle-in-Cell Simulations of Collisionless Dark MatterAndrew Myers, LBNL

23

Talk outline

Tuesday, March 17, 15

Page 24: Myers_SIAMCSE15

eE(x , t) / exp (at)

Controlling Numerical Error in Particle-in-Cell Simulations of Collisionless Dark MatterAndrew Myers, LBNL

24

Particle Remapping

High-order deposition

Positivity not guaranteed, need

mass distribution

• In plasma convergence theory, error for

field contains exponential term:

Before remap After remap

• Periodically restart problem with new particles

Wang+2011

Particles with tiny

masses are discarded

Requires regularization

Tuesday, March 17, 15

Page 25: Myers_SIAMCSE15

• Wrinkle: In comoving

coordinates, velocities

shrink with time.

• shrinks as box

expands, must as well

• Solution: remap with AMR

• Resolves with same #

of particles throughout

• Example, 4 levels

�v

Controlling Numerical Error in Particle-in-Cell Simulations of Collisionless Dark MatterAndrew Myers, LBNL

25

Particle Remapping, with AMR

Tuesday, March 17, 15

Page 26: Myers_SIAMCSE15

Controlling Numerical Error in Particle-in-Cell Simulations of Collisionless Dark MatterAndrew Myers, LBNL

26

Remapping preserves order of method in 1D...

• Once this is

done, still get

2nd order in 1D

Tuesday, March 17, 15

Page 27: Myers_SIAMCSE15

Controlling Numerical Error in Particle-in-Cell Simulations of Collisionless Dark MatterAndrew Myers, LBNL

27

And greatly improves artificial fragmentation issue

Remapped Not remapped�a = 0.013 levels,

Tuesday, March 17, 15

Page 28: Myers_SIAMCSE15

• Motivation - why is understanding these errors relevant for us here?

• The failure of basic PIC for Cosmology applications

• Two modifications:

– Regularization

– Adaptive Remapping

• Summary and Future Research

Controlling Numerical Error in Particle-in-Cell Simulations of Collisionless Dark MatterAndrew Myers, LBNL

28

Talk outline

Tuesday, March 17, 15

Page 29: Myers_SIAMCSE15

Controlling Numerical Error in Particle-in-Cell Simulations of Collisionless Dark MatterAndrew Myers, LBNL

29

Conclusions and Future Research

• We know how to make PIC converge on Cosmology problems at the

stated order of accuracy. Can now benefit from high-order PIC.

Interpolation kernels, etc. for doing so are there.

• The necessary scheme looks a lot like PIC for electrostatic plasmas:

with particle remapping and high particle counts.

• We can exploit this information for designing high-AI methods. Example

- extrapolating RK4.

• Results on the convergence of PIC schemes for cosmology have been

submitted to ApJ, paper and code available here:

https://bitbucket.org/atmyers/cosmologicalpic

Tuesday, March 17, 15

Page 30: Myers_SIAMCSE15

Thank you for listening!

Controlling Numerical Error in Particle-in-Cell Simulations of Collisionless Dark Matter

Andrew [email protected] Numerical Algorithms Group, Computational Research Division with Phillip Colella, Brian Van Straalen

SIAM-CSE MeetingMarch 17th, 2015 Extreme Resilient Discretizations

Submitted to ApJ

Tuesday, March 17, 15

Page 31: Myers_SIAMCSE15

• VP equation is a non-linear advection equation in phase space

• Can be solved using Eulerian methods in phase space on up to

128^6 domains (Yoshikawa + 2013)

• Expense of working in high-dimensional spaces is significant, both in

terms of memory requirements and the number of operations

involved.

• Large range of scales involved implies that adaptivity is usually

required.

Controlling Numerical Error in Particle-in-Cell Simulations of Collisionless Dark MatterAndrew Myers, LBNL

31

Eulerian Methods

Tuesday, March 17, 15

Page 32: Myers_SIAMCSE15

f(x , v , tini) ⇡X

p2Pmp�

�x � x

ip

���v � v

ip

�P

dmp

dt= 0

dx p

dt=

1

avp

dvp

dt= � a

avp +

1

agp

(x p(t), vp(t))

vp

gp

Controlling Numerical Error in Particle-in-Cell Simulations of Collisionless Dark MatterAndrew Myers, LBNL

32

Particle Methods

• Discretize system with set of Lagrangian interpolating points,

• Reduces problem to system of ODEs for particle trajectories:

• Can reconstruct distribution at later times from

x p(t)

Tuesday, March 17, 15

Page 33: Myers_SIAMCSE15

f(x , v , tini) ⇡X

p2Pmp�

�x � x

ip

���v � v

ip

�P

dmp

dt= 0

dx p

dt=

1

avp

dvp

dt= � a

avp +

1

agp

(x p(t), vp(t))

vp

gp

Controlling Numerical Error in Particle-in-Cell Simulations of Collisionless Dark MatterAndrew Myers, LBNL

33

Particle Methods

• Discretize system with set of Lagrangian interpolating points,

• Reduces problem to system of ODEs for particle trajectories:“Viscous drag” term associated with comoving coordinate system

• Can reconstruct distribution at later times from

x p(t)

Tuesday, March 17, 15

Page 34: Myers_SIAMCSE15

• Naturally adaptive

• Do not require keeping track of full, phase-space distribution function

• Basically all of the workhorse Dark Matter codes take this approach (e.g.

Enzo, Flash, Nyz, RAMSES, Gadget, ART, CHARM)

• Differ mainly in the way they compute gp

Controlling Numerical Error in Particle-in-Cell Simulations of Collisionless Dark MatterAndrew Myers, LBNL

34

Particle Methods

Tuesday, March 17, 15

Page 35: Myers_SIAMCSE15

Controlling Numerical Error in Particle-in-Cell Simulations of Collisionless Dark MatterAndrew Myers, LBNL

35

2nd order PIC for Cosmology

Start

Initialize Particles

EndTime to stop?

Particle Kick

Particle Drift

Particle Deposition

Poisson Solve

Force Interpolation

Particle Kick

yes

no

• Deposition / Interpolation handled by CIC

• Poisson’s equation solved w/ 2nd order FD

• Kick-Drift-Kick scheme (Miniati+Colella 2007)

vn+1/2p =

an

an+1/2vnp +

1

an+1/2gnp�t

2.

x

n+1p = x

np +

1

an+1/2v

n+1/2p �t.

vn+1p =

an+1/2

an+1vn+1/2p +

1

an+1gn+1p

�t

2.

Kick

Kick

Drift

All these pieces should be 2nd order.

Tuesday, March 17, 15

Page 36: Myers_SIAMCSE15

Controlling Numerical Error in Particle-in-Cell Simulations of Collisionless Dark MatterAndrew Myers, LBNL

36

The Zel’dovich Pancake

• Collapse of a single, sinusoidal perturbation in an expanding background

• A common test case for cosmological dark matter codes

• Analytic solution exists prior to the “first caustic” - the time at which the

first matter parcels cross

• “Single-mode” analysis of cosmological structure formation

Tuesday, March 17, 15

Page 37: Myers_SIAMCSE15

• Usually, a uniform, zero-temperature fluid is discretized with evenly-

spaced, equal mass particles.

• These particles are then perturbed from the initial positions using the

Zel’dovich approximation.

• Each point in space has only one particle, no velocity dispersion

Controlling Numerical Error in Particle-in-Cell Simulations of Collisionless Dark MatterAndrew Myers, LBNL

37

The Zel’dovich Pancake

Tuesday, March 17, 15

Page 38: Myers_SIAMCSE15

• These initial conditions represent an initial distribution function that is

singular in velocity space:

f(x , v , tini) = ⇢(x , tini)� (v � v)

Controlling Numerical Error in Particle-in-Cell Simulations of Collisionless Dark MatterAndrew Myers, LBNL

38

The Zel’dovich Pancake

• This approximation is made for good physical reasons.

• However, singular initial data can pose problems for numerical solution

methods. Problem may be ill-posed.

• When we look at the Richardson-extrapolated order as a function of time:

Tuesday, March 17, 15

Page 39: Myers_SIAMCSE15

• Sample the regularized distribution on a Cartesian grid in phase space,

discarding those with tiny masses.

• = initial particle spacing in physical, velocity space.

Controlling Numerical Error in Particle-in-Cell Simulations of Collisionless Dark MatterAndrew Myers, LBNL

39

Regularized Initial Conditions

(hx

, hv

)

Tuesday, March 17, 15