A parallel electrostatic solver for the VORPAL code

Please download to get full document.

View again

of 4
All materials on our website are shared by users. If you have any questions about copyright issues, please report us to resolve them. We are always happy to assist you.
Document Description
A parallel electrostatic solver for the VORPAL code
Document Share
Document Tags
Document Transcript
  Computer Physics Communications 164 (2004) 118–121www.elsevier.com/locate/cpc A parallel electrostatic solver for the VORPAL code Peter Messmer ∗ , David L. Bruhwiler Tech-X Corporation, 5621 Arapahoe Avenue, Suite #A, Boulder, CO 80301, USA Available online 21 July 2004 Abstract Here we report on the addition of a parallel electrostatic (ES) field solver to the hybrid plasma simulation code VORPAL.Poisson’s equation is solved by finite differences in real space. The large sparse system of linear equations is solved byiterative methods, like Krylov subspace algorithms and algebraic multigrid (AMG). Utilizing solver libraries optimized forparallel platforms (AZTEC and ML, both developed at Sandia National Labs) results in good overall performance of the solver.Accuracy of the solver is compared to analytical results for examples relevant to electron-cooling problems. © 2004 Elsevier B.V. All rights reserved. Keywords:  Poisson solver; Iterative methods; Algebraic multigrid; Electron cooling; Particle-in-cell (PIC) 1. Introduction VORPAL [1] is a parallel hybrid PIC/fluid codeunder development by the University of Colorado atBoulder and by Tech-X Corp. It is implemented inC ++ , usingtemplate metaprogrammingtechniquestoenabletheexpressionofalgorithmsinarbitrarydimen-sion. This code has been extensively used for electro-magnetic(EM)modelingoflaser–plasma interactions.An electrostatic (ES)field solverhas recentlybeenim-plemented for VORPAL, so the code can be appliedto electron-coolingphysics [2]and otherwide-rangingproblems.The EM treatment of plasmas is more fundamentalthan ES, but computationally much more demanding.In particular, the Courant condition requires that lightcannot cross a grid cell in one timestep—a restrictionthat does not make sense for nonrelativistic plasmas * Corresponding author.  E-mail address:  messmer@txcorp.com (P. Messmer). interacting with low-frequency fields. In such cases,an ES treatment of the problem is justified and can beorders of magnitude faster. Also, an ES field solve isoftenrequiredto establish correctinitial conditionsforan EM simulation.The numerical techniques used in an ES code dif-fer significantly from those used in an EM code,especially for parallel computing. In an EM code,the field advance requires only local quantities, mak-ing it relatively straightforward to efficiently paral-lelize the computation with low communication over-head. In contrast, an ES field solve requires globalinformation—starting from the global charge distri-bution, the electric potential is obtained by solvingPoisson’s equation. The decision to use freely avail-able solvers, which are already optimized for mas-sively parallel architectures, yielded excellent per-formance while greatly simplifying implementationwithin VORPAL.Solving the discretized Poisson equation is a wellstudied field in numerical analysis and several algo- 0010-4655/$ – see front matter  © 2004 Elsevier B.V. All rights reserved.doi:10.1016/j.cpc.2004.06.018  P. Messmer, D.L. Bruhwiler / Computer Physics Communications 164 (2004) 118–121  119 rithms are known, including methods based on FastFourier transforms, or Green’s function. The electro-static solver presented here discretizes the Poissonequation in real space, which reduces to the solutionof a sparse linear system  Ax = r , where  A  is an  n × n matrix of finite difference kernels with  n  the numberof grid cells in the system,  x  the electrostatic poten-tial, and  r  the charge density on the grid. The matrixis very sparse, with usually 5–7 non-zeroelements perrow, depending on the dimensionality of the problem.The electric field is finallyobtainedbyfinite differenc-ing the potential. Currently, arbitrary shaped Dirichletboundaries are supported, leading in general to a non-symmetric matrices. 2. Implementation Designing robust and fast linear solvers on parallelarchitectures is a challenging task. We therefore usedfor our implementation the AZTEC library fromSandia National Lab [3]. The library offers a largecollection of solver and preconditioner algorithms.Additionally, AZTEC allows the user to easily specifythedecompositionofthematrixandthevectorsamongthe processors. This avoids redistribution of field databefore and after the field solver step.In addition to the conventional Krylov subspacesolvers for non-symmetric matrices, like ConjugateGradient Squared (CGS) or Stabilized Bi-ConjugateGradients (BiCGStab) available in AZTEC, multi-grid based solvers can be very efficient for Poisson-type (elliptic) problems. The new electrostatic solverfor VORPAL offers therefore an Algebraic Multi-grid (AMG) based preconditioner and a full AMGsolver. The implementation of the smoothed-aggrega-tion AMG algorithm is based on the multigrid pre-conditioning package ML [4], developed by SandiaNatl. Lab. While ML offers a variety of smoothersto chose from, only Gauss–Seidel without damp-ing was used in the following numerical experi-ments. 3. Results Here we consider accuracy of the solvers, timingissues, and scaling with number of processors. There-fore, solutions to ellipsoidal and Gaussian charge dis-tributions in conducting boxes are computed. An ana-lytical treatment for a Gaussian distribution is used toprecisely study errors. 3.1. Accuracy A non-trivialproblemof relevanceto electroncool-ing is to determine the potential of a Gaussian elec-tron distribution  ρ = ρ 0 e − x 2 / 2 a 2 e − y 2 / 2 b 2 in a conduct-ing box. For simplicity, we only show the 2D case, butthe extensionto 3D is straightforward.The charge dis-tribution is confined in a box of length  L d  / 2 in all di-mensions  d  , centered around the srcin. The analyticexpression for the potential can be found by expand-ing both the potential and the charge distribution in aFourier series, e.g. Φ(x,y) =ℜ ∞  m,n = 0 Φ m,n e ik m x e ik n y , (1) k m =  2 πL x m, k n =  2 πL y n, where  ℜ  denotes the real part. Letting the Laplacianoperate on the potential leads to the following relationbetween the Fourier coefficients of the potential andthe charge density:(2) Φ m,n =−  1 k 2 m + k 2 n ρ m,n . The Fourier coefficients for a Gaussian charge distrib-ution can be expressed as a product of error functions, ρ m,n = 2 πab erf    L x 2 √  2 a  erf    L y 2 √  2 b  (3) × e − 12 (k 2 m a 2 + k 2 n b 2 ) . Eqs.(3) and(2)determine Φ m,n , thecoefficientsof thepotential. Imposing conducting boundary conditions,and the symmetry of the charge distribution, thepotential is finally determined by(4) Φ(x,y) = 4 ∞  m,n = 0 Φ ˆ m, ˆ n cos (k ˆ m x) cos (k ˆ n y), with  ˆ m = 2 m + 1,  ˆ n = 2 n + 1.Fig. 1, left, shows the relative error of the potentialof a Gaussian electron distribution on a 400 × 400  120  P. Messmer, D.L. Bruhwiler / Computer Physics Communications 164 (2004) 118–121 Fig. 1. Left: Relative error of the potential for a 2D Gaussian charge distribution on a 400 × 400 cell grid with 50 particles per cell. Right:Transverse electric field of a uniformly charged ion distribution on a 65 × 65 cells grid with aspect ratio 1000. Solid: 100 particles/cell, 10 − 10 residual. Coincides with the theoretical result. Dashed: 100 particles/cell, 10 − 1 residual. Dotted: 1 particle/cell, 10 − 10 residual. cell grid with 50 particles per cell on average. Theseries in Eq. (4) was approximated by a sum of thefirst 50 coefficients. The convergence criterion for theiterativesolver(preconditionedCGS)wasaresidualof 10 − 10 . The largest contribution to the error srcinatesfrom the center of the distribution. The random natureof the error indicates its srcin in the noise of theparticle distribution. The effect of low sampling of theparticle distribution will be discovered again in thenext example. 3.2. Aspect ratio The electron cooling problem is characterized byan approximately Gaussian ion distribution which ishighly elongated along the dimension parallel to anexternal magnetic field. In order to avoid an excessivenumber of grid points in the longitudinal direction,the electrostatic solver has to be capable of obtainingsolutions for grids with large aspect ratios. Thereforethe electric field of a uniformlychargedellipsoidal iondistribution with axes  a x = 0 . 5 mm and  a y = 0 . 5 m ona 65 × 65 cells grid is investigated.The right panel of  Fig. 1 shows the electric fieldalong the short dimension, obtained with the new ESsolver, using preconditioned CGS. No artifacts arevisible at the interface between plasma and vacuumand the solution is indistinguishable from the theo-retical solution. The error due to a crude solution of Poisson’s equation is marginal. However, undersam-pling the charge distribution leads to a relatively largeerror. 3.3. Performance The availability of different iterative solvers andpreconditioners in AZTEC allows to easily compareconvergence.Fig. 2, left panel, shows the convergenceof the electrostatic potential for a Gaussian particledistribution on a 1026 ×  65 × 65 cells grid with acell aspect ratio of 500. Relative residuals versus timefor runs on 32 processors of NERSC’s IBM-SP areshown, using differentsolvers. A three step symmetricGauss–Seidel preconditioner was used for CGS andBiCGStab.For this particular problem, full AMG has the bestperformanceand convergencebehavior. Each iterationofthe AMG preconditionedCGS takes relativelylong,but the overall solution time is comparable to the fullAMG solver. They are both almost a factor of twofaster than preconditioned CGS and much faster thanCGS or BiCGStab.The convergence criterion was chosen arbitrarilyto be 10 − 6 . The previous example showed that theoverall accuracy of the potential in this particular caseis dominatedby the accuracyin the chargedistributionand not by the accuracy of the solution. Relaxingthe convergence criteria could therefore lead to muchfaster solution time.Another criteria is the scalability of the algorithmon parallel architectures. Even though the algorithm  P. Messmer, D.L. Bruhwiler / Computer Physics Communications 164 (2004) 118–121  121Fig. 2. Left: Convergence behavior for different solvers for a 3D Gaussian charge distribution on a 1026 × 65 × 65 cell grid on 32 processors:Full AMG (solid, thick), AMG preconditioned CGS (solid, thin), Gauss–Seidel preconditioned CGS (dotted, thick), CGS (dotted, thin),preconditioned BiCGStab (dashed, thick), BiCGStab (dashed, thin). The execution time per timestep is shown as horizontal lines. Right:Execution times for the electrostatic simulation of a 3D Gaussian beam, on a grid of 1026 × 65 × 65 cells (solid) and 4104 × 65 × 65 cells(dotted), using AMG preconditioned CGS (diamond) or Gauss–Seidel preconditioned CGS (stars). can handle large aspect ratios, it may be desirable tohavea largecomputationalgrid,e.g.inordertoresolvethe longitudinal Debye length. Fig. 2 (right panel)shows the execution times of the solver on the IBM-SP at NERSC, using the AMG preconditioned CGSfield solver and preconditioned CGS. The scaling ispromising up to a large number of processors. 4. Conclusions A parallel electrostatic field solver has been addedto the hybrid simulation code VORPAL. Utilizingsolver libraries optimized for parallel platforms(AZTEC and ML, both developed at Sandia NationalLabs), good performance for the solution of the largesystem of linear equations resulting from the dis-cretizationofPoisson’sequationisobtained.Accuracyof the solver is compared to analytical results for ex-amples relevant to electron-cooling problems. Acknowledgement The ML library was kindly supplied by Ray Tu-minaro, Sandia National Lab. For more informa-tion on ML, see http://www.cs.sandia.gov/~tuminaro/ ML_Description.html. This work was funded by DOEcontractDE-FG03-01ER83313and by Tech-XCorpo-ration. References [1] C. Nieter, J.R. Cary, VORPAL: a versatile plasma simulationcode, J. Comput. Phys. 196 (2004) 448.[2] I. Ben-Zvi, et al., Electron cooling for RHIC, in: Proc. Part.Accel. Conf., 2001, p. 48.[3] R.S. Tuminaro, M. Heroux, S.A. Hutchinson, J.N. Shadid,Official Aztec User’s Guide: Version 2.1, Technical Report,SAND99-8801J, December, 1999.[4] C. Tong, R.S. Tuminaro, ML 2.0 smoothed aggregation user’sguide, Technical Report SAND2001-8028, Sandia NationalLaboratories, December 2000.
Search Related
We Need Your Support
Thank you for visiting our website and your interest in our free products and services. We are nonprofit website to share and download documents. To the running of this website, we need your help to support us.

Thanks to everyone for your continued support.

No, Thanks