Multi-Grid Methods in Lattice Gauge Theory
Total Page:16
File Type:pdf, Size:1020Kb
Multi-grid Methods in Lattice Field Theory Rich Brower, Physics, Boston University Jan 10, 2017 (SciDAC Software Co-director and NVIDIA CUDA Fellow) Machine Learning/Multi-scale Physics smoothing prolongation (interpolation) Fine Grid restriction The Multigrid V-cycle Smaller Coarse Grid Lattice Field Theory has Come of Age 7 CM-2 100 Mflops (1989) 10 increase in 25 years BF/Q 1 Pflops (2012) Lagrangian for QCD What so difficult about this! (only explicit scale) S = d4x L Z 1 (x)= F abF ab + ¯ δabγ (@ + Aab) + m ¯ L 4g2 µ⌫ µ⌫ a µ ⌫ µ b • 3x3 “Maxwell” matrix field & 2+ Dirac quarks • 1 “color” charge g & “small” quark masses m. • Sample quantum “probability” of gluonic “plasma”: d4xF 2/2g2 Prob A (x)det[D† (A)D (A)]e ⇠ D µ quark quark − Z R All prediction from Quantum Field Theory require “Algorithms” Z = Path Integral exp[ - Action] Feynman Wilson Diagrams Lattice QCD PDE/FEM Schwartz Schurs OPE & Renormalization Group Multi- ‘tHooft Grid Dim Reg Domain Wall Twisters Wilson DD Bootstrap AdS/CFT Flow Peter’s Multiscale P. Boyle Yesterday What about multi-scales inside of Lattice QCD? a(lattice) 1/Mproton 1/m⇡ L(box) 0.06 fermi ⌧ 0.2 fermi ⌧ 1.4 fermi⌧ 6.0 fermi ⌧ ⌧ ⌧ = L = O(100) or Minimum Lattice Volume 1004! ) Multigrid: Case History in Algorithm Development * “MG is always the Future”: Anonymous, JLab 2008 ** “But, the future has arrived!”: Rich, Oak Ridge 2013 • History Lessons (1989-1992) * - Cause of early failure • Modern Era (2008-2013) - 5 years to put into production the QCD MG Solver for Wilson-clover) • Future** (2013-2018) - Domain Wall & Staggered Solvers, HMC evolution, etc - Adaptation to heterogeneous architectures, etc. Outline 1. Lattice Gauge Multi-grid: Why was it so difficult? –Lessons from 1990s: –The scaling & RG metaphor* 2. The Adaptive Geometric MG break through - Wilson clover (twisted mass) - Staggered (?) - Domain Wall (overlap/Peter Boyle!) 3. Multi-scale extensions: - Monte Carlo MG (Endres et al ?) - Quantum Finite Elements (FEM + quantum) - SUSY/Graphene/etc *Combining Renormalization Group And Multigrid Methods 1988 R. Brower, R. Giles, K.J.M. Moriarty , P. Tamayo A faithful Discrete Dirac PDE on Lattice is not trivial Standard Finite Difference or Finite Element Methods Fail There are several popular choice and trade offs Each poses different Multigrid Challenges Wilson Clover & Twisted Mass Staggered (or Kogut-Susskind, SUSY, Graphene) Domain Wall & Overlap (exact chiral) Simplicial Lattice (Random Flat Christ et al) (General Riemann Brower et al) Dirac Equation: Math Preliminaries Continuum: γ (@ iA ) (x)+m = b(x) µ µ − µ @ γij( iAab) jb(x)+m ia = bia(x) or µ @xµ − µ µ X 3 by 3 gauged colors 4 x 4 (d/2 x d/2) spin matrices Even more compact Linear Op form (D + m) = b (D + m) λ =(iλ + m) λ D is anti-Hermitian implies spectral | i | i Normal Operator 2 2 (D + m)†(D + m)= (@ iA ) σ F + m − µ − µ − µ⌫ µ⌫ Putting it on hypercubic lattice 1 γµ 1+γµ Dquark = d − U(x, x + µ) U(x + µ, x)+mq Wilson − 2 − 2 1 d D(U, m)= ⌘µ(~x) Uµ(~x)δ~x,~y µˆ Uµ†(~x µˆ)δ~x,~y+ˆµ + mδ~x,~y −2 − − − Staggered µ=1 X ⇥ ⇤ x + x ⌘ =( 1) 2 ··· µ just phases! µ − Domain Wall is 5d Wilson-like x x+µ iaAµ(x) Uglue(x, x + µ)=e Scaling & Ren Group Metaphor 2 d Massless Laplace: φ(x) λ − φ(λx) ! @2 @2 2φ(x, y, )= [ + + ]=e2δd(x, y, ) r ··· − @x2 @y2 ··· ··· Solution (Green’s function) 1 e2 φ(x, y, )= ··· Sd ( x2 + y2 + z)d 2 ··· − 1 p φ(x, y)= log( x2 + y2) for d =2 2⇡ p Naive Scaling and 1-d MG Toy Example (Dφ)(x)=φ(x + a) 2φ(x)+φ(x a)+a2m2φ(x)=b(x) − − a ➔ 2a Restriction R = P† 2 a ➔ a Prolongation P (1) Blocking preserves the scale invariant const solutions (null state) (2) Coarse operator is renormalized: m ➔ 2 m ( in units a = 1) 1 γ 1+γ D = d − µ U(x, x + µ) µ U(x + µ, x)+m wilson − 2 − 2 q 1 1 = (∆ ∆† )+ ∆† ∆ 2 µ − µ 2 µ µ ∆ (x)=U(x, x + µ) (x + µ) (x) a(@ A (x)) (x) ( where µ − ' µ − µ ) Wilson Spectra in 2d QCD MG attempts in 1990’s See Thomas Kalkretuer hep-lat/9409008 review on “MG Methods for Propagators in LGT”. Israel: Ben-Av, M. Harmatz, P.G. Lauwers & S.Solomon Boston: Brower, Edwards, Rebbi & Vicari Amsterdam: A. Hulsebos, J Smit J. C. Vick Amsterdam: A. Hulsebos, J Smit J. C. Vick QCD MG “failure” in 1990s: 2x2 Blocks for U(1) Dirac β = 1 2-d Lattice, Gauss-Jacobi (Diamond), CG (circle), V cycle (square), W cycle (star) Uµ(x) on links Ψ(x) on sites Universal of Critical Slowing down: ! = 3 (cross) 10(plus) 100( square) Gauss-Jacobi (Diamond), CG(circle), 3 level (square & star) ⌧ = F (mlσ) Lessons from MG attempts in 1990s – Partial success at weak coupling. – “Local” real space RG blocking but not perfect. – Maintain Gauge invariance – Maintain γ 5 Hermiticity – BUT why did it fail at small mass? Perturbative RG Metaphor? “Education is knowing how far to push a metaphor!” Why Didn’t It work? Instantons, Topological Zero Modes (Atiyah-Singer index) and Confinement length lσ 1 (1 + γ)kj ia(x)= xˆµγik ✏ja 0 x2 + ⇢2 µ 2 lσ “A little knowledge is a dangerous thing” Lagrangian Lattice (a >0) Quantum Theory * (i.e. PDE’s) (i.e.Computer) (i.e.Nature) Rotational(Lorentz) Invariance ✔ ✘ ✔ Gauge Invariance ✔ ✔ ✔ Scale Invariance ✔ ✘ ✘ Chiral Invariance ✔ ✘/✔ ✘ *Must take lattice spacing to zero for the quantum averages. 2. Adaptive Multigrid Revolution? 2. Machine Learning Multigrid Revolution! Adaptive Smooth Aggregations Algebraic MultiGrid Slow convergence of Dirac solver is due small eigenvalues for vectors in near null subspace: S . smoothing prolongation (interpolation) D : S 0 ' Fine Grid restriction The Multi-grid Spilt the vector space V-cycle into near null space S and the complement S ? Smaller Coarse Grid 2-level Multi-grid Cycle (simplified) • Smooth: x0 =(1 D)x + br0 =(1 D)r − − • Project: Dˆ = P †DP rˆ = P †r 1 • Approx. Solve: D ˆ eˆ =ˆr = eˆ Dˆ − P †r ) ' • Prolongate e = P eˆ • Update x00 = x0 + e Petrov-Galerkin Oblique Projector: Exact Upscale 1 rexact =[1 DP P †]r = P †rexact =0 − P †DP ) Adaptive Smooth Aggregation Algebraic Multigrid “Adaptive*multigrid*algorithm*for*the*lattice*Wilson7Dirac*operator”*R.*Babich,*J.*Brannick,*R.*C.* Brower,*M.*A.*Clark,*T.*Manteuffel,*S.*McCormick,*J.*C.*Osborn,*and*C.*Rebbi,**PRL.**(2010).* Good News/Bad News More Data: Should Save MG projectors with lattice Actually MG error is smaller at fixed Residual Adaptive Smooth Aggregation Algebraic Multigrid “Adaptive*multigrid*algorithm*for*the*lattice*Wilson7Dirac*operator”*R.*Babich,*J.*Brannick,*R.*C.* Brower,*M.*A.*Clark,*T.*Manteuffel,*S.*McCormick,*J.*C.*Osborn,*and*C.*Rebbi,**PRL.**(2010).* Good News/Bad News More Data: Should Save MG projectors with lattice Actually MG error is smaller at fixed Residual Staggered Multigrid: Preliminary (Brower, Clark, Strelchenko and Weinberg) 1 D(U, m)= ⌘ (x)[U (x, x + µ) U(x + µ, x)] + m 2 µ µ − 1 = ⌘ (~x)[∆ ∆† ]+m 2 µ µ − µ 2d Staggered Spectrum Normal Equation (e/o precond): Free field is 2a Laplace m D mD m2 D D 0 D (U, m)D(U, m)= oe oe = eo oe † D −m D m − 0 m2 D D − eo eo − oe eo Total Deo,Doe applications, Normal eqn 10000 1282, CG on Even-Odd 2562, CG on Even-Odd 1282, MG-GCR 2562, MG-GCR 1000 applications ⇢ D ⇢ 100 00.020.040.060.080.1 m Spurious Galerkin Eigenmodes Eigenvalues 322, β =6.0, Naive Galerkin 0.2 0.15 ) λ 0.1 Im( 0.05 0 ˆ D Dˆ Dˆ ˆ Dˆ = P †DP Dˆ = P †DPˆ Removing Spurious Galerkin Eigenmodes Eigenvalues 322, β =6.0,HybridAlgorithm 0.2 L1: D ˆ L2: D +0.16[D†D]T ˆˆ 0.15 L3: D +0.32[Dd†D]T d ) λ 0.1 Im( 0.05 0 -0.05 0 0.05 0.1 0.15 0.2 0.25 Re(λ) Dˆ = P †DP +[P †D†DP]T Truncated Normal Stabilized Staggered MG Total Deo,Doe applications, ⇢D⇢ solve 10000 1282, CG on Even-Odd 2562, CG on Even-Odd 1282, MG-GCR 2562, MG-GCR 1000 applications ⇢ D ⇢ 100 00.020.040.060.080.1 m Domain Wall Hierarchically deflated conjugate gradient P A Boyle (Edinburgh U.). Feb 11, 2014. 37 pp. QCD EDINBURGH-2014-03 arXiv:1402.2585 Multigrid Algorithms for Domain-Wall Fermions Saul D. Cohen (Washington U., Seattle), R.C. Brower (Boston U., Ctr. Comp. Sci.), M.A. Clark (Harvard-Smithsonian Ctr. Astrophys.), J.C. Osborn (Argonne). May 2012. 7 pp. 2d U(1) Published in PoS LATTICE2011 (2011) 030 Domain Wall: 4 + 1 with extra 5th dimension of size Ls 2 + 1 Domain Wall Spectrum: Violently Non-Normal Operator Non Normal Non Hermitian Non Pos. Def. (Saul Cohen) High Performance on GPUs • Cost in $s reduced by a factor of at least 100+ GPU O(10+) MG O(10+) QUDA: NVIDIA GPU •“QCD on CUDA” team – http://lattice.github.com/quda ! Ron Babich (BU-> NVIDIA) ! Kip Barros (BU ->LANL) ! Rich Brower (Boston University) ! Michael Cheng (Boston University) ! Mike Clark (BU-> NVIDIA) ! Justin Foley (University of Utah) ! Steve Gottlieb (Indiana University) ! Bálint Joó (Jlab) ! Claudio Rebbi (Boston University) ! Guochun Shi (NCSA -> Google) ! Alexei Strelchenko (Cyprus Inst.-> FNAL) ! Hyung-Jin Kim (BNL) ! Mathias Wagner (Bielefeld -> Indiana Univ) ! Frank Winter (UoE -> Jlab) Mapping Multi-scale Algorithms to Multi-scale Architecture First Step Wilson-Dslash on GPUs • REDUCE MEMORY TRAFFIC: • (1) Lossless Data Compression: • SU(3) matrices are all unitary complex matrices with • det = 1.