Uncorrected Proof
Total Page:16
File Type:pdf, Size:1020Kb
Chemical Physics Letters LLL (2018) xxx-xxx Contents lists available at ScienceDirect Chemical Physics Letters journal homepage: www.elsevier.com )9G95F7< D5D9F An 9::=7=9BH =AD@9A9BH5H=CB C: G9A=-numerical 7CADIH5H=CB C: H<9 Hartree-Fock 9L7<5B;9 CB H<9 BH9@ '<= DFC79GGCF Fenglai #=I, !=B; "CB;⁎ Department of Chemistry and Center for Computational Science, Middle Tennessee State University, TN, 37130, United States ARTICLE INFO ABSTRACT Article history: ,B=EI9 H97<B=75@ 7<5@@9B;9G 5B8 H<9=F GC@IH=CBG :CF =AD@9A9BH=B; G9A=-numer=75@ Hartree-Fock 9L7<5B;9 CB )979=J98 30 !5BI5FM 2018 H<9 '<=@ 'FC79GGCF 5F9 8=Gcussed, 9GD97=5@@M 7CB79FB=B; H<9 G=B;@9- =BGHFI7H=CB-mulH=D@9-data HMD9 C: DFC79GG- Accepted 11 $5M 2018 =B; 5B8 GA5@@ 757<9 size. BenchA5F? 75@7I@5H=CBG CB 5 G9F=9G C: 6I7?M65@@ AC@97I@9G K=H< J5F=CIG Gauss=5B Available CB@=B9 LLL 65G=G G9HG CB 5 '<= DFC79GGCF 5B8 5 G=L-core CPU G<CK H<5H H<9 '<= DFC79GGCF DFCJ=89G 5G AI7< 5G 12 H=A9G C: GD998ID K=H< @5F;9 65G=G G9HG 7CAD5F98 K=H< H<9 7CBJ9BH=CB5@ four-cenH9F 9@97HFCB F9DI@G=CB =BH9;F5H=CB 5D- DFC57< D9F:CFA98 CB H<9 CPU. +<9 577IF57M C: H<9 G9A=-numer=75@ G7<9A9 =G 5@GC 9J5@I5H98 5B8 :CIB8 HC 69 7CAD5F56@9 HC H<5H C: H<9 F9GC@IH=CB-of-idenH=HM 5Dproach. Q 2018. 1. Introduction :CF CCSD(T) 75@7I@5H=CB [8], 5B8 H<9 =AD@9A9BH5H=CB 5@GC 57<=9J98 2–3 H=A9G speedup. &B H<9 CH<9F hand, )9=8 etc. [9] AC8=:=98 CP2K +<9 89BG=HM :IB7H=CB5@ H<9CFM (DFT) =G H<9 ACGH K=89@M 5DD@=98 'FC;F5A :CF H<9 KNC 5B8 A589 D9F:CFA5B79 7CAD5F=GCB K=H< 5 8I5@ EI5BHIA A97<5B=75@ A9H<C8G :CF 7<9A=75@ 5B8 A5H9F=5@ GHI8ies. &B9 /9CB CPU(16 cores) GMGH9A :CF 9B9F;M 75@7I@5tions, 5B8 :CIB8 H<5H K5M HC A5?9 DFT 7CADIH5H=CB ACF9 DFC8I7H=J9 =G HC H5?9 H<9 58- H<9 7C89 F5B 56CIH 5.43 H=A9G G@CK9F CB H<9 KNC H<5B CB H<9 J5BH5;9 C: B9K 7CADIH9F H97<BC@Cgies, GI7< 5G H<9 Gen9F5@-Purpose CPU-only system, 9J9B H<CI;< H<9 H<9CF9H=75@ FLOPS 7CIBH :CF H<9 Graph=7G 'FC79GG=B; ,B=HG (GPGPU) 5B8 H<9 BH9@ $5BM BH9;F5H98 8I5@ CPUs =G 56CIH 0.8 H9FaFLOPS J9FGIG 1 H9FaFLOPs :CF H<9 KNC Core (MIC) 5F7<=H97ture. +<9 GPGPU 9LH9B8G H<9 IG9 C: H<9 ;F5D<=7G DFC79Gsor. +<9 G97CB8 ;9B9F5H=CB C: '<= DFC79GGCF S"B=;<H #5B8=B;T DFC79GG=B; IB=H HC D9F:CFA ;9B9F5@ DIFDCG9 7CADIH5H=CB GI7< 5G G7=- (KNL) GIDDCFHG H<9 8=F97H 7CAD=@5H=CB 5B8 9L97IH=CB C: H<9 GC:HK5F9 9BH=:=7 75@7I@5tions. ComD5F98 K=H< 5 HF58=H=CB5@ CPU, 5 GPGPU 75F8 CB H<9 DFC79Gsor. *I7< 5B 5HH9ADH CB H<9 EI5BHIA 7<9A=GHFM GC:HK5F9 DCGG9GG9G ACF9 DFC79GG=B; 7CF9G 5B8 CJ9F5@@ <5G AI7< <=;<9F H<9CF9H- #*DalHCB G<CK98 [10] H<5H =H =G GH=@@ G9J9F5@ H=A9G G@CK9F CB H<9 "%# =75@ D95? :@C5H=B; DC=BH CD9F5H=CBG D9F G97CB8 (FLOPS). &B H<9 CH<9F H<5B 5 8I5@ /9CB CPU system. +<9 8=::=7I@HM K5G ACGH@M 5HHF=6IH98 HC <5B8 H<9 BH9@ MIC H97<BC@C;M =G 5 F989G=;B C: H<9 DF9J=CIG ;9B9F- H<9 =B9::=7=9BH IG9 C: VPUs, K<=7< 5F9 H<9 SIMD (Sin;@9 =BGHFI7tion, 5H=CB C: Celeron 5B8 '9BH=IA 7CF9G 5B8 7CB89BG9G H<9G9 7CF9G CBHC AI@H=D@9 data) DFC79GG=B; IB=HG =B H<9 '<= DFC79GGCF [1,2,11]. For 9::97- CB9 DFC79Gsor. AlH<CI;< H<9 MIC 5F7<=H97HIF9 <5G :5F :9K9F 7CF9G H<5B H=J9@M IH=@=N=B; H<9 '<= DFC79GGCF CB9 IGI5@@M B998 HC F9KF=H9 GC:HK5F9 H<9 GPGPU (curF9BH@M 56CIH 60–70 7CF9G D9F DFC79Gsor), =H =BHFC8I79G G=;B=:=75BH@M GC H<5H =H 75B 69 9::=7=9BH@M 9L97IH98 CB 5B SIMD 5F7<=- H<9 -97HCF 'FC79GG=B; ,B=HG (VPU) HC 957< core, K<=7< 75B 9L97IH9 8 H97ture. 8CI6@9-pre7=G=CB :@C5H=B; DC=BH CD9F5H=CBG D9F CPU =BGHFI7H=CB 7M7@9 B 5 DF57H=75@ DFT 75@7I@5tion, H<9 7CADIH5H=CB C: Hartree-Fock [1,2]. .=H< H<9 -', KCF?=B; 5G 5 AI@H=D@=9F H<9 MIC DFC79GGCF =G 75- (HF) 9L7<5B;9 =G H<9 ACGH H=A9-conGIA=B; 7CAponent, 9GD97=5@@M D56@9 C: DFCJ=8=B; G=A=@5F 7CADIH=B; 75D57=HM 5G H<5H C: H<9 GPGPU 8I9 HC H<9 89J9@CDA9BH C: J5F=CIG H97<B=EI9G :CF H<9 7CADIH5H=CB C: card. Coulomb 5B8 H<9 BIA9F=75@ =BH9;F5H=CB C: 9L7<5B;9-corF9@5H=CB :IB7- B 7CAD5F=GCB HC H<9 G=;B=:=75BH 89J9@CDA9BH 9::CFH A589 CB H=CB5@G [12–14]. .9 <5J9 F979BH@M DI6@=G<98 H<9 =AD@9A9BH5H=CB C: 5 GPGPU :FCA EI5BHIA 7<9A=GHFM GC:HK5F9 7CAAIB=HM [3–6], @9GG 89- G9A=-numer=75@ =BH9;F5H=CB G7<9A9 :CF 7CADIH=B; H<9 HF 9L7<5B;9 J9@CDA9BH 9::CFHG <5J9 699B F9DCFH98 :CF H<9 '<= DFC79GGCF =B 7CApu- 9B9F;M 5B8 A5HF=L K=H< G9@:-conG=GH9BH field(SCF) :CF 7CBJ9BH=CB5@ H5H=CB5@ EI5BHIA 7<9Aistry, D9F<5DG 6975IG9 =H 75A9 56CIH ACF9 F9- CPUs [15]. +<9 G7<9A9 =G G=A=@5F HC H<9 COS-X 5@;CF=H<A [16,17] 5B8 79BH@M H<5B H<9 GPGPU. #95B; 9H al. [7] GHI8=98 H<9 9::=7=9B7M C: A5- DG9I8C-specHF5@ G7<9A9 [18,19]. B H<=G KCF? K9 89G7F=69 H<9 =AD@9- HF=L CD9F5H=CBG CB H<9 :=FGH ;9B9F5H=CB C: H<9 '<= DFC79GGCF S"B=;<H A9BH5H=CB C: H<9 G7<9A9 CB H<9 MIC 5F7<=H97ture. .9 G<CK98 H<5H H<9 CorB9FT (KNC) 5B8 G<CK98 H<5H KNC 75B M=9@8 ID HC H<F99 H=A9G G9A=-numer=75@ G7<9A9 =G ACF9 9::=7=9BH :CF @5F;9 65G=G G9HG 5B8 @5F;9 GD998ID 7CAD5F98 K=H< H<9 <CGH CPU.UNCORRECTED Apra 9H 5@. 9AD@CM98 H<9 KNC AC@97I@9G 8I9 HC EI58F5H=7 G75@=B; K=H< PROOF F9GD97H H<9 65G=G G9H size. Fur- H<9Fmore, =H F9EI=F9G :9K9F H9ADCF5FM J5F=56@9G H<9F9:CF9 :=H 69HH9F H<9 '<= 'FC79GGCF H<5H <5G AI7< GA5@@9F 757<9 G=N9 H<5B 5 GH5B85F8 CPU. +<9 G9A=-numer=75@ HF 9L7<5B;9 =G 5@GC 5B 9GG9BH=5@ =B;F98=9BH :CF ⁎ Corresponding author. H<9 9A9F;=B; @C75@ <M6F=8 :IB7H=CB5@G [17,20–24]. Email address: [email protected] (J. Kong) https://doi.org/10.1016/j.cplett.2018.05.026 0009-2614/ Q 2018. 2 Chemical Physics Letters LLL (2018) xxx-xxx 2. The semi-numerical algorithm for HF exchange =BH9;F5@ 75@7I@5H=CB 5@GC =BJC@J9G 5 @CCD CJ9F H<9 7CBHF57H=CB :CF 957< D5=F C: Gauss=5B 65G=G :IB7tions. +<9 HCH5@ 7CGH C: ESP =BH9;F5@G H<9B +<9 G9A=-numer=75@ 5@;CF=H<A HC 75@7I@5H9 H<9 HF 9L7<5B;9 A5- 75B 69 5DDFCL=A5H9@M 9GH=A5H98 5G HF=L <5G 699B 8=G7IGG98 =B 89H5=@ =B CIF DF9J=CIG D5D9F [15]. .9 6F=9:@M (“6:T GH5B8G :CF S65G=G :IB7H=CBT). +<9F9:CF9 CIF =AD@9A9BH5H=CB 9:- GIAA5F=N9 =H here. :CFH 7CB79BHF5H9G CB HF5BG:9FF=B; H<9 ESP =BH9;F5@ 75@7I@5H=CB :FCA H<9 +<9 HF 9L7<5B;9 A5HF=L =G 89F=J98 H<FCI;< H<9 89F=J5H=J9 C: H<9 BCFA5@ CPU D@5H:CFA HC H<9 '<= DFC79Gsor. HF 9L7<5B;9 9B9F;M K=H< F9GD97H HC H<9 GD=B-reGC@J98 89BG=HM A5HF=L : 3. Implementation on the Phi processor &IF :=FGH HFM K=H< H<9 '<= DFC79GGCF =G HC 7CAD=@9 H<9 5:CF9A9B- H=CB98 =AD@9A9BH5H=CB C: H<9 G9A=-numer=75@ G7<9A9 K=H< H<9 BH9@ 7CAD=@9F H<5H <5G H<9 5IHCA5H98 CDH=A=N5H=CB :95HIF9 :CF H<9 BH9@ MIC (1) 5F7<=H97ture. +<9 H9GH C: H<9 F9GI@H=B; 6=B5FM CB H<9 '<= DFC79GGCF KN- L7250 =G HKC H=A9G G@CK9F H<5B FIBB=B; CB CB9 G=L-core E5-1650 CPU, G=;B=:=75BH@M IB89FD9F:CFAG K=H< F9GD97H HC H<9 DCH9BH=5@ C: H<9 K<9F9 % F9DF9G9BHG 5 ;9B9F5@ Gauss=5B 65G=G :IB7tion, ),*, ( 5B8 ' 5F9 '<= DFC79Gsor. Clearly, H<9 DFC;F5A B998G HC 69 F9KF=HH9B :CF 5 69HH9F 65G=G :IB7H=CB =Bdexes. $ 89BCH9G H<9 GD=B C: 5 $C@97I@5F orbital, 9=- IG5;9 C: H<9 '<= DFC79Gsor. B H<=G G97H=CB K9 K=@@ 8=G7IGG H<9 GD97=5@ H<9F # CF &.