<<

Migrating large codebases to ++ Modules

Yuka Takahashi - The University of Tokyo Princeton University Oksana Shadura - UNL Vassil Vassilev

Yuka Takahashi 13.03.2019 Migrating large codebases to C++ Modules, ACAT 2019 1 Agenda

1. Motivation of C++ Modules 2. C++ Modules in ROOT 3. C++ Modules in CMSSW 4. CMS Performance Results 5. Conclusion

Yuka Takahashi 13.03.2019 Migrating large codebases to C++ Modules, ACAT 2019 2 Motivation of C++ Modules

Yuka Takahashi 13.03.2019 Migrating large codebases to C++ Modules, ACAT 2019 3 Motivation of C++ Modules

C++ Modules technology: - Cache parsed header file information - Avoid header re- - Avoid runtime header parsing (In ROOT) - Part of C++20

Yuka Takahashi 13.03.2019 Migrating large codebases to C++ Modules, ACAT 2019 4 Motivation of C++ Modules

#include

Yuka Takahashi 13.03.2019 Migrating large codebases to C++ Modules, ACAT 2019 5 Motivation of C++ Modules

#include

Textual Include Precompiled Headers (PCH) Modules

Expensive Inseparable Fragile

Yuka Takahashi 13.03.2019 Migrating large codebases to C++ Modules, ACAT 2019 6 Motivation of C++ Modules …… TVirtualPad.h …… # 286 "/usr/include/c++/v1/vector" 2 3 #include "TVirtualPad.h" namespace std { inline namespace __1 { template class __vector_base_common vector #include Preprocess { __attribute__ #include ((__visibility__("hidden"), __always_inline__)) __vector_base_common() Textual Include{} …… int main() { # 394 "/usr/include/c++/v1/set" 3 namespace std {inline namespace __1 { … set template <…> class set { original code public: typedef _Key key_type; …… .o Compile Parse int main { one big file! ……

Yuka Takahashi 13.03.2019 Migrating large codebases to C++ Modules, ACAT 2019 7 Motivation of C++ Modules

Textual Include .c .o

1. Expensive .h .h .c .o Reparse the same header 2. Fragile .c .o Name collisions Users’ code #include Rcpp library … #define PI 3.14 double PI = 3.14; … // => double 3.14 = 3.14;

Yuka Takahashi 13.03.2019 Migrating large codebases to C++ Modules, ACAT 2019 8 Motivation of C++ Modules .c .o PCH (Precompiled Headers) PCH .c .o 1. Storing precompiled header information (same as modules) .c .o 2. Stored in one big file - Monolithic allDict.cxx.pch

In ROOT

Interpreter

Yuka Takahashi 13.03.2019 Migrating large codebases to C++ Modules, ACAT 2019 9 a.pcm Motivation of C++ Modules .c .o Modules b.pcm .c .o - Module files contain parsed header information c.pcm .c .o - PCMs are separated a.pcm b.pcm c.pcm d.pcm e.pcm Each PCM file (a.pcm) corresponds to a library (liba.so) Interpreter In ROOT

Yuka Takahashi 13.03.2019 Migrating large codebases to C++ Modules, ACAT 2019 10 Motivation of C++ Modules Modules - Modules files contain parsed header information - PCMs are separated Compile-time scalability No Fragility Separable

Yuka Takahashi 13.03.2019 Migrating large codebases to C++ Modules, ACAT 2019 11 C++ Modules in ROOT Technology Preview released in ROOT 6.16

Yuka Takahashi 13.03.2019 Migrating large codebases to C++ Modules, ACAT 2019 12 C++ Modules in ROOT Overview - Dependency Graph

rootcling ROOT Dictionaries

Clang Cling ROOT

binaries

files Yuka Takahashi 13.03.2019 Migrating large codebases to C++ Modules, ACAT 2019 13 C++ Modules in ROOT Overview - Dependency Graph

rootcling ROOT Dictionaries

Clang Cling ROOT

Cling calls Clang API binaries

files Yuka Takahashi 13.03.2019 Migrating large codebases to C++ Modules, ACAT 2019 14 C++ Modules in ROOT Overview - Dependency Graph

rootcling ROOT Dictionaries

rootcling generates dictionaries (rootmap, rdict) Clang Cling ROOT

binaries

files Yuka Takahashi 13.03.2019 Migrating large codebases to C++ Modules, ACAT 2019 15 C++ Modules in ROOT Overview - Dependency Graph

rootcling ROOT Dictionaries

ROOTMAP Used to map symbols and Clang Cling identifiers to libraries ROOT

binaries

files Yuka Takahashi 13.03.2019 Migrating large codebases to C++ Modules, ACAT 2019 16 C++ Modules in ROOT Overview - Dependency Graph

rootcling ROOT Dictionaries

RDICT Efficiently store information needed for Clang Cling ROOT serialization

binaries

files Yuka Takahashi 13.03.2019 Migrating large codebases to C++ Modules, ACAT 2019 17 C++ Modules in ROOT Overview - Dependency Graph

rootcling ROOT Dictionaries

Clang Cling ROOT

Dictionaries are used at

ROOT runtime binaries

files Yuka Takahashi 13.03.2019 Migrating large codebases to C++ Modules, ACAT 2019 18 C++ Modules in ROOT Overview - Dependency Graph

rootcling ROOT Dictionaries

ROOT PCMs

Clang Cling ROOT

binaries

files Yuka Takahashi 13.03.2019 Migrating large codebases to C++ Modules, ACAT 2019 19 C++ Modules in ROOT Mechanism of loading modules Preloading of modules - Replace some functionality of RDICT and ROOTMAP with a more stable implementation - Load all ROOT modules at the startup time RSS Memory HSimple (4 months) 665MB

Preloading of all http://root-bench.cern.ch (login required) Production Development modules Yuka Takahashi 13.03.2019 Migrating large codebases to C++ Modules, ACAT 2019 20 C++ Modules in ROOT Mechanism of loading modules

Global Modules Index

- Remove further overhead in ROOT - Mechanism to create the table of symbols and PCM names - ROOT will be able to load corresponding library when a symbol lookup failed - The prototype shows promising results

Yuka Takahashi 13.03.2019 Migrating large codebases to C++ Modules, ACAT 2019 21 C++ Modules in ROOT Mechanism of loading libraries Bloom filter - Hash tables of symbols in .gnu.hash section in shared object files (further read) - ROOT can skip unnecessary libraries by reading it

RSS Memory HSimple (4 months)

232MB Bloom filter http://root-bench.cern.ch (login required) Production Development

Yuka Takahashi 13.03.2019 Migrating large codebases to C++ Modules, ACAT 2019 22 C++ Modules in CMSSW Available in CMS CXXMODULE IB

Yuka Takahashi 13.03.2019 Migrating large codebases to C++ Modules, ACAT 2019 23 C++ Modules in CMSSW Overview - Dependency Graph

Genreflex CMS (rootcling) Dictionaries

SCRAM CMS Build system Runtime

C++ CMS Libraries binaries

files Yuka Takahashi 13.03.2019 Migrating large codebases to C++ Modules, ACAT 2019 24 C++ Modules in CMSSW Overview - Dependency Graph

Genreflex CMS (rootcling) Dictionaries

SCRAM Genreflex and GCC, CMS Build system executed by SCRAM Runtime

C++ CMS Compiler Libraries binaries

files Yuka Takahashi 13.03.2019 Migrating large codebases to C++ Modules, ACAT 2019 25 C++ Modules in CMSSW Overview - Dependency Graph

Genreflex CMS (rootcling) Dictionaries

SCRAM CMS Dictionaries generated by Genreflex Build system Runtime

C++ CMS Compiler Libraries binaries

files Yuka Takahashi 13.03.2019 Migrating large codebases to C++ Modules, ACAT 2019 26 C++ Modules in CMSSW Overview - Dependency Graph

Genreflex CMS (rootcling) Dictionaries

SCRAM CMS Build system Libraries compiled by gcc Runtime

C++ CMS Compiler Libraries binaries

files Yuka Takahashi 13.03.2019 Migrating large codebases to C++ Modules, ACAT 2019 27 C++ Modules in CMSSW Overview - Dependency Graph

Genreflex CMS (rootcling) Dictionaries

SCRAM CMS C++ CMS Build system Modules PCMs Runtime

C++ CMS Compiler Libraries binaries

files Yuka Takahashi 13.03.2019 Migrating large codebases to C++ Modules, ACAT 2019 28 C++ Modules in CMSSW Overview - Dependency Graph

Genreflex CMS (rootcling) Dictionaries

SCRAM CMS C++ CMS Build system Modules PCMs Runtime - Not all CMS libraries were modularized - Modules can co-exist with the old infrastructure C++ CMS Compiler Libraries binaries

files Yuka Takahashi 13.03.2019 Migrating large codebases to C++ Modules, ACAT 2019 29 C++ Modules in CMSSW Overview - Dependency Graph

Genreflex CMS (rootcling) Dictionaries

SCRAM CMS C++ CMS Modulemap Build system Modules PCMs Runtime

C++ CMS Compiler Libraries binaries

files Yuka Takahashi 13.03.2019 Migrating large codebases to C++ Modules, ACAT 2019 30 C++ Modules in CMSSW Explicit PCMs in CMSSW module.modulemap - Definition file of headers to build a PCM in Clang - Contain all “interface” headers, which are used by libraries module "MathCore" { module “TComplex name” { header "TComplex.h" export * } module { header } } modulemap will contain all interface header files Autogeneration of modulemap

Yuka Takahashi 13.03.2019 Migrating large codebases to C++ Modules, ACAT 2019 31 C++ Modules in CMSSW Explicit PCMs in CMSSW Autogeneration of modulemap

- CMSSW has “interface” headers - Exposed to libraries outside - Automatically generate the modulemap by adding interface headers - Modulemap needs to be generated before the execution of genreflex - Build system is responsible for the autogeneration

Yuka Takahashi 13.03.2019 Migrating large codebases to C++ Modules, ACAT 2019 32 C++ Modules in CMSSW Explicit PCMs in CMSSW

Genreflex CMS (rootcling) Dictionaries

SCRAM CMS C++ CMS Modulemap Build system Modules PCMs Runtime

C++ CMS Compiler Libraries binaries

files Yuka Takahashi 13.03.2019 Migrating large codebases to C++ Modules, ACAT 2019 33 C++ Modules in CMSSW Mechanism of the modulemap modulemap, modulemap overlay file, virtual modulemap overlay

/

usr/ $build_dir module “header.h” include/ { header “some/directory/ headers.h” }

some/ module.modulemap directory/

directories *.h (headers) files Yuka Takahashi 13.03.2019 Migrating large codebases to C++ Modules, ACAT 2019 34 C++ Modules in CMSSW Mechanism of the modulemap modulemap, modulemap overlay file, virtual modulemap overlay

Modulemap for system headers - Modulemap needs to know the location of headers - System headers’ location cannot be hardcoded Modulemap Overlay File

Yuka Takahashi 13.03.2019 Migrating large codebases to C++ Modules, ACAT 2019 35 C++ Modules in CMSSW Mechanism of the modulemap modulemap, modulemap overlay file, virtual modulemap overlay

/ Location of system headers (Generated from CMake) name : “/usr/include” usr/ $build_dir contents : [ include/ { name : “module.modulemap” some external-contents : directory/ “path/to/stl.modulemap” } Relative path to stl.modulemap stl.modulemap *.h (headers) Modulemap Overlay File directories module.modulemap files Yuka Takahashi 13.03.2019 Migrating large codebases to C++ Modules, ACAT 2019 36 C++ Modules in CMSSW Mechanism of the modulemap modulemap, modulemap overlay file, virtual modulemap overlay

/ Location of system headers (Generated from CMake) name : “/usr/include” usr/ $build_dir module “stl” { contents : [ module “algorithm”include/ { { name : “module.modulemap” some header “algorithm” }} external-contents : directory/ “path/to/stl.modulemap” } Relative path to stl.modulemap stl.modulemap *.h (headers) Modulemap Overlay File directories module.modulemap files Yuka Takahashi 13.03.2019 Migrating large codebases to C++ Modules, ACAT 2019 37 C++ Modules in CMSSW Mechanism of the modulemap modulemap, modulemap overlay file, virtual modulemap overlay

Clang interprets those information as - module.modulemap exists in the location of system headers (/usr/include, in this example) - module.modulemap has the contents of stl.modulemap

Modulemap Overlay File can handle system headers, but it needs to be generated at configuration time

Yuka Takahashi 13.03.2019 Migrating large codebases to C++ Modules, ACAT 2019 38 C++ Modules in CMSSW Mechanism of the modulemap modulemap, modulemap overlay file, virtual modulemap overlay

The location of system headers needs to be generated at CMake (configuration) time - Not binary distributable - Not relocatable - CMS builds and distributes binary to other locations Solution: Virtual Modulemap Overlay File

Yuka Takahashi 13.03.2019 Migrating large codebases to C++ Modules, ACAT 2019 39 C++ Modules in CMSSW Mechanism of the modulemap modulemap, modulemap overlay file, virtual modulemap overlay

/ Dynamically determined

name : usr/ $build_dir contents : [ { name :include/ “module.modulemap” some external-contents : “path/to/stl.modulemap” ] directory/

Virtual Modulemap *.h (headers) stl.modulemap Overlay File (VMOF) module.modulemap Not an actual file (virtual file on memory) Yuka Takahashi 13.03.2019 Migrating large codebases to C++ Modules, ACAT 2019 40 C++ Modules in CMSSW Summary

- C++ Modules integration in CMSSW - Genreflex generates pcm files - Autogeneration of modulemap from build system - Modulemap for system headers (libc, stl) - Virtual Modulemap overlay file

Yuka Takahashi 13.03.2019 Migrating large codebases to C++ Modules, ACAT 2019 41 CMS Performance Results

Yuka Takahashi 13.03.2019 Migrating large codebases to C++ Modules, ACAT 2019 42 “Fast simulation test” ESetup Lock (seconds)

22.5 seconds better than ROOT Master

random 100 events 15 times execution 0 pcms 96 pcms 121 pcms Standard error of mean

Yuka Takahashi 13.03.2019 Migrating large codebases to C++ Modules, ACAT 2019 43 “Fast simulation test” ESetup Get (seconds)

15.2 seconds better than ROOT Master

random 100 events 15 times execution 0 pcms 96 pcms 121 pcms Standard error of mean

Yuka Takahashi 13.03.2019 Migrating large codebases to C++ Modules, ACAT 2019 44 “Digitization test” CPU Total Loop (seconds) 331 seconds better than ROOT Master

random 100 events 15 times execution 0 pcms 96 pcms 121 pcms Standard error of mean

Yuka Takahashi 13.03.2019 Migrating large codebases to C++ Modules, ACAT 2019 45 “Digitization test” RSS Memory (MBytes)

143 Mbytes worse than ROOT Master

random 100 events 15 times execution 0 pcms 96 pcms 121 pcms Standard error of mean

Yuka Takahashi 13.03.2019 Migrating large codebases to C++ Modules, ACAT 2019 46 CMS Performance Results Summary

- Results suggests the performance benefits at runtime - Especially at the initialization time - ~150 MBytes RSS overhead - Investigation is ongoing

Yuka Takahashi 13.03.2019 Migrating large codebases to C++ Modules, ACAT 2019 47 Conclusion

Yuka Takahashi 13.03.2019 Migrating large codebases to C++ Modules, ACAT 2019 48 Conclusion

- C++ Modules were implemented and tested in ROOT and in CMSSW - Improves the header modularity of libraries - Preliminary performance study suggested the performance improvement at runtime - Work on performance improvement is ongoing

Yuka Takahashi 13.03.2019 Migrating large codebases to C++ Modules, ACAT 2019 49 Thank you for your attention!

Yuka Takahashi 13.03.2019 Migrating large codebases to C++ Modules, ACAT 2019 50 Backup slides

Yuka Takahashi 13.03.2019 Migrating large codebases to C++ Modules, ACAT 2019 51 C++ Modules in CMSSW Implicit PCMs in CMSSW

Implicit pcms Explicit pcms Implicitly generated without Explicitly generated with modulemaps modulemaps - Add all possible header files - Only add defined headers to needed for the generation of the PCM the dictionary - Reduce header duplication - Huge header duplication

Yuka Takahashi 13.03.2019 Migrating large codebases to C++ Modules, ACAT 2019 52