<<

The ATLAS (Offline) Software

Attila Krasznahorkay Preliminaries

• ATLAS’s software infrastructure is quite complicated • On the whole there’s no one software to talk about • This is an introduction to the ATLAS offline software infrastructure • The analysis software will be discussed on Tuesday • And the trigger software on Thursday • This talk is not specific to Run 2 • While many aspects of the offline software changed for Run 2, the basics didn’t. Which is what will be discussed here.

2

• Is the name of the software framework in which (offline) data processing is performed • Is based on Gaudi, originally developed by the LHCb collaboration • Is used for nearly all aspects of offline (and even online) data processing • Simulation, reconstruction, the High Level Trigger, and even data analysis tasks • Most of the code is written in C++ • But all of the scripting and configuration (and even some algorithmic code) is in Python

Athena: Greek goddess of wisdom, war, the arts, industry, justice and skill; sprang fully grown out of her father’s (’s) head Gaudi: Antoni Gaudi, Barcelona architect 1852 – 1926, famed for the design of Barcelona’s La Sagrada Familia, an immense Basilica which has been under construction since the beginning of the 20th century and is not scheduled for completion until 2026

3 Software Framework • A skeleton into which software developers plug in their code • Defines a high level architecture for the organisation of the software • Provides functionality common to several applications • Provides communication between different components of the software • Controls the configuration, loading and execution of the code that the software developer provides • In the end and absolute necessity once >10 developers have to write code for a common goal…

• Some generic description of the basic idea is under: • http://en.wikipedia.org/wiki/Software_framework

4 What Are We Doing? • We are dealing with a lot of structured data • We have individual events that can be treated independent of each other, and need to go through several steps during the data processing • Simulation, reconstruction, analysis • We have metadata (data about data) for the events that we simulate and record • Which trigger stream the event came from • Is it simulated data? • Which trigger menu was used • Which detector geometry was used • … • So the framework needs to load the correct metadata for each processed event, and then process it with the code that the developers wrote

5 Basic Concepts

6 Algorithm

• Is pretty much the simplest concept: • A piece of code that initialises itself before the event processing (initialize) • Some code that does something using the event data during event processing (execute) • Finally, some code that possibly does something after the event processing has finished (finalize)

7 Algorithm • In Athena we implement algorithms in C++ very simply like:

class MyAlgorithm : public AthAlgorithm {

public: MyAlgorithm( const std::string& name, ISvcLocator* svcLoc ) : AthAlgorithm( name, svcLoc ) {}

StatusCode initialize() { // Do something… return StatusCode::SUCCESS; } StatusCode execute() { // Do something… return StatusCode::SUCCESS; } StatusCode finalize() { // Do something… return StatusCode::SUCCESS; } };

8 Algorithm • Is implemented with a C++ class that inherits from one of the algorithm base classes • Generic algorithms inherit from AthAlgorithm • Event filtering algorithms inherit from AthFilterAlgorithm • Histogram filling algorithms inherit from AthHistogramAlgorithm • A few more base classes can be found in the AthenaBaseComps package (Control/AthenaBaseComps) • The Athena framework takes care of: • Running the initialize() function once before the event loop starts • Running the execute() function once per event • Running the finalize() function once after the event loop is finished

9 Reporting Back • Most functions in Athena tell their caller whether they succeeded or not, by returning a StatusCode object • A function will not always be able to do what was asked of it. If it can fail, it needs to report back about its success or failure using StatusCode. • StatusCode is a very simple C++ class that has a few states, and knows whether it was checked or not • Not explicitly checking the StatusCode returned by a function is an error. Which stops the execution of an Athena job. • Checking can be done by using one of the object’s functions (isSuccess()), or comparing it to a value (in an if(…) statement) • The possible values of a StatusCode are: • StatusCode::SUCCESS: The function succeeded • StatusCode::FAILURE: The function failed, and the job needs to stop • StatusCode::RECOVERABLE: The function failed, but the job could continue with the next event. (After discarding the current one.)

10 Reporting Back • Most functions in Athena tell their caller whether they succeeded or not, by returning a StatusCode object • A function will not always be able to do what was asked of it. If it can fail, it needs to report back about its success or failure using StatusCode. • StatusCode is a very simple C++ class that has a few states, and knows whether it was checked or not • Not explicitly checking the StatusCode returned by a function is an error. Which stops the execution of an Athena job. • Checking can be done by using one of the object’s functions (isSuccess()), or comparing it to a value (in an if(…) statement) • The possible values of a StatusCode are: • StatusCode::SUCCESS: The function succeeded • StatusCode::FAILURE: The function failed, and the job needs to stop • StatusCode::RECOVERABLE: The function failed, but the job could continue with the next event. (After discarding the current one.)

10 Hardly ever used… Algorithm Sequence

• All the code is never put into a single algorithm • Instead we write multiple algorithms that need to be executed one after the other • Possibly running the same algorithm in multiple instances with different configurations • Is pretty much just a list of algorithms that Athena runs in the specified order • The order applies to the execution of all 3 main functions of the algorithms • An Athena job means more or less the execution of the main algorithm sequence (called “top sequence”)

11 Algorithm Sequence

CaloClusterRecoAlg Top Sequence

IDTrackRecoAlg

ElectronRecoAlg

12 Algorithm Sequence • The algorithm sequence class (AthSequencer) is itself an algorithm! • You can add an algorithm sequence to another algorithm sequence Alg1 Top Sequence Filter Seq. Filter FilterAlgorithm

ConditionalAlgorithm

Alg2

13 Algorithm Sequence • The algorithm sequence class (AthSequencer) is itself an algorithm! • You can add an algorithm sequence to another algorithm sequence Alg1 Top Sequence Filter Seq. Filter FilterAlgorithm

ConditionalAlgorithm STOP ConditionalAlgorithm doesn’t run if FilterAlgorithm didn’t Alg2 “accept” the event

13 Whiteboard • Algorithms don’t know about each other, they don’t talk directly to each other • How do they exchange information then? • Via a whiteboard • All (event) data is published on a central whiteboard, including the data read in from an input file • Each algorithm can retrieve objects that they need from the whiteboard (retrieve) • If an algorithm produces new data, it puts it on the whiteboard as well (record) • In Athena the class implementing the whiteboard is StoreGateSvc • There are multiple instances of the whiteboard for different purposes • “StoreGateSvc” holds event data, and is cleaned at the end of each event • “DetectorStore” holds detector information, and keeps its contents between events • “MetaDataSvc” holds metadata about the event being processed at the moment, and normally remains unchanged for many events

14 Whiteboard

• In an algorithm’s code you use StoreGateSvc like:

StatusCode MyAlgorithm::execute() {

const CaloClusterContainer* clusters = 0; ATH_CHECK( evtStore()->retrieve( cluster, “Clusters” ) );

MyObject* obj = new MyObject( 3.141592 ); ATH_CHECK( evtStore()->record( obj, “MySuperObject” ) );

const DetectorInfo* info = 0; ATH_CHECK( detStore()->retrieve( info, “SomeDetectorInfo” ) );

return StatusCode::SUCCESS; }

15 Whiteboard

• In an algorithm’s code you use StoreGateSvc like:

Convenient accessors to the StatusCode MyAlgorithm::execute() { different StoreGateSvc instances

const CaloClusterContainer* clusters = 0; ATH_CHECK( evtStore()->retrieve( cluster, “Clusters” ) );

MyObject* obj = new MyObject( 3.141592 ); ATH_CHECK( evtStore()->record( obj, “MySuperObject” ) );

const DetectorInfo* info = 0; ATH_CHECK( detStore()->retrieve( info, “SomeDetectorInfo” ) );

return StatusCode::SUCCESS; }

15 Whiteboard

• In an algorithm’s code you use StoreGateSvc like:

Convenient accessors to the StatusCode MyAlgorithm::execute() { different StoreGateSvc instances

const CaloClusterContainer* clusters = 0; ATH_CHECK( evtStore()->retrieve( cluster, “Clusters” ) );

MyObject* obj = new MyObject( 3.141592 ); ATH_CHECK( evtStore()->record( obj, “MySuperObject” ) );

const DetectorInfo* info = 0; ATH_CHECK( detStore()->retrieve( info, “SomeDetectorInfo” ) );

return StatusCode::SUCCESS; } Helper macro to check the returned StatusCode objects

15 M Re:! Data Flow • t ! ! D:! • B

16

Karsten Köneke 19 Interface/Factory • In ATLAS we adopted a component-based software engineering model • See: http://en.wikipedia.org/wiki/Component- based_software_engineering • This allows us to reduce the dependency between different parts of the software • We define a generic interface class in a central place • All clients only make use of this generic interface, and ask the framework to give them an object that implements this interface • The framework creates the requested objects using a factory model • It is used already for the algorithms. The framework calls the algorithm functions using the IAlgorithm interface class. • But the concept will be much more important for the component types discussed next…

17 Service • There are pieces of code that provide a service to other software components • We saw one already: StoreGateSvc • Athena initialises and finalises a service, but doesn’t make it do something for every event • i.e. It doesn’t have an execute() function • You usually make use of it through an interface directly • But some services just do useful things behind the scenes, without user code ever accessing them • A lot of basic services are provided by the framework. (And of course you can implement others.) For instance: • MessageSvc: Handles the printing of text messages from all components of the software • THistSvc: Handles the management of ROOT histograms and trees • … 18 To o l

• There are often tasks that we want to perform multiple times per event, and from multiple algorithms • For instance to extrapolate a track from the inner detector to the calorimeter • It’s similar to a service in that it is only initialised and finalised by the framework, everything else is done by the user asking the tool to do some work • The user interacts with a tool through an interface, implementing the component model discussed previously

19 To o l

class IMyTool : public virtual IAlgTool { public: virtual StatusCode process( xAOD::Muon& mu ) = 0; };

class MyTool : public virtual IMyTool, public AthAlgTool { public: MyTool( const std::string& type, const std::string& name, IInterface* parent ) : AthAlgTool( type, name, parent ) {}

virtual StatusCode process( xAOD::Muon& mu ) { // Do something with the muon… return StatusCode::SUCCESS; } };

20 To o l • Most of the offline code (I think…) is implemented in tools • So it’s good to get familiar with them if you want to use ATLAS code • Tools can be instantiated by Athena either as private or public tools • Public tools are shared by multiple components • Private tools are owned by another component, and can’t be accessed by other components • Could discuss the difference in more detail during the hands on tutorials… • Some tools are designed such that they can be used outside of the Athena framework as well • These are mostly tools meant for data analysis. They can be used “by hand” in a very lightweight environment as well. • See tomorrow’s tutorials for more details 21 Package/Build Manager • We organise the ATLAS software into software packages • A software package is a set of C++/Python files that are compiled into libraries/executables • Packages depend on each other • If in your algorithm you want to use a tool, your package depends on the package holding the interface of that tool • The dependencies define in a large part how the C++ code needs to be compiled • Such that it would find all the necessary header files for instance • Athena packages are compiled using CMT • It’s a package management software that allows us to describe the build rules of the different packages in a relatively simple manner • In the analysis code we rather use RootCore. More tomorrow… • There is an effort to replace CMT with CMake in the future… 22 Software Release • ATLAS’s offline software is made up of >2000 packages! • Where each package is developed by a different set of people, on differing timescales, following slightly different software revisioning tactics • We create an overall offline software release from these >2000 packages by taking a particular snapshot (subversion tag) of each of them, and compiling those against each other • So, for instance offline release 20.0.1 is composed of Package1-00-00-02, Package2-01-03-04, etc. • This is a concept used with the analysis software as well • Although in a much simpler way than for the offline software • See tomorrow’s tutorials for more details 23 Persistency

24 Objects in Memory and on Disk • The objects put on the whiteboard can range from very simple (a class with a few primitive member variables) to very complex (a track that has links to each of the measurement points that it was made from) • We need to be able to write such objects into a file on disk, and then to read them back into memory in another application • Otherwise the entire data processing would need to happen in one processing step • ATLAS uses ROOT’s persistency system to achieve this • See https://root.cern.ch/drupal/content/inputoutput for a generic introduction • Will discuss in some more detail in the xAOD EDM tutorial tomorrow

25 Event Selectors • In Athena we use “event selectors” to read in objects from input files, and put them on the whiteboard • There are a few different event selector types based on the input file type • ByteStream input: This selector constructs C++ objects from the data coming out of the detector • POOL input: This selector reads in objects from ROOT files, as discussed before • NTuple input: This selector puts simple objects (primitive types, and vectors of primitives) from simple ROOT ntuple files into StoreGate • As with everything else, the job’s configuration decides what input files will be read, and how exactly

26 Fitting It All Together

27 The Application Manager • An Athena job is operated by a finite state machine The• See: Application http://en.wikipedia.org/wiki/Finite-state_machine Manager

! Finite State Machine: Configuration User UserUser Manager ConfigurationConfiguration External • Core of Athena ConfigurationFiles FilesFiles Libraries • Configures and steers everything Algorithm AlgorithmsAlgorithms initialize()‏ Persistent ‏ Storage

execute()‏ Manager Manager (state machine) (state Application Data Python finalize()‏ Converter interface interactive)/) finalize (n) | execute | initialize | configure scriptable Transient Data Store Services AlgTools ServicesServices ServicesServices

The Finite State Machine is a general concept in computing:! • http://en.wikipedia.org/wiki/Finite-state_machine 28 Karsten Köneke 30 Setting Up the Environment • Running offline (or analysis) jobs requires a careful setup of the runtime environment • To tell the shell which directories to take share libraries from, where to find executables, which database servers to connect to, etc. • -> We use helper code (AtlasSetup) to do this for us • On a “properly configured” machine you do this like: # cd myWorkDirectory/ # setupATLAS # asetup AtlasProduction,20.0.0.1,here

29 Setting Up the Environment • Running offline (or analysis) jobs requires a careful setup of the runtime environment • To tell the shell which directories to take share libraries from, where to find executables, which database servers to connect to, etc. • -> We use helper code (AtlasSetup) to do this for us • On a “properly configured” machine you do this like: # cd myWorkDirectory/ # setupATLAS # asetup AtlasProduction,20.0.0.1,here

Project name

29 Setting Up the Environment • Running offline (or analysis) jobs requires a careful setup of the runtime environment • To tell the shell which directories to take share libraries from, where to find executables, which database servers to connect to, etc. • -> We use helper code (AtlasSetup) to do this for us • On a “properly configured” machine you do this like: # cd myWorkDirectory/ # setupATLAS # asetup AtlasProduction,20.0.0.1,here

Project name Release version

29 Setting Up the Environment • Running offline (or analysis) jobs requires a careful setup of the runtime environment • To tell the shell which directories to take share libraries from, where to find executables, which database servers to connect to, etc. • -> We use helper code (AtlasSetup) to do this for us • On a “properly configured” machine you do this like: # cd myWorkDirectory/ # setupATLAS # asetup AtlasProduction,20.0.0.1,here

Project name Release version See later…

29 Running an Athena Job

• After setting up your environment, the incantation is usually something simply like:

# cd myWorkDirectory/ # athena.py myJobOptions.py 2>&1 | tee athena.log • i.e. We use an executable called athena.py, and give it a configuration file to describe what the job should do • The final part just saves the text output from the job into a log file. Which is a very good habit to get into for all jobs.

30 JobOptions? • Athena jobs are configured using python scripts • In which we often make use of a lot of python’s capabilities • Will be further discussed in the hands on part • But a simple jobOption would look something like this:

# Set up the reading of an xAOD file: import AthenaPoolCnvSvc.ReadAthenaPool ServiceMgr.EventSelector.InputCollections = [ “xAOD.pool.root” ]

# Access the top algorithm sequence: from AthenaCommon.AlgSequence import AlgSequence topSequence = AlgSequence()

# Add a user algorithm to the job: topSequence += CfgMgr.MySuperAlgorithm( “MyAlgorithm” )

# Specify how many events to process from the input (“theApp” is the application # manager instance): theApp.EvtMax = 200

31 Summary • This was just a very quick run through the main aspects of the ATLAS offline software • Hopefully you’ll get more familiar with these concepts throughout the week • Remember that the offline software has many millions of lines of C++ and Python code (which cost >$200M to develop) • I still occasionally find surprising things in it after many years of working with it • But once you’re familiar with the basics, you should be able to understand all parts of the ATLAS code (after looking at it for a while…) • You should definitely ask us questions throughout the week as you bump into them

32