<<

MC4QCD web based analysis and workflow tool for Lattice QCD Massimo Di Pierro School of Computing DePaul University Chicago Outline

The problem (analysis data of lattice QCD) The solution (MC4QCD) The architecture and tools used ( + matplotlib) The problem

Goal compute masses, lifetimes and other properties of hadronic matter from first principle (QCD) This is done via large scale MCMC computation: The QCD Lagrangian is Wick rotated into Euclidean time Euclidean space-time is discretized on a lattice The Feynman Path Integral approximated with finite sum The Paths (field configurations in space-time) are generated via MCMC Observables are mapped into correlation functions The MCMC process

a proton

~2fm The MCMC process

1 2 ... 1000

A Feynman Path is a gluonic field configuration in 4D space time. ~1000 x 64 x 32^3 x 4 x 3^2 x 8B ~ 1TB

The MCMC process

1 2 ... 1000 Step 1 Paths are generated from a Markov Chain Monte Carlo. The algorithms encodes laws of dynamics (QCD) The MCMC process

1 2 ... 1000 Step 1 Step 2

On each Path (gauge configuration) we measure multiple quantities of interest. The MCMC process

1 2 ... 1000 Step 1 results of measurements are combined and averaged to produce results with relative Step 2 statistical errors, and plots analysis Step 3 result (m, t, ...) errors plots The MCMC process

1 2 ... 1000 Step 1

unstructured text files Step 2 everybody own format

Step 3 MC4QCD result (m, t, ...) errors plots Goals

Automate Workflow and Analysis process Step 3 Allow Physicists to collaborate more by sharing data, process, results, and comments Keep track of previous analysis and results Log files patterns and regex

loading gauge configuration 0 2pt[00] = 1.913939 2pt[01]loading = 1.537977gauge configuration 1 ...2pt[00] = 1.913939 2pt[15]2pt[01]loading = -1.029314= 1.537977gauge configuration 2 3pt[00][00]...2pt[00] == 5.2038881.913939 2pt[15]2pt[01]loading = -1.029314= 1.537977gauge configuration 3 3pt[00][01]2pt[00] = 4.372048 = 1.913939 ...3pt[00][00]... = 5.203888 2pt[15]2pt[01]loading = -1.029314= 1.537977gauge configuration ... 3pt[15][15]3pt[00][01]2pt[00] = 4.372048= 4.372048 = 1.913939 ...3pt[00][00]... = 5.203888 ... 2pt[15]2pt[01]loading = -1.029314= 1.537977 gauge configuration 1000 comments3pt[15][15]3pt[00][01] ... = 4.372048= 4.372048 ...3pt[00][00]... = 5.203888 ... 2pt[15]2pt[00] = -1.029314 = 1.913939 comments3pt[15][15]3pt[00][01] ... = 4.372048= 4.372048 ...... 3pt[00][00]2pt[01] = 5.203888 = 1.537977 comments3pt[15][15]3pt[00][01] ... = 4.372048= 4.372048 ...... comments3pt[15][15] ... = 4.372048 ...2pt[15] = -1.029314 comments3pt[00][00] ... = 5.203888 3pt[00][01] = 4.372048 ... 3pt[15][15] = 4.372048 ... comments ... log files Log files patterns and regex

loading gauge configuration 0 "2pt[00]" 2pt[00] = 1.913939 2pt[01]loading = 1.537977gauge configuration 1 ...2pt[00] = 1.913939 2pt[15]2pt[01]loading = -1.029314= 1.537977gauge configuration 2 3pt[00][00]...2pt[00] == 5.2038881.913939 "2pt[]" 2pt[15]2pt[01]loading = -1.029314= 1.537977gauge configuration 3 3pt[00][01]2pt[00] = 4.372048 = 1.913939 ...3pt[00][00]... = 5.203888 2pt[15]2pt[01]loading = -1.029314= 1.537977gauge configuration ... 3pt[15][15]3pt[00][01]2pt[00] = 4.372048= 4.372048 = 1.913939 ...3pt[00][00]... = 5.203888 ... 2pt[15]2pt[01]loading = -1.029314= 1.537977gauge configuration 1000 comments3pt[15][15]3pt[00][01] ... 2pt[00] = 4.372048= 4.372048 = 1.913939 log("2pt[]") ...3pt[00][00]... = 5.203888 ... 2pt[15]2pt[01] = -1.029314= 1.537977 comments3pt[15][15]3pt[00][01] ... = 4.372048= 4.372048 ...3pt[00][00]... = 5.203888 ... 2pt[15] = -1.029314 comments3pt[15][15]3pt[00][01] ... = 4.372048= 4.372048 ...... 3pt[00][00] = 5.203888 comments3pt[15][15]3pt[00][01] ... = 4.372048= 4.372048 "3pt[][]"/ ...... comments3pt[15][15] ... = 4.372048 ... comments ... ("2pt[]"*"2pt[]")

log files queries Log files patterns and regex

loading gauge configuration 0 "2pt[00]" 2pt[00] = 1.913939 2pt[01]loading = 1.537977gauge configuration 1 ...2pt[00] = 1.913939 2pt[15]2pt[01]loading = -1.029314= 1.537977gauge configuration 2 3pt[00][00]...2pt[00] == 5.2038881.913939 "2pt[]" 2pt[15]2pt[01]loading = -1.029314= 1.537977gauge configuration 3 3pt[00][01]2pt[00] = 4.372048 = 1.913939 ...3pt[00][00]... = 5.203888 2pt[15]2pt[01]loading = -1.029314= 1.537977gauge configuration ... 3pt[15][15]3pt[00][01]2pt[00] = 4.372048= 4.372048 = 1.913939 ...3pt[00][00]... = 5.203888 ... 2pt[15]2pt[01]loading = -1.029314= 1.537977gauge configuration 1000 comments3pt[15][15]3pt[00][01] ... 2pt[00] = 4.372048= 4.372048 = 1.913939 log("2pt[]") ...3pt[00][00]... = 5.203888 ... 2pt[15]2pt[01] = -1.029314= 1.537977 comments3pt[15][15]3pt[00][01] ... = 4.372048= 4.372048 ...3pt[00][00]... = 5.203888 ... 2pt[15] = -1.029314 comments3pt[15][15]3pt[00][01] ... = 4.372048= 4.372048 ...... 3pt[00][00] = 5.203888 comments3pt[15][15]3pt[00][01] ... = 4.372048= 4.372048 "3pt[][]"/ ...... comments3pt[15][15] ... = 4.372048 ... comments ... ("2pt[]"*"2pt[]") Pattern Syntax

log("2pt[]")

"2pt[...]" defines the repeated pattern to look for defines a variable t to be parsed 2pt[24] = 0.1252345 log(...) is a function to be applied after average Pattern Explained

log("2pt[]")

finds all instances of the pattern 2pt[24] = 0.1252345 for each instance determine t (24) and 2pt[t]=0.125... for each t and each instance compute log(2pt[t]) average results and compute errors using bootstrap Log files patterns and regex

loading gauge configuration 0 "2pt[00]" 2pt[00] = 1.913939 2pt[01]loading = 1.537977gauge configuration 1 ...2pt[00] = 1.913939 2pt[15]2pt[01]loading = -1.029314= 1.537977gauge configuration 2 3pt[00][00]...2pt[00] == 5.2038881.913939 "2pt[]" 2pt[15]2pt[01]loading = -1.029314= 1.537977gauge configuration 3 3pt[00][01]2pt[00] = 4.372048 = 1.913939 ...3pt[00][00]... = 5.203888 2pt[15]2pt[01]loading = -1.029314= 1.537977gauge configuration ... 3pt[15][15]3pt[00][01]2pt[00] = 4.372048= 4.372048 = 1.913939 ...3pt[00][00]... = 5.203888 ... 2pt[15]2pt[01]loading = -1.029314= 1.537977gauge configuration 1000 comments3pt[15][15]3pt[00][01] ... 2pt[00] = 4.372048= 4.372048 = 1.913939 log("2pt[]") ...3pt[00][00]... = 5.203888 ... 2pt[15]2pt[01] = -1.029314= 1.537977 comments3pt[15][15]3pt[00][01] ... = 4.372048= 4.372048 ...3pt[00][00]... = 5.203888 ... 2pt[15] = -1.029314 comments3pt[15][15]3pt[00][01] ... = 4.372048= 4.372048 ...... 3pt[00][00] = 5.203888 comments3pt[15][15]3pt[00][01] ... = 4.372048= 4.372048 "3pt[][]"/ ...... comments3pt[15][15] ... = 4.372048 ... comments ... ("2pt[]"*"2pt[]") Workflow Plots: raw and autocorrations

"2pt[]" Plot: moving averages

"2pt[]" Plot: bootstrap samples

"2pt[]" Plot: aggregate with errors

"2pt[]"

Minkowski time U(t)|s> = exp(iHt)|s> = exp(imt)|s>

Euclidean time (after Wick rotation) U(t)|s> = exp(-Ht)|s> = exp(-mt)|s>

t Fits

"2pt[]" Fitting features

Non-linear fits: "a*exp(-m*t)@a=1,m=0.1" Bayesian fits: "a*exp(-m*t)@a=1,a_bu=0.2,m=0.1" Correlated fits: "{1:a,2:b}[s]*exp(-m*t)@a=1,b=2,m=0.1" "3pt[][]"/("2pt[]"*"2pt[]") Advantages

Very general and easy to use Users can share they data and they results (optional) Users easily find their data Users repeat old analysis on new data Users can comments on each other data/results including latex expressions. Improves collaboration and reduces development time All functions can be scripted for batch processing Architecture web

forms MC4QCD (MVC) security web2py matplotlib

auth session Python

cache Windows/Mac/ web2py?

web2py includes its own web , its own , web based IDE

web2py requires no installation and no configuration

web2py provides a Database Abstraction Layer (generates SQL dynamically for SQLite, PostgreSQL, MySQL, Oracle, MSSQL, FireBird, Informix, DB2, and Ingres or the Google App Engine). It writes SQL for you.

web2py automatically generates forms from SQL tables

web2py handles security (Input validation / Role Based Access Control)

web2py has API for the semantic web

web2py enforces good programming practices

(Model-View-Controller, postbacks, validation) Model View Controller

web2py separates data representation (Model)

from data presentation (View)

from application logic and workflow (Controller)

MC4QCD follows the model view controller Example of Model

db.define_table('dataset',

Field('name',requires=IS_NOT_EMPTY()),

Field('description','text'),

Field('public','boolean',default=False),

Field('file','upload',requires=IS_NOT_EMPTY()),

Field('created_on','datetime',default=request.now,writable=False,readable=False),

Field('created_by',db.auth_user,default=auth.user_id,writable=False,readable=False)) Example of Controller / View

http://...../create_dataset

@auth.requires_login()

def create_dataset():

return dict(form=crud.create(db.dataset))

{{extend 'layout.'}}

Upload new dataset

....

{{=form}} Controller interface to matplotlib

http://...../plot

@auth.requires_login()

def plot():

data = {"series1": [(0,0), (0,1), ....],

"series2": [(0,0), (0,1), ....],

"series3": [(0,0), (0,1), ....]}

return matplotlib(title="my plot",data=data) Conclusions

Web apps can be more than a way to present data They can allow users to interact with data They can allow users to collaborate better MC4QCD provides an example of a scientific application built with web2py. web2py and matplotlib make it very easy to build this kind of apps.