RESTful for Scientific Computing in Django

Shreyas Cholia - [email protected] Annette Greiner - [email protected] Outreach, Software and Programming Group NERSC - LBNL SciPy 2011, Austin, TX

Thursday, July 21, 2011 NERSC

• National Energy Research Scientific Computing Center

• DOE Office of Science HPC User Facility at Lawrence Berkeley Lab

• Provides high performance compute, data, network and information services to scientists across the world

Thursday, July 21, 2011 NERSC Resources

• Multiple HPC clusters Hopper, Franklin, Carver, Magellan, Euclid, PDSF

• HPSS archival storage system

• Global File System

Thursday, July 21, 2011 Web Gateways

• Old way - SSH + command line + batch system

• People now expect web interfaces for everything

• Usability - scientific computing should be as easy as online-banking

• don’t want generic options/tools not applicable to your science

• don’t want to deal with backend, , UNIX CLI etc.

Thursday, July 21, 2011 Motives

• Make it very easy for science teams to build web gateways to their data and computation

• We have already built several science specific gateways - want to encapsulate common patterns

Thursday, July 21, 2011 Web Stack

• Browser + AJAX

• REST

• JSON

Thursday, July 21, 2011 Web Frameworks

• Python has a number of very powerful web frameworks - Django, , Pylons/Pyramid ...

• Model-View-Controller pattern

• Separation of

• data model

• routing/request processing

• templates

• DRY - Don’t Repeat Yourself

• Building blocks for common web tasks (auth, HTTP headers, themes and templates)

• Database abstraction (ORM) - masks SQL layer

Thursday, July 21, 2011 Django

• Powerful MVC framework in Python

• Probably most widely adopted and documented Python framework

Thursday, July 21, 2011 She turned me into a NEWT

• NEWT - NERSC Web Toolkit

• REST ful ... ish API

• access HPC resources over the web using HTTP + JSON

• Built using Django

Thursday, July 21, 2011 NERSC Web Toolkit

• NEWT Service Exposes NERSC Resources as HTTP URIs

• ReST API - HTTP + verbs + JSON

• newt.js Javascript Library for frontend dev

Thursday, July 21, 2011 Things you can do ...

• Authenticate using NERSC credentials

• Check machine status

• Upload and download files

• Submit a compute job

• Monitor a job

• Get user account information

• Store app data

• Issue UNIX commands

Thursday, July 21, 2011 Client: Web Application - HTML 5/AJAX

http request JSON data

Authentication Internal DB: NEWT Django session, cred, user MyProxy CA information

System Accounting Persistent Store Information Resources (via (NoSQL DB) Globus)

Files CouchDB NIM

Batch Jobs

Shell Commands

Status

Thursday, July 21, 2011 Quick Demo

• Make request at a URL

GET !"#$%&&#'()*+,*-)!./0($1.2'3&/04)&5-0-0&!'##0(&6

• Get back JSON response

Thursday, July 21, 2011 Django makes it easy!

• Django already gives us most of the glue to build these kinds of frameworks

• Python has tons of special sauce to interface with different backend resources

Thursday, July 21, 2011 Building an API

• Decide on the resources you want to expose

• CORE: Map URI paths to custom methods

• Plugin an authentication backend if needed

• To access the full complement of HTTP verbs you will need a REST plugin for Django - Piston, Tastypie

Thursday, July 21, 2011 urls.py

urlpatterns = patterns('', (r'^/?$', 'newt.home.views.apiroot'), (r'^status', 'newt.status.views.statusAdapter')), (r'^file', include('newt.file.urls')),

)

Thursday, July 21, 2011 views.py

class StatusAdapter(PublicJsonResourceAdapter): def get(self, request): logger.debug("display status for all") try: status_dict=Status.get() except Exception, e: return HttpResponseServerError("Could not connect") output = JSONEncoder().encode(status_dict['machine_info']) return HttpResponse(output, status=status_dict ['httpstatus'], content_type='application/json')

Thursday, July 21, 2011 !"#$ #"%&'#(" )"%(#*+,*&-

789: &5-0-0&; 9-<=>)$6789:6?*)*6)'65-0-06'/6;@6(0)-(/$6A'<6>?

BC: &+0&;&#*)!& ;0)-(/$6?>(01)'(E6+>$F/26G'(6&#*)!&6'/6;6

BC: &*11'-/)&-$0(&H ;0)-(/$6-$0(6*11'-/)6>/G'6G'(6H

ICJ &$)'(0&IK&I8L I0+0)0$6'/6IK

Thursday, July 21, 2011 Other Cool Things About Django!

• Very pluggable - easily drop in external apps

• Middleware layer - can intercept and tweak HTTP requests and responses (useful for handling cross site headers)

• Lots of nice decorators for handling authorization, sessions, caching etc.

• Handles users, sessions, DB (ORM) stuff automatically

Thursday, July 21, 2011 The AJAX way!

$.newt_ajax({

url: "/queue/hopper/",

type: "POST",

data: {"jobfile": filename},

success: function(data){

$("#output").append(data.jobid);

},

});

This is a jQuery JavaScript function that calls the NEWT API. NEWT returns a JSON object that looks like

{"status": "OK", "error": "", "jobid" : "hop1234.id" }

Thursday, July 21, 2011 NERSC’s Online VASP Application

Thursday, July 21, 2011 Thursday, July 21, 2011 Thursday, July 21, 2011 What it all means …

• Create an API that allows science groups to build custom web applications

• A Simple RESTful API makes it very easy for science groups to build science specific interfaces to data and computing

!! Science-As-A-Service

Thursday, July 21, 2011 The End

Questions?

https://newt.nersc.gov for examples*, tutorial etc. (*you will need a NERSC account for most examples)

Contact:

Shreyas Cholia: [email protected]

Annette Greiner: [email protected]

Thursday, July 21, 2011 Code Samples

Thursday, July 21, 2011 job_list template

{% for job in all_jobs %}

. . .

{{ job.jobname }} {% if job.time_submitted %}{{ job.time_submitted|date:"M j, Y g:i A" }}{% else %}-{% endif %} CompletedCopy Move Delete Convergence 3

{% endfor %}

Thursday, July 21, 2011 newt.status.models.py

class Status(object): @classmethod def get(cls, machine_name=None): base_url = _settings.STATUS_URL url = '%s?%s=%s' % (base_url, 'system', machine_name) conn = httplib2.Http() response, content = conn.request(url, 'GET') httpstatus = int(response['status']) # {"system":"carver", "status":"up"} jd = JSONDecoder().decode(content) od = {'httpstatus':httpstatus, 'machine_info': jd} return od

Thursday, July 21, 2011 view_job view

def view_job(request, jobid, *args, **kwargs): j=Job.objects.get(id=jobid) try: dir_info=j.get_dir(j.jobdir) except IOError, ex: return HttpResponseBadRequest("File Not Found: %s"%str(ex))

# sort by filename dir_info=sorted(dir_info,key=itemgetter('name'))

return render_to_response('main/job_view.', {'job_name': j.jobname, 'job_id': jobid, 'job_jobdir': j.jobdir, 'dir_info' : dir_info, 'pbs_id': j.pbsjobid, 'machine':j.machine}, context_instance=RequestContext(request))

Thursday, July 21, 2011 get_dir in the job model

def get_dir(self, *args, **kwargs): """ >>> j.get_dir() [{listing1},{listing2},]

"""

if 'dir' in kwargs: path=kwargs['dir'] else: path=self.jobdir

cookie_str=self.user.cookie url = '/file/%s%s' % (self.machine, path) response, content = util.newt_request(url, 'GET', cookie_str=cookie_str) if response['status']!='200': raise IOError(content)

dir_info=JSONDecoder().decode(content)

return dir_info

Thursday, July 21, 2011 newt_request in util

def newt_request(url, req_method, params=None, cookie_str=None):

newt_base_url=getattr(settings, 'NEWT_BASE_URL')

full_url = newt_base_url+url conn = httplib2.Http(disable_ssl_certificate_validation=True)

# Massage inputs if cookie_str: headers={'Cookie': cookie_str} else: headers=None

if type(params) is dict: body=urllib.urlencode(params) elif (type(params) is str) or (type(params) is unicode): body=params else: body=None logger.debug("NEWT: %s %s"%(req_method,full_url))

response, content = conn.request(full_url, req_method, body=body, headers=headers)

logger.debug("NEWT response: %s"%response.status)

return (response, content)

Thursday, July 21, 2011 get_dir in the job model

def get_dir(self, *args, **kwargs): """ >>> j.get_dir() [{listing1},{listing2},]

"""

if 'dir' in kwargs: path=kwargs['dir'] else: path=self.jobdir

cookie_str=self.user.cookie url = '/file/%s%s' % (self.machine, path) response, content = util.newt_request(url, 'GET', cookie_str=cookie_str) if response['status']!='200': raise IOError(content)

dir_info=JSONDecoder().decode(content)

return dir_info

Thursday, July 21, 2011 view_job view

def view_job(request, jobid, *args, **kwargs): j=Job.objects.get(id=jobid) try: dir_info=j.get_dir(j.jobdir) except IOError, ex: return HttpResponseBadRequest("File Not Found: %s"%str(ex))

# sort by filename dir_info=sorted(dir_info,key=itemgetter('name'))

return render_to_response('main/job_view.html', {'job_name': j.jobname, 'job_id': jobid, 'job_jobdir': j.jobdir, 'dir_info' : dir_info, 'pbs_id': j.pbsjobid, 'machine':j.machine}, context_instance=RequestContext(request))

Thursday, July 21, 2011 job_view template

Files

{% for fileline in dir_info %} {% if 'd' not in fileline.perms %}

{{ fileline.name }}

{% endif %} {% endfor %}

Thursday, July 21, 2011