RESTful APIs for Scientific Computing in Django
Shreyas Cholia - [email protected] Annette Greiner - [email protected] Outreach, Software and Programming Group NERSC - LBNL SciPy 2011, Austin, TX
Thursday, July 21, 2011 NERSC
• National Energy Research Scientific Computing Center
• DOE Office of Science HPC User Facility at Lawrence Berkeley Lab
• Provides high performance compute, data, network and information services to scientists across the world
Thursday, July 21, 2011 NERSC Resources
• Multiple HPC clusters Hopper, Franklin, Carver, Magellan, Euclid, PDSF
• HPSS archival storage system
• Global File System
Thursday, July 21, 2011 Web Gateways
• Old way - SSH + command line + batch system
• People now expect web interfaces for everything
• Usability - scientific computing should be as easy as online-banking
• don’t want generic options/tools not applicable to your science
• don’t want to deal with backend, middleware, UNIX CLI etc.
Thursday, July 21, 2011 Motives
• Make it very easy for science teams to build web gateways to their data and computation
• We have already built several science specific gateways - want to encapsulate common patterns
Thursday, July 21, 2011 Web Stack
• Browser + AJAX
• REST
• JSON
Thursday, July 21, 2011 Web Frameworks
• Python has a number of very powerful web frameworks - Django, Web2py, Pylons/Pyramid ...
• Model-View-Controller pattern
• Separation of
• data model
• routing/request processing
• templates
• DRY - Don’t Repeat Yourself
• Building blocks for common web tasks (auth, HTTP headers, themes and templates)
• Database abstraction (ORM) - masks SQL layer
Thursday, July 21, 2011 Django
• Powerful MVC framework in Python
• Probably most widely adopted and documented Python framework
Thursday, July 21, 2011 She turned me into a NEWT
• NEWT - NERSC Web Toolkit
• REST ful ... ish API
• access HPC resources over the web using HTTP + JSON
• Built using Django
Thursday, July 21, 2011 NERSC Web Toolkit
• NEWT Service Exposes NERSC Resources as HTTP URIs
• ReST API - HTTP + verbs + JSON
• newt.js Javascript Library for frontend dev
Thursday, July 21, 2011 Things you can do ...
• Authenticate using NERSC credentials
• Check machine status
• Upload and download files
• Submit a compute job
• Monitor a job
• Get user account information
• Store app data
• Issue UNIX commands
Thursday, July 21, 2011 Client: Web Application - HTML 5/AJAX
http request JSON data
Authentication Internal DB: NEWT Django session, cred, user MyProxy CA information
System Accounting Persistent Store Information Resources (via (NoSQL DB) Globus)
Files CouchDB NIM
Batch Jobs
Shell Commands
Status
Thursday, July 21, 2011 Quick Demo
• Make request at a URL
GET !"#$%&'()*+,*-)!./0($1.2'3&/04)&5-0-0&!'##0(&6
• Get back JSON response
Thursday, July 21, 2011 Django makes it easy!
• Django already gives us most of the glue to build these kinds of frameworks
• Python has tons of special sauce to interface with different backend resources
Thursday, July 21, 2011 Building an API
• Decide on the resources you want to expose
• CORE: Map URI paths to custom methods
• Plugin an authentication backend if needed
• To access the full complement of HTTP verbs you will need a REST plugin for Django - Piston, Tastypie
Thursday, July 21, 2011 urls.py
urlpatterns = patterns('', (r'^/?$', 'newt.home.views.apiroot'), (r'^status', 'newt.status.views.statusAdapter')), (r'^file', include('newt.file.urls')),
)
Thursday, July 21, 2011 views.py
class StatusAdapter(PublicJsonResourceAdapter): def get(self, request): logger.debug("display status for all") try: status_dict=Status.get() except Exception, e: return HttpResponseServerError("Could not connect") output = JSONEncoder().encode(status_dict['machine_info']) return HttpResponse(output, status=status_dict ['httpstatus'], content_type='application/json')
Thursday, July 21, 2011 !"#$ #"%&'#(" )"%(#*+,*&-
789: &5-0-0&; 9-<=>)$6789:6?*)*6)'65-0-06'/6;@6(0)-(/$6A'<6>?
BC: &D+0&;*)!& ;0)-(/$6?>(01)'(E6+>$F/26G'(6*)!&6'/6;6
BC: &*11'-/)&-$0(&H ;0)-(/$6-$0(6*11'-/)6>/G'6G'(6H
ICJ &$)'(0&IK&I8L I0+0)0$6'
Thursday, July 21, 2011 Other Cool Things About Django!
• Very pluggable - easily drop in external apps
• Middleware layer - can intercept and tweak HTTP requests and responses (useful for handling cross site headers)
• Lots of nice decorators for handling authorization, sessions, caching etc.
• Handles users, sessions, DB (ORM) stuff automatically
Thursday, July 21, 2011 The AJAX way!
$.newt_ajax({
url: "/queue/hopper/",
type: "POST",
data: {"jobfile": filename},
success: function(data){
$("#output").append(data.jobid);
},
});
This is a jQuery JavaScript function that calls the NEWT API. NEWT returns a JSON object that looks like
{"status": "OK", "error": "", "jobid" : "hop1234.id" }
Thursday, July 21, 2011 NERSC’s Online VASP Application
Thursday, July 21, 2011 Thursday, July 21, 2011 Thursday, July 21, 2011 What it all means …
• Create an API that allows science groups to build custom web applications
• A Simple RESTful API makes it very easy for science groups to build science specific interfaces to data and computing
!! Science-As-A-Service
Thursday, July 21, 2011 The End
Questions?
https://newt.nersc.gov for examples*, tutorial etc. (*you will need a NERSC account for most examples)
Contact:
Shreyas Cholia: [email protected]
Annette Greiner: [email protected]
Thursday, July 21, 2011 Code Samples
Thursday, July 21, 2011 job_list template
{% for job in all_jobs %}
. . .
{% endfor %}
Thursday, July 21, 2011 newt.status.models.py
class Status(object): @classmethod def get(cls, machine_name=None): base_url = _settings.STATUS_URL url = '%s?%s=%s' % (base_url, 'system', machine_name) conn = httplib2.Http() response, content = conn.request(url, 'GET') httpstatus = int(response['status']) # {"system":"carver", "status":"up"} jd = JSONDecoder().decode(content) od = {'httpstatus':httpstatus, 'machine_info': jd} return od
Thursday, July 21, 2011 view_job view
def view_job(request, jobid, *args, **kwargs): j=Job.objects.get(id=jobid) try: dir_info=j.get_dir(j.jobdir) except IOError, ex: return HttpResponseBadRequest("File Not Found: %s"%str(ex))
# sort by filename dir_info=sorted(dir_info,key=itemgetter('name'))
return render_to_response('main/job_view.html', {'job_name': j.jobname, 'job_id': jobid, 'job_jobdir': j.jobdir, 'dir_info' : dir_info, 'pbs_id': j.pbsjobid, 'machine':j.machine}, context_instance=RequestContext(request))
Thursday, July 21, 2011 get_dir in the job model
def get_dir(self, *args, **kwargs): """ >>> j.get_dir() [{listing1},{listing2},]
"""
if 'dir' in kwargs: path=kwargs['dir'] else: path=self.jobdir
cookie_str=self.user.cookie url = '/file/%s%s' % (self.machine, path) response, content = util.newt_request(url, 'GET', cookie_str=cookie_str) if response['status']!='200': raise IOError(content)
dir_info=JSONDecoder().decode(content)
return dir_info
Thursday, July 21, 2011 newt_request in util
def newt_request(url, req_method, params=None, cookie_str=None):
newt_base_url=getattr(settings, 'NEWT_BASE_URL')
full_url = newt_base_url+url conn = httplib2.Http(disable_ssl_certificate_validation=True)
# Massage inputs if cookie_str: headers={'Cookie': cookie_str} else: headers=None
if type(params) is dict: body=urllib.urlencode(params) elif (type(params) is str) or (type(params) is unicode): body=params else: body=None logger.debug("NEWT: %s %s"%(req_method,full_url))
response, content = conn.request(full_url, req_method, body=body, headers=headers)
logger.debug("NEWT response: %s"%response.status)
return (response, content)
Thursday, July 21, 2011 get_dir in the job model
def get_dir(self, *args, **kwargs): """ >>> j.get_dir() [{listing1},{listing2},]
"""
if 'dir' in kwargs: path=kwargs['dir'] else: path=self.jobdir
cookie_str=self.user.cookie url = '/file/%s%s' % (self.machine, path) response, content = util.newt_request(url, 'GET', cookie_str=cookie_str) if response['status']!='200': raise IOError(content)
dir_info=JSONDecoder().decode(content)
return dir_info
Thursday, July 21, 2011 view_job view
def view_job(request, jobid, *args, **kwargs): j=Job.objects.get(id=jobid) try: dir_info=j.get_dir(j.jobdir) except IOError, ex: return HttpResponseBadRequest("File Not Found: %s"%str(ex))
# sort by filename dir_info=sorted(dir_info,key=itemgetter('name'))
return render_to_response('main/job_view.html', {'job_name': j.jobname, 'job_id': jobid, 'job_jobdir': j.jobdir, 'dir_info' : dir_info, 'pbs_id': j.pbsjobid, 'machine':j.machine}, context_instance=RequestContext(request))
Thursday, July 21, 2011 job_view template
Files
{% for fileline in dir_info %} {% if 'd' not in fileline.perms %} {% endif %} {% endfor %}Thursday, July 21, 2011