Granary Documentation Release 1.12
Total Page:16
File Type:pdf, Size:1020Kb
granary Documentation Release 1.12 Ryan Barrett Mar 24, 2018 Contents 1 Using 3 2 Using the REST API 5 3 Using the library 7 4 Troubleshooting/FAQ 9 5 Future work 11 6 Development 13 7 Related work 15 8 Changelog 17 8.1 1.12 - 2018-03-24............................................ 17 8.2 1.11 - 2018-03-09............................................ 17 8.3 1.10 - 2017-12-10............................................ 18 8.4 1.9 - 2017-10-24............................................. 19 8.5 1.8 - 2017-08-29............................................. 19 8.6 1.7 - 2017-02-27............................................. 20 8.7 1.6 - 2016-11-26............................................. 21 8.8 1.5 - 2016-08-25............................................. 21 8.9 1.4.1 - 2016-06-27............................................ 22 8.10 1.4.0 - 2016-06-27............................................ 22 8.11 1.3.1 - 2016-04-07............................................ 22 8.12 1.3.0 - 2016-04-06............................................ 23 8.13 1.2.0 - 2016-01-11............................................ 23 8.14 1.1.0 - 2015-09-06............................................ 24 8.15 1.0.1 - 2015-07-11............................................ 25 8.16 1.0 - 2015-07-10............................................. 25 i ii granary Documentation, Release 1.12 Granary is a library and REST API that fetches and converts between a wide variety of data sources and formats: • Facebook, Flickr, GitHub, Google+, Instagram, and Twitter native APIs • Instagram and Google+ scraped HTML • ActivityStreams 1.0 and 2.0 • microformats2 HTML and JSON • Atom • JSON Feed • XML Free yourself from silo API chaff and expose the sweet social data foodstuff inside in standard formats and protocols! Here’s how to get started: • Granary is available on PyPi. Install with pip install granary. • Supports Python 2.7+ and 3.3+. • Click here for getting started docs. • Click here for reference docs. • The REST API and demo app are deployed at granary.io. License: This project is placed in the public domain. Contents 1 granary Documentation, Release 1.12 2 Contents CHAPTER 1 Using All dependencies are handled by pip and enumerated in requirements.txt. We recommend that you install with pip in a virtualenv.(App Engine details.) The library and REST API are both based on the OpenSocial Activity Streams service. Let’s start with an example. This code using the library: from granary import twitter ... tw= twitter.Twitter(ACCESS_TOKEN_KEY, ACCESS_TOKEN_SECRET) tw.get_activities(group_id='@friends') is equivalent to this HTTP GET request: https://granary.io/twitter/@me/@friends/@app/ ?access_token_key=ACCESS_TOKEN_KEY&access_token_secret=ACCESS_TOKEN_SECRET They return the authenticated user’s Twitter stream, ie tweets from the people they follow. Here’s the JSON output: { "itemsPerPage": 10, "startIndex":0, "totalResults": 12, "items": [{ "verb":"post", "id":"tag:twitter.com,2013:374272979578150912", "url":"http://twitter.com/evanpro/status/374272979578150912", "content":"Getting stuff for barbecue tomorrow. No ribs left! Got some nice ,!tenderloin though. (@ Metro Plus Famille Lemay) http://t.co/b2PLgiLJwP", "actor":{ "username":"evanpro", "displayName":"Evan Prodromou", "description":"Prospector.", "url":"http://twitter.com/evanpro", }, 3 granary Documentation, Release 1.12 "object":{ "tags": [{ "url":"http://4sq.com/1cw5vf6", "startIndex": 113, "length": 22, "objectType":"article" },"..."], }, },"..."] "..." } The request parameters are the same for both, all optional: USER_ID is a source-specific id or @me for the authenti- cated user. GROUP_ID may be @all, @friends (currently identical to @all), @self, @search, or @blocks; APP_ID is currently ignored; best practice is to use @app as a placeholder. Paging is supported via the startIndex and count parameters. They’re self explanatory, and described in detail in the OpenSearch spec and OpenSocial spec. When using the GROUP_ID @search (for platforms that support it — currently Twitter and Instagram), provide a search string via the q parameter. The API is loosely based on the OpenSearch spec, the OpenSocial Core Container spec, and the OpenSocial Core Gadget spec. Output data is JSON Activity Streams 1.0 objects wrapped in the OpenSocial envelope, which puts the activities in the top-level items field as a list and adds the itemsPerPage, totalCount, etc. fields. Most Facebook requests and all Twitter, Google+, Instagram, and Flickr requests will need OAuth access tokens. If you’re using Python on Google App Engine, oauth-dropins is an easy way to add OAuth client flows for these sites. Otherwise, here are the sites’ authentication docs: Facebook, Flickr, Google+, Instagram, Twitter. If you get an access token and pass it along, it will be used to sign and authorize the underlying requests to the sources providers. See the demos on the REST API endpoints above for examples. 4 Chapter 1. Using CHAPTER 2 Using the REST API The endpoints above all serve the OpenSocial Activity Streams REST API. Request paths are of the form: /USER_ID/GROUP_ID/APP_ID/ACTIVITY_ID?startIndex=...&count=...&format=FORMAT&access_ ,!token=... All query parameters are optional. FORMAT may be json (the default), xml, or atom, both of which return Atom. atom supports a boolean reader query parameter for toggling rendering appropriate to feed readers, e.g. location is rendered in content when reader=true (the default). The rest of the path elements and query params are described above. Errors are returned with the appropriate HTTP response code, e.g. 403 for Unauthorized, with details in the response body. By default, responses are cached and reused for 5m without re-fetching the source data. (Instagram responses are cached for 60m.) You can prevent this by adding the cache=false query parameter to your request. To use the REST API in an existing ActivityStreams client, you’ll need to hard-code exceptions for the domains you want to use e.g. facebook.com, and redirect HTTP requests to the corresponding endpoint above. The web UI (granary.io) currently only fetches Facebook access tokens for users. If you want to use it to access a Facebook page, you’ll need to get an access token manually with the Graph API Explorer (click on the Get To.. drop-down) . Then, log into Facebook on granary.io and paste the page access token into the access_token text box. (Google+ pages aren’t supported in their API.) 5 granary Documentation, Release 1.12 6 Chapter 2. Using the REST API CHAPTER 3 Using the library See the example above for a quick start guide. Clone or download this repo into a directory named granary (note the underscore instead of dash). Each source works the same way. Import the module for the source you want to use, then instantiate its class by passing the HTTP handler object. The handler should have a request attribute for the current HTTP request. The useful methods are get_activities() and get_actor(), which returns the current authenticated user (if any). See the individual method docstrings for details. All return values are Python dicts of decoded ActivityStreams 1 JSON. The microformats2.*_to_html() functions are also useful for rendering ActivityStreams 1 objects as nicely formatted HTML. 7 granary Documentation, Release 1.12 8 Chapter 3. Using the library CHAPTER 4 Troubleshooting/FAQ Check out the oauth-dropins Troubleshooting/FAQ section. It’s pretty comprehensive and applies to this project too. For searchability, here are a handful of error messages that have solutions there: bash:./bin/easy_install:...bad interpreter: No such file or directory ImportError: cannot import name certs ImportError: cannot import name tweepy File".../site-packages/tweepy/auth.py", line 68, in _get_request_token raise TweepError(e) TweepError: must be _socket.socket, not socket 9 granary Documentation, Release 1.12 10 Chapter 4. Troubleshooting/FAQ CHAPTER 5 Future work We’d love to add more sites! Off the top of my head, YouTube, Tumblr, WordPress.com, Sina Weibo, Qzone, and RenRen would be good candidates. If you’re looking to get started, implementing a new site is a good place to start. It’s pretty self contained and the existing sites are good examples to follow, but it’s a decent amount of work, so you’ll be familiar with the whole project by the end. 11 granary Documentation, Release 1.12 12 Chapter 5. Future work CHAPTER 6 Development Pull requests are welcome! Feel free to ping me with any questions. You’ll need the App Engine Python SDK version 1.9.15 or later (for vendor support). Add it to your $PYTHONPATH, e.g. export PYTHONPATH=$PYTHONPATH:/usr/local/google_appengine, and then run: virtualenv local source local/bin/activate pip install-r requirements.txt python setup.py test If you send a pull request, please include (or update) a test for the new functionality if possible! The tests re- quire the App Engine SDK or the Google Cloud SDK (aka gcloud) with the gcloud-appengine-python and gcloud-appengine-python-extras components. If you want to work on oauth-dropins at the same time, install it in “source” mode with pip install -e <path to oauth-dropins repo>. To deploy: python-m unittest discover&& gcloud-q app deploy granary-demo *.yaml To deploy facebook-atom, twitter-atom, instagram-atom, and plusstreamfeed after a granary change: #!/bin/tcsh foreach s (facebook-atom twitter-atom instagram-atom plusstreamfeed) cd ~/src/$s && gcloud -q app deploy $s *.yaml end The docs are built with Sphinx, including apidoc, autodoc, and napoleon. Configuration is in docs/conf.py To build them, first install Sphinx with pip install sphinx. (You may want to do this outside your virtualenv; if so, you’ll need to reconfigure it to see system packages with virtualenv --system-site-packages local.) Then, run docs/build.sh. This ActivityStreams validator is useful for manual testing. 13 granary Documentation, Release 1.12 14 Chapter 6. Development CHAPTER 7 Related work Apache Streams is a similar project that translates between storage systems and database as well as social schemas. It’s a Java library, and its design is heavily structured. Here’s the list of formats it supports.