Release 2.5.5 Ask Solem Contributors
Total Page:16
File Type:pdf, Size:1020Kb
Celery Documentation Release 2.5.5 Ask Solem Contributors February 04, 2014 Contents i ii Celery Documentation, Release 2.5.5 Contents: Contents 1 Celery Documentation, Release 2.5.5 2 Contents CHAPTER 1 Getting Started Release 2.5 Date February 04, 2014 1.1 Introduction Version 2.5.5 Web http://celeryproject.org/ Download http://pypi.python.org/pypi/celery/ Source http://github.com/celery/celery/ Keywords task queue, job queue, asynchronous, rabbitmq, amqp, redis, python, webhooks, queue, dis- tributed – • Synopsis • Overview • Example • Features • Documentation • Installation – Bundles – Downloading and installing from source – Using the development version 1.1.1 Synopsis Celery is an open source asynchronous task queue/job queue based on distributed message passing. Focused on real- time operation, but supports scheduling as well. The execution units, called tasks, are executed concurrently on one or more worker nodes using multiprocessing, Eventlet or gevent. Tasks can execute asynchronously (in the background) or synchronously (wait until ready). Celery is used in production systems to process millions of tasks every hour. 3 Celery Documentation, Release 2.5.5 Celery is written in Python, but the protocol can be implemented in any language. It can also operate with other languages using webhooks. There’s also RCelery for the Ruby programming language, and a PHP client. The recommended message broker is RabbitMQ, but support for Redis, MongoDB, Beanstalk, Amazon SQS, CouchDB and databases (using SQLAlchemy or the Django ORM) is also available. Celery is easy to integrate with web frameworks, some of which even have integration packages: Django django-celery Pyramid pyramid_celery Pylons celery-pylons Flask flask-celery web2py web2py-celery 1.1.2 Overview This is a high level overview of the architecture. The broker delivers tasks to the worker nodes. A worker node is a networked machine running celeryd. This can be one or more machines depending on the workload. The result of the task can be stored for later retrieval (called its “tombstone”). 1.1.3 Example You probably want to see some code by now, so here’s an example task adding two numbers: from celery.task import task @task def add(x, y): return x + y You can execute the task in the background, or wait for it to finish: >>> result = add.delay(4,4) >>> result.wait() # wait for and return the result 8 Simple! 4 Chapter 1. Getting Started Celery Documentation, Release 2.5.5 1.1.4 Features Mes- Supported brokers include RabbitMQ, Redis, Beanstalk, MongoDB, CouchDB, and saging popular SQL databases. Fault- Excellent configurable error recovery when using RabbitMQ, ensures your tasks are never tolerant lost. Dis- Runs on one or more machines. Supports broker clustering andHA when used in tributed combination with RabbitMQ. You can set up new workers without central configuration (e.g. use your grandma’s laptop to help if the queue is temporarily congested). Concur- Concurrency is achieved by using multiprocessing, Eventlet, gevent or a mix of these. rency Schedul- Supports recurring tasks like cron, or specifying an exact date or countdown for when ing after the task should be executed. Latency Low latency means you are able to execute tasks while the user is waiting. Return Task return values can be saved to the selected result store backend. You can wait for the Values result, retrieve it later, or ignore it. Result Database, MongoDB, Redis, Cassandra, or AMQP (message notification). Stores Web- Your tasks can also be HTTP callbacks, enabling cross-language communication. hooks Rate Supports rate limiting by using the token bucket algorithm, which accounts for bursts of limiting traffic. Rate limits can be set for each task type, or globally for all. Routing Using AMQP’s flexible routing model you can route tasks to different workers, or select different message topologies, by configuration or even at runtime. Remote- Worker nodes can be controlled from remote by using broadcast messaging. A range of control built-in commands exist in addition to the ability to easily define your own. (AMQP/Redis only) Moni- You can capture everything happening with the workers in real-time by subscribing to toring events. A real-time web monitor is in development. Serial- Supports Pickle, JSON, YAML, or easily defined custom schemes. One task invocation ization can have a different scheme than another. Trace- Errors and tracebacks are stored and can be investigated after the fact. backs UUID Every task has an UUID (Universally Unique Identifier), which is the task id used to query task status and return value. Retries Tasks can be retried if they fail, with configurable maximum number of retries, and delays between each retry. Task A Task set is a task consisting of several sub-tasks. You can find out how many, or if all of Sets the sub-tasks has been executed, and even retrieve the results in order. Progress bars, anyone? Made You can query status and results via URLs, enabling the ability to poll task status using for Web Ajax. Error Can be configured to send emails to the administrators when tasks fails. Emails 1.1.5 Documentation Documentation for the production version can be found here: http://docs.celeryproject.org/en/latest and the documentation for the development version can be found here: http://celery.github.com/celery/ 1.1. Introduction 5 Celery Documentation, Release 2.5.5 1.1.6 Installation You can install Celery either via the Python Package Index (PyPI) or from source. To install using pip,: $ pip install -U Celery To install using easy_install,: $ easy_install -U Celery Bundles Celery also defines a group of bundles that can be used to install Celery and the dependencies for a given feature. The following bundles are available: celery-with-redis for using Redis as a broker. celery-with-mongodb for using MongoDB as a broker. django-celery-with-redis for Django, and using Redis as a broker. django-celery-with-mongodb for Django, and using MongoDB as a broker. bundle-celery convenience bundle installing Celery and related packages. Downloading and installing from source Download the latest version of Celery from http://pypi.python.org/pypi/celery/ You can install it by doing the following,: $ tar xvfz celery-0.0.0.tar.gz $ cd celery-0.0.0 $ python setup.py build # python setup.py install # as root Using the development version You can clone the repository by doing the following: $ git clone git://github.com/celery/celery.git 1.2 Brokers Release 2.5 Date February 04, 2014 Celery supports several message transport alternatives. 6 Chapter 1. Getting Started Celery Documentation, Release 2.5.5 1.2.1 Using RabbitMQ • Installation & Configuration • Installing the RabbitMQ Server – Setting up RabbitMQ – Installing RabbitMQ on OS X * Configuring the system host name * Starting/Stopping the RabbitMQ server Installation & Configuration RabbitMQ is the default broker so it does not require any additional dependencies or initial configuration, other than the URL location of the broker instance you want to use: >>> BROKER_URL = "amqp://guest:guest@localhost:5672//" For a description of broker URLs and a full list of the various broker configuration options available to Celery, see Broker Settings. Installing the RabbitMQ Server See Installing RabbitMQ over at RabbitMQ’s website. For Mac OS X see Installing RabbitMQ on OS X. Note: If you’re getting nodedown errors after installing and using rabbitmqctl then this blog post can help you identify the source of the problem: http://somic.org/2009/02/19/on-rabbitmqctl-and-badrpcnodedown/ Setting up RabbitMQ To use celery we need to create a RabbitMQ user, a virtual host and allow that user access to that virtual host: $ rabbitmqctl add_user myuser mypassword $ rabbitmqctl add_vhost myvhost $ rabbitmqctl set_permissions -p myvhost myuser ".*" ".*" ".*" See the RabbitMQ Admin Guide for more information about access control. Installing RabbitMQ on OS X The easiest way to install RabbitMQ on Snow Leopard is using Homebrew; the new and shiny package management system for OS X. In this example we’ll install Homebrew into /lol, but you can choose whichever destination, even in your home directory if you want, as one of the strengths of Homebrew is that it’s relocatable. Homebrew is actually a git repository, so to install Homebrew, you first need to install git. Download and install from the disk image at http://code.google.com/p/git-osx-installer/downloads/list?can=3 When git is installed you can finally clone the repository, storing it at the /lol location: 1.2. Brokers 7 Celery Documentation, Release 2.5.5 $ git clone git://github.com/mxcl/homebrew /lol Brew comes with a simple utility called brew, used to install, remove and query packages. To use it you first have to add it to PATH, by adding the following line to the end of your ~/.profile: export PATH="/lol/bin:/lol/sbin:$PATH" Save your profile and reload it: $ source ~/.profile Finally, we can install rabbitmq using brew: $ brew install rabbitmq Configuring the system host name If you’re using a DHCP server that is giving you a random host name, you need to permanently configure the host name. This is because RabbitMQ uses the host name to communicate with nodes. Use the scutil command to permanently set your host name: sudo scutil --set HostName myhost.local Then add that host name to /etc/hosts so it’s possible to resolve it back into an IP address: 127.0.0.1 localhost myhost myhost.local If you start the rabbitmq server, your rabbit node should now be rabbit@myhost, as verified by rabbitmqctl: $ sudo rabbitmqctl status Status of node rabbit@myhost ... [{running_applications,[{rabbit,"RabbitMQ","1.7.1"}, {mnesia,"MNESIA CXC 138 12","4.4.12"}, {os_mon,"CPO CXC 138 46","2.2.4"}, {sasl,"SASL CXC 138 11","2.1.8"}, {stdlib,"ERTS CXC 138 10","1.16.4"}, {kernel,"ERTS CXC 138 10","2.13.4"}]}, {nodes,[rabbit@myhost]}, {running_nodes,[rabbit@myhost]}] ...done.