Gruvi Documentation Release 0.10.3
Total Page:16
File Type:pdf, Size:1020Kb
Gruvi Documentation Release 0.10.3 Geert Jansen Jun 14, 2017 Contents 1 Documentation 3 2 Project files 45 Python Module Index 49 i ii Gruvi Documentation, Release 0.10.3 Improved ergonomics for Python programmers wanting to use asynchronous IO. Gruvi is an asynchronous IO library for Python. It focuses on the following desirable properties: • Simple. Async IO code should look just like normal code, with simple, sequential control flow and regular functions. • Efficient. An IO library should have a very low memory and CPU overhead, so that it can scale to small systems or to many concurrent connections. • Powerful. Common protocols like SSL/TLS and HTTP should be part of every IO library. Contents 1 Gruvi Documentation, Release 0.10.3 2 Contents CHAPTER 1 Documentation Rationale Gruvi is an asynchronous I/O library for Python, just like asyncio, gevent and eventlet. It is therefore a fair to ask why I decided to create a new library. The short and simple answer is that I didn’t agree with some of the decision decision that went into these other projects, and thought it would be good to have a library based on different assumptions. These are the design requirements that I had when creating Gruvi: • I wanted to use green threads. Compared to Posix threads, green threads are very light weight and scalable. Compared to a generator-based yield from approach, they provide better ergonomics where there is just one type of function, rather than two that cannot easily call into each other. • I wanted to use libuv, which is a very high performance asynchronous I/O library famous from node.js, and has excellent cross-platform support including Windows. • I wanted to use a PEP 3156 style internal API based on transports and protocols. This model matches very well with libuv’s completion based model. Compared to a socket based interface, a transport/completion based interface forces a strict separation between “network facing” and “user facing” code. In a socket based interface it is easy to make mistakes such as putting a socket read deep in some protocol code, and it incentives hacks such as monkey patching. Comparison to other frameworks The table below compares some of the design decisions and features of Gruvi against asyncio, gevent and eventlet. Note: tables like these necessarily compress a more complex reality into a much more limited set of answers, and they also a snapshot in time. Please assume good faith. If you spot an error in the table, please let me know and I will change it. 3 Gruvi Documentation, Release 0.10.3 Feature Gruvi Asyncio Gevent Eventlet IO library libuv stdlib libev stdlib / libevent IO abstraction Transports / Transports / Protocols Green sockets Green sockets Protocols Threading fibers yield from greenlet greenlet Resolver threadpool threadpool threadpool / blocking / c-ares dnspython Python: 2.x YES (2.7) YES (2.6+, via YES YES Trollius) Python: 3.x YES (3.3+) YES NO NO Python: PyPy NO NO YES YES Platform: Linux FAST FAST FAST FAST Platform: Mac FAST FAST FAST FAST OSX Platform: FAST (IOCP) FAST (IOCP) SLOW (select) SLOW (select) Windows SSL: Posix FAST FAST FAST FAST SSL: Windows FAST (IOCP) FAST (IOCP 3.5+) SLOW (select) SLOW (select) SSL: Contexts YES (also Py2.7) YES (also Py2.6+) NO NO HTTP FAST (via NO (external) SLOW (stdlib) SLOW (stdlib) http-parser) Monkey Patching NO NO YES YES Motivations for choosing green threads Green threads have a very low memory overhead, and can therefore support a high level of concurrently. But more importantly, green threads are cooperatively scheduled. This means that a thread switch happens only when specifically instructed by a switch() call, and with a bit of care, we can write concurrent code that does not require locking. When combining a thread implementation with an IO framework, one of the key design decisions is whether to implement explicit or implicit switching. Green thread switching in Gruvi is implicit. This means that whenever a function would block, for example to wait for data from the network, it will switch to a central scheduler called the hub. The hub will then switch to another green thread if one is ready, or it will run the event loop. A function that can block is just like any other function. It is called as a regular function, and can call other (blocking and not) functions. A common criticism of the implicit switching approach is that the locations where these switches happen, the so-called switch points, are not clearly marked. As a programmer you could call into a function that 3 levels down in the call stack, causes a thread switch. The drawback of this, according to the criticism, is that the switch points could happen essentially anywhere, and that therefore it’s like pre-emptive multi-threading where you need full and careful locking. The alternative to implicit switching is explicit switching. This is the approach taken for example by the asyncio framework. In this approach, every switch point is made explicit, in the case of asyncio because it is not called as a normal function but instead used with the yield from construct. In my view, a big disadvantages of the explicit approach is that the explicit behavior needs to percolate all the way up the call chain: any function that calls a switch point also needs to be called as a switch point. In my view this requires too much up-front planning, and it reduces composability. Rather than implementing explicit switching, Gruvi sticks with implicit switching but tries to address the “unknown switch points” problem, in the following way: • First, it makes explicit that a switch can only happen when you call a function. If you need to change some global state atomically, it is sufficient to do this without making function calls, or by doing it in a leaf function. In this case, it is guaranteed that no switch will happen. 4 Chapter 1. Documentation Gruvi Documentation, Release 0.10.3 • Secondly, in case you do need to call out to multiple functions to change a shared global state, then as a pro- grammer you need to make sure, by reading the documentation of the functions, that they do not cause a switch. Gruvi assists you here by marking all switch points with a special decorator. This puts a clear notice in the func- tion’s docstring. In addition, Gruvi also provides a gruvi.assert_no_switchpoints context manager that will trigger an assertion if a switch point does get called within its body. You can use this to be sure that no switch will happen inside a block. • Thirdly, the “traditional” option, Gruvi provides a full set of synchronization primitives including locks that you can use if the two other approaches don’t work. In the end, the difference between implicit and explicit switching is a trade-off. In my view, with the safeguards of Gruvi e.g. the marking of switch points and the gruvi.assert_no_switchpoints context manager, the balance tips in favor of implicit switching. Some people have come to the same conclusion, others to a different one. Both approached are valid and as an programmer you should pick the approach you like most. Motivations for not doing Monkey patching One other important design decision in Gruvi that I decided early on is not to implement monkey patching. Monkey patching is an approach employed by e.g. gevent and eventlet where they make the Python standard library cooperative by replacing blocking functions with cooperative functions using runtime patching. In my experience, monkey patching is error prone and fragile. You end up distributing parts of the standard library yourself, bugs included. This is a maintenance burden that I’m not willing to take on. Also the approach is very susceptible to dependency loading order problems, and it only works for code that calls into the blocking functions via Python. Extension modules using e.g. the C-API don’t work, as well as extension modules that use an external library for IO (e.g. psycopg). Finally, monkey patching does not work well with libuv because libuv provides a completion based interface while the standard library assumes a ready-based interface. The solution that Gruvi offers is two-fold: • Either, use Gruvi’s own API if available. For example, Gruvi includes classes to work with streams and pro- cesses, and it also provides an excellent HTTP client and server implementation. This is the preferred option. • When integrating with third-party blocking code, run it in the Gruvi maintained thread pool. The easiest way is to call this code via the gruvi.blocking() function. Installation Gruvi uses the setuptools, so installation is relatively straightforward. The following Python / OS combinations are supported: OS Python versions Notes Posix 2.7, 3.3+ Only Linux is regularly tested Mac OSX 2.7, 3.3+ PPC is not tested Windows 2.7, 3.3+ No SSL backports on 2.7 Gruvi and some of its dependencies contain C extensions. This means that you need to have a C compiler, and also the Python development files. Gruvi also uses CFFI so you need to have that installed as well. Installation using pip If you have the “pip” package manager, then you can install Gruvi directly from the Python Package Index: 1.2. Installation 5 Gruvi Documentation, Release 0.10.3 $ pip install cffi $ pip install gruvi You need to install CFFI first because the Gruvi setup script depends on it. Installation from source The following instructions install Gruvi in a virtualenv development environment: $ pyvenv gruvi-dev # "virtualenv gruvi-dev" with Python <= 3.3 $ .