A Multi-Core Python HTTP Server (Much) Faster Than Go (Spoiler: Cython)
Total Page:16
File Type:pdf, Size:1020Kb
A multi-core Python HTTP server (much) faster than Go (spoiler: Cython) Jean-Paul Smets jp (at) nexedi (dot) com Bryton Lacquement bryton (dot) lacquement (at) nexedi (dot) com Details This presentation how various progress which Nexedi made in bringing Python language closer to its corporate goals in terms of performance and concurrency. QR Code Details You may find this presentation online by scanning the QR Code above. Agenda Nexedi Conclusion Rationale Code Details The presentation has four parts. First we provide some information about Nexedi. Then we reach to the conclusions, so that those who are in a hurry can leave. Then we take some time to explain the rational behind our efforts. And last we show some code. Nexedi Nexedi - Profile Largest Free Software Publisher in Europe Founded in 2001 in Lille (France) - 35 engineers worldwide Enterprise Software for mission critical applications Build, deploy, train and run services Profitable since day 1, long term organic growth Details Nexedi is probably the largest Free Software publisher with more than 10 products and 15 million lines of code. Nexedi does not depend on any investor and is a profitable company since day 1. Nexedi - Clients nexedi.com/success Details Nexedi clients are mainly large companies and governments looking for scalable enterprise solutions such as ERP, CRM, DMS, data lake, big data cloud, etc. Nexedi - a Python Shop Nexedi Software Stack stack.nexedi.com Details Nexedi software is primarily developed in Python, with some parts in Javascript. Rapid.Space: ERP5, buildout, re6st, etc. https://rapid.space Details One of the recent services launched by Nexedi is a Big Data cloud entirely based on Open Hardware and entirely running on Free Software that anyone can contribute to. The target cost is 30% to 60% lower than low cost European clouds, and 10 times lower than US based public clouds. Targer use cases include performance testing, disaster recovery and big data batch processing. Conclusions Details 2016: uvloop Blazing fast Python networking... https://magic.io/blog/uvloop-blazing-fast-python-networking/ https://pyvideo.org/europython-2016/fast-async-code-with-cython-and-asyncio.html Details This presentation was inspire by a presentatoin at Europython in 2016 that showed that uvloop could lead to very fast HTTP servers, even fatster than in golang. ...as long as python does nothing https://www.nexedi.com/NXD-Document.Blog.UVLoop.Python.Benchmark Details But we later found that as soon as some python code was added, the performance dropped tremendously by at least an order of magnitude. Also, all tests we running on a single core . But golang http server runs if needed on multiple cores. We thought the results were therefore quite biased. We thus tried to find a better solution. 2018: HTTP Hello World https://www.nexedi.com/NXD-Blog.Multicore.Python.HTTP.Server Details By using LWAN and Cython, we achieved to create a small HTTP server in python that is able to run faster that the fastest HTTP golang server and that is also able to scale linearly on multiple cores. 2018: Multi-core HTTP Finbonacci https://www.nexedi.com/NXD-Blog.Multicore.Python.HTTP.Server Details And if code is added to this server using Cython cdef functions to create some dynamic pages (eg. a page with a Fibonacci result), then the server still remains as scalable as golang and stil faster. We have thus been able to demonstrate how to equal go in concurrency and beat it in performance: use cython with nogil option. 2018: Coroutines 500 000 empty coroutines in several technologies with a concurrency model Spawn time Run time Total time Name Language (sec) (sec) (sec) asyncio Python 1.11 9.70 10.82 asyncio (with uvloop) Python 1.11 6.83 7.91 gevent Python N/A N/A 8.30 goroutines Go N/A N/A 0.39 Lwan coroutines with work- Cython 1.15 0.27 1.49 stealing https://www.nexedi.com/NXD-Blog.Cython.Multithreaded.Coroutines Details We studied more the underlying LWAN library and found that it includes a coroutine library. Based on the ideas ofJ uliusz Chroboczek's system programming project at Paris 7 University, we created a small co-routine library for Cython and compared it with asyncio, gevent and goroutines on a "empty corounine" benchmark. Surprisingly, it is an order of magnitude faster than exiting python libraries. The run-time part is probably as good as in golang. The Spawn time is till high, most likely because we rely on malloc, but could be improved. Future: cdef class nogil cdef class Foo nogil: cdef double a; cdef double b; cdef int bar(self): self.b = 1.0 self.a = self.b Then.. cdef int baz() nogil: o = Foo() o.bar() baz() Details Base on those results, we expect to extend Cython language to turn Extension Types, also known as cdef class, into a full object system independent of cpython. This way, one can use cython to develop high performance algorithms that never hit the GIL of cpython, while keeping all the benefits of easy integration with cpython. References High Peformance Multi-core Python at Nexedi https://www.nexedi.com/NXD-Blog.Multicore.Python A multi-core Python HTTP server (much) faster than Go (spoiler: Cython) https://www.nexedi.com/NXD-Blog.Multicore.Python.HTTP.Server Python mulit-core benchmark in uvloop context https://www.nexedi.com/NXD-Document.Blog.UVLoop.Python.Benchmark Multi-threaded coroutines with work stealing in Cython https://www.nexedi.com/NXD-Blog.Cython.Multithreaded.Coroutines Pygolang: golang concurrency and features in python https://pypi.org/project/pygolang/ Actalk: generic, clean, concurrency model http://www-poleia.lip6.fr/~briot/actalk/papers/PAPERS.html Details All details which have lead to above conclusions are available in the above references. Rationale Details Let us now dig into the rational behind our motivation to improve the concurrency and performance of Pyhton. In this section, we express Nexedi's views in an unambiguous way. Our views might be different from what is usually expressed in Python's community. Yet, they are based on the actual experience of bringing Python based large systems with million users to the market and maintain them for more than 10 years. They are also based on the observation that part of early Python adopters are leaving Python due to difficulties in adopting Python 3 or to push Python 3 language evolution in the direction they actually care most. State of Python: Nexedi Perspective as good as it was in 2000 when we selected it favourite language of developers in 2018 still (very) slow still (very) poor concurrency (models) still unusable inside a Web browser competitors: JS, Go Details In 2018, Python language is in our eyes as good as it was in 2000 when Nexedi selected it as its primary language (No 2 was...Ruby), due to its reflexive object system and to the existence of an object database (ZODB) based on pickles. The existence of ZODB was a key argument for us because we knew since 1993 that object-relational mapping can not work by design (despite how much people still try) and that an object database was essential for clustering and passing objects from one process to another. Python is in 2018 the favourite language of developers, which is a great news. But in 2018, it is still very slow. Compared to other dynamic languages, and to Javascript in particular, it is unacceptably slow. What was acceptable for a scripting language in the 90s at a time when other scripting languages (Tcl, AppleScript, etc.) or dynamic languages (CLOS, Smalltalk, AppleScript, Visual Basic, etc.) were equally slow is no longer acceptable considering the progress made by runtime implementations. The concurrency model of python is still very poor. Asyncio, by adding keywords to the language specification, actually imposes a specific and questionable concurrency model, that has been fashionable in the early 2010s but that is not future- proof as for any fashion. It does not provide much to handle multi-core. And it is not reflexive. In a sense, it does by changing the language something that a library could have done. And by so, it ignores to existence of solutions such as Actalk that provide a generic, universal approach to object oriented concurrent programming, able to cover all forms of concurrency: agent programming, map reduce, implicit concurrency, etc. Python is still in 2018 unusable in a web browser, either because cpython compiled into web assembly is too slow or because other implementation of python are too restrictive. The weaknesses of Python in 2018 are becoming too visible. Many projects that would have started 10 years ago in Python now start with go or javascript. Python 3 Benefits: Nexedi Perspective ... ... ... ... long term support by a dozen of individuals -> > 200 K€ / year Details The state of python in 2018 includes the state of Python 3. Nexedi just started to migrate all its code to Python 3. Yet, we see close to zero benefits of doing this, besides better long term maintenance (python2.7 maintenance will be stopped in a few months or years) and their side effect: better system libraries, new versions of libraries not released for python2.7, etc. Even if we search a lot, we can not say what Python 3 brings to our products and to our customers. Python 3 Losses: Nexedi Perspective Incompatible strings -> 200 K€ + 30 K€/year patch maintenance Slower -> 20 K€/year Requires more memory -> 20 K€/year Needlessly increasingly complex -> 5 K€/staff Perverted by fashion (ex. asyncio, unicode) -> 10+ K€/year Details Porting Nexedi's code to Python 3 has high cost.