Parallele Programmierung mit Python

Martin v. Löwis Überblick

• Threads

• Nebenläufige Programmierung

• Prozesse

• Cluster

MvL ParProg | Python 2 2011 Threads

• Motivation: schneller

• Motivation: scheinbar einfachere Programmierung nebenläufiger Probleme

• Problem: thread safety

• in der Anwendung

• im Interpreter

MvL ParProg | Python 3 2011 import thread

• Seit Python 0.98 (1992; SGI threads + Sun LWP)

• Designziel: "native" threads

• unterstützte Thread-APIs: POSIX, NT,

OS/2, Solaris, SGI, Windows CE, LWP, pth, AtheOS, BeOS

• Python/thread_*.h

• start_new_thread, exit_thread, allocate_lock, get_ident

MvL ParProg | Python 4 2011 import threading

• Python 1.5.2 (1999)

• class Thread

• Lock, RLock, Condition, Event, Semaphore, BoundedSemaphore

• class local

• dummy_thread.py

• Queue.py

MvL ParProg | Python 5 2011 GIL

• Global Interpreter Lock

• Ziel: thread safe interpreter

• Geschützte Resourcen:

• Referenzzähler

• Container (dict, list)

• Speicherverwaltung (obmalloc, GC)

• Interpreter state

• Current thread state

MvL ParProg | Python 6 2011 GIL (2)

• Nebenläufigkeit:

• Py_BEGIN_ALLOW_THREADS/Py_END_ALLOW_THREADS

• sys.get/setcheckinterval

• 3.2: GIL drop request/eval breaker/sys.getswitchinterval

MvL ParProg | Python 7 2011 Free Threading

• Greg Stein (ca. 2000)

• main problem: dictionary synchronization

• Uniprocessor: speedup 0.6

• two processors: 1.2

• three processor: 1.6

MvL ParProg | Python 8 2011 Nebenläufigkeit

• Python workload: web application

, /, TurboGears,

• mehrere gleichzeitige HTTP-Anfragen

• I/O: Sockets, Datenbank, Dateisystem

MvL ParProg | Python 9 2011 import asyncore

• Sam Rushing, Python 2.0

• Koordinierte Verwaltung von sockets

• select, poll

• class asyncore.dispatcher (handle_read, handle_write, ...)

• Reactor-Pattern

• asyncore.loop()

MvL ParProg | Python 10 2011 Python

• J.P. Calderone, G. Lefkowitz, M. Zadka, I. Shtull-Tauring, ...

• "networking engine" (web server, IRC server, mail server, DNS server, * client)

• Programmiermodell für nebenläufige Verarbeitung von Datagrammen

MvL ParProg | Python 11 2011 Twisted: Client

• class EchoClient(protocol.Protocol):

"""Once connected, send a message, then print the result."""

def connectionMade(self): self.transport.write("hello, world!")

def dataReceived(self, data): "As soon as any data is received, write it back." print "Server said:", data self.transport.loseConnection()

def connectionLost(self, reason): print "connection lost"

MvL ParProg | Python 12 2011 Twisted: Deferred

from twisted.internet import reactor from twisted.web.client import getPage from twisted.internet.defer import DeferredList def listCallback(results): print results def finish(ign): reactor.stop() def test(): d1 = getPage('http://www.google.com') d2 = getPage('http://yahoo.com') dl = DeferredList([d1, d2]) dl.addCallback(listCallback) dl.addCallback(finish) test() reactor.run()

MvL ParProg | Python 13 2011 Twisted: Applications

• Bittorrent

• Zope3

• Launchpad

MvL ParProg | Python 14 2011 Stackless Python

• Christian Tismer

• Ziel: "unbegrenzte" Rekursion

• Spaghettistack

• Ziel: beliebig viele Threads

• Ziel: "persistente" Threads

MvL ParProg | Python 15 2011 Stackless: Implementierungsstrategie

• Version 1: Umformulierung des Interpreters zur Vermeidung von C-Rekursion

• Problem: Callbacks in C-Code (map, Tkinter)

• Version 2: "Weg-Kopieren" des Stacks in den Heap

• Anpassung des Stackpointers (setjmp, longjmp)

MvL ParProg | Python 16 2011 Stackless: API

• Tasklets (Microthreads):

• t = stackless.tasklet(aCallable)

• t.insert, t.remove, stackless.run()

• Channels

• Scheduler

• Serialisierung

MvL ParProg | Python 17 2011 Stackless: Anwendungen

• Eve Online (CCP Games)

• IronPort (Cisco)

• Sylphis3D

• ...

MvL ParProg | Python 18 2011 Prozesse

• import subprocess

• import multiprocessing

• Web:

• Application Server?

• Apache: mod_python, mod_wsgi, mod_fastcgi

• CGI

MvL ParProg | Python 19 2011 Futures (PEP 3148)

• Brian Quinlan (Python 3.2)

• Task Queues

• concurrent.futures.submit

• liefert Future Objekt (.done(), .cancel(), .result(timeout?), .exception (), .add_done_callback(cb))

• Thread pools, process pools

MvL ParProg | Python 20 2011