APACHE2 AND MOD PYTHON IN A SHARED HOSTING ENVIRONMENT 295

1 nohup python fcgihandler.py & Then, (re)start the web server with:

1 /etc/init.d/lighttpd restart Notice that FastCGI binds the server to a Unix socket, not to an IP socket:

1 /tmp/fcgi.sock This is where Lighttpd forwards the HTTP requests to and receives responses from. Unix sockets are lighter than Internet sockets, and this is one of the reasons Lighttpd+FastCGI+web2py is fast. As in the case of Apache, it is possible to setup Lighttpd to deal with static files directly, and to force some applications over HTTPS. Refer to the Lighttpd documentation for details. The administrative interface must be disabled when web2py runs on a shared host with FastCGI, or it will be exposed to the other users.

11.8 Apache2 and mod python in a shared hosting environment

There are times, specifically on shared hosts, when one does not have the permission to configure the Apache config files directly. You can still run web2py. Here we show an example of how to set it up using mod python6 • Place contents of web2py into the "htdocs" folder. • In the web2py folder, create a file "web2py modpython.py" file with the following contents:

1 from mod_python import apache 2 import modpythonhandler 3 4 def handler(req): 5 req.subprocess_env['PATH_INFO'] = \ 6 req.subprocess_env['SCRIPT_URL'] 7 return modpythonhandler.handler(req)

• Create/update the file ".htaccess" with the following contents:

1 SetHandler python-program 2 PythonHandler web2py_modpython 3 ##PythonDebug On

6Examples provided by Niktar

296 DEPLOYMENT RECIPES

11.9 Setup Cherokee with FastGGI

Cherokee is a very fast web server and, like web2py, it provides an - enabled web-based interface for its configuration. Its web interface is written in Python. In addition, there is no restart required for most of the changes. Here are the steps required to setup web2py with Cherokee:

• Download Cherokee [76] • Untar, build, and install:

1 tar -xzf cherokee-0.9.4.tar.gz 2 cd cherokee-0.9.4 3 ./configure --enable-fcgi && make 4 make install

• Start web2py normally at least once to make sure it creates the "ap- plications" folder. • Write a shell script named "startweb2py.sh" with the following code:

1 #!/bin/bash 2 cd /var/web2py 3 python /var/web2py/fcgihandler.py &

and give the script execute privileges and run it. This will start web2py under FastCGI handler. • Start Cherokee and cherokee-admin:

1 sudo nohup cherokee & 2 sudo nohup cherokee-admin &

By default, cherokee-admin only listens at local interface on port 9090. This is not a problem if you have full, physical access on that machine. If this is not the case, you can force it to bind to an IP address and port by using the following options:

1 -b, --bind[=IP] 2 -p, --port=NUM

or do an SSH port-forward (more secure, recommended):

1 ssh -L 9090:localhost:9090 remotehost

• Open "http://localhost:9090" in your browser. If everything is ok, you will get cherokee-admin. • In cherokee-admin web interface, click "info sources". Choose "Local Interpreter". Write in the following code, then click "Add New".

SETUP POSTGRESQL 297

1 Nick: web2py 2 Connection: /tmp/fcgi.sock 3 Interpreter: /var/web2py/startweb2py.sh

• Click "Virtual Servers", then click "Default". • Click "Behavior", then, under that, click "default". • Choose "FastCGI" instead of "List and Send" from the list box. • At the bottom, select "web2py" as "" • Put a check in all the checkboxes (you can leave Allow-x-sendfile). If there is a warning displayed, disable and enable one of the check- boxes. (It will automatically re-submit the application server parameter. Sometimes it doesn’t, which is a bug). • Point your browser to "http://ipaddressofyoursite", and "Welcome to web2py" will appear.

11.10 Setup PostgreSQL

PostgreSQL is a free and open source database which is used in demand- ing production environments, for example, to store the .org domain name database, and has been proven to scale well into hundreds of terabytes of data. It has very fast and solid transaction support, and provides an auto- vacuum feature that frees the administrator from most database maintenance tasks. On an Ubuntu or other Debian-based Linux distribution, it is easy to install PostgreSQL and its Python API with:

1 sudo apt-get -y install postgresql 2 sudo apt-get -y install python-psycopg2 It is wise to run the web server(s) and the database server on different machines. In this case, the machines running the web servers should be connected with a secure internal (physical) network, or should establish SSL tunnels to securely connect with the database server. Start the database server with:

1 sudo /etc/init.d/postgresql restart When restarting the PostgreSQL server, it should notify which port it is running on. Unless you have multiple database servers, it should be 5432. The PostgreSQL configuration file is:

298 DEPLOYMENT RECIPES

1 /etc/postgresql/x.x/main/postgresql.conf

(where x.x is the version number). The PostgreSQL logs are in:

1 /var/log/postgresql/ Once the database server is up and running, create a user and a database so that web2py applications can use it:

1 sudo -u postgres createuser -P -s myuser 2 createdb mydb 3 echo 'The following databases have been created:' 4 psql -l 5 psql mydb The first of the commands will grant superuser-access to the new user, called myuser. It will prompt you for a password. Any web2py application can connect to this database with the command:

1 db = DAL("postgres://myuser:mypassword@localhost:5432/mydb")

where mypassword is the password you entered when prompted, and 5432 is the port where the database server is running. Normally you use one databasefor eachapplication, and multiple instances of the same application connect to the same database. It is also possible for different applications to share the same database. For database backup details, read the PostgreSQL documentation; specifi- cally the commands pg dump and pg restore.

11.11 Security Issues

It is very dangerous to publicly expose the admin application and the ap- padmin controllers unless they run over HTTPS. Moreover, your password and credentials should never be transmitted unencrypted. This is true for web2py and any other . In your applications, if they require authentication, you should make the session cookies secure with:

1 session.secure() An easy way to setup a secure production environment on a server is to first stop web2py and then remove all the parameters *.py files from the web2py installation folder. Then start web2py without a password. This will completely disable admin and appadmin. Next, start a second Python instance accessible only from localhost:

1 nohup python web2py -p 8001 -i 127.0.0.1 -a '' &

SCALABILITY ISSUES 299

and create an SSH tunnel from the local machine (the one from which you wish to access the administrative interface) to the server (the one where web2py is running, example.com), using:

1 ssh -L 8001:127.0.0.1:8001 [email protected] Now you can access the administrative interface locally via the web browser at localhost:8001. This configuration is secure because admin is not reachable when the tunnel is closed (the user is logged out).

This solution is secure on shared hosts if and only if other users do not have read access to the folder that contains web2py; otherwise users may be able to steal session cookies directly from the server.

11.12 Scalability Issues

web2py is designed to be easy to deploy and to setup. This does not mean that it compromises on efficiency or scalability, but it means you may need to tweak it to make it scalable. In this section we assume multiple web2py installations behind a NAT server that provides local load-balancing. In this case, web2py works out-of-the-box if some conditions are met. In particular, all instances of each web2py application must access the same database server and must see the same files. This latter condition can be implemented by making the following folders shared:

1 applications/myapp/sessions 2 applications/myapp/errors 3 applications/myapp/uploads 4 applications/myapp/cache The shared folders must support file locking. Possible solutions are ZFS7, NFS8, or Samba (SMB). It is possible, but not a good idea, to share the entire web2py folder or the entire applications folder, because this would cause a needless increase of network bandwidth usage. We believe the configuration discussed above to be very scalable because it reduces the databaseload by moving to the shared filesystems those resources

7ZFS was developed by Sun Microsystems and is the preferred choice. 8With NFS you may need to run the nlockmgr daemon to allow file locking.

300 DEPLOYMENT RECIPES

that need to be shared but do not need transactional safety (only one client at a time is supposed to access a session file, cache always needs a global lock, uploads and errors are write once/read many files). Ideally, both the database and the shared storage should have RAID capa- bility. Do not make the mistake of storing the database on the same storage as the shared folders, or you will create a new bottle neck there. On a case-by-casebasis, you may needto perform additionaloptimizations and we will discuss them below. In particular, we will discuss how to get rid of these shared folders one-by-one, and how to store the associated data in the database instead. While this is possible, it is not necessarily a good solution. Nevertheless, there may be reasons to do so. One such reason is that sometimes we do not have the freedom to set up shared folders.

Sessions in Database It is possible to instruct web2py to store sessions in a database instead of in the sessions folder. This has to be done for each individual web2py application although they may all use the same database to store sessions. Given a database connection

1 db = DAL(...) youcanstorethesessionsinthisdatabase(db)bysimplystating the following, in the same model file that establishes the connection:

1 session.connect(request, response, db) If it does not exist already, web2py creates a table in the database called web2py session appname containing the following fields:

1 Field('locked', 'boolean', default=False), 2 Field('client_ip'), 3 Field('created_datetime', 'datetime', default=now), 4 Field('modified_datetime', 'datetime'), 5 Field('unique_key'), 6 Field('session_data', 'text') "unique key" is a uuid key used to identify the session in the cookie. "ses- sion data" is the cPickled session data. To minimize database access, you should avoid storing sessions when they are not needed with:

1 session.forget() With this tweak the "sessions" folder does not need to be a shared folder because it will no longer be accessed.

Noticethat,ifsessionsaredisabled,youmustnotpassthe session to form.accepts and you cannot use session.flash nor CRUD.

SCALABILITY ISSUES 301

Pound, a High Availability Load Balancer If you need multiple web2py processes running on multiple machines, in- stead of storing sessions in the database or in cache, you have the option to use a load balancer with sticky sessions. Pound [78] is an HTTP load balancer and Reverse proxy that provides sticky sessions. By sticky sessions, we mean that once a session cookie has been issued, the load balancer will always route requests from the client associated to the session, to the same server. This allows you to store the session in the local filesystem. To use Pound: First, install Pound, on out Ubuntu test machine:

1 sudo apt-get -y install pound Second editthe configuration file "/etc/pound/pound.cfg"and enable Pound at startup:

1 startup=1 Bind it to a socket (IP, Port):

1 ListenHTTP 123.123.123.123,80 Specify the IP addresses and ports of the machines in the farm running web2py:

1 UrlGroup ".*" 2 BackEnd 192.168.1.1,80,1 3 BackEnd 192.168.1.2,80,1 4 BackEnd 192.168.1.3,80,1 5 Session IP 3600 6 EndGroup The ",1" indicates the relative strength of the machines. The last line will maintain sessions by client IP for 3600 seconds. Third, enable this config file and start Pound:

1 /etc/default/pound

Cleanup Sessions If you choose to keep your sessions in the filesystem, you should be aware that on a production environment they pile up fast. web2py provides a script called:

1 scripts/sessions2trash.py