Proceedings of the FREENIX Track: 2003 USENIX Annual Technical Conference
Total Page:16
File Type:pdf, Size:1020Kb
USENIX Association Proceedings of the FREENIX Track: 2003 USENIX Annual Technical Conference San Antonio, Texas, USA June 9-14, 2003 THE ADVANCED COMPUTING SYSTEMS ASSOCIATION © 2003 by The USENIX Association All Rights Reserved For more information about the USENIX Association: Phone: 1 510 528 8649 FAX: 1 510 548 5738 Email: [email protected] WWW: http://www.usenix.org Rights to individual papers remain with the author or the author's employer. Permission is granted for noncommercial reproduction of the work for educational or research purposes. This copyright notice must be included in the reproduced paper. USENIX acknowledges all trademarks herein. GNU Mailman, Internationalized Barry A. Warsaw Zope Corporation [email protected] [email protected] http://www.zope.com http://barry.warsaw.us Abstract When Mailman was first released, python.org quickly adopted it and has been using it ever since. Mailman 2.0 GNU Mailman is a mailing list manage- marked a milestone in its development, as ment system that has been in production use version 2.0.13 is quite stable, and deployed since 1998. In December 2002, a version 2.1 at thousands of sites. It runs everything was released containing many new features. from small special interest group lists to This paper will describe one of the most im- huge announcement lists, at sites ranging portant – Mailman 2.1’s internationalization from the commercial (RedHat, SourceForge, support. Presented here are the tools that Apple, Dell, SAP, and Zope Corporation), were built and the approaches Mailman took to the hacker community (XEmacs, Samba, to marking and translating text, as well a re- Gnome, KDE, Exim, and of course Python), view of some of the benefits and pitfalls of to numerous educational organizations Mailman’s solution. Also presented will be and non-profits. There are lots of host- some future directions for internationalized ing facilities providing Mailman services, Mailman, as well as other complex Python and increasingly, quite a few international applications such as Zope. organizations. One of the reasons for the increased interest from the non-English speaking world is that 1 Introduction Mailman 2.1, which had been in development for about two years, is fully internationalized. Internationalization is the process of prepar- ing an application for use in multiple locales. GNU Mailman was invented by John Viega Localization is the process of specializing the sometime before 1997, or so is indicated by application for a specific locale. For example, the earliest known archived message on the during internationalization, all end-user dis- subject. The earliest hit for “viega mail- playable text in the Mailman 2.1 source code man” in Google groups is about the Dave was specially marked as requiring translation. Matthews band mailing list that John was Mailman 2.1.1 (the latest available patch re- running [Viega97]. lease at the time of writing), has been local- ized to almost 20 natural languages. At the time, python.org [Python.Org] was running a hacked version of Majordomo for While Mailman 2.1 may appear to be only a all its special interest group (SIG) mailing minor revision over 2.0.13, it really represents lists, but this had two problems: first, the site quite an extensive rewrite. It could easily was becoming unmaintainable as the admin- have been argued that this version should be istrators tried to customize new features into called Mailman 3.0. Before describing the de- Majordomo, and second, it just wouldn’t do tails of the internationalization work, a brief to run Python’s mailing lists on a Perl-based overview of Mailman is provided, including list server. USENIX Association FREENIX Track: 2003 USENIX Annual Technical Conference 39 a quick tour of some of the other important the message. This means that various as- features in the 2.1 release. pects of the message, i.e. the header or footer, or the To field, can contain infor- mation specific to the member receiving the message. 2 What is GNU Mailman? • Extensive privacy options which allow a list administrator to select policies for subscribing and unsubscribing (open, “GNU Mailman” (informally referred to as confirmation required, or approval re- just “Mailman”), is a system for managing quired), policies for posting to the list electronic mailing lists. It is implemented pri- (open, moderated, members only, ap- marily in Python, an object-oriented, very proved posters only), and some limited high-level, open source programming lan- spam defenses. guage. Mailing lists are administered by a list owner, and users can interact with the list • Automatic bounce processing. Bouncing – including subscribing and unsubscribing – addresses are the bane of any mailing through the web and through email. Site ad- list, and Mailman provides two mech- ministrators can also interact with Mailman anisms for automatic bounce detection, via a suite of command line scripts, or even regular expression based bounce match- via the interactive Python prompt. Mailman ing and Variable Envelope Return Paths is the official mailing list manager of the GNU [VERP]. project and is available under terms of the RFC 3464 [RFC3464] and the older RFC GNU General Public License [GPL]. it replaces [RFC1894] describe a stan- dard format for bounce notifications. Mailman strives for standards compliance, However, many mail systems ignore or and as such is interoperable with a wide range incorrectly implement this standard. For of web servers and browsers, and mail servers recognizing bounce messages, Mailman and clients. Of the web servers, it requires the has an extensive set of regular expression ability to execute CGI scripts, and of mail based matches used to dig the bouncing servers it requires the ability to filter mes- address out of the notice. For fool-proof sages through programs. Apache is probably bounce detection, Mailman also supports the most widely used web server for Mail- VERP, a technique where the intended man, and any of the Big 4 mail servers (Send- recipient’s address as it appears on the mail, Postfix, Qmail, and Exim) will work mailing list is encoded into the envelope just fine. The HTML that Mailman out- sender of the message. Because remote puts is extremely pedestrian so just about mail servers are required to send bounces any web browser should work with it, as long to the envelope sender, Mailman can un- as it supports cookies. Mailman should work ambiguously decode the intended recip- with any MIME-compliant mail reader. Mail- ient’s address and register an accurate man works on any Unix-like operating sys- bounce. Note that technically, VERP tem, such as GNU/Linux. must be implemented in the mail server, but Mailman’s use of the technique is Mailman supports a wide range of features, close enough to warrant the label. such as: • Archiving. Mailman comes bundled with an archiver called Pipermail. Pipermail’s • User selectable delivery modes. Mem- chief advantages are that it comes bun- bers can elect to receive messages im- dled, that it is implemented in Python, mediately, or in batches called digests. and that in Mailman 2.1 it is interna- Two forms of digests are supported, RFC tionalized, allowing the display of mes- 1153 style plain text digests [RFC1153], sages in alternative languages and char- and MIME multipart/digest style di- acter encodings. Its primary disadvan- gests. Non-digest deliveries can be per- tages are that it doesn’t support search- sonalized specifically for the recipient of ing and isn’t very customizable. Mail- 40 FREENIX Track: 2003 USENIX Annual Technical Conference USENIX Association man is easily integrated with external 3 Internationalization Issues archivers. The new features in Mailman 2.1 are ex- • A mail to news gateway. Mailman can tensive, but the most visible addition is the be configured to gateway lists to and support for multiple natural languages. This from Usenet newsgroups. For example, means that all the administrative and pub- the comp.lang.python newsgroup is gate- licly visible web pages, all the email notifica- wayed to the [email protected] tions, and even the built-in archiver can be mailing list. Even moderated lists, such configured to produce text in any of nearly as comp.lang.python.announce can be 20 natural languages out of the box. A large gatewayed, with Mailman serving as the part of the re-architecting of Mailman for 2.1 moderation tool. has been to provide a framework for easily adding new natural languages as they become available from volunteer translation teams. • Auto-responder, content filtering, and topics. The auto-responder can be set up to send a canned message when- 3.1 Message IDs ever someone posts to the list, or emails the list owner or -request robot. Con- tent filtering allows the list owner to explicitly filter or pass specific MIME Not every string in an application needs to (Multipurpose Internet Mail Extensions be translated. For example, some strings are [RFC2045]) content types. Topics allow used as keys in dictionaries, or represent mail the list owner to assign incoming mes- headers, or contain HTML tags. To make sages to any of a configurable number of the proper distinction we refer to strings that groups, and members can “subscribe” to are intended for human readability as “text” a specific topic, receiving only the sub- or “messages”. One of the most labor in- set of list traffic that matches the desired tensive parts of internationalizing an exist- topics. ing code base such as Mailman’s is to go through every string in the software and dis- tinguish messages from ordinary strings. In • Virtual domains. Mailman can be used addition to the non-translatable strings de- on a mail server that supports multi- scribed above, the decision was made to not ple virtual domains.