CanIt-PRO Administration Guide for Version 8.0.3 Roaring Penguin Software Inc. 28 February 2011 2

CanIt-PRO — Roaring Penguin Software Inc. Contents

1 Introduction 17 1.1 Principles of Operation...... 17 1.2 Handling False-Positives...... 17 1.3 Organization of this Manual...... 18 1.4 Definitions...... 19

2 Operation 23 2.1 Principles of Operation...... 23 2.2 Interaction between Whitelists, Blacklists and Mismatch Rules...... 24 2.2.1 RCPT TO: Actions...... 25 2.2.2 Post-DATA Actions...... 26 2.3 Streaming...... 28 2.4 How Addresses are Streamed...... 29 2.5 How Streaming Methods are Chosen...... 29 2.6 Status of Messages...... 33 2.7 Handling of Suspect Messages...... 33 2.7.1 Handling Methods...... 33 2.7.2 Secondary MX Relays...... 34 2.8 The Database...... 34 2.9 Remailing Messages...... 35

3 Streams 37 3.1 Introduction to Streams...... 37 3.2 The Definition of a Stream...... 37 3.3 Users and E-Mail Addresses...... 37 3.4 Mapping...... 40 3.5 The Home Stream...... 40

CanIt-PRO — Roaring Penguin Software Inc. 3 4 CONTENTS

3.6 The “default” Stream...... 40

4 CanIt-PRO Setup 41 4.1 Accessing The Web Interface...... 41 4.1.1 License Key Screen...... 41 4.1.2 Login Screen...... 42 4.2 The Setup Menu...... 43 4.3 Wizards...... 44 4.3.1 Basic Setup Wizard...... 44 4.3.2 RPTN Setup Wizard...... 44 4.3.3 Dictionary Attack Detection Wizard...... 44 4.4 Verification Servers...... 45 4.4.1 Wildcard Verification Server...... 47 4.5 Mail Routing...... 47 4.5.1 Outbound Relaying...... 48 4.6 Cluster Management...... 49 4.6.1 Altering Services on a Cluster Member...... 49 4.6.2 Renaming of Cluster Members...... 50 4.7 Known Networks...... 50 4.7.1 Overlapping Networks...... 53 4.7.2 The SMTP-AUTH Pseudo-Network...... 54 4.8 Rate-Limiting Outbound Mail...... 54 4.8.1 Rate-Limiting by IP Address...... 55 4.9 Features...... 55 4.9.1 Direct Queue Injection...... 55 4.10 System Check...... 56 4.11 Templates...... 57 4.12 HTTPS...... 59 4.13 The Domain Mapping Table...... 59 4.14 The Address Mapping Table...... 61 4.14.1 Wild-Card Entries...... 61 4.15 The default Stream...... 62 4.16 Mapping Scenarios...... 62 4.16.1 Central Scanning with Opt-Out...... 62 4.16.2 Single Domain...... 63

CanIt-PRO — Roaring Penguin Software Inc. CONTENTS 5

4.16.3 Single Domain with Aliases and Mailing Lists...... 63

5 CanIt-PRO Administration 65 5.1 Global Settings...... 65 5.2 Real-Time DNS Blacklists...... 69 5.2.1 Entering the Master List of DNS RBLs...... 69 5.2.2 combined.bl.rptn.ca...... 71 5.3 Users...... 71 5.3.1 User Privileges...... 72 5.3.2 Adding a User...... 73 5.3.3 Editing a User...... 73 5.3.4 Deleting a User...... 74 5.3.5 Granting Access to Streams...... 74 5.4 Permitting Users to Opt In...... 75 5.5 Groups...... 76 5.5.1 Creating, Deleting and Editing Groups...... 77 5.6 Viewing Active Streams...... 78 5.6.1 Definition of an Active Stream...... 78 5.6.2 The Active Stream Display...... 79 5.6.3 Deleting a Stream...... 79 5.7 Filtering Outgoing Mail...... 79 5.8 Copying Rules from One Stream to Another...... 80 5.9 Secondary MX Hosts...... 80 5.10 Avoiding Backscatter...... 81 5.11 Test Plugins...... 82 5.11.1 The PhishingAddress Plugin...... 82 5.12 Emergency Blocking of Delivery Status Notifications...... 83 5.13 Removing All Rules and Settings from a Stream...... 84

6 External Authentication 85 6.1 Introduction...... 85 6.2 User Lookups...... 85 6.2.1 IMAP and POP3 Authentication...... 87 6.2.2 LDAP Authentication and Streaming...... 88 6.2.3 Program Authentication and Streaming...... 91 6.2.4 Program Authentication (Legacy Method)...... 95

CanIt-PRO — Roaring Penguin Software Inc. 6 CONTENTS

6.2.5 The account-info Script...... 95 6.3 Authentication Mappings...... 95

7 Bayesian Filtering 97 7.1 Introduction to Bayesian Filtering...... 97 7.2 Unauthenticated Voting...... 97 7.3 The Bayes Journal...... 98 7.4 RPTN...... 98 7.5 Ruleset and Geolocation Data Updates...... 99

8 Permissions 101 8.1 Introduction...... 101 8.2 Stream Permissions...... 101 8.3 Determining Permissions...... 102 8.4 Granting Permissions...... 102 8.4.1 Granting Stream Permissions...... 103 8.4.2 Granting User Permissions...... 105

9 Streams, Inheritance and the Simple GUI 107 9.1 Simplification...... 107 9.2 Stream Inheritance...... 107 9.3 Special Streams...... 109 9.3.1 Final Streams...... 109 9.3.2 Creating Special Streams...... 109 9.3.3 Deleting Special Streams...... 110 9.4 The Simplified GUI...... 110 9.5 Inheritance from Non-Final Streams...... 111 9.6 Inheritance from Opted-Out Streams...... 111

10 Periodic Reports 113 10.1 Introduction...... 113 10.1.1 Periodic Reports...... 113 10.1.2 Charts...... 113 10.2 Creating Charts...... 115 10.3 Creating Periodic Reports...... 115 10.4 Editing Periodic Reports...... 116 10.5 Running a Report on Demand...... 117

CanIt-PRO — Roaring Penguin Software Inc. CONTENTS 7

11 Locked Addresses 119 11.1 Introduction to Locked Addresses...... 119 11.2 Preparing to use Locked Addresses...... 119 11.2.1 Create a new domain...... 119 11.2.2 Configure mail for the new domain...... 119 11.2.3 Inform CanIt-PRO about the locked address domain...... 120 11.2.4 Associate each login name with an e-mail address...... 120

12 Attachment Handling 121 12.1 General Filename and MIME Type Rules...... 121 12.2 Delaying Attachments...... 121 12.2.1 Enabling the Feature...... 121 12.2.2 Creating Delay Rules...... 121 12.2.3 How It Works...... 122 12.3 Stripping Attachments...... 123

13 CanIt Storage Manager 125 13.1 Storage Manager Concepts...... 125 13.1.1 Principles of Operation...... 126 13.2 Configuring the Storage Manager...... 127 13.2.1 Enabling the Storage Manager...... 127 13.2.2 The Configuration Wizard...... 127 13.2.3 Local Configuration...... 128 13.2.4 Starting the Storage Manager...... 129 13.2.5 Data Stored in the Storage Manager...... 129 13.3 Backup Considerations...... 129 13.4 Running multiple Storage Managers...... 130 13.5 ps Output...... 130

14 Searching Logs 131 14.1 Introduction...... 131 14.2 Installing the Log Indexer...... 131 14.2.1 Disk Space...... 131 14.2.2 Installing the Module...... 132 14.2.3 Configuring rsyslog.conf...... 132 14.2.4 Configuring the Log Correlator...... 132

CanIt-PRO — Roaring Penguin Software Inc. 8 CONTENTS

14.2.5 Cluster Considerations...... 133 14.3 Log Basics...... 133 14.4 Searching the Logs...... 133 14.4.1 Performing a Search...... 133 14.4.2 Result List...... 135 14.4.3 Detailed Results...... 135 14.5 Advanced Search Queries...... 136

15 Tips 139 15.1 Greylisting...... 139 15.2 Don’t Trust Sender Addresses...... 140 15.3 Don’t Trust Sender Domains...... 140 15.4 You May Trust Relay Hosts...... 140 15.5 Custom Rules...... 141 15.5.1 General Recommendations...... 141 15.5.2 Things to avoid...... 141 15.6 Group High-Scoring Messages Together...... 141 15.7 Roaring Penguin Best-Practices...... 142 15.8 General Anti-Spam Tips...... 142 15.8.1 Use Receive-Only Addresses on your Web Site...... 142 15.8.2 Do Not Reply to Spam...... 142

16 Security 143 16.1 Don’t Run as Root...... 143 16.2 Ownership and Permissions...... 143 16.3 SSH...... 144 16.4 PostgreSQL Security...... 144 16.5 PHP Security...... 144 16.6 Network Security...... 144 16.7 Backups...... 145

A The Domain Configuration Wizard 147 A.1 Introduction...... 147 A.2 Entering the Domain Name...... 147 A.3 Configuring Streaming...... 147 A.4 Configuring Authentication...... 148

CanIt-PRO — Roaring Penguin Software Inc. CONTENTS 9

A.5 Configuring Routing and Verification...... 150 A.6 Summary...... 150

B Release Notes 151

C A Testing Topology for CanIt-PRO 187 C.1 Introduction...... 187 C.2 Assumptions...... 187 C.3 Network Setup...... 187 C.4 Build the CanIt-PRO Server...... 188 C.5 Configure the CanIt-PRO Server to Relay Mail...... 188 C.5.1 Enable Relaying...... 188 C.5.2 Configure Forwarding Relays...... 189 C.5.3 Rebuild Databases...... 189 C.6 Route Test Mail...... 189 C.6.1 Direct Injection...... 189 C.6.2 Create a Test Subdomain...... 190 C.7 Route Real Mail...... 190 C.8 Outgoing Mail...... 191

D CanIt-PRO Architecture 193 D.1 Introduction...... 193 D.2 CanIt-PRO Architecture...... 194 D.3 Starting and Stopping CanIt-PRO...... 195 D.4 Static Configuration Files...... 195 D.4.1 Database Settings...... 196 D.4.2 Cron Settings...... 196 D.4.3 MIMEDefang Settings...... 196 D.4.4 Filter Settings...... 198 D.4.5 Ticker Settings...... 199 D.4.6 Storage Manager Settings...... 199 D.5 Tuning CanIt-PRO...... 200 D.5.1 Memory...... 200 D.5.2 Disk...... 200 D.5.3 Solaris-Specific tmpfs Note...... 200 D.5.4 CPU...... 200

CanIt-PRO — Roaring Penguin Software Inc. 10 CONTENTS

D.5.5 Sendmail...... 201 D.6 Dealing with Overload...... 201 D.6.1 Tune CanIt-PRO and Sendmail...... 201 D.6.2 Network Architecture...... 201

E CanIt-PRO HOWTOS 203 E.1 Restoring a Database from a Dump...... 203 E.2 Firewall Settings...... 204 E.2.1 Firewall Rules: External Hosts...... 204 E.2.2 Firewall Rules: Internal Hosts...... 204 E.2.3 Firewall Rules: Intra-Cluster Hosts...... 204 E.3 Running Something after the Nightly Cron Job Completes...... 205 E.4 Migrating CanIt-PRO to a Different Machine...... 205 E.4.1 CanIt-PRO Clusters...... 205 E.4.2 Storage Manager...... 206 E.4.3 Migration Procedure...... 206 E.5 Using canit-api-client, the CanIt-PRO Command-Line Tool...... 209 E.6 Cloning a CanIt-PRO Machine...... 209

F Using CanIt-PRO with memcached 211 F.1 Introduction...... 211 F.2 Using memcached...... 211 F.2.1 Installing memcached...... 211 F.2.2 Configuring memcached...... 211 F.2.3 Single vs. Multiple Caches...... 211 F.2.4 Configuring CanIt-PRO to use memcached...... 212 F.3 What is Cached...... 213

G Using CanIt-PRO with PgBouncer 215 G.1 Introduction...... 215 G.2 Installation...... 215 G.3 Configuration...... 215 G.3.1 Configuring userlist.txt ...... 216 G.3.2 Configuring pgbouncer.ini ...... 216 G.3.3 Configuring CanIt-PRO to use PgBouncer...... 216

H CanIt-PRO Logging 217

CanIt-PRO — Roaring Penguin Software Inc. CONTENTS 11

H.1 General Information...... 217 H.2 Event Log Format...... 218

I SNMP Agents for CanIt-PRO 221 I.1 Introduction...... 221 I.2 The SNMP Agent...... 221 I.2.1 Enabling the agent...... 222 I.2.2 Configuring SNMPd...... 222 I.2.3 Agent Data...... 223

J Additional Scripts 227 J.1 reset-password.pl...... 227

K Bayes Database Back-Ends 229 K.1 PostgreSQL Bayes Data Storage...... 229 K.2 Berkeley Database Bayes Storage...... 229 K.3 CDB Database Bayes Storage...... 229 K.4 Cluster Considerations...... 230 K.4.1 Propagating Updates...... 230 K.5 Switching back to PostgreSQL Bayes Storage...... 230

L System Check Tests 231

M The CanIt-PRO License 235 M.1 THE CANIT DATA LICENSE...... 238

Index 239

CanIt-PRO — Roaring Penguin Software Inc. 12 CONTENTS

CanIt-PRO — Roaring Penguin Software Inc. List of Figures

2.1 Flow of Mail through CanIt-PRO...... 24 2.2 RCPT TO: Decision...... 25 2.3 Post-Data Decision...... 27 2.4 Address Streaming...... 31 2.5 Database Agents...... 34

3.1 Streaming Scenarios...... 39

4.1 License Key Screen...... 41 4.2 Login Screen...... 42 4.3 Welcome Screen...... 43 4.4 Verification Servers...... 46 4.5 Domain Routing Screen...... 47 4.6 Domain Routing Detail...... 48 4.7 Cluster Management Page...... 49 4.8 Known Networks...... 51 4.9 System Check...... 56 4.10 Templates...... 57 4.11 Domain Mappings...... 59 4.12 Address Mappings...... 61

5.1 Global Settings...... 65 5.2 Master RBLs...... 69 5.3 Users...... 72 5.4 Add User...... 73 5.5 Edit User...... 74 5.6 Granting Access to Streams...... 75 5.7 Stream Opt-In Approval...... 76

CanIt-PRO — Roaring Penguin Software Inc. 13 14 LIST OF FIGURES

5.8 Groups...... 77 5.9 Group Members...... 77 5.10 Active Streams...... 78 5.11 Copying Rules...... 80 5.12 Test Plugins...... 82 5.13 Block Delivery Status Notifications Page...... 83

6.1 User Lookup List...... 86 6.2 User Lookup Wizard...... 86 6.3 User Lookup: Method Selection...... 86 6.4 IMAP/POP3 User Lookup...... 87 6.5 LDAP User Lookup...... 89 6.6 Program User Lookup...... 91 6.7 Authentication Mappings...... 96

8.1 Permissions Page...... 103 8.2 Permissions Page...... 103 8.3 Stream Permissions Page...... 104 8.4 User Permissions Page...... 106

9.1 Stream Inheritance Terminology...... 108 9.2 Stream Inheritance Table...... 108 9.3 Special Stream Table...... 109 9.4 Simplified Interface...... 110

10.1 Periodic Reports...... 115 10.2 Add Periodic Report...... 116

12.1 Delayed Attachments...... 122 12.2 Attachment-Stripping Rules...... 123

13.1 CanIt Storage Manager...... 126 13.2 Storage Manager Configuration...... 128

14.1 Log Search Page...... 134 14.2 Log Search Results...... 135 14.3 Log Search Details...... 136

CanIt-PRO — Roaring Penguin Software Inc. LIST OF FIGURES 15

A.1 Domain Configuration: Enter Domain Name...... 147 A.2 Domain Configuration: Configuring Streaming...... 148 A.3 Domain Configuration: Configuring Authentication...... 149 A.4 Domain Configuration: Configuring Routing and Verification...... 150

C.1 Network Configurations...... 188

D.1 CanIt-PRO Architecture...... 194

CanIt-PRO — Roaring Penguin Software Inc. 16 LIST OF FIGURES

CanIt-PRO — Roaring Penguin Software Inc. Chapter 1

Introduction

CanIt-PRO is server-based anti-spam software that stops spam from entering your network. This guide explains how to administer CanIt-PRO, and is intended for e-mail administrators. For installation instructions, please see the Installation Guide, and for end-user instructions, see the User’s Guide.

1.1 Principles of Operation

CanIt-PRO uses many sophisticated rules and mechanisms to detect spam. These rules include those in an open-source anti-spam package, and are very effective and broad-spectrum. Once CanIt-PRO decides that a message is probably spam, it is held for review. You can configure CanIt-PRO to return an SMTP “temporary failure” code to the sending relay host for any message held for review. In this way, the message body is held in the sender’s spool directory and not in yours. A more complete description of how CanIt-PRO operates is given in Chapter2.

1.2 Handling False-Positives

Although CanIt-PRO’s rules for identifying spam are very accurate, no purely automated process can be 100% correct. That is why CanIt-PRO relies, in the end, on human intervention. In this way, it can guarantee that no legitimate e-mail message will ever be rejected, and you will never lose an important e-mail because of automated scanning. At first glance, it seems that requiring human intervention is a step backwards—spam messages again must be reviewed by a person. In reality, CanIt-PRO still saves time and money for the following reasons:

• CanIt-PRO includes many features to lower your workload. (These features are described later in this manual.) You can scan and categorize e-mail messages using CanIt-PRO much more quickly than using mail reader software. • As time passes, you will begin recognize mailing-list traffic and other traffic that tends to be falsely flagged as spam, and tell CanIt-PRO to always whitelist that traffic. Over time, this reduces the amount of human intervention required.

CanIt-PRO — Roaring Penguin Software Inc. 17 18 CHAPTER 1. INTRODUCTION

• If you are willing to take the risk of inappropriately rejected messages, you can configure CanIt- PRO to automatically reject very high-scoring messages.

1.3 Organization of this Manual

This manual is divided as follows: Chapter1, “Introduction”, is this chapter. You should familiarize yourself with the terms in Section 1.4 before proceeding. Chapter2, “Operation”, describes the principles behind CanIt-PRO’s operation. Chapter3, “Streams”, describes the concepts behind streaming. You must read and understand this chapter before using CanIt-PRO in production. Chapter4, “CanIt-PRO Setup”, describes basic setup steps you need to take to configure CanIt-PRO. Chapter5, “CanIt-PRO Administration”, describes tasks undertaken by the CanIt-PRO administrator. Chapter6, “External Authentication”, describes how to integrate CanIt-PRO with an external authen- tication mechanism (such as LDAP or POP3.) Chapter7, “Bayesian Filtering”, explains CanIt-PRO’s Bayesian filtering module. Bayesian filtering uses statistical analysis and training so that CanIt-PRO “learns” to recognize spam based on user feedback. Chapter8, “Permissions”, describes how to control access to various parts of the CanIt-PRO Web interface. Chapter9, “Streams, Inheritance and the Simple GUI”, describes how the CanIt-PRO administrator can set up different groups of spam-handling settings and allow end-users to select from one of a limited number of predetermined setups. The simplified interface is very useful if you wish to provide “canned” settings for unsophisticated users. Chapter 11, “Locked Addresses”, describes how CanIt-PRO permits users to generate addresses that they can give out to strangers, but that those strangers cannot in turn give or sell to third-parties. Chapter 12, “Attachment Handling”, describes CanIt-PRO options for handling various attachments. Chapter 14, “Searching Logs”, describes CanIt-PRO’s log-indexing and searching feature (available only on appliance builds.) Chapter 15, “Tips”, contains guidelines for reducing the workload of the spam-control officer and dealing with spam more effectively. Chapter 16, “Security”, contains information about CanIt-PRO security. AppendixC, “A Testing Topology for CanIt-PRO”, gives tips on how to test CanIt-PRO before putting it into production. This appendix also contains useful information on production network topology, so if you are planning on using CanIt-PRO as a relay-only server, you should read this appendix. AppendixD, “CanIt-PRO Architecture”, discusses CanIt-PRO’s filter architecture in detail. It provides tips on tuning CanIt-PRO and describes the various configuration files used by CanIt-PRO. AppendixE, “CanIt-PRO HOWTOs”, gives short “how-to” recipes for performing common CanIt- PRO administrative tasks, such as restoring a database from the text dump, or moving CanIt-PRO to

CanIt-PRO — Roaring Penguin Software Inc. 1.4. DEFINITIONS 19

another machine. It also briefly describes the command-line tool canit-api-client. AppendixH, “CanIt-PRO Logging”, explains how CanIt-PRO logs statistics, warning, and error mes- sages. AppendixJ, “Additional Scripts”, describes some additional scripts bundled with CanIt-PRO that you might find useful.

1.4 Definitions

We use many terms related to Internet e-mail in this manual. Here is a definition of some of the terms we use.

Backscatter Unwanted DSNs (see “DSN”) caused when e-mail systems respond to faked sender addresses. Bayesian Analysis is a method whereby an anti-spam system keeps track of how often words appear in spam and non-spam. Once enough statistics have been accumulated, the system can calculate the likelihood that a new message is spam. Blacklist A list of domains, senders or hosts that are blocked from sending e-mail. CIDR “Classless Inter-Domain Routing”. A method for specifying an entire set of contiguous IP addresses. CanIt-Domain-PRO is an enhanced version of CanIt-PRO that allows two levels of delegation of responsibility. See the next three definitions for more details. CanIt-PRO is an enhanced version of CanIt that allows flexible delegation of spam-control respon- sibilities rather than requiring a single spam-control officer. CanIt is extra software built on top of MIMEDefang that provides sophisticated spam-management functions. Cron A UNIX program that runs tasks periodically. DNS “Domain Name System”. The mechanism used on the Internet to translate host names to IP addresses and more generally, to associate various sorts of information with domain names. DSN “Delivery Status Notification”. A message generated automatically to notify senders of prob- lems or failure to deliver an e-mail. Daemon A long-running UNIX program that typically starts at system boot and continues running in the background until the system is shut down. Envelope Mail messages often have headers specifying the sender (the “From:” header) and recipi- ents (typically the “To:” header.) However, SMTP has a completely separate set of commands for specifying the sender and recipients. The sender and recipients specified in the SMTP com- mands are referred to as the envelope sender and envelope recipients, and do not necessarily match the information in the message headers. CanIt-PRO always uses the envelope sender and recipient addresses in its rules.

CanIt-PRO — Roaring Penguin Software Inc. 20 CHAPTER 1. INTRODUCTION

Hash An algorithm that computes a short “signature” given a chunk of data. Different inputs are very likely to yield different signatures, so that a signature can be considered as a short-hand identifier for the original data.

Greylisting A technique to block spam from certain spam-sending software. It works by issuing a Temporary Failure Code the first time an e-mail arrives from an unknown sender and IP address. Legitimate SMTP servers will retry, allowing the message to be delivered. Some spam-sending software does not retry, and messages sent by such software will be blocked without any content- scanning if greylisting is enabled.

Joe-Job A technique in which spammers fake the sending address to be that of an innocent victim, who often receives DSNs (see “DSN”) and complaints.

MIMEDefang is a free (GPL’d) e-mail scanning program that integrates with Sendmail’s Milter API. It forms the basis for CanIt.

MIME “Multipurpose Internet Mail Extensions”. A set of rules for encoding different types of at- tachments as plain-text messages for transmission over SMTP.

Milter is a Sendmail interface that allows external programs to listen in on the SMTP dialog, and potentially modify Sendmail’s actions and SMTP responses.

Mismatch Rule A rule that restricts which hosts are allowed to send mail on behalf of a particular domain.

Permanent Failure Code Also called reject, this is a code sent to a relay host telling it that e-mail transmission has failed and will not succeed. (For example, this code is sent if someone tries to send e-mail to a nonexistent user.) The relay host typically e-mails a failure notification to the original sender and discards the message.

Phishing An attack in which someone forges e-mail pretending to be from a security organization, a bank, etc. and convinces naive users to reveal sensitive information like user-names and passwords.

PostgreSQL A free and open-source SQL database heavily used by CanIt-PRO.

RBL “Real-time Blocklist”. A DNS-based system for checking in real-time whether or not hosts or domains should be blocked.

RPTN is the Roaring Penguin Traning Network. This is a system whereby multiple CanIt-PRO installations can share Bayes training data.

RSS stands for “Really Simple Syndication” and is a format for publishing “news feeds” on the Web. CanIt-PRO can produce an RSS feed showing pending incidents.

Relay Host When a mail server wishes to transmit e-mail to your server using SMTP, it establishes a connection with your mail server. The machine attempting to transmit mail to your server is called a relay host.

Root Privileges A CanIt-PRO user with root privileges can create other users and configure basic operating parameters. Also, he or she can edit other users’ preferences and stream settings.

CanIt-PRO — Roaring Penguin Software Inc. 1.4. DEFINITIONS 21

SMTP Dialog During the course of e-mail transmission, the two ends of an SMTP connection trans- mit commands and results back and forth. This conversation is called the SMTP dialog.

SMTP “Simple Mail Transfer Protocol”, as described in Internet RFC 2821. This is the protocol used to transmit e-mail over the Internet.

SPF stands for “Sender Policy Framework”. It is a mechanism that allows a domain’s administrator to list which hosts are allowed to originate e-mail claiming to come from that domain. For more details, please see http://www.openspf.org.

Sender’s Domain This is the domain part (everything after the @ sign) in the sender’s e-mail address.

Sendmail A UNIX-based program for sending and receiving e-mail. Sendmail is designed to route mail from one mail server to another.

Spam Score A numerical score computed by CanIt-PRO that rates the likelihood that a message is spam.

Stream is a “virtual CanIt” machine offered by CanIt-PRO. If an incoming e-mail arrives for more than one recipient, and the recipients each wish to have his or her own private spam trap, CanIt- PRO re-mails the original message so each recipient has his or her own copy, and can dispatch it as he or she sees fit.

Syslog A UNIX program that centralizes the logging of messages from various system daemons.

Tempfail See “Temporary Failure Code”

Temporary Failure Code Also called tempfail, this is a code sent to a relay host telling it that e-mail transmission has failed temporarily, and it should retry in a little while. Typically, the relay host retains the e-mail message in a spool directory and retries transmission periodically. The host eventually gives up after a certain period (typically, a few days) has elapsed without successful transmission.

Ticker A CanIt-PRO program that runs periodic maintenance tasks.

Whitelist A list of domains, senders or hosts whose e-mail is permitted through without spam- scanning.

CanIt-PRO — Roaring Penguin Software Inc. 22 CHAPTER 1. INTRODUCTION

CanIt-PRO — Roaring Penguin Software Inc. Chapter 2

Operation

2.1 Principles of Operation

CanIt-PRO watches each incoming SMTP message and operates as follows. Because different recip- ients can have different settings, CanIt-PRO makes the following decisions at RCPT time (once the recipient is known):

• If the SMTP connection is from a blacklisted host, the RCPT command is rejected.

• If the message sender is blacklisted (or the domain is blacklisted), the RCPT command is re- jected.

• Otherwise, the message is collected and scanned.

After CanIt-PRO has scanned the message, it performs the following operations:

• Messages containing dangerous files (such as viruses) are discarded or rejected, depending on which option you choose.

• If the sender, relay host or domain are whitelisted, the message is accepted without being scanned for spam.

• Many spam-detection rules are applied to the message. If the message is judged not to be spam, it is accepted and the SMTP transaction succeeds. Otherwise, CanIt-PRO may hold the message locally or tempfail the SMTP transaction, depending on the options you configure.

For messages judged to be spam, CanIt-PRO takes the following steps:

• A unique ID is calculated by running the message body through a special hash function. The hash calculation is designed to be resistant to some forms of trivial message modification.

• The ID is looked up in a database.

CanIt-PRO — Roaring Penguin Software Inc. 23 24 CHAPTER 2. OPERATION

1. If the ID is not found in the database, it is entered as a pending message. CanIt-PRO will either hold a copy of the message locally or send a temporary failure code to the SMTP sender, depending on how CanIt-PRO has been configured. 2. If the ID is in the database with status pending, CanIt-PRO may either save a local copy or return a temporary failure code to the SMTP sender, depending on how CanIt-PRO has been configured. 3. If the ID is in the database with status spam, a permanent rejection code is sent to the SMTP sender. 4. If the ID is in the database with status not-spam, the message is accepted for delivery.

The flow of mail through CanIt-PRO is summarized in Figure 2.1. Note that this is the conceptual flow; in reality, several optimizations are performed that would only complicate the figure. See also Figures 2.2 on page 25 and 2.3 on page 27 for more accurate details about blacklisting and whitelisting.

RCPT Command End of DATA

Y Reject Y Discard Blacklisted? RCPT Virus? Message

N N

Accept RCPT Y Deliver Whitelisted? Message

Proceed N to DATA

Looks Y Hold Like Spam? Message

N

Deliver Message

Figure 2.1: Flow of Mail through CanIt-PRO

2.2 Interaction between Whitelists, Blacklists and Mismatch Rules

CanIt-PRO must prioritize whitelists and blacklists. For example, suppose a sender is whitelisted, but the host the message comes from is blacklisted. What should CanIt-PRO do?

CanIt-PRO — Roaring Penguin Software Inc. 2.2. INTERACTION BETWEEN WHITELISTS, BLACKLISTS AND MISMATCH RULES 25

2.2.1 RCPT TO: Actions

At the SMTP RCPT TO: command, CanIt-PRO examines the envelope sender and SMTP relay ad- dress, and makes decisions according to Figure 2.2.

Start

Y Invalid REJECT Recipient?

Relay Y REJECT Blacklisted?

N N

REJECT Y Sender Blacklisted?

Relay Y ALLOW N Whitelisted?

N

ALLOW Y Sender Whitelisted?

Relay on Y REJECT N Reject RBL?

N

REJECT Y Domain Blacklisted?

"Reject" Y REJECT N Mismatch?

N

ALLOW Y Domain Whitelisted?

ALLOW N

Figure 2.2: RCPT TO: Decision

Here are the steps illustrated in Figure 2.2. They determine the response to the RCPT TO: command. The first rule that matches returns the result; subsequent rules are not tested.

1. If the recipient is blacklisted, the command is rejected. Blacklisted recipients can never receive e-mail.

CanIt-PRO — Roaring Penguin Software Inc. 26 CHAPTER 2. OPERATION

2. If the recipient has opted out of spam-scanning, the command is accepted.

3. If the sender address is blacklisted, reject the command with an SMTP failure code.

4. If the sender address is whitelisted, accept the command. (That is, permit the SMTP transaction to continue. The message may be rejected later for other reasons.)

5. If the domain of the sender is blacklisted, reject the command.

6. If the domain of the sender is whitelisted, accept the command.

7. If the sending relay’s IP address is blacklisted, reject the command.

8. If the sending relay’s IP address is whitelisted, accept the command.

9. If the sending relay is on a real-time blacklist for rejection, then reject the command.

10. If a mismatch rule is triggered based on the sender and the host name, and the action for the mismatch rule is “reject”, then reject the command.

11. Otherwise, accept the command.

2.2.2 Post-DATA Actions

After the SMTP “DATA” command has transmitted the entire message, CanIt-PRO has enough infor- mation to determine a spam score. At this point, it makes decisions according to Figure 2.3.

CanIt-PRO — Roaring Penguin Software Inc. 2.2. INTERACTION BETWEEN WHITELISTS, BLACKLISTS AND MISMATCH RULES 27

START

Y Y Y Virus Found? Domain "Hold" Virus Handling Accept Message Whitelisted? Hold in Trap Mismatch?

N N N

Y Bad MIME type Y Domain Y "Hold" Bad Attachment or Extension? Reject Message Blacklisted? Hold in Trap RBL Rule? Handling

N N N

Y Sender Y Domain Y High Spam Accept Message Hold in Trap Hold, Tag Score? Whitelisted? "Hold"? or Reject

N N N

Y Sender Y Relay Reject Message Blacklisted? Accept Message Whitelisted? Accept Message

N N

Y Sender Y Relay Hold in Trap "Hold"? Reject Message Blacklisted?

N N

Y Relay Hold in Trap "Hold"?

N

Figure 2.3: Post-Data Decision

Here are the steps illustrated in Figure 2.3. They determine the response to the DATA command. The first rule which matches returns the result; subsequent rules are not tested. (There is one exception: If a “Hold Sender”, “Hold Domain” or “Hold Relay” rule is hit, but the message scores over the auto-reject threshold, the message is rejected rather than held for review.) When a message is “held in the trap”, an SMTP temporary-failure code may be issued, or the message may be queued locally, depending on your global settings. When a message is “rejected”, the sending relay receives an SMTP failure code. If the message being rejected was queued locally, it is simply discarded. When a message is “accepted”, it is delivered, and removed from the local queue if it was queued locally.

1. If a virus was found in the message, then the action depends on the virus-handling setting. Here’s what happens for the various settings: • Hold – the message is held in the trap. • Reject – the message is rejected with an SMTP failure code. • Discard – the message is discarded. An SMTP success code is returned. • Accept – processing continues to step (2) below. 2. If a bad MIME part or filename extension was found, then if the bad part has a “Reject” setting, the message is rejected. Otherwise, the message is held in the trap.

CanIt-PRO — Roaring Penguin Software Inc. 28 CHAPTER 2. OPERATION

3. If the user has opted-out of spam-scanning, the message is accepted

4. If the sender is whitelisted, the message is accepted.

5. If the sender is blacklisted, the message is rejected. It may seem superfluous to check for a blacklist here, given that the blacklist was checked during the RCPT command. However, by the DATA command, we have the From: header, and CanIt-PRO applies sender checks to the From: header adress also.

6. If the sender has a “Hold” setting, the message is held in the trap. However, if it scores over the auto-reject threshold, it will be rejected.

7. If the domain is whitelisted, the message is accepted.

8. If the domain is blacklisted, the message is rejected. Again, at this point, CanIt-PRO can make use of the From: header address.

9. If the domain has a “Hold” setting, the message is held in the trap. However, if it scores over the auto-reject threshold, it will be rejected.

10. If the relay is whitelisted, the message is accepted.

11. If the relay has a “Hold” setting, the message is held in the trap. However, if it scores over the auto-reject threshold, it will be rejected.

12. If a relay “Hold” mismatch rule applies, the message is held in the trap.

13. If the relay is on a “Hold” real-time DNS blacklist, the message is held in the trap.

14. If CanIt-PRO is in “Tag Only” mode, the message is tagged (if it looks like spam) and accepted.

15. If the spam score is equal to or above the auto-reject threshold, the message is rejected. Oth- erwise, if the spam score is equal to or above the spam threshold, the message is held in the trap.

16. Otherwise, the message is accepted.

2.3 Streaming

Because CanIt-PRO allows different recipients to have different spam-processing rules, an incoming message for more than one recipient must be streamed. The diagram in Figure 2.1 shows what happens to messages after they have been streamed. If an incoming message arrives for more than one stream, copies are re-mailed to recipients in each stream, and the original message is discarded. Then, each re-mailed message folows the flow in Figure 2.1, with some minor differences that will be explained later. In Figure 2.1, all of the blacklisting and whitelisting decisions are unique to a stream. It is perfectly feasible for one stream to whitelist a sender, a second stream to blacklist it, and a third stream to do neither.

CanIt-PRO — Roaring Penguin Software Inc. 2.4. HOW ADDRESSES ARE STREAMED 29

Messages that are streamed and re-mailed are not held by issuing a temporary-failure code, because they would then reside in your own mail spool and waste resources during repeated sending attempts (until they are approved or rejected.) Instead, held messages are stored in the database, and re-mailed if approved or discarded if rejected.

2.4 How Addresses are Streamed

CanIt-PRO can map e-mail addresses to streams using the following techniques:

Database CanIt-PRO maintains a table of address-to-stream mappings in the Address Mapping Table. If you choose the Database technique, then this table is consulted to perform the mapping. You hand-enter the mappings between addresses and streams. In addition, the Database technique allows a “wildcard” lookup if the original lookup does not exist.

AsIs This method simply uses the entire e-mail address as the stream name, after stripping angle- brackets and converting to lower-case. Therefore, [email protected] gets mapped to [email protected],

ChopDomain This method simply chops the domain part off the e-mail address. Therefore, [email protected] gets mapped to xzyyz.

ChopUser This method chops the user part off the e-mail address. Therefore, [email protected] gets mapped to example.com.

Program This method runs the account-info program to determine the stream. Please see Sec- tion 6.2.4 on page 95 for details.

Sendmail This method invokes Sendmail (it assumes that the recipient is local on the CanIt-PRO ma- chine.) This method is deprecated and should no longer be used; it was intended for installations that run CanIt-PRO on their actual final mail server (something we no longer recommend.)

Note: No matter what stream method you choose, an exact-match database lookup is always done first. This lets you override the mapping for special cases. For example, if you host only a single domain, then the ChopDomain method is probably fine for most addresses. However, if you also host mailing lists, you’d like to stream spam for the lists to the mailing list owners. In that case, you can add special mappings mapping [email protected] to joe-owner, (where joe-owner is the person responsible for list-name.) Because the Program method is somewhat inefficient, CanIt-PRO caches results in the database table. This improves efficiency while retaining flexibility. By default, cached entries are valid for 24 hours, but you can adjust the timeout.

2.5 How Streaming Methods are Chosen

Each domain can be streamed using its own method. To select a streaming method, CanIt-PRO first looks up the domain in the Domain Mapping Table. This table holds a list of streaming methods for

CanIt-PRO — Roaring Penguin Software Inc. 30 CHAPTER 2. OPERATION

each domain. If the lookup fails, CanIt-PRO looks up the wildcard entry “*” in the Domain Mapping Table and uses that method to stream the address. Figure 2.4 illustrates how addresses are streamed.

CanIt-PRO — Roaring Penguin Software Inc. 2.5. HOW STREAMING METHODS ARE CHOSEN 31

Incoming Mail for stream = lookup [email protected] "[email protected]" in Address Mapping Table

method = lookup "domain.tld" in Domain Mapping Table Y stream found?

N

Y method found?

method = Y N ChopDomain stream = "user" or AsIs? or "[email protected]"

N method = lookup "*" in Domain Mapping Table

method = Y Run sendmail −bv Cache stream in Sendmail? to determine Address Mapping local user Table

Y N method found?

N method = Y Run account−info script Program? to determine Return stream local user

method = "Database" N

method = Y Look up stream LDAP in LDAP directory.

N

stream = lookup "*@domain.tld" in Address Mapping Table

Y stream found?

N

stream = lookup "user@*" in Address Mapping Table

Y stream found?

N

stream = lookup "*" in Address Mapping Table

Figure 2.4: Address Streaming

CanIt-PRO — Roaring Penguin Software Inc. 32 CHAPTER 2. OPERATION

Figure 2.4 looks complicated, but the streaming process is very flexible, and actually quite simple. Here is a description of the figure, with some more details that would crowd the figure too much.

1. For an incoming message to [email protected], CanIt-PRO first looks up example.com in the Domain Mapping Table. If that lookup succeeds, CanIt-PRO will have a method (ChopDomain, ChopUser, Sendmail, Program or Database), and CanIt-PRO proceeds to Step4.

2. If the lookup fails, the leading component of the domain name is dropped (ie: “subdo- main.example.com” becomes “example.com”) and we retry Step1 with the shorter name.

3. If lookups on all domain components fail, CanIt-PRO looks up * in the Domain Mapping Table. This allows you to set a default streaming method for all domains. If that lookup fails, the method defaults to Database.

4. Regardless of the method chosen, CanIt-PRO looks up [email protected] in the Address Mapping Table. If an exact match is found (and it is not expired if it is a cached entry), the result of that lookup is used as the stream.

5. Otherwise, CanIt-PRO determines the stream as follows:

• If the method is ChopDomain, the @example.com part is deleted, and the stream becomes address. • If the method is ChopUser, the address@ part is deleted, and the stream becomes exam- ple.com. • If the method is AsIs, the entire e-mail address is used as the stream name. • If the method is Sendmail, CanIt-PRO runs “sendmail -bv” with [email protected] as an argument. If that address resolves to a single local mailbox, that local mailbox name is used as the stream. • If the method is Program, CanIt-PRO runs the account-info program as described in Section 6.2.4.

If the stream determination succeeded (ChopDomain and ChopUser always succeed; Send- mail fails if the address does not resolve to a single local mailbox and Program fails if the program produces no output), then the stream is returned. In the case of Sendmail and Pro- gram, the stream is also cached in the database.

6. If the previous step failed to determine a mapping method, or the method was set to Database, CanIt-PRO looks up user@*. If that fails, then *@example.com in the address mapping table. This allows you to map all addresses in a particular domain to a stream. If that fails, as a last resort, CanIt-PRO looks up * in the address mapping table. If that final lookup fails, then a special stream named default is used.

CanIt-PRO — Roaring Penguin Software Inc. 2.6. STATUS OF MESSAGES 33

2.6 Status of Messages

Every message in the database has one of three statuses. The status names and their meanings are: pending Messages enter pending state when they arrive, and remain there until they are marked as spam or nonspam. These messages are displayed in the Web-based “Pending Messages” list. spam The spam-control officer can mark a message as spam. If a message marked as spam is re- ceived, a rejection notice is sent to the sending mail server, and the message is not delivered. not-spam The spam-control officer can mark a message as not-spam. If a message marked as not- spam is received, it is delivered as usual.

2.7 Handling of Suspect Messages

As discussed earlier, CanIt-PRO may be configured to issue an SMTP temporary failure response if a message is held because it is suspected of being spam. This response ensures that the message remains in the sender’s queue. The sender will retry transmission periodically, until one of three things happens:

• The message is marked as spam. On the next transmission attempt, it will be rejected with a permanent failure response.

• The message is marked as not-spam. On the next transmission attempt, it will be accepted and delivered.

• The sending relay times out and bounces the message. Most relays retry transmissions for at least 3 days, so this will not happen unless you do not check the spam trap often enough.

2.7.1 Handling Methods

While keeping the message in the sender’s queue is useful, it does mean that your CanIt-PRO instal- lation relies on the server to retransmit. It also may consume excessive bandwidth on a busy site. Therefore, CanIt-PRO has three options for handling suspicious messages:

1. The default handling, Until-Dispatched, always replies with a temporary failure indication until the CanIt-PRO operator marks a message as spam or not-spam.

2. The First-Time handling replies with a temporary failure indication the first time a suspicious message is received. A lot of spamming software ignores error returns and will never retransmit the message. Failing it the first time, therefore, stops a lot of spam without human intervention. If the message is transmitted a second time, however, it is accepted and held in the CanIt-PRO database. If the operator marks the message spam, it is simply deleted from the database. If the message is marked not-spam, CanIt-PRO re-mails it to the original recipient before deleting it from the database.

CanIt-PRO — Roaring Penguin Software Inc. 34 CHAPTER 2. OPERATION

3. The Never handling never replies with a temporary failure indication. Suspicious messages are always accepted and then held in CanIt-PRO’s database. Incoming messages immediately move to the pending state.

Please note that holding messages locally may greatly increase the disk space used by CanIt-PRO. Be sure to leave enough disk space to handle all messages you anticipate will be held locally.

2.7.2 Secondary MX Relays

Many organizations have secondary MX hosts that queue mail if the primary host is down. They then relay the queued mail when the primary MX host comes back up. Ideally, CanIt-PRO should run on all of your MX hosts. However, if it can only run on your primary MX host, then all other MX hosts should relay to the CanIt-PRO machine. You should then tell CanIt-PRO the IP addresses of the secondary MX hosts via the “Known Networks” facility so that CanIt-PRO can use the Never Tempfail handling for messages from thoses hosts. (There is no point in keeping mail queued and retransmitted on your secondary MX hosts; it’s better to accept and hold the message on the CanIt- PRO machine.)

2.8 The Database

The incident database is key to the correct operation of CanIt-PRO. Three different agents operate on the database as shown in Figure 2.5:

CanIt Filter

Web−Based GUI Periodic Jobs

Incidents Database

Figure 2.5: Database Agents

The agents operating on the database are:

• The CanIt-PRO Filter – This is the portion of CanIt-PRO that integrates with Sendmail and disposes of spam messages.

CanIt-PRO — Roaring Penguin Software Inc. 2.9. REMAILING MESSAGES 35

• The Web-Based GUI – This is used by users or administrators to mark messages as spam or legitimate. The Web-Based GUI also lets you monitor the levels of spam and take action against specific senders, domains or relay hosts.

• Periodic Jobs – These housekeeping jobs perform operations like moving expired pending mes- sages into spam status and purging very old messages from the database. Periodic jobs may be started from one of two places:

1. The /usr/share/canit/scripts/canit.cron script, which should be run once a night. 2. As part of the operation of the CanIt-PRO daemon (canitd). Canitd is a daemon that starts on bootup and runs continuously, performing background maintenance tasks.

2.9 Remailing Messages

On occasion, CanIt-PRO will be forced to remail a message after discarding the original. The follow- ing scenarios cause remailing:

1. If a message comes in for recipients in more than one stream, CanIt-PRO generates one new copy for each stream and mails out the copies. The original message is then discarded. You may see a message in the log file indicating that the message has been discarded; don’t panic. The copies are safely queued.

2. If a Pending message is held in the database and subsequently approved for release, CanIt-PRO fetches the message body from the database and remails it. This always takes place on the designated ticker host, no matter which host processed the original message.

In all cases when CanIt-PRO remails a message, the message goes into Sendmail’s submission queue (most likely in the queue directory /var/spool/clientmqueue or /var/spool/mqueue-client. The message is only processed on the next run of the submission queue. For this reason, you should keep the submission queue interval short (on the order of a minute or two.) On CanIt-PRO appliances, the submission interval is automatically configured for you. On other platforms, consult your system’s documentation for details on how to shorten Sendmail’s sub- mission queue interval.

CanIt-PRO — Roaring Penguin Software Inc. 36 CHAPTER 2. OPERATION

CanIt-PRO — Roaring Penguin Software Inc. Chapter 3

Streams

3.1 Introduction to Streams

The stream is a central concept in CanIt-PRO. Understanding streams is essential to understanding CanIt-PRO. Please be sure to read this chapter before configuring a production CanIt-PRO server.

3.2 The Definition of a Stream

A stream is a collection of rules and policies. Each stream in CanIt-PRO can have its own rules, settings, thresholds and policies. Associated with each stream is a trap. A trap consists of messages that have been held based on the streams settings. For example, a message can be held because of its spam score, or because it contains a suspicious MIME type.

3.3 Users and E-Mail Addresses

Under many circumstances, a single e-mail address corresponds to a single user. For example, the e-mail address [email protected] corresponds to the single user dfs. However, most mail setups are more complicated than this. The first complication comes from aliases. For example, the user dfs may have, in addition to his normal e-mail address, aliases like [email protected] and [email protected]. We would most likely want the same settings and policies to apply to all three aliases. Another complication comes from list addresses. For example, the e-mail address [email protected] does not correspond to any particular user. Instead, it is a list alias that expands to several users. It might make sense to have a separate set of policies for sales than for real users, or it might make sense to assign the policies used by one of the recipients on the sales list. As we see above, the mapping between users and e-mail addresses is not simple. A single e-mail address may result in delivery to several users (the sales example), or a single user may have several

CanIt-PRO — Roaring Penguin Software Inc. 37 38 CHAPTER 3. STREAMS

e-mail addresses that all deliver to the same place (the aliases example.) Streams were created to give you the flexibility of assigning policies. They act as an intermediate container between e-mail addresses and actual users, and let you assign policies any way you choose. As an example, consider Figure 3.1:

CanIt-PRO — Roaring Penguin Software Inc. 3.3. USERS AND E-MAIL ADDRESSES 39

E−Mail AddressStream User−ID

[email protected] [email protected] dfs dfs [email protected]

[email protected] paul paul

[email protected]

(a)

E−Mail AddressStream User−ID

[email protected] [email protected] dfs dfs [email protected]

[email protected] sales

[email protected] paul paul

(b)

Figure 3.1: Streaming Scenarios

We assume that there are two users, dfs and paul. We assume that dfs has the three aliases shown, and that the sales address actually gets delivered to both dfs and paul.

CanIt-PRO — Roaring Penguin Software Inc. 40 CHAPTER 3. STREAMS

In Figure 3.1(a), all mail for dfs’s aliases go into the dfs stream. Mail for paul goes into the paul stream. Furthermore, mail for sales also goes into paul. Although mail for sales is delivered to two users, all of the settings and policies are controlled by the paul stream, and paul is responsible for clearing the trap. In Figure 3.1(b), sales has its own stream. It can thus have different settings and rules from either paul or dfs. Furthermore, both paul and dfs are given access to the stream, so either of those users can adjust the settings and check the trap for sales.

3.4 Mapping

When e-mail comes in, each recipient address is mapped to a stream. We call this process address mapping. Once the stream is determined, CanIt-PRO knows which settings and rules to apply for that recipient. The process by which CanIt-PRO maps addresses to streams is illustrated in Figure 2.4 on page 31. An e-mail address is mapped to a stream in a two-step process:

1. The domain part of the address (everything after the “@” sign) is looked up in the Domain Mapping Table. This lookup results in a method by which to map the address to a stream. 2. Once the method has been determined, then the address is mapped to a stream using the appro- priate method. Details are in Section 4.13 on page 59.

3.5 The Home Stream

When a user logs in to the Web interface, CanIt-PRO must associate a stream with the user name. By default, CanIt-PRO chooses a stream with the same name as the user’s login—this is called the home stream. For example, the user dfs would automatically be sent to the stream dfs upon login. How- ever, it is possible to give users access to additional streams, and to change the default login stream. Also, it is possible to change the user’s home stream with the account-info script (Section 6.2.4).

Note: Stream names are case-sensitive. Thus, a stream called dfs is completely separate from a stream called DFS.

3.6 The “default” Stream

CanIt-PRO treats the stream named default specially in several ways:

• When the database initialization script runs, it sets the login stream for the CanIt-PRO adminis- trator to default. • If a stream mapping cannot be found for an address, the address is mapped to default. • Any blacklists, whitelists and rules defined in the default stream are inherited by all other streams. (However, stream owners can turn this inheritance off if they wish.)

CanIt-PRO — Roaring Penguin Software Inc. Chapter 4

CanIt-PRO Setup

4.1 Accessing The Web Interface

Using your Web browser, open the URL where you installed the CanIt-PRO web pages. For example, if your server is mailserver.mydomain.com and you installed the GUI in the directory canit under your Apache document root, the URL to open would be: http://mailserver.mydomain.com/canit/ (By default, our binary packages and our Debian-based appliances put the web pages at http://machine.yourdomain.net/canit/)

4.1.1 License Key Screen

The very first time you log in, you will see the License Key Screen (Figure 4.1):

Figure 4.1: License Key Screen

Enter or cut-and-paste your license key into the entry box and click Submit Key. The license key includes all the text starting from License and continuing to the end of the string of letters and numbers after Check=.

CanIt-PRO — Roaring Penguin Software Inc. 41 42 CHAPTER 4. CANIT-PRO SETUP

4.1.2 Login Screen

Once the license key has been entered, navigating to the CanIt-PRO URL reveals the Login Screen (Figure 4.2):

Figure 4.2: Login Screen

Log in using the name and password you selected when you initialized the CanIt-PRO database. (See Section J.1 on page 227 if you’ve forgotten the password.) In the Installation Guide example, we used “admin” and “secret”. If you have a CanIt-PRO appliance, the defaults are “admin” and “canit”. (Naturally, you should change the password before connecting your CanIt-PRO appliance to the Internet!) Normally, CanIt-PRO will set a session cookie in your browser. This means that if you close your browser, your session will automatically end. If you want CanIt-PRO to remember your session even if you close the browser, enable the “Remember Me” checkbox. This puts a cookie that lasts longer (by default, 7 days) on your computer. Do not use the “Remember Me” option on a public computer; you should only use it on a workstation to which you alone have access. Once logged in, you should see the CanIt-PRO welcome screen:

CanIt-PRO — Roaring Penguin Software Inc. 4.2. THE SETUP MENU 43

Figure 4.3: Welcome Screen

4.2 The Setup Menu

The Setup main menu entry contains sub-entries for various parts of basic CanIt-PRO setup. Under the Setup menu, you will find:

• Wizards – a collection of tools for easily configuring certain common scenarios. • License Key – a page to enter your CanIt-PRO license key. • Verification Servers – a table allowing you to check recipients against internal servers before CanIt-PRO will accept them. • Known Networks – a table allowing you to change aspects of CanIt-PRO behavior for mail originating from certain known networks. • Features – a page allowing you to turn off certain CanIt-PRO functionality to improve perfor- mance. • System Check – a page that performs a few simple “sanity checks” on your CanIt-PRO system. • Templates – a page for configuring templates that control how CanIt-PRO appends Bayesian voting information to e-mail and the format of Pending Message Notifications. • Domain Routing – a page for configuring e-mail routing. Please note that this link is available only on Debian-based appliaces or on RPM installations with the appliance RPMs installed. • HTTPS – a page for configuring HTTPS. Please note that this link is available only on Debian- based appliances. (It is not available on RPM or source installations.) • Cluster Management – a page for viewing and managing cluster members. • Domain Mappings and Address Mappings – two tables that tell CanIt-PRO how to convert an e-mail address to a stream. • Authentication Mappings and User Lookups – pages for integrating CanIt-PRO with external directories or authentication mechanisms. These are fully described in Chapter6.

CanIt-PRO — Roaring Penguin Software Inc. 44 CHAPTER 4. CANIT-PRO SETUP

4.3 Wizards

The Wizards menu item allows you to ease CanIt-PRO setup by using a wizard to speed through choosing some basic settings. The available wizards are shown on the Setup page. The wizards are self-documenting and guide you through the steps required to configure CanIt-PRO. However, the following wizards are important enough to warrant mention:

4.3.1 Basic Setup Wizard

The Basic Setup Wizard helps you set some basic settings essential to the operation of CanIt-PRO. On a new CanIt-PRO installation, you should follow the steps in this wizard to set some basic settings to sensible values. It is important not to operate CanIt-PRO until you have worked through the Basic Setup Wizard.

4.3.2 RPTN Setup Wizard

The RPTN Setup Wizard configures RPTN, the Roaring Penguin Training Network. (RPTN is a mechanism for sharing Bayes data to increase scanning accuracy. See Section 7.4 on page 98 for details.)

4.3.3 Dictionary Attack Detection Wizard

Note: Dictionary Attack Detection works only on Linux. A Dictionary Attack is an attack whereby an attacker tries to send mail to hundreds or thousands of different e-mail addresses within a domain in the hopes of discovering some valid addresses. CanIt- PRO (on Linux only) can react to dictionary attacks by blocking them using kernel firewall rules. To enable dictionary-attack detection:

1. Click on Setup : Wizards and then Dictionary Attack Detection Wizard.

2. Select Yes when asked “Would you like to enable the dictionary-attack detector?” Click Next.

3. Adjust the parameters as follows:

• Time span over which to track bad recipients specifies for how long CanIt-PRO will keep history. For example, if you specify 900 seconds, then CanIt-PRO tracks bad recipi- ents over the last 15 minutes. • Number of bad recipients to trigger firewalling specifies how many bad RCPT com- mands a host must issue (within the tracking time) to be firewalled off. Continuing the example, if you specify 5 for this parameter, then any host that issues 5 invalid RCPT commands within 900 seconds will be firewalled off. • Length of time in seconds to remain firewalled specifies how long a host remains fire- walled once CanIt-PRO decides it is an attacker. The default is 3600 seconds (one hour.)

CanIt-PRO — Roaring Penguin Software Inc. 4.4. VERIFICATION SERVERS 45

4. Click Next

5. Review your settings and click Finish to make them take effect.

You may wish to exclude certain hosts from ever being banned because of bad RCPT commands. You can exclude such hosts by adding them to the Known Networks list (Section 4.7) with the Omit from Dictionary Attack Detection flag set.

Note: When a host is firewalled off, the Sendmail process that triggered the firewall rule will not receive any traffic from the host. By default, Sendmail will wait one hour between commands. This is far too long if you use the dictionary-attack detector; we recommend shortening Sendmail’s Timeout.command parameter to 5 minutes or shorter. On CanIt-PRO appliances, this configuration change has been done for you. On other platforms, include the line: define(‘confTO COMMAND’, ‘5m’)dnl in your sendmail.mc file and rebuild sendmail.cf.

4.4 Verification Servers

If CanIt-PRO acts as a filtering server that always forwards mail on to other machines, you can have it check recipient addresses against other machines. The internal machine that verifies recipient ad- dresses is called a Verification Server.

Note: This feature only works if the internal machines fail RCPT commands for unknown users. That is, the internal machine must be configured to reject unknown recipients during the SMTP transaction. Some SMTP servers accept any recipient address and then later on generate a failure notification. Servers that delay the rejection of invalid addresses in this manner will not work as Verification Servers. Versions of Microsoft Exchange prior to Exchange 2003 will not work as verification servers. Re- cent Exchange versions can be configured to reject unknown recipients during the SMTP transaction. Please follow the instructions at http://support.microsoft.com/kb/823866/#6. CanIt-PRO allows you to enter a list of domains and the machines that will verify mail for the domains. (Note that this does not change your Sendmail configuration; you need to ensure that Sendmail’s mailertable routes mail appropriately.) To edit the verification server list, click on Setup and then Verification Servers. The following page appears:

CanIt-PRO — Roaring Penguin Software Inc. 46 CHAPTER 4. CANIT-PRO SETUP

Figure 4.4: Verification Servers

In this example, CanIt-PRO performs the following checks:

• Any recipient whose domain is blacky.roaringpenguin.com is verified against the ma- chine blacky.roaringpenguin.com

• Any recipient whose domain is canit.ca is verified against the machine mail.canit.ca

• Any recipient whose domain is roaringpenguin.com is verified against the machine mail.roaringpenguin.com

To add a domain/server pair to the table:

• Enter the domain name in the Domain box and the server name or IP address in the Server box. Note that you can enter multiple verification servers in the Server box by separating the names or addresses with commas. If you enter multiple servers, CanIt-PRO tries them in order until it receive a definite positive or negative response.

• Sometimes, your verification server may be down or unreachable. If you would like CanIt-PRO to tempfail mail in this case, then select “Tempfail” as the Action if Unavailable. If you would prefer CanIt-PRO to queue mail, select “Queue”.

Note: CanIt-PRO only queues mail for recipients that have been successfully validated within the past 60 days. If a recipient has not received mail within that time and the verification server is down, CanIt-PRO will tempfail mail for that recipient and not queue it.

• Click Submit Changes

To delete a domain/server pair from the table, enable the appropriate Delete checkbox and click Sub- mit Changes. If you enter a string in the “Filter:” box, then CanIt-PRO limits the display to entries whose Domain or Server columns contain that string.

CanIt-PRO — Roaring Penguin Software Inc. 4.5. MAIL ROUTING 47

If your verification server listens on a non-standard port (that is, a port other than port 25), you may specify the port number by following the server name with a slash and the number. For ex- ample, if you have a server called mail.example.com that listens on port 2525, you can use mail.example.com/2525 in the Server column.

Note: If you use a verification server, ensure that the server does not throttle or rate-limit the CanIt-PRO server in any way. Because CanIt-PRO runs an SMTP connection for each RCPT command, some naive SMTP server software may think it’s under attack and rate-limit the CanIt-PRO server, with disastrous results.

4.4.1 Wildcard Verification Server

You may optionally choose to add a Verification Server entry for the wildcard domain of ’*’. This will cause mail for any domain that does not have a specific entry to be checked against that server.

Note: If you are relaying outbound mail via your CanIt-PRO server, you should NOT use a wildcard verifi- cation entry, as it will likely result in the rejection of all outbound mail.

4.5 Mail Routing

Note: This section is applicable only to Debian-based CanIt-PRO appliances or to Red Hat installations with the appliance RPMs installed. On other CanIt-PRO installations, you need to configure routing manually by editing Sendmail’s access and mailertable files. Please note the following important requirement: All of the features in Sections 4.5 through 4.6 rely on SSH to operate. Your system must be running an SSH server listening on port 22 and it must allow public-key authentication. If you are running a cluster, all cluster members must be running an SSH server on port 22 and permit connections from all other cluster members. If your SSH server listens on a different port, the features will not work. To configure mail routing, click on Setup and then Domain Routing. The Domain Routing page comes up:

Figure 4.5: Domain Routing Screen

To add a domain for routing:

CanIt-PRO — Roaring Penguin Software Inc. 48 CHAPTER 4. CANIT-PRO SETUP

1. Enter the domain name in the “Domain” box. 2. Click Add Domain

The Domain Routing Detail screen will come up:

Figure 4.6: Domain Routing Detail

1. Enter the servers to which mail should be routed for the given domain. You can enter more than one server; if you need more than one, enter them one per line. The servers are tried in order, until one successfully accepts the mail. 2. If you wish the routing server(s) to be treated as MX records, set Treat route entries as MX records to Yes. Otherwise, leave it at No.

Note: You should normally not treat your route entries as MX records. Unless you know for sure that they specify correct MX records that will route your mail correctly, setting this setting to Yes could cause mail loops. If you use IP addresses rather than host names for your routes, you must not set Treat route entries as MX records to Yes. 3. Click Submit Changes.

4.5.1 Outbound Relaying

Normally, CanIt-PRO refuses to relay mail for domains that do not appear in the Domain Routing Screen. However, if you wish to relay outbound mail through CanIt-PRO, you can specify networks for which relaying should be enabled. To do this:

1. Click on Setup and then Known Networks 2. Enter the network from which relaying should be allowed. For example, to allow all machines on the Class C network 192.168.2.0 to relay outbound mail, enter 192.168.2.0/24 3. Enable the Allow Relaying checkbox. 4. Click on Submit Changes.

CanIt-PRO — Roaring Penguin Software Inc. 4.6. CLUSTER MANAGEMENT 49

4.6 Cluster Management

The CanIt-PRO Web interface has a page for managing your CanIt Cluster. To access the page, click on Setup and then Cluster Management. The Cluster Management page appears:

Figure 4.7: Cluster Management Page

The various machines in your cluster are shown. Each member of a CanIt-PRO cluster can run one or more services. The services are:

• Scanner – this service scans mail flowing through the cluster member. Typically, all members of a CanIt-PRO cluster will run this service, although large installations may not run a scanner on the database host. NOTE: All nodes should be marked “Scanner” even if they don’t actually act as MX hosts. This is to permit locally-generated traffic such as cron messages to be delivered. Do not turn off the “Scanner” service on any cluster members. If you think you need to, please contact Roaring Penguin support first.

• Ticker – this service runs periodic maintenance jobs. Exactly one host in the cluster must run this service. That host must also run the Scanner service.

• Main Database – this service is the main PostgreSQL database. One host in the cluster must be an active database server, but it is possible to set up a failover database server.

• Web Server – this service provides the Web interface and REST-based API. it can run on as many hosts as you like.

• Inbound – this host processes inbound . Always leave this setting enabled; if you think you need to disable it, please contact Roaring Penguin technical support.

• Outbound – this host processes outbound email. Always leave this setting enabled; if you think you need to disable it, please contact Roaring Penguin technical support.

• Sync Bayes – this host requires Bayes data for processing email. Always leave this setting enabled; if you think you need to disable it, please contact Roaring Penguin technical support.

• Storage Manager – if you are using the Storage Manager, the table will indicate on which hosts it is running. You can run Storage Manager on as many hosts as you like.

4.6.1 Altering Services on a Cluster Member

To alter the services running on a cluster member:

CanIt-PRO — Roaring Penguin Software Inc. 50 CHAPTER 4. CANIT-PRO SETUP

1. Check or uncheck the appropriate checkbuttons or radio buttons in the Scanner, Ticker and Cron-Runner columns. (Note that the Database and Web Server checkboxes are informa- tional; changing them won’t actually change which services run on the host. And the Storage Manager column is read-only because storage manager hosts are configured in the Storage Manager Wizard.)

2. Click Submit.

4.6.2 Renaming of Cluster Members

If you rename a CanIt-PRO host, the cluster management software usually picks up on the name change automatically. If, however, nonexistent or dead hosts appear in the Cluster Management table, you can delete them. To delete hosts:

1. Enable the appropriate checkboxes in the Delete column.

2. Click Submit.

4.7 Known Networks

CanIt-PRO allows you to enter a list of “known networks”. These are typically networks that you control, and for which you wish to alter the normal CanIt-PRO processing flow. For example, you may not wish to scan outgoing mail for spam; if all outgoing mail originates from a known set of IP addresses, you can tell CanIt-PRO to skip spam-scanning for mail originating from those IP addresses. To edit the list of known networks, click on Setup and then Known Networks. The Known Networks page appears:

CanIt-PRO — Roaring Penguin Software Inc. 4.7. KNOWN NETWORKS 51

Figure 4.8: Known Networks

Each network appears as a row in the table. By default, all of the network attributes are hidden so as not to clutter the display. To show the network attributes, click on the Show link. The attributes will appear and the link name will change to Hide. Click on Hide to hide the network attributes. In the example in Figure 4.8:

• The host 192.168.10.6 will not be looked up in any RBL blacklists.

• Mail originating from 192.168.10.6 will not be scanned for spam:

• Mail originating from 192.168.10.6 cannot be blacklisted. That is, any sender, domain or host blacklists will be ignored.

• Greylisting will be turned off for 192.168.10.6.

• 192.168.10.6 will never be banned by the Dictionary Attack Detector.

• Mail originating from 192.168.10.6 will be streamed into the Outgoing stream, no matter what.

To add a network to the list of known networks:

1. Enter the network address in the Network box. A network address can either be a single IP address, or a network address in CIDR notation: a.b.c.d/bits. In this notation, a through

CanIt-PRO — Roaring Penguin Software Inc. 52 CHAPTER 4. CANIT-PRO SETUP

d are decimal numbers from 0 to 255, and bits is a number from 1 to 32 specifying how many bits of the address are significant. Note that the remaining bits (32 – bits) must be zero. (For more information on CIDR notation, please see http://en.wikipedia.org/ wiki/Classless_Inter-Domain_Routing.) Here are examples of network addresses:

• 192.168.1.0/24 specifies the Class C network 192.168.1.0 through 192.168.1.255. • 10.5.2.0/23 specifies the IP addresses 10.5.2.0 through 10.5.3.255. • 192.168.5.5/24 is invalid, because the lower 8 bits of the address must be zero.

2. Choose the characteristics you wish to apply to hosts in the known network (you may need to click on Show to see the list of characteristics.)

• To skip DNS-based RBL lookups, enable Skip RBL Lookups. • To skip spam-scanning, enable Skip Spam Scan. • To skip virus-scanning, enable Skip Virus Scan. • To skip filename and filename extension checking, enable Skip Extension Rules. • To skip MIME-type checking, enable Skip MIME-Type Rules. • To skip enforcement by CanIt-PRO of maximum message size, enable Skip Size Limit Checks. • To prevent sender, domain or host blacklists from applying to mail sent from the network, enable Prohibit Blacklisting. • To skip greylisting for hosts in the network, enable Skip Greylisting. • To skip SPF checks for hosts in the network, enable Skip SPF Checks. • To disable delay rules for hosts in the network, enable Skip Delay Rules. • To disable attachment-stripping rules for hosts in the network, enable Skip Attachment Stripping. • To prevent any hosts in the network from being banned by the Dictionary Attack Detector (Section 4.3.3), enable Omit from Dictionary Attack Detection. • To have CanIt-PRO hold suspect messages locally if they originate from the known net- work (rather than tempfailing them), enable Don’t Tempfail Incidents. • To have CanIt-PRO parse Received: headers to find the actual relay, enable Parse Re- ceived Headers. CanIt-PRO parses through the headers until it finds a host that isn’t in a known-network with this flag set. • To auto-whitelist recipients of messages from a known network, enable Auto-Whitelist Recipients. This means that for messages originating from the network, the recipients of the message are whitelisted in the Sender Rule table. Note that auto-whitelisting is not applied if any of these conditions holds: – There is already a sender rule for the recipient in the stream in which the Sender Whitelist rule would normally be created. – The message has a “Precedence: bulk” or “Precedence: junk” header.

CanIt-PRO — Roaring Penguin Software Inc. 4.7. KNOWN NETWORKS 53

– The message has an “Auto-Submitted” header, as specified in RFC 3834. – The message is a bounce message (in other words, the sender is <>. – The message subject contains “[no-whitelist]”. In this case, the [no-whitelist] tag is removed before the message is delivered (so that the recipients do not see it.) – The message subject matches the regular expression ˆout of.*office case- insensitively. – Auto-whitelisting has been disabled under Preferences : Stream Settings for the sender’s stream. Note that some auto-responder software ignores RFC 3834 and fails to add an “Auto- Submitted” header. This could lead to situations in which CanIt-PRO auto-whitelists someone because of an auto-response. If you cannot convince your auto-responder soft- ware to add an Auto-Submitted header, you should complain to the vendor of that software in an attempt to make it RFC-compliant. If a stream inherits from a final stream, then the whitelist rule is created in the final stream. Otherwise, it is created in the actual stream itself. Please see Section 9.3.1 on page 109 for the precise definition of a final stream. • To allow outbound mail from the network to be relayed through the CanIt-PRO machine, enable Allow Relaying. Note: Outbound relaying can be enabled from the Web interface only on CanIt-PRO appliances or Linux-based RPM builds with the appliance RPMs installed. • To rate-limit how many recipients per hour a given sender can send to, enter a number in Recipient Rate Limit. If you use this option, you must also enter a Force To Stream value. Rate-limiting is described more fully in Section 4.8. • If you wish to rate-limit by sending IP address as well as sending email address, enable Rate-Limit by IP Address. See Section 4.8.1 for details. • To force all mail from the network to be streamed into a specific stream, enter the name of the stream in the Force To Stream box.

3. Click Submit Changes to have your changes take effect.

To edit an existing known network, simply adjust the attributes as required and click Submit Changes. To delete a known network, enable the Delete? checkbox and click Submit Changes.

4.7.1 Overlapping Networks

If you add two networks that overlap, CanIt-PRO will use the most-specific network for a given host. That is, CanIt-PRO will choose the smallest network that contains a given host. For example, if you create the known networks 192.168.1.0/24 and 192.168.1.240/28, then hosts in the range 192.168.1.240 through 192.168.1.255 will use the 192.168.1.240/28 settings, whereas hosts from 192.168.1.0 through 192.168.1.239 will use the 192.168.1.0/24 settings.

CanIt-PRO — Roaring Penguin Software Inc. 54 CHAPTER 4. CANIT-PRO SETUP

4.7.2 The SMTP-AUTH Pseudo-Network

CanIt-PRO supports a pseudo-network called SMTP-AUTH. (It must be entered exactly like that in upper-case.) Any Known Network settings for this network will be applied to users who authenticate using SMTP AUTH. This lets you do things like force authenticated mail into a particular stream or skip spam-scanning for authenticated users.

4.8 Rate-Limiting Outbound Mail

The Known Networks feature allows you to limit the number of recipients a given sender can mail in an hour. This can be useful to catch compromised internal hosts that are used to send spam. Here is how rate-limiting works:

• You can only rate-limit mail from a Known Network. This is because rate-limiting is designed to rate-limit outbound mail from a set of machines under your control.

• To specify a rate limit, enter the maximum number of recipients per hour that a given sender can send to. A reasonable value might be 500 to 1000; a value of 0 disables rate-limiting completely. (Enter the value in the Recipient Rate Limit column of Known Networks.)

• You must also specify a Force To Stream value in order to use rate-limiting.

If a sender exceeds the rate limit, CanIt-PRO creates a Sender rule in the Force To Stream stream. The rule rejects all mail from the sender. This has the effect of completely disabling all outbound mail from the sender address. The sender rule that CanIt-PRO creates is set to expire automatically three days after it is created. CanIt-PRO also sends an email to the CanIt-PRO administrator informing him or her of the rule that blocks the sender. Note that the sender will be unable to send outbound mail until the administrator goes into the Force To Stream stream and manually removes the rule that blocks the sender (or until the rule expires after three days.)

Note: Any sender that is whitelisted will not be subject to rate-limiting. You can use this as an “escape hatch” to permit certain senders to send high volumes of mail; simply whitelist those senders in the forced-to stream. However, you should be very careful to only whitelist legitimate senders who are unlikely to have their accounts hijacked. If a sender is whitelisted for any reason (ie, a sender whitelist, domain whitelist or host whitelist), rate-limiting will not apply. For this reason, you should be very judicous about the whitelists you create in the forced-to stream and consider setting up the forced-to stream not to inherit from the default stream.

Note: If you enable rate-limiting on a Known Network, be sure that you do not enable the “Prohibit Black- listing” option for that network. Otherwise, rate-limiting rules will be ignored! In addition, if you rate-limit the SMTP-AUTH pseudo-network, be sure not to enable the global setting “Whitelist users who use SMTP authentication” (G-3600) or rate-limiting will be ignored.

CanIt-PRO — Roaring Penguin Software Inc. 4.9. FEATURES 55

4.8.1 Rate-Limiting by IP Address

Normally, CanIt-PRO applies rate-limiting on a per-sender email address basis. If you enable the Rate Limit by IP Address feature in Known Networks, CanIt-PRO will also apply rate-limiting to the sending IP address. If the Known Networks entry has Parse Received Headers enabled, then the IP address that is rate-limited is extracted from the Received: headers.

Note: Be very careful when enabling this feature. If all of your mail goes out through one server and you accidentally turn on rate-limiting by IP address without enabling Received: header parsing, you may end up blocking all outbound mail. The rule of thumb is as follows:

• If various clients connect directly to the CanIt-PRO server to send outbound email, you must not enable Parse Received Headers on the Known Network containing the client IP addresses.

• If clients relay via an SMTP server that subsequently relays out via the CanIt-PRO server, then you must enable Parse Received Headers.

4.9 Features

The Features page allows you to globally disable certain CanIt-PRO features to reduce the number of database queries. Note that disabling a feature completely disables it system-wide. Unless you know for sure that you don’t need a feature, and you know that the load savings will be worth turning it off, you should leave all features in their default states. To disable a set of features, click on No in the Enabled column for the features you want to disable. Then click Submit Changes. Some features are disabled by default because they are considered dangerous or are only useful in special situations. You can enable such features by selecting Yes in the Enabled column and then clicking Submit Changes.

4.9.1 Direct Queue Injection

Normally, when CanIt-PRO needs to split an incoming message destined for several streams into several single-stream messages, it performs the following actions:

1. It remails a copy of the message for each stream by invoking sendmail with appropriate arguments.

2. It discards the original message.

Remailing a message with Sendmail is expensive because multiple copies of the message data are made and Sendmail uses expensive disk synchronization operations after each copy. CanIt-PRO can instead directly inject copies of the streamed messages into Sendmail’s local client queue. This saves disk I/O because only one expensive synchronization operation is needed (not one per copy.) Also, the data can be hard-linked instead of copied, saving disk space.

CanIt-PRO — Roaring Penguin Software Inc. 56 CHAPTER 4. CANIT-PRO SETUP

In order for this to work, the defang user must be a member of the smmsp group. (This is the case if you are running an appliace or an RPM build.) Additionally, you must enable the “Insert Streamed Mail Directly Into Sendmail Queue” feature under Setup : Features.

4.10 System Check

The System Check page runs some sanity checks on your CanIt-PRO installation. It also displays the current versions of RPTN data and Roaring Penguin rule sets. A typical System Check page is shown in Figure 4.9:

Figure 4.9: System Check

In addition to running a few local tests, viewing the System Check page also shows the results of cluster-wide tests performed on a periodic basis. If System Check indicates a problem, you should take action to fix it immediately. The various System Check tests are outlined in AppendixL.

CanIt-PRO — Roaring Penguin Software Inc. 4.11. TEMPLATES 57

4.11 Templates

CanIt-PRO uses templates to configure how Bayes training information is added to messages and to configure the appearance of Pending Message Notifications. To configure templates, click on Setup and then Templates. The Templates screen appears:

Figure 4.10: Templates

The various templates you can configure are:

• Base URL of CanIt installation is used to construct URLs in messages sent out by CanIt-PRO.

• E-Mail address of CanIt System Administrator is the e-mail address to which CanIt-PRO sends certain warning messages or alerts.

• Source E-Mail address of CanIt notifications is the sender address used by CanIt-PRO when it e-mails notifications.

• Full name for sender of CanIt notifications is the full name place in the From: header of CanIt-PRO notifications.

• SMTP reply for a held incident is the text returned with the SMTP temporary failure code when CanIt-PRO tempfails a suspect e-mail.

• SMTP reply for a rejected incident is the text returned with the SMTP permanent failure code when CanIt-PRO rejects an incident.

• SMTP reply for a blacklist entry is the text returned with the SMTP permanent failure code when CanIt-PRO rejects a host, sender or domain that is blacklisted.

CanIt-PRO — Roaring Penguin Software Inc. 58 CHAPTER 4. CANIT-PRO SETUP

• Header note for a whitelist entry is the note CanIt-PRO places in the X-Spam-Score header when a host, sender or domain is whitelisted.

• Plain-text training link body specifies the appearance of Bayesian training links added to plain-text messages.

• HTML training link body specifies the appearance of Bayesian training links added to HTML messages.

• Pending notification e-mail subject specifies the subject to put in Pending Notification mes- sages.

• Pending notification e-mail body specifies the body of Pending Notification messages

• Preamble before notification details specifies the preamble before the detailed list of held messages (for users who select verbose notifications.)

• Detailed pending notification entry specifies the format for each held message in detailed notifications.

• Subject for Add Alternate Address e-mail specifies the subject of the confirmation e-mail sent when someone attempts to add an Alternate Address to his/her stream.

• Body for Add Alternate Address e-mail is the body of the confirmation e-mail described above.

• Header for ’Webform’-style Pending Notification is the HTML preamble used for “Web- form” pending notifications.

• Footer for ’Webform’-style Pending Notification is the HTML postamble used for “Web- form” pending notifications.

• Subject line for Periodic Reports is the subject used by CanIt-PRO when mailing out periodic reports.

• Body of Periodic Report e-mail is the body used by CanIt-PRO when mailing out periodic reports. It should consist of valid HTML.

• Text boilerplate when attachments are stripped is appended to the first text/plain email part if an attachment is stripped and stored on the CanIt-PRO server.

• HTML boilerplate when attachments are stripped is appended to the first text/ email part if an attachment is stripped and stored on the CanIt-PRO server.

• Text boilerplate when attachments are discarded is appended to the first text/plain email part if an attachment is stripped and discarded.

• HTML boilerplate when attachments are discarded is appended to the first text/html email part if an attachment is stripped and discarded.

• Forgot-your-Password Link or Text is the link or text used for the Forgot your Password? message on the login page.

CanIt-PRO — Roaring Penguin Software Inc. 4.12. HTTPS 59

Note that many templates include various “replacement tags”. For example, in the training link tem- plates, the sequence of characters %spamurl or %{spamurl} will be replaced with a URL that votes the message as spam. To see the list of available replacement tags, click on the “(Tags)” link near the template entry box.

4.12 HTTPS

Note: This feature is available only on Debian-based Appliances. On CanIt-PRO appliances, HTTPS is enabled by default, but with dummy self-signed certificates. If you would like to install your own certificates, click on Setup : HTTPS. Then:

1. Copy-and-paste your SSL certificate into the first text box. 2. Copy-and-paste the corresponding server key into the second text box. The server key must not be encrypted or the Web server on the appliance will fail to start. 3. Click Submit Changes to install the key and certificate.

4.13 The Domain Mapping Table

Recall from Figure 2.4 on page 31 that CanIt-PRO uses a Domain Mapping Table to determine how to stream messages for each domain. The table contains a list of domains with a corresponding lookup method. To edit the Domain Mapping Table, click on Setup and then Domain Mappings. The Domain Mappings page appears:

Figure 4.11: Domain Mappings

CanIt-PRO — Roaring Penguin Software Inc. 60 CHAPTER 4. CANIT-PRO SETUP

To add a mapping method for a particular domain, enter the domain name in the top row of the table and select a value in the Mapping column. The possible choices are:

• Database—CanIt-PRO will look up a stream mapping in the Address Mapping Table (Sec- tion 4.14). • AsIs—CanIt-PRO converts an address to a stream by removing any angle-brackets and convert- ing letters to lower-case. • ChopDomain—CanIt-PRO converts an address to a stream simply by chopping off the @domain.tld part, removing any angle-brackets, and converting to lower-case. • ChopUser—CanIt-PRO converts an address to a stream simply by chopping off the address@ part, leaving just the domain (without angle-brackets and converted to lower-case.) • Program—CanIt-PRO converts an address to a stream by executing the account-info pro- gram. Please see Section 6.2.4 on page 95 for more details. Note that Program is deprecated; you should create and use a User Lookup method instead. • Sendmail This method invokes Sendmail (it assumes that the recipient is local on the CanIt- PRO machine.) This method is deprecated and should no longer be used; it was intended for installations that run CanIt-PRO on their actual final mail server (something we no longer rec- ommend.) • None—CanIt-PRO removes the domain from the Domain Mapping Table. • If you have added external User Lookup methods (Chapter6), some of them may appear as additional choices. For example, the LDAP and Program User Lookup methods can convert an address to a stream.

Click Submit Changes to save your changes. To modify the mapping for an existing domain, select a new mapping in the Mapping column and click Submit Changes. Given a domain sub.example.com, CanIt-PRO looks up entries in the Domain Mapping Table in the following order, stopping at the first one found:

1. sub.example.com 2. example.com 3. com

4. *

The special domain * is used as a last resort if no better match is found. You may enter a mapping for * to set a default mapping. If there is no * entry and a domain is not found in the Domain Mapping Table, then CanIt-PRO uses a default lookup method of Database. If you enter a string in the “Filter:” box, then CanIt-PRO limits the display to entries whose Domain or Mapping columns contain that string.

CanIt-PRO — Roaring Penguin Software Inc. 4.14. THE ADDRESS MAPPING TABLE 61

4.14 The Address Mapping Table

CanIt-PRO uses an Address Mapping Table (Figure 2.4 on page 31) to map e-mail addresses to streams. The Address Mapping Table is used both for hand-entered entries placed there by the CanIt- PRO administrator, and for caching the results of the Program mapping method. To edit the address mapping table, click on Setup and then Address Mappings. The Address Map- pings page will appear:

Figure 4.12: Address Mappings

To add an entry for a new e-mail address, enter the new address in the Address column of the first row, and enter the stream name in the Mapping column. Then click Submit Changes. To edit an existing entry, edit the text in the Mapping column and click Submit Changes. To delete an entry from the table, click the Delete link in the appropriate row. Click on Not Cached to see only non-cached (hand-entered) entries, Cached to see only cached entries, or Any to see all entries in the Address Mapping Table. If you enter a string in the “Filter:” box, then CanIt-PRO limits the display to entries whose Address or Mapping columns contain that string.

4.14.1 Wild-Card Entries

The address mapping table may contain three types of wildcard entries:

1. The entry user@* is used if CanIt-PRO is unable to map an address to a stream with an exact match (or if the Program method fails.) If you run several domains, but all user-parts are the same, this wildcard can be useful.

CanIt-PRO — Roaring Penguin Software Inc. 62 CHAPTER 4. CANIT-PRO SETUP

2. The entry *@domain.tld is used if the previous wildcard does not match anything. Use this entry to set up a default stream for e-mail to a particular domain.

3. The entry * is used as a last resort if the previous wildcards did not match.

Note: The addresses postmaster, postmaster@localhost and postmaster@machine name are always mapped to the default stream unless you have a specific entry in the Address Mapping Table for those addresses. That is, for those three specific addresses, CanIt-PRO will not use wildcard matches or User Lookups to determine the stream. (In the third address, machine name is the name of the host processing the email.)

4.15 The default Stream

CanIt-PRO has a built-in stream name that is reserved, and which cannot be used for other purposes. This stream is named default, and is used as follows: If CanIt-PRO is unable to map an address to a stream (for example, if there are no exact or wildcard matches in the database and the Program method fails), the address is mapped to the hard-coded stream default. The CanIt-PRO administrator should check the default stream from time to time. The default stream also contains whitelists, blacklists, custom rules and mismatch rules that all other streams can inherit. The factory default is for all streams to inherit the lists and rules from default, but you can disable this if you wish. List and rule inheritance work as follows for streams that inherit from default:

• Senders, hosts, domains, extension rules, MIME type rules and mismatch rules are first looked up in the stream’s table. If no entry is found, they are looked up in default’s table.

• Custom rules are evaluated first for the given stream, and then for default. Their scores are added together. Note that if the same rule appears in both the stream’s rule set and default’s rule set, it is counted twice.

4.16 Mapping Scenarios

To give a feel for how to use the mapping, we illustrate a few common scenarios.

4.16.1 Central Scanning with Opt-Out

If you run a mail server and wish to centralize spam-scanning, but you have some users who wish to opt out or handle their own spam, you can do it as follows: In the Address Mapping Table, add this catch-all entry: Address Stream * admin

CanIt-PRO — Roaring Penguin Software Inc. 4.16. MAPPING SCENARIOS 63

This streams most users’ e-mail to the “admin” stream for centralized processing. If user [email protected] does not want his mail examined by the spam control officer, simply add another entry: Address Stream [email protected] joe This streams mail for [email protected] to joe.

4.16.2 Single Domain

If you host a single e-mail domain, and each user’s login name is simply the first part of his/her e-mail address, setting up mappings is easy. In the Domain Mapping Table, add a single entry: Domain Mapping Method * ChopDomain

4.16.3 Single Domain with Aliases and Mailing Lists

Most likely, your scenario is more complex than in Section 4.16.2. You probably host mailing lists, and have aliases. Let’s suppose you host a list called [email protected], which is run by jane, and that your [email protected] is an alias which gets expanded to jim and bob. You can still use the same Domain Mapping as Section 4.16.2. You have two options for handling the mailing list and sales alias:

1. Allow jane to access the tv-list stream, and allow jim and bob (or delegate one of them) to access the sales stream. Jane will have to remember to check the tv-list trap as well as her own trap, and similarly for Bob and Jim.

2. Add address mappings like this: Address Stream [email protected] jane [email protected] bob Explicit entries in the Address Mapping Table will override even the ChopDomain method. Here, Jane’s trap will contain messages both for herself directly and the mailing list she runs. Bob’s trap will contain his messages and messages for sales. (Clearly, you’ve delegated spam handling for sales to Bob alone.) (You can, of course, use Method 1 for tv-list and Method 2 for sales. It’s up to you.)

CanIt-PRO — Roaring Penguin Software Inc. 64 CHAPTER 4. CANIT-PRO SETUP

CanIt-PRO — Roaring Penguin Software Inc. Chapter 5

CanIt-PRO Administration

5.1 Global Settings

The first administrative task you should undertake is to set up global settings. Click on the Adminis- tration link. You will see the global settings screen:

Figure 5.1: Global Settings

Note that the Basic Setup Wizard (Section 4.3.1) populates some of these settings. The “ID” column is a unique identifier for each setting; it is not used except as a convenient way for Roaring Penguin support personnel to indicate a particular setting over the phone. The global settings have the following meanings:

G-1100 Maximum size of message to scan for spam (kB) Spam-scanning can be very slow on large messages. If a message comes in that is larger than this threshold, CanIt-PRO attempts to reduce

CanIt-PRO — Roaring Penguin Software Inc. 65 66 CHAPTER 5. CANIT-PRO ADMINISTRATION

its size by removing non-text attachments before feeding the message to the scanning engine. If this succeeds, the reduced message is scanned. If the message is still too large even after the reduction, it is not scanned for spam.

G-2400 Handling for messages containing viruses If you have a virus-scanner compatible with CanIt-PRO, this setting controls how CanIt-PRO deals with virus-bearing messages. Hold holds the message in the trap for approval. Accept permits the message to pass, while Reject rejects it with an SMTP failure code. Finally, Discard simply discards the message. We recommend setting this option to Discard. Note: This setting may be overridden on a per-stream basis.

G-1500 Expire statistics after this many days Once a day, a cron job removes old entries from the statistics table. By default, CanIt-PRO keeps statistics for 10,000 days (around 27 years), but you can lower this setting to as low as 90 days if you do not want to keep old statistics around.

G-1550 Number of hours to keep detailed statistics CanIt-PRO keeps very detailed statistics for a limited time. This setting lets you adjust the length of this time.

G-1600 Expire old data after this many days Once a day, a cron job purges old messages, log entries and incidents from the database. We recommend retaining at least 14 days’ worth of data, although you might want to lower this on a busy mail server. Note: This setting is the number of days from the creation of the incidents being expired, regardless of whether or when they were marked as spam or non-spam.

G-1700 Expire messages marked as spam after this many days This setting controls when the cron job expires messages you have marked as spam. Note that it only applies to closed incidents— that is, messages that have not only been marked as spam, but have also actually been rejected by CanIt-PRO.

G-1800 Expire messages marked as non-spam after this many days This setting controls when the cron job expires messages you have marked as non-spam. Note that it only applies to closed incidents—that is, messages that have not only been marked as non-spam, but have also actually been delivered by CanIt-PRO.

G-4010 Number of hours to cache address-to-stream lookups As mentioned in Section 2.4, address-to-stream mappings may be cached in the Address Mapping Table. This setting specifies for how long cached entries remain valid.

G-4050 Time in hours to delay messages with Delayed Attachments If you use the Delayed At- tachments feature, this setting controls the length of the delay.

G-4800 Number of days to keep mail signatures for Bayesian analysis This setting specifies how long after a message first arrives a user may vote on whether it is spam or non-spam.

G-4900 Number of generations before cleaning common Bayes tokens CanIt-PRO periodically cleans old data out of the Bayes database. This setting controls how long CanIt-PRO retains a token that has been seen frequently, but not recently. We recommend leaving it at the default value.

CanIt-PRO — Roaring Penguin Software Inc. 5.1. GLOBAL SETTINGS 67

G-5000 Number of generations before cleaning uncommon Bayes tokens CanIt-PRO periodically cleans old data out of the Bayes database. This setting controls how long CanIt-PRO retains a token that has been seen infrequently and not recently. We recommend leaving it at the default value.

G-4020 Users must opt in to anti-spam scanning? If you set this to Yes, then users must explicitly opt-in to anti-spam scanning. If users do not opt-in, their mail is simply passed through un- changed. If you set this to No, then all users are implicitly opted-in. They can, however, explicitly opt out if they choose.

G-4030 Users must be approved for anti-spam scanning? If you set this to Yes, then the CanIt-PRO administrator’s approval is required before a user can opt in to anti-spam scanning. If you are selling anti-spam scanning as a value-added service, you should set this to Yes. If anti-spam scanning is part of your basic service, set it to No. Note that opting in and opting out is done on a per-stream basis. Usually, a stream corresponds to a user, but it is possible for a stream to correspond to more than one user, and for a single user to be responsible for more than one stream.

G-600 Send tempfail indications for suspect messages This entry controls how CanIt-PRO holds messages. There are four choices:

• Until-Dispatched makes CanIt-PRO send temporary failure responses until the adminis- trator handles the held message. This is the default setting. It keeps suspicious messages trapped in the sending relay’s queue. Note, however, that CanIt-PRO will not tempfail messages from the local host, or from networks marked as ”Don’t Tempfail Incidents” under Known Networks; it will store those locally. • First-Time makes CanIt-PRO send a temporary failure notification the first time a suspi- cious message is received. If the message is retransmitted, CanIt-PRO accepts it, but holds it in its local database until the message is approved or discarded. • Never makes CanIt-PRO always hold suspicious messages in its internal database. It will never keep a message in the sender’s queue by replying with a temporary failure response. • Always makes CanIt-PRO send temporary failure responses until the administrator han- dles the held message, no matter what. It differs from Until-Dispatched in that it will even tempfail messages from the local host and from hosts marked as ”Don’t Tempfail Incidents”. Do not use this setting unless you have administrative control over all of your secondary MX systems or other relay hosts and are willing to put up with the extra load the retransmissions place on them.

Use the following guidelines to pick a value for G-600:

• If your mail server is not heavily loaded, and you are willing to tolerate delays when releasing incorrectly-trapped mail, select Until-Dispatched. • If your mail server is heavily loaded, or you want complete control over delays when releasing incorrectly-trapped mail, select Never. (If you are not using Greylisting, but want some of its benefits, select First-Time instead.)

CanIt-PRO — Roaring Penguin Software Inc. 68 CHAPTER 5. CANIT-PRO ADMINISTRATION

G-4300 Minimum size of spam corpus for Bayesian analysis CanIt-PRO will not use Bayes data until at least this many messages have been trained as spam.

G-4400 Minimum size of non-spam corpus for Bayesian analysis CanIt-PRO will not use Bayes data until at least this many messages have been trained as non-spam.

G-3600 Whitelist users who use SMTP authentication If your version of Sendmail is compiled to support the SMTP AUTH extension, you can whitelist mail from authenticated senders by set- ting this to Yes. (The default is No.) In this case, mail from authenticated users will not be scanned for spam (but will still be scanned for viruses and bad filename extensions or MIME Note:parts.) CanIt currently cannot preserve SMTP AUTH-based whitelisting when messages are streamed. Thus, if a AUTH’ed user sends mail to recipients in more than one stream, the whitelisting will not be applied.

G-3900 Store both raw and decoded messages in incident database Some e-mail messages are ob- scured using Base64 encoding or some other encoding scheme. If you change this setting to Yes, CanIt-PRO stores both the “raw” and “decoded” message in the incident database. This lets you view encoded messages more reliably, but approximately doubles the disk space used by the incident database. If you set it to No (the default), CanIt-PRO stores only the raw message. The message display Web page can decode some encoded messages, but it is not completely reliable. If you need a completely reliable way to view encoded messages, you should change this setting to Yes.

G-4000 Obscure To, Cc and Bcc fields for non-root users Because CanIt-PRO stores messages that hash identically only once, the To:, Cc: and Bcc: headers of messages may leak recipient information to other recipients of the message. To hide this information, change this setting to Yes.

G-4060 Users authenticated by external means default to simple GUI? If you set this to Yes, then users who authenticate via an external authentication mechanism have a much simplified inter- face to CanIt-PRO by default. This simplified interface is described in Chapter9.

G-4075 Switching to expert mode cancels stream inheritance If you use the Simple Interface (Chapter9), then you may wish to cancel inheritance whenever a user selects the expert in- terface. In that case, change this setting to Yes. That is, if a user has selected a particular spam-scanning level in the Simple Interface, then when they switch to Expert Interface, the se- lected level is no longer used—instead, individual settings are used that do not depend on any of the preconfigured spam-scanning settings.

G-4080 Support the Sendmail ‘plus hack’ for streaming Some Sendmail configuration files allow users to add a “+” sign followed by arbitrary text to their user names, and use the resulting e-mail addresses for various purposes such as filtering e-mail. If you change this setting to Yes, then CanIt-PRO ignores a “+” sign and any following text after the user name part when mapping e-mail addresses to streams. Note that if you use the “Program” method to stream e-mail, the “+” sign and any following text is retained; it is up to your program to implement the sendmail “plus hack” if you choose.

CanIt-PRO — Roaring Penguin Software Inc. 5.2. REAL-TIME DNS BLACKLISTS 69

G-4090 Scan for viruses prior to streaming incoming mail If you know for sure that you always want to reject or discard viruses, regardless of any per-stream settings, then change this setting to Yes. It causes any viruses to be discarded or rejected (according to the global virus-handling setting) before any streaming takes place. If a virus comes in for more than one recipient, this can greatly reduce the load on CanIt-PRO. Note that the global virus-handling setting must not be set to Hold for this setting to take effect.

G-4100 Timeout in seconds for Verification Server queries If you are using the Verification Server feature, CanIt-PRO will time out Verification Queries according to the value of this setting. You should keep it reasonably low so that a slow or dead verification server does not interfere with delivery to other domains.

To make your changes permanent:

• Click on Update Global Settings

5.2 Real-Time DNS Blacklists

Both Sendmail and CanIt-PRO can make use of DNS-based real-time blacklists. These blacklists allow you to look up the IP address of a host in a special DNS domain, and take action if the host is blacklisted. You can configure Sendmail to use DNS-based blacklists directly, but you may prefer to handle this with CanIt-PRO, because CanIt-PRO allows you to hold or score messages from hosts on the blacklist rather than outright rejecting them.

5.2.1 Entering the Master List of DNS RBLs

To use DNS-based RBLS, you first enter a master list of RBLs that CanIt-PRO can potentially use. To do this, click on Administration and then Master RBLs. The Master RBLs page appears:

Figure 5.2: Master RBLs

To enter an RBL:

1. Enter the domain in the RBL Domain box. 2. Enter a brief (but meaningful) description in the Description box.

CanIt-PRO — Roaring Penguin Software Inc. 70 CHAPTER 5. CANIT-PRO ADMINISTRATION

3. Enter a short tag in the Tag box. This tag is used in the mail log and incident reports to identify the RBL. If you leave it blank, CanIt-PRO will construct a unique identifier for the RBL based on the domain, type and data.

4. Select how the RBL is to be used:

(a)A Block RBL is used to block unwanted mail. Users will be able to create “Ignore”, “Hold”, “Reject” or “Score” RBL rules. Any “Score” rule will have to have a non-negative score. (b) An Allow RBL is used to list known good mail servers. Users will be able to create “Ignore” or “Score” rules, but any “Score” rule will have to have a non-positive score. In addition, no extra greylist delay may be created for an Allow RBL.

5. Select the type of addresses listed by the RBL:

(a) If you know that the RBL lists only IPv4 addresses, set the Address Family to IPv4. (b) If you know that the RBL lists only IPv6 addresses, set the Address Family to IPv6. (c) If the RBL lists both IPv4 and IPv6 addresses, set the Address Family to Both IPv4 and IPv6. If you are not certain whether or not the RBL lists IPv6 addresses, the “Both” setting is safest.

6. Select the type of the RBL:

(a) If the RBL is considered to be “hit” if any record is returned, set the type to normal. Most DNS-based blocklists are of this type. (b) If the RBL returns specific A records to indicate a hit, set the type to match and enter the A record that indicates a hit in the Data field. (c) If the RBL returns information in a bitmask in the returned A record, set the type to mask and enter the mask (for example, 0.0.0.4) in the Data field. A mask-type RBL is considered to be hit if the returned A record bitwise-ANDed with the data field returns non-zero.

7. Click Submit Changes

To delete an RBL, enable the checkbox beside the entry you wish to delete and click Submit Changes. Deleting a master RBL also deletes all RBL rules that refer to it. You can change the timeout for RBL lookups by adjusting the value in the Timeout in seconds for DNS-RBL lookups box. The master RBL list is merely a list of all the RBLs that CanIt-PRO can potentially use. To actually set up RBL rules, please see the User’s Guide. RBL rules can be created on a per-stream basis, so different streams can elect to use none, some or all of the predefined Master RBLs.

Note: Various RBLs have different terms-of-service. Some require licensing or payment; please be sure you are allowed to use an RBL before entering it into CanIt-PRO’s RBL list.

CanIt-PRO — Roaring Penguin Software Inc. 5.3. USERS 71

5.2.2 combined.bl.rptn.ca

Roaring Penguin Software Inc. publishes for DNS-based lists for CanIt-PRO customers. These lists are automatically entered into the Master RBL list (but no rules are created automatically.) The four lists are:

• The Greylist-Stumbler list. These are machines known to have trouble getting past greylisting. The machines are very likely compromised PCs. We recommend making a rule to add one point for machines on this list, and also to extend the greylist period (if you use greylisting) to 60 minutes.

• The Dictionary-Attacker list. These are machines known to send mail to many nonexistent addresses. We recommend a rule to add one point for machines on this list.

• The Spam-Source list. These are machines known to send spam and relatively little non-spam. We recommend adding three points for machines on this list.

• The Good list. These are machines that send relatively little spam, quite a lot of non-spam, and have no trouble with greylisting or sending to nonexistent recipients. We recommend subtract- ing 0.5 points for machines on this list.

Note: The combined.bl.rptn.ca list requires a secret token for lookups to succeed; this token is changed once a day. CanIt-PRO automatically obtains and uses the token for as long as your support term is in force. This means that you cannot use the list outside of CanIt-PRO. If you do a high volume of lookups, please contact Roaring Penguin Software to arrange for a zone transfer via rsync.

5.3 Users

CanIt-PRO maintains its own table of users. You should enter users into this table to create CanIt- PRO administrative users, or users with different privileges from the default (for example, a demo user.) Click on Administration and then Users to set up users. You will see the user management screen.

CanIt-PRO — Roaring Penguin Software Inc. 72 CHAPTER 5. CANIT-PRO ADMINISTRATION

Figure 5.3: Users

If you enter a string in the “Filter:” box, then CanIt-PRO limits the display to entries whose User-ID or E-Mail column contain that string.

5.3.1 User Privileges

When a user logs in to CanIt-PRO, he or she can see a single stream at a time. Every user always has access to a stream that (usually) has the same name as his user name. The CanIt-PRO administrator can give users permission to see additional streams. For example, the user janedoe always has access to the stream janedoe. However, if she manages a mailing list called joke-list, you have two options:

1. You can stream messages for the list to janedoe, so she has only a single spam trap to consider.

2. You can create a new stream called joke-list and give access to that stream to janedoe. In this way, she can use different settings, blacklists and whitelists for the list than she does for her personal e-mail.

Each CanIt-PRO user has two special privileges, which can be on or off:

• A user with root privilege can add, edit and delete other users.

• A user with write privilege can mark messages as spam or not-spam, and can blacklist and whitelist hosts, domains and senders.

Note that CanIt-PRO allows for additional flexibility in controlling which parts of the Web interface are available to various users. For details, see Chapter8.

CanIt-PRO — Roaring Penguin Software Inc. 5.3. USERS 73

5.3.2 Adding a User

To add a user, click on the Add User link. The Add User screen appears:

Figure 5.4: Add User

• Enter the user-ID of the user in the User-ID box.

• To set the user’s e-mail address, enter it in the E-Mail field. (If CanIt-PRO knows a user’s e-mail address, the “Locked Addresses” feature can be used.)

• Enter a password for the user in the Password and Confirm Password fields.

• If you set Locked Password? to Yes, then the user will have a “locked” password and will not be able to log in. However, if you have configured an alternate user authentication method, the user will be able to log in using a password that the alternate method accepts.

• If you only want the user to have read-access to the spam trap, set Write Access? to No.

• If you want to make the user an administrator, set Has Root Access? to Yes.

Once you have filled in the fields, click Add User to add the user.

Note: Both user-names and passwords are case-sensitive; a used named user1 is completely different from one named User1.

5.3.3 Editing a User

To edit a user, click on the User-ID on the user management screen. You will see the user-editing screen.

CanIt-PRO — Roaring Penguin Software Inc. 74 CHAPTER 5. CANIT-PRO ADMINISTRATION

Figure 5.5: Edit User

• To set the user’s e-mail address, enter it in the E-Mail field.

• If you wish to change the user’s password, enter it in the Password and Confirm Password fields. If you leave these fields blank, the password will not be changed.

• If you set Locked Password? to Yes, then the user will have a “locked” password and will not be able to log in. However, if you have configured an alternate user authentication method, the user will be able to log in using a password that the alternate method accepts.

• Adjust the write-access privilege by setting the Write-Access? checkbox appropriately. Note that you cannot grant or revoke root privileges by editing a user; root privileges are given or denied at user-creation time.

To make the changes take effect, click Submit Changes.

5.3.4 Deleting a User

If there is more than one user, a Delete checkbox appears beside those users that can be deleted. Enable the checkbox and then click Submit Changes to delete the selected user or users. Note that it is not possible to undo the deletion! Note that if you delete a user, he may still have access if he can be authenticated using an external authentication mechanism.

5.3.5 Granting Access to Streams

If you wish to grant a user access to additional streams, click on the Edit Accessible Streams button (Figure 5.5). The following page will appear:

CanIt-PRO — Roaring Penguin Software Inc. 5.4. PERMITTING USERS TO OPT IN 75

Figure 5.6: Granting Access to Streams

To grant access to a stream, enter the stream name in the input box and click Add Stream. To revoke access to a stream, click on the Delete link next to the stream name. Note that a user always has access to a stream with the same name as his user name, and this access cannot be revoked. Also, the CanIt-PRO administrator can access any stream, regardless of the settings on this page.

5.4 Permitting Users to Opt In

In the CanIt-PRO global settings (Section 5.1), the CanIt-PRO administrator can control:

• Whether or not people are permitted to opt-in to spam scanning.

• Whether the default setting is opt-in or opt-out.

There are three useful combinations:

1. Permit everyone to opt-in, and have the default be opt-in.

2. Permit everyone to opt-in, and have the default be opt-out.

3. Permit only selected people to opt-in, and have the default be opt-out.

In the first two cases, the administrator need not do anything special. In the third case, you must add entries to the Stream Approval Table. Click on Administration and then Opt Others In/Out to see this table:

CanIt-PRO — Roaring Penguin Software Inc. 76 CHAPTER 5. CANIT-PRO ADMINISTRATION

Figure 5.7: Stream Opt-In Approval

If the “Approved?” column is checked, then the stream may opt in to spam scanning. If it is not checked, then the stream may not opt in to spam scanning. If the “Opted-In?” column is checked, the stream is currently opted in to spam scanning. Otherwise, it is not. To add a stream to the table, enter the stream name in the input box and set “Approved?” and “Opted- In?” appropriately. Then click Submit Changes. To edit existing streams, adjust “Approved?” and “Opted-In?” appropriately and click Submit Changes. To delete a stream from the opt-in table, enable the Delete? checkbox on the appropri- ate row and click Submit Changes. If the default setting is to permit anyone to opt in to spam scanning, you can nevertheless exclude particular streams from being able to opt in by entering them in the Stream Approval Table and turning off the “Approved?” checkbox. In order for spam-scanning to occur, a stream must be both approved and opted-in. If the stream is not found in the Stream Approval Table, then the defaults are taken from the Global Settings. If you enter a string in the “Filter:” box, then CanIt-PRO limits the display to entries whose Stream column contains that string.

5.5 Groups

For the purpose of granting permissions, CanIt-PRO allows you to create groups. A group is simply a collection of users. To edit groups, click on Administration and then Groups. The Groups Page appears:

CanIt-PRO — Roaring Penguin Software Inc. 5.5. GROUPS 77

Figure 5.8: Groups

5.5.1 Creating, Deleting and Editing Groups

To create a new group:

1. Enter the name of the group in the Group box. 2. Enter a description of the group in the Description box. 3. Click Submit Changes

To delete an existing group:

1. Enable the Delete checkbox for the group you want to delete. 2. Click Submit Changes

To edit a group:

1. Click on the Edit link next to the appropriate group. The Group Members page appears:

Figure 5.9: Group Members

CanIt-PRO — Roaring Penguin Software Inc. 78 CHAPTER 5. CANIT-PRO ADMINISTRATION

2. Enter new members (one per line) in the Member text area.

3. If you want to delete existing members, enable the appropriate Delete checkbox.

4. Click Submit Changes

Note: External authentication methods can affect group membership. See Chapter6 for details. In the Groups Page (Figure 5.8), click on Permissions to edit the permissions associated with the group. Permissions will be discussed in detail in Chapter8.

5.6 Viewing Active Streams

The CanIt-PRO administrator can look at all the streams with entries in the incidents table. To do this, select Administration and then See Active Streams. The Active Streams Page appears:

Figure 5.10: Active Streams

5.6.1 Definition of an Active Stream

A stream is considered “active” if it has at least one message in the trap (pending, spam or non-spam) or has any rules, blacklists or whitelists defined.

CanIt-PRO — Roaring Penguin Software Inc. 5.7. FILTERING OUTGOING MAIL 79

5.6.2 The Active Stream Display

The columns in the display are:

Stream The name of the stream. Each stream name is a hyperlink; if you click on the link, you will switch streams to that stream.

Pending The number of pending messages in the stream’s trap.

Spam The number of spam messages in the stream’s trap.

Non-Spam The number of non-spam messages in the stream’s trap.

Opted-In? Set to Yes if the stream is both approved for anti-spam scanning and opted-in; set to No otherwise.

Delete A column of links for deleting streams.

If you enter a string in the “Filter:” box, then CanIt-PRO limits the display to entries whose Stream column contains that string.

5.6.3 Deleting a Stream

To delete a stream, click on the Delete link in the Active Streams page. Then click on Yes, delete it! to confirm deletion. Deleting a stream deletes all incidents, rules, settings, etc. associated with the stream.

5.7 Filtering Outgoing Mail

Some organizations like to add boilerplate disclaimers to outgoing mail. CanIt-PRO can achieve this by streaming all outgoing mail to an “outgoing” stream, and adding boilerplate options for that stream. One way to stream all outgoing mail to a particular stream is to set up your domain mappings as follows:

• All of your own domains (that is, domains considered “internal”) should have mappings set up. The mappings could be ChopDomain, Sendmail, or whatever, as long as the mappings exist.

• The wild-card domain * should have a domain mapping of Database.

• The wild-card address * should have an address mapping mapping it to the stream outgoing. (You can name your outgoing stream however you like.)

With these settings, mail for internal recipients will be streamed appropriately, and mail for external recipients will all be streamed to outgoing. For the outgoing stream, enter the appropriate boilerplate to add to outgoing messages. You can also add custom body-matching rules if you want to trap mail containing certain words—for example,

CanIt-PRO — Roaring Penguin Software Inc. 80 CHAPTER 5. CANIT-PRO ADMINISTRATION

“Do Not Distribute Externally” Such rules on an outgoing stream may help prevent unauthorized distribution of confidential information. See also Known Networks (Section 4.7 on page 50) for another way to force outgoing mail into a specific stream. Using Known Networks may be simpler than using address mappings if all your outgoing mail originates from a limited set of IP addresses.

5.8 Copying Rules from One Stream to Another

Occasionally, it is useful to copy or move rules from one stream to another. To do this, click on Administration and then Copy Rules. The Copy Rules page appears:

Figure 5.11: Copying Rules

To copy rules:

1. Choose which rules you wish to copy by activating the appropriate check boxes under Objects to Migrate. 2. Put the name of the stream you want to copy from in the From stream: box. 3. Put the name of the stream you want to copy to in the To stream: box. 4. Select “Preserve Original” or “Overwrite” to handle the case of conflicting rules in the source and destination streams. 5. Click on Copy Rules to copy rules from the source stream to the destination stream. Move Rules is similar, but any rule that is successfully placed in the destination stream is deleted from the source stream.

5.9 Secondary MX Hosts

Secondary MX hosts require special handling by CanIt-PRO. Secondary MX hosts which relay to the CanIt-PRO system should always be listed in “Known Networks”, with the options below checked, as it is usually desirable to modify CanIt-PRO behaviour as follows:

CanIt-PRO — Roaring Penguin Software Inc. 5.10. AVOIDING BACKSCATTER 81

Note that localhost (127.0.0.1) is always considered a secondary MX host for the purposes below:

Don’t Tempfail Incidents When checked, suspect mail is not responded to with an SMTP temporary-failure code; instead, it is held locally in the CanIt-PRO database. This prevents additional load on your CanIt-PROsystems and your secondaries caused by continual retrans- mission. (However, if you set the Send tempfail indications for suspect messages setting to Always, then CanIt-PRO will tempfail mail from these hosts even if this setting is checked).

Parse Received Headers When checked, CanIt-PRO trusts the Received: header added by that con- necting host or network. This means that CanIt-PRO will be able to apply host checks against the host that submitted the message to your network, rather than against your secondary MX server. In addition, CanIt-PRO will avoid applying Mismatch rules, and rule checks against the “HELO” field if the connecting IP address is in a network with this setting enabled.

Prohibit Blacklisting When checked, CanIt-PRO ignores any host-blacklists for hosts in this net- work. This will prevent locally-generated mail from your secondary MX hosts from being blacklisted. Note that if “Parse Received Headers” is enabled, mail relayed via the secondary system will show as being from the upstream IP, and blacklists will not be ignored.

Skip RBL Lookups When checked, CanIt-PRO will suppress DNS blacklist lookups.

Skip Greylisting When checked, CanIt-PRO will suppress first-time sender checks.

Any machine under your control that you expect to forward mail to your machine should be considered a secondary MX host. For example, if a number of users have accounts on a machine that forward mail to your machine using .forward files, you should consider entering that machine as a secondary MX host. Also, note that if CanIt-PRO is able to determine the “real” relay IP by parsing the Received: headers, and you have enabled this option, then CanIt-PRO runs all the host checks as usual, using the real relay IP address. However, these checks are (of necessity) delayed until after the DATA phase of the SMTP transaction, because CanIt-PRO does not have the required information at the MAIL FROM: or RCPT TO: phases.

5.10 Avoiding Backscatter

Under most circumstances, if CanIt-PRO rejects a message, it responds with an SMTP failure code. This generally causes the sending relay to mail a failure notification to the original sender. However, because most spam and viruses have faked sender addresses, you may not want this behavior for messages relayed from a secondary MX host or for messages split into multiple streams. That’s because if a message is rejected after having been accepted by one of your mail servers, it’s the responsibility of the sending server to generate a failure Delivery Status Notification or DSN. If (as is likely) the sender address is faked, that failure message may arrive at an unsuspecting third- party. This is what is known as backscatter. It is a violation of RFC 821, and is generally considered bad behavior, to silently discard mail; how- ever, many sites are beginning to lump hosts responsible for generating backscatter into the same

CanIt-PRO — Roaring Penguin Software Inc. 82 CHAPTER 5. CANIT-PRO ADMINISTRATION

category as spammers. Because of this, CanIt-PRO will not generate a failure notification for mail from local host or from a designated secondary MX host.

5.11 Test Plugins

Some anti-spam tests are very specific and are implemented as plugins. Currently, CanIt-PRO ships with one plugin: An anti-phishing plugin that examines messages for reply addresses that are on a known list of phishing addresses. If a plugin matches against a particular message, the plugin is said to have fired. To configure test plugins, click Rules and then Plugins. The Test Plugins page appears:

Figure 5.12: Test Plugins

For each plugin, you can configure actions to be taken on a per-stream basis (although we recommend creating rules only in the default stream.) To configure a plugin:

1. Select the action to be taken if the plugin fires. The action can be one of: • Ignore — do not use this plugin at all. • Hold — hold mail in the trap if the plugin fires. • Score — add the score in the Score column if the plugin fires. • Reject — reject the mail if the plugin fires. 2. If you chose Score, enter a decimal score in the Score column. 3. If you wish, you can add a comment in the Comment column 4. Click Submit Changes to make the changes take effect.

5.11.1 The PhishingAddress Plugin

The PhishingAddress plugin consults a dynamically-updated list of e-mail addresses known to be used in phishing attacks. We recommend configuring it as follows:

• In the default stream, configure the test to add 10 points to the message score. Alternatively, you may wish to configure it to reject mail.

CanIt-PRO — Roaring Penguin Software Inc. 5.12. EMERGENCY BLOCKING OF DELIVERY STATUS NOTIFICATIONS 83

• If you are routing outbound mail through CanIt-PRO, then you should be sending outbound mail through a dedicated outbound stream. In that stream, configure the PhishingAddress plugin to reject mail. If users accidentally reply to a phishing e-mail that somehow got through, at least by rejecting their replies you will prevent sensitive information from reaching the attackers.

5.12 Emergency Blocking of Delivery Status Notifications

Sometimes, a spammer will process a large spam run and fake the sender address to be within a domain you control. Faking the sender address as if it comes from an innocent third-party is called a joe-job. Unfortunately, in a typical spam run, a large percentage of the recipient addresses are invalid, so the run creates many delivery failure notifications (officially called Delivery Status Notifications or DSNs). Because of the faked sender address, all of these notifications come back to you, the innocent third- party. These spurious failure notifications are called backscatter and can cause a huge load on your CanIt-PRO scanners, as well as a huge annoyance for end-users. CanIt-PRO has a feature that allows you to block DSNs for selected domains for a limited time. This is an emergency measure and should only be used for a limited time in the face of large amounts of backscatter. Normally, this feature is disabled. To enable the feature, click on Setup : Features and enable the Permit Emergency Blocking of Delivery Status Notifications feature. Next, click on Rules : Block DSNs. The Block Delivery Status Notifications page appears:

Figure 5.13: Block Delivery Status Notifications Page

To turn on DSN-blocking for a domain:

1. Enter the domain name in the Domain box.

2. Pick an expiry date. The default expiry date is 5 days in the future. CanIt-PRO will not let you pick an expiry date more than 30 days in the future.

3. If you wish, add a comment explaining why you are enabling DSN-blocking.

4. Click Submit Changes

CanIt-PRO — Roaring Penguin Software Inc. 84 CHAPTER 5. CANIT-PRO ADMINISTRATION

To edit the expiry date and comment for existing entries, change the text in the appropriate boxes and click Submit Changes To remove DSN-blocking from a domain, enable the appropriate Delete? checkbox and click Sub- mit Changes.

5.13 Removing All Rules and Settings from a Stream

On occasion, it may be necessary to delete all rules, blacklists, whitelists, settings, etc. from a stream. If a novice user has created many such rules and settings, the stream may be unusable and a “factory reset” advised. To delete all rules and settings from a stream:

1. Switch to the stream in question with the View This Stream button.

2. Click on Rules.

3. Click on the link after the phrase “To delete all rules and stream settings for stream streamname, click here.”

4. Click Purge Rules to delete all the rules and settings, or Cancel to cancel.

Note: It is not possible to purge all rules from the default stream.

CanIt-PRO — Roaring Penguin Software Inc. Chapter 6

External Authentication

6.1 Introduction

In addition to its built-in user list, CanIt-PRO can authenticate users using external mechanisms. To enable the use of external authentication mechanisms, these basic steps must be followed:

1.A User Lookup must be defined. A User Lookup describes to CanIt-PRO how to look up user information from an external source.

2. An Authentication Mapping must be created. An Authentication Mapping tells CanIt-PRO which User Lookup to user for a given domain. You can use different authentication mech- anisms for different domains, which gives CanIt-PRO considerable flexibility.

Some User Lookups can also perform streaming. That is, given an email address, they can return the name of the stream associated with that email address. The LDAP (Section 6.2.2) and Program (Section 6.2.3) User Lookups can perform streaming. Using a User Lookup to perform streaming is very powerful; for example, you could use an LDAP lookup to stream all of a user’s aliases into his single stream.

6.2 User Lookups

To create a User Lookup:

• Click on Setup and then User Lookups. You will see the User Lookup list:

CanIt-PRO — Roaring Penguin Software Inc. 85 86 CHAPTER 6. EXTERNAL AUTHENTICATION

Figure 6.1: User Lookup List

• Click on Add a New User Lookup, and the User Lookup Wizard appears:

Figure 6.2: User Lookup Wizard

• Pick a name for the User Lookup, and click Next. The User Lookup method selection screen appears:

Figure 6.3: User Lookup: Method Selection

• Enter a comment for the lookup method. The comment can be anything you like; its purpose is to document the method so you remember what it does.

CanIt-PRO — Roaring Penguin Software Inc. 6.2. USER LOOKUPS 87

• Select a lookup method. CanIt-PRO supports the following methods:

– POP3: CanIt-PRO authenticates users against a POP3 server. – IMAP: CanIt-PRO authenticates users against an IMAP server. – LDAP (Generic): CanIt-PRO authenticates users against an LDAP server. – LDAP (Active Directory): This is very similar to generic LDAP, but default values in the User Lookup creation wizard are more suitable for Active Directory lookups. Note that you can select LDAP (Active Directory) only when you create a new user lookup. Once the user lookup has been created, it is treated as a generic LDAP lookup. – Program: CanIt-PRO invokes a program (that you supply) to perform authentication. – Program (Legacy method): CanIt-PRO invokes external programs in the same way as older versions did (using the “Alternate Authentication” global setting that has since been removed.)

• Click Next.

6.2.1 IMAP and POP3 Authentication

If you selected IMAP or POP3 authentication methods, then the wizard looks like this:

Figure 6.4: IMAP/POP3 User Lookup

To complete the setup:

• Enter the IP address or host name of the IMAP or POP3 server. If the server is listening on a non-standard port, add a slash followed by the port number to the server name. For example, if you have an IMAP server listening on port 1143 on the host magnesium, you could enter magnesium/1143 as the server.

• If you would like to strip the domain name from the login name before attempting authen- tication, set the “Strip domain name” setting to Yes. If someone logs in to CanIt-PRO as

CanIt-PRO — Roaring Penguin Software Inc. 88 CHAPTER 6. EXTERNAL AUTHENTICATION

[email protected] and this setting is Yes, then the username passed to the IMAP or POP3 server is simply user. (The default home stream, however, is normally the full [email protected].)

• If you would like to strip the domain name from the home stream, set “Strip domain name from home stream after authentication?” to Yes. This means that if someone logs in as [email protected], her home stream will be user.

• If you want CanIt-PRO to validate the SSL certificate of the server (assuming SSL or TLS is used), set “Validate server certificate” to Yes.

• Pick the appropriate encryption settings for CanIt-PRO to use when communicating with the POP3 or IMAP server.

• Click Next to see a summmary of your settings.

• If all of the settings are correct, click Finish to create the POP3 or IMAP User Lookup.

6.2.2 LDAP Authentication and Streaming

There are two types of LDAP user lookups possible within CanIt: LDAP (Generic) and LDAP (Active Directory). Both of these methods are very similar; the only difference is the defaults that are offered by the User Lookup Wizard when you first create the User Lookup. (Once the user lookup has been created, it is always referred to as LDAP (Generic) thereafter.) LDAP user lookups can be used for one or both of user authentication and stream mapping. When used for stream mapping, the LDAP lookup method will also validate incoming email addresses against the LDAP server, allowing rejection of invalid recipients immediately at the CanIt gateway.

Note: In order for the LDAP User Lookup to validate incoming recipient addresses, it must be used for streaming in Domain Mapping. Be sure to use another method of validation (e.g. Verification Servers (see Section 4.4, Valid Recipients table) if you do not use your User Lookup for streaming. If you select one of the LDAP methods, you will see the LDAP User Lookup Wizard:

CanIt-PRO — Roaring Penguin Software Inc. 6.2. USER LOOKUPS 89

Figure 6.5: LDAP User Lookup

To complete the setup:

• If you wish to use this User Lookup for authentication, set “Use this method for authentication?” to Yes.

• In the “LDAP server(s)” box, enter the IP address or host name of your LDAP server. You can enter a comma-separated list of servers if you have more than one LDAP server. As with the IMAP and POP3 User Lookups, if a server listens on a non-standard port, enter a slash followed by the port number after the server name. For example, if you have two LDAP servers serverA and serverB, and the second listens on non-standard port 3389, enter the following into the server box: serverA, serverB/3389 If you want to use LDAPS (LDAP over SSL), enter the host name as an “ldaps” URL. For example: ldaps://server.example.com/

CanIt-PRO — Roaring Penguin Software Inc. 90 CHAPTER 6. EXTERNAL AUTHENTICATION

• Normally, CanIt-PRO tries the LDAP servers in order. If you would like it to try them in a random order (for load-balancing), set “Load-balance LDAP servers” to Yes. • Enter the Base DN of your LDAP tree in the “Base DN” box. • Typically, CanIt-PRO needs to bind to the LDAP directory before it can search it. Enter the Bind DN in the “Bind DN” box. If a password is required, enter it in the “Bind password” box. Note that Active Directory does not support anonymous bind; a Bind DN and Bind password are required. • Some LDAP servers require CanIt-PRO to disconnect and reconnect and re-bind between queries. (Active Directory requires this.) If your LDAP server requires this, set the “Recon- nect” setting to Yes. • Enter the search filter for login authentication. The string %s will be replaced by the user’s login name. For most UNIX LDAP servers, a search filter of (uid=%s) is appropriate. For Active Directory, it might be (sAMAccountName=%s). • If you would like to strip the domain name from the login name before attempting authen- tication, set the “Strip domain name” setting to Yes. If someone logs in to CanIt-PRO as [email protected] and this setting is Yes, then the username passed to the LDAP server is simply user. • To use the Locked Addresses feature, CanIt-PRO needs to know the e-mail address of a logged- in user. In most UNIX LDAP servers, this is stored in the mail attribute, while in many Active Directory servers, this is stored in the attribute proxyAddresses. • If you wish to control group membership using LDAP, enter the name of an LDAP attribute in the Attribute containing group names box. This attribute should contain a comma-separated list of group names. When a user authenticates, he/she will be considered to be a member of all of the groups listed in this attribute. • If you wish to use the LDAP server to stream addresses as well as authenticate, set “Use this method for streaming” to Yes. • For streaming, CanIt-PRO needs to look up an e-mail address in the LDAP server. For most UNIX servers, the appropriate search filter is (mail=%s), while for Active Directory, it is probably (proxyAddresses=smtp:%s). In the search filter, the string %s is replaced with the e-mail address. %u is replaced with the local part of the e-mail address (everything before ‘@’) and %d is replaced with the domain part of the address (everything after the ‘@’.) • If you would like CanIt-PRO to force stream names (as determined by the LDAP lookup) to lower-case, set “Force stream name to lower-case?” to Yes. (This is the default.) If you want to preserve mixed-case stream names, set this setting to No. • If you would like CanIt-PRO to cache stream lookups, set “Cache stream lookups in database” to Yes. • CanIt-PRO needs to know which LDAP attribute contains the stream name. For most UNIX servers, the appropriate attribute is uid, while for Active Directory, it is probably sAMAccountName.

CanIt-PRO — Roaring Penguin Software Inc. 6.2. USER LOOKUPS 91

• If CanIt-PRO successfully looks up an e-mail address, but the LDAP record lacks an attribute for the stream name, CanIt-PRO can either tempfail the mail or simply place it in the default stream. Set “Action if stream attribute missing” to the choice that is appropriate for your organization.

Note: Recipient Validation (i.e. rejecting SMTP RCPT with ”User Unknown” when the address is not found in LDAP) is only done if CanIt-PRO receives a positive response that there is no corresponding LDAP record for the given e-mail address. Changes to this setting do not affect validation.

• You can change the connect timeout from the default value of 120 seconds to any value from 2 to 120 seconds. This timeout only applies to streaming lookups by the filters. It does not apply to authentication, because PHP (used for the Web interface) does not have a way to specify an LDAP connect timeout.

Once you have entered the LDAP parameters, click Next to review your entries, and Finish to create the User Lookup.

6.2.3 Program Authentication and Streaming

With the Program User Lookup method, CanIt-PRO invokes an external program to authenticate users and map addresses to streams. If you select Program as your User Lookup type, the Program User Lookup Wizard appears:

Figure 6.6: Program User Lookup

To configure the Program User Lookup:

• Enter the full path to your “account-info” script. This is an executable script or program that you must supply. The path you supply must be an absolute path name. If you are running a CanIt-PRO cluster, this script must exist (and be identical!) on all scanning servers and the Web server.

• If you would like to strip the domain name from the login name before attempting authen- tication, set the “Strip domain name” setting to Yes. If someone logs in to CanIt-PRO as [email protected] and this setting is Yes, then the username passed to the program is simply user. The home stream, however, is normally [email protected].

CanIt-PRO — Roaring Penguin Software Inc. 92 CHAPTER 6. EXTERNAL AUTHENTICATION

• If you would like to strip the domain name from the home stream, set “Strip domain name from home stream after authentication?” to Yes. This means that if someone logs in as [email protected], her home stream will be user.

• If you would like to cache stream lookups, set “Cache stream lookups in database?” to Yes. We strongly recommend enabling caching.

How the Program User Lookup is Invoked

• For authentication, the program is invoked as follows: /path/to/script --authenticate The program is then expected to read two lines from its standard input: The first line is a login name, and the second line is a password. The program must then validate the login name and password, and exit with one of the following exit codes:

–0 — Authentication was successful. –1 — Authentication failed.

• For obtaining user information, the program is invoked as follows: /path/to/script --info username Here, the program is passed the successfully logged-on user name as a command-line argument. It should print a series of key=value lines to its standard output, and exit with an exit status of 0. (The script doesn’t have to produce any output, but it can produce output if you want to pass extra information to CanIt-PRO.) The key/value pairs currently used by CanIt-PRO are:

– home stream=stream-name — sets the user’s home stream to stream-name instead of his or her login name. One possible use could be to convert a login name to all lower- case on systems that permit case-insensitive authentication. This ensures that no matter how the person logs in, she is directed to the correct stream name. – groups=group1,group2,...,groupN — when the user logs in, add her to all of the groups listed in the comma-separated list. – mail=email-address — set the user’s e-mail address to email-address.

• For mapping an e-mail address to a stream, the program is invoked as follows: /path/to/script --info-email address Here, address is an e-mail address that must be streamed. The script should write key=value lines to its standard output, and exit with one of the following exit codes:

–0 — the address exists and was successfully streamed. –1 — there was a temporary failure streaming the address. The mail will be tempfailed. – 67 — the address is not valid. CanIt-PRO will fail the SMTP RCPT command with a “User unknown” failure code.

CanIt-PRO — Roaring Penguin Software Inc. 6.2. USER LOOKUPS 93

If the address was streamed successfully, the script must print the following line to standard output: stream=stream-name This causes address to be mapped to stream-name. If no stream=stream-name line is emitted, but the script exits with a zero status, then CanIt-PRO falls back to database lookups, as described in Section 2.5 on page 29.

Sample Program for the Program User Lookup Method

The following is a very simple Bourne shell script illustrating how the Program User Lookup method works. Real scripts would obviously be more complex and probably written in a more appropriate language like Perl.

CanIt-PRO — Roaring Penguin Software Inc. 94 CHAPTER 6. EXTERNAL AUTHENTICATION

#!/bin/sh do_auth () { read user read pass # In reality, we would do a directory lookup against LDAP or similar if test "$user" = "foo" -a "$pass" = "bar" ; then exit 0 fi exit 1 } do_info () { user="$1" # In reality, we would do a directory lookup against LDAP or similar if test "$user" = "foo" ; then echo "home_stream=foobar"; echo "[email protected]"; fi exit 0 } do_info_email () { email="$1" # In reality, we would do a directory lookup against LDAP or similar if test "$email" = "[email protected]" ; then echo "stream=foobar-stream"; fi if test "$email" = "[email protected]" ; then # No such user exit 67 fi exit 0 }

# Main program case "$1" in --authenticate) do_auth ;; --info) do_info "$2" ;; --info-email) do_info_email "$2" ;; *) exit 1; ;; esac

CanIt-PRO — Roaring Penguin Software Inc. 6.3. AUTHENTICATION MAPPINGS 95

6.2.4 Program Authentication (Legacy Method)

If you select this User Lookup method, then CanIt-PRO falls back to behavior compatible with previ- ous versions. (This behavior is deprecated, however. New installations should use Program Authenti- cation as described in Section 6.2.3.)

• If a program called /usr/share/canit/scripts/account-info exists and is exe- cutable, CanIt-PRO invokes it as if it were the script supplied for a Program User Lookup method.

• Otherwise, CanIt-PRO invokes /usr/share/canit/scripts/authenticate-user to authenticate users and /usr/share/canit/scripts/address-to-stream to con- vert an e-mail address to a stream. These scripts have been in use since CanIt-PRO 2.0 and are deprecated; you should convert to the new Program User Lookup method.

6.2.5 The account-info Script

Some User Lookup methods (such as POP3 or IMAP) as well as a lookup in the built-in user database are not capable of passing extra information back to CanIt-PRO. For that reason, if any User Lookup method other than Program or LDAP is used, CanIt-PRO still attempts to execute: /usr/share/canit/scripts/account-info --info username to obtain extra attributes (mail, groups and home stream) after a user logs in. If you need to set users’ e-mail addresses or home streams, but have them authenticate against an IMAP or POP3 server, simply supply an appropriate account-info script.

6.3 Authentication Mappings

Once you have set up your User Lookup methods, you need to tell CanIt-PRO which method to invoke for each domain. To do this, click on Setup and then Authentication Mappings. The Authentication Mappings page appears:

CanIt-PRO — Roaring Penguin Software Inc. 96 CHAPTER 6. EXTERNAL AUTHENTICATION

Figure 6.7: Authentication Mappings

To create a new authentication mapping:

1. Enter the domain name in the Domain field. If you enter a single asterisk (“*”) in this field, then it is used as the default authentication mapping if an exact match is not found.

2. Select the User Lookup from the Mapping field.

3. Click on Submit Changes

In Figure 6.7, we see that anyone who logs in as [email protected] will be authen- ticated using the POP3-Sample User Lookup. Anyone logging in with a different domain (or no domain at all—simply user) will be authenticated using the LDAP-Sample User Lookup. If you enter a string in the “Filter:” box, then CanIt-PRO limits the display to entries whose Domain or Mapping columns contain that string.

CanIt-PRO — Roaring Penguin Software Inc. Chapter 7

Bayesian Filtering

7.1 Introduction to Bayesian Filtering

Bayesian filtering is a statistical technique whereby CanIt-PRO assigns a spam probability based on training from users. Bayesian filtering can greatly improve the accuracy of CanIt-PRO, and makes it harder for spammers to evade filtering. Please consult the CanIt-PRO User’s Guide for additional details on using Bayesian filtering. This guide only contains information relevant when setting up and administering CanIt-PRO.

7.2 Unauthenticated Voting

Normally, to vote if a message is spam or not spam, a user must log in. You can configure CanIt-PRO to permit unauthenticated voting; this can make life easier for end-users who can just click on a link without worrying about entering a user name and password.

Note: Think carefully about permitting unauthenticated voting. If voting links ever escape your organiza- tion (as part of a forwarded message, for example), and your CanIt-PRO Web interface is externally accessible, outsiders can cast votes. We strongly recommend permitting unauthenticated voting only if access the the CanIt-PRO Web interface is controlled in some other way. To permit unauthenticated voting:

• Under Preferences and Stream Settings, set Permit unauthenticated voting to Yes

You can permit unauthenticated voting on a stream-by-stream basis. If you permit it in the default stream, then it will be permitted in all streams that inherit from default (and that don’t override the setting.)

CanIt-PRO — Roaring Penguin Software Inc. 97 98 CHAPTER 7. BAYESIAN FILTERING

7.3 The Bayes Journal

Bayesian training can be slow because it involves many database updates. For that reason, when you train a message, CanIt-PRO simply makes a note of the fact that the message is to be trained in a special table called the Bayes Journal. Periodically, a CanIt-PRO daemon process goes through the Bayes Journal and actually updates the Bayes data. For this reason, if you train some messages, these results will not immediately appear in the Bayes Settings page. The Bayes Journal is run every 10 minutes or so, so your training should appear within 10-15 minutes.

7.4 RPTN

RPTN stands for the Roaring Penguin Training Network, and is a mechanism whereby multiple CanIt installations can share Bayes votes. RPTN contains two main parts:

1. In the reporting phase, CanIt-PRO installations send reports about whether or not mail they have seen is spam. A report essentially consists of a list of tokens in the mail message and a spam or not-spam flag, depending on how the incident was disposed of. The RPTN server aggregates all of the reports it receives and builds a database of Bayesian statistics from the reports.

2. In the download phase, a CanIt-PRO installation downloads the aggregated data and installs it in its database. This data can subsequently be used for Bayesian analysis.

To set up RPTN, click on Setup and then Wizards. Choose the RPTN Setup Wizard. The wizard leads you through the following steps:

1. You are asked if you would like to download Bayes data from RPTN.

2. If you answered Yes in Step 1, you are given an opportunity to limit when RPTN data is down- loaded. Downloading RPTN data can place a fair amount of load on the server, so you should limit RPTN downloads to off-peak hours. Be sure to leave at least a four-hour download win- dow, because RPTN checks are made every two hours. If the download window is too short, you may miss a download.

3. You are asked if you would like to submit reports to RPTN.

4. If you answered Yes in steps 1 or 3, you are prompted for your download username and pass- word. You cannot submit RPTN reports or download RPTN data unless you supply a valid username and password.

5. Your settings are summarized, and you are prompted to click Finish to save the changes.

RPTN data are downloaded into a stream called @@RPTN. If you would like to use RPTN data in Bayesian analyis, you must include @@RPTN in the stream setting “Inherit Bayes training history from these streams”. If you want all streams to inherit Bayes data from @@RPTN, then set the “Inherit Bayes training history from these streams” setting in the default stream.

CanIt-PRO — Roaring Penguin Software Inc. 7.5. RULESET AND GEOLOCATION DATA UPDATES 99

Note: To download RPTN data, the CanIt-PRO server must be able to make outgoing HTTPS connections (over TCP port 443) to the machine server.rptn.ca. To submit RPTN reports, the server must be able to make outgoing HTTPS connections to server.rptn.ca and also be permitted to send outgoing e-mail to [email protected]. If you have a firewall in front of the CanIt-PRO server, please ensure that the firewall rules permit the RPTN traffic.

7.5 Ruleset and Geolocation Data Updates

In addition to downloading Bayes data, CanIt-PRO uses your RPTN credentials to download two other sets of data:

• Updated rules that are pushed out from time-to-time by Roaring Penguin.

• Geolocation data that maps IP addresses to countries and cities. (The data are derived from the GeoLite City data from MaxMind, which requires the following acknowledgement: This product includes GeoLite data created by MaxMind, available from http://www.maxmind.com/)

The updated rulesets are simply SpamAssassin rules that Roaring Penguin publishes as required when a new spam variant is detected. The geolocation data is used by the country rules as described in the User’s Guide. CanIt-PRO also tokenizes the country, region, city and latitude/longitude of the sending relay for use in the Bayes database.

CanIt-PRO — Roaring Penguin Software Inc. 100 CHAPTER 7. BAYESIAN FILTERING

CanIt-PRO — Roaring Penguin Software Inc. Chapter 8

Permissions

8.1 Introduction

In addition to the fairly coarse-grained settings described in Section 5.3.1, “User Privileges”, CanIt- PRO allows you to implement fine-grained control over access to various parts of the Web-based interface. CanIt-PRO has two kinds of permissions:

1. Stream Permissions control access to CanIt-PRO features that affect the filtering of e-mail. For example, the ability to whitelist or blacklist senders, create custom rules, and so on are all Stream Permissions. Stream Permissions depend on both the user and the stream; a given user may have different permissions in different streams.

2. User Permissions control access to various parts of the CanIt-PRO user-interface not directly connected to filtering mail. For example, access the different GUI preferences and the ability to do WHOIS lookups are all User Permissions.

CanIt-PRO can associate permissions with users and with groups. Any user can be a member of zero or more groups. CanIt-PRO always grants a user the union of all his user-specific permissions and all his group permissions. Adding a user to a group, therefore, can only ever grant additional permissions. It cannot take away permissions.

8.2 Stream Permissions

Every stream has associated with it an ordered list of stream classes. When CanIt-PRO looks up stream permissions, it first calculates the list of stream classes associated with a particular user and stream. Here is how CanIt-PRO computes the list of stream classes:

1. The name of the stream always comes first. Thus, for example, if you are viewing a stream called mystream, then the list of stream classes starts with mystream.

CanIt-PRO — Roaring Penguin Software Inc. 101 102 CHAPTER 8. PERMISSIONS

2. If mystream happens to be your “home stream” (Section 3.5), then @@HOME is added to the list of stream classes.

3. If you have write-access in mystream, then @@WRITABLE is added to the list of stream classes.

4. If you have read-access in mystream, then @@READABLE is added to the list of stream classes.

5. Finally, the wildcard value * is added to the end of the list of stream classes.

When CanIt-PRO determines what permissions you have in a particular stream, it uses the following procedure:

1. It looks for permissions granted in the actual stream name. If it finds any, it stops searching the stream classes.

2. Otherwise, it checks the the stream classes and adds all permissions found to the set of granted permissions.

8.3 Determining Permissions

To determine a particular user’s permissions, CanIt-PRO performs the following steps:

1. First, it gathers all permissions associated with the particular user’s login ID. (These permissions are shown in Figures 8.3 and 8.4.)

2. Next, it adds all permissions granted to all the groups to which the user belongs.

3. If there was no entry in the permissions table for the particular user (that is, if Step 1 found no entries), then CanIt-PRO performs the following steps:

(a) If the user has root privileges, then CanIt-PRO adds all permissions granted to the pseudo- user *root*. (b) Next, CanIt-PRO adds all permissions granted to the wild-card user *.

8.4 Granting Permissions

To grant or deny permissions, click on Administration and then Permissions. The Permissions Page appears:

CanIt-PRO — Roaring Penguin Software Inc. 8.4. GRANTING PERMISSIONS 103

Figure 8.1: Permissions Page

If you enter a string in the “Filter:” box, then CanIt-PRO limits the display to entries whose User column contains that string. If you want to edit permissions for groups rather than users, click on the Groups link:

Figure 8.2: Permissions Page

8.4.1 Granting Stream Permissions

To grant stream permissions, click on the Edit link in the Stream Permissions column. The Stream Permissions page appears:

CanIt-PRO — Roaring Penguin Software Inc. 104 CHAPTER 8. PERMISSIONS

Figure 8.3: Stream Permissions Page

• To enable a stream permission in a particular stream or stream class, enable the checkbox in the appropriate row and column.

• To enter the name of a stream or stream class, enter it into the text box in the Per-Stream Permission row. Note that when you enter permissions for a new user, you must enter the stream class in the text box, or your changes will be discarded.

• To delete all permissions for a particular stream or stream class, click the Delete link at the bottom of the appropriate column.

• To view permissions only for one stream or stream class, click on the stream or stream class name.

• To make your changes take effect, click Submit Changes.

The Stream Permissions are:

• Blacklist Senders – The user is permitted to blacklist senders.

• Whitelist Senders – The user is permitted to whitelist senders.

• Hold Senders – The user is permitted to add a hold rule for senders.

• Blacklist/Whitelist/Hold Domains – These permissions are similar to the Sender Action per- missions, but they apply to domain rules.

• Blacklist/Whitelist/Hold Networks – These permissions are similar to the Sender Action per- missions, but they apply to network rules.

• Reject/Accept/Hold MIME Types – These permissions are similar to the Sender Action per- missions, but they apply to MIME type rules.

• Reject/Accept/Hold Filename Extensions – These permissions are similar to the Sender Ac- tion permissions, but they apply to filename extension rules.

CanIt-PRO — Roaring Penguin Software Inc. 8.4. GRANTING PERMISSIONS 105

• Custom Rules – The user is permitted to create custom rules.

• Mismatch Rules – The user is permitted to create mismatch rules.

• SPF Rules – The user is permitted to create SPF rules.

• RBL Rules – The user is permitted to create RBL rules.

• Country Rules – The user is permitted to create country-code rules.

• Bayes Settings – The user is permitted to edit Bayes scoring rules.

• Blacklisted Recipients – The user can blacklist recipients.

• Valid Recipients – The user can enter recipients into the Valid Recipients Table.

• See Pending/Non-Spam/Spam Message – The user can see the specified message type in the trap. Note that these permissions are normally off for @@READABLE streams; otherwise, the user could see default’s spam trap.

• Add Alternate Addresses to Streams – The user can add aliases to his/her stream.

• Opt In/Out – The user can opt in or out of spam-scanning.

• Adjust Notification Settings – The user can adjust his or her notification settings.

• See Per-Stream/Global Reports – The user can see the specified reports.

• Stream Settings – Every stream setting has an associated permission. The user can only see a stream setting if its corresponding permission has been granted. The user can only change a stream setting if the permission has been granted and the user has write-access in the stream.

Note: If a user does not have write-access in a stream, then permissions such as Custom Rules, Whitelist Senders, etc. merely permit the user to see the rules. He or she still cannot change them.

8.4.2 Granting User Permissions

To grant user permissions, click on the Edit link in the User Permissions column. The User Permis- sions page appears:

CanIt-PRO — Roaring Penguin Software Inc. 106 CHAPTER 8. PERMISSIONS

Figure 8.4: User Permissions Page

The following User Permissions may be granted:

• Preferences – Unless this permission is granted, the user will not have access to the Preferences menu or any of its sub-menus.

• WHOIS Lookups – If this permission is granted, the user will be allowed to do WHOIS lookups.

• See Statistics – Allows the user to see the Reports : Statistics page.

• Use API – Allows the user to access the REST-based CanIt-PRO API. See the API Guide for details.

• Use Log Searching – Allows the user to use the Log Searching feature (Chapter 14).

Note: Users must have root privileges to use Log Searching; non-root users cannot use it even if Use Log Searching is enabled. Also, the log-searching feature is available only on CanIt-PRO appliances.

• See User’s Guide – Enables the link to the user’s guide.

• Use Expert Interface – Grants the user access to the expert interface.

• Create RSS Feed – Grants the user permission to create an RSS feed link for pending messages.

• Turn off Stream Inheritance – Grants the user permission to completely isolate his stream by disabling inheritance from the default stream. We do not recommend granting this permission as a matter of course.

• Preferences – Each preference setting has an associated permission. A user can only change those settings for which permission has been granted.

CanIt-PRO — Roaring Penguin Software Inc. Chapter 9

Streams, Inheritance and the Simple GUI

9.1 Simplification

CanIt-PRO is extremely versatile, allowing end-users to set many parameters such as blacklists, whitelists, custom rules, and so on. For many users, this is intimidating—the users may be unso- phisticated, and just want to “make spam stop.” CanIt-PRO allows the administrator to set up special streams with pre-configured settings. Unsophis- ticated users then see a very simple interface which allows them to choose from one of these settings. CanIt-PRO achieves this with stream inheritance and special streams.

Note: Users who use the Simple GUI will not have their own traps. Special streams should be configured to pass, tag or reject. If any incidents are actually created, someone with administrative access will need to check the special streams’ traps periodically.

9.2 Stream Inheritance

Streams in CanIt-PRO inherit rules and settings from other streams. By default, all streams inherit rules and settings from the default stream. If a stream stream1 inherits from another stream stream2, we refer to stream2 as the parent of stream1. Conversely, we call stream1 the child of stream2. Furthermore, suppose that stream2 inherits from stream3. We then call stream3 and stream2 the ancestors of stream1. These terms are illustrated in Figure 9.1:

CanIt-PRO — Roaring Penguin Software Inc. 107 108 CHAPTER 9. STREAMS, INHERITANCE AND THE SIMPLE GUI

stream3 stream2 inherits from stream3 parent

child stream2 stream1 inherits from stream2 stream1 is the child of stream2 parent stream2 is the parent of stream1 child stream3 and stream2 are the ancestors of stream1 stream1

Figure 9.1: Stream Inheritance Terminology

In addition to the default inheritance, streams can be configured to inherit rules and settings from Special Streams (discussed next in Section 9.3.) To determine a stream’s inheritance, CanIt-PRO consults the Stream Inheritance Table. To see this table, click on Administration and then Inheritance:

Figure 9.2: Stream Inheritance Table

To determine a stream’s parent, CanIt-PRO first looks up the stream in the inheritance table. If there is an entry, then that entry is used to determine the parent. If there was no entry, CanIt-PRO looks up the key “*” in the inheritance table. If such an entry exists, it is used to determine the parent. In the example in Figure 9.2:

• user3 inherits from 01 Tag Only.

• user4 inherits from 00 Opt Out.

• user5 does not inerit from any other stream.

• user9 inherits from default.

CanIt-PRO — Roaring Penguin Software Inc. 9.3. SPECIAL STREAMS 109

• All other streams (except for default) inherit from 01 Tag Only, because of the wildcard entry.

If you enter a string in the “Filter:” box, then CanIt-PRO limits the display to entries whose Stream or Inherits From columns contain that string.

9.3 Special Streams

A Special Stream is a normal stream with two extra behaviors:

• Other streams are allowed to inherit from special streams. Normally, a stream can only have default as its parent. If you add special streams, however, other streams are allowed to make the special streams their parents.

• If a stream inherits from a special stream, then mail for the child stream is trapped in the parent’s trap. That is, by inheriting from a special stream, a stream “loses” its trap, giving responsibility for any trapped mail to the special stream.

9.3.1 Final Streams

A special stream may be marked final. If a special stream is marked final, then children of that stream may not override the special stream’s rules or settings. If a stream inherits from a final special stream, it’s as if the stream has given all control over to the special stream. To see special streams, click on Administration and then Special Streams. The Special Stream Table appears:

Figure 9.3: Special Stream Table

9.3.2 Creating Special Streams

To create a special stream, enter the name of the stream in the Stream text box, and a user-friendly description in the Description box. Then click Add Special Stream.

CanIt-PRO — Roaring Penguin Software Inc. 110 CHAPTER 9. STREAMS, INHERITANCE AND THE SIMPLE GUI

In the example, the four streams 00 Opt Out, 10 Tag Only, 20 IT Staff and 30 Aggressive have been created. (Special streams are presented to end-users in order of the stream name, so we named the streams beginning with numbers so they would sort from least to most aggressive. We leave gaps between the stream numbers so we can insert more streams in between if required.) Once you have created the special streams, configure them appropriately. For example, for 00 Opt Out, you’d switch into that stream, and then under Preferences : Opt In/Out, you’d opt that stream out. (For convenience, you can click on a stream name in the Special Stream Table to switch into that stream.) For 30 Aggressive, you might change the stream settings to auto-discard anything scoring 8 or more on the spam scale. For 20 IT Staff, you could have CanIt-PRO hold suspect spam, and have a member of your IT staff check 20 IT Staff’s trap and release false- positives. Note that 00 Opt Out and 20 IT Staff are marked final. This means that rules and settings in streams inheriting from these two special streams are ignored; only the special streams’ settings and rules are used. On the other hand, streams inheriting from 10 Tag Only and 30 Aggressive may define their own rules, settings, whitelists and blacklists. You can define as many special streams with as many different settings as you deem appropriate. Note that all special streams (by default) inherit from the default stream.

9.3.3 Deleting Special Streams

To delete a special stream, enable the checkbox in the Delete? column for the appropriate stream. Then click Submit Changes. Warning: If you delete a special stream, then all inheritances from that stream are deleted. Please see Section 9.2 for more details.

9.4 The Simplified GUI

If the CanIt-PRO administrator enabled the global setting G-4060 Users authenticated by alternate means default to simple GUI? (Section 5.1), then such users only see the Simplified Interface:

Figure 9.4: Simplified Interface

The simplified interface simply lists the possible Special Streams. The currently-inherited special

CanIt-PRO — Roaring Penguin Software Inc. 9.5. INHERITANCE FROM NON-FINAL STREAMS 111

stream is highlighted in bold red print. To inherit from a different stream, the user simply clicks on the appropriate radio button and clicks Set Spam-Scanning Level. This adjusts the entry in the inheritance table. To log out, the user clicks on Log Out. If the user clicks on Enable Expert Interface, then he or she will have access to the usual CanIt-PRO interface. He or she can then turn off inheritance (via Preferences : Set Default Stream) and take control over his or her own blacklists, whitelists, rules and spam trap.

Note: If you have set the global setting G-4075 Switching to expert mode cancels stream inheritance to Yes, then the act of clicking Enable Expert Interface cancels any inheritance that was in force, making the stream inherit from default again. To get back to the simple GUI, click on Simple Interface top-level menu entry. Note that this menu entry does not appear until at least one special stream has been defined.

9.5 Inheritance from Non-Final Streams

If a stream inherits from a non-final stream, CanIt-PRO uses the following procedures to resolve rules. In these examples, we assume that stream john inherits from the non-final stream 10 Tag Only

• For sender, domain and network blacklists and whitelists, and for MIME type and mismatch rules, CanIt-PRO first looks for a rule associated with the original stream (in our example, john.) If no such rule is found, it then tries the parent stream (in our example, 10 Tag Only) and then the parent of the parent, and so on up the inheritance chain.

• For custom rules, CanIt-PRO uses all the rules associated with the original stream in addition to rules associated with the ancestor streams.

• Bayes data is associated with the original stream (john) and not the parent stream (10 Tag Only).

9.6 Inheritance from Opted-Out Streams

If a stream or any of its ancestors is opted-out of spam-scanning, then no spam scanning is performed.

CanIt-PRO — Roaring Penguin Software Inc. 112 CHAPTER 9. STREAMS, INHERITANCE AND THE SIMPLE GUI

CanIt-PRO — Roaring Penguin Software Inc. Chapter 10

Periodic Reports

10.1 Introduction

CanIt-PRO can generate PDF reports about mail filtering activity and e-mail them to specified recipi- ents.

10.1.1 Periodic Reports

A periodic report has a name, a page size, a recipient and a recurrence. The name can be anything you pick. The page size can be one of “US Letter” or “A4”. And the recipient can be any valid e-mail address. The recurrence specifies how often the report should be generated and mailed out. Possible choices for the recurrence are:

• On Demand — the report is never generated and mailed automatically, but only when specifi- cally requested from the Web interface.

• Daily — the report is generated and mailed daily.

• Weekly — the report is generated and mailed weekly. You can choose the day of the week.

• Monthly — the report is generated and mailed monthly. You can choose either the first or fifteenth day of the month.

10.1.2 Charts

A chart is a single page in a periodic report. It contains a chart corresponding to a particular statistical query. A chart has a name (which can be anything you pick) and a type. The available chart types are described below. Note that all charts accept parameters that modify the results. For example, you can restrict the types of mail counted (you might only want to count spam, for example), the destination domains, etc.

CanIt-PRO — Roaring Penguin Software Inc. 113 114 CHAPTER 10. PERIODIC REPORTS

• Classification of Recent Mail. A pie chart showing the breakdown of recently-received e- mail. (“Recent” e-mail is defined by Global Setting G-1550, “Number of hours to keep detailed statistics”)

• Top Mail Countries. A pie chart showing the top countries sending recent e-mail.

• Top Domains. A pie chart showing the top recipient domains receiving recent e-mail.

• Top Mail Relays. A pie chart showing the top sending relays that have sent recent e-mail.

• Top Recipients. A pie chart showing the top recipient addresses receiving recent e-mail.

• Top Streams. A pie chart showing the top streams receiving recent e-mail.

• Top Viruses. A pie chart showing top recently-received viruses.

• Summary of Greylisting per Hour. A bar-chart showing how much recent e-mail was greylisted and ungreylisted.

• Summary of Mail per Hour. A bar-chart showing the classification of recent e-mail per hour.

• Classification of Long-Term Mail. A pie chart showing the breakdown of received e-mail over the long term. The timespan available in long-term statistics is determined by Global Setting G-1500, “Expire statistics after this many days”.

• Top Domains (Long-Term Statistics). A pie chart showing the top recipient domains over the long-term.

• Top Streams (Long-Term Statistics). A pie chart showing the top recipient streams over the long-term.

• Summary of Greylisting per Day. A bar chart showing how much mail was greylisted and ungreylisted over the long-term.

• Summary of Mail per Day. A bar chart showing how daily classification of mail over the long-term.

CanIt-PRO — Roaring Penguin Software Inc. 10.2. CREATING CHARTS 115

10.2 Creating Charts

The first step in creating a periodic report is to create one or more charts. Click on Reports : Periodic Reports. The main Periodic Reports page appears:

Figure 10.1: Periodic Reports

To add a chart:

1. Click Add a New Chart. 2. Enter a name for your chart. This name will appear as the page title in the final reports. 3. Select a chart type. 4. Click Next...

Once you have selected a chart type, CanIt-PRO will display a page for setting parameters for the chart. Set the parameters as appropriate for your chart and click Save Chart. To edit an existing chart’s parameters, click on its name in the Name column. To rename a chart, enter its new name in the Rename To... box and click Submit Changes. To delete a chart, enable the corresponding checkbutton in the Delete... column and click Submit Changes.

10.3 Creating Periodic Reports

Once you have created one or more charts, you can create periodic reports. To create a new periodic report, click Add a New Report. The Add Periodic Report page appears:

CanIt-PRO — Roaring Penguin Software Inc. 116 CHAPTER 10. PERIODIC REPORTS

Figure 10.2: Add Periodic Report

To create the report:

1. Pick a name for the report and enter it in the appropriate box.

2. Pick a time when the report should be sent. You can pick daily, weekly or monthly reports. You can also select “On-Demand Only”. Such reports are never sent automatically, but are only generated on demand.

3. Enter an e-mail address to which the report should be sent. You can enter multiple addresses by separating them with commas.

4. Select a page size for the report (A4 or US Letter).

5. Pick one or more charts for the report by enabling the appropriate Add checkboxes.

6. Click one of the Submit Changes buttons.

10.4 Editing Periodic Reports

To edit an existing periodic report, click on the report’s name in the Name column. You alter the reports parameters, add or remove charts, or move existing charts up or down from the report editing page. To delete a periodic report, enable the appropriate Delete... checkbox and click Submit Changes.

CanIt-PRO — Roaring Penguin Software Inc. 10.5. RUNNING A REPORT ON DEMAND 117

10.5 Running a Report on Demand

To run a specific periodic report on demand, enable the appropriate Run Now... checkbox and click Submit Changes. The report will be queued for processing. Note that it can take anywhere from a few minutes to a few hours for the report queue to be processed, so the report might take a while to be mailed out.

CanIt-PRO — Roaring Penguin Software Inc. 118 CHAPTER 10. PERIODIC REPORTS

CanIt-PRO — Roaring Penguin Software Inc. Chapter 11

Locked Addresses

11.1 Introduction to Locked Addresses

Locked Addresses are designed to solve the following problem: You want to give out your e-mail address to someone, but you don’t trust that person or organization not to turn around and give or sell it to others. You want an address that can only be used by the person or organization you give it to, and not by anyone else. CanIt-PRO has a complete solution to this problem. However, it does require some administrative overhead before users can take advantage of the feature.

11.2 Preparing to use Locked Addresses

Before end-users can use locked addresses, you need to perform the following steps.

11.2.1 Create a new domain

Choose a new domain, specifically for locked addresses. This domain should be a subdomain of your “real” domain. For example, if you own the domain roaringpenguin.com, you might choose to place all your locked addresses in la.roaringpenguin.com. The domain you use for locked addresses should contain only locked addresses and should not be used for any “real” e-mail addresses.

11.2.2 Configure mail for the new domain

The next step is to configure the CanIt-PRO machine to receive mail for the new domain. Obvi- ously, the first thing you need to do is publish an MX record for the domain. For example, if your locked address domain is la.roaringpenguin.com and your CanIt-PRO server’s name is canit.roaringpenguin.com, you might add a DNS record that looks like this:

la.roaringpenguin.com. 1d IN MX 1 canit.roaringpenguin.com.

CanIt-PRO — Roaring Penguin Software Inc. 119 120 CHAPTER 11. LOCKED ADDRESSES

Also, you need to configure the CanIt-PRO machine to accept and discard all mail for the locked domain. (Mail should never be delivered to addresses in the locked domain, but just in case, there should be a mechanism to discard them.) Configuring Sendmail to accept mail for the locked domain is easy: Just add an entry in the access database. In our example, it would be:

To:la.roaringpenguin.com RELAY

(If you are running a CanIt-PRO Appliance, you can use Domain Routing from the Web interface instead of manually editing Sendmail configuration files.) The easiest way to configure Sendmail to discard mail for the locked domain is to make use of the virtusertable feature. Add an entry like this in virtusertable:

@la.roaringpenguin.com [email protected] and ensure that mail to [email protected] gets discarded (by making an alias from devnull to /dev/null.) Of course, you need to substitute your own locked address domain for la.roaringpenguin.com and your own CanIt-PRO server name for canit.roaringpenguin.com.)

11.2.3 Inform CanIt-PRO about the locked address domain

CanIt-PRO needs to know the domain you’re using for locked addresses, so it can treat any such addresses specially. In the Web interface, click on Administration : Global Settings and enter the locked address domain into the global setting G-10000 “Domain for Locked Addresses”

11.2.4 Associate each login name with an e-mail address

CanIt-PRO can only generate locked addresses if it has a real e-mail address for each logged-in user. For users in CanIt-PRO’s built-in user table (Section 5.3 on page 71), simply ensure that you enter an e-mail address for each user. For users authenticated via external means, the User Lookup method must return the user’s e-mail address upon login. For some User Lookup methods such as POP3 or IMAP that cannot return the e-mail address, you need to create an account-info script (Section 6.2.4 on page 95) and ensure that a mail=email-address attribute is always emitted for each login that should be permitted to use locked addresses. Once all of these steps in Sections 11.2.1 through 11.2.4 have been performed, the Locked Address feature is ready to use. Please consult the CanIt-PRO User’s Guide for details about how to use a Locked Address.

CanIt-PRO — Roaring Penguin Software Inc. Chapter 12

Attachment Handling

CanIt-PRO can handle file attachments in a number of different ways. Messages can be delayed, rejected or held based on the attachment’s type. They can be scanned for viruses and held or rejected using one or more configured virus scanners. If desired, attachments can also be removed from the message and discarded, or held for access via a web-based system.

12.1 General Filename and MIME Type Rules

Whole messages can be rejected or held on a per-stream basis using the Filename Extensions or MIME Types rules. See the section entitled Blacklists, Whitelists and Rules in the CanIt-PRO Users Guide for full details.

12.2 Delaying Attachments

On a site-wide basis, it is sometimes useful to delay certain attachment types temporarily, without placing them in a stream’s trap area. By delaying these attachments for a short period of time, you can give your virus scanners and RBLs time to catch up with new virus and spam content.

12.2.1 Enabling the Feature

First, the feature must be enabled via the Web GUI. Log in as an admin user, and enable Delay Attachments on the Setup : Features page. Next, configure the time delay, by modifying Time in hours to delay messages with Delayed At- tachments under Global Settings.

12.2.2 Creating Delay Rules

To create a delay rule, click on Administration and then Delayed Attachments. The Delayed At- tachments screen appears:

CanIt-PRO — Roaring Penguin Software Inc. 121 122 CHAPTER 12. ATTACHMENT HANDLING

Figure 12.1: Delayed Attachments

To add a rule:

1. Enter a filename pattern in the Filename Pattern box. A filename pattern is normally inter- preted as a filename extension. For example, exe will match a file with the extension .exe. Note that the pattern should not contain a period. If a filename pattern begins with ˆ, then it matches an entire filename. For example, the pattern ˆbad.exe matches (only) the filename bad.exe. 2. Enter a comment in the Comment box. This will help you remember why you are delaying the given filename pattern 3. Click Submit Changes to add the rule.

Note: Attachment-delaying is global. It cannot be adjusted on a per-stream basis.

12.2.3 How It Works

As an administrator, you may configure any number of file extensions or full filenames to be delayed. When a message arrives matching that filename or extension, it will be held in a special @@DELAYED stream for the number of hours specified in the Time in hours to delay messages with Delayed Attachments configuration. Once that time has elapsed, the message is automatically released from the @@DELAYED trap, pro- ceeding through the CanIt-PRO filtering process where normal scanning will proceed as if that mail had just arrived. Should it be necessary for a message to be released from @@DELAYED early, the admin user (or other user with appropriate permissions) may manually release it. Note, however, that a message released from @@DELAYED may be re-trapped in its normal stream because of spam-scoring rules. That is because messages released from @@DELAYED are scanned by CanIt-PRO as if they had never been seen before; CanIt-PRO does not correlate what it believes to be a brand new message with anything in the @@DELAYED stream.

CanIt-PRO — Roaring Penguin Software Inc. 12.3. STRIPPING ATTACHMENTS 123

12.3 Stripping Attachments

In addition to delaying, holding or rejecting mail based on characteristics of attachments, CanIt-PRO can strip attachments out of messages before forwarding the message. You can configure CanIt-PRO to strip out attachments and store them for retrieval via the Web interface, or simply to strip them out and discard them. Attachment-stripping rules can be set per-stream, but only the CanIt-PRO administrator can create or edit attachment-stripping rules; normal users cannot. In addition, all streams inherit default’s attachment-stripping rules, even if the “Inherit rules from ’default’ stream” setting is set to No. To create attachment-stripping rules:

1. Click on Rules and then Attachment Stripping. You see the Attachment Stripping Screen:

Figure 12.2: Attachment-Stripping Rules

2. Enter a filename pattern in the Filename Pattern box. This pattern is interpreted exactly as for Delayed Attachments.

3. Enter a comment in the Comment box.

4. Choose an Action setting to determine how CanIt-PRO handles the filename pattern:

• Keep in Message indicates that CanIt-PRO should not strip the attachment out. This setting can be used in a particular stream to override settings in default. • Strip and Store on Server indicates that CanIt-PRO should remove the attachment and store it in the PostgreSQL database. CanIt-PRO will also add a message indicating that the attachment was stripped, and provide a link whereby the message recipient can retrieve the attachment. • Strip and Discard indicates that CanIt-PRO should remove and discard the attachment. CanIt-PRO will add a note to the message indicating that the attachment was discarded and cannot be retrieved.

5. Click Submit Changes to create the rule.

CanIt-PRO — Roaring Penguin Software Inc. 124 CHAPTER 12. ATTACHMENT HANDLING

CanIt-PRO — Roaring Penguin Software Inc. Chapter 13

CanIt Storage Manager

13.1 Storage Manager Concepts

Normally, CanIt-PRO stores all incident-related data in the PostgreSQL database. For many sites, this works very well and there is no need for any alternate storage mechanisms. However, for large sites, storing large amounts of text in the database can be very burdensome, leading to very large databases and the consequent very long database dump and VACUUM processes. To alleviate this problem, CanIt-PRO ships with a program called canit-storage-manager. This program allows you to store large textual data in the file system rather than in the PostgreSQL database. The benefits of using the storage manager are:

1. The large amounts of text do not have to be dumped with each database backup, and they do not have to be VACUUMed.

2. Because the data are stored as ordinary files, you can easily back up and synchronize the data to other machines.

3. canit-storage-manager is optimized for the quick storage and retrieval of textual data, so it reduces the burden on the database server.

4. canit-storage-manager can be run on a different machine from the database server, which improves scalability.

CanIt-PRO — Roaring Penguin Software Inc. 125 126 CHAPTER 13. CANIT STORAGE MANAGER

13.1.1 Principles of Operation

Figure 13.1 illustrates how the storage manager works:

scanner

TCP traffic

TCP traffic canit−storage−manager Web UI Disk traffic

TCP traffic

File scanner System

Figure 13.1: CanIt Storage Manager

• The storage manager daemon runs on one machine and stores data locally on that machine’s file system.

• The scanners, ticker and Web interface processes (running on the same machine or in general on different machines) communicate with the storage manager daemon via a TCP connection.

• The scanners, ticker and Web interface make requests to fetch and store data and the storage manager daemon carries out those requests.

• Old data are expired by the cron job. The storage manager daemon supports a special “purge” request to delete old data.

CanIt-PRO — Roaring Penguin Software Inc. 13.2. CONFIGURING THE STORAGE MANAGER 127

13.2 Configuring the Storage Manager

Before configuring the storage manager, you need to make the following decisions:

• You need to pick one or more machines to run the storage manager. These machines should be fast with plenty of memory and (most importantly) fast disks.

• You need to pick a directory under which the storage manager can store data. (It has to be the same directory on each machine that runs storage manager.) Be sure there is sufficient disk space for your expected mail storage! The required disk storage is given approximately by the following formula. (Note that this is a worst-case estimate. It assumes than 100% of your mail volume is spam and that every message is larger than 8kB and is held locally.)

S = (Dsig × M ×V) + (Ddata × 8kB ×V) + (Ddata × M ×V) where:

– S is the required amount of disk space. – V is the average number of messages received in a day. – M is the average size of a message.

– Dsig is the number of days before you expire old Bayes signatures.

– Ddata is the number of days before you expire old data. For example, if you receive 50,000 messages per day averaging 20kB per message, you retain Bayes signatures for 3 days and you expire old data after 28 days, the required disk space is: S = (3 × 20 × 50000) + (28 × 8 × 50000) + (28 × 20 × 50000) = 42200000kB or about 42GB.

13.2.1 Enabling the Storage Manager

Before using the storage manager, ensure that all machines in your CanIt-PRO cluster can connect to the storage manager daemon on port 6568 (or whatever port you choose for it to listen on.)

13.2.2 The Configuration Wizard

Once you have decided on the machine and directory, you can begin configuring the storage manager from the Web interface. Click on Setup and then Storage Manager Wizard.

1. First, you are asked whether or not you wish to use the storage manager. Answer Yes. Then click Next. The storage manager configuration page appears:

CanIt-PRO — Roaring Penguin Software Inc. 128 CHAPTER 13. CANIT STORAGE MANAGER

Figure 13.2: Storage Manager Configuration

2. Enter the following information into the wizard:

(a) For each host in your cluster, select whether you want the host to run storage manager in Read/Write mode, Read-Only mode, or not at all. (Normally, you should never run storage manager in Read-Only mode; this mode is intended only when you are retiring a storage manager node and want to leave it in the pool until all data on it expires.) (b) If you have more than one host running a storage manager deamon and you want CanIt- PRO to store data only on some subset of them, enter the number of hosts on which to attempt writes in the “Number of Copies to Write” box. (c) If CanIt-PRO is writing more than one copy of the data and you want it to continue operat- ing even if some writes fail, enter the number of writes required to succeed in the “Success Threshold” box. (d) Enter the port on which the storage manager daemon should listen. The default port is 6568. (The port must be the same for all storage manager hosts.) (e) If you want to restrict the daemon to listen on a particular IP address, enter it. A typical entry might be 127.0.0.1 for a CanIt-PRO installation running entirely on a single machine. Normally, you should leave this field blank. If you are running storage manager on more than one host, you must leave this field blank.

Once you have entered the settings, click Next. 3. Review the settings and then click Finish.

13.2.3 Local Configuration

On each host, four settings in the [storagemanager] section of /usr/share/canit/ canit.conf control various aspects of the storage manager. If you want to change the set- tings, create a [storagemanager] section in /etc/mail/canit/canit.conf; do not edit /usr/share/canit/canit.conf directly. The settings are:

CanIt-PRO — Roaring Penguin Software Inc. 13.3. BACKUP CONSIDERATIONS 129

• pidfile – A file used by the Storage Manager server to write its process ID and to lock against concurrent Storage Managers. The default value is /var/run/ canit-storage-manager.pid.

• rootdir – The root directory under which data are stored. The default value is /var/lib/ canit-storage-manager.

• user – The UNIX user as which the Storage Manager server should run. The default value is defang.

• client retry delay – The delay in seconds to avoid retrying a connection to a dead stor- age manager. The default is 90 seconds.

13.2.4 Starting the Storage Manager

Once the settings have been saved, you should log in to each host that will run the storage manager daemon. Become root and start the storage manager daemon: # /etc/init.d/canit-system check (Your canit-system program may be located in /usr/share/canit/scripts/canit-system instead.) The canit-system startup script should run on bootup; it will start the Storage Manager if required.

13.2.5 Data Stored in the Storage Manager

Once the storage manager is enabled, CanIt-PRO stores the following data in it rather than in the PostgreSQL database:

• Bayes signatures.

• Message previews (the first portion of an incident’s message).

• Entire messages (if the message is being held locally for some reason.)

In addition, CanIt-PRO uses the storage manager rather than the database for collecting statistics. These statistics are periodically summarized out of storage manager and the summaries are placed in the database.

13.3 Backup Considerations

Once you start using the storage manager, the nightly database dump will not contain all of the infor- mation about incidents. In addition to backing up the nightly database dump, you should also back up the entire storage manager directory tree. (This directory is specified in /etc/mail/canit.conf as the rootdir setting in the storagemanager section.)

CanIt-PRO — Roaring Penguin Software Inc. 130 CHAPTER 13. CANIT STORAGE MANAGER

The files in that directory are ordinary files; you can back them up with tar or rsync or your favourite backup tool. However, there are many, many small files within many, many directories and subdirectories. Test to confirm that your backup tool can handle the directory. The best time to back up Storage Manager is after the nightly cron job has finished. This is because (a) expired data will have just been purged; and (b) the system should be less busy, resulting in less contention for disk I/O. If you have more than one CanIt-PRO server (in other words, a cluster) then it is best to run storage manager on multiple CanIt-PRO servers rather than using a backup tool. See section 13.4.

Note: Warning: We have seen instances where rsync is not able to handle the storage manager directory due to the extreme number of files and subdirectories. We have also seen instances where scp takes an extremely long time to transfer the directory.

13.4 Running multiple Storage Managers

If you have more than one CanIt-PRO server running in a cluster then we strongly recommend running storage manager on at least two servers. There are several advantages:

• Storage Manager automatically load balances between its nodes;

• Multiple redundant copies of the data eliminate the need for backups;

• When migrating a server, you can skip the process of migrating storage manager as the other nodes will carry the data.

13.5 ps Output

If possible, canit-storage-manager changes the string shown by the ps command to reflect what it is doing. For example, ps might show the following output:

canit-storage-manager: 10.0.0.1 scanner_6448 store bayes_sig 19819

The output above means that this instance of the storage manager is connected to the scanner with process-ID 6448 on the machine 10.0.0.1. It is currently executing the command “store bayes sig 19819”.

CanIt-PRO — Roaring Penguin Software Inc. Chapter 14

Searching Logs

14.1 Introduction

CanIt-PRO has the ability to index mail logs in a full-text database and search them. This can be used to diagnose many mail problems such as missing messages, duplicate messages, etc.

Note: The log-searching feature is available only on our Debian-based appliance build. It is not available in the source or RPM versions of CanIt-PRO. In addition to presenting search results from the log files, CanIt-PRO also annotates the log-lines to provide a clear explanation of what each line means. This can greatly ease troubleshooting.

14.2 Installing the Log Indexer

If you have upgraded your CanIt-PRO appliance from a version older than 8.0.0, the log-indexer may not be installed by default. Follow the instructions in this section to install it. If you have installed from the ISO image (version 8.0.0 or newer), then the log-indexer is installed and you can skip this section.

14.2.1 Disk Space

To determine how much disk space the log indexer will require, it’s helpful if you have a week or so worth of log data already. On CanIt-PRO appliances, the mail logs are in /var/log in files named mail.log.*. Take a look at how much disk space one day’s worth of logs uses. (Pick a weekday that has a relatively high mail volume and measure the uncompressed size.) Then multiply that by 2 for indexing overhead. Multiply that by the number of days over which you wish to permit indexing. The answer is how much disk space you will require. For example, suppose your system generates 250MB worth of logs per day and you want to retain 30 days of logs. Then the disk space required is: S = 250MB × 2 × 30 = 15GB

CanIt-PRO — Roaring Penguin Software Inc. 131 132 CHAPTER 14. SEARCHING LOGS

You’ll need to have 15GB of spare disk space set aside for the logs.

14.2.2 Installing the Module

To install the log-searching module, log on to your CanIt-PRO appliance as root (either at the console or over SSH) and type: apt-get install canit-log-correlator

14.2.3 Configuring rsyslog.conf

Once canit-log-correlator is installed, the system logs all mail logs to files in the directory /var/log/mail-daily. You should turn off logging to /var/log/mail.log since that file is redundant. To turn off the logging:

1. Edit the file /etc/rsyslog.conf. A simple command to do this is pico /etc/rsyslog.conf.

2. Find the line that reads: mail.* -/var/log/mail.log 3. Comment out that line by prepending a hash mark # before mail so that the line now reads: #mail.* -/var/log/mail.log 4. Save your changes. In pico, hit Control-O and then Enter to save the file.

5. Type: /etc/init.d/rsyslog restart

14.2.4 Configuring the Log Correlator

By default, the log correlator keeps 30 days’ worth of logs. If you want to change this, edit the file /etc/mail/canit/canit.conf and add the following lines:

[logindexer] keep_log_days=21

(In the example, we changed it from 30 days to 21 days.) The log correlator keeps uncompressed copies of mail logs in the directory /var/log/mail-daily. Each day’s file is named mail-YYYY-MM-DD.log, where YYYY, MM and DD are the year, month and day respectively. CanIt-PRO automatically deletes files older than the keep log days setting. The log correlator creates indexes in the directory /var/lib/canit-log-correlator. You should not touch anything under that directory.

CanIt-PRO — Roaring Penguin Software Inc. 14.3. LOG BASICS 133

14.2.5 Cluster Considerations

If you have a cluster of CanIt-PRO machines, you should install canit-log-correlator on all machines. In Setup : Cluster Management, you should mark all machines that store mail logs as a “Log Host”.

Note: Even if you have only a single machine, you should visit Setup : Cluster Management and enable the “Log Host” flag After installing and configuring the log correlator, you should run /etc/init.d/canit-system check to ensure the log indexer is running.

14.3 Log Basics

CanIt-PRO uses the Sendmail program to transfer mail. It also uses the MIMEDefang filtering tool as the basis for its filtering. There are therefore three sources of log lines:

1. Sendmail.

2. The core MIMEDefang tool.

3. CanIt-PRO itself.

The log indexer groups log lines for a given message into a log document. A log document consists of the set of log lines that describe the process of one message transmission through the CanIt-PRO system. The common element between different log lines that allows them to be grouped together is the Send- mail queue ID. This is an identifier assigned by Sendmail to each message transmission. A typical queue ID might look like this: oBGIkIUj026238

14.4 Searching the Logs

There is a 15-minute delay between a log-line being created and the indexer indexing it. Therefore, you can search for log lines starting as far back as your logs go up until 15 minutes before the current time.

14.4.1 Performing a Search

Note: Only the system administrator can use the log-searching facility. In addition, the user must have permission to see trap contents. To search the logs, click on Administration : Search Logs. The Log Search page appears:

CanIt-PRO — Roaring Penguin Software Inc. 134 CHAPTER 14. SEARCHING LOGS

Figure 14.1: Log Search Page

In the search page, you can specify search parameters as follows

• Start date/time and End date/time restrict the time interval over which the search is performed. • Queue ID lets you search for a specific Sendmail Queue ID. • Stream lets you restrict results to messages within a given stream. • Incident ID lets you search for a specific CanIt-PRO incident ID. • Sender lets you specify a sender’s email address. • Recipient lets you specify a recipient’s email address. • Subject lets you specify the subject of a message. • CanIt Host lets you restrict the search to messages processed by a particular host. Note that you need to specify the host name as it appears in the log file. • Classification lets you restrict messages based on their classification. Possible values for clas- sification are: – accepted – rejected – tagged – discarded – greylisted – pending

You do not need to fill in all of the field values, but you should fill in as many as you can to make the search as specific as possible. Once you have filled in the search fields, click Submit to perform the search.

CanIt-PRO — Roaring Penguin Software Inc. 14.4. SEARCHING THE LOGS 135

14.4.2 Result List

After you submit a search request, CanIt-PRO returns a list of matching results. This list might look something like Figure 14.2:

Figure 14.2: Log Search Results

Within the results page:

• Click on the small up- or down-arrows next to each column to sort by that column in ascending or descending order. The current sort order is shown by the red arrow.

• Click on a Queue ID link to view the detailed log lines for that queue ID.

• If there is an incident ID, click on the Incident ID link to see the Incident Details page.

Note: CanIt-PRO’s full-text search engine returns an estimate of the number of hits. While this estimate is usually quite accurate, it can sometimes overestimate the number of hits. As you move from page to page in the search results, the estimate may change. This is expected behavior and nothing to worry about.

14.4.3 Detailed Results

If you click on a queue ID, the Detailed Results page appears:

CanIt-PRO — Roaring Penguin Software Inc. 136 CHAPTER 14. SEARCHING LOGS

Figure 14.3: Log Search Details

This shows each log line related to the message transmission. To see the timestamp in a more readable format, hover the mouse cursor over the timestamp. For a detailed explanation of a log line, click on the question-mark icon next to the line. You can expose details for all log lines by clicking Show All Explanations. Finally, if you need the raw log lines (for example, to send to someone for analysis), click on Show Raw Logs.

14.5 Advanced Search Queries

CanIt-PRO uses a full-text engine to index the mail logs. The different search fields have different types; the types are:

• Atom. A search field that is an atom is considered one indivisible unit. You can only search for exact matches.

• Phrase. A search field that is a phrase is treated as natural-language text. You can search for partial matches by supplying a few words instead of the whole phrase.

Note: Fields that are phrases will be searched using “OR” logic if you enter multiple words. For example, if you enter Meeting Tomorrow Canceled in the Subject search box, CanIt-PRO will return hits whose subject contains “Meeting” or “Tomorrow” or “Canceled”. This is probably not what you want; to search for an entire phrase, enclose it in double-quotes: "Meeting Tomorrow Canceled" The following fields are atoms:

• Queue ID

• Incident ID

• Stream

CanIt-PRO — Roaring Penguin Software Inc. 14.5. ADVANCED SEARCH QUERIES 137

• Source Host IP

• Destination Host IP

• Canit Host

• Classification

The Start date/time and End date/time fields are neither atoms nor phrases. Rather, they restrict the time range over which logs are searched. The following fields are phrases, meaning you can enter partial information. (You have to split the information on a word boundary. For example, if you want to search for senders in the example.com domain, you can enter example.com or example, but not exam.)

• Subject

• Sender

• Recipient

• Message ID

CanIt-PRO — Roaring Penguin Software Inc. 138 CHAPTER 14. SEARCHING LOGS

CanIt-PRO — Roaring Penguin Software Inc. Chapter 15

Tips

Managing spam requires constant attention, but there are many things you can do to reduce the work- load of the administrator. This chapter offers advice for fine-tuning CanIt-PRO and making it more effective.

15.1 Greylisting

In the past, spammers would use open SMTP relays to send spam. With the advent of inexpensive residential broadband, many spammers use special software to send bulk mail directly from their PC’s. Because spammers want wide distribution, they want each message to be sent as cheaply as possible. Some spam software, therefore, ignores SMTP errors if a message cannot be delivered. CanIt-PRO can deal very effectively with software that never retries by sending a temporary failure indication at the end of DATA when mail from an unknown sender arrives. If you set the “Tempfail unknown senders on first transmission” stream setting to Yes, then CanIt-PRO uses the combination of sender e-mail address, recipient e-mail address, sending relay IP address and message subject to calculate a hash. If this hash has never been seen before, CanIt-PRO tempfails the message. Once the hash reappears, CanIt-PRO marks the host as “known to retry” and lets the message to proceed to content-scanning. A host marked “known to retry” is allowed to bypass greylisting for 40 days. There are some down-sides to using greylisting. Valid mail from new senders may be delayed by anywhere from 15 minutes to four hours, depending on the retry interval on the sending relay. You can avoid this delay by setting up a secondary MX record. In fact, you can simply give the CanIt-PRO machine a virtual interface with another IP address and publish this other IP address as a secondary MX record. In this way, when proper SMTP relays receive a temporary failure indication on the primary MX machine, they immediately try to send to the secondary MX machine. Often, spamware won’t retry. On a similar note, CanIt-PRO will not issue temporary failures for messages relayed from any server in a Known Network with Skip Greylisting configured (see Section 4.7 on page 50). If a message is received by such a server, greylisting will not be used. In some cases, this can cause greylisting statistics to be skewed. For example, if mail is initially received by a CanIt-PRO server and marked as greylisted, then is received by a secondary MX server and either relayed to the CanIt-PRO server, or to

CanIt-PRO — Roaring Penguin Software Inc. 139 140 CHAPTER 15. TIPS

an internal mail server, the message will appear in the CanIt-PRO statistics as having been greylisted, even if it was received and processed. In general, we find that setting Tempfail unknown senders on first transmission to Yes is a cheap and effective way to reduce spam. WARNING: Some mailing list programs use “disposable” sender addresses which always change. These lists do not work well with greylisting. To work around the problem, you should whitelist the domain of the mailing list sender. CanIt-PRO tries to detect disposable-address schemes. It ignores everything in the sender address following a plus sign or a dash followed by a digit. These rules catch most common methods for generating disposable addresses, but they are not exhaustive.

15.2 Don’t Trust Sender Addresses

Many spammers use one-time disposable sender addresses. Many addresses are not even valid. So we do not recommend blacklisting addresses unless you receive many different spam addresses from the same address. Therefore:

Blacklisting individual addresses is usually not effective. Whitelisting known good addresses (for example, mailing-list sending addresses) can be very effective. The sender report may, however, highlight a persistent spam sender address which is worth blacklist- ing.

15.3 Don’t Trust Sender Domains

Just as sender addresses are often fake, sender domains are too. However, some domains are known spammers and these can be profitably blacklisted. The tip:

Blacklisting entire domains can be effective under limited circumstances. Whitelist- ing known good addresses (for example, mailing-list sending addresses) can be very effec- tive. Holding all mail from free e-mail services like Hotmail and Yahoo can be effective if you use it in conjunction with whitelisting of known good senders from those services. Use the domain report to help make these decisions.

15.4 You May Trust Relay Hosts

It is rather difficult to fake the IP address of the SMTP relay host, so this attribute can usually be trusted. We recommend using a DNS-based blacklist service in your Sendmail configuration file or the CanIt-PRO GUI to reject the most obvious offenders. However, if you receive multiple spam messages from a given relay host, it can be effective to block the host:

CanIt-PRO — Roaring Penguin Software Inc. 15.5. CUSTOM RULES 141

Blacklisting a repeat-offender relay host is effective. Whitelisting known good hosts such as internal hosts is effective and recommended. Use the host report to determine which hosts are persistent spam relays.

15.5 Custom Rules

15.5.1 General Recommendations

There are a few custom rules which are quite effective:

1. If you know that your CanIt-PRO server only accepts inbound mail from the Internet, then no server should ever claim to be in your domain in the HELO command. If your CanIt-PRO server is called canit.mydomain.tld, a custom rule to add 5 points if HELO ends with mydomain.tld can be very effective. In fact, you might want to make high-scoring rules which automatically reject messages with obviously-fake HELO arguments.

2. Similarly, no machine should ever put an IP address as the argument of HELO. Some spammers use random IP addresses here to confuse spam-reporting tools. A custom rule which “regexp- matches” HELO against ˆ\d+\.\d+\.\d+\.\d+$ can be quite effective.

3. Custom rules which specify Sender contains “offer”, “bounce”, “return” and “noresponse” can often trap spam. You should use only moderate scores on these rules, because some legitimate mail comes from such senders. However, adding a rule which scores around 3 for these patterns can help catch a lot of spam which might otherwise sneak under the scoring threshold.

4. Subject-matching rules for the most obnoxious spams are very effective. For example, Sub- ject regexp-match rules against v\Sagra and (increase|enlarge).*penis are very effective.

15.5.2 Things to avoid

Be very careful when writing custom rules, especially rules that can match on the message body. For example, a straightforward rule that contains “cum” in the body will match mail containing mail containing “document”, “cumulative”, “modicum” and at least 64 other common English words. Sim- ilarly, “sex” will match “sexton”, “Essex” and others. If you want to match words in a message body, we recommend that you use a regular-expression match, and use Perl’s word-boundary operators. For example, the Perl regular expression \bcum\b matches the word “cum”, but not “document”, “cumulative” or “modicum”.

15.6 Group High-Scoring Messages Together

We recommend that you set the default sort order to sort by Score, Descending. This groups high- scoring messages at the beginning and low-scoring messages at the end of the pending list. This makes it easier for the spam-control officer to dispose of the messages.

CanIt-PRO — Roaring Penguin Software Inc. 142 CHAPTER 15. TIPS

15.7 Roaring Penguin Best-Practices

At Roaring Penguin Software Inc., we’ve spent quite a bit of time analyzing spam and spammers. You may wish to try out some of our anti-spam rules to see if they work well for you. Here is a quick summary of the rules we use; they may inspire you to develop your own anti-spam rules.

• We use custom rules to add 4 to any message whose Sender contains “offer”, “noresponse”, “remove”, “marketing” or “promo”. These rules may be a touch aggressive for very busy sites, but are quite effective for smaller sites.

• Another custom rule adds 1.2 to any Relay containing “[” (left square bracket.) This indicates a reverse-DNS failure on the sending host, which is mildly correlated with spamming.

• We use a Spam threshold of 4.6, because we find the default of 5 is somewhat conservative.

• We use a discard threshold of 20; this seems quite safe.

• We set Tempfail unknown senders on first transmission to Yes. Again, this may be unaccept- able for some sites.

15.8 General Anti-Spam Tips

15.8.1 Use Receive-Only Addresses on your Web Site

Spammers love to extract e-mail addresses from Web sites, and not only do they use them for the obvious purpose of spam targeting, but also they use them as fake sender addresses. Therefore, we recommend a general policy of publishing only generic e-mail addresses on your Web site, like [email protected] and [email protected]. When you reply to inquiries, always use a real, personal e-mail address like [email protected]. This has two benefits:

1. If someone sends e-mail purporting to come from [email protected], you know immediately that it is spam, and you can reject it. You can blacklist all your generic addresses inside CanIt-PRO.

2. If someone complains about receiving e-mail from one of the generic addresses, you can point to your policy and assure the recipient that the sender address was faked.

15.8.2 Do Not Reply to Spam

Do not ever reply to spam e-mail; such replies simply serve to validate your e-mail address. Similarly, do not visit Web sites purporting to offer opt-out services; they also serve to validate your address for further spamming.

CanIt-PRO — Roaring Penguin Software Inc. Chapter 16

Security

Running a secure CanIt-PRO installation is relatively straightforward, but there are many issues you have to watch out for. This chapter gives you guidance on how to secure your CanIt-PRO installation.

16.1 Don’t Run as Root

The most basic security principle is to run as little software as root as possible. Therefore:

• Always create the Sendmail smmsp user and group, and do not run Sendmail suid-root. Instead, the permissions on the Sendmail executable should look like this: -r-xr-sr-x root smmsp sendmail That is, the sendmail binary should be owned by root, group smmsp and have mode 2555.

• Always create the MIMEDefang defang user and group, and run MIMEDefang as defang. In /etc/mail/canit/canit.conf, enable mx user=defang in the [mimedefang] section.

16.2 Ownership and Permissions

All system configuration directories like /etc and their descendants should be owned by root and writeable only by root. Here are suggested ownership and permissions for various files and directories. Note that where we use group root, your system may use wheel or some other group for root-owned files. File or Directory Owner Group Mode /etc/mail/canit and ancestors root root 0755 /etc/mail/canit/canit.conf apache defang 0640 /var/spool and ancestors root root 0755 /var/spool/MIMEDefang defang defang 0700 The PHP files in Apache’s Web space root Apache’s group 0644

CanIt-PRO — Roaring Penguin Software Inc. 143 144 CHAPTER 16. SECURITY

16.3 SSH

The various nodes in a CanIt-PRO cluster communicate via SSH. Each node must be able to SSH to all other nodes on port 22. For intra-cluster communication to work, root SSH login must be permitted. However, you do not need to permit general root login because the CanIt-PRO nodes only use a forced command for com- munication. The safest setting in /etc/ssh/sshd config is therefore:

PermitRootLogin forced-commands-only

16.4 PostgreSQL Security

By default, PostgreSQL trusts any connection coming from the local host. Therefore, if you use PostgreSQL on your CanIt-PRO server with the default access rules, do not allow normal users to have shell accounts on the CanIt-PRO server. This cannot be emphasized strongly enough: If you allow normal users shell access on the CanIt-PRO server with PostgreSQL’s default setup, anyone can access or change the spam database. If you must allow shell accounts on the CanIt-PRO server, then you must password-protect your Post- greSQL installation. See the PostgreSQL documentation (“Authentication Methods” section) for de- tails. You must also protect your database passwords:

• The file /etc/mail/canit/canit.conf must be owned by apache and group defang. Both the defang user and the apache user need read-access to these files, which should have mode 0640. (We assume your Web server runs as user apache; if not, substitute the Web server user as appropriate.)

For best security, we strongly recommend that you do not allow ordinary users to have shell accounts on your mail server. If the CanIt-PRO database server is on a different machine, you should not permit shell accounts on that machine either.

16.5 PHP Security

PHP has a parameter called register globals, which automatically sets global variables based on GET, PUT or COOKIE variables. This setting may be a security risk, and CanIt-PRO does not require it. We strongly recommend that you set register globals to off.

16.6 Network Security

When you log on to CanIt-PRO, your username and password are transmitted in cleartext. While you interact with CanIt-PRO, your browser passes a session cookie back so CanIt-PRO can keep track of your session. Both your password and the cookie are vulnerable to network sniffing. If you interact

CanIt-PRO — Roaring Penguin Software Inc. 16.7. BACKUPS 145

with CanIt-PRO over an untrusted network, or a network whose traffic may be sniffed, you should use HTTPS and SSL encryption. Setting this up is beyond the scope of this manual, but CanIt-PRO should operate with no changes over HTTPS.

16.7 Backups

The daily CanIt-PRO cron job dumps a text backup of the spam database to the file /var/spool/Canit-Spam-DB-Backup/SPAM-DATABASE-BACKUP. You should back this file up regularly in case the CanIt-PRO server suffers a hardware or other problem. You should also make sure the file is not readable by normal users. You should also back up the entire directory tree rooted at /var/spool/MD-Bayes. If you are using the Storage Manager, you should also back up the Storage Manager directory on each Storage Manager node. Some CanIt-PRO settings are stored in /usr/share/canit as well as /etc/mail/canit; you should back up that directory any time that you change a file in it. You may wish to back up /etc/mail in its entirety to capture Sendmail configuration files in your backup as well.

Note: When restoring from backups, never replace existing /etc/mail/ or /usr/share/canit files with backed up versions! Rather, use your backup versions as reference. Finally, please remember to back up any customizations you have made to your CanIt-PRO installa- tion, including web interface files, custom account-info or other scripts, et cetera.

Note: When restoring from backups, be careful when replacing web interface files, especially (but not only) if you are restoring to a different version of CanIt-PRO than that from which your backup was made.

CanIt-PRO — Roaring Penguin Software Inc. 146 CHAPTER 16. SECURITY

CanIt-PRO — Roaring Penguin Software Inc. Appendix A

The Domain Configuration Wizard

A.1 Introduction

The Domain Configuration Wizard provides a simple way to quickly configure the most important settings for a domain. All of the pages in the Domain Configuration Wizard are available in greater detail in the Setup and Administration menus. However, because the Domain Configuration Wizard centralizes the important settings in one simple workflow, you may prefer to use it to set up new domains. To access the Domain Configuration Wizard, click on Setup and then Wizards. Click on Domain Configuration Wizard.

A.2 Entering the Domain Name

The first step in the Domain Configuration Wizard requires you to enter a domain name. (Figure A.1). Enter the domain name and click Next.

Figure A.1: Domain Configuration: Enter Domain Name

A.3 Configuring Streaming

The next step (Figure A.2) requires you to choose how mail for the domain should be streamed. Streaming is explained in detail in Chapter3.

CanIt-PRO — Roaring Penguin Software Inc. 147 148 APPENDIX A. THE DOMAIN CONFIGURATION WIZARD

Figure A.2: Domain Configuration: Configuring Streaming

You can configure streaming in several ways:

• You can simply chop the domain part off the e-mail address so that mail for [email protected] goes into the stream user.

• You can chop the local part off the e-mail address so that mail for [email protected] goes into the stream example.net.

• You can keep the entire e-mail address as the stream name. This is the recommended method for most installations.

• You can invoke the User Lookup Wizard to set up a more complex streaming method (for example, using LDAP). The User Lookup Wizard is described in Chapter6.

Note that if you have created User Lookup methods (either in the past or after stepping through the User Lookup Wizard from the Domain Configuration Wizard), you will be presented with additional choices for streaming.

A.4 Configuring Authentication

Once streaming has been configured, you will be asked to configure authentication (Figure A.3).

CanIt-PRO — Roaring Penguin Software Inc. A.4. CONFIGURING AUTHENTICATION 149

Figure A.3: Domain Configuration: Configuring Authentication

To allow end-users to log into CanIt-PRO and manage their traps, you can set up an authentication mechanism. From the Domain Configuration Wizard, you have several choices:

• IMAP allows you to authenticate users against an IMAP server.

• POP3 allows authentication against a POP3 server.

• Other allows you to skip setting up authentication. You can do it at a later time, or (if you do not want to allow end-users to log in) skip it entirely. You can also step through the User Lookup Wizard to set up a more complex authentication mechanism.

If you select IMAP or POP3, you will be prompted to enter the name (or IP address) of the IMAP or POP3 server. If CanIt-PRO should strip the domain name off the login name before attempting to authenticate, set the “Strip domain name from login” parameter to Yes. You can also configure CanIt-PRO to validate SSL certificates and to use (or require) an encrypted connection to the POP3 or IMAP server. If you step through the User Lookup Wizard to create an authentication method, the newly-created method will be presented as an authentication choice when you return to the Domain Configuration Wizard.

CanIt-PRO — Roaring Penguin Software Inc. 150 APPENDIX A. THE DOMAIN CONFIGURATION WIZARD

A.5 Configuring Routing and Verification

Finally, CanIt-PRO will ask you to configure routing and verification (Figure A.4).

Figure A.4: Domain Configuration: Configuring Routing and Verification

Note: Configuring routing via the Web interface is only available on CanIt-PRO appliance builds. If you are not running an appliance build, you will need to configure routing using Sendmail’s mailertable feature; consult the Sendmail documentation for details. To route mail for the domain, enter the host name or IP address of the back-end SMTP server that will accept e-mail for the domain. We strongly recommend configuring some method for CanIt-PRO to validate recipient addresses. If you do not validate recipient addresses, CanIt-PRO is forced to accept mail for any address withing the domain, likely resulting in many failure notifications. If your back-end mail server validates recipients during the SMTP transaction, enter its name or IP address as the verification server. If it does not, you will have to leave the verification server blank and use some other method (such as LDAP streaming) to validate recipients.

A.6 Summary

After configuring routing and verification, CanIt-PRO will display a summary of your settings. Click Finish to make them take effect.

CanIt-PRO — Roaring Penguin Software Inc. Appendix B

Release Notes

Version 8.0.3 released on 2011-03-01

• NEW FEATURE (CanIt-Domain-PRO appliance only): The log-indexer can forward log lines (using the SYSLOG protocol) to remote hosts on a per-realm basis.

• IMPROVEMENT: Reduce the disk I/O consumed by the log-indexing daemon.

• POLICY CHANGE: Access to “Search Logs” is now controlled by a separate permission rather than the “View Trap” permissions.

• POLICY CHANGE: We have clarified the licensing terms of our data feeds such as RPTN and our RBLs. Please see the LICENSE.TXT file for details.

• POLICY CHANGE: On CanIt appliances, we now auto-create a mail alias for “root” that goes to the CanIt Administrator email address.

• UPDATE: Upgrade to ClamAV 0.97.

• BUG FIX: Package a missing cron task in canit-log-correlator.

• BUG FIX: When searching logs, correctly handle the case when the index is updated during a search.

• BUG FIX: Fix a number of deprecation warnings with PHP 5.3.

• BUG FIX: Fix typo in theme file that caused browser to request a nonexistent CSS file.

Version 8.0.2 released on 2011-02-14

• NEW FEATURE: In addition to rate-limiting outbound mail by sender, you can also rate-limit it by originating IP address.

CanIt-PRO — Roaring Penguin Software Inc. 151 152 APPENDIX B. RELEASE NOTES

• NEW FEATURE (Appliance only): Implmented new API calls to perform log searches. Note: The API client libraries were updated to permit query parameters on GET requests; to use the new API calls, you will need to use the latest API client libraries.

• POLICY CHANGE: We have a new “Mixed” real-time DNS-based list. This list includes hosts that send a large amount of both spam and non-spam. They should be penalized somewhat, but not as much as hosts in “SpamSource”.

• GUI IMPROVEMENT: All pages display the “Pager”, “Filter” and “Enter specific object” ele- ments in that consistent order.

• BUG FIX: The audit-trail for Preferences did not work; it works correctly now.

• BUG FIX: MIMEDefang failed to install on (really old versions of) FreeBSD. This has been fixed.

• BUG FIX: Voting links were sometimes not removed even if they should have been. This has been fixed.

Version 8.0.1 released on 2011-01-31

• BUG FIX: The log-indexer would sometimes produce incorrect timestamps when indexing log files. This has been fixed.

• BUG FIX: A third-party PHP module was not compatible with PHP4. We have backported it.

• BUG FIX: The new Web code broke the RSS Feed feature; this has been fixed.

• BUG FIX: A minor display problem when clicking on the message subject in a Pending Notifi- cation message has been fixed.

• BUG FIX: A very rare edge case in the PHP code that could produce illegal SQL has been fixed.

• BUG FIX: In very rare cases, a dead storage manager node could cause a scanning process to terminate unexpectedly. This has been fixed.

• BUG FIX: On appliances, clamd could be disabled unintentionally during an upgrade. This has been fixed.

• BUG FIX: The Verification Server feature could sometimes misinterpret an ESMTP “SIZE” keyword from the back-end server. This has been fixed.

• BUG FIX: Remove deprecated call-time pass-by-reference instances in the PHP code.

• DOC FIX: In the Administration Guide, document the fact that the SNMP agent requires a helper cron job to run once a minute.

CanIt-PRO — Roaring Penguin Software Inc. 153

Version 8.0.0 released on 2011-01-24

• MAJOR NEW FEATURE: CanIt keeps an audit trail of all changes to settings, rules, etc. You can review the audit trail from most pages by clicking “Show Changes”.

• MAJOR NEW FEATURE: CanIt can perform full-text indexing and searching of all mail logs. Note that this feature is available only on our Debian appliances (“Lenny” release) and is not installed by default. See the Administration Guide for installation details.

• NEW FEATURE: All of the online documentation can now be searched live from the CanIt interface.

• MAJOR CHANGE: The PHP Web interface has been completely rewritten, making theming much easier. NOTE INCOMPATIBILITY IF YOU HAVE THEMED CANIT, YOU WILL NEED TO REWORK YOUR CUSTOMIZA- TIONS

• POLICY CHANGE: We no longer provide binary RPMs for SuSE Linux Enterprise Server 9.

• POLICY CHANGE: Default database connection timeout has been increased from 10 seconds to 20 seconds. 10 was causing problems on some systems.

• POLICY CHANGE: In a Verification Server entry, if you choose “Queue” rather than “Temp- fail” when the back-end server is down, CanIt only permits recipients who have been seen within the last 60 days. It tempfails any others. This makes it much safer to use “Queue” with much less risk of backscatter.

• UPDATE: Update ClamAV from 0.96.4 to 0.96.5.

• NEW FEATURE: The Pending Notification page has a button to send a notification immedi- ately.

• NEW FEATURE: We now have IPv6 geolocation data (though it is less granular than the IPv4 data, listing only the country.)

• NEW FEATURE: When the failover code fails over, it executes all scripts in /usr/share/canit/failover/notify.d/ This lets you write scripts to send administrators notice that the database has failed over.

• NEW API CALLS: /realm/@@/streams/with pending: List streams with new pending incidents. /realm/@@/stream/@@/pending flag: Get/set the Pending Notification flag. /realm/somerealm/realm mappings: Get all realm mappings for a given realm.

• API IMPROVEMENT: The API returns more information about incidents than before, includ- ing sender, recipient, host, etc.

CanIt-PRO — Roaring Penguin Software Inc. 154 APPENDIX B. RELEASE NOTES

• IMPROVEMENT: (CanIt-Domain-PRO only): A configuration file setting lets you use AJAX- enabled auto-completing text fields wherever a realm entry box appears. The AJAX code has been optimized since the previous release and should be much faster, especially for the site administrator.

• IMPROVEMENT: Verification Server SMTP callbacks understand the ESMTP “SIZE” key- word.

• IMPROVEMENT: You can control caching of invalid recipients and invalid recipients using memcached independently. (For example, you can cache valid recipients, but not invalid ones.)

• IMPROVEMENT: The Permissions page forces you to enter a stream class if there isn’t one instead of silently ignoring input.

• IMPROVEMENT: The POP3 authentication method would fail against Exchange 2007 and newer because of a Microsoft bug. We have code in CanIt to work around Microsoft’s bug.

• COSMETIC IMPROVEMENT: The charts displayed in Reports have been improved: Redun- dant trailing zeros are deleted and all load charts are lined up.

• COSMETIC IMPROVEMENT: The User Lookup test page has been made clearer. Tests that can’t work for a particular user lookup method are suppressed.

• BUG FIX: If the system has been configured to force user-names to lower-case, then all com- ponents of the system enfoce that setting on data entry.

• BUG FIX: When CanIt stripped out existing training links from messages, it would sometimes be a bit greedy and strip out too much of the message. This has been fixed.

• BUG FIX: SNMP monitoring code has been rewritten and cleaned up. Permissions problems with monitoring PostgreSQL have been fixed. NOTE INCOMPATIBILITY The SNMP code has changed to require only a single SNMP agent process. If you have config- ured SNMP, you will need to reconfigure it to use the new SNMP agent.

• BUG FIX: sendmail-account-info.pl could occasionally fail. (This affects very few people...)

• BUG FIX: Update the Python API client library to work with Python 2.6 and later.

• BUG FIX: In CanIt-Domain-PRO, the Domain Setup Wizard would sometimes create a Verifi- cation Server entry in the base realm instead of the appropriate realm. This has been fixed.

Version 7.0.8 released on 2010-11-09

• NEW FEATURE: CanIt now has built-in support for DKIM (DomainKeys Identified Mail). See http://dkim.org for a description of DKIM.

CanIt-PRO — Roaring Penguin Software Inc. 155

• NEW FEATURE: We have experimental support for Vouch By Reference (RFC 5518). We may change the way it is implemented in a future release of CanIt, but will provide an automatic upgrade path.

• IMPROVEMENT: If you are using outbound rate limiting *and* use SMTP AUTH on the CanIt server, we limit outbound mail based on the authentication name as well as the purported enve- lope sender. This is to avoid spammers bypassing rate limiting by changing the sender address.

• NEW API CALLS: GET /realm/xx/stream/yy/addresses seen and GET /realm/xx/domains seen return statistics about observed email addresses per stream/realm.

• IMPROVEMENT: Graphical reports now show statistics for “bad” things (eg spam and viruses) using a reddish palette and “good” things using a greenish palette.

• IMPROVEMENT: The failover scripts include additional checks to prevent certain misconfigu- rations.

• IMPROVEMENT: You can specify a database connection timeout. This improves response time if you have configured a databaseless-filter mode and the database is down. The default timeout is 10 seconds.

• UPGRADE: Updated ClamAV to version 0.96.4.

• PLATFORM SUPPORT: We have dropped support for RPMs for Red Hat Enterprise Linux 3. We have dropped binary package support for Debian Sarge (3.1).

• POLICY CHANGE: The CanIt Appliance setup screen will not let you name your CanIt host “something.local” or “something.localdomain”.

• BUG FIX: The username “defang” was hard-coded in a few places. Now everything respects the setting in canit.conf.

• BUG FIX: Verification Servers were documented as taking a “/port” suffix. Now that actually works!

• BUG FIX: In the API server, using ’@@’ for realm and/or stream now works as documented.

• BUG FIX: A long-standing bug that could cause Pending Notifications to fail if there are pend- ing messages with invalidly-encoded subjects has been fixed.

• BUG FIX: The old message-hashing algorithm could sometimes consider two different mes- sages to be the same incident. This has been fixed, but there will be a transition period of several days for your CanIt installation to move completely to the new algorithm. (We must retain the old algorithm until all incidents hashed with it have expired.)

• MISCELLANEOUS: Many minor bug fixes and code cleanups.

CanIt-PRO — Roaring Penguin Software Inc. 156 APPENDIX B. RELEASE NOTES

Version 7.0.7 released on 2010-09-20

• POLICY CHANGE: Add default rules for the Roaring Penguin reputation lists. *** NOTE *** If you are already using our lists, please check the RBL rules in the default stream (base realm in Domain-PRO) carefully after upgrading!

• NEW FEATURE: You can configure CanIt to create an incident for all messages, even non- spam ones. The feature is disabled by default and we do not recommend enabling it on busy sites.

• NEW FEATURE / POLICY CHANGE: The default notification format has been changed to “HTML with Links” instead of “Clickable Webform”. The “Clickable Webform” form fails in many email clients.

• NEW FEATURE: Add a button to the Preferences : Notification page that allows a user to request an immediate notification.

• UPDATE: Prepare CanIt to work properly with forthcoming PostgreSQL 9.0.0.

• MINOR IMPROVEMENT: Add a “tag” field in Master RBL List. This tag (if present) is used in log messages rather than the long RBL name.

• IMPROVEMENT: Add more convenient do get, do put, do delete and do post methods to Perl API client library.

• IMPROVEMENT (CanIt-Domain-PRO only): Instead of a pull-down list of realms, you can use an AJAXy auto-completion text field in the “Switch Realm” box. See $Con- fig[’RealmSelectWidget’] in the config-domain-pro.php file.

• BUG FIX: The API server includes the “parent” field of each realm in the “GET /realms” result.

• BUG FIX: Previously, the API server did not convert a user name of ’@@’ to the logged-in username. Now it works as advertised.

• BUG FIX: The Known Networks page would sometimes display a spurious error message ”You cannot use Force To Stream on a network containing 127.0.0.1 or ::1” This has been fixed.

• BUG FIX: Uploading an SSL certificate under Setup : HTTPS would put the certificate on all cluster members. Now it only puts it on the particular web server that your browser communi- cates with.

• BUG FIX: When disposing of locally-held messages, correctly report the sending relay to the Roaring Penguin Reputation Collection system.

• BUG FIX: Properly implement the retry-delay for dead Storage Manager nodes.

• BUG FIX: Use the –compress flag on all “rsync” commands.

• BUG FIX: Do not check RPTN data for freshness on cluster members that are not marked as “Sync Bayes?” nodes.

CanIt-PRO — Roaring Penguin Software Inc. 157

Version 7.0.6 released on 2010-07-27

• NEW FEATURE: CanIt can stream inbound messages by directly injecting files into the Send- mail client queue. This can significantly improve performance and reduce disk I/O when stream- ing messages.

• IMPROVEMENT (CanIt-Domain-PRO only): Realm administrators now have access to the Greylisting Report.

• POLICY CHANGE: SORBS has been removed from the suggested set of useful RBLs.

• BUG FIX: The “Known Networks” cache size has been increased to avoid “ping-ponging” the cache in certain situations.

• BUG FIX: The sample PHP API client code now implements DELETE.

• BUG FIX: The CanIt-Connectwise integration module no longer uses the DateTime module.

• BUG FIX: Subject lines in the Pending Notification messages would sometimes be converted to UTF-8 incorrectly. This has been fixed.

Version 7.0.5 released on 2010-07-02

• BUG FIX (CanIt-PRO only): Brand new installations would fail with a PHP error. This has been fixed.

Version 7.0.4 released on 2010-06-29

• BUG FIX: On CanIt-PRO only, if the login/password were the default “admin/canit”, the system would fail with a fatal PHP error. This has been fixed.

Version 7.0.3 released on 2010-06-29

• UPDATE: Update MIMEDefang to 2.70

• NEW FEATURE: Scanners can be marked as “Inbound” and/or “Outbound”. You can also mark a scanner as not needing Bayes data if it’s only used for outbound scanning.

• IMPROVEMENT: You can specify whether an RBL applies to IPv4 addresses, IPv6 addresses or both.

• IMPROVEMENT: The Bayes calculation handles edge-cases better rather than biasing them towards “spam”.

• IMPROVEMENT: The stream (and possibly realm) are included in the URLs generated by Pending Notifications.

CanIt-PRO — Roaring Penguin Software Inc. 158 APPENDIX B. RELEASE NOTES

• CHANGE: The “User”, “PID File” and “Root Directory” settings for Storage Manager are now specified in canit.conf rather than being stored in the database and updated via the Web interface. This allows you to use different values on different machines, and also makes the canit-system startup script more robust in the face of a missing database.

• CHANGE: “Sender-Whitelisted” messages are reported to RPTN (but not trained locally.)

• CHANGE: We no longer track the “Expired from Trap” statistic. Instead, when an incident is created, it increments a new “Quarantined” statistic.

• BUG FIX: IPv6 addresses were not always correctly parsed out of Received: headers; this has been fixed.

• BUG FIX: Avoid useless DNS lookups in CanIt::Socket.

• BUG FIX: Make notifications stream settings accessible via API.

• BUG FIX: Auto-whitelisting could inadvertently create mixed-case sender rules. This has been fixed.

• BUG FIX (CanIt-Domain-PRO only): A realm’s “default” stream always inherited from base:default even if the admin had explicitly turned off inheritance. This has been fixed.

• BUG FIX: The LDAP lookup Perl code broke with very new versions of Net::LDAP. This has been fixed.

• BUG FIX: The test for Blacklisted Recipients was case-sensitive; this has been fixed.

• BUG FIX: If you had a wildcard Verification Server, it would sometimes be used even if there was a more-specific entry. This has been fixed.

• BUG FIX: Country Code rules can now be exported to CSV files and imported from CSV files.

• BUG FIX: The Web interface formerly took quadratic time to obtain the tree of realms; it now takes linear time.

• BUG FIX: Provide API-level access for setting a stream’s parent.

• BUG FIX: Improve CanIt API server handling of JSON vs YAML.

Version 7.0.2 released on 2010-05-03

• IMPROVEMENT: HTTPS (with self-signed certificates) is enabled on appliances by default.

• DOCUMENTATION FIX: Fix typo in Administration Guide Memcached configuration instruc- tions.

• BUG FIX: Allow a wildcard SPF rule (broken in 7.0.0).

• BUG FIX: Add workaround for ancient Perl LWP library shipped with Red Hat Enterprise Linux 4.

CanIt-PRO — Roaring Penguin Software Inc. 159

• BUG FIX: Periodic Reports were broken on CanIt-PRO. They have been fixed. Note: If you are upgrading from pre-7.0.0, you may still have to edit all of your charts and save them to update the stored report configuration. • BUG FIX: Avoid warning caused by ancient version of PostgreSQL shipped with Red Hat Enterprise Linux 4. • BUG FIX: Only allow selection of LDAP (Active Directory) when first creating a User Lookup. Thereafter, it becomes (and stays) LDAP (Generic). • BUG FIX: Suppress pointless warning when entering a Verification Server of “ignore”. • BUG FIX: Prevent startup code from always registering a new cluster member as a standalone machine. • BUG FIX: Don’t attempt DNS lookups on hostnames that are already IPv4 or IPv6 addresses. • BUG FIX: 9-digit old-style incident IDs would cause the work journal task to fail. This has been fixed. • BUG FIX (CanIt-Domain-PRO only): Allow Base URL of CanIt Installation to be set on a per-realm basis.

Version 7.0.1 released on 2010-04-20

• UPDATE: Update to ClamAV version 0.96. • BUG FIX: Remove the long-obsolete “Sendmail” domain-mapping option. • BUG FIX: Subject lines with NUL characters could produce badly-rendered Pending Notifica- tion reports. This has been fixed. • BUG FIX: Fix compilation error in Storage Manager on Gentoo. • BUG FIX: Avoid spurious warning in database upgrade script. • BUG FIX: Include correct text in incident report when sender is whitelisted due to SMTP AUTH. • BUG FIX: Fix the optional “Next Msg” and “Prev Msg” links in the incident details page; these were broken by the 7.0.0 release. • BUG FIX: In 7.0.0, a user-lookup method whose name matched the method name would fail. This has been fixed. • BUG FIX (CanIt-Domain-PRO only): The 7.0.0 upgrade accidentally reduced the permissions of realm administrators. This has been fixed. • BUG FIX: The Known Networks page produced invalid HTML; this has been fixed. • BUG FIX: The database upgrade script could fail on certain databases that were upgraded from old versions of PostgreSQL. This has been fixed.

CanIt-PRO — Roaring Penguin Software Inc. 160 APPENDIX B. RELEASE NOTES

Version 7.0.0 released on 2010-04-13

• MAJOR NEW FEATURE: CanIt can automatically block senders who send too many messages per hour. This rate-limiting is controlled by a Known Networks flag and can be used to detect and block internally-compromised email accounts.

• MAJOR NEW FEATURE (CanIt-Domain-PRO only) Realms can now be hierarchical. This allows many levels of administrative control; a customer in charge of a realm can be allowed to manage sub-realms.

• MAJOR NEW FEATURE: CanIt installations collect data about IP address reputation and send the data back to Roaring Penguin Software. This will be used to build a set of DNS-based blocklists usable by CanIt customers. NOTE: You should open UDP port 6568 outbound so the CanIt machines can report the IP reputation data.

• MAJOR CHANGE: The API server has been rewritten in PHP. As a result, it is much easier to deploy and does not need FastCGI or Catalyst. Also, if you choose, you can make the API available to realm administrators or even end-users. (Normal permission checks apply.) NOTE INCOMPATIBILITY The API version number has changed from 1.0 to 2.0. You should rework any scripts that use the API and make sure they still work correctly.

• MAJOR CHANGE: Incident IDs are no longer integers, but string identifiers. This is to sup- port a future add-on component that allows CanIt incident data to be spread across multiple PostgreSQL servers.

• MAJOR NEW FEATURE: Roaring Penguin Software Inc. provides four new DNSBLs to CanIt customers; see the Administration Guide for details.

• NEW FEATURE: Master RBL’s can be marked “Block” or “Allow”, which controls the RBL rules that can be created.

• NEW FEATURE: You can extend the enforced greylisting “quiet time” for hosts listed on a DNSBL.

• NEW FEATURE: (CanIt-Domain-PRO only) You can store up to four user-defined pieces of information per realm.

• ENHANCEMENT: DNSBLs now let you specify an A record to match or a bitmask to mask against. This lets CanIt handle combined DNSBLs that return multiple pieces of information encoded in the A record.

• UPGRADE: Upgraded SpamAsasssin from 3.2.5 to 3.3.0.

• IMPROVEMENT: You can set a timeout on Verification Server lookups and User Lookups.

• IMPPROVEMENT: “Subject” custom rules apply to both raw and decoded subject lines.

CanIt-PRO — Roaring Penguin Software Inc. 161

• IMPROVEMENT: CanIt Storage Manager packs old data into CDB databases; this reduces the number of files in the Storage Manager Tree making it easier to back up and consuming fewer inodes.

• IMPROVEMENT: The Sanity Checker module checks for many more problems and misconfig- urations.

• IMPROVEMENT: The Dictionary Attack Detector uses a Known Networks flag to avoid ban- ning friendly hosts. This replaces the older text entry box with a list of hosts.

• IMPROVEMENT: The “Clickable Webform” Pending Notification has been improved so large pending lists don’t generate over-long URLs. Also, all subject lines are decoded and presented in UTF-8.

• IMPROVEMENT: The failover code refuses to fail over if the standby database is active for some reason.

• IMPROVEMENT: WAL-file copying in the failover code has been made more robust.

• IMPROVEMENT: System Load graphs are now in a zoomable vector format on browses that support the HTML Canvas tag (this means any modern browser except Internet Explorer.)

• IMPROVEMENT: The API always returns a stream’s parent when returning stream data.

• IMPROVEMENT: DNSBL descriptions are shown in the Spam Analysis Report.

• IMPROVEMENT: We don’t use DB File unless we actually encounter a Berkeley DB file. This can reduce memory usage.

• IMPROVEMENT: French and Portuguese translations have been overhauled.

• SECURITY ENHANCEMENT: The “goto” redirection parameter is sanitized to avoid cross- site scripting attacks.

• IMPROVEMENT: (Appliance only) The curses-based “setup” utility lets you reset CanIt user passwords.

• IMPROVEMENT: (CanIt-Domain-PRO only) Many formerly-global settings like the CanIt ad- ministrator email address are settable on a per-realm basis.

• PERFORMANCE IMPROVEMENT: You can use memcached to cache Verification Server re- sults.

• MINOR NEW FEATURE: You can purge all rules and settings from a stream.

• MINOR NEW FEATURE: You can remove permission for end-users to disable stream inheri- tance.

• MINOR IMPROVEMENT: Text in the “Strip Attachements” notification is now templatable.

• MINOR IMPROVEMENT: You can set the default action for the “Clickable Webform” notifi- cation, and also add “Blacklist/Whitelist Sender” options to the notification.

CanIt-PRO — Roaring Penguin Software Inc. 162 APPENDIX B. RELEASE NOTES

• MINOR IMPROVEMENT: You can tell the LDAP lookup not to validate the server certificate (if, for example, it uses a self-signed certificate.)

• MINOR IMPROVEMENT: System Check test names are hyper-linked to descriptions in the Administration Guide.

• PERFORMANCE IMPROVEMENT: The Storage Manager server uses “sendfile” to send data if possible. Otherwise, it tries “mmap” and only as a last resort falls back to “read/write”.

• CLEANUP: The database schema has been cleaned up to improve performance and maintain- ability.

• GUI CLEANUP: The Pending Trap displays incident dates in a more readable way.

• GUI CLEANUP: The Known Networks interface has been reworked to avoid very wide pages that require side-scrolling.

• POLICY CHANGE: The “Handling for Windows Executables” setting has been removed. In- stead, use Filename Extension rules. On upgrade, appropriate Filename Extension rules are created to keep the same behaviour as the pre-upgrade version.

• POLICY CHANGE: The “Secondary MX Machines” setting has been removed. Instead, use Known Networks flags. On upgrade, appropriate Known Networks entries are created.

• POLICY CHANGE: We no longer tokenize Microsoft Word documents. They were leading to too many false positives. This change may be revisited in a future release.

• POLICY CHANGE: The “Database Cron Runner” flag in Cluster Management has been ig- nored since release 6.1.0. The flag has therefore been removed.

• BUG FIX: We don’t count addresses in addresses seen unless a domain is known to validate recipients.

• BUG FIX (Appliance only): The “Set Timezone” menu option now works properly and actually sets the time zone.

• BUG FIX: Bayes training is more robust in the face of corrupt CDB files.

• BUG FIX: Known Networks would refuse to allow a “Force-to-Stream” value for SMTP- AUTH. This has been fixed.

• BUG FIX: The “Top N” reports in long-term statistics used to issue PHP errors on PHP 5.3; this has been fixed.

• BUG FIX: CanIt would incorrectly force local parts of email addresses to lower-case when do- ing verification server lookups. This can break things like SRS, so we leave the local part alone now. (For the purposes of rules, however, the local part is still compared case-insensitively.)

• BUG FIX (Domain-PRO only): Realm Administrators are now given full access to traps within their realms.

• BUG FIX: Daily reports were broken on PHP 5.3.

CanIt-PRO — Roaring Penguin Software Inc. 163

• BUG FIX: Remove all PHP calls to the deprecated “ereg*” functions in favour of “preg*” • BUG FIX: (CanIt-Domain-PRO appliance only) Deleting a realm also deletes domain routes associated with the realm. • BUG FIX: SPF “permerror” and “temperror” returns codes are handled properly. • BUG FIX: Oversize text/plain parts are no longer scanned with SpamAssassin. • BUG FIX: The system is much more robust in the face of corrupt Bayes CDB files. • BUG FIX: canit-failover-verify-setup.pl would report a spurious test failure. • BUG FIX: CSS stylesheets have been fixed up to have more consistent font selection.

Version 6.1.3 released on 2009-10-15

• BUG FIX: Make startup code regenerate mailertable and access databases on appliances. • BUG FIX: Make startup code coexist more peacefully with PgBouncer. • BUG FIX: Fix errors in IPv6 validation in Rules : Networks. • BUG FIX: Force scanners to notice changes to Known Networks immediately. • BUG FIX: Make “PUT /domain routes/activate” API command actually work. • MINOR BUG FIX: Suppress warnings that cron job has not run on new installations; we only trigger the test after system has been installed for at least a day. • MINOR BUG FIX: Fix possible print formatting error in Custom Rule test. • MINOR BUG FIX: Suppress “Use of uninitialized variable” warning in Bayes code. • MINOR BUG FIX: Make Bayes sync work on symbolically-linked source directories.

Version 6.1.2 released on 2009-08-17

• BUG FIX: Total token counts for local streams were being reset to zero. This has been fixed. Note that no Bayes training was lost; only the token counts in the PostgreSQL database were affected. As streams undergo Bayes training, the token counts will be corrected automatically.

Version 6.1.1 released on 2009-08-12

• BUG FIX: On certain systems, the database upgrade code would fail. This has been fixed. • BUG FIX: In some cases, depending on how it was sorted, the trap display would produce an SQL error. This has been fixed. • BUG FIX: Storage manager refused to compile on NetBSD; this has been fixed.

CanIt-PRO — Roaring Penguin Software Inc. 164 APPENDIX B. RELEASE NOTES

Version 6.1.0 released on 2009-08-11

• NEW REPORT: A “Dormant Streams” report lists all streams that have not received mail in the last 60 days. • NEW FEATURE: “Host” rules have been replaced by “Network” rules, which can apply to CIDR blocks as well as individual hosts. • NEW FEATURE: We have *experimental* support for IPv6. Anywhere an IPv4 address can be used, so can an IPv6. And anywhere an IPv4 CIDR can be used, so can an IPv6 CIDR. Please note that there may be many untested edge cases, hence the designation “experimental”. • NEW FEATURE (CanIt-Domain-PRO only): Realms can have an expiry date; when it nears, realm administrators get a warning. This lets hosting providers keep track of when customer services are to expire. • MINOR NEW FEATURE: The trap display can be sorted by the domain of the sender. • DOCUMENTATION FIX: The theming guide has been overhauled. It’s now linked from Setup : Wizards (just like all the other manuals.) • CHANGE: The “Incident Note” feature has been removed. It cluttered the interface and is almost useless in CanIt-PRO and Domain-PRO. • CHANGE: Internally, we use Mail::SPF rather than the deprecated Mail::SPF::Query to handle SPF lookups. • POLICY CHANGE: When looking for SPF scoring rules, we recursively strip off domain com- ponents in the same way as for Domain Action rules. • IMPROVEMENT: There is no need to manually create and maintain a script to synchronize Bayes data files. Instead, the CanIt cluster system automatically synchronizes Bayes data to all scanners. • IMPROVEMENT: The text of the “Periodic Report” e-mail can be templated. • IMPROVEMENT: We tokenize the “HELO” string for Bayes. • MAJOR INTERNAL CHANGE: The internal mechanism used to run tasks across cluster mem- bers has been drastically overhauled and should be much more robust. • PERFORMANCE IMPROVEMENT: CanIt can now use the PgBouncer connection pooler to reduce load on the database. PgBouncer is packaged for our Debian-based appliances. • PERFORMANCE IMPROVEMENT: We use the “CDB” database format for storing Bayes data rather than Berkeley DB. CDB should be faster than Berkeley DB and the files are portable across operating systems and CPU architectures. • IMPROVEMENT: If a message is oversize, we attempt to scan it anyway after removing non- text parts. (If the remaining text parts are still oversize, we do not scan the message for spam.) This should help considerably in catching spams that are artificially inflated with image attach- ments.

CanIt-PRO — Roaring Penguin Software Inc. 165

• BUG FIX: CanIt is now compatible with PostgreSQL 8.4.0.

• BUG FIX: The “Bogus MX” check now considers 0.0.0.0 to be bogus.

• BUG FIX: A performance regression when viewing very large traps has been fixed.

• BUG FIX: Crash-inducing typos in the SPF Rules page and the daily-mail-by- realm report have been fixed.

Version 6.0.3 released on 2009-05-28

• NEW FEATURE: Sessions can be made to last longer than 8 hours with the new “Remember Me” checkbox on login screen. (Default Remember Me time is one week.)

• NEW FEATURE: The POP3, IMAP and Program external authentication methods let you strip the domain name from the login name to generate the home stream. For example, you can configure it so that “[email protected]” is placed in a home stream called “user”.

• NEW FEATURE: An administrator (or realm administrator) can choose to allow sender whitelisting/blacklisting directly from notification messages. The administrator can also set the default pulldown settings in notification messages to “Reject” or “Do nothing”.

• NEW REPORT: An administrator (or realm administrator) can pull a report showing the number of addresses seen per stream.

• IMPROVEMENT: In the “Clickable Webform” notification, clicking on the message subject displays the message body without requiring logging in.

• CHANGE: canit-storage-manager has an official IANA port number (6568). The default port has been changed to reflect that.

• FIX: The 70 sare stocks.cf ruleset is obsolete and has been deleted.

• BUG FIX: The API server would sometimes return a 500 error code instead of a 404 not found code.

• BUG FIX: If an incoming message has an X-Spam-Flag: header, we delete it.

• BUG FIX: /etc/init.d/canit-system behaves more reliably if the database happens to be down when it is run.

• BUG FIX: When we download a new ruleset, we now signal all scanners to re-read the rules files.

• BUG FIX: Several other minor bugfixes and cosmetic improvements.

CanIt-PRO — Roaring Penguin Software Inc. 166 APPENDIX B. RELEASE NOTES

Version 6.0.2 released on 2009-04-20

• UPDATES: Updated to MIMEDefang 2.68 and ClamAV 0.95.1.

• PERFORMANCE IMPROVEMENT: When using Embedded Perl, scanner startup time is im- proved. Also, more memory can be shared among scanners, reducing the total memory foot- print.

• IMPROVEMENT: In CanIt 6.0.0 and 6.0.1, incidents that expired out of the trap were never counted in daily statistics. Now they are counted in their own category (“Expired from Trap”)

• PERFORMANCE IMPROVEMENT: Performance of greylisting was improved on large and busy clusters by partitioning the greylisting table in the database.

• POLICY CHANGE: If the score for AutoRejectNoIncident is lower than AutoReject, we in- crease it to match AutoReject.

• IMPROVEMENT: canit-prepare-system warns if it notices that SELinux is enabled. It also sets reasonable defaults for mx maximum in canit.conf.

• NEW SETTINGS: canit.conf has new settings in the [mimedefang] section: con- serve descriptors, md required fds and mx required fds. See /usr/share/canit/canit.conf for de- tails.

• BUG FIX: Several typos in the HTML manuals were fixed.

• BUG FIX: The address-count-by-domain report used SQL that didn’t work on old versions of PostgreSQL; this has been fixed.

• BUG FIX: The Domain Setup Wizard would refuse to let you choose “Other” for the streaming method; this is fixed.

• BUG FIX: A minor rendering error on the Bayes Rules page was fixed.

• BUG FIX: The “Bulk Entry” page performs basic validation of entered data.

• BUG FIX: A bug in the Storage Manager Wizard that would only let you change one host’s Storage Manager Settings has been fixed.

• BUG FIX: PDF pie charts would render incorrectly if there were many entries; this has been (partially) fixed so that even if some entries are truncated, the pie graph displays correctly.

• BUG FIX: The “PhishingAddress” test worked, but put a nonsensical value in the list of fired test names. This has been fixed.

Version 6.0.1 released on 2009-03-25

• NEW FEATURE: CanIt can produce reports showing the number of valid e-mail addresses seen per domain (and per-realm, for CanIt-Domain-PRO.)

CanIt-PRO — Roaring Penguin Software Inc. 167

• BUG FIX: In certain very unusual situations, the database upgrade script could abort. This has been fixed.

• BUG FIX: The Storage Manager wizard in 6.0.0 did not work correctly with multiple storage manager nodes; this has been fixed.

• BUG FIX: If you create a periodic report with no charts, CanIt used to produce invalid PDF. Now, it produces a single-page PDF containing an error message.

• BUG FIX: On new installations only, the RunBayesJournal background task would die. This has been fixed.

• BUG FIX: A typo in version 6.0.0 would sometimes cause a filtering process to terminate ab- normally. This has been fixed.

• BUG FIX (CanIt-Domain-PRO only): Deleting a realm would sometimes leave some realm data in the database. Now, it is all cleaned out.

• BUG FIX: Reports now have fields for realms and streams, rather than Yes/No fields “Show All Realms” and “Only This Stream”.

• BUG FIX: Non-root users could not create periodic reports; this has been fixed.

• BUG FIX: Under VMWare, the master multiplexor process could consume a lot of CPU time. This has been fixed (but we still do not recommend running CanIt under VMWare, especially the PostgreSQL database server.)

Version 6.0.0 released on 2009-03-12

• UPGRADE NOTE: It is no longer possible to upgrade versions of CanIt less than 4.0.0 to the current version. If you are running CanIt older than 4.0.0, you must first upgrade to 5.0.2 before upgrading to 6.0.0.

• NEW FEATURE: CanIt blocks mail to or from addresses on a dynamically-maintained phishing address list. This list is distributed several times a day over the RPTN distribution channel.

• NEW FEATURE: On an emergency, per-domain basis, you can block Delivery Status Notifica- tions to cope with severe backscatter. (This feature is dangerous, so must be explicitly enabled under Setup : Features)

• NEW FEATURE: CanIt can generate and e-mail PDF reports on a periodic basis. You can configure which reports you want and how often you want to receive them.

• POLICY CHANGE (Appliance Only): If your sources.list file contains a non-RoaringPenguin repository, automatic updates are suppressed; you have to run the update manually in this case.

• POLICY CHANGE: The nightly RPTN download submits some statistics back to the RPTN server for Roaring Penguin’s monitoring and analysis purposes. In particular, it submits the PostgreSQL version, license key, operating system name and version, CanIt version, count of

CanIt-PRO — Roaring Penguin Software Inc. 168 APPENDIX B. RELEASE NOTES

number of hosts in your cluster, count of number of valid inbound email addresses and domains seen in the last 60 days, and daily cluster load statistics. No personally-identifying information is reported back.

• PERFORMANCE IMPROVEMENT: If you are using Storage Manager, we do not store any- thing in PostgreSQL for a Bayes signature unless it is trained. On busy systems, this can con- siderably reduce the load on the database.

• BUG FIX: Fixed some rendering errors in the Web interface on Opera and Chrome.

• BUG FIX: RPTN downloads now validate the server certificate against Roaring Penguin’s cer- tification authority file.

• BUG FIX: The Storage Manager Wizard is better integrated with the Cluster Management GUI.

• BUG FIX: canitd, the CanIt Daemon, has been completely rewritten to improve reliability.

• BUG FIX: You can now specify a port number if PostgreSQL is listening on a non-standard port.

• BUG FIX: Silenced annoying (but harmless) log messages about duplicate keys from the LogLoad daemon task.

Version 5.0.2 released on 2009-01-15

• MAJOR POLICY CHANGE: Self-whitelists are ignored. That is, if the sender e-mail address is the same as the recipient e-mail address, any whitelist for that address is ignored. Similarly, if the sender domain is the same as the recipient domain (or a subdomain thereof), then any domain whitelist is ignored.

• NEW RULES: We ship a SpamAssassin plugin that detects many kinds of targeted phishing attempts (known as “spear phishing”). Look for the RP PHISH rule in incident reports.

• PERFORMANCE IMPROVEMENT: Performance of Bayes training SQL queries has been im- proved.

• IMPROVEMENT: We do not auto-whitelist messages if the outgoing message looks like an out-of-office auto-reply.

• BUG FIX: “Import Rules” did not correctly import rules exported by “Export Rules”; this has been fixed.

• BUG FIX (Domain-PRO only): Only the site administrator can switch into the special “@@” streams (which are always in the base realm.)

• BUG FIX: The MIMEDefang SNMP agent reported inaccurate data; this has been fixed.

• BUG FIX: If the master “canitd” daemon lost connection with the PostgreSQL database, it would incorrectly stop various daemon tasks. This has been fixed.

CanIt-PRO — Roaring Penguin Software Inc. 169

• BUG FIX: Under certain circumstances, CanIt would inappropriately store an additional copy of released messages. This has been fixed. • BUG FIX: (Appliance Only) Automatic upgrades were broken in 5.0.1; they are fixed in 5.0.2. (Note that we released interim 5.0.1 packages that fixed the problem also, so very few appliances should be affected.) • BUG FIX: The Sendmail address mapper Path in the database was not updated to reflect the new location of sendmail-account-info.pl. It is now.

Version 5.0.1 released on 2008-12-03

• NEW RPMS: We now supply RPMs for Fedora 10 on i386. NOTE: This will be the LAST version of Fedora for which we will supply RPMs. For future Fedora releases, you will need to install CanIt from source. • UPDATE: ClamAV updated from 0.94.1 to 0.94.2. • IMPROVEMENT (Appliance Only): The upgrade process on appliances has been improved; e-mail notifications are more meaningful. You can configure Automatic vs Manual upgrade via the Web interface. The system will refuse to do automatic upgrades on a cluster. • COSMETIC IMPROVEMENT: If the geolocation code cannot determine the location of a relay, a special small flag is shown rather than “Location Unknown”. • BUG FIX: The RPMS for Red Hat Enterprise Linux 3 did not work because of the ancient version of Perl on RHEL3. We have since made the code work. • BUG FIX: The “Reset Inheritance” button for stream settings did not work in 5.0.0. It now works as designed. • BUG FIX: We inadvertently used a PHP function only available in newer versions of PHP. This broke some reports. We’ve fixed the code to use functions available in all supported versions of PHP. • BUG FIX: The sanity-checker emitted false reports of failed ticker tasks. This has been fixed. • BUG FIX: New cluster members automatically register themselves as scanners. While this may not always be the case, it almost always is and is a better default behaviour. • BUG FIX: The country-name selection menu on the Rules : Countries page was too narrow in Internet Explorer. This has been fixed. • BUG FIX: The CanIt API server would ignore realm restrictions for certain stream-listing queries. This has been fixed. • BUG FIX: Advanced Search would sometimes double-escape input data, leading to failed searches that really should have succeeded. • BUG FIX: Various minor PHP and JavaScript warnings have been fixed.

CanIt-PRO — Roaring Penguin Software Inc. 170 APPENDIX B. RELEASE NOTES

Version 5.0.0 released on 2008-11-18

• MAJOR NEW FEATURE: CanIt determines the country in which a sending relay is located. You can make rules based on sending country; geolocation information is also used as Bayes tokens.

• MAJOR NEW FEATURE: Cluster management has been completely revamped. It is much easy to set up a cluster now.

• MAJOR NEW FEATURE: The cluster-management system collects performance data for all scanners in the cluster; the Web interface lets you plot the data minute-by-minute, hour-by-hour or day-by-day.

• MAJOR NEW FEATURE: CanIt can do dictionary-attack detection and block abusive hosts at the firewall level. This feature is ONLY available on Linux.

• MAJOR NEW FEATURE: The Web interface includes a Domain Setup Wizard that walks you through all the major steps required to set up a new domain.

• NEW FEATURE: You can set expiry dates on most rules. This lets you avoid “rule creep” as many rules accumulate and last forever.

• NEW FEATURE: The Administration Guide and Users Guide are available in HTML format. Most CanIt GUI pages link to corresponding manual sections.

• NEW FEATURE: Auto-whitelists now expire (by default after 180 days).

• NEW FEATURE: SNMP tools are packaged with CanIt. Note that you need to install and configure net-snmp yourself to enable the SNMP tools.

• IMPROVEMENTS: Many additional reports were added; existing reports were made more con- figurable.

• MAJOR IMPROVEMENT (Appliance Only): The text-based interface for setting up a CanIt appliance has been completely rewritten and is much more usable and stable.

• MAJOR IMPROVEMENT: Sender and Domain rules work both with the SMTP envelope sender and the address in the From: header. The previous behaviour of ignoring From: was very confusing to end-users.

• MAJOR IMPROVEMENT: The “sanity checker” that checks for common misconfigurations has been revamped. It now e-mails the CanIt administrator if it discovers problems.

• MAJOR CHANGE: The “ticker” is gone. Replacing it is the CanIt daemon “canitd” that runs on all hosts in a cluster.

• MAJOR IMPROVEMENT: The different CanIt startup scripts have been unified into /etc/init.d/canit-system which starts or stops all processes required on a particular node.

• IMPROVEMENT: You can set “Allow Unauthorized Voting” on a per-stream (hence per-realm in Domain-PRO) basis.

CanIt-PRO — Roaring Penguin Software Inc. 171

• POLICY CHANGE: SURBL is now charging large users for access to SURBL data. Please read the comments in /etc/mail/spamassassin/72 score compensate.cf. Similar comments apply to the URIBL.COM URI blocklist.

• POLICY CHANGE: Support terms have been changed. We have increased our excess support fee from $75/hour to $100/hour, and have added the following clause: After Hours Support: If mail delivery is interrupted, we reserve the right to make minimal changes to get mail flowing again (including disabling filtering entirely) until our normal office hours, at which time we will attempt to make a complete correction.

• POLICY CHANGE: By popular demand, the “Whitelisted Action” for filenames and extensions also applies if a domain or host is whitelisted, and not only if an actual sender is whitelisted.

• POLICY CHANGE: We avoid greylisting very large messages (to conserve bandwidth)

• UPDATE: Updated ClamAV to version 0.94.1.

• CLEANUP: SpamAssassin rules shipped with CanIt were cleaned up and updated.

• CHANGE: The “Sendmail” address-mapping method has been removed. The upgrade process replaces any Sendmail methods with an equivalent Program method.

• CLEANUP: Many global variables that had identical per-stream variables have been removed (the globals were really only necessary for plain-CanIt and are not needed for CanIt-PRO or CanIt-Domain-PRO.)

• CLEANUP: Several global variables that were of marginal use have been removed and the recommended behaviour has been hard-coded.

• CLEANUP: Many scripts and paths have been moved. We place most scripts under /usr/share/canit/scripts rather than in /etc/mail/canit.

• CLEANUP: The entire concept of “one-shot messages” has been removed. It was not useful and served mostly to confuse.

• CLEANUP: “Hit-and-Run” is now consistently referred to as “Greylisting” to keep in line with standard terminology.

• IMPROVEMENT: The formerly-global ”Auto-populate pending notification addresses” is now per-stream (therefore per-realm in CanIt-Domain-PRO.)

• IMPROVEMENT: The internal storage-manager code has been changed to make it easier to add, remove and rename storage-manager nodes.

• MAJOR REORGANIZATION: Most configuration files (mimedefang.conf, db-settings, etc.) have been reorganized into one master configuration file canit.conf.

• BUG FIX: Mail log rotation on Debian-based appliances has been fixed. Previously, logs could be rotated twice in quick succession.

• OTHERS: Many other minor bug-fixes and improvements.

CanIt-PRO — Roaring Penguin Software Inc. 172 APPENDIX B. RELEASE NOTES

Version 4.1.3 released on 2008-08-13

• UPDATES: Updated ClamAV from 0.93.1 to 0.93.3

• BUG FIX: The global setting ”Silently discard rejected messages rather than remailing with ticker” was removed from the Web interface, but not from the filter code. On busy servers, this could result in large queues on the ticker host and delays in remailing released messages.

• BUG FIX: A missing JavaScript check on one of the “Reject All as Spam” buttons in the trap display was fixed.

• BUG FIX: The CanIt Domain Routing API call would fail if you supplied only one server for domain routing. This has been fixed.

Version 4.1.2 released on 2008-06-16

• UPDATES: Updated SpamAssassin from 3.2.3 to 3.2.5. Update ClamAV from 0.92.1 to 0.93.1. NOTE INCOMPATIBILITY Some clamd options have been removed; you MUST remove them from your clamd.conf file or clamd will refuse to start. (CanIt appliances will automatically remove the options.) The options to remove are: ArchiveMaxFileSize, ArchiveMaxRecursion, ArchiveMaxFiles, ArchiveMaxCompressionRa- tio and ArchiveBlockMax

• POLICY CHANGE: On new installations, the default for “Tempfail Suspect Messages” is now “Never” rather than “Until-Dispatched”. We decided the change was necessary to avoid am- plification effects on very busy systems and to avoid support queries when people release mail after several days.

• NEW BINARY PACKAGES: We have added support for Fedora 9 RPMs on i386. We have dropped binary packages for Fedora Core 5 and 6.

• IMPROVEMENT: We have provided a new command-line tool called “canit-api-client”. You should begin using it rather than “canit-cmd” because “canit-cmd” will be removed in CanIt 4.2.0.

• WORKAROUND: We implemented a workaround for Outlook 2007’s broken form-handling behaviour; it lets you accept or reject individual messages from the notification e-mail.

• IMPROVEMENT (CanIt-Domain-PRO only): “Source Address of CanIt Notifications” and “Full name for sender of CanIt notifications” can now be set on a per-realm basis.

• IMPROVEMENT: Global Settings and Stream Settings are now grouped into related sets. You can hide the display of sets you’re not interested in seeing.

• IMPROVEMENT: If the DNS lookup to find the RPTN version fails, the System Check page alerts the administrator.

CanIt-PRO — Roaring Penguin Software Inc. 173

• IMPROVEMENT: In the trap display, if you have accepted any messages but click “Reject All as Spam”, CanIt prompts for confirmation first.

• WORKAROUND: If the system time on the ticker is set far in the future and then reset to the correct time, ticker tasks may not run. The ticker code now detects clock skew and compensates for it.

• IMPROVEMENT/BUG FIX: The API server has had many validation bugs fixed.

• BUG FIX: The PostgreSQL failover module has been update with various bug fixes as well as a workaround for bugs in PostgreSQL 8.3.0 and 8.3.1.

• BUG FIX: If the database is still starting up, the CanIt Storage Manager startup script keeps trying for a while before giving up.

• BUG FIX: A quoting error in decoding certain encoded subject lines has been fixed.

• BUG FIX: The Authentication Mapping web page did not handle a Filter correctly. This is now fixed.

• BUG FIX: The “Dashboard” Web page did not respect all permissions correctly. This is now fixed.

• BUG FIX: The ForceToStream attribute is ignored for mail originating from the loopback ad- dress. The old behaviour would sometimes cause released mail to be re-trapped in a different stream.

• BUG FIX: CanIt would sometimes leave SpamAssassin temporary files littering /tmp; this has been fixed.

Version 4.1.1 released on 2008-04-03

• WORKAROUND: “Clickable Webforms” do not work with Outlook 2007 and cannot be made to work; we added a note to that effect for Outlook 2007 users.

• COSMETIC FIX: The “View System Load” button on the Server Management page was ab- surdly big. It now matches the other buttons.

• BUG FIX: If users defaulted to the “Simplified Interface”, the RSS feed did not work. This is now fixed.

• BUG FIX: Templates for “Clickable Webform” pending notifications were not being set cor- rectly. This is now fixed.

• BUG FIX: Pending Messages were not being triggered as incidents were created. This is now fixed.

• BUG FIX (Domain-PRO only): “Clickable Webform” sometimes did not work, depending on which realm the user was in. This is now fixed.

CanIt-PRO — Roaring Penguin Software Inc. 174 APPENDIX B. RELEASE NOTES

• BUG FIX: Updating Templates didn’t take effect immediately; this is now fixed.

• BUG FIX: The canit-cmd tool and the API server would not let you set the treat as mx flag for verification servers. This is now fixed.

• BUG FIX: The API server did not work on some versions of Red Hat because Red Hat ships a truly ancient version of Sys::Syslog. We have worked around the problem.

• BUG FIX: We inadvertently packaged the wrong version of the PostgreSQL failover code. This has been fixed.

Version 4.1.0 released on 2008-03-25

• MAJOR NEW FEATURE: Users can configure an RSS feed of pending incidents. This allows you to use your favourite RSS feed reader to monitor your trap.

• MAJOR IMPROVEMENT: Pending notifications are only sent out if there are new pending messages. Also, you can configure notifications to be submittable forms; this allows you to accept or reject messages directly from your e-mail client without having to authenticate with CanIt.

• IMPROVEMENT: You can restrict the days on which Pending Notifications are sent. (You can skip them on weekends, for example.)

• IMPROVEMENT: We use the creation date of an incident when logging it in the statistics tables rather than the resolution date. However, pending incidents that are resolved some time after they are created appear only in daily statistics, not hourly statistics.

• NEW FEATURE: You can set the full name for the source of CanIt notificaions.

• IMPROVEMENT: CanIt can be disabled for maintenance from the Web interface. You no longer have to create /etc/mail/canit/disabled on all machines.

• POLICY CHANGE: We have disabled all spamhaus.org DNS-based RBLs. Spamhaus is be- coming more adamant about enforcing its terms-of-use; if you wish to use Spamhaus-based tests and do not qualify for free use of the RBLs, please arrange directly with Spamhaus for a data feed contract.

• POLICY CHANGE: We have removed the Sendmail domain-mapping method. The upgrade script replaces it with a Program method for backward-compatibility.

• PERFORMANCE IMPROVEMENT: Marking old incidents as spam has been moved out of the cron job into a ticker task that can operate more leisurely.

• PERFORMANCE IMPROVEMENT: If a stream has too little Bayes training, we don’t use that stream’s training database. (Before, we’d add the database totals together and if the total was large enough, we would use all the data.) This greatly improves performance on sites with many streams where most of the streams have little Bayes data.

CanIt-PRO — Roaring Penguin Software Inc. 175

• IMPROVEMENT: Log the IP address of HTTP clients in incident logs.

• IMPROVEMENT: The site administrator can temporarily disable pending notifications.

• SECURITY IMPROVEMENT (CanIt-Domain-PRO only): Only the super-root can create “Pro- gram” or “Program Legacy” user-lookups.

• GUI IMPROVEMENT: The GUI prompts for confirmation before deleting rules, users, etc.

• UPDATES: Update to ClamAV 0.92.1 and Net::DNS 0.63.

• BUG FIX: Fix internal handling of “duplicate key violation” error message from PostgreSQL 8.3.

• BUG FIX: Various bugs in the CanIt-API server were fixed.

• BUG FIX: Build problem on NetBSD and FreeBSD was fixed.

• BUG FIX: Under certain conditions, CanIt would break S/MIME signed messages. This has been fixed.

• BUG FIX: Storage-manager startup script incorrectly distinguished upper- and lower-case in host names. This has been fixed.

• BUG FIX: Prohibit deletion of @@READABLE and @@WRITABLE permission-sets (in base realm only in CanIt-Domain-PRO.)

• BUG FIX: The PHP code would fail if one Storage Manager node was marked read-only. This has been fixed.

• BUG FIX (CanIt-Domain-PRO only): Invalid recipients would always be logged in the “base” realm’s statistics instead of the correct realm. This has been fixed.

Version 4.0.3 released on 2007-12-11

• NEW FEATURE: You can enter multiple verification servers (separated by commas) for a given domain. CanIt tries the servers in order until it receives a definite success or failure indication.

• NEW FEATURE (CanIt-Domain-PRO only): The site administrator can view statistics aggre- gated across all realms.

• CHANGE: We emit key/value logging information when an incident is held or streamed.

• POLICY CHANGE: All of the products (CanIt, CanIt-PRO and CanIt-Domain-PRO) now use the same filter file. NOTE INCOMPATIBILITY IF YOU HAVE MODIFIED YOUR FILTER FILE, BE SURE TO TEST YOUR MODIFICA- TIONS WITH THE NEW FILTER BEFORE INSTALLING ON A PRODUCTION SERVER. If you have not modified your filter file, an upgrade will proceed safely.

CanIt-PRO — Roaring Penguin Software Inc. 176 APPENDIX B. RELEASE NOTES

• PERFORMANCE IMPROVEMENT: If you use Storage Manager, the expiry job prunes storage-manager data in the background.

• BUG FIX (Standard CanIt only): The displayed RPTN statistics would be incorrect if local Bayes traning occurred. This has been corrected.

• BUG FIX: A PHP warning on the Reports page was corrected.

• BUG FIX: Some minor CSS settings were updated to work better with Internet Explorer.

• BUG FIX: Some minor bugs in the CanIt API and canit-cmd were fixed.

• BUG FIX: Errors in rendering pie charts with very small wedges were corrected.

• BUG FIX: The upgrade code from CanIt to CanIt-PRO would fail to initialize some CanIt-PRO templates; this has been fixed.

Version 4.0.2 released on 2007-11-27

• NEW PACKAGES: We have Debian packages for Etch as well as sarge, and a new Etch-based ISO image.

• NEW FEATURE: The “Stream Settings” page can show where each setting comes from (in other words, which stream it is inherited from.)

• PACKAGING IMPROVEMENTS: Several formerly appliance-only features such as Post- greSQL failover and configuring mail routing from within the CanIt web interface are now available in the RPM versions.

• BUG FIX: The file wal archive command.pl was inadvertantly left out of the failover packages; this has been corrected. Additionally, we include a sample failover configuration file.

• BUG FIX: The charts in the Statistics page were adjusted to avoid cutting off Y-axis labels.

• BUG FIX (CanIt-Domain-PRO only): The Known Networks page would insist on an entry in the force-to-stream column even if you didn’t want one. This has been fixed.

• BUG FIX: Storage Manager would not compile on some old C compilers. This has been fixed.

• BUG FIX: The index on the daily statistics table was suboptimal, leading to slow queries. This has been fixed.

• BUG FIX: The Stream Permission page did not expand/contract properly under Internet Ex- plorer. This has been fixed.

CanIt-PRO — Roaring Penguin Software Inc. 177

Version 4.0.1 released on 2007-10-25

• NEW FEATURE: A stream setting can limit the maximum number of entries in the Valid Re- cipients Table. By removing permissions from this entry, a site administrator can limit the maximum number of valid recipients per stream.

• NEW FEATURE: A verification server can listen on a non-standard port; use “servername/port” in the Verification Server table.

• EXPERIMENTAL FEATURE: We calculate Bayes probability based on the Robinson-Fisher calculation. This calculation is not used, but information about it appears in headers and reports. More testing is needed to see if the Robinson-Fisher calculation is actually any better than the Naive Bayes calculation.

• POLICY CHANGE: The main Reports page now shows statistics for all streams if the user has root privileges. Unprivileged users only see statistics for their particular stream.

• IMPROVEMENT: The canit-convert-statistics.pl script now works with standard CanIt as well as CanIt-PRO and CanIt-Domain-PRO.

• BUG FIX: On standard CanIt appliances, we accidentally omitted the Perl Log::Syslog::Abstract module. It is now correctly included.

• BUG FIX: Fixed compilation failure of canit-storage-manager on FreeBSD 5.0.

• BUG FIX (Domain-PRO only): The web interface insists on a fully-qualified stream name for the “Force-to-Stream” attribute.

• BUG FIX: On *new* installations only, the init-database script did not create /etc/mail/canit/db- settings. This is now fixed.

• BUG FIX: The CanIt REST-based API did not return the same information for list-active- streams as the Web interface. This has been fixed.

Version 4.0.0 released on 2007-10-15

• MAJOR NEW FEATURE: Statistics and reporting have been completely reworked. There are many more reports available and if your PHP installation has the GD extension, you get graph- ical charts.

• MAJOR NEW FEATURE: We have scripts for automatic database failover using PostgreSQL’s Point-in-Time-Recovery feature. The failover feature is only available on our Debian-based appliances, however.

• MAJOR ARCHITECTURAL CHANGE: CanIt includes a dedicated “Storage Manager” dae- mon for storing large blocks of textual data. This cluster-aware daemon greatly relieves the load on the PostgreSQL database. It should considerably shrink the size of the database, with attendant improvements in expiry, VACUUM and dump times and overall performance.

CanIt-PRO — Roaring Penguin Software Inc. 178 APPENDIX B. RELEASE NOTES

• MAJOR NEW FEATURE (PRO and Domain-PRO only): The command-line tool has been replaced with a completely-new REST-based API. (We ship a replacement command-line tool that uses the REST API rather than direct database manipulation.)

• NEW FEATURE (Appliance Only): You can specify that the appliance is to treat a name in the domain routing table as an MX record rather than a host name. This allows for load-balancing back-ends servers using DNS.

• NEW FEATURE: Each stream can request mail to be blind-carbon-copied to an additional e- mail address.

• NEW FEATURE: We have implemented a mechanism similar to RPTN for automatically push- ing out SpamAssassin rules.

• NEW FEATURE: The cron job can be configured to rotate nightly dumps, keeping a config- urable number of nightly dumps.

• NEW FEATURE: “Known Networks” has been enhanced to include a pseudo-network called “SMTP-AUTH”. Settings for that network apply to senders who authenticate using STMP AUTH.

• NEW FEATURE: A ’*’ entry in the domain for a verification server acts as a wildcard. Do not use this feature if your CanIt server relays outbound mail!

• NEW FEATURE: You can specify how many hours to keep mail in ”Current Statistics.” The default setting of three days is much too long on very busy mail servers.

• UPGRADES: Upgraded bundles software: SpamAssassin from 3.1.8 to 3.2.3; ClamAV from 0.90.3 to 0.91.2.

• PACKAGING CHANGES: We have added packages for Fedora 7. We have dropped binary packages for Solaris, Fedora Core 3 and Fedora Core 4. You will need to install from source on those platforms.

• GUI IMPROVEMENT: Release notes and all PDF manuals are accessible from the Web inter- face.

• GUI IMPROVEMENT: The “System Check” page has been improved to show the results of all system tests. It also shows the currently-loaded RPTN and ruleset versions.

• IMPROVEMENT: We tokenize additional parts of messages for Bayes, such as the sending relay and the local and domain parts of the envelope sender.

• UPDATE: Removed “Chickenpox” SpamAssassin rules.

• CHANGE: All relics of the old CanIt-SMB codebase have been removed and consolidated into CanIt-PRO.

• GUI CHANGE: Web pages are rendered in UTF-8 rather than ISO-8859-1 character set.

CanIt-PRO — Roaring Penguin Software Inc. 179

• CHANGE: The X-Antispam-Training headers are renamed to X-Antispam-Training- {Forget,Nonspam,Spam}. Some marginal mail clients seem to delete multiple headers with the same name.

• MINOR IMPROVEMENT (PRO, Domain-PRO only): Administrators can sort incidents by stream when viewing the “*” pseudo-stream.

• GUI IMPROVEMENT: The GUI character set is now UTF-8. This should allow for more accurate display of message subjects in non-Western character sets.

• MINOR IMPROVEMENT: The X-Bayes-Prob header now lists which streams’ tokens were used. This makes it easier to verify that RPTN is being used.

• MINOR IMPROVEMENT: The sample script for synchronizing Bayes databases uses the -S and -O options with rsync (if your version of rsync supports them.)

• MINOR IMPROVEMENT: The administrator can set the “Forgot your Password?” link from Setup : Templates.

• RULE CHANGE: Removed the VIRUS WARNING64 SpamAssassin rule which could cause false-positives.

• BUG FIX (appliances only): You could not edit a domain-routing entry. (You would have to delete/add it). Editing now works properly.

• BUG FIX (PRO, Domain-PRO only): Editing an Address Mapping in the Web interface now explicitly clears the “cached” flag.

• BUG FIX: The BccAddress stream permission was not correctly granted by a database upgrade. Now fixed.

• BUG FIX: Obsolete settings are now correctly deleted from the setting desc table upon database upgrade.

• BUG FIX: The attachment-stripping code would sometimes log a filename of “unknown” rather than the proper filename.

• BUG FIX: The watch-clamd script uses a lock to prevent two concurrent instances.

• BUG FIX: The “Domain Rules” page accepts top-level domains like “jp” or “ca” without com- plaining that they are invalid.

• BUG FIX (Debian appliances only): Ownership and permissions of /var/lib/clamav have been fixed.

• BUG FIX: The incident creation code has been cleaned up to reduce the chance of race condi- tions.

• BUG FIX (Plain CanIt only): If RPTN downloads are disabled, do not use RPTN data (it is probably stale anyway.)

CanIt-PRO — Roaring Penguin Software Inc. 180 APPENDIX B. RELEASE NOTES

• BUG FIX: Sendmail accepts addresses like and <\d\f\[email protected]> equivalently. This can mess up AsIs or ChopDomain streaming, so CanIt canonicalizes addresses by removing backslashes.

• BUG FIX: The Simplified Interface is disabled if you’re in the “default” stream. (Before, it would appear but do nothing because default’s inheritance cannot be changed.)

• BUG FIX: In several places in the Web interface, potentially-dangerous actions were performed by a GET request. These have all been converted to POST to make it difficult to accidentally bookmark a dangerous request.

• BUG FIX: The “WHOIS” page would sometimes lose the “s=xxx” parameter in the URL, caus- ing it to refuse to send abuse complaints. This has been fixed.

• BUG FIX: If “One-Shots” are disabled, the Show Active Streams page no longer has a One-Shot column.

• BUG FIX: Generating the “Hit-and-Run” report could consume huge amounts of memory caus- ing PHP to abort. This has been fixed.

• BUG FIX: We test mail against a Verification Server before attempting to stream it. This can reduce load considerably in some configurations.

Version 3.4.6 released on 2007-07-05

• ENHANCEMENT: Hit-and-Run detection now takes into account mutating message subjects, making greylisting even more effective.

• UPGRADE: Packaged version of ClamAV has been upgraded to 0.90.3.

• GUI ENHANCEMENT (Appliance Only): The Domain Routing page has a Filter box that lets you limit which domains are displayed.

• GUI ENHANCEMENT: The message display page makes better use of screen real-estate to reduce back-and-forth scrolling.

• POLICY CHANGE (PRO and Domain-PRO only): We now do greylisting (AKA Hit-and-Run Detection) before streaming. If a message comes in for more than one stream, and *all* of the streams have greylisting enabled, we greylist the message.

• POLICY CHANGE (PRO and Domain-PRO only): Greylisting applies even to opted-out streams. (However, since you can now disable greylisting on a per-stream basis, you can turn off greylisting for opted-out streams if you wish by explicitly disabling greylisting.)

• BUG FIX: The RPTN download task obeys proxy settings in environment variables. (See the LWP::UserAgent man page for details.)

CanIt-PRO — Roaring Penguin Software Inc. 181

Version 3.4.5 released on 2007-04-30

• UPGRADE: Packaged version of ClamAV has been upgraded to 0.90.2.

• MINOR ENHANCEMENT: The message display page displays only the main headers (Return- Path, From, To, Subject and Data) by default, with a JavaScript link to reveal/hide all headers. NOTE INCOMPATIBILITY If you have re-themed CanIt, please test your theme. The base theme now uses an external JavaScript library rather than emitting inline JavaScript. If your themes inherit from rather than replace our themes (the recommended approach), they will most likely work fine.

• BUG FIX (PRO and Domain-PRO only): If an attachment is stripped from an e-mail that has no text/plain or text/html parts, we add the notice of stripping as a separate text/plain part.

• BUG FIX (PRO and Domain-PRO only): Parameter validation was improved; previously, a stream named “0” was not allowed.

• BUG FIX (PRO and Domain-PRO only): If a stream inherited from another stream, the inherit- ing stream’s owner could get spurious Pending Notifications. This has been fixed.

Version 3.4.4 released on 2007-04-09

• UPGRADE: Packaged version of ClamAV has been upgraded to 0.90.1. NOTE INCOMPATIBILITY ON SOURCE AND RPM INSTALLATIONS, YOU MAY NEED TO EDIT clamd.conf AND freshclam.conf. THE SYNTAX HAS CHANGED. Instead of single words like “LogSys- log”, you must now use “LogSyslog Yes”. PLEASE VERIFY YOUR CLAM CON- FIGURATION FILES AFTER UPGRADING. See http://wiki.clamav.net/Main/ UpgradeNotes090 for details On our Debian-based CanIt appliances, the upgrade script will fix the Clam configuration files automatically.

• MINOR CLEANUP: The cron job removes its lock file when it has finished. There is no harm from leaving the lock file lying around, but it is cleaner to remove it.

• BUG FIX: All hard-coded colors in the PHP code have been eliminated in favour of CSS stylesheets and classes.

• BUG FIX: A typo in the PHP code could produce a PHP warning; this has been fixed.

• BUG FIX (CanIt-PRO and higher only): Inheritance of the “Use Simplified GUI” preference was broken; it is now inherited just like other preferences.

• BUG FIX (CanIt-PRO and higher only): A User Lookup would not accept a comma-separated list of servers unless there was whitespace around each comma. This has been fixed so the whitespace is allowed but not required.

CanIt-PRO — Roaring Penguin Software Inc. 182 APPENDIX B. RELEASE NOTES

• BUG FIX (CanIt-PRO and higher only): The “Filter” on the Inheritance page now filters both by stream and inherited-stream.

• BUG FIX (CanIt-Domain-PRO only): If you deleted a realm mapping, some settings for a domain (Verification Server, Authentication Mappings and Domain Mappings) would be “or- phaned”. Now, they are moved into the new realm that contains the domain.

Version 3.4.3 released on 2007-03-26

• NEW ARCHITECTURE: We now have RPM packages for Red Hat Enterprise Linux 5

• BUG FIX: The VACUUM and database backup cron jobs would fail on password- protected databases. They now work correctly.

• BUG FIX: When Bayes votes were acted upon, CanIt forgot to record the vote in the Bayes signature table (although it did correctly update the Bayes statistics.) The vote is now properly recorded.

• BUG FIX (CanIt-PRO and higher only): The Web interface and Perl code disagreed about the default value for “Permit use of auto-whitelisting”. They are now in agreement.

• BUG FIX: The cron script would sometimes fail to find required Perl modules on Solaris, and the locking mechanism would fail on Solaris, Both of these problems have been fixed.

• BUG FIX: When switching streams after doing an Advanced Query, the query would be forgot- ten. It is now correctly remembered.

• BUG FIX (CanIt only): Some PHP pages such as Bayes Settings would fail to render on Stan- dard CanIt. This has been fixed.

• BUG FIX: The host IP in the Report pages did not link to a proper WHOIS query URL. This has been fixed.

Version 3.4.2 released on 2007-03-13

• NEW FEATURE: When you create an LDAP user-lookup, you can specify a connect timeout for streaming. (The timeout does not apply to authentication because PHP unfortunately lacks a mechanism to specify the timeout.)

• TRANSLATION IMPROVEMENT: The French and Spanish translations have been updated.

• BUG FIX: The RPM installer correctly recognizes all versions of CentOS 4.

• BUG FIX: Some architecture-specific Debian packages were incorrectly marked “all”. They are now marked “i386” as they should have been.

• BUG FIX: Hit-and-run statistics could be incorrect depending on your greylisting settings. This has now been fixed.

CanIt-PRO — Roaring Penguin Software Inc. 183

• BUG FIX: On some versions of DBI and PostgreSQL, the expiry job could fail with a bunch of SQL syntax errors being emitted. This has been fixed.

• BUG FIX: When verifying RPTN signatures, gpg would sometimes warn about insecure mem- ory. This warning has been suppressed.

• BUG FIX: In the 3.4.x series, whitespace was stripped from the beginning and end of all form entries. If your password began or ended with a space, this would make logging in impossible. 3.4.2 no longer strips spaces from password-entry boxes.

• BUG FIX: Entering a blank or non-numeric score for an RBL rule would cause a PHP error. This has been fixed.

• BUG FIX: Upgrading from CanIt to CanIt-PRO could fail when updating the Bayes table. This has been fixed.

• BUG FIX: Some missing “NOT NULL” column constraints were added.

• BUG FIX: A useless warning “Non-multipart entity with no bodyhandle??” would sometimes appear in the mail logs; this has been removed.

• BUG FIX: Warnings about undefined variables if a sender is whitelisted due to SMTP AUTH have been suppressed.

Version 3.4.1 released on 2007-03-01

• BUG FIX: Fix a bug in the database upgrade code which could make the schema upgrade fail.

• BUG FIX: Remove a harmless but annoying PHP warning from the trap display.

Version 3.4.0 released on 2007-02-28

• MAJOR NEW FEATURE: The old “Access Rights” page is gone. In its place are far more flexible “Permissions” pages. These allow fine-grained control over permissions on a per-user and per-group basis. The database upgrade code should migrate the old Access Rights to the new Permissions accurately.

• MAJOR CHANGE (CanIt-PRO and Domain-PRO only): The old “Stream Redirection” concept is gone. In its place we have “Stream Inheritance”. This is more flexible and simplifies the code a lot. The database upgrade code will migrate Redirection to Inheritance.

• MAJOR IMPROVEMENT (CanIt-PRO and Domain-PRO only): Hit-and-Run (also known as “Greylisting”) can be enabled on a per-stream basis. However, greylisting is now always done after the DATA phase of SMTP.

• GUI IMPROVEMENT (CanIt-PRO and Domain-PRO only): When you are viewing the ’*’ pseudo-stream, many things are writable (you can dispose of incidents, change rules, etc.)

CanIt-PRO — Roaring Penguin Software Inc. 184 APPENDIX B. RELEASE NOTES

• IMPROVEMENT (CanIt-PRO and Domain-PRO only): The canit-cmd command-line tool has additional commands and is fully modular.

• PERFORMANCE IMPROVEMENT: A new caching scheme has reduced the number of database queries per e-mail substantially, sometimes by more than 50%.

• PERFORMANCE IMPROVEMENT: Many internal code changes and code refactoring have been performed to reduce memory usage and improve performance.

• NEW FEATURE: The Verification Servers feature allows you to queue mail (rather than temp- fail it) if the verification server is unreachable. (The default is still to tempfail mail.)

• NEW FEATURE (CanIt-Domain-PRO Appliance Only): Realm administrators can set up mail routing for domains within their realms.

• NEW FEATURE (CanIt-PRO and higher only): Locked Addresses can be locked to a comma- separated list of domains or addresses, any of which is permitted to send mail to the locked address.

• GUI IMPROVEMENT: (Almost) any GUI page can be set as your default home page.

• GUI IMPROVEMENT: Layout of “Incident Details” page has been made much cleaner.

• IMPROVEMENT: Many more CanIt-generated messages are templatable so you can translate them or tailor them to fit your site’s policies.

• IMPROVEMENT: The RBL timeout defaults to 7 seconds instead of 30 seconds and is config- urable rather than hard-coded.

• POLICY CHANGE: CanIt’s “System Check” page warns if you do not enable RPTN down- loads.

• POLICY CHANGE: The “Only See Spam” attribute has been removed from the Users table. Instead, use the equivalent Permissions.

• POLICY CHANGE: We have removed support for a shared Bayes database in CanIt-PRO. It complicated the code very much and never really worked well.

• POLICY CHANGE: Obsolete headers X-CanIt-Tag-Reason and X-CanIt-Warning are no longer added. The information they would have contained is included in the X-Spam-Score header.

• POLICY CHANGE: Source packages no longer ship with Sendmail. You are expected to have Sendmail and Milter installed as prerequisites.

• CHANGE: The canit.cron cron job is now written in Perl rather than Bourne shell.

• BINARY PACKAGES: We have added RPMs for Fedora Core 6 and dropped them for Fedora Core 2.

• IMPROVEMENT: The GUI decodes base64- or quoted-printable-encoded subject lines.

CanIt-PRO — Roaring Penguin Software Inc. 185

• INCOMPATIBILITY: CanIt now requires PostgreSQL 7.3 or newer. It will NO LONGER WORK with PostgreSQL 7.2.x or older!

• CHANGE: We no longer bundle Crypt-SSLeay with the source installer. You must install one of Crypt::SSLeay or IO::Socket::SSL as a prerequisite for CanIt.

• IMPROVEMENT: More messages are templatable.

• IMPROVEMENT (Appliance Only): The GUI allows you to remove cluster members if you remove or rename a node.

• IMPROVEMENT: The Advanced Search allows you to search by date range.

• BUG FIX: RPTN would sometimes fail GnuPG signature verification because of bad times- tamps. This has been fixed.

• BUG FIX (CanIt only): RPTN data would not be reflected correctly in the Web interface. This has been fixed.

• BUG FIX: All rules that can add to the score (Bayes, SPF, Mismatch) are now reflected in the X-Spam-Score: header.

• BUG FIX: The LDAP user-lookup ignored the “Mail Attribute” setting when authenticating. This has been fixed.

• BUG FIX (CanIt-Domain-PRO only): Realm-mapping lookups use the entire domain, then the parent domain, and so on until a match is found. (For example, “foo.example.com” will search the Realm Mapping Table for “foo.example.com”, “example.com” and “com” until it finds a match.)

• BUG FIX (CanIt-Domain-PRO only): The command-line invocation “canit-cmd del-realm REALMNAME” deletes all traces of a realm (including streams and users.)

• BUG FIX (CanIt-PRO and Domain-PRO): If you require streams to opt-in, then the database upgrade script could opt-out the default stream. This has been fixed.

CanIt-PRO — Roaring Penguin Software Inc. 186 APPENDIX B. RELEASE NOTES

CanIt-PRO — Roaring Penguin Software Inc. Appendix C

A Testing Topology for CanIt-PRO

C.1 Introduction

The best way to evaluate CanIt-PRO is to route real-world mail through it. However, you may be hesitant to place CanIt-PRO in production without testing it first. So we’ll show you how to set up CanIt-PRO for test purposes, and then how to put it into production in a safe way. The test topology makes it very easy to back out of CanIt-PRO if you decide to do so.

C.2 Assumptions

We make the following assumptions about your current e-mail setup:

• You already have a mail server that is your primary MX record, and you control that server and its network. The existing mail server may run Sendmail, but it doesn’t have to—it could run Netscape Messenger, Microsoft Exchange, or any other mail server software of your choice.

• You have a spare Intel-architecture server for installing Linux and CanIt-PRO. This server should have sufficient horsepower to handle all of the mail for your domain or domains. While you can use other supported UNIX operating systems for CanIt-PRO, the instructions in this pa- per are specific to Linux. If you are an experienced UNIX and Sendmail system administrator, you can probably translate them for your own system.

• You control your DNS settings and can publish MX records for your domains.

C.3 Network Setup

Figure C.1 illustrates the assumed existing network setup followed by the new network setup. Note that your actual setup may be more complex and may include firewalls, demilitarized zones, etc. Conceptually, however, we assume you have an existing mail server which is the primary MX machine for your domains, and which is connected to the Internet.

CanIt-PRO — Roaring Penguin Software Inc. 187 188 APPENDIX C. A TESTING TOPOLOGY FOR CANIT-PRO

The test network shows how the CanIt-PRO server is configured to accept mail from the Internet and relay it to your actual mail server.

Internet Existing Mail Server

Original Network

Secondary MX Primary MX

Internet CanIt Server Existing Mail Server

Test Network

Figure C.1: Network Configurations

C.4 Build the CanIt-PRO Server

To build the CanIt-PRO server, install Linux on an Intel Architecture server. Be sure to install Apache, PHP and PostgreSQL, which are included with most Linux distributions. Alternatively, install our Debian-based appliance build.

C.5 Configure the CanIt-PRO Server to Relay Mail

You’ll need to edit two files on the CanIt-PRO server to configure Sendmail to relay mail. Make a list of all the domains for which your existing mail server accepts mail. Let’s suppose you own the do- mains example1.com and example2.net, and accept mail for both on the machine mail.example1.com. Finally, we’ll assume the CanIt-PRO server is called canit.example1.com.

C.5.1 Enable Relaying

First, you must enable relaying for the domains you control. To do this, edit the file /etc/mail/access and add a line for each domain, something like this:

CanIt-PRO — Roaring Penguin Software Inc. C.6. ROUTE TEST MAIL 189

To:domainname.tld RELAY

In our example, we’d add two lines to /etc/mail/access:

To:example1.com RELAY To:example2.net RELAY

C.5.2 Configure Forwarding Relays

Next, you have to tell CanIt-PRO where to relay mail for the domains. Edit the file /etc/mail/mailertable and add a line for each domain, something like this:

domainname.tld esmtp:[relay.domainname.tld]

In our example, recall that mail.example1.com handles mail for both domains, so our mailertable would look like this:

example1.com esmtp:[mail.example1.com] example2.net esmtp:[mail.example1.com]

C.5.3 Rebuild Sendmail Databases

Finally, you need to rebuild Sendmail’s internal databases to reflect these changes. Simply execute the following Linux commands as root:

cd /etc/mail make

C.6 Route Test Mail

Up until this point, your existing mail server has continued to act as it always does. The CanIt-PRO machine, although “live” and on the network, is not handling any mail traffic. Now comes the time to route mail through the CanIt-PRO server. There are two options to route test mail through the CanIt-PRO server:

C.6.1 Direct Injection

The least disruptive method is to directly inject test messages into the CanIt-PRO server. Run an SMTP client and send messages via the CanIt-PRO server. Verify that they are received and that spam messages are held.

CanIt-PRO — Roaring Penguin Software Inc. 190 APPENDIX C. A TESTING TOPOLOGY FOR CANIT-PRO

You can use an e-mail client such as Mozilla or Microsoft Outlook for testing purposes. Simply set the outgoing SMTP machine to be the CanIt-PRO relay (in our example, canit.example1.com and send messages to people in your organization. Alternatively, you can use a UNIX or Linux machine with its own DNS server. Create an MX record for your domain pointing to the CanIt-PRO server and send messages. Remember, only the test machine thinks that CanIt-PRO is your mail relay; the rest of the Internet still uses your existing mail server.

C.6.2 Create a Test Subdomain

Another option is to create a test subdomain, such as test.example1.com. Configure your regular mail server to accept mail for that domain, and don’t forget to modify the CanIt-PRO server’s access and mailertable files to relay mail for that domain. Then publish an MX record for test.example1.com pointing to canit.example1.com. You can then send mail from anywhere in the Internet to someone at test.example1.com and it will be relayed through the CanIt-PRO server. Existing mail to your proper domain, however, will still travel via your old mail server.

C.7 Route Real Mail

Once CanIt-PRO has passed the initial tests, it’s time to route real e-mail through it. The safest way to do this is to add an additional MX record for your domains. This record should have the highest priority, and point to the CanIt-PRO server. For example, let’s suppose your existing MX records look like this:

example1.com. 1d IN MX 10 mail.example1.com. example1.com. 1d IN MX 15 m2.example1.com.

Simply add another MX record like this:

example1.com. 1d IN MX 5 canit.example1.com.

and propagate the DNS changes. Mail for your domain will now be routed through the CanIt-PRO machine. In an emergency, if you need to take the CanIt-PRO machine offline, simply kill Sendmail on the CanIt-PRO server. Relays attempting to deliver mail to your domain will first try the CanIt- PRO server and immediately get a “Connection refused” error. They will fall back very quickly to the remaining MX records, and mail will flow as usual.

Note: This test setup is not a viable topology for stopping spam. Because CanIt-PRO sends temporary- failure codes for suspect mail, if your real mail server has an MX record, the sender will simply relay the spam directly to it. For production use, all of your public records should either:

• Be running CanIt-PRO, or

CanIt-PRO — Roaring Penguin Software Inc. C.8. OUTGOING MAIL 191

• Relay to a machine running CanIt-PRO.

The actual internal mail server should be hidden (no MX record) and ideally firewalled off, so only the CanIt-PRO relay can connect to it.

C.8 Outgoing Mail

If you want to pass outgoing mail through CanIt-PRO, configure your mail server to use the CanIt- PRO server as a “SmartHost”. This is a host to which all non-local mail will be sent. The details of SmartHost configuration differ among mail servers; consult your mail server documentation for details.

CanIt-PRO — Roaring Penguin Software Inc. 192 APPENDIX C. A TESTING TOPOLOGY FOR CANIT-PRO

CanIt-PRO — Roaring Penguin Software Inc. Appendix D

CanIt-PRO Architecture

D.1 Introduction

CanIt-PRO is based on the Sendmail Milter API, described at http://www.milter.org/ developers/design. Milter is a scalable API for doing site-wide filtering of e-mail. Figure D.1 shows how CanIt-PRO interfaces with Milter.

CanIt-PRO — Roaring Penguin Software Inc. 193 194 APPENDIX D. CANIT-PRO ARCHITECTURE

Sendmail Sendmail Sendmail

Milter Interface

mimedefang

Unix−domain Socket

mimedefang−multiplexor

Pipes

mimedefang.pl mimedefang.pl mimedefang.pl

Figure D.1: CanIt-PRO Architecture

D.2 CanIt-PRO Architecture

In Figure D.1, we show multiple sendmail processes communicating with a single mimedefang process. The mimedefang executable uses the Milter reference library, and is therefore multi- threaded. The mimedefang process is shown in cyan because it is the only multi-threaded process in CanIt-PRO; all others are single-threaded. The interface between mimedefang and sendmail may be a local (UNIX-domain) socket or a TCP socket. mimedefang takes care of accepting e-mail headers and bodies from sendmail and writing them to a temporary spool directory (typically, /var/spool/MIMEDefang). It then sends short com- mands to mimedefang-multiplexor. mimedefang-multiplexor listens on a UNIX-domain socket and manages a pool of Perl pro- cesses which do the actual filtering. The multiplexor has the following responsibilities:

1. It listens for requests from mimedefang and assigns them to one of the Perl processes.

2. It starts more Perl processes (up to a configured limit) if load increases.

CanIt-PRO — Roaring Penguin Software Inc. D.3. STARTING AND STOPPING CANIT-PRO 195

3. During times of low load, it kills off Perl processes (down to a configured limit.)

4. It kills Perl processes which have processed a configured number of messages. This is done to avoid potential memory leaks.

5. It kills Perl processes which take too long to scan a message or which stop responding to re- quests. mimedefang.pl is the actual Perl filtering program. It listens for requests (from the multiplexor) on its standard input, and writes results to its standard output. The commands and results exchanged are quite short; any modifications to the e-mail message are done in the spool directory. Because the multiplexor manages several Perl processes, the Perl filters do not have to be thread-safe. In addition, the “pool-of-preforked-processes” architecture scales very well on SMP systems, and is efficient, robust and reliable.

D.3 Starting and Stopping CanIt-PRO

CanIt-PRO is started by a script called /usr/share/canit/scripts/canit-system. This script handles the starting and stopping of multiple CanIt-PRO services and is invoked with a single argument; possible arguments are: start Starts all relevant CanIt-PRO services on this host. stop Stops all CanIt-PRO services running on this host. restart Equivalent to stop followed by start. check Starts all CanIt-PRO services that should be running on this host but are not, and stops all services that are running on this host but should not be. status Prints the status of CanIt-PRO services on this host. Exits with an exit code of 0 if all services that should be running are running, and all services that should be stopped are stopped. Exits with a code of 1 otherwise.

D.4 Static Configuration Files

Most CanIt-PRO services read a configuration file called /etc/mail/canit/canit.conf for static configuration settings, before reading the remainder of the configuration from the database. This file contains local configuration items that differ from factory defaults. The meanings of some of the configuration settings are described below. Boolean variables can take the values yes or no, while other variables are integers or strings.

CanIt-PRO — Roaring Penguin Software Inc. 196 APPENDIX D. CANIT-PRO ARCHITECTURE

D.4.1 Database Settings

The following settings exist in the [database] section of the configuration file. db host (string) should be set to the host name or IP address of the database server. If the database server is on this host, this setting should be blank. db name (string) should be set to the name of the CanIt-PRO database, typically spam. db super (string) should be set to the name of the PostgreSQL super-user, typically postgres. db user (string) should be set to the name of the PostgreSQL user for normal database access, typi- cally spam. db super passwd (string) should be set to the name of the PostgreSQL super-user’s password, if you are using MD5 authentication. db passwd (string) should be set to the name of the PostgreSQL normal user’s password, if you are using MD5 authentication. db port (integer) may be set to the TCP port number of the PostgreSQL server. You should set this only if your PostgreSQL server listens on a non-standard port. db connect timeout (integer) should be set to the timeout for connecting to the PostgreSQL database in seconds. The default timeout is 20 seconds.

D.4.2 Cron Settings

The [cron] section contains two settings that control the nightly database cron-runner. They are: compress dump (boolean) If set to yes, then the nightly database dump will be compressed with gzip. keep dumps (integer) Specifies how many nightly dumps to keep. CanIt-PRO will rotate the nightly dumps, keeping at least keep dumps days’ worth of dumps.

D.4.3 MIMEDefang Settings

These settings exist in the [mimedefang] section of the configuration file. As most do not need to be modified from factory defaults, you may not have a [mimedefang] section in your /etc/mail/canit/canit.conf file, so do not be alarmed if it does not exist. mx user (string) should be set to the user ID of the mimedefang processors. This should nearly always be a dedicated user called defang. mx relay check (boolean) enables filtering of relay IP addresses during SMTP connection. CanIt- PRO does not filter at connect-time, so this should be set to no.

CanIt-PRO — Roaring Penguin Software Inc. D.4. STATIC CONFIGURATION FILES 197

mx sender check (boolean) enables checking of the sender address in the SMTP “MAIL FROM:” command. CanIt-PRO performs sender checks at “RCPT TO:” time in order to take recipient streams into account, so this should always be set to no. mx recipient check (boolean) enables checking of the recipient address in the SMTP “RCPT TO:” command. CanIt-PRO requires this check, so it must be set to yes. mx log (boolean) enables logging. This should always be set to yes. mx requests (integer) specifies how many requests each Perl filter will handle before being killed. The filters are killed after this number of requests to eliminate any possibility of problems due to memory leaks. The default is 200, which should be reasonable for most installations. mx max recipok per domain (integer) specifies how many concurrent “RCPT TO:” checks are al- lowed per domain. The default is zero, which means no limit. You can set this to some number smaller than mx maximum to reduce the likelihood of one domain affecting others adversely. For example, if one domain has a slow or dead verification server, limiting this setting protects other domains from denial of service caused by all scanners doing RCPT TO: checks for the slow domain. mx minimum (integer) specifies the minimum number of Perl filters to keep running, even if the system is idle. mx maximum (integer) specifies the maximum number of Perl filters to run concurrently, no matter how busy the system is. Note that each Perl filter requires a database connection. The default installation of PostgreSQL permits only 32 simultaneous database connections. If you need more than this, you should increase the number of PostgreSQL back-ends with the “-N” and “-B” postmaster options when you start the database. Please see the postmaster(1) and pg ctl(1) man pages for details. mx idle (integer) specifies how long in seconds a Perl process should be idle before it is killed off. Af- ter a period of heavy load, idle processes eventually get killed off until there are mx minimum Perl filters running. mx busy (integer) specifies how long in seconds a Perl filter is allowed to process a message. If the filter takes longer than this, it assumed to have hung up and is killed, and the message is tempfailed. mx cmd timeout (integer) specifies how long in seconds to wait for commands and results to be transferred between mimedefang and mimedefang-multiplexor. mx slave delay (integer) specifies how long to wait after starting each Perl filter. If the system is idle, but fewer than the minimum number of filters are running, a new filter is started each mx slave delay seconds. mx min slave delay (integer) specifies that the multiplexor must not start slaves more quickly than the specified delay, no matter what. Even if the system is busy, a new filter will not be started more often than every mx min slave delay seconds. Setting this to 1 or 2 seconds may help your machine withstand a sudden surge in e-mail; it helps smooth out sudden load increases. However, it may cause delays as some mail is tempfailed.

CanIt-PRO — Roaring Penguin Software Inc. 198 APPENDIX D. CANIT-PRO ARCHITECTURE

mx max rss (integer) specifies the maximum resident-set size in kB of each Perl filter process. On systems which support this limit, a Perl filter which exceeds this limit is killed. If set to zero, the limit is ignored. mx max as (integer) specifies the maximum virtual address space in kB of each Perl filter process. On systems which support this limit, a Perl filter which exceeds this limit is killed. If set to zero, the limit is ignored. mx stats (boolean) specifies that the multiplexor should log statistical information in /var/log/mimedefang/stats. mx flush stats (boolean) specifies that the multiplexor should flush /var/log/mimedefang/stats each time it writes a line to the file. mx stats syslog (boolean) specifies that the multiplexor should log statistical information using sys- log. mx socket (string) specifies the full path to the UNIX-domain socket used for communication be- tween mimedefang and mimedefang-multiplexor. For CanIt-PRO, this should not be changed. log times to syslog (boolean) specifies whether or not to log filter times using syslog. If you set this to yes, then CanIt-PRO will log lines similar to this in your mail log:

gBNEeeI9004056: Filter time is 231ms syslog ident (string) specifies the identifier to include in syslog messages. It defaults to CanIt. You should not change this; it will be used by future versions of CanIt-PRO for log analysis. mx embed perl (boolean) specifies whether or not the multiplexor should use an embedded Perl interpreter. Normally, when a Perl slave is needed, the multiplexor forks and the child execs a Perl program. If you set this to yes, then the multiplexor uses an embedded Perl interpreter that reads the Perl filters only once. When a new slave is needed, only a fork is done. The overhead of the exec and the Perl interpreter initialization is avoided. On some systems, it is not possible to embed a Perl interpreter. If you set this flag to yes on such a system, a warning is logged to syslog and CanIt-PRO continues as if the flag were no. On some systems, it is possible to embed a Perl interpreter, but not to safely destroy it and create another interpreter in the same process. On such systems, a warning is logged if you force a filter reread. This will not affect the operation of CanIt-PRO, but if you edit the actual Perl filter file, you will need to do a (more expensive) mimedefang-ctrl restart rather than the cheaper mimedefang-ctrl reread.

D.4.4 Filter Settings

A few filter settings are stored in the configuration file rather than in the PostgreSQL database. They reside in the [filter] section. These filtering settings are used in the event that the database is not available.

CanIt-PRO — Roaring Penguin Software Inc. D.4. STATIC CONFIGURATION FILES 199

admin address The e-mail address of the CanIt-PRO administrator. database down action This setting can take one of three values:

• tempfail (the default). CanIt-PRO will tempfail mail if the PostgreSQL database server is non-responsive. • accept. If the database is down, mail will be delivered un-scanned with a warning added in the X-Spam-Score: header. • minimalfilter. CanIt-PRO will run with bare SpamAssassin rules in tag-only mode. database down virus action If database down action is set to ’accept’, this parameter con- trols how viruses are handled while the database is down. It must be set to one of reject, discard or accept. database down tag score If database down action is set to ’minimalfilter’, this parameter controls the threshold at which incoming mail is tagged. The default value is 5. database down reject score If database down action is set to ’minimalfilter’, this parameter controls the threshold at which incoming mail is auto-rejected. The default value is 20.

D.4.5 Ticker Settings

The [ticker] section contains only one setting related to the ticker tasks run by the CanIt daemon. We do not recommend that you change this setting; it is documented solely for completeness. tick interval (integer) specifies the interval in seconds between “ticks”. The CanIt daemon task checks every tick interval seconds for work to do.

D.4.6 Storage Manager Settings

The [storagemanager] section contains the following settings related to Storage Manager: pidfile (string) A file used by the Storage Manager server to write its process ID and to lock against concurrent Storage Managers. The default value is /var/run/ canit-storage-manager.pid. rootdir (string) The root directory under which data are stored. The default value is /var/lib/ canit-storage-manager. user (string) The UNIX user as which the Storage Manager server should run. The default value is defang. client retry delay (integer) specifies the delay in reconnecting to a dead storage manager node. If a CanIt-PRO cluster node fails to connect to a storage manager node, it will not retry the connec- tion for client retry delay seconds. This can help prevent a dead storage manager node from bogging down the clients in blocked connect calls.

CanIt-PRO — Roaring Penguin Software Inc. 200 APPENDIX D. CANIT-PRO ARCHITECTURE

D.5 Tuning CanIt-PRO

Tuning CanIt-PRO is a bit like tuning Sendmail: A black art. Nevertheless, we can offer some guide- lines which should help improve the performance of your CanIt-PRO installation.

D.5.1 Memory

You CanIt-PRO server should have sufficient memory. As a rule of thumb, you should have about 50MB of memory for each concurrent Perl filter. If you set the maximum number of Perl filters to 16, for example, your machine should have at least 800MB of physical memory. Your CanIt-PRO server should also have sufficient swap space that a sudden flood of e-mail does not cause exhaustion of virtual memory. An additional 32MB of swap space for each Perl filter is probably a good rule of thumb.

D.5.2 Disk

You should have fast, reliable disks on your CanIt-PRO server. In particular, the CanIt-PRO spool directory (/var/spool/MIMEDefang) is heavily used, and it may be worth putting it on its own disk. Even better, put the spool directory on a RAM disk, assuming you have sufficient memory. A RAM-based CanIt-PRO spool directory is a large win, especially on systems like Solaris with relatively conservative file systems. To calcluate the amount of RAM you’ll need for the spool, multiply the size of the largest message you’ll accept by the maximum number of concurrent filters, and then multiply by 3 as a safety factor for CanIt-PRO processing. For example, if you accept messages up to 3MB, and you’ll have at most 8 Perl filters running, then your /var/spool/MIMEDefang space should be at least 72MB. If you use a RAM disk for the spool directory, add this memory to the memory requirements in the previous section.

D.5.3 Solaris-Specific tmpfs Note

Solaris is very conservative about committing writes to disk. On a busy Solaris server, consider it mandatory to put /var/spool/MIMEDefang on a RAM-based tmpfs file system. The perfor- mance improvement will be dramatic.

D.5.4 CPU

Spam-scanning is quite CPU-intensive, but in modern computers, the CPU is unlikely to be the bot- tleneck. If the CPU does prove to be a bottleneck, you should consider a faster machine, or even a multiprocessor machine.

CanIt-PRO — Roaring Penguin Software Inc. D.6. DEALING WITH OVERLOAD 201

D.5.5 Sendmail

Tuning Sendmail is quite complex; for a review of some of the issues involved, we recommend “Send- mail Performance Tuning” by Nick Christenson, Addison-Wesley, ISBN 0-321-11570-8.

D.6 Dealing with Overload

Normally, the resources which first become overloaded in a mail server are disk or network bandwidth. However, a server with CanIt-PRO installed is more likely to run out of CPU power or memory, simply because content-scanning is relatively expensive. If your CanIt-PRO machine becomes overloaded to the point that very little mail is flowing and the machine is struggling, here are tuning tips to help you recover.

D.6.1 Tune CanIt-PRO and Sendmail

In addition to the tuning tips in Section D.5, two parameters are particularly helpful in letting the CanIt-PRO server deal with overload: In /etc/mail/canit/canit.conf, set mx maximum in the [mimedefang] section to a fairly low number, around 5 or 6. On most hardware, this should limit the impact of scanning on CPU and memory. It will allow the CanIt-PRO machine to process incoming mail smoothly until the overload conditions abate. In conjunction with mx maximum, it is very useful to set Sendmail’s ConnectionRateThrottle option. If you set this to 3, for example, Sendmail will accept at most 3 SMTP connections per second. Again, this lets your machine process mail smoothly until overload conditions abate. So if your server becomes overloaded, follow these recovery steps:

• Set mx maximum to 5, and ConnectionRateThrottle to 3. (If you use M4 to generate the sendmail configuration file, the M4 parameter is called confCONNECTION RATE THROTTLE.)

• Watch the load carefully. If your machine appears to have idle time and free memory on its hands, cautiously increase the parameters until throughput seems to be maximized.

D.6.2 Network Architecture

A good way to deal with temporary overload conditions is to have a secondary MX machine that simply relays mail without doing any scanning. It will queue messages that the primary machine cannot handle, and then deliver them serially to the primary machine, smoothing out the load. The disadvantage of this scheme is that some relay-IP tests do not work as effectively, and the secondary MX machine may have to generate bounce messages. If your CanIt-PRO machine is overloaded a lot of the time, we suggest setting up a second equal- weighted MX machine with CanIt-PRO installed. The two CanIt-PRO machines can share the same PostgreSQL database, since database access is rarely the bottleneck. Having two equal-weighted MX records will spread the load over both machines.

CanIt-PRO — Roaring Penguin Software Inc. 202 APPENDIX D. CANIT-PRO ARCHITECTURE

CanIt-PRO — Roaring Penguin Software Inc. Appendix E

CanIt-PRO HOWTOS

E.1 Restoring a Database from a Dump

The CanIt-PRO cron job makes a text dump of the entire database every night; the database is dumped into /var/spool/Canit-Spam-DB-Backup/SPAM-DATABASE-BACKUP. You should back this file up to ensure the integrity of your spam database. If, for some reason, you need to restore the database from the text file, follow this procedure. Note that you may need to supply the full path to the PostgreSQL utilities like pg dump, psql, createuser, etc. All of these examples assume that the PostgreSQL superuser is named postgres. This is likely to be true on Linux and Solaris, but some platforms use pgsql instead (this is the setting in FreeBSD’s port of PostgreSQL.)

1. Stop CanIt-PRO, the ticker, Sendmail and the CanIt-PRO Web interface. 2. Dump your existing database, just to be safe. Be sure to do this in a directory with sufficient space: $ pg dump -U postgres spam > spam-dump-file.txt 3. Drop the database: $ dropdb -U postgres spam 4. Create an empty database: $ createdb -U postgres -E sql-ascii -T template0 spam 5. Restore the database contents from the nightly dump file: $ psql -U postgres -d spam < SPAM-DATABASE-BACKUP 6. Analyze the database to update statistics for the query optimizer: $ psql -U postgres -d spam -c ’ANALYZE VERBOSE’ Do not omit the ANALYZE step or your database will be very slow. 7. Restart CanIt-PRO, Sendmail and the CanIt-PRO Web interface.

CanIt-PRO — Roaring Penguin Software Inc. 203 204 APPENDIX E. CANIT-PRO HOWTOS

E.2 Firewall Settings

Many people run CanIt-PRO behind a packet-filtering firewall. If you do, be sure to permit access to the following ports. “Inbound” and “Outbound” are from the perspective of the CanIt-PRO machine. For example, if we say that outbound TCP port 80 must be open, we mean CanIt-PRO must be able to initiate TCP connections to an external machine with a destination of port 80. And when we say inbound TCP port 25 must be open, we mean CanIt-PRO must be able to accept TCP connections from another machine destined to port 25.

E.2.1 Firewall Rules: External Hosts

CanIt-PRO needs the following ports open for communication with external machines on the Internet:

• Inbound and outbound TCP port 25 for SMTP.

• Inbound TCP port 22 for SSH access.

• Outbound TCP and UDP port 53 for DNS lookups

• Outbound TCP port 80 and port 443 for software updates, RPTN downloads and ClamAV sig- nature downloads.

• Outbound UDP port 6568 to report IP address reputation data back to Roaring Penguin Soft- ware.

• Possibly inbound and outbound UDP port 123 if you are using NTP to synchronize the clock.

E.2.2 Firewall Rules: Internal Hosts

CanIt-PRO needs the following ports open for communication with internal machines in your organi- zation. CanIt-PRO always needs port 25 open; the other three items are required only if you use the corresponding User Lookup (Chapter6.)

• Outbound TCP port 389 for LDAP lookups and/or 636 for LDAPS lookups.

• Outbound TCP port 110 for POP3 lookups and/or 995 for POP3S lookups.

• Outbound TCP port 143 for IMAP lookups and/or 993 for IMAPS lookups.

• Outbound TCP port 25 for verification servers and e-mail delivery.

E.2.3 Firewall Rules: Intra-Cluster Hosts

If you have a cluster of CanIt-PRO machines, the following ports are used between cluster members and should be open:

• TCP port 5432 (typically) for PostgreSQL database connections.

CanIt-PRO — Roaring Penguin Software Inc. E.3. RUNNING SOMETHING AFTER THE NIGHTLY CRON JOB COMPLETES 205

• TCP port 6568 (typically) for Storage Manager connections.

• TCP port 22 for SSH connections.

Note that the PostgreSQL and Storage Manager ports should be firewalled off from external hosts and hosts that are not members of the CanIt-PRO cluster.

E.3 Running Something after the Nightly Cron Job Completes

The script /usr/share/canit/scripts/canit.cron runs once a night to perform various maintenance tasks. (Note that on some systems, canit.cron might be located in a different direc- tory.) If a file called post-cron-hook is present in the same directory as canit.cron and is exe- cutable, then it will be run as root after all other cron tasks have been completed. You can use this script for whatever purposes you like. For example, you might use it to rsync the nightly database dump to another machine for backup purposes.

E.4 Migrating CanIt-PRO to a Different Machine

The following instructions will guide you through migrating to a different server. It is necessary to stop processing mail during the migration. The amount of time this will take depends mostly on the size of your database. Review the time-stamps on START-BACKUP-TIME and STOP-BACKUP-TIME in /var/spool/Canit-Spam-DB-Backup for an idea of this time-frame. Assume restoring the database will take 4/3 as long as dumping it does. By default will mail be tempfailed during the migration. Most sending mail servers will not give warning Delivery Status Notifications back to the sender unless attempts to deliver to you have failed for 4 hours. If you determine that your migration down-time will be too long, there are two options: (1) Allow mail to flow through un-scanned; (2) implement databaseless filtering. Please contact Roaring Penguin Technical Support for details on these options. You may wish to upgrade your CanIt-PRO installation at the same time.

E.4.1 CanIt-PRO Clusters

If you have a cluster of CanIt-PRO servers there may be additional considerations. If you are migrating the database server it will be necessary to stop all CanIt-PRO servers during the migration since the database will not be available during the dump and restore. It is safe to install the latest version of CanIt-PRO on the new machine as long as all members of the cluster are also upgraded. In a cluster all servers must run the same version of CanIt-PRO. The migration procedure includes the necessary steps for clusters. However, you must consult the Clustering Guide after migrating to ensure that all machines in the cluster remain properly configured.

CanIt-PRO — Roaring Penguin Software Inc. 206 APPENDIX E. CANIT-PRO HOWTOS

E.4.2 Storage Manager

If you are running Storage Manager there may be additional considerations. If you are running only one Storage Manager node and it is on the server being migrated then your Storage Manager data must be moved during the migration as well. However, in most cases it takes too long to copy this data from one machine to another. Therefore, in this case, the recommended procedure is to keep your old CanIt-PRO server running Storage Manager in read-only mode for some time after the migration is complete. You may either: (a) leave the old server running until its Storage Manager data expires (typically 30 days); or (b) begin copying the Storage Manager to the new machine once migration is complete. The latter option should take much less time (a few days) but requires extra steps. If you run multiple Storage Manager nodes and have configured your cluster to store at least two copies of all data, then migration will not be a problem since the other nodes will carry all of the data. The migration procedure includes the necessary steps for Storage Manager (both options listed above).

E.4.3 Migration Procedure

1. Install CanIt-PRO on the new server. You may install the latest version.

Note: Please ensure that CanIt-PRO is fully installed on the new server. In particular, PostgreSQL roles must be initialized with the canit-prepare-system command, even though you will restore from another database shortly. It is not necessary to run this command on a fresh CanIt-PRO Appliance/ISO install, although it is safe to do so.

2. Stop CanIt-PRO on the new server, the existing server, and all existing cluster members: # /etc/init.d/sendmail stop # /etc/init.d/canit-system stop Disable the CanIt-PRO web interface for non-admin users by visiting Administration : Dis- able/Enable.

Note: If your CanIt-PRO version is older, this function may not be present. If this is the case, or to be extra careful, stop Apache entirely (e.g. /etc/init.d/apache2 stop). These services must be stopped to ensure that no process attempts to access the database while the dump or restore is occurring. To completely ensure safety, you may also set PostgreSQL to listen only on the loopback address: Find postgresql.conf and set the listen addresses parameter to ’localhost’. Be sure that only one such parameter exists in the file. Restart PostgreSQL for this to take effect.

Note: After this step CanIt-PRO is no longer processing mail.

3. Dump the database to a file: $ pg dump -U postgres spam > spam-dump-file.txt

Note: This command may run for a long time without producing any output. This is normal.

CanIt-PRO — Roaring Penguin Software Inc. E.4. MIGRATING CANIT-PRO TO A DIFFERENT MACHINE 207

4. Copy the file to the new server. This can be done with ssh: # scp spam-dump-file.txt root@new machine:/root

5. Copy the entire directory tree rooted at /var/spool/MD-Bayes to the new machine, being sure to preserve ownership and permissions. There are various ways to do this. However, in the common case in which the old and new machine both have rsync and ssh installed: # rsync --archive -essh /var/spool/MD-Bayes new machine:/var/spool You may wish to add the --verbose and --progress flags if you have a lot of data to copy.

6. If your domains were entered manually into /etc/mail/mailertable and /etc/mail/access, you must transfer that information to the new server. If your new server now has Setup : Domain Routing in its Web Interface you should enter the domains there once the database is successfully restored. You may wish to make a tarball of /etc/mail/ to keep as a reference on the new ma- chine. This is handy if you later need to refer to files such as access, mailertable, sendmail.mc, etc.

Note: WARNING: Do not overwrite /etc/mail/ on the new server! # tar zcvf old-etc-mail.tar.gz /etc/mail # scp old-etc-mail.tar.gz root@new machine:/root

7. On the new machine, restore the database from your file: # dropdb -U postgres spam # createuser -U postgres -S -D -R spam (The createuser command may fail if the spam user already exists.) # createdb -U postgres -E sql-ascii -T template0 spam # psql -U postgres spam < /root/spam-dump-file.txt # psql -U postgres spam -c ’ANALYZE VERBOSE’ # canit-prepare-system

Note: psql -U postgres spam < /root/spam-dump-file.txt will produce output. However when it restores the largest table it may appear as though the process has frozen. It is processing a very large table and may take a long time before further output is generated.

8. Log into the Web Interface on the new server and go to Administration : Enable/Disable and disable CanIt-PRO if it is not already disabled. This prevents non-admin access to the Web Interface for recent versions. This is also a good time to ensure that all your domains are entered into Setup : Domain Routing if your new server has this function.

9. Storage Manager migration 1: skip this step if you do not need to migrate your Storage Manager data.

CanIt-PRO — Roaring Penguin Software Inc. 208 APPENDIX E. CANIT-PRO HOWTOS

Configure the networking on the old server so that it will be accessible when the new server’s final networking configuration is complete.

10. Storage Manager migration 2: skip this step if you do not need to migrate your Storage Manager data. Access the Web Interface for Storage Manager (see section 13.2.2) and update the networking information. The old server must be set as a Read-Only Hostname. Update the new server’s hostname in the Hostnames if it will be different when migration is complete.

11. Clear out all hosts from the cluster members table and system check tables: # psql -U postgres spam -c ’DELETE FROM cluster members’ # psql -U postgres spam -c ’DELETE FROM cluster sanity check’ # psql -U postgres spam -c ’DELETE FROM cluster sanity check state’

12. Cluster considerations: skip this step if you do not have a cluster. Review the Cluster Checklist in the Clustering Guide to ensure that all cluster members are configured correctly in your new post-migration configuration. For example, the scanners may need to have their database IP address updated.

Note: Before proceeding, ensure all cluster members are running the same version.

13. Reconfigure the networking on the new server to its final configuration if necessary.

14. Restart CanIt-PRO on the new server and all cluster members: # /etc/init.d/canit-system start # /etc/init.d/sendmail start

Note: Mail may now flow. Adjust any firewalls, networking equipment or MX records if necessary.

15. Click on Setup : Cluster Management and make sure the expected hosts are all present and correctly configured. Adjust any settings as required. You may need to re-run the commands from step 11 to clear the cluster members and system check tables. After doing so, restart CanIt-PRO on the new database server before restarting other cluster members.

16. Storage Manager migration 3: skip this step if you do not need to migrate your Storage Manager data, or if you have chosen to allow the old server to remain operating until all data has expired rather than copying it. Begin copying the Storage Manager data from the old server to the new: # scp -r /var/lib/canit-storage-manager/ root@newmachine:/var/lib/

Note: This operation may take a very long time, perhaps many days. scp does not preserve ownership and permissions. Therefore once the above is complete, set the correct permissions on the new server’s Storage Manager data: # chown -R defang.defang /var/lib/canit-storage-manager

CanIt-PRO — Roaring Penguin Software Inc. E.5. USING CANIT-API-CLIENT, THE CANIT-PRO COMMAND-LINE TOOL 209

# chmod -R go-rwx /var/lib/canit-storage-manager Access Storage Manager in the Web Interface and remove the old server from the Read-Only Hostnames. You may now shut down the old server.

E.5 Using canit-api-client, the CanIt-PRO Command-Line Tool

CanIt-PRO includes a tool called canit-api-client, which lets you perform various actions against a CanIt-PRO installation. This lets you script things like addition or deletion of users, address mappings, domain mappings, and so on. canit-api-client replaces canit-cmd, which provided some of this functionality in a previ- ous release. The canit-api-client tool requires that you have a CanIt-PRO API server running. canit-api-client is invoked as follows: $ canit-api-client [options] command [args...] Please see the man page (man canit-api-client) for details about required command-line op- tions or configuration-file options. The command specifies which action you want to take. The additional args may or may not be re- quired, depending on the command. For a full list of the available commands, invoke canit-api-client with the argument --help $ canit-api-client --help

E.6 Cloning a CanIt-PRO Machine

If you clone a CanIt-PRO machine, either with disk-imaging software or a virtual environ- ment’s cloning mechanism, do not bring up the cloned machine without first deleting the file /etc/mail/canit/canit-cluster-member-id. Otherwise, the cluster management sys- tem will assume the cloned machine is still the original machine and will become confused.

CanIt-PRO — Roaring Penguin Software Inc. 210 APPENDIX E. CANIT-PRO HOWTOS

CanIt-PRO — Roaring Penguin Software Inc. Appendix F

Using CanIt-PRO with memcached

F.1 Introduction

Memcached is a “distributed memory object caching system.” CanIt-PRO can use memcached to cache the results of Verification Server Lookups. In future, it might make more extensive use of memcached to improve performance.

F.2 Using memcached

To use memcached with CanIt-PRO, you need to install memcached and then configure CanIt-PRO to use it.

F.2.1 Installing memcached

On our Debian-based CanIt-PRO appliances, you can install memcached and its client libraries by running the following command as root: # apt-get install memcached libcache-memcached-perl php5-memcache On other platforms, you’ll have to use your system’s package manager to install memcache, the Cache::Memcached Perl module, and the Memcache PHP extension.

F.2.2 Configuring memcached

Configuring memcached is beyond the scope of this manual. Consult the memcached documentation for details on setting the various memcached configuration options.

F.2.3 Single vs. Multiple Caches

On a CanIt-PRO cluster, memcached can be run in one of two basic ways:

CanIt-PRO — Roaring Penguin Software Inc. 211 212 APPENDIX F. USING CANIT-PRO WITH MEMCACHED

1.A single cache. In this case, each cluster member communicates with the same memcached daemon (or set of daemons.) There is one cache for the entire cluster.

2. Separate caches. In this case, each cluster member runs its own copy of memcached. Each memcached daemon is used only by the node on which it is running.

Each mode has its advantages and disadvantages. A single cache allows more data to be cached and makes cache hits more likely. However, if a cache node fails, then the entire cluster will be affected. Timeouts when doing cache lookups could negatively affect performance. If you use separate caches, then less data can be cached and cache misses become more common. However, the failure of one node does not affect any other nodes in the cluster. To avoid a single point of failure, therefore, we recommend using separate caches: Each CanIt-PRO cluster member should run its own instance of memcached.

F.2.4 Configuring CanIt-PRO to use memcached

Once you have installed memcached and configured it to run, edit the CanIt-PRO configuration file (typically /etc/mail/canit/canit.conf) and add a [cache] section. This section should look something like this:

[cache] use_cache = yes driver = memcached servers = 127.0.0.1:11211 single_cache = no cache_valid_recipients = yes cache_invalid_recipients = no

The lines have the following meanings:

• use cache = yes is required to enable caching. Otherwise, CanIt-PRO will not use memcached.

• driver = memcached is required. In the future, other drivers may be supported, but for now, only memcached is.

• servers = server list specifies the memcached servers. It should be a comma-separated list of server descriptions. Each server description is a host name or IP address followed by a colon and the TCP port number on which memcached is listening.

• single cache = no specifies that each CanIt-PRO cluster member runs (and uses) its own inde- pendent copy of memcached. If the entire cluster uses the exact same set of memcached servers, then set single cache to yes.

• cache valid recipients = yes specifies that CanIt-PRO should cache valid recipient results from verification servers. A setting of yes is recommended.

CanIt-PRO — Roaring Penguin Software Inc. F.3. WHAT IS CACHED 213

• cache invalid recipients = no specifies that CanIt-PRO should not cache invalid recipient re- sults from verification servers. A setting of no is strongly recommended unless you know with absolute certainty that the back-end verification server only ever rejects invalid recipients and never rejects valid recipients. Some back-end servers may reject valid recipients for policy reasons and this could cause CanIt-PRO to incorrectly cache a valid recipients as being invalid.

Once you have edited canit.conf, restart CanIt-PRO.

F.3 What is Cached

Currently, CanIt-PRO caches verification server results only. It caches a valid recipient for 24 hours, and an invalid one for one hour. This can reduce the number of times CanIt-PRO needs to connect via SMTP to the verification server, and generally improves the recipient-checking time by a factor of two or more.

CanIt-PRO — Roaring Penguin Software Inc. 214 APPENDIX F. USING CANIT-PRO WITH MEMCACHED

CanIt-PRO — Roaring Penguin Software Inc. Appendix G

Using CanIt-PRO with PgBouncer

G.1 Introduction

PgBouncer is a “Lightweight connection pooler for PostgreSQL” developed by Skype and available at http://pgbouncer.projects.postgresql.org/. PgBouncer is very effective at reducing database load in large CanIt-PRO installations by reducing the number of simultaneous PostgreSQL processes. CanIt-PRO works very well with PgBouncer in “Transaction Pooling Mode”, which makes very effective reuse of existing PostgreSQL processes.

G.2 Installation

Note: We will only describe the installation and operation of PgBouncer with our Debian-based CanIt-PRO appliance build. Although it is possible to use PgBouncer on other systems, this is not officially supported; you’ll have to install and configure PgBouncer yourself on those systems. To install PgBouncer on a CanIt-PRO appliance, type: # apt-get update # apt-get install pgbouncer

G.3 Configuration

The PgBouncer configuration files are located in /etc/pgbouncer. The files are:

• userlist.txt: A list of PgBouncer users and how they map to PostgreSQL users.

• pgbouncer.ini: The main PgBouncer configuration file.

If you use PgBouncer, you should run one PgBouncer instance on each CanIt-PRO cluster member.

CanIt-PRO — Roaring Penguin Software Inc. 215 216 APPENDIX G. USING CANIT-PRO WITH PGBOUNCER

G.3.1 Configuring userlist.txt

Configuring /etc/pgbouncer/userlist.txt is very easy. It should contain exactly the fol- lowing content:

"spam" "spam" "postgres" "postgres"

G.3.2 Configuring pgbouncer.ini

There is a sample pgbouncer.ini file installed in /usr/share/canit. You should copy it into /etc/pgbouncer and then edit it. pgbouncer.ini has several sections. They should be configured as follows:

• The [databases] section should point all databases at your database host. If your database host is db.example.org, then there should be a single line in the [databases] section that reads: * = host=db.example.org That line tells PgBouncer to contact db.example.org for all databases.

• The [pgbouncer] section has a wide variety of settings. The defaults in the sample file are probably fine; if you need to tweak them, see the pgbouncer man page.

The sample configuration file causes pgbouncer to listen for client connections on port 6432.

G.3.3 Configuring CanIt-PRO to use PgBouncer

Once PgBouncer has been installed and configured, you need to tell CanIt-PRO to use it. To do this, edit /etc/mail/canit/canit.conf and change the following settings:

• In the [database] section, set: db host=127.0.0.1 db port=6432

• Also in the [database] section, set: use pgbouncer=1 This is important if you use failover; it tells the failover code to edit pgbouncer.ini rather than canit.conf when changing the database server during failover.

Once PgBouncer has been configured, run /etc/init.d/canit-system restart on all clus- ter nodes.

CanIt-PRO — Roaring Penguin Software Inc. Appendix H

CanIt-PRO Logging

H.1 General Information

CanIt-PRO logs messages regarding its operation using syslog. By default, these are logged using the mail syslog facility to keep them together with Sendmail’s logs. This is recommended, but if for some reason you wish to change it, you can do so by modifying the syslog facility configuration setting in the mimedefang section of /etc/mail/canit/canit.conf. In general, a CanIt-PRO log entry will consist (after the standard syslog preamble of date, host, process name, and process ID) of the word CanIt: followed by the 14 character Sendmail queue ID (or the text NOQUEUE) followed by another colon. After this comes the message-specific information for that log type. Several types of log message are generated, at different log levels:

Debugging messages Debugging messages provide very verbose, detailed information regarding the internal workings of CanIt-PRO. These are logged using syslog’s debug facility, and are turned off by default in shipped versions of CanIt-PRO. You will probably never need to enable debug logging, but if you need to do so, you must edit the CanIt-PRO filter file (/etc/mail/canit/canit-pro-filter) and add the line: CanIt::Logger::set debuglevel( CanIt::Logger::DEBUG ON() ); to the filter initialize() function, and restart the CanIt-PRO service. When enabled, debug logging provides extra debugging information. After the general log entry info mentioned above, a debug message consists of DEBUG:, the message itself, and then in parentheses, the line, file, function, and caller information for each debug message.

Note: Enabling debug logging is not recommended on a heavily loaded production server, as the extra syslog traffic will slow things down, and greatly increase the disk space required for your logs.

Regular log messages Regular log messages provide information about the normal operation of CanIt-PRO and are logged at the ’info’ level.

CanIt-PRO — Roaring Penguin Software Inc. 217 218 APPENDIX H. CANIT-PRO LOGGING

Event messages Event log messages provide information about the normal operation of CanIt-PRO in a format that is both human readable and machine parseable. These are logged at the ’info’ level. Warning messages Warning messages indicate that an undesirable, but non-fatal, condition has oc- curred. These are logged at the ’warning’ level. Error messages Error messages indicate that a failure has occurred within CanIt-PRO and should be attended to immediately. These are logged at the ’error’ level.

H.2 Event Log Format

Event messages are logged in a format designed to be both human-readable and machine-parseable. This format consists of comma-separated key=value pairs, where the key consists of entirely lower- case alphabetic characters, and the value consists of arbitrary text appropriate for that key, with prob- lematic characters such as newlines and commas replaced with a % followed by their two-digit hex- adecimal value. With the exception of what, which always appears first, and subject, which will appear last if present, the key/value pairs cannot be assumed to occupy any specific position in the log line. De- pending on where and why the message was logged, different keys will be present. An example log message is: Jan 01 13:10:31 oxygen mimedefang.pl[9813]: CanIt: j4CHAVtu009864: what=accepted, nrcpts=1, relay=192.168.10.8, score=2.5, [email protected], stream=user1, tests=HTML MESSAGE, subject=Yes%2C this is an example (We have wrapped the output for readability; in reality, the log message would appear on a single line.) Here we see the standard date, time, hostname, process name, and process ID from syslog, the name CanIt:, the sendmail queue ID for the message being processed, and a number of key-value pairs separated by commas. The keys that can appear in an “event” log line are: what This field provides the first indication of what happened to the message. The ’reason’ and ’detail’ fields provide further information Valid values for ’what’ are: accepted Message was accepted and relayed through. The ’reason’ field may contain one of: approved, sender-whitelisted, domain-whitelisted, host-whitelisted, unscanned-toobig, skip-spam-scan, opt-out, or no reason at all if none of those cases apply. rejected Message (or sender, or recipient) was rejected and the sending relay was given a 5xx failure code. The ’reason’ field may contain one of: auto-reject, auto-reject-no-incident, blacklisted-recipient, domain-blacklisted, exe, ext, host-blacklisted, invalid-recipient, mime, mismatch-blacklist, rbl-blacklisted, sender-blacklisted, too-large, or virus. tagged Message was tagged and relayed through. what=tagged log lines will not contain a ’reason’ field.

CanIt-PRO — Roaring Penguin Software Inc. H.2. EVENT LOG FORMAT 219

discarded Message was discarded silently. The ’reason’ field can be auto-reject, auto-reject- no-incident, exe, ext, mime, virus. pending Message was quarantined and held for human review. greylisted Message was greylisted with a 4xx code. what=greylisted lines will not contain a reason field. reason This provides secondary information (the ”why” to the ”what” above) regarding the disposi- tion of an incoming connection. Valid values are:

approved Message was manually approved from the spam trap interface. auto-reject Message was rejected. An incident is available and is indicated by the value for the incident key. auto-reject-no-incident Message was automatically rejected due to spam score, and no inci- dent was created. blacklisted-recipient The specified recipient was blacklisted domain-blacklisted The domain of the sender’s address was blacklisted in the specified stream. domain-whitelisted The domain of the sender’s address was whitelisted in the specified stream. exe The message contained a file with an extension considered executable on Microsoft oper- ating systems. detail will contain the extension name. Note that CanIt-PRO no longer generates the exe reason, but older versions used to. New versions of CanIt-PRO only generate the ext reason. ext The message contained a file with a blocked extension. detail will contain the extension name. host-blacklisted The relay host was blacklisted in the specified stream. host-whitelisted The relay host was whitelisted in the specified stream. invalid-recipient The specified recipient was not valid. mime The message contained a file with a blocked MIME type. detail will contain the actual MIME type found. mismatch-blacklist The message triggered a mismatch rule. opt-out The stream containing this message is configured to opt out of spam scanning. rbl-blacklisted The relay sending this message was blocked by an RBL entry. sender-blacklisted The sender address was blacklistedin the specified stream. sender-whitelisted The sender address was whitelistedin the specified stream. skip-spam-scan The originating relay was in a Known Network marked with “Skip Spam Scan” too-large The message was rejected because it was over the configured maximum size for messages received. The detail key will contain the actual size of the message. unscanned-toobig The message was not scanned for spam because it was over the configured maximum size for scanning and could not be reduced below that size. The detail key will contain the actual size of the message.

CanIt-PRO — Roaring Penguin Software Inc. 220 APPENDIX H. CANIT-PRO LOGGING

virus The message contained a virus payload. The detail key will contain the name of the virus found. detail This provides further detail if necessary (and available) from certain tests. For example, if what=discard and reason=virus, the detail key will contain the name of the virus found. city The name of the city in which the SMTP sending relay is located, if it could be determined. country code The two-letter ISO-3166 country-code in which the SMTP sending relay is located, if it could be determined. incident The numeric ID of the incident, if available. An incident ID will be available only if an incident is associated with this message, either because it was created, or because the message matched an existing incident. resolved by If an incident was present for this message, this field provides the username of the user responsible for accepting or rejecting the message. nrcpts The number of recipients for the given message. In general, rather than listing the individual recipients (which, in some cases could number in the hundreds), we use this key to provide only the number. The exception is when a particular single recipient is affected. In that case, we use the recipient key to log the actual address. recipient If an envelope recipient is rejected for some reason, the recipient address is logged with this key. relay The IP address of the sending relay. If parsing of Received: headers is enabled, this contains the address retrieved from the headers. Otherwise, the actual connecting relay IP is logged. score The score for the message, if scoring rules were applied. sender The envelope sender of the message. subject The subject line of the message, if available. This key always appears last in the log message. stream The name of the stream being applied to the message at the time. tests A semicolon-separated list of test names (both SpamAssassin and CanIt tests) that triggered for the message.

CanIt-PRO — Roaring Penguin Software Inc. Appendix I

SNMP Agents for CanIt-PRO

I.1 Introduction

SNMP (“Simple Network Management Protocol”) is a protocol for monitoring networks. An SNMP monitoring station typically polls an SNMP Agent via UDP and receives data about the monitored facility. CanIt-PRO includes an SNMP agent that integrates with the Net-SNMP package. (For more informa- tion on Net-SNMP, please see http://www.net-snmp.org/)

Note: This chapter is not a tutorial on SNMP, nor will it tell you how to configure Net-SNMP; we assume you’re familiar with both. Also, we support the SNMP agents only on our Debian-based appliance and our Red Hat Enterprse Linux RPMs; on all other systems, the SNMP agents are supplied on an as-is basis without support. To use the SNMP agent, ensure that Net-SNMP is installed, and that snmpd is configured and set to start on system boot. Data returned by an SNMP agent is described in a Management Information Base or MIB. You can download the CanIt-PRO MIB by logging in to the Web interface as the administrator, selecting Setup : Wizards and then clicking Download CanIt SNMP MIB File.

I.2 The SNMP Agent

The SNMP agent included with CanIt-PRO monitors:

1. Sendmail queue sizes and number of processes.

2. MIMEDefang busy/free scanners.

3. Failover status.

The agent provides information under the MIB tree .1.3.6.1.4.1.10055, which corresponds to enterprises.roaringpenguin.

CanIt-PRO — Roaring Penguin Software Inc. 221 222 APPENDIX I. SNMP AGENTS FOR CANIT-PRO

I.2.1 Enabling the agent

To enable the SNMP monitoring agent, add this line to snmpd.conf:

pass_persist .1.3.6.1.4.1.10055 /usr/share/canit/scripts/canit-snmp-agent

Additionally, the cron job /usr/share/canit/scripts/canit-snmp-cron must be set up to run once per minute. On CanIt-PRO appliances, edit the file /etc/cron.d/canit- snmp and uncomment the line. On other platforms, create a cron script that runs /usr/share/canit/scripts/canit-snmp-cron as root once per minute.

I.2.2 Configuring SNMPd

You may need to configure your SNMP daemon to allow connections from your external monitoring services. The following instructions apply to the Appliance Build. For other operating systems, consult the distributor’s documentation or support resources. To configure the SNMP daemon to listen on your external network interface, edit /etc/default/snmpd. Find this line:

SNMPDOPTS='-Lsd -Lf /dev/null -u snmp -I -smux -p /var/run/snmpd.pid 127.0.0.1'

Change it to:

SNMPDOPTS='-Lsd -Lf /dev/null -u snmp -I -smux -p /var/run/snmpd.pid'

To tell the SNMP daemon to allow readonly connections, edit /etc/snmpd/snmpd.conf. Find the section of com2sec definitions and add the following two lines (replace ip.addr with your external monitoring service’s IP address):

com2sec readonly localhost public com2sec readonly ip.addr/32 public

You may prefer a different COMMUNITY name than public. If so, simply change the last parame- ter of the latter line. Finally, restart the daemon and test with the following two commands:

/etc/init.d/snmpd restart snmpwalk -v 1 -c public localhost .1.3.6.1.4.1.10055

Note: If the snmpwalk command doesn’t give any output, wait a minute before trying again. The daemon may take a moment to fully start up, or the cron job may not have run yet.

CanIt-PRO — Roaring Penguin Software Inc. I.2. THE SNMP AGENT 223

I.2.3 Agent Data

The Sendmail portion of the agent returns information about the number of entries in the main queue and the submission queue. The meanings of the variables are as follows; “sendmail” is short for .1.3.6.1.4.1.10055.100:

• sendmail.1.1.1.1 — constant integer 1

• sendmail.1.1.2.1 — constant string “Main Queue”

• sendmail.1.1.3.1 — number of messages in Sendmail’s primary queue

• sendmail.1.1.1.2 — constant integer 2

• sendmail.1.1.2.2 — constant string “Submission Queue”

• sendmail.1.1.3.2 — number of messages in Sendmail’s submission queue

• sendmail.1.1.1.3 — constant integer 3

• sendmail.1.1.2.3 — constant string “Sendmail Process Count”

• sendmail.1.1.3.3 — number of sendmail processes running

The MIMEDefang portion of the agent returns information about the number of messages processed in the last 10 seconds, 1 minute, 5 minutes and 10 minutes; the average scan time in milliseconds, and the average number of busy scanners. The meanings of the variables are as follows; “mimedefang” is short for .1.3.6.1.4.1.10055.1:

• mimedefang.1.1.1.1 — constant integer 1

• mimedefang.1.1.1.2 — constant integer 2

• mimedefang.1.1.1.3 — constant integer 3

• mimedefang.1.1.2.1 — constant string “Max slaves”

• mimedefang.1.1.2.2 — constant string “Busy slaves”

• mimedefang.1.1.2.3 — constant string “Free slaves”

• mimedefang.1.1.3.1 — maximum number of scanning processes configured

• mimedefang.1.1.3.2 — number of busy scanning processes

• mimedefang.1.1.3.3 — number of free scanning processes

• mimedefang.2.1.1.1 — constant integer 1

• mimedefang.2.1.1.2 — constant integer 2

• mimedefang.2.1.1.3 — constant integer 3

CanIt-PRO — Roaring Penguin Software Inc. 224 APPENDIX I. SNMP AGENTS FOR CANIT-PRO

• mimedefang.2.1.1.4 — constant integer 4

• mimedefang.2.1.2.1 — constant string “10 Seconds”

• mimedefang.2.1.2.2 — constant string “1 Minute”

• mimedefang.2.1.2.3 — constant string “5 Minutes”

• mimedefang.2.1.2.4 — constant string “10 Minutes”

• mimedefang.2.1.3.1 — number of messages scanned in the last 10 seconds

• mimedefang.2.1.3.2 — number of messages scanned in the last 1 minute

• mimedefang.2.1.3.3 — number of messages scanned in the last 5 minutes

• mimedefang.2.1.3.4 — number of messages scanned in the last 10 minutes

• mimedefang.2.1.4.1 — average scan time in milliseconds times 1000 (last 10 seconds)

• mimedefang.2.1.4.2 — average scan time in milliseconds times 1000 (last 1 minute)

• mimedefang.2.1.4.3 — average scan time in milliseconds times 1000 (last 5 minutes)

• mimedefang.2.1.4.4 — average scan time in milliseconds times 1000 (last 10 minutes)

• mimedefang.2.1.5.1 — average busy scanners times 1000 (last 10 seconds)

• mimedefang.2.1.5.2 — average busy scanners times 1000 (last 1 minute)

• mimedefang.2.1.5.3 — average busy scanners times 1000 (last 5 minutes)

• mimedefang.2.1.5.4 — average busy scanners times 1000 (last 10 minutes)

Note: The average scan time and average busy scanners values are reporting an average, which is normally a floating-point value internally. Since SNMP does not have a floating-point type, we multiply the raw value by 1000 so that the information can be reported using a SNMP integer type with an acceptable level of precision. If you have configured PostgreSQL failover as described in the CanIt-PRO Clustering Guide, you will also be able to retrieve information about the status of PostgreSQL failover. The meanings of the variables are as follows; “failover” is short for .1.3.6.1.4.1.10055.2:

• failover.1 — the type of server; one of “master” or “backup”.

• failover.2 — if 1, then the failover system is OK. If 0, then there are problems that should be investigated.

• failover.3 — a count of errors seen in the PostgreSQL log file. (Meaningful only on the “master” server.)

• failover.4 — a count of WAL files waiting to be consumed on the backup server. Always re- ported as 0 on the master server.

CanIt-PRO — Roaring Penguin Software Inc. I.2. THE SNMP AGENT 225

• failover.5 — the count of WAL files in the pg xlog directory on the master server. Always reported as 0 on the backup server.

• failover.6 — the time of last base backup as a UNIX timestamp (seconds since 1 January 1970 00:00:00 UTC.) Always reported as 0 on the master server.

• failover.7 — the time the last WAL file was shipped to the backup server as a UNIX timestamp. Always reported as 0 on the master server.

CanIt-PRO — Roaring Penguin Software Inc. 226 APPENDIX I. SNMP AGENTS FOR CANIT-PRO

CanIt-PRO — Roaring Penguin Software Inc. Appendix J

Additional Scripts

CanIt-PRO ships with additional scripts that you may find useful. Please note that these scripts are not officially supported by Roaring Penguin Software Inc.

J.1 reset-password.pl

The script /usr/share/canit/scripts/reset-password.pl lets you reset the adminis- trator password if you forget it. To run the script, simply type: # /usr/share/canit/scripts/reset-password.pl and follow the prompts.

CanIt-PRO — Roaring Penguin Software Inc. 227 228 APPENDIX J. ADDITIONAL SCRIPTS

CanIt-PRO — Roaring Penguin Software Inc. Appendix K

Bayes Database Back-Ends

K.1 PostgreSQL Bayes Data Storage

By default, versions of CanIt-PRO prior to 3.2.0 store Bayesian statistics in the PostgreSQL database in a table called bayes. At a large site, Bayesian lookups can cause considerable database traffic and substantial load on the database machine. CanIt-PRO has a mechanism to store Bayesian statistics in CDB database files. These files are local to each scanner. Lookups are extremely fast, and involve no database traffic and no load on the PostgreSQL database. Similarly, updates do not involve the PostgreSQL database, which can greatly improve performance.

Note: As of CanIt-PRO version 3.3.0, the PostgreSQL back-end is no longer supported, and cannot be used.

K.2 Berkeley Database Bayes Storage

Versions 6.0.x and earlier of CanIt-PRO used BerkeleyDB to store Bayesian statistics. In current versions, we now use CDB for the same data. CanIt-PRO can now read both BerkeleyDB and CDB Bayesian statistics files, but will only write CDB files. As such, no conversion step is necessary – all Bayesian statistics will be migrated to CDB storage as new training is performed.

K.3 CDB Database Bayes Storage

CDB storage of Bayes data operates as follows:

• The master database files are stored on the machine running the ticker. Each stream has its own database file under the directory /var/spool/MD-Bayes/DB.

• Bayes training is performed by the ticker. It updates the master CDB database files. If you are running a cluster, the ticker then copies the updated database files to each scanning machine.

As a consequence of the way the CDB database files work, you must be aware of the following:

CanIt-PRO — Roaring Penguin Software Inc. 229 230 APPENDIX K. BAYES DATABASE BACK-ENDS

• You must have sufficient room under /var/spool/MD-Bayes/DB for all of your Bayes data on the ticker machine and on each scanner.

• If you want to back up your Bayes data, you must back up /var/spool/MD-Bayes on the ticker machine as well as backing up the nightly database dump.

• The ticker machine must be able to communicate via SSH to all scanning servers. SSH key setup is performed automatically, so no additional configuration should be necessary beyond the setup of a proper CanIt-PRO cluster (see Section K.4.)

K.4 Cluster Considerations

CDB files need to copy the files to all your scanning machines. On a new cluster, this will be taken care of with update propagation (see below). However, if you are adding a new server to an existing cluster, you will need to copy over your data. If you have rsync and ssh installed, the following commands can be used to copy the data over. They should be run as defang on the ticker machine; we assume $SCANNERS is a list of all your newly-added scanners. for mach in $SCANNERS ; do rsync -essh --archive --progress --verbose /var/spool/MD-Bayes/DB \ $mach:/var/spool/MD-Bayes done

K.4.1 Propagating Updates

Because the ticker can only update CDB databases locally on the ticker machine, a mechanism is required to copy updated files to all scanning machines. In recent versions (post-6.0.3) of CanIt-PRO, this is performed via the standard cluster communication process using an automatically-generated shared SSH key. Previous versions used sync-berkeley-db and sync-berkeley-db-multi scripts to syn- chronize the data; these are no longer necessary or supported.

K.5 Switching back to PostgreSQL Bayes Storage

As of CanIt-PRO 3.3.0, it is not possible to switch back to the PostgreSQL storage module for Bayes data.

CanIt-PRO — Roaring Penguin Software Inc. Appendix L

System Check Tests

CanIt-PRO features an extensive self-test system that checks for common misconfigurations and the administrator if problems are detected. You can see an overview of the self tests on the Setup : System Check page. The tests are as follows:

ApplianceDebianRepositories If this test fails, then your appliance has non-Roaring Penguin repos- itories in its sources list. We do not recommend this.

ApplianceDebianVersion If this test fails, then your CanIt-PRO appliance cannot determine its De- bian version. Contact Roaring Penguin support personnel for assistance.

BaseURLConfigured If this test fails, you have not configured the Base URL of the CanIt-PRO installation. Run through the Basic Setup Wizard (under Setup : Wizards) to correct this.

BayesDatabaseFormat If this test fails, the Bayes storage mechanism from an old CanIt-PRO instal- lation is incorrect. Contact Roaring Penguin support for help.

ClamAVCurrent If this test fails, your ClamAV signatures are out of date. Check the Clam logs to see what might be causing the problem.

ClusterMain databaseHost If this test fails, no cluster member is designated as the main database host. Contact Roaring Penguin support for help.

ClusterScannerHost If this test fails, no cluster member is configured as a scanner. Fix this under Setup : Cluster Management.

ClusterStorageManagerHost If this test fails, the Storage Manager is misconfigued. Run though the Storage Manager Wizard to correct the problem.

ClusterTickerHost If this test fails, no cluster member is configured as a ticker. Fix this under Setup : Cluster Management.

ClusterWebserverHost If this test fails, no cluster member is configured as a Web server. Fix this under Setup : Cluster Management.

CanIt-PRO — Roaring Penguin Software Inc. 231 232 APPENDIX L. SYSTEM CHECK TESTS

CopyToCluster If this test fails, it indicates a problem copying Bayes data from the ticker host to another host in the cluster. Make sure all hosts can communicate with each other over SSH on TCP port 22.

Cron If this test fails, then the nightly cron job has not run recently. You should immediately investi- gate and take corrective action.

DatabaseDump If this test fails, it indicates that the nightly database dump has failed. You should immediately take corrective action; check the log file canit-cron.log in the directory /var/spool/Canit-Spam-DB-Backup to see what went wrong.

DatabaseVacuum If this test fails, it indicates that the nightly database vacuum has failed. You should immediately take corrective action; check the log file canit-cron.log in the direc- tory /var/spool/Canit-Spam-DB-Backup to see what went wrong.

DeprecatedFiles If this test fails, it indicates there are some obsolete files from an old CanIt-PRO installation. Contact Roaring Penguin support for help.

DeleteOnCluster If this test fails, it indicates a problem deleting Bayes data on a host in the cluster. Make sure all hosts can communicate with each other over SSH on TCP port 22.

Hostname If this test fails, then your host is called localhost. You should give it a real host name.

HostnameNotLoopback If this test fails, then a host’s canonical name resolves to the loopback ad- dress (i.e, 127.0.0.1 or ::1). This usually indicates a bad entry in /etc/hosts on the affected machine. You should ensure that /etc/hosts does not contain an entry for the host pointing it at 127.0.0.1 or ::1. Only localhost should point to the loopback address.

LicenseValid If this test fails, your license key is invalid. Contact Roaring Penguin support for help.

MainDatabase If this test fails, there isn’t exactly one machine in the cluster marked as the main database server. Contact Roaring Penguin support for help.

MaxFSMPages If this test fails, PostgreSQL’s max fsm pages parameter is too low. You should take immediate corrective action: Edit the PostgreSQL postgresql.conf file to increase max fsm pages and restart PostgreSQL.

Note: You may need to increase your kernel’s max. shared memory limit. If PostgreSQL fails to restart after updating max fsm pages, revert your change and restart. Correct the kernel.shmmax issue, then retry.

PhishListDownload If this test fails, the phishing list download has failed. Check the reason in the description field to diagnose why the download failed.

PostgresEncoding If this test fails, the your spam database uses the wrong encoding. Contact Roar- ing Penguin support for help.

PostgresVersion If this test fails, your version of PostgreSQL is too old. Contact Roaring Penguin support for help.

CanIt-PRO — Roaring Penguin Software Inc. 233

PostgresXXX If this test fails, the corresponding PostgreSQL configuration value is too low. Increase it in postgresql.conf and restart PostgreSQL.

RecipientVerification:XXX If this test fails, then there is no mechanism to verify recipients for the given domain. You should enable recipient verification by doing one of the following:

1. Set up a Verification Server (Section 4.4.) 2. Set up a User Lookup method that validates recipients (Section 6.2.) 3. Use the Valid Recipients Table (see the Users’ Guide.)

RPTNBayesDownload If this test fails, an RPTN download has failed. Check the reason in the description field to diagnose why the download failed.

RPTNEnabled If this test fails, you have not enabled RPTN downloads. Run through the RPTN Setup Wizard (under Setup : Wizards) to correct this.

RPTNGeoDownload If this test fails, the geolocation data download has failed. Check the reason in the description field to diagnose why the download failed.

RPTNRulesDownload If this test fails, a ruleset download has failed. Check the reason in the de- scription field to diagnose why the download failed.

RPTNSynchronization If this test fails, then at least one scanner has no RPTN data (or outdated RPTN data). Make sure your RPTN downloads are succeeding and that all cluster members can communicate via SSH over TCP port 22.

StorageManagerConfig If this test fails, you have enabled the Storage Manager, but have not con- figured any read/write Storage Manager nodes. Run through the Storage Manager Wizard to correct the misconfiguration.

SupportValid If this test fails, your support term is about to expire or has expired. Contact Roaring Penguin’s sales department to extend your support.

TickerTable If this test fails, no ticker tasks are running. Contact Roaring Penguin support for help.

TickerTaskXXX If this test fails, the corresponding ticker task has not run recently. Contact Roaring Penguin support for help.

VirusScannerEnabled If this test fails, then no virus scanners are enabled. You should enable a virus scanner in /etc/mail/canit/virus-scanners.pl.

WebserverDeprecatedFiles If this test fails, there are obsolete files in the CanIt-PRO web directory. Contact Roaring Penguin support for help.

CanIt-PRO — Roaring Penguin Software Inc. 234 APPENDIX L. SYSTEM CHECK TESTS

CanIt-PRO — Roaring Penguin Software Inc. Appendix M

The CanIt-PRO License

READ THIS LICENSE CAREFULLY. IT SPECIFIES THE TERMS AND CONDITIONS UNDER WHICH YOU CAN USE CANIT-PRO This license may be revised from time to time; any given release of CanIt-PRO is licensed under the license version which accompanied that release. CanIt-PRO is distributed in source code form, but it is not Free Software or Open-Source Software. Some CanIt-PRO components are Free Software or Open-Source, and we detail them below: The following files may be redistributed according to the licenses listed here. An asterisk (*) in a file name signifies a version number; the actual file will have a number in place of the asterisk.

File License src/Archive-Tar-*.tar Perl License src/Config-Tiny-*.tar Perl License src/DBD-Pg-*.tar Perl License src/DBI-*.tar Perl License src/Data-ResultSet-*.tar Perl License src/Data-UUID-*.tar Perl License src/Digest-MD5-*.tar Perl License src/Digest-SHA1-*.tar Perl License src/File-Spec-*.tar Perl License src/File-Temp-*.tar Perl License src/HTML-Parser-*.tar Perl License src/HTML-Tagset-*.tar Perl License src/IO-Zlib-*.tar Perl License src/IO-stringy-*.tar Perl License src/Log-Syslog-Abstract-*.tar Perl License src/MIME-Base64-*.tar Perl License src/MIME-tools-*.tar Perl License src/Mail-SPF-Query-*.tar Perl License src/Mail-SpamAssassin-*.tar Apache License, Version 2.0 src/MailTools-*.tar Perl License src/Module-Pluggable-Tiny-*.tar Perl License

CanIt-PRO — Roaring Penguin Software Inc. 235 236 APPENDIX M. THE CANIT-PRO LICENSE

File License src/Net-CIDR-Lite-*.tar Perl License src/Net-DNS-*.tar Perl License src/Net-IP-*.tar Perl License src/Time-HiRes-*.tar Perl License src/TimeDate-*.tar Perl License src/URI-*.tar Perl License src/YAML-Syck-*.tar Perl License src/clamav-*.tar GPLv2 src/libwww-perl-*.tar Perl License src/mimedefang-*.tar GPLv2

ALL REMAINING FILES IN THIS ARCHIVE (referred to as ”CanIt-PRO”) ARE DISTRIBUTED UNDER THE TERMS OF THE CANIT LICENSE, WHICH FOLLOWS:

THE CANIT LICENSE

1. CanIt-PRO is the property of Roaring Penguin Software Inc. (”Roaring Penguin”). This license gives you the right to use CanIt-PRO, but does not transfer ownership of the intellectual property to you.

2. CanIt-PRO is licensed with a limit on the number of allowable protected domains or mailboxes. This limit is called ”the Usage Limit”. CanIt-PRO usage may be purchased on a yearly basis, or you may purchase a perpetual license.

3. You may use CanIt-PRO up to the Usage Limit you have purchased. If you have purchased yearly usage, you may continue to use CanIt-PRO until your purchased usage time expires, un- less you purchase additional time. If you have purchased a perpetual license, you may continue to use CanIt-PRO indefinitely, providing you do not violate this license. If you have purchased yearly usage, you may exceed your purchased limit by up to 10% until the yearly renewal date, at which time you must purchase a sufficient limit for the increased number of domains or mailboxes. If you have purchased a perpetual license, or wish to increase your usage more than 10% above your paid-up limit, you must purchase the additional usage within 60 days of the increase.

4. You may examine the CanIt-PRO source code for education purposes and to conduct security audits. You may hire third-parties to audit the code providing you first obtain permission from Roaring Penguin. Such permission will generally be granted providing the third-party signs a non-disclosure agreement with Roaring Penguin.

5. You may modify the CanIt-PRO source code for your own internal use, subject to the restrictions in Paragraph9 below. However, if you do so, you agree that Roaring Penguin is released from any obligation to provide technical support for the modified software. If you wish your modifications to be incorporated into the mainstream CanIt-PRO release, you agree to transfer ownership of your changes to Roaring Penguin.

CanIt-PRO — Roaring Penguin Software Inc. 237

6. You may make backups of CanIt-PRO as required for the prudent operation of your enterprise. 7. You may not redistribute CanIt-PRO in source or object form, nor may you redistribute modified copies of CanIt-PRO or products derived from CanIt-PRO. 8. If you violate this license, your right to use CanIt-PRO terminates immediately, and you agree to remove CanIt-PRO from all of your servers. 9. Restrictions on modification: (a) Notwithstanding Paragraph5, you may not make changes to CanIt-PRO or your software environment which would allow CanIt-PRO to run without a valid License Key as issued by Roaring Penguin. You also agree not to set back the time on your server to artificially extend the validity of a License Key, or do anything else which would artificially extend the validity of a License Key. (b) You may modify the Web-based interface only providing you adhere to the following restrictions: (c) At the bottom of every CanIt-PRO web page, the following text shall appear, in a size, color and font which are clearly legible: Powered by CanIt-PRO (Version x.y.z) from Roaring Penguin Software Inc. where x.y.z is the product version. In addition, “CanIt-PRO” shall be a clearly-marked hy- pertext link to http://www.roaringpenguin.com/powered-by-canit.php (d) You may not include elements on the CanIt-PRO Web interface that require plug-ins (such as, but not limited to, Macromedia Flash, RealPlayer, etc.) to function. (e) You may not include Java applets on the CanIt-PRO Web interface. (f) If you include JavaScript on the Web interface, you shall ensure that the interface functions substantially unimpaired in a browser with JavaScript disabled. (g) You shall not include browser-specific elements on the Web interface. You shall ensure that the Web interface functions substantially unimpaired on the latest versions of the following browsers: • Internet Explorer for Windows • Mozilla for Windows • Mozilla for Linux • Konqueror for Linux (h) You may not include banner ads on the CanIt-PRO Web interface. 10. Restrictions on reselling services: Unless you purchased CanIt-PRO as a service provider on the ISP rate plan, you may not use CanIt-PRO to provide spam-scanning services to third parties. You may use CanIt-PRO only for your employees and contractors accounts on your own corporate servers. 11. Disclaimer of Warranty (Virus-Scanning) NOTE: ALTHOUGH CANIT-PRO IS DISTRIBUTED WITH CLAM ANTIVIRUS, WE DO NOT MAKE ANY REPRESENTATIONS AS TO ITS EFFECTIVENESS AT STOP- PING VIRUSES. ROARING PENGUIN HEREBY DISCLAIMS ALL WARRANTY ON

CanIt-PRO — Roaring Penguin Software Inc. 238 APPENDIX M. THE CANIT-PRO LICENSE

THE ANTI-VIRUS CODE INCLUDED WITH CANIT-PRO, OR WHICH INTERFACES TO CANIT-PRO. WE ARE NOT RESPONSIBLE FOR ANY VIRUSES THAT MIGHT EVADE A VIRUS-SCANNER INTEGRATED WITH CANIT-PRO.

M.1 THE CANIT DATA LICENSE

Roaring Penguin makes available certain data that are used by CanIt. This license covers the RPTN Bayes data and the Roaring Penguin RBLs. The data are owned by Roaring Penguin and their use is licensed under the following terms:

1. You may update the RPTN data once per day per Roaring Penguin download username. Roaring Penguin reserves the right to cut off downloads if more than one download per day per username is attempted.

2. You may use the RPTN data only in conjunction with your properly-licensed CanIt installation.

3. You may not redistribute the RPTN data.

4. If your support term expires, you lose the right to use RPTN data for any purpose whatsoever.

5. You may make use of the Roaring Penguin RBLs from within CanIt. You may not query them with any other software.

6. You may use the Roaring Penguin RBLs only in conjunction with your properly-licensed CanIt installation.

7. You may not redistribute the Roaring Penguin RBL data.

8. If your support term expires, you lose the right to use the Roaring Penguin RBLs.

CanIt-PRO — Roaring Penguin Software Inc. Index

access rights, see permissions database settings, 196 account-info, 95 database, moving, 205 active streams, 78 debugging logs, 217 address mapping, 61 default stream, 62 scenarios, 62 deleting a group, 77 wildcards, 61 deleting a stream, 79 addresses, locked, 119 delivery status notification, blocking, 83 alias, 39 direct queue injection, 55 architecture, 194 disabling features, 55 attachment, 121 discarded message, 35 authentication, external, 85 disclaimer, 79 disk imaging, 209 backscatter, 83 DNS blacklists, 69 backups, 145 domain configuration wizard, 147 basic setup wizard, 44 domain mapping, 59 Bayes Database, 229 AsIs, 60 Berkeley, 229 ChopDomain, 60 CDB, Cluster Considerations, 230 ChopUser, 60 PostgreSQL, 229 Database, 60 Bayes journal, 98 Program, 60 Bayesian filtering, 97 domain, locked address, 119 voting download, RPTN, 98 unauthenticated, 97 dump, restoring from, 203 best practices, 142 event log, 218 canit-api-client, 209 expire, 66 canit-cmd, 209 non-spam, 66 canit.conf, 195 spam, 66 classes external authentication, 85 stream, 101 cloning, 209 false positive, 17 command-line tool, 209 features, 55 configuration file, 195 direct queue injection, 55 creating a group, 77 disabling, 55 cron job, 66, 205 enabling, 55 cron settings, 196 filter settings, 198 filtering outgoing mail, 79 data license, 238 final stream, 109

CanIt-PRO — Roaring Penguin Software Inc. 239 240 INDEX

firewall, 204 periodic reports, 113 firewall rules permissions, 101 RPTN, 99 granting, 102 flow of mail, 24 group, 101 stream, 101, 103 geolocation data, 99 permissions and ownership, file, 143 global settings, 65 phishing, 82 greylisting, 20, 139 plus hack, 68 group, 76 privileges creation, 77 user, 72 deletion, 77 root, 72 editing, 77 write, 72 group permissions, 101 program user lookup, 92

HTTPS, 59 RAM, 200 rate-limiting, 54 inheritance, 107 real-time blacklist, 69 joe-job, 83 receive-only addresses, 142 Received: header, 81 known networks, 50 relay host, 20 remailing, 35 license, 235 report, RPTN, 98 data, 238 reports, periodic, 113 locked address domain, 119 restoring database, 203 locked addresses, 119 Roaring Penguin Training Network, see RPTN logging, 217 RPTN, 98 events, 218 firewall rules, 99 logs, searching, 131 RPTN download, 98 RPTN report, 98 mail flow, 24 RPTN setup wizard, 44 mapping, 40 rsyslog.conf, 132 maximum size, 65 rule memcached, 211 copying, 80 memory, 200 prioritization, 24 message ruleset update, 99 status, 33 message size, maximum, 65 searching logs, 131 milter, 20 secondary MX, 34, 80 MIMEDefang, 20 security, 143 mimedefang settings, 196 network, 144 moving database, 205 PHP, 144 MX, secondary, 80 PostgreSQL, 144 ssh, 144 opt-in, 67 Sender Policy Framework, see SPF outgoing mail, filtering, 79 Sendmail plus hack, 68 ownership and permissions, file, 143 server, verification, 45

CanIt-PRO — Roaring Penguin Software Inc. INDEX 241

settings, cron, 196 temporary failure, see tempfail settings, database, 196 ticker settings, 199 settings, filter, 198 tuning, 200 settings, mimedefang, 196 settings, storage manager, 199 unauthenticated voting, 97 settings, ticker, 199 updates, rules, 99 simple GUI, 107 user simple interface, 68 adding, 73 Simple Mail Transfer Protocol, see SMTP deleting, 74 SMTP, 21 editing, 73 SMTP AUTH, 54 user lookup, program, 92 SMTP authentication, 68 user privileges, 72 SNMP, 221 users, 71 special streams, 109 verification server, 45 SPF, 21 non-standard port, 47 storage manager, 125 virus, 66 storage manager settings, 199 voting stream, 21, 37 unauthenticated, 97 active, 78 default, 40, 62 welcome screen, 42 definition, 37 wizard, 44 deleting, 79 basic setup, 44 final, 109 RPTN setup, 44 granting access to, 74 inheritance, 107 mapping, 40 special, 109 stream classes, 101 stream permissions, 101, 103 streaming, 28 methods, 29 AsIs, 29 ChopDomain, 29 ChopUser, 29 Database, 29 Program, 29 Sendmail, 29 syslog, 217 system check, 56 tempfail, 21, 67 always, 67 first-time, 67 never, 67 until-dispatched, 67 templates, 57

CanIt-PRO — Roaring Penguin Software Inc.