Hot Topics for Web Services Research

International Journal of Web Services Research , Vol.X, No.X, 200X

WSEmail: Secure Internet Messaging Based on Web Services

Kevin D. Lux, Michael J. May, and Carl A. Gunter University of Pennsylvania University of Illinois at Urbana-Champaign [email protected], [email protected], [email protected]

ABSTRACT:

Web services offer an opportunity to redesign a variety of older systems to exploit the advantages of a flexible, extensible, secure set of standards. In this paper we explore the objective of improving Internet messaging (email) by redesigning it as a family of web services, an approach we call WSEmail. We illustrate an architecture and describe some applications. Since increased flexibility often mitigates against security and performance, we focus on steps for proving security properties and measuring the performance of our system with its security operations. In particular, we demonstrate an automated proof using TulaFale and ProVerif of a correspondence theorem for an application called on-demand attachments. We also provide performance measures for the basic WSEmail functions in a prototype we have implemented using .NET. Our experiments show a latency of about a quarter of a second per transaction under load.

KEY WORDS: Keywords INTRODUCTION

The advent of web services has created the foundation for highly interoperable distributed systems to communicate over the Internet using standardized protocols and security mechanisms. With the foundation now available, older designs and protocols should be reevaluated to see whether they can benefit from the new architecture, standards, and tools. As a case study of such an analysis and redesign, we present WSEmail, electronic mail redesigned as a family of web services.

Internet electronic mail (email) is based on a collection of protocols—SMTP, POP, IMAP, S/MIME—that have evolved over a vast installed base. The resulting system has shortcomings in the areas of flexibility, security, and integration with other messaging systems. For instance, problems of authentication and extensibility have plagued attempts to reduce spam, while poor integration with browsers has led to a rash of spoofing attacks (“phishing”). There are many ideas about improving email that can be addressed by new design ideas; a listing of a number of these limitations can be found on the mail-ng list of the Internet Mail Consortium (IMC) (www.imc.org). To respond to these shortcomings and pursue new opportunities, we have investigated replacing the existing protocols with protocols based on SOAP, WSDL, XMLDSIG and other XML-based formats. The new protocols are inherently extensible, promise to give stronger guarantees for message authentication, and are amenable to formal modeling and proof.

WSEmail is designed to perform the functions of ordinary email but also enable additional security functions and more flexibility. The primary strategy is to import these virtues from the standards and development platforms for web services. Our exploration of WSEmail is based on a

1 International Journal of Web Services Research , Vol.X, No.X, 200X prototype architecture and implementation. WSEmail messages are SOAP messages that use web service security features to support integrity, authentication, and access control for both end-to- end and hop-by-hop message transmissions. The WSEmail platform supports the dynamic updating of messaging protocols on both client Mail User Agents (MUAs) and server Mail Transfer Agents (MTAs) to enable custom communications. This flexibility supports the introduction of new security protocols, richer message routing (such as routing based on the semantics of a message), and close integration with diverse forms of communication such as instant messaging.

The benefits of flexibility can be validated by showing diverse applications. However, flexibility often has a high cost for security and performance. This paper focuses on techniques to measure and mitigate these costs for WSEmail. We make two contributions. First, we carry out a case study of a formal analysis of an application called on-demand attachments, in which email with an attachment leaves the attachment on the sender’s server rather than placing it on the servers of the recipients. The challenge is to design the associated security for the attachment based on emerging federated identity systems. Second, we carry out a set of experiments intended to determine the efficiency of our base system, including its security operations. Both of these studies demonstrate promise for security and performance for web services in general and WSEmail in particular.

The paper is organized in seven sections. Following this introductory section we sketch the architecture of WSEmail focusing on its security assumptions and then continue with a discussion about how plugin-ins function. In the fourth section we discuss applications we have explored with WSEmail focusing on on-demand attachments. In the fifth section we describe on-demand attachments theoretically and argue that basing email on web services can aid the application of advances in security analysis to new messaging protocols. In the sixth section we discuss our implementation and its performance. The seventh section discusses related work. The eighth section concludes. Interested readers can find more information on our project web page at wsemail.ws. BASE ARCHITECTURE

The baseline protocols for WSEmail are illustrated in Figure 1. In the common case, similar to an SMTP message, an MUA Sender Client SC1 makes a call on its MTA Sender Server SS to send a message M1. This and other calls are SOAP calls over TCP; the message M1 is in the body of the SOAP message and the SOAP header contains information like the type of call and security parameters. The message itself is structured as a collection of XML elements, including, for instance, a subject header. A sample trace of WSEmail messages can be found at wsemail.ws/messages.html. After receiving the call from SC1, the server SS makes a call on the Receiver Server RS to deliver the mail from the Sender Domain SD into the Receiver Domain RD. The Receiver Client RC makes calls to RS to inquire about new messages or download message bodies. In particular, RC makes a call to RS to obtain message headers and then requests the message M1 if it wishes.

2 International Journal of Web Services Research , Vol.X, No.X, 200X

Figure 1. Messaging architecture

Our design is based on a three-tier authentication system combined with an extensible system of federated identities. The first tier provides user (MUA) authentication based on passwords, public keys, or federated identity tokens. The second tier provides server (MTA) authentication based on public keys with certificates similar to those used for TLS. The third tier uses root certificates similar to the ones in browsers. Overall, this addresses interdomain authentication in a practical way at the cost of full end-to-end confidentiality. Confidentiality is preserved between hops by TLS or another tunnel protocol. In a basic instance, the message from SC1 to RC will be given an XMLDSIG signature by SS that is checked by both RS and RC.

The novel aspects of our WSEmail architecture are in the integration and flexibility of the MUA authentication and the ability of both MUAs and MTAs to add new security functions dynamically. To illustrate a variation in the basic protocol, consider our design for instant messaging. Referring again to Figure 1, an instant message M2 dispatched from a client SC2 to RC while SC2 is outside its home domain SD. In this case SC2 contacts SS to obtain a security token T that will be recognized by RS. Once this is obtained, SC sends M2 authenticated with this credential to RS and indicates (in a SOAP header) that it should be treated as an instant message by RS and RC. Instant messages are posted directly to the client, with the client now viewed as a server that accepts the instant message call. RS and RC are able to apply access control for this function based on the security token from SC. This token is recognized because of a prior arrangement between SS and RS.

The WSEmail MUA and MTA are based on a plugin architecture capable of dynamic extensions. Security for such extensions is provided though a policy for trusted sources and the enforcement mechanisms provided by web services. On-demand attachments are an example of such a plug-in, as are a variety of kinds of attachments with special semantics. A party that sends a message with such an attachment automatically includes information for the receiver on where to obtain the software necessary to process the attachment. The client provides hooks for plug-ins to access security tokens, after first performing an access control check on the plug-in. A figure illustrating the MUA (client) components is available at wsemail.ws/client.html and screen shots of the GUI can be seen at wsemail.ws/screenshots.html. A figure illustrating the server (MTA) components is available at wsemail.ws/server.html. PLUG-IN ARCHITECTURE

Before delving into the applications, it might be helpful to describe a bit about how WSEmail is currently designed and how it is able to adapt to various applications. In the application section, each application will make reference to the kinds of interfaces it uses. These interfaces describe

3 International Journal of Web Services Research , Vol.X, No.X, 200X how much access the application has to the mail server or client software, in addition to where the plug-in is activated to affect message delivery or reading.

Server-side Plug-ins

Server plug-ins are just libraries of code that conform to certain interfaces known to the server. During server initialization, the server goes through the list of plug-ins it is to load from its configuration file. For each plug-in, a specific object class is listed along with the name of a library from which it may be loaded. The server attempts to find the library and instantiate the object. If it successfully loads the plug-in, the server will request further configuration data from the plug-in and use this information to place the plug-in in the appropriate execution queue. The process of loading or unloading plug-ins is entirely dynamic and can happen at any time. In addition, the execution queues can be reprioritized or disabled while the server is running.

IDataAccessor

Incoming Arbitrary WSEmail ISendingProcessor IDeliveryProcessor IExtensionProcessor Messages

Figure 2. An overview of server side plugins

All plug-ins must implement the IServerPlugin interface, which allows the server to understand the purpose of the plug-in and add the plug-in to the appropriate processing queues. Beyond that, there are two main classes of plug-ins in the current implementation of WSEmail: message- dependent and RPC-like (or message-independent). Message dependent plug-ins are those plug- ins that depend on a message to be present to execute. They can be inserted in various places in the delivery cycle including the initial receipt of a message (ISendingProcessor) or the final destination of a message (IDeliveryProcessor). To create an analogy to current systems, ISendingProcessor is similar to any processing done by sendmail (such as verifying relay permissions, stripping oversized attachments, etc). IDeliveryProcessor is more like user-space programs such as procmail or a vacation messaging scripts. A diagram depicting the interactions and data flow of incoming WSEmail and extension requests is show in Figure 2.

The RPC-like interface is a generic “catch-all” interface (IExtensionProcessor). Like the IDeliveryProcessor and ISendingProcessor, it too inherits from the IServerPlugin interface. Upon initializing, the plug-in implementing the interface provides the server with an “extension identifier”. This identifier is used by the server route incoming requests to the appropriate plug- in. There is no required or defined structure for the actual requests and most plug-ins tend to view requests as XML documents or fragments. This provides a lot of flexibility in terms of the data an application can process.

All requests that go to any kind of plug-in are first authenticated by the server. In the case of IExtensionProcessor and IDeliveryProcessor plug-ins, an “environment” object is created and

4 International Journal of Web Services Research , Vol.X, No.X, 200X passed to the plug-in to allow access to authentication tokens and raw XML streams directly from the server. Other plug-ins are generally only given the message that triggered their execution.

Figure 3. Example server plug-ins and their classifications

Plug-ins can implement more than one interface, which allows increased functionality in one piece of code. It is possible for a plug-in to implement both the IExtensionProcessor and the IDeliveryProcessor (or any variant) interfaces. This would allow the plug-in to interact with messages as they are being delivered, but also to be configurable using some protocol that interacts with the IExtensionProcessor interface. Implementing multiple interfaces allows plug- ins to share data that should be accessible though multiple paths. Examples of where composite plug-ins (a plug-in implementing multiple interfaces) are useful will be given in the applications section.

Other server plug-ins exist in WSEmail that would be expected for an enterprise application such as data store access (IDataAccessor), database connection management (IDatabaseManager) message queues (IMailQueue) and the local delivery (ILocalMTA).

Client-side Plug-ins

In the current implementation, client-side plug-ins only affect message reading and are much more dynamic in nature than server-side plug-ins. Similar to server-side plug-ins, client-side plug-ins are libraries of code, but they extend an abstract class (DynamicForms.BaseObject) instead of implementing an interface, mostly for convenience of programming. They also contain more information than their server-side components including version and network location information. With this additional information, the plug-in can create messages that can be received by other clients without the plug-in. A common stub can then be executed to download the plug-in using its supplied location information. After presenting the user with the chance to authenticate the downloaded code (currently Authenticode is used), it can be dynamically loaded and then executed. The plug-in information (version, name and library location) is also saved in a registry that will allow the new recipient to use the same plug-in at a later point, assuming the plug-in allows them to.

Plugin Plugin is Registry is instantiated queried. and run.

User User creates Plugin data Message decides to a new serialized in sent to attach a message. to message. recipient(s). “form”.

Figure 4. Client-side plug-ins from the sender's perspective

The originator of a message creates a new message and adds a “form”, picking a specific form from their registry. This form will, usually, present them with a UI that requires information. After filling out the form, the data contained in is then serialized down to an XML document that is “attached” to the message. The original message is notified by the plug-in where message should be sent next for the message to be appropriately processed. The plug-in is unloaded and a flag is set on the message header that indicates the presence of a form. This allows a user to see which messages in their inbox contain a form by viewing the header information and without downloading all message content. Figure 4 illustrates the process.

Deserialize instance data

Obtain

Available?

Plugin Plugin is Registry is instantiated queried. and shown.

Message Message User User updated, if forwarded, if receives a decides to necessary. necessary. form. view.

Figure 5. Client-side plug-ins from the recipient's perspective

From the recipient’s perspective, they will see the new message appear in their inbox with an annotation indicating that the message contains an attached form. They can view the message normally, but will now have the option to view any attached forms. Upon viewing the form, the mail client will attempt to load the plug-in or obtain it if it is not present on the system, using information contained in the recipient’s plug-in registry and the information contained within the form. After a successful plug-in load, the XML document containing the payload of the form is pushed to the plug-in, which deserializes the data and loads whatever state is necessary. The plug- in’s logic then takes control, usually displaying a graphical interface. Figure 5 demonstrates the typical flow for the recipient of a message with a form attached.

Figure 6. Sample application (a timesheet)

A typical question is “what do client-side plug-in user interfaces look like to the user”. There is not a complete answer to such a question, as the answer is up to the plug-in designer. A plug-in can have no interface at all, a few message boxes or a full-blown GUI. Since WSEmail is based upon the .Net framework, plug-ins are able to leverage the rich UI elements that can be in a graphical .Net application, without the need to transfer libraries for rendering graphics, assuming the plug-in designer uses built-in .Net objects. Plug-ins also have access to authentication information in the mail client and may petition the user for access to their federated token. This allows plug-ins to perform secure web service calls to automatically fill in information or perform other functions.

For convenience and to limit bandwidth used, multiple plug-ins can be contained within one library file. Each plug-in will be enumerated (via .Net’s reflection libraries) and registered with the client application. This allows system administrators to deploy one library with updated plug- ins instead of worrying about each and every plug-in separately.

APPLICATIONS

As previously mentioned, WSEmail has created some applications for demonstration purposes. In this section, a few of the applications will be examined for usefulness, integration and “building block” potential. This survey of WSEmail applications should provide the reader with a better idea of how WSEmail extensible is and how modifications are made to the system to support new features.

Instant Messaging

Since its introduction, instant messaging has taken off. The convenience of being able to have a pseudo real-time conversation with a peer (or a group) is obvious. It eliminates sending short email messages back and forth, instead allowing users to view the exchange as a conversation that has a much smaller lag. Despite these features, there are a few properties of instant messaging that cause quite an irritation among users. Chief among these problems is that there is a distinct handful of chat networks, each with their own protocols and each (generally) incapable of dispatching messages to other networks. This leads to users being unable to communicate even though they both have chat software, simply because they are not on the same network. It burdens the user because they must create identities for the most popular chat mediums and have a client for each (or a meta client) in order to correspond with most people.

This is extremely unfortunate. Instant messaging is truly just like email, except with slightly different delivery mechanisms and implied higher priority delivery (over non “instant” messages).

Forwards Send message instant message normally. to recipient.

Internet 3. Deliver Instant Messages

2. Create Instant Message 1. Register Location Sender Recipient Figure 7. Instant messaging overview

The current implementation of WSEmail has a composite plug-in that supports instant messaging. It is made up of an IDeliveryProcessor and an IExtensionProcessor implementation. Upon starting the instant messaging client program, the user automatically registers their location with the server using the IExtensionProcessor interface. The plug-in records the location information of the user in an internal table. Later, when the server receives a message flagged as instant, the server passes delivery control of the message to the instant messaging plug-in using the IDeliveryProcessor interface. The plug-in simply consults its table of user locations and, if it finds a match, sends the message directly to the client. If a match is not found, the plug-in can choose to relinquish control of the message, passing it back to the server. The server will then attempt delivery using the next matching plug-in.

Instant messages are currently sent to clients using a Microsoft .Net technology known as Remoting. Similar to Java’s RMI, Remoting allows an object to be distributed across a network. Clients remote their instant message queues and have a separate thread watch it. As the server pushes messages into the queue, the client watcher thread will pull them out and coordinate the display of them into the typical conversation like interface.

After a conversation has been established, the users have the option of upgrading the security of the channel they are using. An additional server-side plug-in supports coordinating the shift from WSEmail messages marked as instant messages and transmitted in the clear to a TLS secured “party line”. At a user’s request, the plug-in will allocate a TCP port running TLS and notify all participants in the conversation of the available channel. There is no inherent requirement that the TLS ports be opened on the mail server itself, so it is possible for the mail server to simple act as a broker and pass the connection request on to a secure chat server farm. The recipients of the invitations are given a choice of whether they want to accept. Upon connecting to the secure chat server, the client will be presented with the server’s X.509 certificate and will be asked to present their own certificate for authentication purposes.

The secure channel instant message brokering provides some valuable benefits to the users. They are able to create secure channels with arbitrary groups of people at will. Since the messaging now flows over the TLS channel, they do not have the lag of the composition and forwarding of a WSEmail message. If the users have an existing X.509 architecture, they are easily able to authenticate users. Of course, if they lack a shared X.509 architecture, everyone (including the server) will need manual certificate trusts setup, which can be a prohibitive amount of work depending on the number of users and sites involved.

Business Workflows

Email is the typical venue for all kinds of request processing and tracking. It provides a free-form means of accomplishing work both in and out of the enterprise. The problems with such a system are obvious, including the inability for automated systems to easily process messages due to a lack of structure and, usually, a lack of authentication and tracking. The lack of automation is a major hindrance because it requires humans to decide where the message should go next. If the message is sent to the wrong person or to someone on vacation, business processes can be delayed needlessly.

WSEmail has the idea of routed forms to introduce workflow-type scenarios to the platform. A routed form will typically use the client-side plug-ins mentioned previously to create a rich UI. These forms generally look very similar to their paper counterparts such as time sheets or requisition forms.

The sender follows the flow of activities described in client-side plug-in section. In our prototype workflow plug-ins, the forms are much smarter than their paper counterparts. They can, for example, provide basic spreadsheet-type functionality or automatically populate data using the user’s federated token and a secure web service query to an HR database. To address the security of the workflow, we assume each user has a unique X.509 certification such that the certificate’s common name (CN) matches the user’s email address. As approvals are acquired, the XML data contained in the form is signed by each approver. This creates an approval list that can be verified and audited by a third party or program, which demonstrates that the data has not been tampered with and can authenticate each member in the workflow.

A user is able to delegate their responsibility in a workflow. They provide the name or names of people who will approve the request in their stead. This feature provides a powerful automation feature. Using server-side plug-ins, a workflow can be received by a program, which can make decisions about delegation given the current approval list and data contained within the form. A common business process that might require such a feature would be a requisition form. A department might have the ability to buy items under a certain price. If the total requisition is greater than a certain number an additional approval by a member of the purchasing department might be required. A program could easily detect this condition and expedite the entire process by creating delegations.

If so desired, a workflow in our system could end in another program. This program could validate the entire workflow and, if approved, actually order items, perform database manipulations, etc. With an increasing number of online retailers exposing their order process as a web service, it becomes possible to automate a larger number of business functions end-to-end within a common and well known application such as email.

Additional Request resumes approvals original path are added

User generates request `

Complete and approved request ` ` processed

Figure 8. Sample workflow scenario

Because WSEmail is also functional as a decentralized system, a workflow is capable of extending across multiple enterprises. Although all the enterprises would need to have an agreement to trust a common certificate authority (CA) or cross-trust each other, that is the only additional configuration that is needed. Assuming the enterprises were capable of communicating before, they can now use workflows on WSEmail to create multi-enterprise processes. There are many applications that could use such a setup such a negotiating prices with a supplier or gathering approvals for press releases from interdependent corporations. Since the data is contained within the actual email message another common problem of who hosts the data and applications for the exchange is eliminated. The data simply goes where ever it is required.

On-Demand Attachments

Attaching files to email has long been a poor man’s way of distributing files to a group. It’s convenient enough and accessible to anyone with an email client. Unfortunately, there are a variety of problems with current email attachments that WSEmail can improve on. Except for certain proprietary email systems, there is no version control system for email attachments, which usually results in costly message resends, so that each person gets the newest version of the files.

The way attachments are currently bundled also requires users to download the entire message and attachment, even if the user only wants to read the message. A solution is to post the attachment on a secured website and just send a link in the message. Unfortunately, this creates an administrative headache, as each message would likely have different access control requirements and requires servers to be setup.

Figure 9. On-demand attachments protocol

WSEmail solves the problem by introducing the concept of “on-demand” attachments. Simply put, message attachments are handled as a plug-in to the WSEmail base protocols. The plug-in implements both the IExtensionProcessor and ISendingProcessor interfaces. The client creates a message that contains information about the attachment such as its size, description and a SHA-1 hash, in addition to the normal fields a message contains. The request to the server contains the message as normal, along with the attachments in DIME format. As the message is received by the server, the request is intercepted by the “on-demand” plug-in using the ISendingProcessor interface. The plug-in is able to gain access to the DIME attachments in the requests. These attachments are stripped out of the request are saved to a database, along with a list of all the recipients. Globally unique identifier are generated and injected in to the original message such that each GUID relates to one of the DIME attachments. Delivery of the message now continues as it would normally.

A user, who receives a copy of the message, will see that a file is attached but will not actually have a copy of the file. The user simply has the GUID for the file and the location from which it can be obtained. If the user decides to obtain the attachment, they first acquire a federated identity token. The token is presented using the IExtensionProcessor interface to the server that originally

11 International Journal of Web Services Research , Vol.X, No.X, 200X stripped the attachment, along with the unique identifier. The server verifies the authenticity of the token and that the supplied GUID and token’s owner is permitted access. If there is a successful match, the server sends the attachment back to the requestor as a DIME attachment.

On-demand attachments will form the basis for the security analysis of WSEmail. THEORY

Trends in the analysis of security protocols will aid the development of WSEmail security by supporting the analysis of web services generally [3] (securing.ws) and email specifically [1]. To illustrate this, we describe an on-demand attachments protocol in Figure 9 and state a formally verified correctness property. We assume familiarity with digital certificates (Γ) and hope the notation is sufficiently self explanatory. Let S be a public key signature function and H be a one way hash.

Definition: (Public Key Authentication Using a Timed Nonce) We write A → B : M (pkey Γ, r, t) to indicate that A sends to B a message of the form: A | M | r | t | S(priv(Γ),H(A | M |r | t)).

Here t is the current time according to the clock of A and r is a random number selected by A. The principal B processes this message by checking the following conditions in this order: the time t is not older than a given threshold; the validity interval of Γ includes the current time; the nonce r is not in the replay cache of B; the signature checks using public key in Γ; the certificate Γ is trusted. If any of these fails then the remaining steps are omitted and the message is discarded. If all of the conditions succeed, then r is added to the replay cache with an expiration time determined by a given threshold. In this case the message is said to be valid.

Definition: (Salted Password Authentication) We write A → B : M (pswd P, r, t) if A sends B a message of the following form A | M | r | t | MAC(P,A | M | r | t).

Here t is the current time according to the clock of A and r is a random number selected by A. The principal B processes this message by checking the following conditions in this order: the time t is not older than a given threshold; the nonce r is not in the replay cache of B; the MAC is correct for the password associated with A. If any of these fails then the remaining steps are omitted and the message is discarded. If all of the conditions succeed, then r is added to the replay cache with an expiration time determined by a given threshold. In this case the message is said to be valid.

Protocol: On-Demand Attachments

Initiation: Distribution procedures are used to ensure that the following passwords and keys are known only to the specified principals: principals SC and SS share a password PSC; principal SS has the private key for a certificate ΓSS that is trusted by the other principals; principal RS has the private key for a certificate ΓRS that is trusted by the other principals; principals RS and RC share a password PRC. Principal SC creates a message M that includes the return address SC and an attachment N and sends:

Message 1: SC → SS : SS | (RC | RS) | M | N (pswd PSC, r1, t1) If SS gets a message of this form, it checks this using the password PSC. If the message is valid, SS selects a locally unique reference N*, stores the attachment N with this reference and the name SC and sends:

Message 2: SS → RS : RS | (RC | RS) | M | (SS |U*) (pkey ΓSS, r2, t2) If RS gets a message of this form, it checks this using priv(ΓSS). If the message is valid, RS selects a locally unique reference V*, stores M | (SS | U*) with this reference and the name SC and sends:

Message 3: RS → RC : RC | (RC | RS) | V* (pkey ΓRS, r3, t3) If RC gets a message of this form, it checks this using priv(ΓRS). If the message is valid, RC may send the following request:

Message 4: RC → RS : RS | V *? (pswd PRS, r4, t4) If RS gets a message of this form, it first checks whether V* is associated with RC. If it is and the message is valid based on the password of RC, then it sends the following message:

Message 5: RS → RC : RC | V *| (SS | U*) (pkey ΓRS, r5, t5) If RC receives a message of this form, it checks this using ΓRS and may choose to retrieve the attachment N. The reference V*is included to ensure that this is the response to Msg 4. To do this it needs a token to authenticate to SS. If it does not have one it may request this by a key pair with K as its public key and sending:

Message 6: RC → RS : RS | token(K)? (pswd PRC, r6, t6) When RS receives a message of this form, it checks validity and creates a certificate Γ RC for use by RC and sends:

Message 7: RS → RC : RC | ΓRC When RC receives a message of this form, it checks and stores the certificate for use in accordance with its lifetime. It may choose to obtain the attachment by sending:

Message 8: RC → SS : SS | U*? (pkey ΓRC, r7, t7) When SS receives a message of this form, it checks validity using its code for credentials from RS and confirms that the reference U*is associated with RC. It sends:

Message 9: SS → RC : RC | U* | N (pkey ΓSS, r8, t8) When RC receives a message of this form, it checks validity and makes the attachment available on RC

We specified this protocol formally using the TulaFale [4] specification language, which has constructs for public key signatures and salted password authentication. The TulaFale script compiles to a script that is verifiable with the ProVerif protocol verifier of Bruno Blanchet (www.di.ens.fr/~blanchet/crypto-eng.html, version 1.11) [5]. With this we were able to prove the following correspondence theorem for on-demand attachments: if RC retrieves an on-demand attachment with SC as return address, then SC sent the attachment.

The story behind this proof is at least as interesting and important as the theorem itself, given what it illustrates about the potential for formal automated verification of security protocols in general and for secure web services in particular. In our first attempt to do the proof, we formulated the protocol messages and required secrets directly in the ProVerif language. ProVerif’s language allows for atoms called events which programmers use to declare that certain actions have happened or conditions have been met. We created events for message sending and receiving as well as the receiving of the attachment. We then queried the correspondence that we desired: RC’s event that it received the attachment implies SC’s event that it had sent the same

13 International Journal of Web Services Research , Vol.X, No.X, 200X attachment. The ProVerif tool ran on our script and quickly output that our correspondence was true.

We realized that there was a problem when we discussed our results with Andrew Gordon and Karthikeyan Bhargavan. Both remarked at the mismatch between the complexity of our protocol (nine messages and four parties) and the speed with which ProVerif returned a result. In their experience ProVerif verifications of complex protocols required time measured in days, not seconds! With some checking and insertion of extra event checks it became clear that we had proven a trivial theorem. RC’s “receive attachment” event was unreachable and that was why the prover had declared the correspondence to be true. We had made mistakes in our original formalization and, given our initial “success” with the prover, had not inserted the additional checks necessary to discover if our theorem was trivial.

We decided that rather than attempt to rewrite the protocol in ProVerif’s language, we would start from scratch using the TulaFale language and libraries, the aim of which were to help us to focus on just the aspects of the protocol unique to our system. The TulaFale compiler would translate down from the higher level structures to the ProVerif language and then run the ProVerif executable automatically. As expected, this improved confidence and degraded performance. After encoding up to message 5 in the protocol, ProVerif would no longer converge for correspondence checks (that is, receiving Message 5 implies that Message 5 was sent) after two days of run time. The memory footprint for the verifier exploded, taking up to 900MB of memory during execution. Some proofs simply caused an out of memory error. We finished the encoding regardless, just ensuring that events were reachable to prevent trivial theorems. When the encoding was complete we attempted to just prove that receiving the attachment implied it was sent, but even after four days of processing on a 2.4GHz computer, we saw no output. With the assistance of Bruno Blanchet, the creator of ProVerif, we found that the proof could be completed with some optimizations—in particular, by discarding the derivation tree ordinarily created by the proof.

Our analysis of on-demand attachments appears to be the largest application of TulaFale or ProVerif and the largest verification of any web service security protocol yet completed. It illustrates both the value of these tools and challenges in correct representation and performance. The specification can be found in full at wsemail.ws/On-Demand.tf and the output of the proof is at wsemail.ws/On-Demand.tf.an. EXPERIMENTS

Web services are often criticized for being slow based on their design and existing implementation platforms. Security and flexibility also provide a performance challenge. Hence a secure, flexible implementation of messaging based on web services raises concerns about performance. We implemented a prototype for WSEmail as a way to address these concerns at the same time as illustrating the benefits of flexibility. In order to evaluate the efficiency of our messaging system, we built a test bed to stress test our implementation’s application and protocols. In this section we describe the implementation, the test bed, and our experiments.

We simulated a real world email environment where many users share a common email server. Users may exchange messages with other users within the local domain or external domains. Users may also interact with their personal inboxes to view and delete messages. For our test we defined four standard email operations: send, list, retrieve, and delete. These operations are discussed in detail below.

Implementation

Our WSEmail prototype runs on Windows server and client systems. Version 1.0 was implemented over the .NET framework version 1.1 and relies on the Web Services Enhancement (WSE) 1.0, CAPICOM 2.00, SQL Server 2000 (to store messages for the server), and IIS 5.0. The current version consists of 68 interfaces and 343 classes organized into 30 projects (see wsemail.ws/uml.html for a UML model illustrating the design). Most of the software is C# .NET managed code created with Microsoft Visual Studio. Unmanaged code was needed to gain access to lower-level DNS functions necessary to query for SRV records. Our instant messaging system also exploits a TLS package from Mentalis (mentalis.org) since the .NET 1.1 platform does not provide native support for server TLS sockets. In November 2004 we upgraded WSEmail to version 1.1 in order to get WS-Policy support from Microsoft WSE 2.0. This was challenging because primitive functions from WSE 1.0 that we needed for our WSEmail 1.0 implementation were removed from the WSE 2.0 package forcing us to use both WSE 1.0 and WSE 2.0 to implement WSEmail 1.1.

WSEmail uses DNS SRV records (ietf.org/rfc/rfc2782.txt) to determine routing. This makes it possible to run WSEmail over other protocols without changing the way DNS is queried, and we can exploit the priority and weight attributes in the records. These properties of the SRV record allow for future enhancement and present day configuration that is extremely similar to the way SMTP is deployed now.

Test Bed

Our test bed consisted of a total of four client machines, two mail servers (designated as local and external), one test coordinator and one database/DNS server. The arrangement of the test bed is depicted in Figure 10.

Figure 10. Test bed architecture

The test clients (labeled as T1 through T4) all performed operations by sending requests to the “local” email server, Si. The test client actions were coordinated by the test coordinator, Stc. There also was a second server, Se, which acted as both an “external” email server and a load generator for the “local” system. Sdb hosted a message storage database and DNS records for Si and Se. The clients all had Pentium 4 2.8GHz processors with 512MB of memory and the

Windows XP Pro operating system. They performed four different operations during the test execution: send a message to a recipient; list the headers of messages in the client’s inbox, retrieve a particular message, delete a particular message. We explored various ways to include a mixture of applications with these basic operations but found it difficult to isolate performance issues clearly in doing this, so we restricted our focus to a demonstration of the basic operations.

The test coordinator, Stc, was responsible for distributing the test specifications, starting the test and receiving the results from each client. The coordinator Stc broadcasted its network address, instructing all clients to connect to it and download the test specifications file. The clients then waited for Stc to announce the start of the test, after which the clients executed requests to Si in compliance with the specs they downloaded. After each client finished, the latencies for each request were reported back to Stc.

The test specifications document described exactly what each client was to do. It indicated whether the client should authenticate using a username token (user name and password) or X.509 certificate. It also specified how many messages were to be sent from each client, to whom are they were to be sent and the size of the message body. The specification document also indicated the total number of requests that should be sent and the ratios of the four types of requests.

The local server Si was the focus of our test. It accepted incoming messages from the clients and an external server Se. It performed the necessary authentication, and forwarded external messages to the appropriate destination after performing DNS resolution. If the destination was local (for example, the recipient is on Si) then the message was stored in Sdb. If the destination was external, the message was forwarded to Se. We allowed the local and external server to share a database and DNS server since these were not performance bottlenecks in the system.

The external server Se played two roles in our test bed. First, it imitated the entire external client list, so that all emails directed to any external client were forwarded to it. On reception of a message addressed to one of the clients that it simulated, it did not save it to the database server. This was done to prevent Si from experiencing extra latency due to Se’s database transaction. Rather, it performed the required certificate checking to verify authenticity and then discarded the message. Second, it acted as a load generator and sent one message per second addressed to each of the four clients: T1 - T4. These messages were all received by Si, authenticated, and stored in Sdb.

Procedure and Results

The test coordinator Stc provided a test specification document that instructed each client to run one execution thread sending 2,000 requests to Si. The clients chose send, list, retrieve, and delete operations with 25% chance. In cases where the delete operation was to be performed on an unpopulated inbox, it was considered a no-op and not counted towards the results. However, to avoid this condition, each client’s inbox was primed with approximately half a dozen messages. To get the most out of each send event, each message was addressed to both a randomly chosen local client and an external client. The clients were all instructed to authenticate to Si using username token authentication. Si and Se authenticated to each other using X.509 certificate signing. The duration of the entire test was 1826 seconds.

In order to get a client-side view of the efficiency of the system, we measured the latency of each request. A timer was started as the client contacted Si with a request and stopped after the client received the appropriate response (e.g. inbox listing, message received confirmation, and so on).

The time difference between the client’s request and the server’s complete response was the latency of the operation. The results of this calculation point to an average of 0.284 seconds per request with a variance of 0.1389 seconds. The minimum and the maximum latencies were 46.876 ms and 4.0 seconds respectively. Note that the “Message Received” confirmation does not mean that the message was delivered to the ultimate recipient, just that the message was placed in the delivery queue.

The test results in Table 1 show the throughput of bytes sent in MB as a breakdown of the number of requests (send, list, retrieve and delete). Therefore the total data in MB from the clients to Si is 36.18 MB and from the Si to all the clients is 369.35 MB.

Operation Send List Retrieve Delete # of requests 1970 2024 2026 1980 % of all requests 24.6 25.3 25.4 24.7 Client to Server Data (MB) 10.74 8.42 8.62 8.4 Server to Client (MB) 12.31 324.55 20.41 12.08

Table 1. Bytes sent between clients and Si

Since each message is also sent to an external client, each send action also sends a message from Si to Se. This data is measured according to the representation in Table 2. Therefore the total data exchanged from Si to Se is 30.95 MB and from Se to Si 30.69 MB.

Server name # of Send (MB) Received Confirmations (MB) messages

Si 1970 2024 2026

Se 24.6 25.3 25.4

Table 2. Bytes sent between Si and Se

The entire test bed data transfer was recorded using the Ethereal network monitor (www.ethereal.com), which was run at Si and Se. The TCP/IP sessions were then reconstructed using tcpflow (www.circlemud.org/~jelson/software/tcpflow/) and post-processed with Perl and awk. Since Se, acting as a load generator, sent one message per second, 1826 messages were also sent from Se to Si over the entire duration of the test. The corresponding byte count represents the messages that were sent and the notification messages that were received.

Analysis

A best case test of SMTP with no load on the server/network and no contention for resources yielded an average latency of 0.170 ms to send a message of approximately the same size as the WSEmail messages we sent in our experiment. The average difference in latency between WSEmail and the SMTP test is therefore 0.114 ms, which accounts for the additional overhead of the XML parsing and cryptography. In that short time span a large number of operations took place: one secret key signature, one private key signature verification, two public key signatures, and one public key signature verification. Given that the entire system is using XML we conclude that performance is not a barrier to secure web services in this type of application. Indeed, the extra latency probably would not be noticed in a typical client/server environment.

XML and XMLDSIG do have a drawback in their verbosity. Our test bed sent 1 KB mail messages which ballooned into 10 KB responses to the retrieve message action in order to make XMLDSIG work properly. In general, at least 30% of those bytes were the Base64 encoded

17 International Journal of Web Services Research , Vol.X, No.X, 200X representations of the certificates used for signing messages. After the certificate size, the WS- Security structures were also a significant amount of overhead, accounting for about 30% of the bytes transferred. WSEmail might need to explore ways to distribute certificates so that they are not replicated excessively. It might also be fruitful to look at how messages are signed to try to minimize their verbosity.

Our experimentation bodes well for web service efficiency, especially for high volume messaging. Extending our experimental results, we find that WSEmail is theoretically capable of handling approximately 1787 messages a minute (combination of incoming and outgoing). We looked for published benchmarks to compare this against and found that the University of Wisconsin-Parkside (uwp.edu/cgi/netstats.cgi?log=mail) had a peak usage of 1716 (total of incoming and outgoing) messages per minute over the past year, meaning it should be possible for a single WSEmail server similar to our test system to routinely handle the normal load at that institution. Given the modest cost and capability of our test bed system, we predict that much higher capacities are possible with further hardware investment.

We made some modest adjustments in the WSEmail system to improve its efficiency based on early test results. For instance, our early results show that the client to server bandwidth usage was dominated by list commands. This is because in the test bed the new message checking mechanism was inefficient. It would download all the headers and compare them to find new ones rather than just asking for messages newer than a certain date. In response to these results, we changed the functionality of WSEmail to ask for new messages by date, increasing its efficiency.

The experimental results show a promising picture for secure XML-based web services messaging. The responsiveness of the system, combined with the new security features and inherent extensibility, provide a solid framework for new messaging systems. We aim to expand the scope of the test bed to include application tests and use it to regularly confirm performance consequences of changes in the system. RELATED WORK

Work related to WSEmail can be divided into two general areas: improved Internet messaging systems and the analysis of web service security.

Improvements to the SMTP messaging system have often been motivated by two, sometimes overlapping goals: strong message authentication and spam prevention. PGP offers authentication tools that include public/private key signing and encryption. Privacy Enhancement Mail (PEM) (IETF RFCs 1421-4) has mechanisms for privacy, integrity, source authentication, and nonrepudiation using public and private key encryption and end-to-end encryption techniques. Zhou, et. al. [8] use formal tools to verify the properties of the PEM system. Abadi, et. al. [2] use a trusted third party to achieve message and source authentication and formally prove correctness of their protocol.

Changes to the SMTP system aimed at spam reduction include a proposal by Fenton and Thomas [7] which uses public key cryptography and an option for server signed (rather than client-signed) messages and Petmail (petmail.lothar.com). Petmail uses the GPG encryption utility for public key encryption and signing of messages. Users are identified by IDRecords, self-signed binary blobs that include public key, identity, and message routing information. Petmail agents can enforce IDRecord whitelists and policies for contact from first time senders. First-time senders

18 International Journal of Web Services Research , Vol.X, No.X, 200X may be forced to obtain tickets from a third party Ticket Server which may perform checks to ensure that the sender is a human (using CAPTCHA reverse Turing tests.) Messages can be encapsulated and sent using SMTP, Jabber, or some other queuing transport protocol. Patterns and options for sender anonymity are offered as well. Our most recent work on extensions of WSEmail show how to do several of these things and more based on WS-Policy negotiations and our dynamic plug-in capability. There exists an interesting opportunity to modify the TLS based instant messaging system to use federated identities instead of X.509 certificates. Since all users of WSEmail are able to obtain such credentials from their local servers, such a modification would allow secure conversations amongst all peers.

In the area of web services security analyses, the Samoa project at Microsoft Research (securing.ws) has developed important fundamentals, including a formal semantics for proving web services authentication theorems [3] and the TulaFale language for automating web service security protocol proofs [4]. This is our primary foundation for current and future formal work with WSEmail. Damiani et. al. [6] discusses connections between XML, SOAP, and access control languages. CONCLUSIONS

We have explored WSEmail, the development of email functions as a family of web services, by developing a prototype system based on an architecture that emphasizes flexibility, security, and integration. We have shown that WSEmail is amenable to the addition of new protocols and the formal analysis of these protocols. We have also shown that the basic WSEmail functions have satisfactory performance. In ongoing work, we are exploring several directions such as: new applications that exploit improved integration between web-like data retrieval functions and the messaging system; challenges to interoperability with a Java implementation of the MUA; and ways to express and negotiate messaging policies. For widespread use, WSEmail faces substantial problems with standardization and interoperability with SMTP, which may be mitigated by writing more plug-ins like our SMTP-compatible relay agent. However, it is well- suited to some high-security applications even now, offers ideas in exploring the general design space for Internet messaging, and can rely on the standardization advantages of XML as an aid to addressing interoperability challenges. We also aim to support WSEmail on diverse platforms. A project of Heo, Patel, and Shah was partially successful in doing this for a Java WSEmail client based on Sun’s JWSDP 1.4 with X.509 security. This went well for some web service standards like WSDL but was challenged by support for some security functions in the message headers such as OASIS UsernameToken authentication. ACKNOWLEDGEMENTS

This work was supported by a gift from Microsoft University Relations, NSF grants CCR02- 08996 and EIA00-88028, ONR grant N000014-02-1-0715, and ARO grant DAAD-19-01-1-0473.

We are grateful for discussions of WSEmail that we had with Martin Abadi, Raja Afandi, Noam Arzt, Karthikeyan Bhargavan, Luca Cardelli, Dan Fay, Eric Freudenthal, Cedric Fournet, Andy Gordon, Ari Hershl Gordon-Schlosberg, Munawar Hafiz, Jin Heo, Himanshu Khurana, Ralph Johnson, Bjorn Knutsson, Jay Patel, Neelay Shah, Kaijun Tan, and Jianqing Zhang. We are also grateful to Bruno Blanchet for technical support.

References

[1] M. Abadi and B. Blanchet. Computer-assisted verification of a protocol for certified email. In Radhia Cousot, Editor, Static Analysis, 10th International Symposium (SAS’03, volume 2694 of Lecture Notes on Computer Scienece, pages 316–335, San Diego, CA, June 2003. Springer.

[2] M. Abadi, N. Glew, B. Horne, and B. Pinkas. Certified email with a light on-line trusted third party: design and implementation. In Proceedings of the eleventh international conference on World Wide Web, pages 387–395. ACM Press, 2002.

[3] K. Bhargavan, C. Fournet, and A. Gordon. A semantics for web services authentication. In Proceedings of the 31st ACM SIGPLAN-SIGACT symposium on principles of programming languages, pages 198–209. ACM Press, 2004.

[4] K. Bhargavan, C. Fournet, A. D. Gordon, and R. Pucella. TulaFale: A security tool for web services. In International Symposium on Formal Methods for Components and Objects (FMCO’03), LNCS. Springer, 2004.

[5] B. Blanchet. An efficient cryptographic protocol verifier based on prolog rules. In Proceedings of the 14th IEEE Workshop on Computer Security Foundations, page 82. IEEE Computer Society, 2001.

[6] E. Damiani, S. De Capitani di Vimercati, and P. Samarati. Towards securing XML Web services. In Proceedings of the 2002 ACM workshop on XML security. ACM Press, 2002.

[7] J. Fenton and M. Thomas. Identified internet mail. Work in Progress draft-fenton-identified- mail-01, IETF Internet Draft, October 2004. Expires April 2005.

[8] D. Zhou, J. Kuo, S. Older, and S. Chin. Formal development of secure email. In Proceedings of the 32nd Hawaii International Conference on System Sciences. IEEE Computer Society, 1999.

About the Authors

Kevin D. Lux Michael J. May Carl A. Gunter