Anomaly Detection in XML-Structured SOAP Messages Using Tree-Based Association Rule Mining

Technical Report No. TWcL-TR-1501, Trustworthy Computing Laboratory, School of Computer Engineering, Iran University of Science and Technology, Tehran, Iran, 2015. Anomaly Detection in XML-Structured SOAP Messages Using Tree-Based Association Rule Mining Reyhaneh Ghassem Esfahani, Mohammad Abadollahi Azgomi* and Reza Fathi Trustworthy Computing Laboratory, School of Computer Engineering, Iran University of Science and Technology, Tehran, Iran E-mail: [email protected], [email protected] and [email protected] Abstract Web services are software systems designed for supporting interoperable dynamic cross- enterprise interactions. The result of attacks to Web services can be catastrophic and causing the disclosure of enterprises’ confidential data. As new approaches of attacking arise every day, anomaly detection systems seem to be invaluable tools in this context. The aim of this work has been to target the attacks that reside in the Web service layer and the extensible markup language (XML)-structured simple object access protocol (SOAP) messages. After studying the shortcomings of the existing solutions, a new approach for detecting anomalies in Web services is outlined. More specifically, the proposed technique illustrates how to identify anomalies by employing mining methods on XML-structured SOAP messages. This technique also takes the advantages of tree-based association rule mining to extract knowledge in the training phase, which is used in the test phase to detect anomalies. In addition, this novel composition of techniques brings nearly low false alarm rate while maintaining the detection rate reasonably high, which is shown by a case study. Keywords: Web service security, SOAP security, anomaly detection, tree-based association rule mining. 1. Introduction In service-oriented architecture (SOA), each part of the system is implemented in the form of a service or a disjoint unit such that there is little dependency on underlying technologies used in implementing them. Although using SOA brings less maintenance and reusability costs, exposing it in open environments, such as the Internet, can cause instabilities in using these services because of failure, security attacks and so on. Web services, as a realization of SOA, are commonly used not only in the Internet, but also in intranets and inter-organization communications. They usually provide access to vital systems, such as enterprise resource planning, to which a successful attack may cause heavy damages. Therefore, it is important to consider appropriate mechanisms to protect these * Address for correspondence: School of Computer Engineering, Iran University of Science and Technology, Hengam St., Resalat Sq., Tehran, Iran, Postal Code: 16846-13114, Fax: +98-21-73021480, E-mail: [email protected]. services from both accidental failures and security attacks. Intrusion detection systems (IDSs) are proposed for the detection of attacks and intrusions to computer networks. Two major approaches of analyzing data in intrusion detection systems are misuse detection and anomaly detection. In the former approach, attack detection is done on the basis of a set of predefined attack patterns or signatures stored in a database. In the latter, the IDS defines a normal profile of the system under supervision and tries to identify any deviations from it. As a result, this approach is capable of identifying new unseen attacks as well. A special kind of IDSs works in Web service layer and is called Web services intrusion detection system (WS-IDS). WS-IDSs are used to protect Web services from malicious attacks. To design new enhanced WS-IDSs, proposing new techniques for anomaly detection in Web services is a key requirement. In this paper, after an investigation on shortcomings of the existing solutions, a new approach for detecting anomalies in Web services is outlined. More specifically, the proposed technique illustrates how to identify anomalies by employing mining methods on extensible markup language (XML)-structured simple object access protocol (SOAP) messages. This technique also takes the advantages of tree-based association rule mining to extract knowledge in the training phase, which is used in the test phase to detect anomalies. In addition, this novel composition of techniques brings nearly low false alarm rate while maintaining the detection rate reasonably high. This technique is implemented and experimented in a case study. The experiment results are also presented in the paper. The rest of this paper is organized as follows. First, we have a brief review on Web services terminology in Section 2. After a short look at related works in this area in Section 3, we highlight our motivations in Section 4. In Section 5, we explain our proposed technique. Then, the proposed technique is evaluated through a case study in Section 6. Finally, Section 7 concludes the paper and outlines future works. 2. Background In this section, we briefly define some concepts and terms, which are used throughout the paper. 2.1. Web services According to the World Wide Web consortium (W3C), a Web service is a software system designed to support interoperable machine-to-machine interaction over a network. It has an interface described in a machine-processable format especially in web service definition language (WSDL). Other systems interact with the Web service in a manner prescribed by its description using SOAP messages, typically conveyed using the hypertext transport protocol (HTTP) with an XML serialization in conjunction with other Web-related standards [1]. 2.2. Simple object access protocol Simple object access protocol (SOAP) is an Internet-based platform-independent protocol that provides a way to communicate between applications. Previous application communication protocols, such as remote procedure call (RPC), were raised some security and compatibility issues. Therefore, firewalls and proxy servers, usually block their traffic. SOAP messages, on the other hand, are sent over HTTP, which is a transfer protocol supported by all Internet browsers and servers. Simplicity and extensibility are other features of this XML-based protocol [2]. A SOAP message is an ordinary XML document containing the following elements: - Envelope: the root element that identifies the XML document as a SOAP message. - Header: an optional element that contains the information like authentication or payment. - Body: a required element that contains call and response information. - Fault: an optional element, containing errors and status information. Each SOAP request is embedded in data section of an application layer protocol to be transferred over the Internet. SOAP requests are usually transferred trough GET and POST methods of HTTP protocol. 2.3. Web services and SOAP security Because firewalls and gateways see SOAP messages as a normal HTTP traffic, it is possible for these messages to get around firewalls. Therefore, any possible malicious contents in SOAP messages will remain undetected and this fact can jeopardize the Web services security. In [3], attacks on Web services are classified into three main classes: (1) infrastructure attacks, which are attacks related to web servers where the Web services reside and also the attacks related to the transport protocol used for exchanging the Web services’ request, (2) Web services attacks, that are native to the actual technology fueling Web services, such as WSDL scanning, and (3) XML content attacks, which can be any type of XML-based, content-driven threat that employ the tactic of embedding malicious content with a legitimate XML document. In [4], security problems of XML Web services are investigated and the result of their studies is presented as eight categories of possible attacks on Web services. The categories are as follows [4]: - Identity attacks: dictionary, IP spoofing, message eavesdropping and data tampering attacks. - Session attacks: replay and man-in-the-middle attacks. - Parsing attacks: recursive payloads, oversize payloads and schema poisoning attacks. - Buffer overflow attacks. - Code attacks: SQL injection and XPath injection attacks. - XML denial of service (XDoS) attacks. - WSDL attacks: WSDL scanning and parameter tampering attacks. - Internal attacks: These attacks perform within the firewalls by an employee or former employee that usually have access to confidential information and are familiar with internal system of the organization. According to [5], there are three logical layers for security implementation in SOAP messages: 1. Transport layer, which include protocols for exchanging SOAP messages. This approach has the limitation of point-to-point interaction, but what we need is message-level security to support the multi-application interaction. Another drawback is that no uniform standard exists to carry security context from transport layer to application layer. 2. Application layer, which refers to service providers and service consumers. Implementing custom security mechanisms in applications, besides being error prone, can cause integration problems when two services are to be composed. 3. SOAP layer, which is the engines that help exposing business logic in applications as SOAP-based services. SOAP layer seems to be the only alternative where the security model can be built. OASIS, a nonprofit consortium in charge of developing open standards for information society, proposed WS-Security which is a set of enhancements to SOAP messaging to provide message integrity and confidentiality [6]. Their

Load more