Downloads the Open Services, Described by the WSDL Standard

Data Repository for Security Information and Event Management in Service Infrastructures Igor Kotenko, Olga Polubelova and Igor Saenko Laboratory of Computer Security Problems, St. Petersburg Institute for Informatics and Automation (SPIIRAS), 39, 14th Liniya, Saint-Petersburg, Russia Keywords: Security Repository, Security Information and Event Management, Security Ontology, Data Model, Data Representation, Logical Inference, Service Infrastructure. Abstract: Design and implementation of the repository is a critical problem in advanced security information and event management (SIEM) systems, which are SIEM systems of service infrastructures. The paper discusses several innovations which are realized to address this challenge. These include the application of an ontological approach for repository data modeling and a hybrid approach to its development, meaning the combined use of relational databases, XML databases and storage of triplets. 1 INTRODUCTION build a new generation of SIEM systems (MASSIF, 2011). SIEM systems elaborated in MASSIF must At present, one of the most important research have the following new features: removing most of directions in the area of computer network security restrictions on the functions imposed by is a technology of security information and events infrastructure; coherent interpretation of the management (SIEM). The essence of this technology incidents and events at various levels; high degree of is to ensure coherent boot in a centralized repository reliability and durability in capturing event data; of security log records from a variety of sources – high scalability. “security events”, their long- and/or short-term One of the most important components of the storage, modeling and analysis to detect attacks, SIEM systems used in service infrastructures is a generating efficient countermeasures. SIEM security repository, which is a data warehouse that technology can make effective safety decisions enables to store security information and event data based on event correlation, data mining, logical in an internal format and extracts it at the request of inference and data visualization. Using SIEM other components for identifying security threats and systems is extremely important to ensure attacks and generating countermeasures. information security of large distributed computer In SIEM systems of service infrastructures, networks, management and financial services of security event data arrive from a variety of different companies as well as for critical infrastructures, such sources and can be presented in various input as dams, power plants, etc. (Miller et al., 2011). formats. A SIEM system produces the normalization Advances of SIEM systems in computer network of those data and they are converted into an internal infrastructures give rise to use such systems in the format. Then the security data are exposed to broader class infrastructures that can be defined as correlation analysis (Stevens, 2005). In the SIEM service infrastructures. In these infrastructures, in system of new generation it is possible to use the addition to computer networks there are the advanced modeling and simulation modules, which infrastructures of various types of services also use data stored in the repository to build on (financial, physical, etc.). SIEM systems which can their basis the attack and countermeasure graphs be used in service infrastructures are considered in (Ingols et al., 2009); (Kotenko et al., 2006). the paper as new generation SIEM systems. For these reasons, the main objectives of the MASSIF (MAnagement of Security information repository development are as follows: to design a and events in Service InFrastructures) is one of the unified repository, languages and tools for effective EU projects, which aims to develop solutions to management of security information, events and 308 Kotenko I., Polubelova O. and Saenko I.. Data Repository for Security Information and Event Management in Service Infrastructures . DOI: 10.5220/0004075303080313 In Proceedings of the International Conference on Security and Cryptography (SECRYPT-2012), pages 308-313 ISBN: 978-989-8565-24-2 Copyright c 2012 SCITEPRESS (Science and Technology Publications, Lda.) DataRepositoryforSecurityInformationandEventManagementinServiceInfrastructures policies, and logical inference about security; to and applications, set their secure configurations, implement software applications for storing, identify the most critical vulnerabilities, etc. manipulating, visualizing and validating security The main repository solutions in advanced SIEM information, events and policies based on the unified systems (AlienVault OSSIM, AccelOps, QRadar, repository. The paper examines the main issues of Prelude, ArcSight, IBM Tivoli, and Novel Sentinel) data model design and repository development for are based on relational databases. The storage in new generation SIEM systems. We could note the OSSIM includes a user-defined, searchable following innovations that were used to solve this knowledge base of incident solutions (AlienVault, problem. First, for data modeling the ontological 2011). AccelOps SIEM is designed to collect logs approach is proposed and implemented. It provides generated by Cisco network and security devices, the necessary flexibility of internal data and all the major network vendors’ devices. The representation in the repository and the possibility of repository is implemented as online PostgreSQL more accurate and high-quality results of queering. storage applied for log analysis in real time and for Secondly, the hybrid approach to implement the historical log analysis (AccelOps, 2011). Qradar repository is suggested. It integrates relational stores the entire input event stream to enable databases, XML databases and stores of triplets. detailed forensics and compliance reporting (Miller Finally, we propose the advanced repository et al., 2011). Prelude (Prelude, 2011) supports three architecture implemented and tested with the data databases: MySQL, PostgreSQL, and SQLlite. used for attack modeling in SIEM systems. ArcSight Logger 4 collects data in structured and The rest of the paper is organized as follows. unstructured formats (Shenk, 2009). The system Section 2 reviews related work in the field of SIEM implements role-based access and access through data processing, representation and storage. Here we a web-based interface, and intelligent and intuitive consider standards in security event representation, search mechanisms with a visual query designer. advanced SIEM systems, languages for data IBM Tivoli SIEM (Buecker et al., 2010) can provide representation, and approaches to implement the a long-lasting and compact storage of information repository. Section 3 considers the ontological security events. The collected events are stored in a vulnerability model used in the repository for attack database as text objects containing information modeling. Section 4 discusses the issues of the about incidents, management actions, correlation repository implementation and testing based on the rules, etc. Novell Sentinel Log Manager stores all data of the SIEM attack modeling module. Section 5 data in a compressed format (Novell, 2010). The concludes our results and outlines further research. components of data storage use a file-based storage and an indexing system. PostgreSQL is used for data management. 2 RELATED WORK One of the alternative solutions on data representation in systems with complex data During the analysis of state-of-the-art, we structures is the ontological approach (OWL, 2009). considered perspective and widely used approaches Using description logic, this approach can express much easier the complex relationships between and standards for data representation in area of security information and events management. entities. To represent an ontological meta-data we Information and event management standards suggest using RDFS (RDF Schema) (RDF, 2004). RDF data model is a directed graph, which is based provide the most common rules for representation of security events and incidents. Currently, there are on elementary statements (triples). A triple is a short formal statement in the form of “subject-predicate- many different standards of security data object”. A triple store is a purpose-built database to representation (IDMEF, IODEF, CEE, SCAP, CBE, CEF, XDAS, CIM, etc.). The most popular of them store and retrieve RDF metadata (Triplestore, 2010). In addition to RDF and XML, OWL (Web Ontology are Common Event Expression (CEE), SCAP (SCAP, 2011), Common Base Event (CBE) (Ogle et Language) can be chosen to represent data. OWL is al., 2004) and Common Information Model (CIM) a language of Semantic Web, created to represent ontologies (OWL, 2009). OWL presents ontology in (CIM, 2011). For example, CEE realizes a comprehensive approach to handling the input the form of documents, which can be stored and stream of information to log management systems, transmitted in a global network. SWRL (Semantic Web Rule Language) (SWRL, 2004) can be used to including recommendations to vendors of hardware and software systems that generate the input stream. specify rules. SWRL is a proposal for a Semantic SCAP enables to compile a list of system platforms Web rule language, based on a combination of OWL 309 SECRYPT2012-InternationalConferenceonSecurityandCryptography sublanguages with RuleML sublanguages (Parsia, models and the repository of SIEM systems for 2005). SPARQL Protocol and RDF Query Language service

Downloads the Open Services, Described by the WSDL Standard

Including Co-Referent Uris in a SPARQL Query

Versioning Linked Datasets

Semantics Developer's Guide

Property Graph Vs RDF Triple Store: a Comparison on Glycan Substructure Search

A Digital Library for Plant Information with Performance Comparison Between a Relational Database and a Nosql Database (RDF Triple Store)

Some Experiments on the Usage of a Deductive Database for RDFS Querying and Reasoning

Combining Unstructured, Fully Structured and Semi-Structured Information in Semantic Wikis

A Semantic Web-Based Methodology for Describing Scientific Research Efforts Aida Gandara University of Texas at El Paso, [email protected]

Will Triple Stores Replace Relational Databases?

Triplestore Databases

Implementing a Semantic Catalogue of Geospatial Data

Evaluation Criteria for RDF Triplestores with an Application to Allegrograph