Log Management with Open-Source Tools

Risto Vaarandi rvaarandi 4T Y4H00 D0T C0M

Outline

● Why do we need log collection and management? Why use open source tools? ● Widely used logging protocols and recently introduced new standards ● Open-source servers ● Open-source log management tools

Why collect logs from your IT system and network?

● Observation – logs contain information which is often not available from other sources

● Real-time monitoring – analyze logs in real-time (or near-real-time) fashion, in order to discover important changes in the state of the IT system

● Post-factum incident analysis – leverage collected data for discovering unknown past incidents and getting detailed insights into them

Why use open source tools for log management?

● Commercial SIEM and log management frameworks:

✔ many frameworks are consultant-oriented – have complex design and insufficient documentation

✔ prohibitive deployment and licensing costs

✔ many frameworks repeat a number of design mistakes of network management solutions (made almost two decades ago!)

● Past experience with network management solutions:

✔ Phase 1: initial marketing hype, followed by a number of success stories in the context of large and wealthy institutions

✔ Phase2: disappointment among many potential customers (failed deployments, prohibitive pricing, etc.) and search for alternatives

✔ Phase3: appearance of well-designed open-source solutions which become widely used and acknowledged, especially by small- and mid- size enterprises

Traditional log collection protocols

● The scene of log collection protocols was relatively stable for two decades

● BSD syslog – the only cross-vendor protocol designed specifically for logging

● UDP based plaintext, thus resource-efficient, but unreliable and not secure

● Simple message layout in the UDP frame – priority, simple timestamp, host name, program name, unstructured message text

New log collection protocols

● IETF syslog (2009) – support for including structured data in messages, UDP and TCP based transport, encryption and authentication, detailed timestamps

● CEE (Common Event Expression) logging standard (2012) – use JSON format inside originally unstructured BSD/IETF syslog message fields

● Other protocols – non-RFC flavors of BSD and IETF syslog (e.g., BSD syslog over TCP), GELF, SNMP trap messages, etc.

Examples

# Traditional BSD syslog – priority value 28 encapsulates facility value 3 # (daemon) and severity value 4 (warning): 3*8 + 4 = 28

<28>Nov 17 12:33:59 myhost2 ids[1299]: port scan from 192.168.1.102

# IETF syslog – note high granularity timestamps with timezone information # and two blocks of structured data

<28>1 2012-11-17T12:33:59.223+02:00 myhost2 ids 1299 - [timeQuality tzKnown="1" isSynced="1"][origin ip="10.1.1.2"] port scan from 192.168.1.102

# CEE message format – use standard BSD syslog message for transporting # structured data in JSON format

<28>Nov 17 12:33:59 myhost2 ids[1299]: @cee:{"pname":"ids","pid":1299,"msg":"port scan from 192.168.1.102", "originip":"10.1.1.2","action":"portscan","src":"192.168.1.102"}

Why pass structured data in log messages?

● Unstructured message fields often contain additional information about event which needs to be highlighted

● It is much easier to parse structured data (keyword- value pairs) than unstructured free-format strings

● Some structured data can be used without extra parsing – JSON format is supported by several log management frameworks and databases (e.g., Elasticsearch)

Log collection on platform

db GUI

local syslog messages to programs server remote syslog servers

openlog(3) incoming messages to local logfiles syslog(3) ... /proc/kmsg /var/log/... /dev/log configuration kernel network /etc/syslog-server.conf port

messages from other nodes 9 Syslog servers –

● http://www.rsyslog.com + fast message processing, efficient multithreading, designed to handle at least 150-200K messages per second (see the paper “Rsyslog: going up from 40K messages per second to 250K” by Rainer Gerhards from Linux Kongress 2010) + backwards compatible with UNIX syslogd configuration directives + has a number of unique features and advantages over competitors (disk based buffers, support for Elasticsearch database, etc.) - documentation could be better - configuration language has a non-intuitive syntax - filtering conditions can not be named which prevents their reuse

Syslog servers – syslog-ng

● http://www.balabit.com/network-security/syslog-ng/ + a flexible and readable configuration language which allows for specifying complex configurations + single-threaded until the 3.2 version, but multi-threading has been introduced into recent versions which considerably improves scalability and performance + well documented - open-source edition does not support disk based buffers - no support for Elasticsearch (although could be configured through a self-developed output plugin)

Syslog servers – nxlog

● http://nxlog-ce.sourceforge.net/ + native support for Windows platform and Windows Event Log + supports the use of embedded constructs for message processing + supports a number of input and output types not supported by competitors (e.g., accepting input events from SQL databases, producing output events in GELF format, etc.) - poor message filtering performance

Elasticsearch DB for log management

● http://www.elasticsearch.org/

✔ Apache Lucene based noSQL database technology that is frequently used for storing log data

✔ native support for distributed operations and building clusters

✔ allows for splitting indexes into parts (shards) and distributing shards over several nodes (e.g., split an index into 2 shards and distribute them over 2 nodes, turning disks at individual nodes into a single logical storage space)

✔ indexes can be configured to have one or more replicas which increases fault tolerance (e.g., split an index into 2 shards and configure the index to have 1 replica, and distribute resulting 4 shards across 4 nodes)

✔ builtin support for data compression (important when storing large volumes of log data)

✔ supported by several log management tools (Kibana, Graylog2, logstash, rsyslog) Log management tools – Kibana

● http://kibana.org/

✔ Kibana is a GUI for searching log data stored into Elasticsearch DB

✔ Kibana is designed to work with logstash log preprocessing tool, but can accept data from any other tool which is able to store it to Elasticsearch in a recognizable way (e.g., rsyslog)

✔ Kibana is lightweight, written in Ruby, accessible over HTTP, and contains only searching and reporting functionality (e.g., user authentication and SSL connectivity has to be accomplished with external tools like Apache reverse proxy)

✔ When building a Kibana based log management solution, you are creating the system from well-documented and well-established building blocks, and thus having the opportunity for many customizations during initial installation and later maintenance

Kibana web interface

Log management tools – Graylog2

● http://graylog2.org/

✔ A full log management solution consisting of a server for log message reception (syslog, GELF) and a GUI

✔ The GUI is user-friendly with builtin help, and is intuitive to use

✔ Many configuration tasks (such as setting log data retention intervals, etc.) can be accomplished through a web interface

✔ Graylog2 supports users with different roles and password authentication

✔ Earlier versions of Graylog2 employed single-server approach which limited the system scalability, while most recent versions allow to run several servers in parallel

Graylog2 web interface

Other log management tools

● Logstash (http://www.logstash.net/) - has a web interface for searching logs stored to Elasticsearch database, but since it supports large number of input and output types, it is mostly used as a log parsing and conversion tool

● ELSA (http://code.google.com/p/enterprise-log-search-and- archive/) - a log management system which is built on top of syslog-ng, MySQL and Sphinx

Netflow protocol

● Proposed by Cisco in 1990s, nowadays supported by many major vendors

● A Netflow-enabled network device (e.g., router, switch, dedicated probe) collects network traffic statistics and exports it to collector over UDP

● Traffic statistics consists of flow records, where each record describes some network flow

● Network flow – unidirectional sequence of packets which share transport protocol, source and destination IP, source and destination port, and few other parameters (e.g., type of service)

Example of collected Netflow data

● The following two records represent a successfully negotiated and completed TCP connection from client 10.3.1.1 port 48896 to the HTTP service (port 80) running at the server 10.2.1.1:

Start = 2013-02-18 00:04:05.733 Duration = 0.014 TCP 10.3.1.1:48896 -> 10.2.1.1:80 TCPflags = .AP.SF Packets = 5 Bytes = 513

Start = 2013-02-18 00:04:05.734 Duration = 0.010 TCP 10.2.1.1:80 -> 10.3.1.1:48896 TCPflags = .AP.SF Packets = 4 Bytes = 375

How to collect/use netflow data

● Enable Netflow collection at your network device or use dedicated probes (e.g., fprobe)

● Open-source software packages for collecting Netflow

✔ NfSen (http://nfsen.sourceforge.net/) ✔ SiLK (http://tools.netsa.cert.org/silk/) ✔ Flow-tools (http://www.splintered.net/sw/flow-tools/) - unmaintained ● What you might be interested in finding in Netflow data

✔ Flows with unusual combinations of TCP flags (e.g., FIN without ACK)

✔ Flows which represent connections to/from known bad IP addresses

✔ Unexpected spikes in traffic volumes (measured in number of bytes, packets, flows) associated with certain sources (e.g., foreign IP addresses or bad IP addresses)