At my place of employment we have recently set up Zabbix for our infrastructure monitoring. To complement that system and allow easier and faster diagnosis of problems detected we have looked into building a centralized logging system for our servers and applications. This has been an ongoing search but we never found something that truly fit our needs. Recently, I came across Logstash which uses a program written in JRuby which operates like many Unix style programs. It is described to be similar to sed. Logstash is designed to chain a number of filters together to process an input source and output to many different places. One of them is to Elasticsearch which allows for easy searching, pattern matching and even correlation without needing to dump the entire system in a backend SQL database which is often slow and cumbersome to use on unstructured data like log files are.
This post will be a set of steps to install and configure a Centralized install of Logstash. It will be tested with Apache access logs.
# Install Javash jdk*rpm.bin* -noregister
# Link the JDK into your environment:alternatives --install /usr/bin/java java /usr/java/default/bin/java 1
alternatives --config java
# Download Elastic SearchES_PACKAGE=elasticsearch-0.17.9.zip # Latest release as of publishing.ES_DIR=${ES_PACKAGE%%.zip}SITE=http://github.com/downloads/elasticsearch/elasticsearch
if[ ! -d "$ES_DIR"] ; thenwget --no-check-certificate $SITE/$ES_PACKAGE unzip $ES_PACKAGEfi# Install as system servicebin/elasticsearch install
# Run Elastic Searchbin/elasticsearch -f -Xmx3g -Xms3g
# Download grokwget --no-check-certificate https://github.com/jordansissel/grok/tarball/master -O grok.tar.gz
tar zxf grok.tar.gz
# Install grokcd *grok*
make grok
make install
ldconfig # Load the new libraries into the library path.# If this is a 64-bit machine, create a symbolic link from /usr/lib64/ to the installed grok in /usr/lib/# Download logstashsrc=/usr/src/
mkdir -p $src/logstash
cd$src/logstash
wget http://semicomplete.com/files/logstash/logstash-1.0.17-monolithic.jar
cat << EOF > logstash.confinput { stdin { format => "plain" message_format => "plain" type => "apache-access" } tcp { type => "apache-access" port => 3333 }}filter { grok { type => "syslog" # for logs of type "syslog" pattern => "%{SYSLOGLINE}" # You can specify multiple 'pattern' lines } grok { type => "apache-access" # for logs of type 'apache-access' pattern => "%{COMBINEDAPACHELOG}" } date { type => "syslog" # The 'timestamp' and 'timestamp8601' names are for fields in the # logstash event. The 'SYSLOGLINE' grok pattern above includes a field # named 'timestamp' that is set to the normal syslog timestamp if it # exists in the event. timestamp => "MMM d HH:mm:ss" # syslog 'day' value can be space-leading timestamp => "MMM dd HH:mm:ss" timestamp8601 => ISO8601 # Some syslogs use ISO8601 time format } date { type => "apache-access" timestamp => "dd/MMM/yyyy:HH:mm:ss Z" }}output { stdout { } # If you can't discover using multicast, set the address explicitly elasticsearch { host => "localhost" }}EOF
Run logstash
java -jar logstash-*-monolithic.jar agent -f logstash.conf -- web --backend elasticsearch://localhost/
Parse logs
time tail -n100000 /tmp/wwt-virt-extra-web15_access.log | nc 192.168.1.119 3333