Saturday, November 29, 2014

The ElasticSearch, LogStash and Kibana (ELK) Stack Architecture

In this article, we'll be going over the basic architecture of an ELK Stack logging system. If you aren't familiar with logging, check out this article as well as others from different sources to understand why the visibility of your logs is so important. We won't be going over how to implement this solution. That we'll save for another article. For now, let's go over the three major components and how they all fit together.

Elastic Search

ElasticSearch is an open-source, distributed RESTful search engine. It's high performant, highly configurable, and focuses to in depth data exploration. Taking full advantags of the JVM and hardware, it can run parallel operations, and return search results and analyses in real time. It's also very easy to scale, depending on a network of Elasticsearch clusters. These clusters are also referred to as nodes, and fall into a grid that's composed of master nodes, slave data nodes, and load balancer nodes.

With a basic 2 node setup, you have everything you need to begin storing and retrieving data (documents) distributed throughout various replica shards. Elasticsearch also has very powerful aggregation (formerly known as faceted search) capabilities, and because of it's node and shard infrastructure, it can return results in milliseconds, versus the minutes or hours it would take a batch processing job to run.

Logstash

Logstash is the component of the ELK stack that's responsible for collecting logs, parsing and storing them for later analysis. It is made up of a centralized logstash server or hub that collects logs from a series of logstash-forwarders (agents) living on servers. Logstash can be configured to accept a variety of different formats, filter them down using grok recipes, and pass those along to a Sentry or Sensu monitoring server for example.

Kibana

Kibana is the web interface that makes it easier to visually understand your log data, in real time. It rolls out with powerful Lucene querying capabilities and works seamlessly with Elasticsearch and Logstash. Using their widget-based UI, you have the ability to add a variety of different charts, histograms, and even geographical representations of the data.

The following screenshot visualizes logs from three Linux instances I've set up. The "elon" instance is where I serve all of my front end applications, the "fairchild" instance runs all of my server-side services, and the "intrepid" instance is where I run all of my logging and monitoring services. If you're curious about the meaning behind all of these host names, make sure to check out this article.

That concludes this post. Head over to this article to find out how to implement a logging system, yourself. Also, take a moment to stop by Jordan Sissel's website. Until next time.

No comments:

Post a Comment