Wednesday, December 24, 2014

A Grafana Dashboard for Graphite and Sensu

Grafana is described as a feature rich metrics dashboard and graph editor for various time series databases (TMDB) and provides end users with a much more robust set of capabilities for querying, data visualization, and annotating. Today, I'm going to show you how to hook Grafana into an existing Graphite installation, and explain where Sensu tracking fits in.

Before we dive into the implementation, let's take a step back and look at the system from a high level.

As you can see, we have a pretty robust system in front of us. There are three dashboards to interface with and multiple dependencies; everything being tied together by our monitoring server. We're also leveraging RabbitMQ, a broker, and Redis, a key-value store to buffer the events that come through.

Prerequisites: You'll first want to set up Graphite and Sensu, first. This article was written for Mac OS X and Linux Ubuntu users.

Grafana Installation

# install grafana from source
cd /opt
sudo curl -O -L http://grafanarel.s3.amazonaws.com/grafana-1.9.0.tar.gz
sudo tar xf grafana-1.9.0.tar.gz
sudo cp -R grafana-1.9.0 /usr/share/grafana

# clone the config file
cd /usr/share/grafana/
sudo cp config.sample.js config.js

Java 8 Configuration

# fetch oracle java ppa
sudo add-apt-repository -y ppa:webupd8team/java

# update the packages
sudo apt-get update

# install the latest stable oracle java 8
sudo apt-get -y install oracle-java8-installer

Elasticsearch Configuration

# fetch the elasticsearch/logstash public GPG key
wget -O - http://packages.elasticsearch.org/GPG-KEY-elasticsearch | sudo apt-key add -

# compile the elasticsearch source list
echo 'deb http://packages.elasticsearch.org/elasticsearch/1.1/debian stable main' | sudo tee /etc/apt/sources.list.d/elasticsearch.list

# update the packages
sudo apt-get update

# install the elasticsearch 1.1.1 release
sudo apt-get -y install elasticsearch=1.1.1

# edit a configuration file for elasticsearch
sudo nano /etc/elasticsearch/elasticsearch.yml

# add the following line to this file
script.disable_dynamic: true

# uncomment the following lines in this file
network.host: localhost

# restart elasticsearch to take the new config into effect
sudo service elasticsearch restart

# configure elasticsearch to run on startup
sudo update-rc.d elasticsearch defaults 95 10

Grafana Configuration

Since we're dependent on a Graphite & Elasticsearch system, we're updating that datasources block with the absolute url of our running graphite application and our elasticsearch server. If you don't remember what those are off the top of your head, refer back to your /etc/nginx/sites-avaiable/default file or whichever file you set up your server blocks in. Whether you're running graphite on port 8080 or 80 using your own subdomain, take note of that location. Then, uncomment and update the following lines in /usr/share/grafana/config.js.

// Graphite & Elasticsearch example setup
datasources: {
  graphite: {
    type: 'graphite',
    url: "http://localhost:8080",
  },
  elasticsearch: {
    type: 'elasticsearch',
    url: "http://localhost:9200",
    index: 'grafana-dash',
    grafanaDB: true,
    render_method: 'GET' # to prevent CORS issues down the road
  }
},

Note: Setting render_method to GET to prevent CORS issues is a recommendation provided by the Grafana team.

Nginx Configuration

Let's add a new server block by running sudo vi /etc/nginx/sites-available/default.

# using a port
server {
  listen                9300;

  access_log            /var/log/nginx/example.access.log;
  error_log            /var/log/nginx/example.error.log;

  location / {
    root /usr/share/grafana;
  }
}

Sensu Server Configuration for Metrics Tracking

The following steps will teach you how to leverage a plugin for tracking load across all of our clients. Make sure you've set up Sensu, first. When you complete these steps, you'll be able make a query that will display the following visualization in Grafana:

We now have to SSH into our sensu server and fetch a few plugins.

Fetching the Required Plugins

Now, we're going to need to pull down the Sensu community files repository and clone some plugins. These plugins are run on every machine that needs to be tracked by Sensu, including the server, itself. Perform these steps on the Sensu server and ALL client machines.

# install git if you haven't already
sudo apt-get install git

# clone the repo
git clone git://github.com/sensu/sensu-community-plugins.git

# copy the graphite mutator plugin
sudo cp /opt/sensu-community-plugins/mutators/graphite.rb /etc/sensu/plugins/

# copy the load metrics plugin
sudo cp /opt/sensu-community-plugins/plugins/system/load-metrics.rb /etc/sensu/plugins

# copy the vmstats plugin
sudo cp /opt/sensu-community-plugins/plugins/system/vmstat-metrics.rb /etc/sensu/plugins

Configuring the Graphite Mutator

Mutators are responsible for altering event data before passing it to a handler. This helps us end up with more consistent formatting for graphite to consume, down the road. So for example, our load metrics flow ends up looking something like:

cat event.json | load-metrics.rb | mutator.rb | handler.rb | transmit over tcp

Perform this step on the Sensu server.

# prepare folders/files
mkdir /etc/sensu/conf.d/mutators
sudo vi /etc/sensu/conf.d/mutators/graphite.json

# add the following content
{
  "mutators": {
    "graphite": {
      "command": "/usr/bin/ruby /etc/sensu/plugins/graphite.rb"
    }
  }
}

Configuring the Graphite Handler

Perform this step on the Sensu server.

# prepare folders/files
mkdir /etc/sensu/conf.d/handlers
sudo vi /etc/sensu/conf.d/handlers/graphite.json

# add the following content
{
  "handlers": {
    "graphite": {
      "type": "tcp",
      "socket": {
        "host": "127.0.0.1",
        "port": 2003
      },
      "mutator": "graphite"
    }
  }
}

Tracking Metrics

Create the following files on the Sensu server:

# create a new file
sudo vi /etc/sensu/conf.d/load_metrics.json
{
  "checks": {
    "load_metrics": {
      "type": "metric",
      "handlers": ["graphite"],
      "command": "/etc/sensu/plugins/load-metrics.rb",
      "interval": 60,
      "subscribers": ["all"]
    }
  }
}
# create a new file
sudo vi /etc/sensu/conf.d/vmstat_metrics.json
{
  "checks": {
    "vmstat_metrics": {
      "type": "metric",
      "handlers": ["graphite"],
      "command": "/etc/sensu/plugins/vmstat-metrics.rb",
      "interval": 60,
      "subscribers": ["all"]
    }
  }
}

Now restart Sensu on the master by running sudo service sensu-server restart && sudo service sensu-api restart. Don't proceed until your check shows up under "checks" in the dashboard.

More Sensu Client Configurations

You'll need to add the following file to every sensu client you've set up. Make sure to put in the ip address or domain of the sensu server in the address field. Notice how our subscriptions is set to "all". We're doing this because we want to track stats for all of our servers. If you only want to monitor stats for some of your servers, you can create a "stats" subscription, provided you update the subscribers property of the checks on the server.

# on every client
/etc/sensu/conf.d/client.json
{
  "client": {
    "name": “client1”,
    "address": “sensu.server.ip.address”,
    "subscriptions": ["all"]
  }
}

Now restart sensu on the client by running service sensu-client restart.

CORS

Configuring Elasticsearch for CORS

Now navigate to port 9300 in your browser (http://<ip.address.of.vm>:9300). Depending on how your set up elasticsearch, you may get CORS issues when you navigate to your grafana dashboard.

# edit your elasticsearch config file on...
# ...the server you installed elasticsearch on
sudo vi /etc/elasticsearch/elasticsearch.yml

# add the following lines to this file
http.cors.allow-credentials: true
http.cors.enabled: true
http.cors.allow-origin: http://my.example.com 

Note: Make sure you put in the absolute url of the url grafana is running on when you set http.cors.allow-origin. Then, restart elasticsearch by running sudo service elasticsearch restart.

Configuring Nginx for CORS

The Grafana Docs recommends adding the following lines to your grafana server blocks. Run sudo vi /etc/nginx/sites-available/default.

if ($http_origin ~* (https?://[^/]*\.somedomain\.com(:[0-9]+)?)) {  
    set $cors "true";                                               
}

if ($cors = 'true') {
    add_header  Access-Control-Allow-Origin $http_origin;          
    add_header  "Access-Control-Allow-Credentials" "true"; 
    add_header  "Access-Control-Allow-Methods" "GET, OPTIONS";
    add_header  "Access-Control-Allow-Headers" "Authorization, origin, accept";
}

More Troubleshooting Tips

You can tail carbon cache by running sudo tail -f /opt/graphite/storage/log/carbon-cache/carbon-cache-a/listener.log. Every minute or so, that log will be updated as bulk events are received. You can run sudo /opt/graphite/bin/carbon-cache.py start --debug, as an alternative.

Clearing Out Our Whisper Data

Whisper is the database graphite uses to store its time-series data. It's fixed-size, fast, reliable, optimized as far as python allows, and modeled after round robin-databases (RRD). It also describes itself as being excellent at translating highly concentrated recent data points into lower resolutions for long-term retention archived data.

While you're experimenting with different naming hierarchies, early on, you may want to clear out old obsolete data, to make it easier to build queries in Grafana. Simply navigate to the directory at /opt/graphite/storage/whisper and clear out the files you no longer need. You can also run a pattern similar to the following to clear out files in bulk:

sudo find /opt/graphite/storage/whisper -name load_avg -delete

2 comments:

  1. Nice tutorial. I followed it and now I have an ansible to create my virtual machine with grafana and graphite and it works very well!!
    However I would like to clarify something about the architecture.

    Grafana talks directly with Graphite-webapp (Django) right? For this reason we need two places to store dashboards:
    - ElasticSearch to allow Grafana search, save and store dashboards.
    - SQLite (or PostGres) to allow Graphite save and store dashboards.
    And finally all the data (time series data) is stored in one place: Whisper.

    Thus, even if I only use Grafana I still need Graphite-webapp and all its dependencies like SQLite right?

    ReplyDelete
    Replies
    1. Hey Mariano. Thanks for pointing all that out. I updated the diagram to reflect these dependencies. To answer your question:
      • Yes, you would still need the webapp and all of its dependencies because the webapp provides the rest api for third-party tools like Grafana. Also, by using Graphite, you have to also depend on elasticsearch to be able to store and search dashboards, but you can use json and scripted dashboards if you don't want to set up Elasticsearch.
      • As an alternative, you can use Influxdb or OpenTSDB.

      By the way, I have another article coming where I'm using Shinken + InfluxDB + Grafana for monitoring. It's much more mature than Sensu and has access to the ecosystem of Nagios plugins. I'll have that published, soon, but definitely start googling it.

      Here are two good articles to read, by the way:
      • https://grey-boundary.io/the-architecture-of-clustering-graphite/
      • http://docs.grafana.org/installation/

      Delete