Sunday, July 12, 2015

Manually Monitoring a Client with Shinken

Now that we've set up our Shinken server, we're going to want to monitor a client.

Prerequisites: Check out this introductory guide to Shinken and Nagios, and make sure you've set up your Shinken server, first. You'll need to set up an additional client server, so make sure you're familiar with Vagrant and VirtualBox. This guide was written for Mac OS X users.

More Configuration for the Shinken Server

The first thing we'll want to do is make some changes to our Shinken server to prepare it for the client, so let's ssh back into that machine.

# ssh into the server vm
vagrant ssh hostname1

# log in as root user
sudo su

# switch to the shinken user
su shinken

# create a host file config for the client
vi /etc/shinken/hosts/client.cfg

Manually Setting Up a Shinken Monitoring Server

We're mainly focusing on how to set up a Shinken monitoring server, today. Make sure not to skip the prerequisites if you'd like to learn more about Shinken, from a high level. When you're finished going through this guide, I'd urge you to learn about how to set up a Shinken client, as well.

Introduction to Shinken and Nagios

Nagios is a mature, established, monitoring and alerting solution for your entire IT infrastructure. Nagios offers coverage for servers, switches, applications, and services. Nagios is free. Along with the software you get plenty of simple to use plugins, years of development and documentation, and a large community to reach out to / hire. And that's just about everything good Andy Sikes has to say about it in his hilarious talk at the London Devlops meetup[1].

"Monitoring" by Diglinks at en.wikipedia - Transferred from en.wikipedia; transfer was stated to be made by User:Esquilo.. Licensed under Public Domain via Wikimedia Commons

Friday, July 10, 2015

Setting Up a Firewall with Iptables on Linux

Since we've already become familiar with Packet Filtering and Iptables, this is going to get right into the iptables commands you need to execute in order to set up rules and manage the packets entering and exiting your server.

Intro to Packet Filtering and Iptables


iptables is an interface to Netfilter, which is kernel-level Linux firewall, allowing the administrator to add, remove, or modify packet filtering and NAT rules. These rules govern how different types of data are treated by the server, whether to accept, drop, and so on. Understanding how to setup and configure iptables is the first step to managing your Linux firewall.

The basic flow of a data packet hitting a Linux firewall is as follows:

TCP/IP, Subnet Masking, CIDR, Default Gateways, and DHCP


TCP/IP is a collection of protocols. The two main protocols we need to be concerned about are TCP (Transmission Control Protocol), which enables two hosts to connect and exchange data, and IP (Internet Protocol), which handles the routing of information (packets) between servers and devices within a network. So TCP/IP can be categorized as a routable protocol, where you can divide networks into subnetworks and define the communication channels within them. With an non-routable protocol, all devices can communicate directly with one another, which ends up resulting in extremely inefficient bandwidth utilization.

By dividing the hosts across subnetworks that connect to the outside world through a router, you not only have more efficient communication, but more IP addresses afforded to everyone since we don't need to allocate a unique IP address to every single device. Now that we have a better idea of what a subnetwork is in relation to the network, let's talk about the different components of a subnetwork.

Monday, July 6, 2015

Setting Up a CoreOS Cluster with Fleet and Vagrant

Prerequisites: You'll want to be familiar with setting up VirtualBox with Vagrant and have Homebrew installed. This guide was written for Mac OS X users.

Installing Dependencies

  • VirtualBox
  • fleetctl
  • Vagrant


On our Mac OS X host machine, we need to make sure we have the above dependencies installed. Since we're already familiar with VirtualBox after going through the prerequisites, we install fleetctl like so:

# install fleetctl using brew
brew update
brew install fleetctl

# or from source
wget && unzip
sudo cp fleet-v0.10.2-darwin-amd64/fleetctl /usr/local/bin/

Introduction to CoreOS, Fleet, Etcd, and Flannel


Usually, when you have a Linux distro, it would come with the standard ISO, package manager (yum, yast, apt) to distribute all of the individual pieces of compiled software, and other programs like netcat, httpd, etc. that just ends up being extra overhead. CoreOS differs in that they distribute an entire disk image that's essentially a stripped down Linux kernel with a few PaaS-like features. So now, rather than shipping with all those programs right out of the gate when you may not ever utilize them, we only pull in the ones we need, and run them in an isolated, lightweight, slice in the form of a Linux container (LXC, KVM, Docker). We adhere to the one process, one container principle and run these containers side-by-side over any number of CoreOS machines.

So within this system, we trust Docker to:

  • Be responsible for the containerization of all of your services / applications
  • Provide a mechanism for searching and transporting containerized services

Before we dive into etcd, let's take a bird's eye view of the system as a whole.

"CoreOS Architecture Diagram" by User:Dsimic - A high-level illustration of the CoreOS cluster architecture. Licensed under Public Domain via Wikimedia Commons -

Intro to Systemd and Unit Files

Systemd is an init system, managing how you start, stop and restart services. It also manages how services come up when you start a machine. Most linux distros have been using SysVinit and Ubuntu has been using Upstart up until recently, but Systemd seems to be what all of the distros are standardizing on. Systemd basically calls daemons, which we refer to as units. A unit is any service or resource that the systems manages or controls. They may go by other names depending on the OS. Linux calls them daemons, Windows calls them processes, and so on. All units have their own start scripts.

Provisioning Buildbot Master-Slave Containers with Docker

Now that we're more familiar with setting up Buildbot and the master.cfg, we can move forward with this alternative, container-based approach, using Docker. I strongly urge you to go through the prerequisites in in article because you'll need to carry over your master and slave configuration. Let's start there.

Saturday, June 27, 2015

Provisioning Buildbot with Vagrant and Ansible

Because the existing Buildbot documentation explains the manual installation process well enough, already, I'm just going to refer you to that if it's something you'd like to learn. What I provide, here, is an automated alternative that will quickly spin up a Buildbot master and slave through Vagrant and Ansible, so we can focus more on the configuration file and less on installation. If we do anything to compromise our configuration or environment, we can quickly reprovision.

Also, take note of the master directory. It contains a master.cfg file you can modify so that you're not always starting with a configuration from scratch. Just keep in mind that the Ansible task simply copies this file over. It doesn't create a shared volume. So when you make changes to your master.cfg file in the VM and want to keep your changes, make sure to copy them to this file before you reprovision the machine.

Sunday, June 14, 2015

MySQL Multi-Master Two-Way Replication

Now, you should already be familiar with load balancing traffic across two servers, at this point. This article is going to introduce you to the same basic concept using two MySQL servers. Since MySQL 5.1.18, we've had the ability to implement a MySQL cluster with multi-cluster replication, and that's what we'll be doing here.

Here's what the basic architecture looks like:

MySQL Load Balancing with HAProxy

Today, we're going to be using HAProxy to queue and throttle connections towards a set of MySQL servers, alleviating the load. It's the same basic principle as when we used HAProxy to load balance requests to our web servers. This time, however, we're using a different connection scheme. Whereas before we used the roundrobin[1] scheme to select servers in turns, we're using leastconn to select the server with the least number of connections. Let's continue.

Dockerizing StatsD, InfluxDB, and Grafana for a ReactJS Chat App

Today, we have a straightforward guide to using Docker Compose, Docker Machine, and Gulp to provision a development environment and analytics for a ReactJS chat application. By the end of this, you'll see just how easy it is to spin up dependencies, and hopefully be inspired to make your own compose files.

A MongoDB Cluster Sharding Tutorial

This is going to be a straightforward guide to setting up a MongoDB sharded cluster[1] on your local machine.

A MongoDB Cluster with Replica Sets Tutorial

Because I've already covered the basic idea behind replica sets in the past, this is going to be a straightforward guide for implementing a replica sets into an existing MongoDB cluster. Let's get right to it.

Introduction to MongoDB Sharding and Replica Sets

The following diagram breaks down the individual components of a MongoDB sharded system.

We have the config servers, which are responsible for maintaining the metadata for your cluster. We have the MongoS query routers, which fetch data from the config servers in order to return data from the cluster and return them to the client application.

Think of MongoS as the load balancer, the shards as the data store, and the config servers as the metadata store. There are always exactly three config servers with the same metadata and they use a two phase commit protocol to keep the metadata in sync with each other. As long as one config server is up, the cluster is alive, but you lose the ability to do splits and migrates if one or more go down. You'll learn more about these components as your continue on.

Caching MySQL with Memcache

Since we've already gone over the idea behind Memcache in a past article, this is going to be focused more on Memcache implementation. Let's begin.

Intro to Memcache

Memcached is an in-memory, high-performance, distributed memory object caching sytem that also performs as a key-value pairs data store. The main advantage of sitting in memory is speed over data stores that sit on disk. Data is as simple as putting and getting values by key, and anything that can be serialize can be put into Memcache.

Memcache is commonly used for storing data returned from:

  • Data store query results
  • API calls
  • User authentication token and session data
  • Other computation results

Why Memcache?

So why Memcache? It's easy enough to use the data store to share data across instances, or even use it to do the caching of API calls. Simply put, Memcache makes your app more performant at a much lower cost. Anytime you hit the datastore, you're paying the cost of executing the query and CPU tied to it. Query are also computationally expensive.

Sunday, May 31, 2015

A Development Environment Using Docker Machine, Docker Compose and Gulp

Today, we're going to set up a development environment for a ReactJS application using Docker containers. Even if you decide this isn't the best setup for your team, I strongly urge you to have at least 3 different environments for your software development efforts. It's normally some variation of local development, staging, and production. Development is for all of your experimental / bleeding edge tech. Staging must be a mirror of Production, serving as your UAT[1] platform. Given that setup, if it works in staging, it'll work in production. Before we continue, here's a diagram of the workflow we're implementing:

Thursday, May 28, 2015

Analyzing Your Applications with StatsD, InfluxDB, and Grafana

StatsD was original developed by Flickr before being adopted by Etsy. It's a Node.js-powered network daemon that listens for statistics like counters and timers over a UDP or TCP connection, acting as a local aggregator. StatsD can then send application metrics to a backend for time-series data storage and visualization. Some of the most popular tools used in conjunction with StatsD are Whisper + Graphite and InfluxDB + Grafana.

Setting up an ELK Stack with Kibana4, Beaver, and Docker-Compose

At this point you should be familiar with both manually setting up ELK, as well as through Dockerfiles. Well, here we go again- this time, with Docker Compose. Most of what we cover here relates more to workflow, rather than the writing of actual Docker Compose files. You'll have the files in my repository to reference if you're curious about the make up of these.

Docker Machine and the Docker Registry

This article is going to be focused on moving away from Boot2Docker and using Docker Machine (still in beta) in its place. After going through the prerequisites, you should be familiar with some basic commands and a workflow that accommodates individual Dockerfiles and leverages Boot2Docker to provision a VM pre-baked with Docker on your local machine.

Sunday, April 26, 2015

A Distributed Task Queue Using Celery, RabbitMQ, and Redis

Celery is a distributed system for processing messages in a task queue. It focuses on real-time processing, supports task scheduling, and allows out of band processing which can be started periodically or triggered by web requests. Coupled with a message broker, it's reasonably good at distributed computing and high availability.

With the right components in place, you have what is essentially a buffer when your systems fail and the ability to conduct operations on that buffer through the employment of multiple, concurrent workers. Today, we're going to be demonstrating some of these capabilities by making requests and seeing how tasks get queued and executed in real-time.

Creating a REST API with Scala and Spray

In this second article in our Scala series, I'll be introducing you to the Spray framework, which will allow you to build a REST API. If you aren't already familiar with Spray, it's essentially an HTTP stack for Akka and Scala applications. It's embeddable, rather than a standalone server, and fairly easy to tie in. Spray is focused on HTTP integration layers, rather than being a full fledged web application (see Play), allowing you to create bridges between your service and other servers, clients and third party APIs like Twitter, etc., through HTTP. Just define your REST API and you're good to go.