Sunday, March 1, 2015

An Introduction to Docker Containers

History and LXC

You may have heard about Docker back when it began surfacing some time in 2013. What exactly is Docker? Simply put, it's virtualization technology. To understand the problems Docker is trying to solve, it helps to know a bit about linux containers (LXC). LXC provides the ability to partition off processes from one another at the Linux kernel level through namespaces and control groups. Docker builds on top of this and also grants additional functions that alleviate the process of building, shipping, and running distributed applications. So if you were to try to visualize this in your head, think of each process running in its own bubble on the host. Each bubble can have its own specific environment, doesn't require all the overhead of an entire operating system, is encapsulated from other bubbles, and is accessible through a simple port mapping interface.

Prerequisites: Make sure you have a good understanding of virtual machines and VirtualBox.

Docker consists of two major components - the Docker Engine which runs the host and drives the packaging functionality, and the cloud-based Docker Hub where the community can push and distribute Dockerfiles and containers.


"Docker-linux-interfaces" by User:Maklaan - Based on a Docker blog post. Licensed under Public Domain via Wikimedia Commons - http://commons.wikimedia.org/wiki/File:Docker-linux-interfaces.svg#mediaviewer/File:Docker-linux-interfaces.svg

As you can see in the above diagram, Docker uses different interfaces to leverage the Linux kernel's virtualization functions. But let's talk more in detail about all the things Docker does in addition.

Performance

IBM recently released a research paper titled "An Updated Performance Comparison of Virtual Machines and Linux Containers" by Felter et al. that presents the results of their comparisons between Docker containers, KVM and bare metal. The overall findings concluded that Docker performs better than KVM across all categories and only slightly worse than Native. Let's take a look at Disk I/O, for example:

Addresses Conflicting Runtimes

You've probably already experienced what it's like to manage multiple programs running on the same image. Even with virtual environments to contain each program's dependencies, it can get unwieldy. Docker isolates at the process level, cleanly encapsulating each program's runtime and access to these with better port mapping and standards for what should be handled by the container and what should be abstracted. Long story short, you can run one container with Java 7 and another container with Java 8 right next to each other on the same machine and they won't conflict.

More About Port Mapping

With Docker, each container is assigned its own IP address that can be used to communicate with other containers on the same host. And because each container has its own scope, you can have multiple containers, side-by-side, with processes listening on the same port and Docker will map those at the host level. Communication over a network, however, relies on port mapping since containers are tied to the IP address of the host machine.

The Docker Community

Not only does the Docker Hub Registry provide the means to share and distribute containers, it also curates these resources and provides the community with an interface for searching the entire repository. With this at your disposal, it's easier to locate base images that are more suited for your service. So rather than utilizing the ubuntu base image and running a command that installs Java 8 on top of it, you can utilize the Java 8 base image and significantly cut down the provisioning time.

Just to point out how Docker has exploded over the last few years, we can look at some alternatives and compare the growth rate of contributors over time. We'll also look at the spread of commits across the top contributors so you can get a good idea of how involved the community really is.

Repo Total Contributors Total Days
Vagrant 506 1883
LXC 145 2145
Docker 802 791

So given the data, we can deduce that Docker has the highest rate. And these numbers are towards the development of Docker itself, not including the community around the development of dockerfiles.

And here we can see that the number of commits is more distributed across Docker contributors with Vagrant having the most active top contributor.

Security

Additional security is provided by the containers inability to access what hasn't been exposed to it. So there's less risk of accidentally implementing over-broad privileges when managing Unix permissions. In addition, root users in the container are not treated as root outside the container.

Documented Build Steps

With build steps listed out in easy to ship Dockerfiles, there are fewer scenarios where server configuration works for one party and not for another. You instruct someone to pull the Dockerfile or even the container, itself, from the registry and they're good to go. Again, it's isolated, meaning you run into fewer conflicts with the environment that's running the service. You spend less time going through logs and server configurations, trying to identify what's missing or different and more time on actually deploying.

Offloads Ops Work

Containers also allow you to offload some of the ops work to the developer. So if a developer's application requires some bizarre set of dependencies and modifications to the environment to serve properly, at least it's contained. By implementing a workflow where the developer is in charge of the Dockerfile, the DevOps engineer can focus on other more important matters. Keep in mind, some containers still have to depend on shared services so you still have to be smart about what you allow. This is made immediately known when dealing with persistent data. You're not going to have a mongoDB process running on each individual container, but have a separate mongoDB container they can all link to.

Now that you have a better idea of Docker's strengths, if you'd like to learn more, check out this next article, which provides a hands-on approach to learning Docker's basic features while also setting up an ELK logging system. This is a much better alternative to the manual process of setting all this up.

No comments:

Post a Comment