The Graylog blog

Aggregating, Managing and Centralizing Docker Container Logs With Graylog

Docker containers are an amazing invention that simplified the lives of many IT departments. Container images are lightweight, easily standardizable, and highly secure. Docker is the technology of choice when you need to run several different (and possibly newer) applications on the same servers.

However, nothing comes without a price, and while docker containers provided a fantastic solution to package and ship programs quickly, it also introduced several new technical challenges. In particular, when it comes to collecting logs, containers do not leave a reliable trail of historical information since they tend to persist for a short period. Information also keeps increasing as the container volume becomes higher, making the process of managing and analyzing logs more difficult and cumbersome. And we didn’t even mention the complexities of Docker Swarm and Kubernetes yet!

If you work on an enterprise-scale, logs are more than just useful – when something goes wrong, they’re essential to providing you with those much-needed troubleshooting answers. A centralized logging mechanism is critical to take care of all the containers of an organization whose container volume is sufficiently high. Tools like Graylog are required for log management, aggregation, analysis, and monitoring in environments that make intensive use of Docker containers and Orchestration platforms.

WHAT IS A DOCKER CONTAINER?

Linux containers are standard units of software that contain all the system libraries, code, system tools, runtime, and even settings of an application so it can be run on any system. Just like a physical container, they are software packages that contain all the parts needed for an application to work even if it is isolated from the host system itself. Their purpose is to simplify the life of system administrators whenever they need to move code by separating application dependencies from infrastructure. The most important thing is that a container allows the software to work uniformly despite infrastructure differences (such as between staging and development). Docker is an open-source tool designed to simplify the container creation and deployment process.

Docker containers are similar to virtual machines since they are used to isolate resources and infrastructures. However, containers focus on virtualizing the operating system instead of the whole hardware. They can be used to ship apps on the same Linux kernel as the system they’re running on and are, therefore, much lighter and efficient since they require fewer resources to work. Containerization in large environments must be organized at the networking level when the number of applications grows too much. Instead of having to manually spin-up each virtual machine to meet the needs of the various applications, the lifecycle of all these containers is managed through orchestrations systems such as Docker Swarm, Azure Container Service, and Kubernetes. Orchestration systems are used to automate many tasks, such as container deployment, self-healing, and much more.

LOG MANAGEMENT SOLUTIONS AND DOCKER CONTAINERS

Today, many enterprises have replaced their monolith architectures in favor of a much more agile microservice approach. Parallel services are often hosted on different nodes, making log aggregation and analysis efforts more complex without an efficient log centralization tool. Developers need to trace the root cause of a problem, digging through countless nodes to perform even the most basic debugging operation. No matter what, logging is a pivotal component of Docker if one must ascertain a container’s stability. Fortunately enough, there are many different approaches for tackling the changes in logging introduced by containerization.

First thing first, containers are transient entities that are continuously generated, destroyed, and rebuilt. Data will be lost once a process is completed, so it is recommended to export your logs to a place where you can store it, such as a folder on a local HD. Second, since a containerized infrastructure usually has several levels of logging, it is advised to define the events you want to track and then correlate them later with Graylog.

To speed things up, you can use the elegant GELF logging driver to pump logs directly into Graylog from the Docker container. Graylog Sidecar (included) makes it easy to manage the many GELF collectors so all events can be tagged and forwarded to Graylog for later analysis.

CONCLUSION

Docker containers made the life of developers much easier, and thankfully, we have Graylog as well to avoid any issue with log management. If you still have difficulties setting up Graylog through docker, you can find all the necessary instructions on this page. Don’t forget you can always check our community as well!

Get the Monthly Tech Blog Roundup

Subscribe to the latest in log management, security, and all things Graylog Blog delivered to your inbox once a month.