The Graylog blog

Managing Centralized Data with Graylog

Central storage is vitally important in log management. Just as storing and processing logs into lumber is done in one place, a sawmill, a central repository makes it cheaper and more efficient to process event logs in one location. Moving between multiple locations to process logs can decrease performance. To continue the analogy, once boards are cut at a sawmill, a tool such as a wood jointer smoothes out the rough edges of the boards and readies them for use in making beautiful things.

Referencing the above example, Graylog is both the sawmill and the jointer. It first stores all the logs in one location, then makes the log messages usable by normalizing the data into information that’s important to you. Having just a pile of logs only helps if you want firewood.

Let’s look at how Graylog serves as an essential tool to get real value from the raw material in your environment.

Gathering Your Data

Delivering log messages into Graylog is easy, and multiple possible inputs are already given by default. The input system is pluggable and you can extend and add something if you can’t use one of the given inputs or did not find your desired transport method in the Graylog Marketplace.

Syslog is one of the most popular ways to transport log messages from the operating system, applications, and hardware. Graylog understands all messages that follow one of the two leading RFCs: 3164 and 5424, including some variations made by vendors. The RFC then creates defined fields in Graylog with the submitted content from the log message.

Graylog can also ingest CEF, which is used by many security appliances. It then tries to split the messages into different fields and values according to the standard. Also, Netflow to analyze network traffic can be sent directly into Graylog. Various AWS and Cloudtrail logs can also be ingested to Graylog.

In the next release, the support of Beats will be extended beyond the current level. The various available Beats can ingest their messages to Graylog, including Packetbeat, Winlogbeat, and Filebeat. Many other community-based Beats work too, but have not been fully tested.

Should none of the above scenarios work for your environment, Graylog can just listen on a specific port and get everything sent to that port as messages.

You can use nearly all of the above with a queue to simplify log transport for IT environments that have multiple networks connected with high latency or an unstable connection.

Normalizing Your Data

After ingest to Graylog, the first modification based on the standard used to ingest, the message processing kicks in. Graylog enables you to define rules for what should be done with a message, extract and enrich information given, and remove clutter. After that processing, the messages are now final and get stored. All the fields can be searched or alerted on, or they can be used to visualize the log data in widgets on dashboards. This normalization translates the harder to comprehend raw data into extracted information that lets more people understand and work with the log message content.

Not only does central log file management secure your environment and give IT operators a complete picture of their environment, it also lets people outside of the IT department work with data that is already present but not previously known to them.

Setting Up Graylog

To configure Graylog to collect logs, first create your inputs, based on your message source. After you get the first messages into Graylog, you can build your extraction and normalization pipeline. This step is where you will spend most of your time; it is the foundation of your data collection and management. Data extraction evolves and will need to be adjusted to meet your needs. For example, you notice that it would be useful to have specific information available in a single field. Adding external information to the message would be useful so that another department can also work with messages and generate more value for your organization.

Graylog can be integrated with other parts of your technology stack. Specific messages can be forwarded, and alerts can be sent to your monitoring and alerting. Even better, the monitoring can check Graylog streams if they contain one specific message. That is only one check and not one on each server in your fleet.

If you need help with any of these steps, we can help with a training that teaches you all of the above or professional services that kick start your environment.

Get the Monthly Tech Blog Roundup

Subscribe to the latest in log management, security, and all things Graylog Blog delivered to your inbox once a month.