Introduction to Correlation Engine
After you have been amassing a large volume of logs, you really want to get to the high value alerts quickly to see what is happening in your environment. You take these high value events and alerts, and move them for longer retention using Graylog Enterprise’s Correlation Engine with event management.
With Graylog you no longer have to decide what logs to collect and then hope you never need data from other logs. But while we make possible to collect everything, some log messages are more critical than others. These are called events, and are usually a small percentage of your total log volume. For example, “Failed Logon” or “Attack Detected”. Critical logs and message types should be part of any alerting scheme and have extended retention for compliance, long term auditing, and threat research.
It sometimes can take many different events in a particular order or timeframe to be an issue requiring an analyst’s attention. That is what the correlation engine is doing. An example could be finding 100 failed login events, followed by a successful logon from the same IP address. This could be a brute force attack which succeeded, but taking any one event by itself would not be worth alerting on.
How It Works
Graylog is monitoring all the logs as they enter the system, and based on defined event and alert rules, will take the logs of interest and move them from the noise into their own Elasticsearch index. With the high value logs in their own index, you can run queries on those events to see if there is a pattern of activity, or lack of activity.
Events and correlation events are stored in Elasticsearch, allowing for further filtering, aggregation or compound correlation rules, allowing you to build a very powerful alert from your data.
Frequently Asked Questions
Can I monitor for a down service?
Yes, Graylog’s correlation engine allows you to look for the absence of logs. An example would be if you see an application shutdown, but within 3 min, you do not see the starting of the service, create an alert.
What kinds of use cases can be accomplished?
Some of the many ideas you can add to your alert/correlation rules
- Brute Force Login Attack
- Critical Device not sending in logs
- Admin Login and creation of new user
- Finding an IP Scanning your network after touching a high number of unique ports
Does correlation of events work across all my Graylog nodes?
Yes, the correlation of events works across all Graylog nodes, giving better visibility into all of your data.