Cyber Defense with MITRE Framework | Graylog + SOC Prime | On-Demand Webinar >>

The Graylog blog

Making data-driven decisions with log management software

Today, most enterprises rightfully think about their business strategies by leveraging available data. Data-driven decisions certainly are more solid and reliable than those based upon mere instinct, intuition or just plain mysticism. Logs, in particular, are a fantastic source of information from which a company can draw to fuel its business intelligence (BI) strategies.

However, there’s a big and sometimes unbridgeable gap between theory and practice. Despite all the big talk about data-driven insights, using the information available is something that very few enterprises do for real. A log management tool is a great instrument in the arsenal of any data analyst who wants to extract meaningful insights from raw big data. Let’s walk the walk then and jump straight across this bridge to start converting your company’s logs into an endless source of valuable business intelligence data.

WHAT IS DATA-DRIVEN DECISION MAKING?

Data-driven decision making (DDDM) defines all those processes through which strategic decisions are made by objectively analyzing actual data that has been previously collected and refined. This practice is on the opposite spectrum of making decisions based on tradition, feelings, or plain theory. Data may come from many sources, including logs, but what it ultimately matters is that metrics are used to determine whether a planned strategy is bearing fruit or not. Goals can be therefore set as concrete, numerically-defined milestones, and results and achievements measured in practice. To put in W. Edwards Deming’s words, “Without data, you’re just another person with an opinion.”

Normally, all companies exist in one of the five stages of data-driven culture evolution, although many may shift from one to the other over time (and hopefully end towards the last one).

Data Denial:The company plainly distrusts data, and thinks it’s not worth using it.

Data Indifference:The organization is not particularly interested in setting up data collection and analysis processes.

Data Aware: The enterprise is either collecting data but doesn’t know how to use it, or didn’t establish a reliable and repeatable pipeline to collect it and fuel decisions.

Data Informed: Data is actively collected and used, but only to support or reinforce their decisions or thought processes.

Data Driven: Employees and managers base their decisions on data, which is always used as their starting point.

DIFFERENCES BETWEEN THEORY AND PRACTICE IN DDDM

All enterprise tools produce a broad range of different logs. Whether they are alert logs, event logs, troubleshooting, or system logs, they all contain precious information that can be parsed and digested. However, security data cannot be simply dumped into some magical recycling machine that processes it, adds some spicy machine learning, and then conjures some cybersecurity insights out of thin air. Several steps must be followed in a certain order before data can become the fuel that feeds the decision-making machine.

Once you’ve defined your objective (e.g., increase sales in a certain region), you must first establish a hypothesis that data can prove or disprove (e.g., more people from that region are using your tool). Then, you need to pinpoint the right source of data that can provide that info (e.g., logs that contain geolocation info). Now, you need to establish a robust process to extract that data recursively (e.g., each time that a user uses your tool), and refine it (e.g., remove all IPs that are not from that region), enrich it (e.g., add info such as how many times the tool was used, OS used, version, etc.). The last steps are to collect this data in a centralized repository so it can be accessed analyzed. Now you finally got the actionable data you need to determine whether your initial assumptions were correct (e.g., your strategy was successful because you increased the number of users from that region).

NARROWING DOWN DATA SOURCES WITH LOG CENTRALIZATION

Data gathered from different sources must be unified, too. Information extracted from a broad range of data pools such as tools, CRMs, firewalls, or internal system logs can hardly be read unless it’s aggregated and cross-referenced. More often than not, without a pre-digestion phase, all this disparate data may end up being redundant or even contradictory. Centralization is vital to collect data from different sources, something you can do via Sidecar to establish a robust network of remote log collectors. Pipelines will then be used to select only the logs you need, blacklist the stuff that you don’t, and enrich your logs so that they are refined enough to be used for BI purposes. In a nutshell, unification is needed if you want to uncover the truth that lies behind data and build a holistic rather than fragmented picture.

One of Graylog’s strongest points is the flexibility of its processing engine. You can easily parse logs from any data source, and enrich them with all the information you need. After you drilled down to the very core of the relevant data you’re looking for, you can save your queries to ease the analysis process later on. An adamant pipeline structure will feed all this valuable information into a centralized source where you can analyze it and make the right, data-driven decisions.

USING LOGS FOR DATA ANALYSIS – USE CASES

Logging tools can be used to improve your data-driven decision making in many practical ways. Here are some useful use cases:

ENHANCING UX AND CUSTOMER EXPERIENCE

You want your tool to be as easy to use as possible. Streamlining your customer experience can be done by monitoring their usage of the tool and its various components. How many times did they click on a certain button? How many clicks do they need to access a critical dashboard? How frequently do they use a feature? All this info can be used to enhance your tool’s usability in future versions.

IMPROVING MARKETING EFFORTS

You can easily hook up Graylog to a marketing platform such as Hubspot or Salesforce to stream its data and monitor trends. You can easily determine which industries bring the most leads, how frequently a lead is turned into a paying customer, or in which countries you may want to invest.

TROUBLESHOOTING PERFORMANCE ISSUES

You may have had a hunch that a certain app is taxing your system’s resources. However, proving that your intuition is correct can be hard, and measuring exactly the extent of the problem can be a chore. Log analysis can be used to gather information about processes that are consuming too many resources, pinpoint wastes, troubleshoot issues and improve the overall performance of your systems, software, and networks.

CONCLUSION

The most successful enterprises rely on data-driven decision making to establish robust strategies, reach out for new markets, and increase their chances to seize business opportunities. Log data is the most solid foundation upon which you can build your company and improve your agility. You just need the right log management tools to make the best out of it.

Get the Monthly Tech Blog Roundup

Subscribe to the latest in log management, security, and all things Graylog Blog delivered to your inbox once a month.