Ochsner Health System has been providing high-quality clinical and hospital patient care to Louisiana residents since 1942. Ochsner Health System is southeast Louisiana’s largest non-profit, academic, and multi-specialty healthcare delivery system. Ochsner owns, manages or is affiliated with 25 hospitals and more than 50 health centers across the region, all connected electronically to provide convenience and the best possible care for their patients. Ochsner is also a national leader in medical research, conducting more than 750 clinical research studies every year and producing 200 annual publications in medical literature.
Before Graylog, Ochsner Health System had decentralized log messages, many of which were generated by their 17,000 workstation PCs. The information was difficult to analyze in any meaningful way, due primarily to three factors: 1) the message volume was too high, 2) log files were spread across multiple files and servers, and 3) log files did not retain a sufficient timeframe of historical log messages. Additionally, flat text logs did not provide a visual representation of message frequency over time, which made analyzing log message spikes for meaningful events problematic.
Prior to Graylog, Ochsner tried using a multiple-tabbed tail tool. This tool only displayed real-time log files while it was running, which made managing and analyzing log messages impractical. Ochsner also tried to write a rudimentary event log parser to collect event logs from servers, send them to a database, and trigger email alerts on error event ids. This approach was also only a partial solution and still did not provide the ability to perform comprehensive search or visualizations of log data.
With Graylog, Ochsner is able to continuously collect key application and operating system event log messages, which allows them to determine baseline counts for specific log messages. This is important because if logs are only collected when there is an issue, valuable data is lost and issues possibly unknown to the user remain undetected. Not knowing what a baseline for specific log messages looks like makes it difficult to know when message counts over a period of time look abnormal.
Graylog’s visualization tools enable Ochsner’s Desktop Engineering Team to interpret the data, analyze trends, and obtain actionable information. “I cannot emphasize enough how powerful and effective the graphs in Graylog are,” said Drew Miranda, Ochsner’s Systems Engineer. “Even a well-written text summary is still not as accessible and cannot convey as much meaning as a visual representation of what log messages look like over time and what percentage of each type of log is contained in that timeframe. I was recently able to use Graylog’s visualization tools to expand on technical notes by another team member in a way that helped everyone on the email chain better understand the situation.”
Graylog’s data parsing tools enable Ochsner’s Desktop Engineering Team to intake log data in any format. “We definitely take full advantage of this,” said Drew, “We effortlessly intake messages from GELF over TCP and UDP, syslog, and raw JSON sent via cURL using scripts.”
Ultimately, Ochsner Health System chose Graylog because of the product and service. “Graylog is very much a company that understands the importance of log management and they ‘get it’ when it comes to doing log management correctly. The product is well thought out and well executed. The company also provides amazing service and the community is very friendly and helpful. Software updates are released on a regular basis and bug fixes are addressed in a manner that is very transparent (via GitHub).”
“Graylog is very much a company that understands the importance of log management and they ‘get it’ when it comes to doing log management correctly. The product is well thought out and well executed. The company also provides amazing service and the community is very friendly and helpful. Software updates are released on a regular basis and bug fixes are addressed in a manner that is very transparent (via GitHub).”
Drew Miranda, Systems Engineer
Ochsner’s technology departments are highly customer and service focused. While it is easy to provide great customer service in an ideal world with unlimited resources, it is fairly difficult in the real world with limited resources. In this real world context, Graylog has enabled Ochsner’s IT groups to be innovative in increasing efficiency, making the work they do consume less time and, as a byproduct, cost less money. According to Drew, “Graylog has been a big win by allowing us to not only quickly find the cause of issues, but to know about the issues sooner. We have been able to identify and remediate issues that affect many of our systems and servers (systems such as Device/Inventory/Configuration Management, Power Management, File Encryption, Virtual Desktops, and Authentication Federation). For example, a particular system we have will sometimes have issues generating reporting data. Rather than wait for reports from users, we receive an alert when the system has not generated data in a specified timeframe. I can go about my other tasks and not worry about this issue until I see an alert. Then I can open a case with the specific third party vendor (service provider) and get back to my other tasks.”
Ochsner recently had a service interruption that, without Graylog, would have been difficult to diagnose effectively. Before Graylog, the IT team did not have any log intelligence to clearly display specific times when issues occurred, how many users were impacted, and what the length of the interruption was. They could not see log messages in real-time, and they did not know what their baselines were or what normal log behavior should look like. With Graylog, Drew’s team was able to both backload existing log messages as well as setup real-time log collection. This gave them a never before seen view of their log messages and provided effective analysis of the messages. This visibility also placed Ochsner in a better position to negotiate with their other third party vendors and be confident in demanding better service.
“One of the things that really surprised me was how often issues were happening and how often errors were occurring that we never had any visibility into. I’ve spoken to a lot of people in the IT field, both internally and externally, and I found that a lot of IT does not actively collect or analyze log messages. Generally, logs are only looked into after an issue has occurred, which may be too late. Some people may view Graylog as an alerting and root cause analysis engine, but it is much more than that. It provides real-time data and leading indicators to address issues before there is any noticeable impact to users. This is not even something theoretical. We are doing this today.”
“Some people may view Graylog as an alerting and root cause analysis engine, but it is much more than that. It provides real-time data and leading indicators to address issues before there is any noticeable impact to users. This isn’t even something theoretical. We are doing this today.”
Drew Miranda, Systems Engineer