Larger environments especially benefit from Graylog’s fault-tolerant architecture. Expanding the infrastructure to consume high log volumes and ensuring the system is up and available for collection and analysis can all be done with Graylog Enterprise.
Using a load balancer, you can have multiple Graylog servers ingesting logs and providing additional interfaces for the analysts. You can configure the MongoDB and Elasticsearch databases to be redundant to ensure no data loss. Additionally, all processing pipelines utilize a message journal, allowing quick collection and storage to disk in case of a power loss, so no messages are lost.
Fault tolerance is also built into many of the agents we support via Sidecar. This approach allows hosts to locally spool their logs in case of a network outage, and once the network connection is established, the logs are sent.
Yes, using our REST API, you can query the status of the hosts, with an ALIVE or DEAD response. You can also manually change the state for Zero Downtime upgrades.
You can have more than one web interface for redundancy purposes, where you can front-end the access with a load balancer or proxy. This architecture also allows for centralized SSL/TLS termination and certificate management.
You can adjust your journal size as desired, keeping in mind disk requirements. This flexibility allows for longer network outages or upgrade times, while not losing any messages.
No. All configurations are replicated, including SAML/SSO role mappings.
No. In fact, when properly configured, it will both increase your ingest performance and decrease the time to return the results of a search.
Yes. There are multiple ways to provide multi-site replication using Graylog.
Yes. Graylog supports multiple notification mechanisms. In version 3.0, Graylog supports actions triggered by an alert that can invoke a script to act outside of Graylog.
Due to Graylog’s open nature and support of other open technologies, you can leverage message queuing technologies to increase your control over the flow of data. You can provide an easier method for developers to deliver logs while increasing fault tolerance.