Fast and highly available storage is expensive. Setting long retention times in Graylog or any other log management system can trigger serious cost constraints.
The good news is that for the majority of use cases, you only need to perform instant log searches over a relatively short period of time. Many of our users want to keep 365 days of log data but usually only search over the last 30 days.
With the new Archiving functionality in Graylog Enterprise, you can now store everything older than 30 days on slow storage and only re-import it into Graylog when you need it, for example when investigating a certain event from the past.
Graylog has a configuration that tells it how long to keep log data. The standard
behaviour is to just delete data that contains log messages older than the configured
retention period. The archiving functionality configures Graylog to automatically
write all messages of an index to flat files on disk before deleting the index.
You can then move the archive files wherever you want using a tool like
or anything else you feel comfortable with. The archives are simple plain files
that you can handle like any other text files.
If you need to take another look at archived data, you can temporarily re-import any archive for analysis in Graylog using the web interface. After you finish, you can once again delete the imported archive data.
Because the archive files are simple plaintext files, you can store them wherever you want. Put them on tape, burn them to a DVD, move them to cheap storage, or upload them to a cloud service. You’ll be able to temporarily re-import the data back into Graylog when needed.
Yes, they are GZIP compressed by default but you can apply any other compression that you want.
Yes, you can apply any encryption or signature mechanisms that your operating system offers. Just automatically run it over the files when they are written.
The archiving process itself is not very IO hungry, and we do not expect any serious sizing challenges with it. We use a special method to get all messages and apply no sorting, scoring or other expensive algorithms.
However, be aware that importing large archives back into Graylog can of course stress your storage cluster. You can import archives into a second, dedicated Graylog cluster with no special configuration or tricks to circumvent this problem.