How to use a JSON Extractor

In this Graylog tech series video we’re going to learn how to extract valuable data from JSON responses using JSON extractors. We will briefly teach you how to set them up, correctly configure them, and do some parsing.

USING JSON EXTRACTORS TO SWITCH TO STRUCTURED LOGGING

The JSON format is full of interesting data that, if extracted, can create a nice and accessible format because of its layout. You can parse useful info from various sources such as the destination addresses, the different sources, and response bytes. In a nutshell, the JSON format is a great tool for parsing logs by turning unstructured data into structured data.

THE FIRST STEPS

At first, we’re going to go off and find a JSON knowledge pack in the marketplace – in this case, we’re going to use one from Nginx. You can scroll through all the details of the readme until you find what you’re looking for at the very bottom – the configuration. Copy-paste all these lines and modify your Nginx configuration. Then restarting it.

CREATING AN INPUT

Once you modified the configuration, it’s time to create a new input – a Syslog UDP in this case. Just click on “Launch a new input” to create it. We’ll make it a global node titled “Nginx error_log” since we created an access_log already. We’re also going to change the port to some different port that we have.

Now, we need to add a few tags to the input. One is a static field from Nginx, and the other one is going to be the error_log and access_log respectively. This way we can filter them out in different message streams when I need to parse them later on. Note also how there are two different port numbers ports based upon that configuration file we set up before.

STRUCTURING DATA WITH THE JSON EXTRACTOR

If you click on the “Show Received Messages” button, you will see that, even if the data is all here, the messages you received are completely unformatted so the data is all mashed up. It’s time to go back and add a couple of extractors to solve this issue. Click on “Manage extractors” and then on the “Get started” button once the new “Add extractor” window opens up.

Click on “Load Message” and on the “message” field choose your extractor from the “Select extractor type” menu. We will start with a “Regular expression” – a regex which will look for Nginx to pull everything it finds. Let’s store it as a JSON field and give it a title to understand what it does.

Now we’re going to create a second extractor to take the JSON format that we just extracted out of the log, and parse all those fields in a readable format. By repeating the same operation, you will see that the new log came in the JSON format this time. Go ahead, and flatten all the structures. You can leave all the default values as they are since they’re pretty universal with JSON. That is, unless you want to customize them. Give it a title, and you’re done – all the fields will be pulled out automatically.

CHECKING YOUR WORK

Let’s go back to the “Inputs” section once again to see if our changes took effect (they do it immediately). Open up one of your messages and you will see the new access logs coming in. All the fields are now parsed off nicely in a clear and understandable format!

That’s all you need to know for JSON extractors. Happy logging!