The Graylog Blog

Tapping Wires for Lean Security Monitoring: DNS Request Analysis with Open Source Software

As we continue our discussion on security monitoring, we find there are multiple ways to defend attackers on the outside network perimeters and to detect intruders that have landed inside your network. The combined force of virus scanners, firewalls, IDS systems, and a log management system is a great way to protect your network.

We would like to introduce an additional method of security monitoring: capturing all DNS requests that are made within your network. This post will be the first of a two part series that covers the collection and analyzation of DNS requests. Our second post will focus on automatically flagging bad actors by integrating threat intelligence databases with Graylog.

Let’s dive in!

Why monitor DNS requests?

If a computer is infected, in most cases, the malware or attacker will do one of three things: try to load more code, open a backchannel to exfiltrate data, or wait for further instructions. Typically, the attacker will be using a domain name instead of a hard-coded IP because it offers more flexibility. To our advantage, this will result in a DNS lookup that can be easily spotted.

Another benefit of using this technique is that we can spot attacks instantaneously. For example, a user that clicks on a phishing link will be caught the moment the browser performs the DNS lookup to open the “fake” website. The same happens when infected machines try to call their command and control hosts.

How to capture the data?

The classic approach to collect all DNS requests is to write all requests the DNS servers receive to a log file and then to transfer those logs into Graylog. However, there are at least 3 issues with this approach:

  1. DNS server logs are notoriously hard to parse, which leads to an extremely frustrating process.
  2. If you do not have your own DNS servers, but rely on DNS servers in the internet, then you cannot instrument by logging requests from the server. In addition, some small branch office DNS appliances do not allow you to log the requests they receive.
  3. If an attacker uses a DNS server in the internet directly, he/she could query a Google DNS or OpenDNS server to circumvent detection or block your own DNS servers.
  4. Fast-flux botnets often run their own short-lived DNS servers.

A new approach is to simply listen for all DNS requests that go through your cables. By listening to the wire data that goes to your DNS servers and that leaves your networks to the internet, you will be able to spot every DNS request. A span port or network tap will allow you to detect all DNS requests, regardless of final location or logged format.

An architecture using this approach is demonstrated below:


We’ll be using Packetbeat to listen to all wire traffic with destination_port:53 and the default DNS packet decoder to parse important information out of the raw traffic.

Configuring Packetbeat

This tiny Packetbeat configuration file is all you need:

  device: any

    ports: [53]
    include_authorities: true
    include_additionals: true

    hosts: [""]

It instructs Packetbeat to …

  • Listen on all network interfaces
  • Detect DNS requests on port 53
  • Forward all logs to a Graylog Beats input, listening on port 15999.

Lastly, we need to instruct Graylog to receive and parse the messages from Packetbeat.

Configuring Graylog

First, install the Graylog Beats input plugin and start a Beats Input on the port you configured in your Packetbeat configuration.

You should see messages coming in immediately after starting the input if Packetbeat is already configured and running.

You will notice that the messages have several standard fields, most of which we will not need.

Let’s build a Processing Pipeline rule to remove all the unneeded clutter and to format the messages nicer:

rule "rewrite raw packetbeat DNS logs"
  $message.packetbeat_type == "dns"
  // Select interesting fields and rename their keys to something more useful.
  set_field("dns_question", $message.packetbeat_dns_question_name);

  set_field("src_addr", $message.packetbeat_client_ip);

  set_field("dst_addr", $message.packetbeat_ip);

  set_field("dns_flags_authoritative", to_bool($message.packetbeat_dns_flags_authoritative));

  set_field("dns_flags_recursion_allowed", to_bool($message.packetbeat_dns_flags_recursion_allowed));

  set_field("dns_flags_recursion_desired", to_bool($message.packetbeat_dns_flags_recursion_desired));

  set_field("dns_flags_truncated_response", to_bool($message.packetbeat_dns_flags_truncated_response));

  set_field("dns_op_code", $message.packetbeat_dns_op_code);

  set_field("dns_question_class", $message.packetbeat_dns_question_class);

  set_field("dns_question_type", $message.packetbeat_dns_question_type);

  set_field("dns_response_code", $message.packetbeat_dns_response_code);

  set_field("dst_port", to_long($message.packetbeat_port));

  set_field("src_port", to_long($message.packetbeat_client_port));

  // Remove fields we don't need or want.

  // Remove trailing . if there is one
  let fix = regex("(.+?)\\.?$", to_string($message.dns_question));
  set_field("dns_question", to_string(fix["0"]));

  set_field("message", concat("DNS Query: ", to_string($message.dns_question)));

Now that you have every DNS request coming into Graylog, let’s look at real-world analysis use-cases.

Analyzing the captured data

Let’s assume you want to see which DNS servers are being used in your network. Most likely, you will want to make sure that only your internal servers are being used.

Start by searching for all DNS requests:


Now run a quick values analysis on the dst_addr field to find all destination addresses of DNS requests (remember, we searched for type:dns)

As you can see, there have been 229 requests made to DNS servers not under our control. In this case, it is the public Google DNS servers, but we should still investigate and make sure that the machines that use those servers are re-configured.

To find out what machines were making those requests, we will need to run a new query:

type:dns AND (dst_addr: OR dst_addr:

Now run a quick values analysis on the src_addr field:

This is how fast you can get the result. The machine with the IP address is using Google DNS servers instead of your own DNS servers for some requests.

Next steps could be:

  • Search for dst_addr: in all messages to analyze what else this machine is doing. This is very powerful if you are also collecting network connection (NetFlow) logs. Look if there are more suspicious connections being made.
  • Set up an automated alert for any DNS request that is made against outside DNS servers. You could do a CIDR translation in a processing pipeline rule to identify networks that do not include your trusted DNS servers. For simplicity’s sake, you could keep a list of your own DNS servers.
  • Send a nightly report (cronjob + REST API) of all DNS servers that have been used over the day.

This is only one example. Another useful thing is to just scroll through the actual hostnames that were resolved via DNS to see if there is anything suspicious going on. See the last paragraph of this post about automatic enrichment with threat intelligence data to automatically detect machines communicating with malware command and control systems.

Legal aspects

Note that you should first consult with HR or an attorney to make sure that you are not invading privacy by collecting DNS requests and that employment agreements allow this practice.

Next steps / Enriching the data

Due to its size, you can see that it will become difficult and time-consuming to manually sift through DNS data. To improve efficiency, you will want to implement an automatic enrichment that can mark requested domains as an indicator of compromise. Consider this, a Windows workstation is sending a DNS request for a known malware command & control host. If that request was automatically marked with threat_indicated:true, you could quickly filter through and be alerted of this incident.

So how do we automate this process?…

Our next blog post will look at integrating threat databases with Graylog to automatically flag DNS requests of domains that are known bad actors.

Subscribe to our newsletter or follow us on Twitter to be notified when it is released!

comments powered by Disqus