DNS Zombies

By on 4 Apr 2016

Category: Tech matters

Tags: , , , ,

Blog home

It seems that some things just never die, and this includes DNS queries.

In a five-month experiment, encompassing the detailed analysis of some 44 billion DNS queries, we find that one-quarter of these DNS queries are zombies – queries that have no current user awaiting the response, and instead are echoes of previous queries.

What is causing these zombies? Are we seeing deranged DNS resolvers that maniacally re-query the same questions and never accept the answer? Or is this something slightly more sinister and are we seeing evidence of widespread DNS stalking and shadowing? Let’s find out.

As part of our Internet measurement work, we use a technique of embedding a small set of ‘sentinel blots’ within an online ad. When the ad is delivered to a user’s browser, the script in the ad causes the user’s browser to fetch these blots. To measure IPv6, for example, one blot is accessible in both IPv4 and IPv6, while another blot is only accessible using IPv6. Those systems who can fetch this second blot fall into the category of “IPv6-capable” end systems, and we can extrapolate from the sample measurements to estimate the extent to which IPv6 is deployed across the entire Internet.

Of course, this simple description of the measurement system glosses over a number of subtle aspects of behaviour. It is a requirement of this measurement system that the blot is fetched from one of the measurement servers, and that means that we need to bypass the various DNS and web proxy caches that are widely used in the network. The approach we use is to generate unique DNS names for each instance of the blot, so that every DNS query is a new query and cannot be served from a cache. Equally, every web fetch of the blot is a new URL, so that a proxy cannot intercept the fetch.

When we generate these unique DNS names we include in the synthetic name a time component, which is an encoding of the time that the script in the ad was executed by the user. Each DNS name is used only once, and the script is intended to run to completion immediately. So in a reasonable world, the authoritative DNS server would see one, or perhaps two, queries for each unique DNS name, and the time of the DNS query should be within a few seconds of the time that is encoded in the DNS name. The Time To Live (TTL) of the DNS name is one second, and each unique name is never reused in any other Ad.

The Internet can be truly prodigious! If you look hard enough and for long enough on the Internet you will probably find every form of pathological behaviour that could possibly exist! And with some 24 million unique DNS names being generated each and every day from these measurement experiments, then if there are some strange DNS behaviours out there, it’s likely that we can see them!

1450151673.887 15-Dec-2015 query: z.t1000.u953a6ea5.s1448087430.i5112.vxxxx.06ca0.z.dotnxdomain.net A
1450151673.887 15-Dec-2015 query: z.t1000.uc86fd1d9.s1447672979.i5112.vxxxx.3b460.z.dotnxdomain.net A
1450151673.887 15-Dec-2015 query: z.t1000.ub46e3821.s1447703026.i5112.vxxxx.0c914.z.dotnxdomain.net A
1450151674.013 15-Dec-2015 query: z.t1000.u953a6ea5.s1448087430.i5112.vxxxx.06ca0.z.dotnxdomain.net A
1450151674.015 15-Dec-2015 query: z.t1000.ub46e3821.s1447703026.i5112.vxxxx.0c914.z.dotnxdomain.net A
1450151674.017 15-Dec-2015 query: z.t1000.uc86fd1d9.s1447672979.i5112.vxxxx.3b460.z.dotnxdomain.net A
1450151674.753 15-Dec-2015 query: z.t1000.u953a6ea5.s1448087430.i5112.vxxxx.06ca0.z.dotnxdomain.net A
1450151674.755 15-Dec-2015 query: z.t1000.uc86fd1d9.s1447672979.i5112.vxxxx.3b460.z.dotnxdomain.net A
1450151674.756 15-Dec-2015 query: z.t1000.u953a6ea5.s1448087430.i5112.vxxxx.06ca0.z.dotnxdomain.net A
1450151674.757 15-Dec-2015 query: z.t1000.ub46e3821.s1447703026.i5112.vxxxx.0c914.z.dotnxdomain.net A

Figure 1 – DNS Query Log Extract

Figure 1 shows an extract from the query log of one of the authoritative DNS servers for this domain. In this log extract, the first column is the time of the query, encoded as the number of seconds since 1 January 1970 UTC, which corresponds in this case to the 15th December 2015, at 03:54:33/03:54:34 UTC. The number in the label starting with ‘s’ is the time the experiment was executed by the user. These times correspond to the time and dates listed in Figure 2.

2015-11-21 06:30:30
2015-11-16 11:22:59
2015-11-16 19:43:46
2015-11-21 06:30:30
2015-11-16 19:43:46
2015-11-16 11:22:59
2015-11-21 06:30:30
2015-11-16 11:22:59
2015-11-21 06:30:30
2015-11-16 19:43:46

Figure 2 – DNS Name Creation Times for Queries in Figure 1

What this shows is that each of these queries corresponds to an experiment that was delivered and executed between 20 and 30 days earlier! There is nothing in the measurement exercise itself that could lead to these “echo” queries. These queries are zombie queries. The initial trigger event is long gone, and the query itself is living in some strange afterlife where the single trigger event that kicked the query into life is long gone! One or two of these zombie queries per day is one thing, but the numbers appear to be far higher than that. To try and understand this a little better we can look at the age of these zombie queries for one authoritative name server for a single day. Figure 3 shows the distribution of the age of queries for all DNS zombies for a single day on a single DNS server used in this experiment. In this case, 16%, or one in six of the 18.3 million DNS queries seen at this server were zombies.

 Zombie Age Distribution

Figure 3 – Zombie Age Distribution

The age distribution of these zombie queries can also be seen in a cumulative distribution plot (Figure 4). One-half of the zombie queries occurs within the first 24 hours after the original query. There is an exponential decline in zombie counts for the first 30 days, then the zombies appear to be very persistent, and the decline over time is far slower for older zombies. The high count at the 60-day point appears to map to a local peak in original DNS queries that occurred some 60 days earlier.

Figure 4 – Zombie Age Cumulative Distribution

Figure 4 – Zombie Age Cumulative Distribution

It appears that a number of DNS resolvers are performing some form of “just in case” pre-provisioning of the DNS name resolution, and not releasing a name from this cache for some days, even months. What if we broadened our search to look for all zombies over an extended period?

The following two figures show the distribution of the age of these DNS query zombies recorded over a 5½ month period from 1 October 2015 until mid-March 2016. Some 44 billion queries were seen over all the servers (44,733,946,408), of which some 11 billion (11,274,142,797 queries) were zombie queries where the name itself was more than 1 hour “old”. The age profile of this set of zombies is shown in Figure 5, and the cumulative distribution is shown in Figure 6.

Figure 5 – Zombie Age Distribution – 160-day collection

Figure 5 – Zombie Age Distribution – 160-day collection

The cumulative distribution shows that one-quarter of all zombies is between one and 24 hours old. There is a long-lived tail to this distribution, and 1% of all queries are for query names that were created more than 83 days ago. Figure 6 shows that the decline in these long-lived zombies appears to be approximately linear when viewed using a log scale, which suggests some form of an exponential decline of these zombies over time.

Figure 6 – Cumulative Zombie Age Distribution – 160-day collection

Figure 6 – Cumulative Zombie Age Distribution – 160-day collection

It’s a somewhat surprising outcome that one-quarter of all the DNS queries at these authoritative name servers are zombie queries, where there is no discernable original trigger event.

What could be causing this behaviour? One possible explanation is that this is not DNS cache refreshing at all, but web cache refreshing, coupled with a form of web caching that entails checking the validity of the embedded URLs within the page, and this would cause these zombie queries.

Do web zombies exist? And is there any correlation between the web zombie fetch distribution and the DNS zombie query distribution? Over the same period, the web servers associated with this experiment recorded 9,005,437,917 web fetches, or which just 7,055,965 appear to be aged more than one hour. So 0.08% of the web queries are zombies, and this is far lower than the 24.41% zombie rate seen in the DNS query logs.

The cumulative distribution of the age of these zombies is also quite different. As shown in Figure 7, most of the DNS query zombies are less than 100 days old, while some 6% of the zombie web queries are greater than 100 days old.

Figure 7 – Cumulative Zombie Age Distribution – Web vs DNS

Figure 7 – Cumulative Zombie Age Distribution – Web vs DNS

It appears that there is no real correlation here, and the DNS zombie query rate is largely independent of the far smaller web zombie query rate.

It seems that we are left with the DNS resolvers themselves being the cause of this zombie query pattern. The next question is the nature of the zombie activity. Is this the result of a small number of unique queries with a very large query rate, or a much larger number of unique queries that are queried at a far lower rate of around once per day or similar. The distribution of repeat queries is a ‘heavy tail’ distribution with a high number of zombie queries occurring between 1 and 12 times per day.

Figure 8: Queries per unique Query Name per day

Figure 8: Queries per unique Query Name per day

Of the 59 million unique zombie query names, one-quarter of these names are queried once per day or less, and 9/10 of these names are queried 12 times per day or less. Most of these queries appear to be some form local cache refreshing within a local refresh timer setting between 2 and 24 hours. However, some 16 query names were queried in excess of 10 million times in a day!

Figure 9: Cumulative Distribution of unique Query Names per day

Figure 9: Cumulative Distribution of unique Query Names per day

What is the nature of this zombie query load? To what extent is this load due to a large number of query names being held in DNS resolver caches being periodically refreshed at daily or hourly cycles? Or is this dominated by a small number of resolvers that appear to have wedged themselves into a maniacal query loop and performing queries for the same name at a sustained query rate in excess of 100 queries per second?

The distribution of zombies according to the repeat frequency per day is shown in Figure 10.

Figure 10: Cumulative Distribution of Zombie Queries per day

Figure 10: Cumulative Distribution of Zombie Queries per day

What we can see from this figure is that some 30% of the zombie queries are from resolvers that query for the same query name less than 30 times per day. So a little under one-third of the zombie queries are from resolvers using a local cache refresh timer of the order of hours to maintain their local cache. However, some 60% of the zombie queries are from query strings that are queried around 1 million times (or more) per day. This very high query rate suggests that these queries are originating from resolver behaviours that are broken in some manner, and these resolvers have been pushed into some form of pathological high-speed query loop.

This data points to the observation that there are a set of resolvers that appear to be misbehaving by emitting a duplicate query stream at high volume while there are a second set of resolvers that appear to be shadowing original DNS queries.

Let’s look at this data set by counting, for each visible DNS resolver, the number of queries made for DNS names that are “current” the number of queries that are “zombie” queries, where the timestamp in the DNS name is older than one hour.

The first list is those resolvers that made the highest zombie query count from the 1.2 million resolver IP addresses gathered in this exercise. The table shows the IP address of the DNS resolver that made the query to the authoritative name server, the count of “current” queries, the count of zombie queries, the ratio between the two and the network and country where the DNS resolver is located (Table 1).

ResolverCurrentZombieRatioASNCCAS Name,978,9314,610,444,8121,15814754GTTelgua, Guatemala,124,4231,006,797,8937135656JOJUNET Jordanian Universities, Jordan,868,204870,945,1378853618CAADITY-OSH – Aditya Birla Minacs, Canada,034,545594,314,499162572USMissouri Research and Edu., United States,038,41681,862,63023028USTeam Cymru Inc. United States,486,712379,724,41925521391DZTDA-AS,DZ Algeria,041,670373,155,04718221391DZTDA-AS,DZ Algeria,697,987255,364,2804435656JOJUNET Jordanian Universities, Jordan,975,978200,821,24610114214CAMINACS – Minacs Inc, Canada,929,88111,720,89823028USTeam Cymru Inc, United States,905,02854,952,51423028USTeam Cymru Inc, United States,637,78830,212,59623028USTeam Cymru Inc, United States,436,25822,478,75223028USTeam Cymru Inc, United States,98639,623,75442114868BRCOPEL Telecom S.A. Brazil,632,91017,868,0741027026USNetwork Maryland, US United States,637,5671,356,73516509USAMAZON-02 – Amazon.com, United States,331,749293,75816509USAMAZON-02 – Amazon.com, United States,259,59112,759,627414813BBColumbus Telecommunications, Barbados

Figure 11 – Resolvers with the highest zombie query count

There are two distinct behaviours visible here.

One is “wedged” resolvers that appear to be making the same query over and over again. For example, the first line shows that a resolver located in Guatemala ( generates on average 1,158 zombie queries for each current query, and over the entire period generated some 4.6 billion zombie queries. Clearly there is some intense query loop going on here, and these 4.6 billion zombie queries coming from this single resolver is a perverse form of highest achievement in today’s Internet. It’s likely that this resolver is wedged in some strange looping state. A similar picture exists for the two resolvers in the Jordanian Universities Network where there the 1 billion and 250 million zombie queries from each of two resolvers located in that network. Also notable are the DNS resolvers in Minacs in Canada, Missouri Research and Education in the US, Copel in Brazil, and to a lesser extent Columbus Telecommunications in Barbados. It’s likely that these resolvers are in some strange form of query loop status and they probably need some intervention to calm them down.

The second form of behaviour is shown in this list by those resolvers that make a massive number of zombie queries, but few, if any, current queries. These resolvers are operated by Team Cymru and Amazon.

Let’s look at each of these behaviours in slightly more detail.

We can provide a little more detail at the apparently broken resolvers by looking at the number of unique current queries made by each resolver and the number of subsequent repeat queries, and do the same for the zombie queries (Table 4).

ResolverCurrentZombieZombieASNCCAS Name
 UniquesRepeatsUniquesRepeatsRepeat Ratio,23810,501,108724876,780,6011,211,02353618CAAditya Birla Minacs Worldwide, Canada,49535,034,545572600,739,9951,050,2442572USMOREnet, United States,978,9316,4624,704,634,886728,04614754GTTelgua, Guatemala,167,441411202,079,128491,67614214CAMINACS – Minacs, Canada,20114,435,2623,0941,019,572,525329,53235656JOJUNET Jordanian Universities, Jordan,700113,338,108303,46418474USAeneas Internet Services, United States,058123,154,574262,88118101INReliance Communications, India,98621840,534,251185,93614868BRCOPEL Telecom, Brazil,4421138,326138,32631418ESSOGECABLE, Spain,946,2426668,671111,4457922USComcast, United States,83037,012,5121,408142,438,304101,16337558LYLITC, Libya,644171,522,81089,57728625BRTerremark do Brasil, Brazil,1666436,26972,711174USCogent Communications, United States,9156435,97372,66239742UAITM IT-MARK, Ukraine,122583,729,92964,30927026USNETWORKMARYLAND, United States,9985,819,4305,634258,275,97245,84235656JOJUNET Jordanian Universities, Jordan,886391,731,39044,3943215FRAS3215 Orange, France,215391,727,06344,2833215FRAS3215 Orange, France,8231,634,68850519,286,36638,19027026USNETWORKMARYLAND, United States,7775150,97630,1957385USIntegra Telecom, United States

67016,443128,92128,92141383GBWOLASN Wolseley, United Kingdom,24424,6808167BRBrasil Telecom, Brazil,696494,52423,63136907AOTVCaboAngola, Angola,4039201,79722,42133481USBELWAVE COMMUNICATIONS, United States,292366,06722,02234397SACyberia Riyadh, Saudi Arabia,3331833,690,77620,1686471CLENTEL, Chile,8908137,97217,2467018USATT-INTERNET4, United States,707230,22115,11025899USLS Networks, United States of America,53616193,63512,10217126CLE-money, Chile,94113152,70511,7468359RUMTS MTS PJSC, Russian Federation

Figure 12 – Resolvers with the highest zombie repeat query ratio

This is a day-by-day running total of the number of unique current queries made by each resolver, and the number of repeat queries made in the first hour. The resolver at is clearly broken, in so far as it managed to generate some 10 million repeat queries in the first hour from just 3,200 initial unique queries. This appears to be a resolver that took the 1 second TTL seriously and commenced a cache refresh cycle based on this 1 second TTL.

If a resolver is going to gratuitously refresh a local cache entry, it should pass the TTL through a basic sanity check first! Or give up after 1 or 2 gratuitous refresh cycles. This is a resolver that kept on querying, and it presented the server with some 876 million subsequent queries for just 724 unique query names, an amplification factor of 1.2 million repeat queries per name. The resolver at the Missouri Research and Education network in the US ( show a very similar query behaviour.

All these resolvers listed in the table above have the highest zombie amplification factor. Either they are taking the original 1 second TTL literally and attempting to keep the record in a local cache by mindlessly re-querying the name every second, or there is some other pathology that is causing these resolvers to enter a very high repeat query cycle.

Another way to look at the second category of resolver behaviour is to rank the resolvers by the ratio of zombie to current queries. Figure 13 shows the 25 resolvers with the highest zombie to current query ratio.

ResolverCurrentZombieRatioASNCCAS Name,978,9314,610,444,8121,15814754GTTelgua, Guatemala,038,41681,862,63023028USTeam Cymru Inc, United States,905,02854,952,51423028USTeam Cymru Inc, United States,637,78830,212,59623028USTeam Cymru Inc, United States,436,25822,478,75223028USTeam Cymru Inc, United States,929,88111,720,89823028USTeam Cymru Inc, United States,519,4615,519,46127471USBlue Coat Systems, Inc, United States,472,1092,472,1096830NLLGI-UPC Liberty Global Operations, Netherlands,401,9302,401,9306830NLLGI-UPC Liberty Global Operations, Netherlands,480,6341,480,63416509USAMAZON-02 – Amazon.com, Inc, United States,479,0661,479,06616509USAMAZON-02 – Amazon.com, Inc, United States,423,1471,423,14716509USAMAZON-02 – Amazon.com, Inc, United States,637,5671,356,73516509USAMAZON-02 – Amazon.com, Inc, United States,849842,84916509USAMAZON-02 – Amazon.com, Inc, United States,779713,77924151CNChina Internet Network Information Center, China,889372,88916509USAMAZON-02 – Amazon.com, Inc, United States,598365,59816509USAMAZON-02 – Amazon.com, Inc, United States,804361,80416509USAMAZON-02 – Amazon.com, Inc, United States,474361,47416509USAMAZON-02 – Amazon.com, Inc, United States,080345,08016509USAMAZON-02 – Amazon.com, Inc, United States,949338,94916509USAMAZON-02 – Amazon.com, Inc, United States,725334,72516509USAMAZON-02 – Amazon.com, Inc, United States,208326,2083462TWHINET Data Communication Business Group, Taiwan,403323,4033462TWHINET Data Communication Business Group, Taiwan,396321,3961136NLKPN, Netherlands,115317,11517204USNominum, Inc, United States

Figure 13 – Resolvers with the highest zombie query ratio

It’s interesting that almost none of these resolvers made a “current” query – they appear to specialize almost exclusively in zombie queries. It may well be these particular systems are used as part of an operation to collect the URLs that users go to and then validate these URLs by resolving the names themselves. Both Team Cymru and Blue Coat apparently specialize in cybersecurity functions, so this may well be the case.

It appears that the overall 25% zombie ratio of DNS queries we are seeing here is made up of two quite different behaviours. The first is a small set of resolvers that are re-querying the same DNS query at rates that can only be described as maniacally insane! This is probably the outcome of an extended local cache retention policy and strict adherence to the provided TTL. The combination is just disastrous. The second zombie query component is a little more sinister. It seems that nothing you or I do on the Internet is a secret, and there is a large industry that actively tracks what you and I do. Now it may be that their motives are pure of heart, and they perform this intense shadowing as part of their efforts to identify and track various forms of cyber abuse and attack. However, the result is that it seems that as Internet users we are little more than goldfish in a clear glass bowl, and personal privacy is a quaint historic function.

Which has the larger zombie population? The storers? Those maniacal re-queriers that hammer a small number of unique queries. Or the stalkers? Those DNS snoopers that re-query a massive number of unique names, but each unique query is handled in a more constrained manner.

The distribution of Zombie query ratios is shown in Figure 14.

One-fifth or some 20% of all zombie queries are made from resolvers that query these labels less than five times. It is plausible to infer that this within this set of queries there is some element of online tracking and shadowing of user behaviour. Almost all visible resolvers that pose zombie queries (94%) have a zombie re-query ratio of five or less. The resolvers listed in Table 5 appear to be part of this set of DNS trackers.

At the other end of the scale, some 60% of all zombie queries are part of a repeat query set that is 100,000 queries of greater. These 60% of all zombie queries with this very high repeat rate are generated by just 11 resolvers, as listed as the highest ranked resolvers in Figure 12.

Figure 14: Cumulative Distribution of Zombie Queries per day as Zombie Ratio

Figure 14: Cumulative Distribution of Zombie Queries per day as Zombie Ratio

In the larger scheme of things, most of the DNS is behaving exactly as expected, and more than one-half of the 1.2 million visible DNS resolvers make no zombie queries whatsoever. Some 424,000 visible resolvers perform up to 5 zombie queries per unique query name, which could be seen as some modest level of local cache refresh. The remaining 15,000 DNS resolvers behave in progressively worse ways, with the re-query rates rising from 5 zombie queries per unique query to the worst case of 1.2 million zombie queries per unique query. Even if we can fix just 11 of the worst cases here we would make a substantial impact on the zombie population in the DNS.

The good news is that we now know who these zombies are.

Now all we have to do is kill them.

Zombiepedia offers some hints as to how to do that!

Editors Note: Below is a presentation that Geoff gave on this topic at DNS OARC 24, Buenos Aires

Rate this article

The views expressed by the authors of this blog are their own and do not necessarily reflect the views of APNIC. Please note a Code of Conduct applies to this blog.

Leave a Reply

Your email address will not be published. Required fields are marked *