In the Domain Name System (DNS), authoritative nameservers are charged with answering queries for the DNS zones under their control. For redundancy, each DNS zone should have multiple authoritative nameservers.
Combined, RFC 1034 and RFC 2182 state that zones must have at least two topologically and geographically distributed nameservers. Having multiple authoritative nameservers for a zone improves robustness in the face of individual machine or network failures. Therefore, many zones, including those considered critical to many enterprises, operate with even more than two authoritative nameservers per zone for multiple failure redundancy.
This practice leaves recursive resolvers, which send their queries to authoritative nameservers, with a choice as to which authoritative nameserver to contact when sending each DNS query.
In this post, I explore the methods that recursive resolvers use to select authoritative nameservers and why. Answering these questions informs decisions around authoritative nameserver deployment and improving recursive resolver behaviour.
The authoritative nameservers for a zone are defined in the DNS via delegation or nameserver (NS) records. Each zone has a set of these records that specify the authoritative nameservers for the zone.
The records contain a name, itself resolvable via the DNS to an IP address, which recursive resolvers use for DNS queries regarding the zone. For example, the edgekey.net zone used by the Akamai CDN has 13 NS records specifying the 13 authoritative nameservers that can answer queries for the zone.
edgekey.net. 172800 IN NS a6-65.akam.net.
edgekey.net. 172800 IN NS a12-65.akam.net.
edgekey.net. 172800 IN NS a13-65.akam.net.
edgekey.net. 172800 IN NS a16-65.akam.net.
edgekey.net. 172800 IN NS a18-65.akam.net.
edgekey.net. 172800 IN NS a28-65.akam.net.
edgekey.net. 172800 IN NS adns1.akam.net.
edgekey.net. 172800 IN NS ns1-66.akam.net.
edgekey.net. 172800 IN NS ns4-66.akam.net.
edgekey.net. 172800 IN NS ns5-66.akam.net.
edgekey.net. 172800 IN NS ns7-65.akam.net.
edgekey.net. 172800 IN NS usw6.akam.net.
Figure 1 — List of NS records for the zone edgekey.net
To answer our questions, I used logs from Akamai’s authoritative nameservers for DNS queries from recursive resolvers. I collected 10 minutes of logs where each log includes the IP address of the recursive resolver.
Next, I used the authoritative nameservers to ping each recursive resolver IP address in the locally collected logs. Thus, for each recursive resolver and authoritative nameserver pair, I have the number of DNS queries it sent to each authoritative nameserver and an estimate of the round-trip-time (RTT) between the recursive resolver and the authoritative nameserver.
Of the 890K recursive resolvers studied, 60% of them sent less than one query per minute. Let’s focus on the recursive resolvers that sent many DNS queries first and return to the ones with few queries at the end.
Recursive resolvers prefer delegations with lower RTT
Upon examining the number of queries that recursive resolvers send to each authority, I found that very few appear to spread the load evenly.
Less than 7% of recursive resolvers distribute their DNS queries uniformly among all 13 authoritative nameservers or even a subset of those nameservers. This implies that most (94%) of recursive resolvers are preferencing some authoritative nameservers over others.
Previous research shows that specific recursive resolver software in the lab and recursive resolvers in the wild using synthetic traffic loads prefer lower RTT authoritative nameservers over ones with higher RTT. The RTT for authoritative nameservers must be discovered, however, and recursive resolver software learns by measuring the time from when a DNS query is sent to the time when the DNS response is received.
Because RTT to different authoritative nameservers varies and can change with time, recursive resolvers must periodically send DNS queries to all authoritative nameservers to update their knowledge of RTT.
I confirmed these findings in our production traffic logs.
First, roughly 10% of recursive resolvers sent nearly all (>95%) of their DNS queries to a single authoritative nameserver out of the 13, which was also frequently the lowest RTT authoritative nameserver according to ping measurements.
Second, the remaining ~83% of recursive resolvers also showed a preference for lower RTT authoritative nameservers, but the preference was not as pronounced.
Visualize the following computation: for each recursive resolver, sort the authoritative nameservers by RTT and divide the list in half, one half with lower RTTs (A) and one half with higher RTTs (B). If the recursive resolver prefers lower RTT authoritative nameservers, then it should send over half of its DNS queries to group A.
Figure 2 shows the fraction of DNS queries sent to group A for each recursive resolver in our dataset. The green shaded area indicates a preference for lower RTT authoritative nameservers and 80% of recursive resolvers fall within the green area.
Figure 2 — Cumulative distribution function showing the fraction of DNS queries that recursive resolvers send to the half of authoritative nameservers with lower RTTs.
The preference could be stronger
The preference for many recursive resolvers is not as strong as we might expect. For example, 40% of recursive resolvers sent less than 65% of their DNS queries to the lower RTT half of authoritative nameservers. Of course, this means that for many DNS queries the time required to receive a response is inflated.
Figure 3 shows how inflated the average resolution time is for each recursive resolver. The Y-axis represents an idealized recursive resolver that sends all DNS queries to the lowest RTT authoritative nameserver, while the curve shows that 15% of the recursive resolvers measured inflate the average resolution time by greater than 50ms due to sending DNS queries to authoritative nameservers with higher RTTs. If recursive resolvers would more greatly preference the lower RTT nameservers (like the 10% described above that send nearly all DNS queries to the lowest RTT authoritative nameserver), the curve would shift towards the left.
Figure 3 — Cumulative distribution function showing the average additional milliseconds that DNS resolutions took because recursive resolvers did not use the lowest RTT authoritative nameserver.
Many recursive resolvers send very few DNS queries
Returning to the 60% of recursive resolvers that sent less than one query per minute, my analysis suggests that these resolvers likely operate the same algorithms for authoritative nameserver selection as the recursive resolvers that sent more DNS queries. However, with such rare DNS queries, the recursive resolvers have few opportunities to measure RTT.
At one query per minute, it would take the recursive resolver 13 minutes to learn the RTT of each of the 13 authoritative nameservers. However, after 13 minutes, the RTT of the first authoritative nameserver is likely stale and needs to be refreshed. Thus, recursive resolvers with very low query rates likely do not benefit from algorithms designed to home in on the lowest RTT authoritative nameservers.
Authoritative nameserver deployments and resolver behaviour
In summary, nearly all (93%) of recursive resolvers attempt to preference lower RTT authoritative nameservers when selecting which to send DNS queries to. This can reduce resolution time because some authoritative nameservers can have longer RTTs than others. However, the effectiveness of the recursive resolver technique is limited due to two factors.
First, for the majority of recursive resolvers, the preference for lower RTT authoritative nameservers is not strong, leading to longer than necessary resolution times.
Second, there is a very large number of recursive resolvers in the wild that send very few DNS queries and recursive resolver techniques to estimate RTT and then home in on the lower RTT authoritative nameservers are not effective for them.
Together, I think two actions should be considered to improve and accommodate recursive resolver behaviour:
- Recursive resolver software developers should modify their software to send nearly all DNS queries to the lowest RTT authoritative nameservers and only send rare queries to update RTT estimates for all other authoritative nameservers.
- Authoritative nameserver operators should deploy their nameservers such that all authoritative nameservers are low RTT as some recursive resolvers will be unable to identify the lower RTT ones.
Kyle Schomp is a Performance Engineering at Akamai Technologies in London, UK.
The views expressed by the authors of this blog are their own and do not necessarily reflect the views of APNIC. Please note a Code of Conduct applies to this blog.
There is another (possibly ill-advised) method to keeping this first query time low. Upon timer expiry or first query, recursive resolvers could send the query to all authoritative servers listed and take the “winner” as the fastest resolver (and subsequently order the other authoritative servers by their reply latency) and prefer this order until the timer cache expires or a forcing event occurs such as no response or high latency from the current preferred authoritative server. Clearly, this generates more traffic to authoritative resolvers, but it would be interesting to see a more in-depth study of how much traffic this would generate versus improvement in query times – perhaps it is an acceptable trade (perhaps not.) The kneejerk reaction is probably that this is a terrible idea, but I have no data to say how terrible.
Further thinking on this: continuing on this bad idea and possibly making it slightly less awful: it would make sense to perform “sample” tests in the same way, where the preferred lowest latency server would receive the request, and then a single other authoritative resolver would receive the same request at the same time, and the winning (fastest) return would be used and the lowest latency result would be moved into preferred position. Instead of N-1 additional queries, that would reduce the badness to 1 additional query per timeout cycle.
Hi John, there is a large space of potential optimizations to be studied. In addition to methods like you describe that run “races” to pick a winner (which I’m not sure is a terrible idea), I’ve also heard proposals to send all non-prefetching queries to the estimated lowest latency authority and then use only pre-fetching queries to update RTT estimates for other authorities (and possibly select a new lowest latency authority). Hybrid approaches that race when uncertainty is high (e.g., on the first query for a zone) and then fall back to other methods when uncertainty is lower (e.g., updating existing RTT estimates) are likely worth exploring as well.