Are homogenic nameserver names a single point of failure?

By on 3 May 2017

Category: Tech matters

Tags: , , ,

Blog home

The golden rule of security, stability, and resiliency of virtually anything is “don’t put all your eggs into one basket”. But there are exceptions.

The golden rule of security, stability, and resiliency of virtually anything is: “Don’t put all your eggs into one basket.” This generally applies to the DNS, and there are some recommendations to avoid having all your nameservers in one domain.

I would like to show that, in this case, this is not a silver bullet, it depends on many conditions, and using different domains in your nameserver set might not make things better, but worse.

There might be other reasons to spread domains used in the nameserver sets, so there is no definitive good or bad in this area, but there is always a need to balance the risks.

Evaluating Amazon Cloudfront

Amazon Cloudfront spreads domains used for nameservers quite a lot:

cloudfront.net. 172800 IN NS ns-666.awsdns-19.net.
cloudfront.net. 172800 IN NS ns-418.awsdns-52.com.
cloudfront.net. 172800 IN NS ns-1597.awsdns-07.co.uk.
cloudfront.net. 172800 IN NS ns-1306.awsdns-35.org.

There might be a very good reason for this, but we are going to look at this purely from the DNS Resolver point of view, and the work that has to be done to resolve a simple domain name hosted on such service.

I am going to use a (fake) domain – example.udp53.cz – to demonstrate that such a setup might lead to more DNS queries and increased latency.

Imagine a DNS Resolver with an empty cache. A DNS client asks such resolver for an AAAA record for example.udp53.cz.

$ kdig IN AAAA example.udp53.cz.

 

A very simple question, it seems. So, what would be the chain of queries the DNS Resolver must make to resolve the name?

Note: I have deliberately stripped most kdig output in the examples to make this post shorter. I am also ignoring DNSSEC for now because that would make this post even more complicated.

  • Every DNS Resolver is primed with addresses of root nameservers (well, at least one is needed), so that will be used:
    $ kdig AAAA example.udp53.cz. @2001:503:ba3e::2:30
    ;; AUTHORITY SECTION:
    cz. 172800 IN NS a.ns.nic.cz.
    […]
    ;; ADDITIONAL SECTION:
    a.ns.nic.cz. 172800 IN A 194.0.12.1
    a.ns.nic.cz. 172800 IN AAAA 2001:678:f::1
    […]

 

  • Good, there are GLUE records in the ADDITIONAL SECTION we can use, so the next step would be to ask one of the .cz nameservers:
    $ kdig AAAA example.udp53.cz. @2001:678:f::1
    ;; AUTHORITY SECTION:
    udp53.cz. 3600 IN NS trubka.network.cz.
    udp53.cz. 3600 IN NS master.dns.rocks.;; ADDITIONAL SECTION:
    trubka.network.cz. 3600 IN A 81.91.84.116
    trubka.network.cz. 3600 IN AAAA 2001:1568:b:145::1
    trubka.network.cz. 3600 IN AAAA 2001:1568:b::145
    […]

 

  • And now the DNS Resolver might pick either of the two nameservers. Let’s pick the worse of the two for the latency, master.dns.rocks, and we are back at the root zone:
    $ kdig AAAA master.dns.rocks. @2001:503:ba3e::2:30
    ;; AUTHORITY SECTION:
    rocks. 172800 IN NS demand.beta.aridns.net.au.
    rocks. 172800 IN NS demand.alpha.aridns.net.au.
    rocks. 172800 IN NS demand.delta.aridns.net.au.
    rocks. 172800 IN NS demand.gamma.aridns.net.au.;; ADDITIONAL SECTION:
    demand.alpha.aridns.net.au. 172800 IN A 37.209.192.7
    […]

 

  • Um, wait – a .au zone? What is this madness? We can use these specific GLUE records, but there are cases where the GLUE could not be trusted, so I am going to pretend we need to resolve them here (for example, if the DNS resolver is very strict and believes only in-domain GLUEs):
    $ kdig AAAA demand.alpha.aridns.net.au. @2001:503:ba3e::2:30
    ;; AUTHORITY SECTION:
    au. 172800 IN NS a.au.
    […]
    ;; ADDITIONAL SECTION:
    a.au. 172800 IN A 58.65.254.73
    a.au. 172800 IN AAAA 2407:6e00:254:306::73
    […]

 

  • We still don’t have a name for demand.alpha.aridns.net.au name:
    $ kdig AAAA demand.alpha.aridns.net.au. @2407:6e00:254:306::73
    ;; AUTHORITY SECTION:
    net.au. 86400 IN NS x.au.
    […]
    ;; ADDITIONAL SECTION:
    x.au. 86400 IN A 37.209.194.5
    x.au. 86400 IN AAAA 2001:dcd:2::5
    […]

 

  • And the next step:
    $ kdig AAAA demand.alpha.aridns.net.au. @2001:dcd:4::5
    ;; AUTHORITY SECTION:
    aridns.net.au. 14400 IN NS ari.alpha.aridns.net.au.
    […]
    ;; ADDITIONAL SECTION:
    ari.alpha.aridns.net.au. 14400 IN AAAA 2001:dcd:1::2
    ari.alpha.aridns.net.au. 14400 IN A 37.209.192.2
    […]

 

  • Now we have an IP address for demand.alpha.aridns.net.au!
    $ kdig IN AAAA demand.alpha.aridns.net.au. @2001:dcd:1::2
    ;; ANSWER SECTION:
    demand.alpha.aridns.net.au. 172800 IN AAAA 2001:dcd:1::7

 

  • Now we can return to resolving the master.dns.rocks DNS chain:
    $ kdig IN AAAA master.dns.rocks. @2001:dcd:1::7
    ;; AUTHORITY SECTION:
    dns.rocks. 86400 IN NS trubka.network.cz.
    dns.rocks. 86400 IN NS master.dns.rocks.;; ADDITIONAL SECTION:
    master.dns.rocks. 86400 IN AAAA 2a01:5f0:c001:113:a::10
    master.dns.rocks. 86400 IN A 89.187.130.10

 

  • Guess what – the DNS Resolver can pick one of the two nameservers here, but let’s go easy this time and follow master.dns.rocks because we have received GLUE records for the name, together with delegation nameservers, so we can finally ask for example.udp53.cz:
    $ kdig AAAA example.udp53.cz. @2a01:5f0:c001:113:a::10
    ;; ANSWER SECTION:
    example.udp53.cz. 60 IN CNAME example.udp53.cz.s3-website-us-east-1.amazonaws.com.

 

  • Lovely! We have just processed nine DNS queries and responses to be redirected back at .com level. I am going to list just the queries to get to the final nameservers. Notice that we would be doomed on IPv6-only networks, as nameservers for amazonaws.com are Legacy IP only.
    $ kdig AAAA example.udp53.cz.s3-website-us-east-1.amazonaws.com. @2001:503:ba3e::2:30
    ;; AUTHORITY SECTION:
    com. 172800 IN NS a.gtld-servers.net.
    […]$ kdig AAAA example.udp53.cz.s3-website-us-east-1.amazonaws.com. @2001:503:a83e::2:30
    ;; AUTHORITY SECTION:
    amazonaws.com. 172800 IN NS u1.amazonaws.com.
    amazonaws.com. 172800 IN NS u2.amazonaws.com.
    amazonaws.com. 172800 IN NS r1.amazonaws.com.
    amazonaws.com. 172800 IN NS r2.amazonaws.com.;; ADDITIONAL SECTION:
    u1.amazonaws.com. 172800 IN A 156.154.64.10
    u2.amazonaws.com. 172800 IN A 156.154.65.10
    r1.amazonaws.com. 172800 IN A 205.251.192.27
    r2.amazonaws.com. 172800 IN A 205.251.195.199$ kdig AAAA example.udp53.cz.s3-website-us-east-1.amazonaws.com. @156.154.64.10
    ;; AUTHORITY SECTION:
    s3-website-us-east-1.amazonaws.com. 1800 IN NS ns-1133.awsdns-13.org.
    s3-website-us-east-1.amazonaws.com. 1800 IN NS ns-1919.awsdns-47.co.uk.
    s3-website-us-east-1.amazonaws.com. 1800 IN NS ns-490.awsdns-61.com.
    s3-website-us-east-1.amazonaws.com. 1800 IN NS ns-661.awsdns-18.net.

 

  • So, are we done yet? Not even close – as you can see, there’s another restart in the resolving, and now we can pick from four TLD variants: 1) awsdns-13.org, 2) awsdns-47.co.uk, 3) awsdns-61.com, and 4) awsdns-18.net. Let’s pick .co.uk just to make life with the DNS more fun. How many queries do we need?
    $ kdig AAAA ns-1919.awsdns-47.co.uk. @2001:503:ba3e::2:30
    ;; AUTHORITY SECTION:
    uk. 172800 IN NS nsa.nic.uk.
    ;; ADDITIONAL SECTION:
    nsa.nic.uk. 172800 IN A 156.154.100.3
    nsa.nic.uk. 172800 IN AAAA 2001:502:ad09::3$ kdig AAAA ns-1919.awsdns-47.co.uk. @2001:502:ad09::3
    ;; AUTHORITY SECTION:
    awsdns-47.co.uk. 172800 IN NS g-ns-367.awsdns-47.co.uk.
    […]
    ;; ADDITIONAL SECTION:
    g-ns-367.awsdns-47.co.uk. 172800 IN AAAA 2600:9000:5301:6f00::1
    g-ns-367.awsdns-47.co.uk. 172800 IN A 205.251.193.111$ kdig IN AAAA ns-1919.awsdns-47.co.uk. @2600:9000:5301:6f00::1
    ;; ANSWER SECTION:
    ns-1919.awsdns-47.co.uk. 172800 IN AAAA 2600:9000:5307:7f00::1

 

  • Finally, here comes the final query we have been waiting for. Or not?
    $ kdig AAAA example.udp53.cz.s3-website-us-east-1.amazonaws.com. @2600:9000:5307:7f00::1
    ;; ANSWER SECTION:
    example.udp53.cz.s3-website-us-east-1.amazonaws.com. 60 IN CNAME s3-website-us-east-1.amazonaws.com.

 

  • In the worst case, the DNS Resolver would pick something not in the cache → ns-661.awsdns-18.net and we are right back in the vicious cycle:
    $ kdig AAAA ns-661.awsdns-18.net. @2001:503:ba3e::2:30
    ;; AUTHORITY SECTION:
    net. 172800 IN NS a.gtld-servers.net.
    […]$ kdig AAAA ns-661.awsdns-18.net. @2001:503:a83e::2:30
    ;; AUTHORITY SECTION:
    awsdns-18.net. 172800 IN NS g-ns-467.awsdns-18.net.
    […]$ kdig AAAA ns-661.awsdns-18.net. @2600:9000:5301:d300::1
    ;; ANSWER SECTION:
    ns-661.awsdns-18.net. 172800 IN AAAA 2600:9000:5302:9500::1
  • And this is the final step:
    $ kdig AAAA s3-website-us-east-1.amazonaws.com. @2600:9000:5302:9500::1
    ;; AUTHORITY SECTION:
    s3-website-us-east-1.amazonaws.com. 900 IN SOA ns-1919.awsdns-47.co.uk. awsdns-hostmaster.amazon.com. 1 7200 900 1209600 86400

    Now we have proof that the example.udp53.cz can’t be reached over IPv6, but we don’t want to end in a bad mood, so let’s query for IPv4 address to have a nice souvenir to take home in the form of Legacy IP(v4) address:

    $ kdig +norec IN A s3-website-us-east-1.amazonaws.com. @2600:9000:5302:9500::1
    ;; ANSWER SECTION:
    s3-website-us-east-1.amazonaws.com. 5 IN A 52.216.17.18

    It took us 20 full DNS queries to resolve the example.udp53.cz domain name, and even if the DNS Resolver would pick the optimal path on every step, we would still end up with eight queries.

In the real world, the DNS Resolver uses some clever tricks to avoid some of the complexity and most of the records would be quickly cached, so the latency would not be that bad; but, there are other quirks when using domains from different TLDs.

Different TLDs means different registries that take care of the top-level domain, and while most of the registries are well maintained, it brings more points of failure into the system.

Using different TLDs does not just mean different registries, but also different jurisdictions. This means picking a “cool” TLD from the country with a totalitarian regime might not be “cool” in the end, as they might be able to take down/hijack your domain name, or manipulate responses from your nameserver.

You might be asking: what’s a good setup then?

There are several approaches. The setup with least latency on the cold-cache would be to use the in-the-domain nameservers like this:

$ kdig AAAA www.nic.cz @2001:503:ba3e::2:30
;; AUTHORITY SECTION:
cz. 172800 IN NS a.ns.nic.cz.
cz. 172800 IN NS b.ns.nic.cz.
cz. 172800 IN NS c.ns.nic.cz.
cz. 172800 IN NS d.ns.nic.cz.;; ADDITIONAL SECTION:
a.ns.nic.cz. 172800 IN A 194.0.12.1
b.ns.nic.cz. 172800 IN A 194.0.13.1
c.ns.nic.cz. 172800 IN A 194.0.14.1
d.ns.nic.cz. 172800 IN A 193.29.206.1
a.ns.nic.cz. 172800 IN AAAA 2001:678:f::1
b.ns.nic.cz. 172800 IN AAAA 2001:678:10::1
c.ns.nic.cz. 172800 IN AAAA 2001:678:11::1
d.ns.nic.cz. 172800 IN AAAA 2001:678:1::1$ kdig AAAA www.nic.cz @2001:678:f::1
;; QUESTION SECTION:
;; www.nic.cz. IN AAAA;; ANSWER SECTION:
www.nic.cz. 1800 IN AAAA 2001:1488:0:3::2

 

See? The answer in just two steps. However the .cz registry is a special case because the nameservers for cz and nic.cz are shared. But even without this neat trick, it would take us only three DNS queries to get to the result. Remember that any indirection will increase the number of DNS queries needed to get the result, increasing the number of places where something can break.

This is the optimal setup for people who are deep into DNS and care deeply about latency, and in fact, if you check Alexa Top 10 domains, most of the domains there use in-the-domain nameservers, because it’s simple and reduces latency.

What’s a good setup for normal domain holders?

My recommendation would be to split the responsibility between more entities to reduce the risk of one being under attack (or just making a mistake, because mistakes happen).

For example,  pick one or two stable DNS providers that use nameservers in the same domain so your nameserver set contains one to two domain names.

You’ll also not have to worry about latency, because nameservers shared among multiple domains also means that they will be cached very quickly.

Adapted from the original post on blog.nic.cz

Ondřej Surý is a Technical Fellow at CZ.NIC where he is responsible for Knot DNS and BIRD development.

Rate this article

The views expressed by the authors of this blog are their own and do not necessarily reflect the views of APNIC. Please note a Code of Conduct applies to this blog.

Leave a Reply

Your email address will not be published. Required fields are marked *

Top