There are times when decisive action is the straightest path to success. Starting from 1 February 2019, the organizations behind open source DNS software implementations are going to deploy changes to their code that could break your domains. That day has been labelled DNS Flag day.
Do software developers want to intentionally break domains? Well, no. For years, those software developers had to include workarounds in their code to allow a few domains to work; domains using DNS software that’s not standard compliant, or living behind network devices not respecting Internet standards. Those workarounds are coming to an end. If you run a domain name and want to get more information, please check out the DNS Flag day website, which includes an online tool for testing.
As the guardians of the .nz namespace, InternetNZ sees it as our responsibility to investigate how this change will affect .nz, and we have been collecting information about DNS standard compliance across all .nz domains for a couple of months. The research involved was presented at LACNIC 30 and will be presented at DNS-OARC 29 in the coming weeks.
What do we test for
The workarounds to be removed starting on February 2019 are related to a component of the DNS called EDNS. EDNS was created to extend the optionality and usefulness of the DNS protocol. For example, there couldn’t be DNSSEC without EDNS.
ISC, the organization behind BIND, the de facto standard DNS implementation, created a test to verify if a DNS server responds correctly to a series of queries exploring different elements of the DNS standard, including EDNS. They have been collecting compliance data for the root zone and other domains for a while.
CZ.NIC, managers of the Czech Republic ccTLD, created a tool that tests a nameserver once, independently of how many domains it hosts, allowing bulk verification of a whole namespace like .cz or .nz.
We are using the CZ.NIC tool currently for .nz, and are checking for EDNS compliance. In the future, we will check full DNS compliance.
Results
In a coordinated effort with .cl, .cz, .se, .nu, .nz, and using the public results for the root zone, we can compare how different namespaces fare on the test. The figures below are not exhaustive but are the most compelling output.
Our first look is at the general nameserver distribution, as one nameserver can have multiple IPv4 and/or IPv6 addresses.
Figure 1 — General nameserver statistics (view data via Plotly).
Although different zones have different numbers of domains, the number of servers is more or less stable, with the exception of Sweden with over 10k extra addresses compared to the rest.
Basic DNS test
dig soa ZONE @SERVER +noedns +noad +norec
For each nameserver, we send a query to confirm they respond. In general, most of the nameservers pass this test.
Figure 2 — Base DNS test (view data via Plotly).
The root zone has higher levels of correctness on this basic test because IANA imposes a set of technical tests to TLD operators. From now on, the root zone metric will be a baseline to compare other zones.
The errors nosoa and noaa imply the server didn’t send the right response to the query, due to misconfiguration mostly.
EDNS test
dig soa ZONE @SERVER +edns=0 +nocookie +noad +norec
With the baseline defined, we can start showing how, increasingly, more complex queries start producing errors.
Figure 3 — EDNS test (view data via Plotly).
From this test, we can start seeing the first protocol violations. To activate EDNS, a DNS query will include an OPT record, which is required to be copied in the DNS response. The noopt errors are servers not returning that record. The nsid errors are servers returning the NSID option when originally they were not asked to provide it!
DO Test
dig soa ZONE @SERVER +edns=0 +nocookie +noad +norec +dnssec
Having working EDNS is essential for DNSSEC. The DO bit signals that a DNS client wants to receive DNSSEC-related records, like RRSIG (signatures) and DNSKEYs. While testing for DO bit support, we start to find higher levels of failure.
Figure 4 — DO test (view data via Plotly).
From the plot, we can see two different stories. The root, .se and .nu zones have nearly 100% of nameservers answering correctly, and .nz, .cz and .cl slightly less than 80%, with the other 20% failing to include the DO bit on the response as required! There are also a few nameservers that timeout with the query. If there is a signed domain behind those failing servers, DNSSEC will definitely break.
EDNS1 test
dig soa ZONE @SERVER +edns=1 +noednsneg +nocookie +noad +norec
The EDNS1 test is quite tricky; as EDNS version 1 has not been defined yet, the only version available is EDNS0. So this is a test to verify if the nameserver handles the error correctly, or any potential network device doing packet inspection understands if the query is valid or not.
Figure 5 — EDNS1 test (view data via Plotly).
You can see that the root zone keeps its high compliance level, but the ccTLDs in our list fall behind with roughly 50% of the nameservers passing the test. The expected response must include a BADVERS return code, no SOA record, and the OPT record signalling EDNS version 0. The noerror and soa cases represent a nameserver that didn’t validate the query properly, the noopt case, a nameserver that violated the standards by not returning the OPT record as we saw above, and the badversion case where a nameserver actually responded indicating it supports EDNS version 1!
OPTLIST test
dig soa ZONE @SERVER +edns=0 +noad +norec +nsid +subnet=0.0.0.0/0 +expire +cookie=0102030405060708
The OPTLIST test is intended to explore the adoption of new DNS options, as they have been added in later years. nsid defined 11 years ago in RFC 5001, asks the server to reply with a server identification string, useful for anycast deployments. subnet is an option defined two years ago for clients to signal where the original DNS query came from, useful for CDN operators. expire is defined in RFC 7314 to query the EXPIRE timer in the SOA record. cookie is defined in RFC 7873 and provides a lightweight DNS transaction security mechanism against a variety of attacks. In simple terms, it is a gauge of how new and fresh the DNS software used on the nameservers is.
Figure 6 — OPTLIST test (view data via Plotly).
The plot provides two views. First, which options are more commonly deployed like nsid and subnet. Second, the error cases, as there are a few nameservers failing to respond to the query (timeout) and some returning a DNS error (formerr) that means they didn’t understand the query, indicating the software is a few years old.
Why does it matter?
We started this article pointing out that changes will be introduced due to DNS Flag day. The deployment of these changes will cause currently functioning domains to stop working. We estimate around 1.2% of .nz domains will be broken, and we will notify those registrants and DNS operators about our findings using the Registrar Portal.
Final words
The Internet is a tool for innovation and disruption, but introducing innovation in the core DNS protocols has always proven difficult. There are constant demands to be backwards compatible and to avoid making big changes that will break existing features. Consequently, the DNS, in particular, is a protocol that stayed the same for many years. We will be actively guarding and investigating the level of protocol compliance within the .nz namespace and reporting back our findings. Stay tuned.
Adapted from original post which appeared first on NZRS Blog.
Sebastian Castro is a DNS expert at .nz Registry Services.
The views expressed by the authors of this blog are their own and do not necessarily reflect the views of APNIC. Please note a Code of Conduct applies to this blog.
Hello,
With a majority of bindtools installed on servers Internet wide, they do not support the flags needed to test this. If the tools do not support the flags, do the name servers support them?
For instance, Ubuntu 16.04.5 LTS has DiG 9.10.3-P4-Ubuntu installed and this is the latest supported version via “apt” running “dig +nocookie +norec +noad +edns=1 +noednsneg soa zone @server” returns:
Invalid option: +nocookie
Is this really viable at this time?
Kind Regards,
Jeff
@Jeff
According to article https://kb.isc.org/docs/aa-01387 BIND 9.10 does not have cookies enabled by default, so +nocookies would be no-op.
Anyway, you can use web form at https://dnsflagday.net/ to test your authoritative servers without worrying about particular dig version etc.
thanks for the information.
I’m operating the Auth.DNS for one domain with old DNS version, yes, got some error when query my domain.
but I believe that there is no actual impact for me because I don’t provide DNSSEC or IPV6 for my domain.
please correct me if I’m wrong.
Thank you.
@Monty
This does not relate to DNSSEC or IPv6 specifically, and will most likely impact the domains you are hosting. (although it’s possible you won’t experience problems immediately)
You can read more details at https://dnsflagday.net/#dns-admins under the section “DNS server operators”. I’ll quote the first paragraph for you here, but please read the full details at the URL above:
***
After February 1st 2019 major public DNS resolver operators listed below will disable work arounds for standards non-compliance. This change will affect domains hosted on authoritative servers which do not comply either with original DNS standard from 1987 (RFC1035) or the newer EDNS standards from 1999 (RFC2671 and RFC6891). Non-compliant domains may become unreachable through these services.
***
Hello, Jamie Gillespie
thank you very much for your prompt reply. it’s helpful.
yes, I read the articles several times but still I can’t understand what will exactly occur after Feb. 1st.
1. Does edns options are general one even we don’t use DNSSEC or ipv6? I thought that this dnsflag has relation with DNSSEC only.
2. is there any actual DNS test environment before Feb. 1st? Hard to know what will happen after Feb.1st.
3. also, we didn’t open 53/tcp, but some domain reports OK, but some cases fail when query from dnsflag site.
surely, we monitored the traffic for some days. but there was no normal packet to use 53/tcp. that’s why we blocked 53/tcp from firewall. (No zone trasnfer issue) should I open 53/tcp as well due to dnsflagday?
Please advise.
Thank you!!
1 . I’m not the biggest expert on EDNS, and I don’t know the exact errors you were seeing in your tests, but I will quote the following from dnsflagday.net which should be clear:
***
The main change is that DNS software from vendors named above will interpret timeouts as sign of a network or server problem. Starting February 1st, 2019 there will be no attempt to disable EDNS in reaction to a DNS query timeout.
This effectively means that all DNS servers which do not respond at all to EDNS queries are going to be treated as dead.
***
So if you were seeing timeouts during your tests, then that is where the problems will occur after tomorrow.
2. dnsflagday.net links to https://ednscomp.isc.org/ednscomp which performs the tests for you. If any tests fail, it is a sign that your DNS server is not following the required standards.
The source code for that tool is available at https://gitlab.isc.org/isc-projects/DNS-Compliance-Testing but it’s easier to just use the ednscomp.isc.org web interface.
3. Correct, you should open 53/TCP as not having that open may cause delays, failures, or other problems with DNS resolution. You can read more about that at https://kb.isc.org/docs/aa-01219
Without knowing the intimate details of your configuration it’s difficult to give specific advice, but as you mentioned zone transfers you should of course ensure you have configured AXFRs correctly for your servers before opening 53/TCP so you don’t open up any information disclosure vulnerabilities.
I hope this information is helpful to you.
thank you!!