Extended DNS Errors: Unlocking the Full Potential of DNS Troubleshooting

By on 28 Sep 2023

Category: Tech matters

Tags: ,

1 Comment

Blog home

A modified Windows extended error message box

The Domain Name System (DNS) has traditionally relied on response codes to signal anomalies, but they are of little help to precisely identify the root causes behind failures. This shortcoming was addressed in RFC 8914, which introduced Extended DNS Errors (EDEs), a new mechanism to provide extra feedback on DNS resolutions. At Laboratoire LIG – Université Grenoble Alpes, we recently studied the implementation of this proposed standard and enumerated domain misconfigurations in the wild. This blog post summarizes the key findings of our paper, which was accepted at the Internet Measurement Conference (IMC).

Background

EDEs rely on EDNS(0) defined in RFC 6721 to serve data inside the OPT resource record using the option code, 15. As of September 2023, the EDE codes registry at IANA contains 30 entries, five of which were added after the release of the original RFC 8914. Table 1 presents them all. The codes cover different aspects of the DNS, such as DNSSEC validation (1, 2, 5-12, 25, 27), caching  (3, 13, 19, 29), resolver policies (4, 15-18, 20), software operation (14, 21-23), and so forth. These EDE codes exist independently from traditional response codes and the EDE specification does not prohibit any combination of the two. Importantly, any DNS system, whether a recursive resolver, a forwarder, or an authoritative nameserver can generate, forward, and parse the EDE codes.

CodeDescriptionCodeDescription
0Other Error15Blocked
1Unsupported DNSKEY Algorithm16Censored
2Unsupported DS Digest type17Filtered
3Stale Answer18Prohibited
4Forged Answer19Stale NXDOMAIN Answer
5DNSSEC Indeterminate20Not Authoritative
6DNSSEC Bogus21Not Supported
7Signature Expired22No Reachable Authority
8Signature Not Yet Valid23Network Error
9DNSKEY Missing24Invalid Data
10RRSIGs Missing25Signature Expired before Valid
11No Zone Key Bit Set26Too Early
12NSEC Missing27Unsupported NSEC3 Iterations Value
13Cached Error28Unable to conform to policy
14Not Ready29Synthesized
Table 1 — Registered EDE codes.

Implementation

As of May 2023, EDEs are implemented by major resolver software vendors (BIND9, Unbound, Knot Resolver, PowerDNS Recursor) and public resolvers (Cloudflare, Quad9, OpenDNS). Note, Google DNS announced its support of RFC 8914 two months after our experiments, in July 2023. 

We were wondering what kind of issues can trigger recursive resolvers to return EDE codes. To answer this question, we have set up 63 domains reflecting different misconfigurations and corner cases, such as erroneous DNSSEC configurations (wrong keys, signatures, digests, very old/new algorithms), unreachable nameservers, restrictive ACLs, and so forth. Refer to https://extended-dns-errors.com for a full list of domains and feel free to use them for your own tests. 

Next, we queried Cloudflare, Quad9, and OpenDNS, as well as our own instances of BIND 9.19.9,  Unbound 1.16.2, PowerDNS 4.8.2, and Knot 5.6.0. Overall, our 63 test domains generated 12 different EDE codes. Only four test cases out of 63 triggered the same results across all the seven tested systems; the no-ds, nsec3-iter-200, unsigned, and valid subdomains did not result in any extended error code. The following factors contributed to the inconsistency among the remaining 94% of tests:

  • Some systems implemented a subset of EDE codes that may not cover all our test cases. For example, as a first step, Unbound focused on DNSSEC-related errors.
  • Some EDE codes depend on the individual resolver’s capabilities. For example, the Cloudflare public resolver was the only system to return the Unsupported DNSKEY Algorithm when resolving the domain name signed with the ED448 algorithm.
  • Some EDE codes are more specific than others, but still point to the same problem. The majority of DNSSEC-related problems were signalled with either DNSKEY Missing or DNSSEC Bogus extended error codes, depending on the software.

Misconfigurations in the wild

We now set out to discover the most prevalent issues in the wild. We gathered a dataset of more than 303 million registered domains across 1,475 TLDs and requested Cloudflare public DNS to resolve their A records. Overall, 17.7 million domain names triggered 14 individual EDE codes or their combinations. 

Lame delegations are the most common issue encountered — 14.8 million domains triggered No Reachable Authority and/or Network Error EDEs. These refer to cases when recursive resolvers cannot reach some or all the domain’s authoritative nameservers. Cloudflare used the EXTRA-TEXT field of the EDE entry to inform that some nameservers returned REFUSED or SERVFAIL response codes, therefore, did not serve the authoritative data. DNSSEC misconfigurations are another prevalent problem. Expired / missing / not yet valid signatures, missing keys or proofs of non-existence, DNSKEYs not corresponding to DS records, and broken chains of trust all make those domains inaccessible when end users are behind validating DNS resolvers. However, when using unsupported cryptographic algorithms, resolutions would not fail, but rather be accompanied by the Unsupported DNSKEY Algorithm or Unsupported DS Digest Type. Finally, two debugging EDEs were returned to signal that we were served stale answers (Stale Answer) or previously cached SERVFAIL (Cached Error).

Interestingly, 2.47 million domain names under two European ccTLDs triggered the RRSIGs Missing EDE code without leading to DNSSEC validation failures. We reached out to one of the TLD operators who explained to us that despite the TLD zone being correctly configured, Cloudflare DNS signalled the problem with a so-called stand-by KSK, that is, the one published in the zone file in case the emergency key rollover is needed, but not actively used to establish the chain of trust. We identified another 22 public suffixes and TLDs with standby DNSSEC keys triggering the same error. We contacted Cloudflare and reported our findings. They, in turn, confirmed that it was an expected behaviour and updated their documentation to inform that “key rollover in-progress, stand-by key, and attacker stripping signatures” may trigger the RRSIGs Missing EDEs.

Conclusions

Our measurements revealed that all the systems implementing RFC 8914 were successful in determining root causes of misconfigurations with different levels of specificity. Moreover, this standard is particularly useful to enumerate misconfigurations at scale. Therefore, we believe that EDE is a promising technique that assists DNS operators, domain owners, and end clients in identifying and resolving DNS issues.

Yevheniya Nosyk is a PhD student at Université Grenoble Alpes (France), working on DNS and network security. The blog and the paper reflect the work of Yevheniya along with her co-contributors Maciej Korczyński and Andrzej Duda also from Université Grenoble.

Rate this article

The views expressed by the authors of this blog are their own and do not necessarily reflect the views of APNIC. Please note a Code of Conduct applies to this blog.

One Comment

Leave a Reply

Your email address will not be published. Required fields are marked *

Top