DNS at IETF 118

By Geoff Huston on 29 Nov 2023

The IETF met in Prague in the first week of November 2023, and, as usual, there was a flurry of activity in the DNS-related Working Groups. Here’s a roundup of those DNS topics I found to be of interest at that meeting.

Rethinking the DNS

Prior to IETF meetings, there is an opportunity for informal discussions, under the label of a ‘hackathon’. The topic in the DNS was to reconsider the DNS. A major constraint in any such discussion is that the installed base of the DNS is highly resistant to many forms of change. The namespace definition, the data model of the DNS, and the model of the collection of delegated authorities assembled in a hierarchical manner are all fixed. With so much that must be retained, such conversations about a ‘new’ DNS quickly turn to discussions about incremental evolution that preserve backward compatibility.

There are a couple of areas of useful consideration. One is the way that DNS uses the underlying transport layer, its model of a preference for using UDP for queries and responses and the visible trend of placing more information into DNS responses. The work on DNS over HTTP (DoH) opens some interesting conversations about using a reliable, authenticated and encrypted streaming transport by default, and also opens up a conversation about push vs pull as a model of information delivery. However, this year’s hackathon headed in a different direction, namely the delegation record in the DNS.

Confusion over the nameserver (NS) record has been a significant issue for the DNS over its entire lifespan. It’s a piece of data that is held in both the child and parent zones, yet, confusingly, it’s the child that is authoritative for this data, while logically in terms of the DNS nameserver discovery process, it’s the parent’s copy of this data that is used to guide the name resolution process. So, the thought process is to consider what a compound delegation record might look like.

These days such a record may refer to multiple zone hosting providers, it may allow for suggested query transports in a similar manner to the SVCB record, and it would likely be authoritative at the parent and not be part of the child zone (similar to the existing DS record).

It’s an interesting area of thought for the DNS as it applies the same consideration that the DNS is a provisioning protocol for accessing named applications and services to the DNS query resolution protocol itself in terms of querying the nameservers for a zone.

The hackathon report at the DNSOP Working Group meeting made mention of a new DELEG record that was able to carry this revised form of delegation in the DNS, but the report was also quick to note that further consideration of this idea would probably move further along in the coming months, and what might emerge as a proposal may not necessarily include a DELEG resource record per se.

The report noted that the administrative structure of the upper layers of the DNS hierarchy, with shared registry operators, multiple registrars, and a collection of zone publishers has as much influence on the evolution of DNS design as the technical issues of efficiency of name resolution within the DNS protocol. I suspect that admitting this new reality of the administrative complexity of these outsourcing relationships is a major concession within many hardcore DNS circles!

Domain Control Validation

The DNS is used for many things besides the simple translation of a DNS name to an IP address. Some services want to know if the party requesting a service for a domain name is the real controller of that name, and the most common way of doing this is to ask the requestor to place a token into the domain name to demonstrate that level of control. It seems to have become a bit of a mess!

$ dig +short TXT bbc.co.uk
"google-site-verification=ITX3CwHXxGVfkCmhF4eSwdfo8h2ZGLAZ3zRpYvZi5XA"
"google-site-verification=RaiMXJBIiFvqXHd43kv_ekzmXT2l8ibq5Xy0mulndvU"
"voUGv5zARbEV516E/S8Ugsy9/FOgDGg4n/rpmKZQRROVOj0+2tgzKw3Tk9+Ks6qVbNKU18KTrR5khxTQutDvBg=="
"miro-verification=1a94b0fef7a6d5136a272d5cb425e8dc034e8cfc"
"apple-domain-verification=jFFO0rdS9IrxgWUR"
"msfpkey=3e29l8m08bqxp19k63t73fj5b"
"dropbox-domain-verification=l5djk65wpy3z"
"atlassian-domain-verification=SQsgJ5h/FqwMTXuSG/G4Nd1Gx6uX2keREOsZSa22D5XT46EsEuyaic8Aej4cR4Tr"
"2RLXso9TrRPyhWOEhYggL0U/r1D+g8H7z9RqDBOmcJjSbj88TobGKimtkCrXZNBkDXQDj89lS4mDskNOJyWLdg=="
"MSUEdqCJpCtrI1JuH2-U"
"MS=ms10378910"
"docusign=a10ad7b6-cf7e-472d-8157-23061f5b5116"
"adobe-idp-site-verification=9b850a4a56e3fac19aea1e0ac5db302e5cefab444cd73519dce1c72ccd4db058"
"Huddle"
"docusign=50f10407-e3e4-4f6a-aae4-712d4eb31329"
"docker-verification=aab67462-78f7-4ade-a86b-358645923430"
"J0kgGm0XqA3/6pLD4DHeC5x/dAduzT809P1Iwx/PRCYvVS32rv75RIHKC2aVz47dJxKhPlxGf3h3KXiL6+dyXw=="
"_globalsign-domain-verification=AQ2dURU9RbDOuheuLrx89LSUlA_btgMS6vmFXngBtE"
"v=spf1 a ip4:212.58.224.0/19 ip4:132.185.0.0/16 ip4:78.136.53.80/28 ip4:78.136.14.192/27 ip4:89.234.10.72/29 ip4:89.234.53.236 ip4:212.111.33.181 ip4:78.137.117.8 ip4:46.37.176.74 ip4:185.184.237.181" " ip4:185.119.233.144/30 ip4:185.119.232.158 +include:sf.sis.bbc.co.uk +include:spf.messagelabs.com ~all"

The more this technique is used by various service operators, the greater the number of TXT records that are stashed at the zone apex. Of course, a querier can’t just ask for a particular TXT record. Each query gets the entire bundle of records!

There is a proposal to pull these bundles of zone apex TXT records apart by using service-specific challenge records and creating specialized records such as:

_foo-challenge.example.com.  IN   TXT  "3419sdqa32453243d206c4"

The proposal contains a useful survey of currently used techniques in Appendix A, which is helpful, but otherwise, the proposal is unclear as to what problem it is attempting to solve and why. It should be remembered these days that the DNS is quite convoluted in terms of roles and responsibilities. The entity that is the notional zone holder (or ‘owner’) may not be the same as the zone administrator who maintains the zone content, who, in turn, may not be the zone publisher (or publishers). When we talk about ‘validation’ which of these parties is providing validation? Should a validation structure make these distinctions in various name administrative roles explicit?

The techniques we use today for name validation certainly have their idiosyncrasies, but in many ways, it’s a simple hack that works adequately. It’s unclear what additional value this proposal is bringing to the table.

Generalized notify

There are several theories about why DNSSEC is not being taken up with widespread enthusiasm.

Aside from the obvious issues related to the efficiency of name resolution, such as dealing with larger responses and addressing UDP truncation by switching to TCP, there is an additional overhead due to the extra queries involved in DNSSEC validation. Additionally, it’s worth noting that very few stub resolvers at the network’s user edge perform validation, leaving the last hop in DNS resolution essentially unprotected.

Aside from these rather obvious (and quite compelling) performance and efficiency reasons to hold off DNSSEC-signing a zone, there remains a persistent school of thought that the real showstopper for DNSSEC adoption is the lack of a standard way for a delegated zone to communicate the DS record to the parent zone.

Almost ten years ago, in September 2014, RFC 7344 proposed an automated way of doing this. This RFC defined a new pair of resource records, CDS and CDNSKEY, which are respectively a digest of the zone’s DNSKEY in the same format as a DS record, and the zone’s DNSKEY records.

Like other entries in the delegated zone, these records are signed with the zone’s key. The basic approach is that the parent regularly checks the child zone for published CDS records, and if a new value is detected then the parent lifts a copy of the records from the child zone, performs a standard DNSSEC validation on the value, and then adds them into the parent zone as a new DS record. The parent may use the CDS record and copy it to the parent zone DS record, or it may prefer to use its own hash function over the child zone key, in which case it would use the CDNSKEY and perform the hash operation on that key value.

Several domain admins have taken up this approach over the past decade. On the positive side, it eliminates a whole new set of administrative processes to pass a value from the child zone to the parent, as the parent simply performs a regular poll against the child zone’s CDS record to detect a change in value.

But in the DNS, nothing is ever simple. A key question is how responsive should this polling system be? A polling interval of months is clearly just too infrequent, while a polling interval of fractions of a second is also clearly going too far the other way!

A highly responsive system can imply a very high polling load if the parent zone contains many delegations, while a more infrequent polling rate causes slower convergence between parent and child zones for DNSSEC key transitions. More generally, polling is a highly inefficient way of communicating change.

This line of thought leads to the concept of borrowing the DNS mechanism used by a primary nameserver to inform its secondary servers that the zone has changed, namely the NOTIFY mechanism (RFC 1996). In a more general case of extending these NOTIFY messages, the child zone could send NOTIFY(CDS) and NOTIFY(CDNSKEY) messages to the parent to trigger a poll for new data from the child zone.

This could also include NOTIFY(CSYNC) messages to allow the child to directly signal NS record changes to the parent. This generalized NOTIFY is described in a draft that was introduced to the DNSOP Working Group in this week’s meeting. This work also proposes a new DNS record type, tentatively referred to as NOTIFY record, which is used in the parent zone to publish details about where such generalized notifications should be sent for each delegation.

Johan Stenstam has pointed out that this could all be simplified if instead of coupling a NOTIFY message to trigger a pull, we could just use the DNS UPDATE mechanism (DNS Dynamic Updates), which could generalize this dynamic UPDATE to include NS as well as DS/DNSKEY records, without a prerequisite of a DNSSEC-signed zone.

This is not a novel idea and was described in a draft from Mark Andrews back in 2013. Having new thoughts in the DNS is harder than it looks! Johan’s point is that the original work required some effort to understand where to send the UPDATE, and for this reason, work was abandoned, yet the same problem is addressed in the generalized NOTIFY draft. NOTIFY and UPDATE are alternative methods for parent synchronization.

The key insight here is that sending the UPDATE to a DNS ‘service’ destination rather than the primary nameserver allows the parent zone admin to implement its own checks on the contents of the UPDATE before applying them. It also can work for both signed and unsigned child zones.

There is no doubt that the area of the DNS that uses shared data is one of the more operationally troubled areas of misconfiguration in the DNS. In the case of delegation NS records, the data used by resolvers is the parent’s non-authoritative copy of such records in the downward traversal of delegations to find the authoritative nameservers for the domain name, while the child’s copy of the NS records is the authoritative version of the data.

In the case of the delegation signer (DS) records, the authoritative data is published by the parent, but the data is derived from the DNSKEY key value that is authoritative in the child domain. Whenever there is replicated data in the distributed system the key question is: What should a client do when the various sources of the data disagree with each other? The DNS has no answers here.

The DNS is getting to a point where the collection of signalling and action tools is large enough to mean that there are multiple ways to customize these existing tools to achieve a desired outcome. The parent domain might be happy using polling to detect changes in the child domain for DNSSEC-signed domains, or they might prefer that the child domain signals the change of key with a NOTIFY signal, and then leave it to the parent to pick up the new data via a query. There is also the option of the child passing the new data to the parent via an UPDATE.

Deep space DNS

When the work on protocol development work for an ‘InterPlanetary Internet’ first started it was a DARPA-funded project supporting work by NASA, MITRE, and others to go beyond simple point-to-point long-distance communications and examine how to support a network of communicating devices operating in the very high delay environment of deep space.

The work evolved into the more generic work of ‘delay-tolerant networks’ and the IETF published RFC 4838 and RFC 5050 as experimental individual contributions in 2007. In many ways, it’s a form of returning to the past of dial-up store and forward relay networking. The bundle approach was taken up by the Delay-Tolerant Networking Working Group revising RFC 5050 with a set of RFCs: 9171, 9172, 9173, and 9174, in January 2022.

Of course, many of the networking issues remain in such hostile environments, including naming and name resolution. Marc Blanchet has been working on the issues that lurk behind a desire to map the DNS into such deep space environments. Approaches include pre-loading local caches with resolution outcomes, a mapping of needed names and RR values into a ‘special’ version of a deep space zone, a new root zone for this namespace, or some form of split-horizon DNS.

The question posed by the presentation of DNS in deep space is whether this is a topic of interest to the DNSOP Working Group. It does seem to be a very esoteric area of application of the DNS and somehow, I just can’t see DNSOP spending its collective time and effort on this. But my track record of guessing what Working Groups choose to do and choose not to do is pretty poor, so any outcome is possible!

Compact Denial of Existence in DNSSEC

There are two fundamental performance and efficiency criticisms of DNSSEC. The first is that the inclusion of signed records in DNS responses bloats the DNS response size and this makes DNS over UDP an uncomfortable situation that requires UDP fragmentation or a rapid transition to TCP. The second is that validation of DNSSEC-signed responses can be very time-consuming when using incremental queriers to assemble the validation path.

The DNS response size has been seen as a challenge, particularly in the case of signed NXDOMAIN responses. These negative responses contain the zone’s SOA record (to let the querier know how long to cache the negative answer) and its RRSIG signature, an NSEC record that spans the query name; its RRSIG signature and an NSEC record for a wildcard, to indicate that there is no wildcard in the zone; and it’s RRSIG signature. That’s three resource records and three signature resource records. At this point, large responses are more likely, particularly with RSA-2048 crypto.

Work by Cloudflare in 2016 pointed out that responding in that manner to a query where the query name does not exist is perhaps more information than what was strictly asked for. The query contains a query name and a query type. A very minimal negative response to such a query would be to indicate that the requested query type does not exist for this name (NODATA). The subtle shift in this NODATA response is to indicate that all query types other than the type in the query itself exist in the zone.

This approach has been taken up in DNSOP under the name of ‘Compact Denial of Existence‘. The draft defines NXNAME, a pseudo–resource record type, used in the NSEC type bit map for non-existent names.

This approach to Denial of Existence, coupled with elliptical curve crypto, can reduce the size of DNSSEC-signed responses to the point that signed responses should fit into UDP and front-end response signers can work even without complete zone knowledge.

Now, what can we do about the time taken to perform validation?

SVCB and DANE

There’s little doubt that the DNS is the new steering protocol for the Internet (see DNS is the new BGP), and these days it appears that the two most useful records to achieve this in the DNS are the name translation record (CNAME), with its ability to transfer control of an individual name out of its original zone to a target zone that is operated by a service hosting provider, and the service binding record (SVCB), with its ability to define the parameters of a service connection without performing additional DNS queries.

There are a number of RFCs describing the use of Domain Name keys in the DNS, including RFC 7671, which describes DNAME itself, RFC 7672, which describes DNAME and MX records and RFC 7673 for DNAME and SDRV records. This work is performing the same function for DANE and SVCB records.

The DANE specification contains the advice that a client should follow the CNAME alias chain to its ultimate target hostname and use that target name as the base name for a TLSA query, but where no TLSA records exist for that name, it should also query for a TLSA record using the initial domain name. This work suggests that this is not the desired behaviour and only the target name should be used, as in the following example:

example.com.           HTTPS 0 xyz.provider.example.
www.example.com.       CNAME xyz.provider.example. 
xyz.provider.example.  HTTPS 1 . alpn=h2,h3 ... 
xyz.provider.example.  A     192.0.2.1 

_443._tcp.xyz.provider.example.  TLSA … 
_443._quic.xyz.provider.example. TLSA …

Some protocols that can run over TLS, such as HTTP/0.9 and HTTP/1.0, do not confirm the name of the service after connecting. With DANE, these protocols are subject to an Unknown Key Share (UKS) attack, which allows an attacker to deceive one peer of a secure communication as to the identity of the remote peer.

Current focus of DNS activity

The focus points of DNS activity are a constantly moving target. The frenetic pace of work on channel encryption for the DNS appears to have calmed down and attention has shifted to the server side of the DNS, looking at the issues related to the role of the DNS in content hosting.

At the extreme end of the DNS on the client side, the stub resolvers appear to be highly resistant to change. They use DNS over UDP queries in the clear and the overwhelming majority of these queries are directed to the recursive e resolver operated by the client’s ISP.

Enterprise users have a higher level of willingness to use non-local open DNS resolvers, notably Google’s Public DNS service, and in this space, there is also a higher level of use of DoH and DNS over TLS. However, the measurements do not show any dramatic movement in these patterns of use, and the efforts to add channel encryption between the recursive resolver and the authoritative name servers appear to not have gathered much momentum.

These days there is more activity on the server side of the DNS, looking at ways to make administration of delegated zones more efficient, and also looking at ways to make the initial rendezvous process more efficient by adding service profile data into the DNS.

I am interested in watching the evolution of the combination of DANE, CNAME, and SVCB constructs in the DNS as they apply to encrypting the last open peephole in TLS, the encryption of the SNI field. It strikes me that progress in this area has a pretty high barrier to surmount, given that a truly end-to-end approach that does not entail trusted third-party intermediaries will need DNSSEC validation all the way to the stub resolver.

In today’s DNS, that looks like it’s still a big ask.

Rate this article

The views expressed by the authors of this blog are their own and do not necessarily reflect the views of APNIC. Please note a Code of Conduct applies to this blog.