DNS at IETF 123

By on 19 Aug 2025

Category: Tech matters

Tags: , ,

Blog home

The Internet Engineering Task Force (IETF) meets three times a year to work on Internet Standards and related operational practice documents. In July of 2025, the IETF met in Madrid (finally, and only after several thwarted mis-starts!) with more than a thousand folk in attendance through the week.

At IETF meetings, DNS discussions span several Working Groups. In this report, I’ll focus on the material presented at the DELEG and DNSOP Working Groups. To keep things (relatively) brief, I won’t cover Adaptive DNS Discovery (ADD), Extensions for Scalable DNS Service Discovery (DNSSD), or DANE Authentication for Network Clients Everywhere (DANCE).

DELEG

Design by committee should always ring alarm bells, particularly in technology. The desire to achieve acceptable compromises between various opinions often leads to compromised technical outcomes, and it seems to me that the current work on redefining zone cuts and delegation in the DNS is leading to this same outcome.

The original model of delegation in the DNS is defined by the NS record in a zone. This resource record specifies that this label, and the DNS sub-tree that descends from this label, has been delegated. The NS record has a single value, the name of the nameserver that serves this zone. A delegated zone with multiple nameservers uses multiple NS records for the same label.

A server that contains an NS delegation may also serve the IP address of the nameserver, held as ‘glue records’ by the server. If a server is queried for a label that lies within the delegated sub-tree of the zone served by this server, then it will return a ‘referral’ response, which is the contents of these NS records. All relevant glue records are also loaded into the Additional Section of a DNS referral response. If there are no glue records in a referral response, then the task of resolving these nameserver names is left to the querying resolver (as a sub-task, before resuming the primary resolution task).

As painful experience has taught us, these glue records are not necessarily correct, particularly when the names of these nameservers lie outside of the zone being served (‘out of bailiwick’) and should not be used in any other context outside of a shortcut in following the delegation chain to the next nameserver.

These NS records exist both in the parent (delegating) zone and in the child (delegated) zone, and they are intended to be identical in value. If they differ, then the child zone is declared to be the authority, and when DNSSEC came around it was the child zone that contained the DNSSEC-signed NS records, while the parent contained the unsigned NS records. Out of bailiwick, nameserver glue records are of course never signed.

There are several shortcomings with this approach. DNSSEC-aware validating resolvers cannot validate the parent-side NS records and must follow the potentially insecure delegation to retrieve the child-side NS records to validate the delegation if they wish to validate the delegation. The referral delegation method ultimately loads the querying resolver with the IP addresses of the nameservers for the delegated zone, but no other information.

No form of priority can be communicated, nameserver names cannot be aliased, and the implicit intended DNS query protocol is unencrypted DNS over UDP port 53. Resolvers may choose to probe a nameserver to see if it supports any alternative protocol (such as DNS over TLS (DoT), DNS over Quic (DoQ), and DNS over HTTPS (DoH)), but such probes may take additional time, increasing the time needed to complete the resolution process.

The DELEG record is an effort to augment and alter the functionality of the NS record. There are two basic changes to the NS delegation model. The first is that the parent zone is defined to be authoritative for the DELEG record, rather than the child, and the second is that the value of the DELEG delegation record is modelled on the SVCB record (RFC 9460).

For many years, the DNS has struggled with trying to pack multiple elements into a single query. For example, in a dual-stack world, it seems logical to ask for both the IPv4 and IPv6 addresses in a single query. While the base protocol permits the loading of multiple queries in the DNS packet (QDCOUNT), its value must always be 1 (RFC 8906).

Service Binding (SVCB) offers a convenient way around this by packing multiple attributes into a single response (for example, the HTTPS record allows the use of ipv4hints and ipv6hints as attributes in an HTTP response). A DELEG response could include the name of a nameserver, the server’s IPv4 and IPv6 addresses, and the DNS protocol to use (as an alternative to unencrypted DNS over UDP port 53), and also permit indirection via SVCB alias records and even add a priority value. A referral response would contain a list of such DELEG records, one for each of the zone’s nameservers in a manner that is analogous to the set of NS records seen in a current DNS referral response.

If that seems like a major change to the DNS, then it certainly is, and there is much to get wrong. Delegation is one of the more troublesome aspects of the DNS for operators, with deviation between parent and child delegation records at the heart of most of these issues. Placing the authoritative location of this DELEG record at the parent increases the load on whatever is used as the DNS provisioning protocol to allow the child zone administrator to control the contents of the parent’s DELEG record. There is also the issue of backward compatibility with DELEG-unaware resolvers querying across a DELEG-defined delegation.

I suspect that at the core of these issues is the challenge of trying to package three items of essentially disparate information into a single DNS response, where:

  • The parent zone controls the delegation of authority to the child.
  • The child zone controls the names of the nameservers that have been selected to serve this zone, and the suggested relative priority to query each of these nameservers.
  • The zones of the nameserver names control the details of the addresses of the nameservers, as well as the supported DNS query protocol, and whether the name is an alias to another name.

A query to the parent zone can reveal that a label lies within a delegated zone and identify the cut point. A query to the child zone can then disclose the names of the nameservers for that zone. Further queries to the zones of those nameservers reveal their IP addresses and the query protocols they support. However, the resolver cannot query the child zone for the nameservers until it already knows the IP addresses and supported protocols of those same nameservers.

So, while it might be useful to think of this as three distinct items of DNS information, a referral response must combine them into a single response to allow a resolver to progress with resolution. This functionality is achieved in the NS delegation framework with the Additional Section providing the glue records that contain the IP addresses of the nameservers. In the DELEG model, the information is packaged in a DELEG record (or records) contained in a referral response.

The differences are that the parent’s copy of the DELEG record can be DNSSEC signed by the parent, allowing the client to DNSSEC-validate this delegation response, if it so chooses, and the additional information in the DELEG response enables the use of other protocols to query the authoritative nameservers.

As an aside, I’m mystified by the desire to DNSSEC-validate a delegation record. Validation takes long enough as it is, and the higher-level semantic intent of DNSSEC is to assure a client of the authenticity (accuracy and currency) of a DNS response, irrespective of how the client learned of that response. So why should each delegation step be validated? Yes, such an action can expose certain forms of denial-of-service attacks through deliberate misdirection of delegation, but ultimately it is necessary and sufficient to DNSSEC-validate the response alone to confirm its authenticity and currency.

A misdirection in validation is not going to lead to a response that can be validated. And there is the incremental time cost of performing DNSSEC validation of each of these delegations. If a nameserver’s name lies outside the parent zone’s bailiwick, the parent’s DNSSEC signature over that nameserver’s IP address(es) offers limited assurance of authenticity.

Strictly speaking, a resolver would need to separately query the zone containing the nameserver name for the name’s A and AAAA records and perform DNSSEC validation. Either that or the parent would need to perform this query at the outset and include the RRSIG signatures of these address records in the DELEG records.

What about the use of alternative DNS protocols, namely, protocols that use encryption? Don’t forget that recursive resolvers query authoritative servers, not stub resolvers, so the end user information is not normally contained in these queries. The potential client’s privacy concerns are, to some extent, mitigated by this level of indirection. If QNAME minimization is being used by the resolver, then any sensitive information that may be contained in the more specific (left-most) labels will be stripped out during this phase of nameserver discovery.

Also, don’t forget that the overhead of turning on channel encryption can only be amortized using repeated queries. There are few opportunities for repeat queries in the recursive-to-authoritative query model. Validation of delegation and use of encrypted channels incur high cost, while the incremental benefits of occlusion through encryption are minimal at best.

Hmm. What’s the point of DELEG?

I have no answer, other than to come back to a long-standing observation of the IETF that it is close to impossible to prevent the IETF from working on an item. ‘No’ is just not a word that the IETF appears to understand!

This particular Working Group has managed to generate an awesome volume of email and inspire many debates in its Working Group sessions, but in terms of progress in solving real-world problems in the DNS, I’m just not feeling it!

DNSOP

The DNSOP Working Group is the catch-all of DNS activity in the IETF. If it concerns the DNS and it doesn’t have its own Working Group, then the default action is to pass the item to the DNSOP Working Group. It’s a very busy Working Group, with currently 13 active drafts. The two sessions at IETF had a full agenda, and discussion of these proposals was necessarily quite limited.

Extended DNS errors

There is a proposal to add to the set of DNS error codes provided in Extended DNS Errors (RFC 8914). This was originally envisaged to augment the SERVFAIL error to provide more details of DNS and DNSSEC validation failures. It should be noted that Extended DNS Errors do not change the processing of response codes.

I must admit that while the ability to provide a more detailed error diagnosis might sound helpful, such information provides no useful information to the end user. There is nothing that a user can do to correct the problem with the data! What should an implementation do with the open text field? While RFC 8914 notes that this text field contains information intended for human consumption, any text field sent back to an end user within the DNS is about as helpful as the error code ‘Something has gone wrong with the Internet!’. A related issue is to use this DNS Extended Error field to pass information to the user in the case where a response has been blocked by some imposed DNS filtering order. Again, it’s hard to understand how this is actionable information for the end user!

Dual-stack DNS — RFC3901bis

The IPv6 protocol is not a good match for the DNS, particularly when the DNS is using UDP and the responses start to stray into the vague class of ‘large’ responses, as there are residual reliability issues relating to packet fragmentation and IPv6. The 2004 advice in RFC 3901 was that all recursive resolvers and authoritative servers should be IPv4 only or dual-stack, and any endorsement of IPv6 in this document was somewhat muted.

IPv6 enthusiasts have criticized this advice as needlessly cautious, even reactionary, arguing that the IETF should encourage IPv6 adoption in every possible way. Others take a different view: When the DNS carries large responses, IPv6 shows a higher failure rate, and the safer course is to avoid this risk by preferring IPv4.

What should the IETF do? IPv6 packet fragments are still dropped by too many networks, which directly affects DNS operation. In that sense, the guidance in RFC 9715 fits into a broader picture — avoiding IP fragmentation in DNS over UDP is a sensible step. That said, I was hoping RFC 9715 would go further, for example, by recommending the use of more compact cryptographic options in DNSSEC (such as elliptic curve algorithms), along with compact denial-of-existence methods and QNAME minimization. Perhaps a 9715-bis draft that includes these measures would offer more practical guidance for reducing fragmentation in DNS responses.

By avoiding many of the circumstances that bloat a DNS response, the operational reliability issues associated with IPv6 fragmentation can be managed, allowing IPv6 to be handled at the same level as IPv4.

DNS over TCP

On a related note, there is a recent proposal (draft-tojens-do-not-accommodate-udp53) that, as its name suggests, advocates that DNS implementations should not specifically accommodate DNS over UDP and instead assume that DNS over TCP is universally available. Implementations should be capable of initiating a DNS query over TCP without first attempting UDP and receiving a truncated response. Given that most of the web now operates over TLS on TCP, a model that supports a transaction volume comparable to DNS, it is reasonable to assume that a DNS infrastructure capable of handling DNS over TCP could be constructed. This would avoid the entire issue of IP fragmentation.

However, as has been pointed out in earlier studies, DNS over TCP requires approximately double the server capacity of one using DNS over UDP, and using TLS with ECC P-256 requires three times the capacity. So, who gets to pay for the additional resolver and server infrastructure required to shift the DNS over to use TCP or to use DoH or its related variants? I suspect the answer is that no one is prepared to foot the additional bill, and the incidence of large DNS responses is low enough that we are prepared to live with the occasional inconvenience of large DNS responses in a largely DNS over UDP world.

Domain Control Validation

It could be argued that the DNS is assuming the role of a universal signalling protocol, and its use in Domain Control Validation (DCV) supports such a supposition. DCV (draft-ietf-dnsop-domain-verification-techniques/) allows a user to demonstrate to an Application Service Provider (ASP) that they have sufficient control over a domain to place a DNS challenge response, provided by ASP, into the domain. The general practice of such verification tokens is to continue the long-held DNS tradition of abuse of the TXT record, but this can present some scaling issues (see the TXT record for bbc.co.uk as a good example of DCV bloat!) Some responses to this scaling issue are considered, including using subdomains to identify an ASP, as well as the use of an alias record to allow this function to be outsourced to a third-party intermediary.

It’s useful to document what has so far been an informal operational practice and to highlight practices that can improve the resilience, security, and scalability of third-party validation.

Validating a DNSSEC signature on a DCV record would be helpful if you want to provide stronger assurance that the returned token is both current and authentic. As the draft (draft-sheth-identifiers-dns) points out, it does not appear to make a whole lot of sense to advance this independently through the IETF process, and including this use case as a ‘persistent identifier’ in the existing DCV draft seems like a sensible approach.

DS automation

Many years ago (11 years to be precise), the IETF published RFC 7344, which describes a mechanism for automating Delegation Signer (DS) record management. A child zone can publish its DNSKEY entry point using a CDS or CDNSKEY record, and the parent zone periodically polls for these records. If the parent can validate them, it updates its own DS record accordingly.

The polling mechanism described in the document is a weakness in the process, and work has largely been completed to complement this CDS method with a generalized version of the DNS NOTIFY function, allowing the child to signal to the parent that there is an updated CDS value to fetch (draft-ietf-dnsop-generalized-notify). This document is in the RFC Editor queue, but the authors feel that some important additional functionality was missing in the draft that could be added.

Before the introduction of the DSYNC record for generalized notify, there was no way to signal to child domains that a scanner existed or how often it was running. Obviously, the existence of a DSYNC RRset with notification details implies the existence of a scanner, as the notification mechanism triggers the parent to perform a scan. One idea is to reuse the DSYNC record to indicate an ‘old-style’ scanner that does not support generalized notifications, by specifying a null target for notifications and reinterpreting the notification port as the scan interval (in minutes). I’m a bit sceptical about this late addition to the document, as continual scanning feels like a poor way to synchronize parent and child, compared to notifications, which directly signal when child data has changed.

In the meantime, the registry / registrar world has been deploying its own mechanisms for provisioning data into the parent zone using the Extensible Provisioning Protocol (EPP) and an associated set of controls and locks. Whenever there are two or more ways of achieving the same outcomes, the potential for confusion always arises, and this work (draft-shetho-dnsop-ds-automation) tries to define some basic rules of precedence in such situations.

DNS Quality of Service

The assumption behind a raft of various differentiated quality of service response mechanisms for the past few decades is that there is a shortfall in the resources available to service all requests, and that some form of prioritization is required to meet the more compelling or important requests first. The countervailing argument is that signalling such differential response preferences can be costly, and it’s often cheaper to augment the available resources such that such an imposition of relative priority is not required.

I view the proposal to define a background priority for service records of DNS and HTTP servers (draft-gakiwate-dnsop-svcb-bg-priority-parameter) to be part of this same class of differentiated response control. For me, this proposal adds nothing more than an additional adornment to a service where the underlying problem can be more directly addressed by augmenting the server’s resources to meet the underlying demand.

DNSSEC multi-signing

In the search for improved resilience, the DNS world has turned to using multiple operators to serve a DNS zone. This can cause complications when the zone is DNSSEC-signed, and one or more of the operators choose to use a sign-on-the-fly configuration. The use cases being considered here include multi-signer configurations where each multi-signer party supports a different signing algorithm, or a ‘live’ transfer of a signed zone between DNS providers that support different signing algorithms. There are also cases for independently rolling the algorithm for a Key-Signing Key (KSK) or Zone-Signing Key (ZSK), and online signers supporting a zone with multiple algorithms.

This proposal (draft-huque-dnsop-multi-alg-rules-06) attempts to classify DNSSEC algorithms into those that are widely supported by almost all validators and recommended for use, ones that are being deprecated, and others. The basic proposal is that signers must sign with at least one widely supported algorithm, relaxing the condition that all signers must use all algorithms found in the DNSKEY set.

Can you encrypt?

RFC 9462 provides a convenient way to determine if a DNS recursive resolver can support queries over an encrypted DNS channel. The technique described in this RFC is to query for the SVCB record for the locally served domain _dns.resolver.arpa. For example, the following query illustrates the capability of the 8.8.8.8 resolver to support DNS over TLS, and DNS over HTTP/2 and HTTP/3:

$ dig SVCB _dns.resolver.arpa @8.8.8.8
;; ANSWER SECTION:
_dns.resolver.arpa.   86400  IN  SVCB   1 dns.google. alpn="dot"
_dns.resolver.arpa.   86400  IN  SVCB   2 dns.google. alpn="h2,h3" key7="/dns-query{?dns}"

;; ADDITIONAL SECTION:
dns.google.           86400  IN  A      8.8.8.8
dns.google.           86400  IN  A      8.8.4.4
dns.google.           86400  IN  AAAA   2001:4860:4860::8888
dns.google.           86400  IN  AAAA   2001:4860:4860::8844

The draft (draft-sst-dnsop-probe-name) extends the potential use of the resolver.arpa locally served domain with the name probe.resolver.arpa. If a client queries for this DNS name and receives a NXDOMAIN response, then it’s a signal that their network and their DNS service infrastructure are working (to some extent!). To me, this proposal seems like a simple and pragmatic approach to a common DNS diagnosis issue.

.internal and delegation

What do you do if you want a private-use local DNS zone? The standard response is to make up your own Top-Level Domain (TLD) name and set it up to serve your local private community. If you choose a label that is not already delegated in the public DNS, then even if a query ‘leaks’ into the public namespace, it will simply return an NXDOMAIN response. But how can you be assured that your private name will not collide with a name that might be delegated in a future round of ICANN-managed expansion of the set of TLDs in the root zone? The answer is that it isn’t possible to provide any such assurance.

In response to this, the Security and Stability Advisory Committee (SSAC) of ICANN has recommended (sac-113) that the Board of ICANN reserves a TLD label as a root label for private use identifiers that anyone can use and populate with their own subdomains, confident in the knowledge that this reserved label will not be delegated as a public-use TLD now or in the future. Because of the decentralized nature of the DNS, there is no way to prevent ad hoc use of any label for a private use domain. Nevertheless, the SSAC believes that the reservation of a private string will help to reduce the potential for collisions between private ad hoc usage and the public DNS.

Without a way to signal that a zone cut exists in the private namespace, but not necessarily in the public namespace, different resolvers may return conflicting answers to the same query. A server might respond with a name error, or with a DNSSEC-signed name error indicating the name truly does not exist. Alternatively, it might return that the name does exist, either with an Answer section or with signatures that cannot be validated. What is wanted is a standard response that says: “a zone cut exists at this label, but you can’t follow the delegation right now”.

The draft zone-cut-to-nowhere proposes to publish a delegation to servers that cannot be reached. The suggested approach is to use an NS resource record with no target nameserver name.

Synchronized DNS caches

A very high capacity of a recursive resolver is often implemented as a set of recursive resolver engines with a front end that distributes incoming queries across the set of resolver engines. The issue is that each engine’s cache will load differently. It would be good to have these caches synchronized so that the response obtained by any single engine can be shared across the caches of all other engines. The approach described here is for each engine that receives a resolution response to forward that response to all other engines in the set. This could be achieved by response replication or by using a local multicast set. The preliminary results based on an implementation of the unbound resolver look promising.

Opportunistic DNS transport signalling

There are some elements of doubt as to whether DELEG will be widely deployed, and there are some thoughts on how the delegation can signal the transport capabilities of the nameservers without incurring the cost of probing.

The draft draft-johani-dnsop-transport-signaling, proposes to add an SVCB record (and its DNSSEC signature) to the Additional Section of an authoritative response to a query. This record would presumably contain DNS transport capabilities. The only change to DNS records is the addition of the SVCB record in the zone where the nameserver is located (and that record may be synthesized automatically if wanted).

Some closing thoughts

In 2018, Bert Hubert, who was then with PowerDNS, gave a talk to the DNSOP Working Group on the topic of ‘The DNS Camel, or, how many features can we add to this protocol before it breaks‘, and also wrote a blog article on this topic. As he reported at the time, “The concept, however, of reducing at least the growth in DNS complexity was very well received.”

Alas, the IETF operates without much of a long-term memory and all such thoughts of constraint in terms of feature creep and burgeoning complexity in the DNS appear to have been abandoned in the intervening period.

As can be seen from this report of activity at the IETF, there is a large amount of activity at all levels, from the fundamentals of delegation and the various ways we can use the compound SVCB query type through to the operational aspects of generalized notifications and zone cuts across different namespaces.

It’s challenging to understand the need for encrypted transport for queries between recursive resolvers and authoritative nameservers, or why resolvers should perform DNSSEC validation of the IP addresses of the nameservers that are listed in a delegation. We appear to be working diligently once more to establish precisely how many additional DNS features it will take to break the DNS!

Rate this article

The views expressed by the authors of this blog are their own and do not necessarily reflect the views of APNIC. Please note a Code of Conduct applies to this blog.

Leave a Reply

Your email address will not be published. Required fields are marked *

Top