The 29th DNS-OARC workshop was recently held in Amsterdam, attracting around 200 key operators, implementors and researchers (the most participants ever) to discuss the state of affairs in the modern global DNS.
Here’s my impression of some of the priority themes discussed during the two-day series of talks.
Addressing the KSK roll in the room
First and foremost, the meeting was significant because it was timed alongside the root zone KSK roll event, which happened on 11 October. This demanded a higher than usual degree of attention from operations staff involved in the production of the state of the root DNS zone, and rather than either be in multiple time zones — or worse yet, in-flight to the meeting — the relevant DNS operations community decided to meet before OARC to conduct the roll with all hands on deck. OARC provided meeting spaces used by some observers, and NLNet Labs hosted ICANN staff who were involved in the process of the roll.
— Nico CARTRON (@cartron) October 13, 2018
In many ways, the KSK roll has played out in a similar manner as the Y2K problem — hugely significant, with potential for high-risk adverse outcomes; but, in practice, a smooth and probably, low-key change.
Geoff Huston’s work on APNIC Labs, using adverts to try and measure potential damage, strongly suggested the impacts were confined to a cohort of users in the global Internet whose size was below the noise threshold of the measurement technique. Prior work in the root zone query flows had provided a list of potentially affected resolvers and associated origin-AS (in BGP), which identified the ISPs with risks. APNIC, along with the four other RIRs, each ran a communication campaign to identify and reach out to entities in their region who held the addresses and originated the DNS traffic.
Read Geoff Huston’s Measuring the KSK Roll
Given that the roll has happened without any significant problems, the conversation at DNS-OARC turned to ‘what next’, an open question that comes down to three key issues, which we (the DNS operations community, and the at-large governance community in the public DNS) need to consider:
- When is the next keyroll going to happen? And, how often after that?
- Can we discuss getting ‘standby’ keys into deployment, so an unexpected keyroll has lower risks?
- Can we discuss changing to a different DNSSEC algorithm such as EC-DSA?
It is highly likely that all of these topics are a subset of a wider question, under the umbrella of ‘who decides’ but it was good to have the key people in operational and development roles able to kick off a conversation.
Opinions on ‘when is the next keyroll going to happen’ varied from ‘soon’ to ‘never’ (to some extent, both said in humour, but there are reasons to promote both sides).
Discussion skipped over standby keys because they are such an obviously good idea, we just need to specify how we want this to work. Having said that, it is very likely we don’t want to alter key deployment processes in more than one way. Therefore, if we consider standby keying and algorithm change as two distinct changes, we have to decide on one and then the other — which begs the question, which comes first?
Lastly, nobody seriously expects a keyroll every month, but maybe every year or three years could be possible. That said we might be waiting between two to six years to resolve either of these changes. Thus, frequency goes directly to change: how much change do we want in the DNSSEC world and how often?
New tools make lighter work of DNS
Jerry Lundstrom is a software developer at DNS-OARC and has been working on a suite of public-domain software for DNS analysis. This has been a body of work sponsored by the community (Comcast has a community benefit fund that DNS-OARC tapped into).
Some of the tools are old favourites we all use, including DSC and DNSCAP. These are now in modern software development frameworks and Jerry accepts pull requests and issue reports in GIT, with continuous integration building as the code develops.
Some of the tools are newer, including DNSJIT and DROOL, which are analysis and replay systems. Being able to both capture DNS packets (which is what DNSCAP does) and also replay them, means you can use real-world DNS flows to check system behaviour by running a DNS server with a real-world load. This has been enormously useful for testing new ideas in DNS systems, and the changing environment of how DNS is served by intermediate caching systems.
Use of the newer ‘aggressive NSEC caching’ systems, for instance, massively alters traffic flows to authoritative servers. So having code to replay public query captures and see how the aggregated traffic changes between client, resolver, and authority is a huge win.
DoH and DoT
Olafur Gudundsson, Sara Dickinson and Tony Finch provided three different perspectives on the deployment of DNS over anything but UDP.
Olafur spoke to the Cloudflare deployment of the DNS service on 126.96.36.199, which includes a DNS over HTTPS (DoH) and DNS over TLS (DoT) service point.
Since it’s launch on 1 April 2018, the service has been deployed in more than 150 locations worldwide. As well as offering query minimization (removing the full DNS name from the indirect parties, seeking the final answer) and aggressive NSEC (keeping a record of what you know doesn’t exist, and not querying for things you can prove don’t exist) Cloudflare made the decision to implement this basic transport security mode to give users privacy on the wire — from their browser or operating system — to the service.
Cloudflare is making a quite public commitment to a low-view service; they are deliberately not monetizing the query flow, not keeping logs, and not providing or sharing any information about what people ask, or where they come from, at the individual query level. Cloudflare provides limited information daily on the aggregated state of traffic, but nothing more. Given their wide-ranging presence worldwide this is a large public service offering and welcome. However, it does beg some questions — perhaps best expressed at the microphone by Bert Hubert from PowerDNS: “Where is the regulatory oversight in this? Where is the GDPR and like legislation, or audit?” These are good questions, but we’re very early in service to understand all the long-term ramifications.
Sara Dickinson from SINODUN spoke separately to the software and service aspects, as a primary author in the field, and a software developer writing code.
To some extent, the communities fragmentation into two privacy modes (DoH and DoT) has a consequence that doesn’t help, but there are aspects of this that reflect different needs. For example, a bank may well want to use DoH to provide trusted name-to-address maps that secure the banks’ web services, and removes some risk of phishing by reference to outside agencies (for web markup fetches, packages, or ancillary service points) that is a focused deliverable best done inside that context of binding client-to-server. But DoT is a more wide service offering, taking you to a public DNS service rooted inside a secure channel across all web browser bindings. You are still getting an outside worldview of DNS truth, but it’s now private between you and your DNS provider. As a side note, it is also absolutely vital in both cases you use DNSSEC to validate what you are told.
Sara discussed the need for some kind of bootstrap of DoH because right now you need the specific URI to bind to. This has to be built into the browser, but because of fragmented browser code development, it’s not clear what is going to emerge here. DoT already has this, and the Android ‘pie’ implementation attempts to automatically turn on if it can detect the service is available (it can be hand-configured too). Sara’s talk also surfaced in the RIPE 77 meeting. It’s well worth following online as a slide pack or from a video feed, and I suspect we’re going to see a wider conversation about public governance issues in the DNS as a service for some time.
In the lighting talks, Tony Finch from Cambridge University gave a very fast and entertaining talk on his own deployment of a service he wrote himself from the specifications, using NGINX (web software) LUA (a lightweight scripting language) and the ‘BIND’ DNS server. Tony has a campus-wide WiFi network with over 30,000 users, so providing this service meant he both got people like him who can hand-configure service bindings, and potentially others who have devices like Android Pie, which can be configured to work. Bert Hubert observed there are tuning issues here that go to battery life on the edge device, keeping TLS and HTTPS sessions active.
A simpler DNS
Bert Hubert spoke at IETF 101 earlier this year to the problem of the ‘DNS Camel’ which is a shorthand for the ‘last straw that breaks the camel’s back’ problem in the DNS.
For years now we have been adding standards, documents, guidelines, notes to a collection of both canonical, and ephemeral work on ‘how the DNS works’ and it’s become quite hard to steer a course through this mess. Bert summarized it by noting that over 2,000 pages of RFCs — equivalent to a standard Chemistry formula handbook, for the entire field of Chemistry as we know it — is just for the DNS! This is crazy, and probably not helping us.
Read Burt Hubert’s The DNS Camel…
So, as well as complaining about the additional complexity (on average, two new pages from RFCs every IETF meeting) Bert decided to do something about it. He’s been working on a teaching and didactic framework he calls Hello-DNS: a website in the spirit of the Richard Stephens ‘TCP/IP Illustrated’ book, which exemplified a combination of simple, unoptimized C code, and worked examples using the ‘tcpdump’ tool to see actual TCP/IP packets off the Ethernet on your own computer. Bert has worked on a very small, standards conforming, simple DNS system in C++ in only around 2,000 lines of code, which can be talked to, discussed, compiled, and run by anyone with a modern computer and a C compiler.
Insightful and entertaining session by @PowerDNS_Bert about saving the DNS camel at #OARC29. To learn more about DNS without reading the 2000 plus pages of RFCs go to https://t.co/ukKtNkuCfB pic.twitter.com/6KDGBEW0Vn
— Deteque (@deteque) October 13, 2018
The site is a fantastic step forward in documenting what really has to be known, from the ground up, to work in the DNS. Its a real lasting contribution to the DNS, and I think deserves wide support and propagation. Well done Bert!
There’s lots of life in DNS-OARC
In closing, I want to note that this was a remarkably effective DNS meeting, with around 200 participants, and a wealth of DNS operational experience and engagement.
DNS-OARC is well worth joining if you care about the DNS ecology in all its levels, standards, and operations. Different levels of membership exist from an individual through to significant sponsors, such as Verisign who were a premium sponsor for this meeting, and the RIPE NCC who facilitated the meeting as a pre-event to the RIPE 77 meeting.
Disclosure: I am a board member of DNS-OARC, now in my own right, but previously as the APNIC representative.
The views expressed by the authors of this blog are their own and do not necessarily reflect the views of APNIC. Please note a Code of Conduct applies to this blog.