I was invited to participate in a session at the 2022 Internet Governance Forum (IGF 2022) that was devoted to the workings of the DNS. I’d like to share my contribution to this session with my thoughts on where the DNS is headed.
The session brief reads: “The DNS is receiving increased attention from policy-makers and standards-setting bodies for its central role in the functioning of the Internet. From the DNS4EU proposal, which seeks to create an EU-based recursive DNS service, to local and regional conversations about the potential impacts of DNS encryption, domain names infrastructure and governance have become new sources of contention. But what does the data say on these issues? And perhaps as importantly, what data is missing to develop evidence-based policies around the DNS that protect users’ trust on the Internet?”
The DNS lies in a relatively obscure part of the Internet. Unlike browsers and the World Wide Web, or social network applications, the DNS is not exactly prominent, or even visible to users. The operation of DNS name resolution protocol operates in a manner that confounds even the end client. It is extremely challenging to trace where and why DNS queries are propagated through the DNS infrastructure and where DNS answers come from and why.
The simple question of who gets to see your online activity, in the guise of you and your DNS queries, is often very challenging to answer, Yet, even though its inner workings are obscure to the point of impenetrability, as a protocol its task is simple — the DNS resolution system takes names and translates them into network addresses. All this might seem innocuous enough, but there are a few aspects of this function that have been used and abused by many over the years, and this lies at the heart of today’s issues with the DNS.
This particular protocol can trace a history back to the 1970s. The initial specification of the protocol was published in the RFC series some 35 years ago in 1987 as RFC 1034 and RFC 1035, based on earlier work on the specification of data objects used to query name servers that were initially published in 1978 with Internet Experimental Note 61.
The DNS followed the pattern used by many other network protocols of the time; in that it was open. That is to say, its payload, who is asking and what name they are asking about, was not encrypted. It was also trustful, in that it did not bother to authenticate whom it was talking to, and a client simply believed in whatever answers were elicited from its query.
In defence of what today would be considered an obvious shortcoming, at that time we weren’t constructing the final version of a future global communications infrastructure. This was just a small-scale experiment in packet networking. The DNS protocol as it emerged 35 years ago was, in retrospect ,overly trustful to the point of being naively gullible, and any determined adversary that intruded upon the DNS query traffic could observe and tamper. But this was a research project. Why would they ever want to do this in any case?
When the Internet started to assume a more central role in the public communications realm, the DNS came along with it and quickly became a point of vulnerability. If I could see your DNS queries and tamper with DNS answers then I could misdirect you, or I could claim that sites and servers did not exist when in fact they did. I could poison your cache with gratuitous information in DNS responses that you were prepared to believe. In all this, you would be none the wiser because, as we have already noted, the inner workings of the DNS are totally opaque to its users.
However, tampering with the DNS is not just a tool for bad actors and bad actions. Many regimes have used their regulatory and judicial powers to compel Internet Service Providers (ISPs) to actively censor the DNS by intercepting queries for certain DNS names and synthesizing a DNS response that claims that the name does not exist or misdirects the end user to a different service point. This is very widespread today. But perhaps more disturbing, at least for some members of the technical community (RFC 7258) that form the core of the IETF, was the Snowden revelations of 2013, which showed that the Internet was being used by a number of national agencies, including some US agencies, to perform mass surveillance. Everything that happens online starts with a call to the DNS. Everything. If I was able to observe your DNS query stream, then there are no secrets left for you. I really do know everything you are doing online and with whom!
The technologist’s response to the Snowden papers has been to erect a new set of protections around the DNS. DNS messages are encrypted, sources of DNS information are authenticated, DNS queries are trimmed of all extraneous information, and DNS content is verifiable. Tampered DNS responses can be recognized as such and discarded. These days we are looking at perhaps the most complete measures with two-layer obfuscation, such that no single party can correlate who is asking and the name that they are asking about. It’s not that such information is well hidden — it’s that it does not exist in any such form anymore once it leaves the application on the user’s device. What exactly is ‘DNS data’ being referred to in the session brief in this obfuscated world? Where might we find it? The answer is that there is none!
The result is that DNS is going dark. Very dark.
It’s unclear what this means in the long run. Do bad actions and actors go undetected? Do we lose our visibility into network management? What is a ‘secure’ network and how do we secure it using traditional techniques of network perimeter traffic inspection when all the network traffic is opaque? If we can’t see inside the DNS anymore, then how can we tell if (or when) the DNS has been captured by one or two digital behemoths? How can public policymakers, market regulators and market actors assess the competitive ‘health’ of the DNS as an open and efficient market for providers and consumers where the market itself heads into deliberately dimmed obscurity.
There is much to consider about whether the reaction to the originally perceived abuse is causing its own set of issues that are commensurate with the original trigger issues that started us down this path.
Already, DNS query data is incredibly hard to find. It’s easy to talk about the provisioning part of the DNS, but extraordinarily hard to find out how the DNS is being used. I know this only too well as a researcher in this space. The privacy implications are just too great to make this data available, and obfuscating it makes it largely useless! Our efforts have had some limited success in exposing query patterns and behaviours but it’s a window that is shutting down day by day.
We are heading towards an outcome where there will be nothing left to see in the DNS — no data, nothing! And in my view, no policy or regulation can materially alter this trajectory. What we are talking about here are the actions and behaviour of applications. Trying to exercise some regulatory cost on the way that the DNS protocol behaves is about the same in my mind as attempting to regulate the fine-grained behaviour of Microsoft Word or the Chrome browser.
In many ways, it has been a convenient coincidence of motives for both the large operators in today’s Internet (Google, Apple, and so on) and (their perception of) user preference that there is a newfound regard for privacy. With the ascendency of the application level as the dominant factor in the Internet ecosystem, there is a strong aversion by applications to allow the network or platform to gain any insight at all into the behaviour of the application, or the content of the application’s transactions. The QUIC protocol is a good example of loading the entire function of transport and content drivers into the application and hiding absolutely everything from the platform and the network.
The DNS is heading in the same direction where, with tools such as resolverless DNS over HTTPS (DoH) and DNSSEC, we can remove end user DNS queries entirely and have the server pre-provision DNS information via server push. If you had thought of the DNS as a common piece of network-level infrastructure, then that view is being superseded by the view of the DNS as an application artefact.
The implications of this combination of increased opacity in the DNS and a shift from common infrastructure to application artefact will inevitably head into the consideration of splintering and fragmentation, as applications customize their view of the space of names for their individual purposes. There is the prospect, admittedly a distant one currently, of declining residual value in a common general-purpose namespace. As all this operates behind a veil of encrypted and obscured DNS traffic, it is going to be highly challenging to try and prevent such market forces of destructive entropy from forcing an inevitable outcome here on the Internet as a whole.
To try and answer the question as to what data is missing to develop evidence-based policies around the DNS that protect users’ trust on the Internet, then for me the answer is not exactly encouraging. We really have no commonly available data to use for this purpose today, and the pressures for ever-increasing diligence in the handling of such collected data, and the shift to more effective encryption and obfuscation in DNS queries, provide more than ample disincentives to collect and disseminate such data to policymakers in any case in the future.
The views expressed by the authors of this blog are their own and do not necessarily reflect the views of APNIC. Please note a Code of Conduct applies to this blog.