CZ.NIC labs is running a survey to determine the most useful features in a DNS resolver and these findings have applicability to the APNIC community.
DNS resolvers are constantly adding features while not removing any, but this trend cannot continue indefinitely because the software would eventually break under its own weight. Which features are used in practice and which can be safely removed? We present preliminary results of a survey among DNS resolver administrators, and also invite readers to participate in a cross-vendor survey, which is open until Tuesday, 30 June 2020.
Why vendors need feedback
The DNS protocol has been with us for 33 years now and its complexity is daunting: its specification has grown from 132 pages in 1987 to 3000+ pages nowadays and it keeps growing! DNS software offers several vendor-specific features as well, which adds even more complexity, and all this complexity, in turn, makes manuals longer, configuration more error-prone, and software more buggy and less reliable.
In theory, vendors might decide to remove obsolete features and code, making the appearance of bugs less likely … if they only knew which features are actually used by their users. Getting rid of obsolete code would help both parties. But exactly this kind of feedback from administrators is missing, and vendors who try to be conservative keep adding options while not removing anything, and that is obviously not a feasible long-term strategy.
How does this historical baggage look in practice? Let’s have a look at the documentation for the various software packages, estimate the number of options, and compare the total number of options with usage indicated by the 120 detailed survey responses received so far.
$ man named.conf | sed -e 's/ //g' | sort -u | wc -l
BIND 9.16 named.conf supports roughly 400+ options, and many of these can be used in various contexts (global, view, zone) and interact together, along with authoritative DNS servers, which are part of BIND. The survey data shows that only 65 out of the 400 options are used in practice.
$ man unbound.conf | grep '^ *[a-zA-Z0-9_-]*:' | sed -e 's/ //g' -e 's/:.*$//' | sort -u | wc -l
Unbound 1.10.0 unbound.conf supports roughly 230+ options, but currently, this survey shows only 30 options in active use, possibly because only a few Unbound administrators voluntarily submitted their configuration in the survey.
The candidate for the most obscure option not yet seen in the survey responses is: “dlv-anchor”.
PowerDNS Recursor estimate:
$ pdns_recursor --help | fgrep -- -- | wc -l
PowerDNS Recursor 4.3.0 comes with 152 options in its configuration file, which is a significantly lower number, but it also has a built-in Lua interpreter for configuration, making its configuration file Turing-complete.
Currently, the survey does not have enough data from PowerDNS users, but it has a candidate for the most obscure option: “distribution-pipe-buffer-size“.
Knot Resolver 5.1.1 is the newest kid on the block, but its innocent looking configuration file is practically a Lua program with infinite possibilities. Currently the survey data shows that some users actually do use Lua for scripting their own functions inside the resolver, but the majority of respondents use only pre-baked functions shipped with the software. This prompts the question of whether the Lua configuration is worth the complexity, or if it can be replaced with something more user-friendly.
The candidate for the most obscure option not yet seen in the survey responses is: modules.unload(‘detect_time_jump’).
As you can see, all four implementations have vast configuration possibilities – that’s a lot of code to maintain and test, especially as the features often interact with each other. At the same time our survey suggests that several options might not be used, which means it may be possible to remove historical baggage. Please participate in the survey, it will help to determine what obsolete parts should be removed to eliminate bugs and simplify configuration.
Hopefully it is now clearer why vendors need your feedback!
How bad is the lack of feedback?
To illustrate the scale of the problem let’s make a back-of-the-envelope estimate to see how many operators give feedback to DNS resolver vendors.
Guess no. 1: Number of people talking to vendors
Here we use public sources to estimate the number of people who actually talked to their vendors.
All four projects have public mailing lists, so we can download archives and use a couple of regexes to get the number of unique email addresses:
$ grep -o -h '^From [^ ]\+ at [^ ]\+ ' *-users/2019-*.txt | sort -fu | wc -l
This gives us 534 email addresses including contributors working on all four projects.
Also all four projects have public bug trackers, so we can count users who reported issues or commented on them in 2019. To make this task feasible we will simplify the analysis:
- We will not attempt to subtract interactions by vendors themselves.
- BIND and PowerDNS do not separate their repositories for the recursor and authoritative server, so we will count all communication in these two repositories.
- We will not deduplicate people between GitHub and private GitLab instances used by different vendors.
A simple script based on the GitHub API and Gitlab CSV export produces 400 accounts posting at least one comment on public trackers in 2019 (certainly an overestimate).
Lastly we also need to count customers talking to vendors in private, which is much harder to do. Luckily ISC publishes a detailed annual report, which reveals that roughly 100 more customers could be talking to ISC in private. Other vendors do not publish these numbers so let’s extrapolate the ISC numbers to all four vendors and add 400 more people talking in private.
Finally we can summarize the total number of people talking to vendors in 2019:
- Public mailing lists = 530 (including vendor employees)
- Public bug trackers = 400 (without deduplication across projects)
- Estimated number of customers talking in private = 4 * 100 = 400 (extrapolated from ISC)
The total is 1,330 people, which is almost surely an overestimate… but how does it compare with the number of DNS resolver operators?
Guess no. 2: Number of operators
It is impossible to obtain a precise number, but we can establish a range of possible values.
The very conservative lower bound could be the number of Autonomous Systems (ASes) in use on the Internet. At the moment, roughly 67,000 operators care enough about Internet infrastructure to run their own AS, and are thus likely to operate other essential Internet services like a DNS resolver.
The upper bound is much harder to establish. If we limited ourselves to recursive DNS resolvers we could base our guess on the number of unique IP addresses sending DNS queries to the DNS root over a period of one day, but the number of unique source IP addresses calculated over all root server instances is not available. We need to resort to independent statistics of each root server operator. From these we can see that L-root seems to have the highest number of unique source IP addresses seen during a day, varying at around eight million.
This gives us very broad range from 67,000 to 8,000,000.
What portion of operators talk to vendors?
Finally we can estimate how many operators talked to a DNS vendor in 2019:
The upper bound bound figure (1,330 people talking to vendors in 2019)/(67,000 ASes) stands at roughly 2%.
The lower bound figure (1,330 people talking to vendors in 2019)/(8,000,000 source addresses) stands at 0.017 %.
We can conclude that only (0.017 to 2%) of operators talked to a DNS resolver vendor in 2019, so the vendor lacks feedback and we are left with the following options:
- Keep maintaining all features, including unused ones, thus producing software that has more bugs and is harder to configure for everyone.
- Remove features that are not used by the small fraction of the ‘talking’ user population, possibly removing features other users depend on.
If none of these options sound appealing to you, participate in the survey and help us fix that!
Better late than never
The survey is open until 30 June 2020 and gives you an opportunity to tell vendors what features users need and should not be removed. It also provides options to describe the express wishes for further development of configuration interfaces such as DNS-over-TLS or DNS-over-HTTPS support.
Of course, a manual survey based on a web page with forms has significant limitations, so the survey itself also touches on the possibility of built-in ‘call home’ features, which could automate future surveys.
Petr Špaček devotes his professional life to the DNS and leads the Knot Resolver project at CZ.NIC labs.
The views expressed by the authors of this blog are their own and do not necessarily reflect the views of APNIC. Please note a Code of Conduct applies to this blog.