In the first post of this series, we discussed how ‘natural fragmentation’ (UDP DNS fragmentation not triggered by malicious attacks) can appear in the Internet, but it is rare and does not create problems for Domain Name System (DNS) infrastructure.
In this post, we will demonstrate how malicious actors can force UDP DNS answers to fragment so they can inject forged DNS data into DNS resolver caches and highlight what percentage of servers are at risk.
How malicious actors can force UDP DNS answers to fragment
For such an attack to succeed, a malicious actor needs to find a DNS response in the victim DNS zone that is large enough to fragment and contains valuable information that will create an interesting target to attack.
One possible target is the TXT records stored at the start of a DNS zone (the zone APEX). The TXT record stores valuable information such as the Sender Policy Framework (SPF) record used for email sender authentication, and often a lot of security tokens used to authenticate the domain owner against cloud and web services. These TXT record sets are often large; larger than the Maximum Transfer Unit (MTU) used in ethernet networks (1,500 bytes).
In cases where the authoritative DNS server for such a domain operates with the older (pre-2020 DNS flag day) EDNS settings, there is a risk that these responses can be fragmented. Attackers could leverage an attack against the fragmented DNS answer to spoof the SPF record information (to be able to send spam or phishing emails from the victims’ domain name) or change the security authorization tokens to take over a cloud service.
During our research, we found data of a domain owned by a company in the Fortune Top-50 list having an excessive number of Name Server (NS) records. The list of NS records is used by DNS resolvers to find the path to the authoritative DNS servers when following the DNS delegation path. In this case (which has since been fixed), the UDP DNS answer size of the list of NS records was greater than 1,500 bytes and therefore had a high risk of being spoofed.
Spoofing the NS record set is a high-value target for attackers, as it permits redirecting the whole DNS zone to a DNS server under the control of the attacker.
What if a victim’s DNS zone doesn’t have overly large DNS resource records?
Unfortunately, an attacker can still create a situation where DNS UDP fragmentation happens on normal-sized DNS records.
To understand how such an attack can take place we need to discuss how fragmentation error signalling works in the Internet.
Whenever a router in the Internet receives an IP package that is too large to be forwarded to the destination, and if the ‘don’t fragment’ bit is set on this IP packet, it will discard the packet and will respond with an Internet Control Message Protocol (ICMP) error message (Type 3, code 4 “Fragmentation Needed and Don’t Fragment was Set” in IPv4). This ICMP error message contains information about the maximum permitted packet size on this network path. The sending host will lower the MTU for this path and will start fragmenting the UDP DNS messages if they are larger than the announced MTU in the ICMPv4 error message.
Unfortunately, ICMP error messages are not authenticated. They can be sent by everyone on the Internet with any source address. If an attacker wants to lower the MTU between an authoritative DNS server and a DNS resolver, all it needs to do is send a carefully crafted ICMPv4 error message that looks like it comes from the DNS resolver to the authoritative DNS. Once the operating system (OS) of the authoritative DNS server lowers the path MTU between itself and the resolver, the DNS answers will be fragmented at the source (the authoritative server) and the attacker can launch a DNS fragmentation attack.
Is it possible to trick a modern operating system into lowering the path MTU from the outside?
To answer this question, we created the following testbed setup (Figure 1) where we used the Scapy toolkit (a toolkit for assembling IP packets from building blocks written in Python) to send spoofed ICMPv4 error messages toward an Authoritative DNS Server running on various OSes.
The authoritative DNS server had a resource record set just below 1,232 bytes. Under normal conditions, a DNS response containing this resource record would not fragment. We confirmed that assumption and then tested again after sending the spoofed ICMPv4 and ICMPv6 error messages.
|Operating system||minMTU IPv4||minMTU IPv4 effective||minMTU IPv6||Success in IPv4||Success in IPv6|
|Debian 6 / Kernel 2.6.32-5-amd64||552||552||1,280||x||x|
|Ubuntu 14.04.1 / Kernel 3.13.0-45-generic (12/2014)||552||552||1,280||x||x|
|Ubuntu 14.04.1 LTS / Kernel 3.13.0-170-generic (05/2019)||552||552||1,280||x||x|
|Ubuntu 16.04.6 LTS / Kernel 4.4.0-184-generic (06/2020)||552||552||1,280||x||x|
|Ubuntu 18.04.4 LTS / Kernel 4.15.0-106-generic (06/2020)||552||1,500||1,280||-||x|
|CentOS 6 / Kernel 2.6.32-504.3.3.el6.x8664 (12/2014)||552||552||1,280||x||x|
|CentOS 7 / Kernel 3.10.0-1127.10.1.el7.x8664 (06/2020)||552||1,500||1,280||-||x|
|CentOS 8 / Kernel 4.18.0-147.8.1.el81.x8664 (04/2020)||552||1,500||1,280||-||x|
|SUSE EL 15SP1 / Kernel 4.12.14-197.45-default (06/2020)||552||1,500||1,280||-||x|
|FreeBSD 12.1 / Kernel 12.1-RELEASE r354233 GENERIC amd64||1,500||1,500||1,280||-||x|
|OpenBSD 6.7 / Kernel 6.7 GENERIC#234 i386||1,500||1,500||1,280||-||x|
|Windows Server 2008R2||1,500||1,500||1,280||-||x|
|Windows Server 2012R2||1,500||1,500||1,280||-||x|
|Windows Server 2016||1,500||1,500||1,280||-||x|
|Windows Server 2019||1,500||1,500||1,280||-||x|
Table 1 — Operating system fragmentation test results.
The result was that the path MTU between the authoritative DNS server and the DNS resolver can be lowered to the minimum permitted MTU of 1,280 bytes for IPv6 on all OSes. This is in line with the rules of the IPv6 RFC.
On IPv4, the picture looks different for older and newer Linux systems.
On Linux systems using older Linux kernel versions (4.12 and below) it was possible to lower the path MTU to 552 bytes. More recent Linux kernel versions did report the lower path MTU in the kernel’s routing table, however, the kernel network stack did not fragment UDP messages below 1,500 bytes, mitigating this type of attack.
The then-current releases of OpenBSD and FreeBSD, as well as all Windows Server Operating Systems (2008R2 to 2019), would not lower the path MTU for IPv4 after receiving the ICMP error messages from the attacker.
The problematic older Linux kernel versions are used in Long Term Support (LTS) Linux distributions and are in support even today (some until 2024).
How many authoritative DNS servers in the Internet run on older Linux operating systems that are vulnerable to ICMP Path-MTU spoofing?
To answer these questions, we again used the OpenINTEL project to query the version information of BIND 9 DNS servers (query to a TXT-Record in the CHAOS class for the domain ‘version.bind’). On Linux systems, the version number returned often encodes the Linux operating system version alongside the BIND 9 version number as shown below:
% dig ch txt version.bind @ns1.example.com. +short "9.8.2rc1-RedHat-9.8.2-0.68.rc1.el6.11.cloudlinux.els"
In the example above, the authoritative server is running BIND 9.8.2rc1 on a RedHat Enterprise Linux 6.11 version.
We are well aware that the query for this version number is often inhibited by good DNS admins, but still, the measurement gave us a lower bound of potentially vulnerable DNS servers in the Internet.
We selected Red Hat Enterprise Linux and Ubuntu Linux as the two most popular long-term-supported Linux versions that are used to host authoritative DNS servers. Table 2 shows what we found when fingerprinting DNS servers for their OS and version string.
|Linux OS||Number of servers||Percent of total DNS servers seen|
Table 2 — DNS operating systems in production.
At least 14% (RedHat EL5 and EL6, Ubuntu 14.04 and 16.04) of all authoritative DNS servers seen in this study were vulnerable to ICMP-based MTU spoofing attacks, which could be misused to launch subsequent DNS fragmentation attacks.
Why isn’t there a CVE or patch?
The fact that the Linux kernel lowers the path MTU is not a security issue (it is mandated by the IPv4 Internet Standards). Only in combination with the DNS does it become a security threat. But because it is not a security threat alone, there is no CVE and no patch for these older Linux systems.
In our final post of this series, we will discuss possible mitigation tactics for DNS fragmentation attacks.
Read the full study, available for free in English.
Study contributors: Roland van Rijswijk-Deij (NLnet Labs), Patrick Koetter and myself (sys4), and Markus de Brün and Anders Kölligan (BSI).
Carsten Strotmann is a DNS/DHCP/IPv6/Linux/Unix security trainer for Linuxhotel, Men & Mice, and Internet Systems Consortium (ISC).
The views expressed by the authors of this blog are their own and do not necessarily reflect the views of APNIC. Please note a Code of Conduct applies to this blog.
Please note that protection against spoofed ICMPv4 error messages is NOT default behavior of newer version of Linux kernel. DNS server also needs to enable a new socket option (IP_PMTUDISC_OMIT, available in Linux 3.15 and newer) explicitly on UDP sockets.
Recent versions of many open source DNS servers enables this option (if available in running Linux platform) , but not all. You should check if your DNS server software enables this option.
 As far as I know, DNS servers that enables IP_PMTUDISC_OMIT are:
BIND 9.9.10 and newer
NSD 4.1.27 and newer
Unbound 1.5.2 and newer
Knot DNS 2.8.2 and newer
PowerDNS 4.2.0 and newer