IP fragmentation is a process that breaks large packets into smaller packets to allow them to more easily traverse a network. The process is common in the DNS, which is predominantly UDP based.
Recently, a group in the IETF proposed TCP with Path MTU Discovery (PMTUD) as an alternative to IP fragmentation. PMTUD allows TCP to adjust the size of the TCP segments so that together with the IP headers they do not exceed the maximum transmission unit (MTU).
In the wake of this alternative, there are recommendations to use TCP instead of UDP to send DNS packets as it’s believed that TCP is resistant to IP fragmentation attacks. (During DNS Flag Day 2020, an agreement was reached to set the default EDNS size as 1,232 bytes, which indicated transmission of larger packets over TCP.)
However, in a recent study, we at Fraunhofer SIT found that IP fragmentation attacks also apply to packets over TCP. Responses from at least 393 additional domains’ nameservers can be exploited for IP fragment misassociation attacks via source fragmentation. Worryingly, the attack surface is potentially even larger; over one thousand intermediate routers in the Internet have a small next-hop MTU, which causes packets that traverse them to get fragmented even when fragmentation is not performed by the source.
IP fragmentation misassociation
The way attackers exploit fragmentation is to interfere in the IP reassembly process with fragment misassociation by sending a spoofed fragment to the victim. When the spoofed fragment is reassembled with the genuine fragment from the server, it results either in an incorrect fragment, which is discarded by the client (causing a DoS attack against the target service) or results in a correct fragment that contains a malicious payload that was carried in the spoofed fragment.
Forcing servers to fragment communication over TCP allows adversaries to circumvent the entropy in TCP segments, such as the Sequence Number (SN) and source port, which are otherwise not possible to guess efficiently. Figure 1 is an example of a DNS packet over TCP in two IP fragments. Both TCP entropy and DNS entropy are in the first fragment, while the payload is in the second fragment.
How to force TCP to fragment
As a first step, the adversary needs to ensure that the packets are fragmented. This can be achieved at the source or at intermediate routers.
To cause a source to fragment, the adversary can send a spoofed ICMP fragmentation needed error message (type: 3, code: 4) to the target server, instructing it to fragment to the destination. The ICMP packet should contain the original IP header and the first eight bytes of the payload that triggered the error message, as well as the MTU of the router that sent the ICMP error (RFC 792). These first eight bytes contain the transport-layer header, for example, TCP or UDP. ICMP can also be sent only as an echo reply, in which case it does not include any header.
Since UDP is stateless, UDP headers in ICMP error messages are typically not checked against existing sockets. In contrast, TCP headers in ICMP packets should be checked by the operating systems against the existing open sockets, so as to identify a socket the packet in the error message should be demultiplexed to. For instance, if the port in the TCP header in the ICMP message does not match any current socket or if the SN is wrong, the error message is ignored. Otherwise, the MTU will be reduced to that specific destination. Correspondingly, the size of the TCP segments in that socket will be reduced accordingly. This prevents fragmentation at the IP layer.
Forcing TCP to fragment via off-path attacks is therefore hard — if the adversary uses invalid parameters inside the ICMP error message, it is ignored. If the parameters are correct, the information is passed on to the TCP socket, which updates the maximum segment size (MSS) correspondingly.
The challenge is, therefore, to create a situation where the MTU at the IP layer is lower than the MSS value the TCP socket uses. Namely, to cause the IP to update its MTU information, without passing this information on to TCP.
Off-path adversaries can often force TCP to fragment
As part of our study, we found that off-path adversaries can often force TCP to fragment in the following scenarios:
- Incorrect TCP parameters inside the ICMP error message: an adversary can use any values in the ICMP error message; this will cause the IP layer to update its MTU to that destination. TCP segments are then fragments at the IP layer.
- We found that 429 popular Alexa domains are vulnerable to this type of attack.
- Correct TCP parameters inside the ICMP error message
- We found at least 445 domains fragment TCP segments on the IP layer without adjusting the MSS.
- ICMP with the UDP header
- We found 149 domains reduce the MTU and fragment TCP segments when receiving ICMP errors with UDP headers.
- Echo reply
- We found about 100 domains reduce the MTU and consequently fragment TCP segments when receiving ICMP echo reply messages.
All these can be exploited to launch fragmentation-based attacks.
Overall, we found 496 domains that can, at the source, be forced to fragment responses over TCP. Among the servers that are vulnerable to IP fragmentation attacks over TCP, we found servers (76 out of the 393 new domains) that actively avoid UDP by responding with a TC bit set, indicating to the DNS resolver to resend the DNS request over TCP. These domains follow the recommendations to move to TCP to avoid fragmentation-based attacks to which communication over UDP is known to be vulnerable!
In addition to source fragmentation, the attack surface is potentially larger due to intermediate routers with a small next-hop MTU. An analysis of CAIDA data traces between 2008 and 2016 showed at least 1K intermediate routers in the Internet generate ICMP fragmentation needed packets with next MTU values even below 576 bytes, and fragment IP packets with TCP, despite path MTU discovery.
The ease with which the attack can be performed also depends on the incremental proportional–integral–differential (IPID) allocation algorithm. Table 1 shows results from a study of the Alexa Top-100K domains’ nameservers with that many domains still using nameservers with globally incremental IPID counters — 5,388 domains over UDP and 2,247 domains over TCP. This makes those domains highly vulnerable to fragmentation-based DNS cache poisoning attacks since the off-path adversaries can easily probe and guess the IPID.
Table 1 — IPID allocation in Alexa Top-100K domains’ nameservers.
Move to TCP with caution
Our research shows that the attacks that were believed to apply only to connectionless communication, such as UDP, also apply to connection-oriented communication with TCP. The recommendations to move to TCP to avoid the fragmentation attacks that apply to UDP, do not solve the problem.
The core issue is that the attacker can exploit ICMP messages with UDP datagrams as well as ICMP messages with echo reply packets to trigger fragmentation on TCP. Since these protocols allow fragmentation and do not trigger ICMP fragmentation needed error messages, you may need to adjust the reaction of the operating systems in such scenarios. In any case, we recommend that such ICMP messages are filtered in firewalls or at the servers.
This research is a joint work of Tianxiang Dai, Dr Haya Shulman and Professor Michael Waidner and was published at ANRW 2021. You can watch our ANRW talk.
Tianxiang Dai is a research associate in ATHENE Center and Fraunhofer SIT in Germany. His research interests are in network security including DNS security, IP security and firewall security.
The views expressed by the authors of this blog are their own and do not necessarily reflect the views of APNIC. Please note a Code of Conduct applies to this blog.