Weaponizing middleboxes for TCP reflected amplification

By on 6 Oct 2021

Category: Tech matters

Tags: , , , ,

Blog home

This post is an adaption of an original post by researchers from the University of Maryland and the University of Colorado, Boulder. It describes their discovery of a new way an attacker could launch reflected Denial of Service (DoS) amplification attacks over TCP by abusing middleboxes and censorship infrastructure. These attacks can produce orders of magnitude more amplification than existing UDP-based attacks.

Key points:
  • This is the first reflected amplification attack over TCP that goes beyond sending SYN packets and the first HTTP-based reflected amplification attack.
  • We found multiple types of middlebox misconfiguration in the wild that can lead to technically infinite amplification for the attacker: by sending a single packet, the attacker can initiate an endless stream of packets to the victim.
  • Collectively, our results show that censorship infrastructure poses a greater threat to the broader Internet than previously understood. Even benign deployments of firewalls and intrusion prevention systems in non-censoring nation-states can be weaponized using the techniques we discovered.

Reflected amplification attacks are a powerful tool in the arsenal of a DoS attacker. An attacker spoofs a request from a victim to an open server (for example, an open DNS resolver), and the server responds to the victim. If the response is larger than the spoofed request, the server effectively amplifies the attacker’s bandwidth in the DoS attack:

GIF showing a reflected amplification attack

Most DoS amplifications today are UDP-based. The reason for this is that TCP requires a 3-way handshake that complicates spoofing attacks. Every TCP connection starts with the client sending a SYN packet, the server responds with a SYN+ACK, and the client completes the handshake with an ACK packet.

The 3-way handshake protects TCP applications from being amplifiers. If an attacker sends a SYN packet with a spoofed source IP address, the SYN+ACK will go to the victim, and the attacker never learns critical information contained in the SYN+ACK needed to complete the 3-way handshake. Without receiving the SYN+ACK, the attacker can’t make valid requests on behalf of the victim.

GIF showing a reflected amplification attack via UDP

The 3-way handshake is effective at preventing amplification for TCP-compliant hosts. But in this work, we discover a large number of network middleboxes do not conform to the TCP standard, and can be abused to perform attacks. In particular, we found many censorship middleboxes will respond to spoofed censored requests with large block pages, even if there is no valid TCP connection or handshake. These middleboxes can be weaponized to conduct DoS amplification attacks.

Middleboxes are often not TCP-compliant by design: many middleboxes attempt to handle asymmetric routing, where the middlebox can only see one direction of packets in a connection (for example, client to server). But this feature opens them to attack: if middleboxes inject content based only on one side of the connection, an attacker can spoof one side of a TCP 3-way handshake and convince the middlebox there is a valid connection.

GIF showing how middleboxes can be weaponized

This leaves us with some questions: what is the best way to trigger these middleboxes, and what kinds of amplification factors can we get from them?

Discovering amplifying middleboxes

Our goal was to discover a sequence of packets that an attacker can send to trick a middlebox into injecting a response without completing a real 3-way handshake.

Note that this goal is not compliant with TCP. We are taking advantage of weaknesses in implementation, not in the design of the TCP protocol itself. This means it’s not sufficient to study the TCP protocol alone — we must study real middlebox TCP implementations. This poses a challenge: there are too many kinds of middleboxes around the world for us to purchase, and even if we could, the middleboxes that power nation-state censorship infrastructure are usually not for sale.

Instead, we used our tool, Geneva, to study censoring middleboxes in the wild.

To find middleboxes to study, we used the public data released from CensoredPlanet’s Quack tool. Quack is a scanner that finds IP addresses with a censoring middlebox on their path. We used this data to identify 184 sample middleboxes located around the world that performed HTTP censorship by injecting block pages.

Geneva (Genetic Evasion) is a genetic algorithm we designed to automatically discover new ways to evade censorship, but at its core, Geneva is a packet-level network fuzzer. We modified Geneva’s fitness function to reward it for making the elicited response as large as possible, and then trained Geneva against all 184 middleboxes.

GIF showing TCP-based amplification attack variants

We found five packet sequences that elicited amplified responses from middleboxes. Each of these contains a well-formed HTTP GET request for some domain that was forbidden by the middlebox:

  • SYN packet (with forbidden request)
  • PSH packet
  • PSH+ACK packet
  • SYN packet, followed by a PSH packet containing the forbidden request
  • SYN packet, followed by a PSH+ACK packet containing the forbidden request

We also found another five modifications that increased amplification further for a small fraction of middleboxes; an attacker could use these to specific middleboxes. See our paper for more details on these modifications.

To elicit a response from these middleboxes, we needed a domain that was censored or forbidden by each middlebox, but most censoring middleboxes use different blocklists, making it difficult to find one domain that will elicit block pages from everyone. We analysed the Quack dataset to find the five domains that elicited responses from the most middleboxes, which coincidentally spanned five different areas:

  • www.youporn.com (pornography)
  • www.roxypalace.com (gambling)
  • plus.google.com (social media)
  • www.bittorrent.com (file sharing)
  • www.survive.org.uk (sexual health/education)

We also used example.com and no domain at all as control experiments.

Finding amplifiers

We scanned the entire IPv4 Internet to measure how many IP addresses permit reflected amplification. To do this, we modified the zmap scanner to construct all five packet sequences identified by Geneva.

We scanned the entire IPv4 Internet a total of 35 times (five packet sequences × seven test domains). We measured the responses we got back to calculate the amplification factor we got from each IP address.

Our version of zmap is open-source and available.

What we found

Diagram showing the five different types of attacks the group found.
Figure 1 — Types of attacks we found. The thick arrows denote amplification; red ones denote packets that trigger amplification. (a) Normal TCP Reflection, in which the attacker sends a single SYN packet to elicit SYN+ACKs. (b) Middlebox reflection, in which the attacker sends a packet sequence to trigger a block page or censorship response. (c) Combined destination and middlebox reflection, in which the attacker can elicit a response from both the middlebox and end destination. (d) Routing loop reflection, in which trigger packets are trapped in a routing loop. (e) Victim-sustained reflection, in which the victim’s default response triggers additional packets from the middlebox or destination. We find that infinite amplification is caused by (d) routing loops that fail to decrement TTLs and (e) victim-sustained reflection.

Recall that we were searching for weaknesses in the TCP implementation in middleboxes, not in the TCP protocol itself. In addition, each middlebox has its own injection policies and block pages. This means that there is no one single amplification factor for this attack since each middlebox we triggered would be different!

Instead, we can look at the distribution of the response sizes to see the amount of amplification available to attackers.

In Figure 2, you can see a huge range in amplification factors — from over 100,000,000 to less than 1.

Line graph showing the maximum amplification factor we received for each IP address
Figure 2 — Graph showing the maximum amplification factor we received for each IP address across all 35 scans, sorted by amplification factor on the x-axis. On the y-axis, you can see the amplification factor that the IP address provides.

Infinite amplification

Next, we examined the IP addresses at the head of Figure 2, where we saw amplification factors between 1,000,000 and 100,000,000. These IP addresses are our mega-amplifiers, offering tremendous amplification factors. In fact, these amplification factors are likely an under-estimate; these numbers are from where our scan stopped collecting data, not when the IP addresses stopped sending us data. What’s going on here?

This is where we find technically infinite amplification factors. The amplification factor is calculated by the number of bytes received by an amplifier divided by the number of bytes sent. We found amplifiers that, once triggered by a single packet sequence from the attacker, will send an endless stream of packets to the victim. In our testing, some of these packet streams lasted for days, often at the full bandwidth the amplifier’s link could supply.

We found two causes for this infinite amplification: routing loops and victim sustained amplifiers.

Routing loops

Routing loops occur between two IP addresses when packets get stuck traversing a loop while being routed from one IP address to the other. Routing loops that contain censoring middleboxes offer a new benefit to attackers: every time the trigger packets circle the routing loop, they re-trigger the censoring middlebox.

The number of hops a packet will survive in a network is usually regulated by the time-to-live (TTL) field in IP packets: each time a packet is passed from one router to the next, its TTL value is decremented. If the TTL hits 0, the packet is dropped. The maximum TTL value is 255. This means an attacker that can send a trigger sequence into an amplifying routing loop gets an additional ~250× amplification for free.

Even more dangerous than normal routing loops are infinite routing loops. Infinite routing loops occur if is a circular routing path that does not decrement the TTL value, causes packets to circle the loop forever (or until a random packet drop occurs). We found a small number of infinite routing loops that traversed censorship infrastructure (notably in both China and Russia) that offered infinite amplification.

Victim sustained amplifiers

The second cause we found for infinite amplification was victim sustained loops. When a victim receives an unexpected TCP packet, the correct client response is to respond with a RST packet. We discovered a small number of amplifiers that will resend their block pages when they process any additional packet from the victim, including the RST. This creates an infinite packet storm. The attacker elicits a single block page to a victim, which causes an RST from the victim. This, in turn, causes a new block page from the amplifier, which causes an RST from the victim, and so forth.

The victim sustained case is especially dangerous for two reasons. First, the victim’s default behaviour sustains the attack on itself. Second, this attack causes the victim to flood its own uplink while flooding the downlink.

GIF showing Mega-amplifiers: victim sustained loops

Are these really middleboxes?

Yes, our results suggest so. To test this, we performed a TTL-limited experiment. We took the top 1 million amplifiers, tracerouted to them to determine how many hops away they were, and resent our probes with a reduced TTL number. This ensured that our probes wouldn’t make it to the destination host, but would likely cross the censoring middlebox, so if we still saw responses, we would know they were generated by a middlebox. We confirmed that ~83% of the top 1 million amplifiers were caused by middleboxes.

What is the effect of nation-states?

We found that nation-state censorship infrastructure for economies around the world can also be weaponized. Most of these nation-states are weak amplifiers (the Great Firewall of China only offers about 1.5x amplification, for example). Some of them offer more damaging amplifications, such as Saudi Arabia (~20x amplification).

The real challenge of nation-states is that their censorship infrastructure usually processes all traffic entering or exiting the economy. This means that unlike other amplification attacks, where the source IP address of the traffic received by the victim is the amplifier itself, every IP address behind a middlebox can appear as the source of traffic. Said another way: every IP address within an amplifying nation-state can be an amplifier.

How much damage can an attacker do with infinite amplification?

This is a case in which the ‘amplification factor’ as a metric starts to break down. Most of the time, when people ask about the amplification factor, they are asking how much damage an attacker can do with a given attack vector. If an attacker can elicit a technically infinite amplification factor, but at only 64 kbps before the link is completely saturated, the amount of damage an attacker can do is limited.

A better question is ‘what is the maximum bandwidth an attacker can elicit through this attack?’. Unfortunately, this is the hardest to study ethically. To measure the maximum capacity of a given amplifier, we would have to completely saturate the link for each network, which could have real negative consequences on the users of that network. For now, the true capacity available to attackers from this attack is unknown.

Defending against this attack is difficult

The incoming flood of traffic comes over TCP port 80 (normal HTTP traffic) and the responses are usually well-formed HTTP responses.

Since middleboxes are spoofing the IP address of the traffic they generate, the attacker can set the source IP address of the reflected traffic to be any IP address behind the middlebox. For some networks, this is a small number of IP addresses. But, if an attacker uses nation-state censorship infrastructure, the attacker can make the attack traffic come from any IP address within that economy. This makes it difficult for a victim to drop traffic from offending IP addresses during an attack.

Responsible disclosure

In September of 2020, we reached out and shared an advanced copy of our paper with several economy-level Computer Emergency Response Teams (CERTs), DDoS mitigation services, and firewall manufacturers. We also had further meetings and ongoing communication with multiple DDoS mitigation services and US-CERT for further discussions about mitigation strategies.

Unfortunately, there is only so much we can do. Completely fixing this problem will require economies investing money in changes that could weaken their censorship infrastructure, something we believe is unlikely to occur.

This work was presented at USENIX Security 2021 and received a Distinguished Paper Award. You can read the paper and watch the conference talk.

Note: This is not the only way we’ve discovered that censors could be weaponized. See the post about our work in WOOT 2021 “Your Censor is My Censor: Weaponizing Censorship Infrastructure for Availability Attacks”.

Contributors: Abdulrahman Alaraj, Yair Fax, Kyle Hurley, Eric Wustrow, and Dave Levin

Adapted from the original post, which appeared on censorship.ai. Refer to the original post for a list of frequently asked questions and how you can support or learn more about this project — the group is taking suggestions on a name for this type of attack.

Kevin Bock is a PhD candidate in the computer science department at the University of Maryland, advised by Dave Levin.

Rate this article

The views expressed by the authors of this blog are their own and do not necessarily reflect the views of APNIC. Please note a Code of Conduct applies to this blog.

Leave a Reply

Your email address will not be published. Required fields are marked *