Down the black hole: Dismantling operational practices of BGP blackholing at IXPs

Remote Triggered Black Hole (RTBH) filtering has been established as a tool to mitigate inter-domain DDoS attacks. (Image: NASA Goddard Space Flight Center, Flickr)

Large Distributed Denial-of-Service (DDoS) attacks pose a major threat not only to end systems but to the Internet infrastructure as a whole.

Remote Triggered Black Hole (RTBH) filtering has been established as a tool to mitigate inter-domain DDoS attacks. The idea is simple — signal special BGP announcements to discard unwanted traffic early in the network, for example, at Internet Exchange Points (IXPs).

Because little is known about the kind and effectiveness of RTBH filtering use, and about the need for more fine-grained filtering, my research colleagues and I sought to analyze the efficiency of RTBH as a mitigation tool and whether mitigation is the only use case of RTBH. We cooperated with one of the world’s largest IXPs and analyzed each RTBH during a measurement period of more than three months.

How does blackholing work at IXPs?

The Border Gateway Protocol (BGP) is used to exchange IP prefix reachability information between Autonomous Systems (ASes) to form the global Internet. This said, one BGP application has the opposite effect in practice — signalling RTBH based on BGP informs a neighboring AS to discard traffic destined towards an owned IP prefix.

The most prominent and well-established use case for RTBH filtering is the mitigation of attack traffic towards a victim, identified by an IP prefix. Thereby, attack traffic is dropped before it reaches its final destination, alleviating the damage to the network infrastructure under attack.

Figure 1 — Blackholing drops all traffic towards a prefix. This might introduce collateral damage.

IXPs are particularly well suited for this kind of prevention, since they provide a convergence point where hundreds of ASes meet and exchange inter-domain traffic. Route servers at the IXP propagate the RTBH signals from the victim to all other IXP members, which, after accepting the signal, forward all traffic destined to the victim prefix to the blackhole. Hence, attack traffic as well as legitimate traffic is dropped, which we call collateral damage.

Blackholing acceptance and drop rates

Any BGP peer that does not accept a blackhole route from the route server will continue to forward the traffic that was intended to be filtered. Acceptance of blackhole routes is beyond the control of the triggering AS, but subject to local BGP policies of the receiving peer.

Figure 2 — Not every IXP member drops traffic due to BGP rejection policies.

Quantifying the RTBH acceptance rates and the triggered packet drop rates shows how effective RTBH is in the wild. Hopefully, this also helps to explain instances when RTBH is not accepted and to unveil potential misconfigurations. To find this out, we grouped RTBHs by prefix length. Then, for each blackhole in a group we calculated how much of the traffic was really dropped and plotted the cumulative distribution function (Figure 3).

Figure 3 — Blackholing announcements are rejected for more specific prefix lengths.

Successful mitigation depends highly on the announced RTBH prefix length. A perfect mitigation in this plot is a vertical line at 100%. Although not perfect, we observed a significant drop rate for /24 prefixes.

However, that is a really large network range to blackhole, which might blackhole hosts that are not under attack. The prefix length /32 has only a mean drop rate of 50%. This is disappointing because it is the most important prefix length for RTBHs, as /32 prefixes covered 99% of traffic that should have been blackholed.

This analysis shows that RTBHs are surprisingly ineffective, mainly due to misconfigurations; that is, not accepting more specific prefixes than /24 in the case of RTBH. Such prefixes are usually not accepted globally in order to prevent sub-prefix hijacks.

Blackholing events and DDoS traffic anomalies

An effective DDoS mitigation requires also a fast reaction to DDoS events. Looking for volumetric traffic changes right before each RTBH announcement, however, can be misleading. This is because RTBHs as DDoS mitigation are announced and withdrawn repeatedly to check whether the attack event is still ongoing.

Figure 4 — Individual blackholes have to be grouped into blackholing events.

We use this on-off pattern to identify RTBH announcements that target the same attack event and merge them into a single RTBH event. Each RTBH event reflects the mitigation process after the attack was detected. So, a traffic change is only expected before the first RTBH of an RTBH event.

Using a sliding window algorithm (EWMA) we monitored a total of five features that we expect to change during attack events, such as amplification, TCP SYN or GRE flooding attacks:

Number of packets
Number of unique destination ports
Number of flows
Number of unique source IP addresses
Number of non-TCP flows

The following heatmap (Figure 5) shows all anomalies that we found before the corresponding blackhole events. We observed a high number of anomalies up to 10 minutes before an RTBH event. Usually, all five features were affected.

Figure 5 — Time offset between anomalous traffic bursts and start of blackholing events.

This short reaction time indicates automatic DDoS mitigation. Hence, RTBH as DDoS mitigation is fairly effective with respect to time; most anomalous bursts are mitigated via RTBH within 5 minutes.

Other use cases for blackholing

Although we found many anomalies, a substantial share of RTBH events did not have a corresponding traffic anomaly — only 27% of all RTBH events experience traffic in the preceding 72 hours and a traffic anomaly in the last 10 minutes. For all other RTBH events, we either did not observe an anomaly or traffic at all.

We also identified surprising practices that significantly deviated from the expected DDoS mitigation use patterns — the following are two new RTBH use-cases we surveyed:

Prefix Squatting Protection: IP prefix squatting is a variant of prefix hijacking, where third parties take over address space that is assigned to another AS but not actively announced from this legitimate origin. One common mitigation technique for prefix squatting is to announce the assigned address space. To ensure the address space is not used at the same time, the same prefix is announced as an RTBH.
Content Blocking: Applying RTBH to block clients from accessing content occurs rarely but is possible. Attackers (for example, port scanners, vulnerability scanners) and not victims have been blocked by network operators to block outgoing malicious behaviour. Another motivation for the deployment of BGP blackholing is censorship. RTBH can be used to block traffic towards an IP address hosting undesirable content.

Collateral damage and fine-grained filtering

RTBH is a coarse-granular traffic filtering tool. Although having the advantage of dropping DDoS traffic early in the network, there is a major drawback — RTBHs complete the attack and the victim is unreachable. (Remember the IXP example and the collateral damage in the background section?)

In order to reduce collateral damage and keep the victim reachable during the attack events, we could extend the filter rules, for example, by port information. We can either whitelist legitimate traffic or blacklist attack traffic.

Figure 6 — Fine-granular blackholing based on port information.

Whitelisting does not help

We cannot whitelist client traffic, because client traffic is highly variable. That is why we categorized each blackholed DDoS victim as a server or client by looking at the regular traffic patterns (outside of RTBH events). Moreover, we correlated our classification with data from PeeringDB.

It turns out that a lot of DDoS victims are clients located in eyeball providers. Professional e-gamers, Twitch streamers and the like experience DDoS attacks frequently due to DDoS-as-a-Service websites called booters. This means that whitelisting of regular patterns, for example, HTTP traffic for a web server, is in many cases not an option. However, fine-grained blacklisting is.

Blacklisting is effective

Most volumetric attacks (90%!) that were mitigated with RTBH used only three attack vectors such as NTP and DNS. So, blocking these attack vectors — each identified by the default source port of the misused application — can effectively block the attack and prevent collateral damage. BGP Flowspec (RFC 5575) is one of the standards that offers such an exchange of fine-grained blackholing information.

Advice for operators

We would like to sum up this article with three recommendations for operators when it comes to RTBH filtering:

Check your BGP policies: Accept more specific RTBH prefix announcements, in particular /32. It is worth noting that some IXPs provide a dedicated RTBH route server that only advertises blackholes, which eases BGP policy configuration as you need to accept /32 announcements only from this peer.
Check your routing tables for RTBH ‘zombies’: RTBH zombies are active RTBHs for which data plane activities do not justify a blackhole. Your routing tables may contain outdated RTBH entries or those that do not relate to DDoS. Contact your peers to understand their RTBH use case.
Consider fine-grained filtering: The majority of volumetric DDoS attacks are still not complex to detect in terms of traffic features. Simple port-based blacklisting (ACLs, BGP Flowspec) can be very effective and would be a cheap solution to limit collateral damage.

This research was presented at RIPE 79 under the RACI program as well as published and presented at the ACM Internet Measurement Conference (IMC) 2019.

Watch: Marcin Nawrocki present the results of his team’s study at ACM Internet Measurement Conference (IMC) 2019.

Contributors: Jeremias Blendin, Christoph Dietzel, Thomas C. Schmidt, Matthias Wählisch

Adapted from original post which appeared on RIPE Labs.

Marcin Nawrocki is is a PhD student and Research Assistant at Freie Universität Berlin.

Rate this article

The views expressed by the authors of this blog are their own and do not necessarily reflect the views of APNIC. Please note a Code of Conduct applies to this blog.