Yes, de-aggregation can be bad. No, it’s not ‘game over’

A paper circulating on NANOG and SANOG mailing lists discusses the risks inherent in the de-aggregation of prefixes announced into the global BGP system, at scale. Titled “Kirin: Hitting the Internet with Millions of Distributed IPv6 Announcements”, the paper by Lars Prehn, Pawel Foremski, and Oliver Gasser discusses how the vast address space of IPv6 presents a ‘new’ attack surface on the Internet’s routing infrastructure.

What’s in the paper?

The paper describes a problem that really does exist. If an ISP has a large delegation of IPv6 like a /32 (which is the typical delegation for an ISP), it’s possible to deaggregate the /32 and announce it as quite a large number of more specific routes. It’s normal and common to deaggregate, within reason.

Usually, the intent is to construct routing information that gives traffic engineering or policy outcomes to ‘direct’ IPv6 traffic down specific links, or to specific places. After all, while the ISP has ‘all’ the space in BGP, it also probably has complex relationships with peers and an Internet Exchange Point (IXP) and wants to manage things in more detail than just ‘all of it’, sometimes.

This is normal — but the normal engineering outcomes here reflect the normal engineering investments. An ISP might have ten or 100 Points of Presence (PoPs) and may have ten or 100 sub-contexts of routing being announced, but not millions, or tens of millions, or billions. So, the global BGP system might expect this size of delegation to have at least some potential to announce millions and billions of more specific routes, up to the /64 boundary and beyond (if another entity will accept and forward them).

IPv6 is big, really big

Even when limited to a /48, there are 65,000 more specifics that can be potentially announced in a /32. And deaggregated to a /64, the address space in a /32 has the equivalent of the whole of the IPv4 address space in routes — there are 4B prefixes lurking in there.

The paper discusses the theoretical consequences of an entity announcing millions and billions of more specific routes, what it might look like if done in a distributed manner from more than one location into the global BGP space, and the consequences of having these routes withdrawn.

A secondary effect in BGP is the ‘update to withdraw’ sequence. This happens when a BGP speaker is told one specific path no longer works and attempts to ‘hunt to the end’ all the other alternatives. The BGP speaker would then receive the subsequent updates one by one as those alternatives confirm that they too cannot see the route.

This is not exactly a ‘combinatorial explosion’ effect, but it is a magnifier — the kind of magnifier that brings all BGP speakers that learned the routes to the table. That is, the BGP speakers who learned the routes will announce their learning of not having the route. It takes a while for this to settle down and makes a huge spike in BGP traffic.

Does it work? In theory, yes.

The authors of the paper suggest the ‘attack’ is possible and would have effects at scale.

But here’s the thing — it’s not new. This is old news in many ways. This email from Nick Hilliard to the RIPE routing-wg makes it clear. Not only have operators known about this since the start of the IPv6 BGP story, but they know the mitigations. They’re actually pretty simple. In short, do your job.

Do your job: NOCs

So, the ‘in theory’ part above is important because if a Network Operations Centre (NOC) of an ISP is doing its job and monitoring BGP behaviour, this kind of attack is going to be detected at launch and nipped in the bud. It won’t simply run rampant across the entire BGP surface. It could potentially have bad effects — even devastating for some ISPs — but in practice, it will be seen and stopped by the BGP speakers who are keeping an eye on BGP change, the volume of change, and rate of change.

The paper authors point out that if some reasonably clear logic is used to filter what the BGP speakers see and set limits on how ‘big’ the BGP tables are expected to be, the ISP can limit its exposure. The downsides of this approach are that proscriptive BGP filter limits tend to become reduced into ‘golden rules’ that have unintended consequences, and the scale limits of setting table size can rebound in other ways if badly implemented (first-hand experience — I’ve personally caused this by misconfiguring BGP at a much earlier time, and for a smaller table).

Don’t panic

The moral of the story is not to walk away from IPv6 or even BGP traffic engineering and policy. Instead, it’s essential to consider your NOC and the 24/7 nature of the network. Be prepared, be alert, but also don’t panic.

Rate this article

Discuss on Hacker News

The views expressed by the authors of this blog are their own and do not necessarily reflect the views of APNIC. Please note a Code of Conduct applies to this blog.

2 Comments

Lars Prehn November 25, 2022 at 2:38 am

All three of us have been previous APNIC blog authors. We would have loved to contribute a post or provide feedback for this one if you had contacted us. While we are thankful you linked to our preprint, we believe that the operator-tailored digest from the mailing lists (that you also refer to) would have been more apt for such a post.

Half a year ago, we would have agreed with your assessment that the attack might die instantly as all bigger NOCs would immediately detect and mitigate it. However, on October 5th, AS20473 flapped more than 8k de-aggregated routes for more than 7 hours until enough NOCs reacted (more details here: https://twitter.com/Qrator_Radar/status/1577748939805278209). While 7 hours of reaction time is still considered “quick” by some operators, it would allow a KIRIN-like attack to take full effect. This event also urged us to release our findings immediately rather than in Spring 2023 (as we initially planned). As detailed in the mailing list digest, we hope that the re-raised awareness reduces the time-to-action and helps prevent these types of (intentional or unintentional) de-aggregation events.

Finally, we never claimed that the attack (surface) was novel. We were very explicit about this in Pawel’s MAPRG talk (https://youtu.be/SE3CE5vKcUY?t=468) and the paper’s section three: “The idea that routers may crash due to memory constraints is not new: many operators already reported crashed routers when the IPv4 routing table reached 512K and 768K routes. … However, it is the new context and availability of new methods that we believe re-enable a well-known attack to be successfully executed on the Internet today, by anyone, and with a limited budget.”

Best regards,
Lars

Reply ↓
George Michaelson Post authorNovember 25, 2022 at 4:00 pm

Thanks for the comment Lars. I look forward to reading more from you on the Blog, or maybe we can do a podcast in PING about the measurement aspects of this problem?

I think your pointer to the route flap in October is a good reminder that operations teams need to be a lot more ‘on the ball’ because 7 hours is a long time.

We can all agree with a call to action to be more responsive to problems in BGP at scale.

Reply ↓