Basic routing concepts, part 3: Trust but verify BGP

As with the first and second posts in this series, this post is aimed at those in the early stages of their networking careers.

The previous entries in this series focused on two key things:

How prefixes become routes, by being associated with an Origin-AS and a path.
How the BGP ‘gossip’ system distributes everyone’s assertions about what they think they know, either because they originated it, or heard it from another BGP speaker.

This post will talk about the risks inherent in this model of the routing world. They can be summarized with one key question — why believe this gossip?

As I have stressed before, BGP ‘gossip’ should not be taken at face value. It’s important to think about it, review and check it. This is another version of ‘trust but verify‘ and in BGP, the verification is about two things — the Internet Routing Registry (IRR) and Resource Certification or Resource Public Key Infrastructure (RPKI).

Using these systems, (and I prefer RPKI, because this kind of verification is strongly informed by cryptographic checks) helps mitigate against a wide range of risks.

Let’s explore them one-by-one.

Risk one: Being tempted by inappropriately long prefixes

One of the nice things about BGP is how incredibly simple it is. ‘Shortest path wins’ and ‘longest prefix match wins’ are the two key driving rules of BGP that were discussed in the previous blog post.

But as was discussed in that post, things can go very wrong when someone makes an announcement to the world about a long match (which is more specific). This was an outcome of an unfortunate incident in Pakistan, which resulted in YouTube traffic from around the world flocking there, crashing servers. Announcing a longer prefix match caused the problem, but it was also the solution. In this case, Google arranged to announce the services for YouTube from two /25 prefixes, which became the ‘longer’ longest-match for BGP speakers.

Although the problem was both caused and solved by using longest-match rules, the main problem here is that BGP is simple, but kind of dumb. In the current system, anyone can say “I’m the longest match!” and unless the Origin-AS is checked via some external source of information, the only information available comes from the BGP ‘gossip’. This is something RPKI can address right now, using Route Origin Authorization (ROA). It can prevent ‘longer’ longest-match routes being accepted. This problem could be addressed in the IRR with some better Routing Policy Specification Language (RPSL) route objects, but that won’t solve the problem entirely. It will just make it more visible when things go wrong, and make it possible to point to the IRR to say “I didn’t declare that intent”. It will still lack the cryptographic strength of a ROA-based response.

Unfortunately, this mechanism (called MaxLength) in the ROA comes with a sting in the tail. The maximum length needs to be used wisely. If it is set to include prefix combinations that aren’t actually routed, it becomes possible to make attacks by announcing these longer prefixes (longest-match wins) with the Autonomous System (AS) in question as the origin, and the attacker in the path. I’ll come back to this point shortly. The point here, is that you should only make ROAs for routes you do actually announce.

Risk two: Hearing lies about the Origin-AS

If somebody wants to tell lies about a prefix, aside from pushing to a longer prefix match, they can outright lie about the origination.

Let’s explore an example. Say you’re an AS over here in Asia. We’ll call that AS-A. There’s also AS-B over there in Europe, and if they say they originate all the prefixes you have, then for some parts of the BGP-speaking world, you won’t be the best path. AS-B will present as an Origin-AS on a shorter path.

As an example, this attack could try to subvert bank traffic, but probably not in the economy that the bank operates in. It would be an interception, so it would more likely target a different economy with several tourists from that bank’s economy. It’s possible to trick those customers into using a hijacked web page to steal their passwords, then use the bank’s real web services to extract the money.

This is a bit like the scam in the film The Sting (1973). They hear the real race announcements on one radio and pretend to be the race announcements to another room, where the victim is fooled into making huge bets. The defence against this is also in RPKI, because the primary purpose of ROAs (aside from dealing with the ‘size’ of the announcement) is to specify which Origin-AS is meant to be seen.

Regardless of how long somebody pushes a longest-match, if the ROA is set up to say the prefix should come from origin AS-A, attackers can’t make origin AS-B be accepted for Route Origin Validation (ROV) BGP speakers. The IRR can also give a signal of your Origin-AS intention.

Risk three: Snooping

If RPKI (ROA and ROV) can prevent against problems with both longest-matches and Origin-ASes, how can the bad guys still make things go wrong?

If an attacker can make BGP speakers think the attacker offers the shortest path (even if the Origin-AS hasn’t actually changed) then that traffic will opt to go through the attacker’s AS. This allows the attacker to ‘snoop’ on the contents of those packets. The goal here is to snoop, rather than deceive.

The problem here is that at present, RPKI can’t prevent this kind of attack. Nor can the IRR. This is because paths don’t exist in advance, but emerge in BGP as a function of each BGP speaker’s model of the world from things said and heard. The path can be secured if BGP itself changes to be BGPSEC. In BGPSEC, statements about the path are signed in BGP. This isn’t something ROV and ROAs protect; it’s a different cryptographic model using different keys, but is anchored in things done in RPKI, so it still relates. However, it’s a work-in-progress.

There’s also an alternative under discussion in the IETF relating to a model where ‘possible’ paths are declared from the set pairs of ASes that talk to each other. This wouldn’t entirely fix the snooping problem, but it would set limits on it.

Risk four: Being told about unallocated resources

A longstanding problem in BGP is that people can announce any prefix they like. This can be a lie about somebody else’s resources, but what if the attacker told a lie about a resource nobody currently thinks they have? This isn’t a hijack, because there’s no other BGP announcement being subverted. The attacker is just saying they have the authority to route something they don’t have the authority to route. This is a common mechanism for bad faith actors like spammers. They can pop up a BGP route, do something bad, and then drop it back down again.

In fact, it’s probably possible to persistently announce resources nobody else thinks are in use for quite a long time. RPKI can’t directly prevent this in the ISPs themselves.

However, the APNIC AS0 ROA system is in part designed to prevent this from happening for the unallocated resources that are part of APNIC’s management function. The AS0 ROA ensures the spaces APNIC delegates from can’t be abused this way. RPSL doesn’t directly address this problem, but there are other services in BGP that offer null route statements for these resources, which BGP speakers can subscribe to or peer with to control the risk.

I’d also recommend reviewing the Bogon reports on the CIDR report (Geoff Huston) and the BGP Routing Report (Phil Smith) regarding visible unallocated routes in BGP.

Risk five: Conflicts when BGP doesn’t agree with RPSL and RPKI

A risk in having more than one information model about something, is that they might accidentally make the two (or more) systems describe the world slightly differently. This can happen between BGP and RPKI, or BGP and RPSL, or RPSL, RPKI and BGP. The result is that things don’t quite work as expected. There has to be some resolution process when the model of the world doesn’t align.

At present it’s somewhat tenuous; for Bogon reporting, it’s managed by discussion. Individual BGP speakers have to self-police, and review things in the information systems they operate. NIST, NLNet Labs, IIJ, and others operate routing information systems that can reflect on clashes between BGP, RPKI and RPSL. APNIC is also looking into ways to add value to registry services here, with possibilities of alerting services for delegated resource holders in 2022.

Trust but verify

Why should you believe things in BGP? Believe them because you check them. You need to run RPKI to have the highest strength confidence in routing information. This is because only RPKI provides cryptographically verifiable statements about what has been said, regarding the routes you can see inside it.

Trust has to come from checking. This is another instance of ‘trust but verify’.

Things seen in BGP can be checked using filtering driven by external (outside of BGP) rules, such as statements made in RPSL format, in whois. This is an older method of verifying what is seen in BGP and its not ‘inside’ BGP. This allows diligent users to look at other BGP speakers via an external source and say “hmm.. what have they declared in whois RPSL records that they want to do?” and make sure the two sources (BGP and RPSL) agree.

This mechanism has low levels of protection. It’s based around secure secret passwords to update the RPSL record, and it also implicitly puts a huge obligation on the RPSL publisher (who operates the whois database) to be secure and up-to-date. Many people do this, and the worldwide IRR community probably comprises between a third and half of the global BGP speakers. Some people automate production of their entire BGP configuration from RPSL records using things like the IRRtoolkit. Other people maintain things by hand.

The higher level of trust in BGP can come from cryptography via ROAs. This is a higher standard of trust in an assertion about intent to originate, and it’s managed slightly differently to things in the IRR world. Instead of using the ROA to construct filters, the recommendation is to use a protocol called RPKI-RTR to send BGP origin statements to a system that maintains a view of the cryptographically signed ROA statements, and blesses (or disavows, or simply says “I don’t know”) things that can be seen. This mechanism works reasonably well against ‘lies’ about origins, but is not entirely sufficient to protect things seen in BGP because of the second value in a BGP update, the path.

You can still be told a valid origin, but have a false statement made about the sequence of ASNs seen to get to that origin. This is a weakness that needs some thinking about, and is still subject to the IETF working group process to try and narrow down the problems with fixing it. Part of the problem is that the path is completely different for each BGP speaker. It has to reflect how they see the Internet, and how things approach them specifically. Therefore, it’s a lot harder to pre-compute things about path, because it’s specific to each of the 70,000 BGP speakers, and where they sit and what they hear.

That sums up some the risks inherent in BGP, and what’s being done about them. In the next post in this series, we’ll tackle NATs and why they’re both necessary and (sometimes) frustrating.

To learn more about BGP and routing, check out the range of free courses and webinars on the APNIC Academy.

Rate this article

The views expressed by the authors of this blog are their own and do not necessarily reflect the views of APNIC. Please note a Code of Conduct applies to this blog.