Where did my packet go? Measuring the impact of RPKI ROV

For a long time, we at RIPE Labs have been encouraging operators to enable Resource Public Key Infrastructure (RPKI) Route Origin Validation (ROV) in their networks. We still do, and again if you are not currently doing RPKI ROV, please consider doing so. However, what we have not yet fully explored is the effect ROV has on where your packet actually ends up. After all, the aim of RPKI is to ensure that packets end up at their intended location, and merely doing ROV does not guarantee that.

BGP and RPKI

Let us take a step back for a moment. The Internet, in its physical form, is essentially a large cluster of cables going all around the globe. Every source and destination must have an address, namely an IP address. These addresses are managed by Regional Internet Registries (RIRs), such as RIPE NCC and APNIC. Figuring out which route to take requires something called the Border Gateway Protocol (BGP), which enables a network to announce to other networks which IP addresses can be reached via them. These networks are also called Autonomous Systems (ASes), and they are assigned a unique number, an ASN. When another AS receives this announcement, they can propagate it further by prepending their ASN to the path. Repeat this several times, and one learns how to reach pretty much every IP address currently in use.

This is a system that largely hinges on ‘good faith’ — there is nothing stopping me from announcing AS1133 (the University of Twente) as the origin for 193.0.0.0/21 (the prefix used by the RIPE NCC). In RPKI, I can make objects called Route Origin Authorizations (ROAs) that enable the rightful holder of the IP address prefix (in this case, RIPE NCC) to make verifiable statements about which ASN that prefix may originate from. In this case, the RIPE NCC has a ROA for 193.0.0.0/21-21 for AS3333, the ASN assigned to the RIPE NCC. Additionally, the ROA states how specific the announcements may be (via the max length attribute) if, for example, the 193.0.1.0/24 announcement for AS3333 is also expected. The /21-21 here indicates that only the /21 announcement should be valid.

With the data that ROAs provide, an operator can classify incoming BGP announcements, and group them into three states — valid, invalid, and not found. As with everything BGP, what an operator does with these states is up to their policy, but generally speaking, it is advised to reject invalids and accept valids and not founds. This best practice is called ROV.

ROV

There have been measurements into the adoption of RPKI ROV looking into who is doing ROV. Unfortunately, simply doing ROV does not mean that your traffic will not end up at an origin-AS ROV marked as invalid. Even though you receive the full path via BGP, you do not specify the full path your packet should take when you send it onwards.

Let us take the previous RIPE NCC example. If I announced 193.0.0.0/22 with AS1133 as the origin-AS, generally speaking, this announcement will be preferred as it’s more specific than the 193.0.0.0/21 announcement for AS3333. You’d receive two announcements — one for 193.0.0.0/22 with path AS3320 → AS1133 and one for 193.0.0.0/21 with path AS3320 → AS3333. You might do RPKI ROV, so you reject the first announcement and use the second one. However, your traffic still goes towards AS3320 (Deutsche Telekom), which does not do ROV yet. It receives your packet, sees both announcements from AS1133 and AS3333, and sends it to the most specific one. This means that even though you do ROV, your traffic still ends up in a different location than you intended. Fortunately, the inverse holds true as well. If you did not do ROV, and you accepted both announcements and used the most specific (which also went to AS3320, and Deutsche Telekom did do ROV), then your traffic might actually end up at the intended location without you doing ROV explicitly.

As we can see, the impact of ROV is not as simple as just measuring how many ASes do ROV. We wanted to explore this further and set up an experiment.

The experiment

We wanted to measure where the traffic goes. To do so, we have a ROA that authorizes AS211321 to announce 2a04:b905::/32-32 and 203.119.22.0/23-23. We announce 2a04:b905::/32 and 203.119.22.0/23 from Vultr (AS20473) in Sydney, and 2a04:b905::/33 and 203.119.22.0/24 from ColoClue (AS8283) in Amsterdam. If everyone would do ROV, then the traffic inside 2a04:b905::/33 and 203.119.22.0/23 (for example, 2a04:b905::2) should go to Sydney.

Our hypothesis is that if one link in the chain is missing (that is, the one AS that accepts the more specific), then the traffic will not end up in the intended location. To measure this we have set up two RPKI publication points, parent.rov.koenvanhove.nl (2a04:b905:8000::1 and 203.119.23.1) and child.rov.koenvanhove.nl (2a04:b905::2 and 203.119.22.2). All traffic will go to Sydney for the parent (because that is the only place it’s announced), and we measure where the traffic for the child ends up.

Every Internet measurement has bias. By using RPKI publication points, we can ensure a steady stream of Internet traffic that is more likely to do RPKI ROV (and drop invalids) than the average. After all, we assume that an organization that runs an RPKI validator (and thus hits our publication points) is more likely to do ROV than an organization that does not run an RPKI validator.

To check whether an organization does ROV and drops invalids, we have set up a third publication point invalid.rov.koenvanhove.nl, which is accessible on 203.119.21.1 and 2001:ddb::1. The prefixes they belong to (203.119.21.0/24 and 2001:ddb::/48) have no less specific prefix announcements associated with them, and both have an AS0 ROA associated with them. This means that those networks that do ROV and drop invalids should never reach the invalid.rov.koenvanhove.nl publication point, as there is no route to it. This third publication point is also hosted at Vultr in Sydney.

Results

Roughly a quarter (25%) of traffic ends up at ColoClue in Amsterdam. The other 75% ends up at the intended location at Vultr in Sydney. The precise numbers differ slightly depending on whether you count the number of requests, IP addresses, IP prefixes (and which size), or ASes those IP prefixes belong to. However, checks with the NLNOG ring and RIPE Atlas show a similar image to this 75%/25% division. Interestingly enough, the physical location of the origin did not have a strong influence on where a packet ended up. There are RPKI validators in Australia that had their traffic directed to ColoClue in Amsterdam and there are RPKI validators in Amsterdam that had their traffic directed to Vultr in Sydney. There seems to be no clustering of either group at least geographically.

Using the check with our always invalid, we end up with the following numbers:

	Ends up at ColoClue in Amsterdam	Ends up at Vultr in Sydney
Drops invalids	304	1,650
Does not drop invalids	600	628

Table 1 — The physical location of the origin did not have a strong influence on where a packet ended up.

There is a caveat, namely, that it seems like invalid.rov.koenvanhove.nl is not as well connected as invalid.rpki.cloudflare.com (for example, while invalid.rpki.cloudflare.com is reachable from T-Mobile Thuis, invalid.rov.koenvanhove.nl is not), so the number of ‘drops invalids’ is likely lower.

Challenges

Along the way, we stumbled upon some challenges. When we first started the experiment, over 99% of traffic went to ColoClue in Amsterdam. The only packets that did arrive were those from validators within Vultr in Sydney. Upon further investigation, it turns out that even traffic that reached Vultr would be rerouted to Amsterdam at their edges, as Vultr itself does not do RPKI ROV. We worked around this issue by also announcing 2a04:b905::/33from Vultr in Sydney with a BGP community that would ensure that the announcement would not be propagated outside Vultr. It is however good to realize that without this workaround, all our traffic would have been misdirected.

Additionally, we noticed that not all networks that handle IPv6 traffic also support IPv6 on their validators. Initially, the ROA for 2a04:b905::/32-32 was only hosted on parent.rov.koenvanhove.nl, which used to be IPv6-only. In our inspection at several looking glasses, we noticed that several networks (including large networks) would still consider the announcement from 2a04:b905::/32unknown. We also got several reports from organizations that our IPv6-only ROA caused internal VRP inconsistencies due to some validators supporting IPv6 and some validators not supporting IPv6.

The importance of RPKI ROV

We looked at what impact RPKI ROV has on where the packet actually ends up. We see that in a lot of cases the packet ends up in the intended location, even if the organization itself does not do RPKI ROV, by merit of the upstream doing RPKI ROV. We also see that there are cases where the organization does do RPKI ROV, but the upstream still sends the packet to the unintended destination. It highlights once again how important RPKI ROV is, but also how important it is to have a varied set of upstreams. In our initial case, Vultr acted as our only upstream, causing us to lose nearly all traffic.

Koen van Hove is a researcher at the University of Twente on the topic of network security, with a focus on DoS risks.

Willem Toorop co-authored this post.

This post was originally published on RIPE Labs.

Rate this article

Discuss on Hacker News

The views expressed by the authors of this blog are their own and do not necessarily reflect the views of APNIC. Please note a Code of Conduct applies to this blog.

BGP and RPKI

ROV

The experiment

Results

Challenges

The importance of RPKI ROV

Leave a Reply Cancel reply