Solo effort to clean up RPKI invalids across a region

By Peter Peele on 26 Jul 2021

With the growing adoption of Resource Public Key Infrastructure (RPKI) Route Origin validation (ROV) to make the Internet more secure and resilient, it is becoming increasingly important for IP network operators or IP prefix holders to ensure their Border Gateway Protocol (BGP) advertisements are not seen as RPKI invalid on the Internet. Recent years have seen a significant increase in IP transit providers (across tiers) and IXPs, implementing ROV and filtering invalid prefix-origin AS pairs.

This article describes an initiative taken to alert and propose corrective measures to network operators with RPKI invalid prefix-origin pairs in the AFRINIC service region.

Background

Although the concept dates back a decade, in the past three years RPKI-based ROV (RFC 7115) has been on the lips of every network operator that cares about the resilience of the Internet. Many tier 1 and tier 2 networks have implemented ROV, fully enabled by dropping RPKI invalid prefixes.

A prefix received via BGP can either be RPKI valid, NotFound, or invalid (RFC 6811). RPKI invalid prefix-origin pairs can occur due to a BGP ‘route leak’ (RFC 7908) or incorrect creation of Route Origin Authorizations (ROAs). These are, in many cases, due to a ‘fat finger’, design oversight, or not fully understanding the implications of certain ROA configurations.

At the time of implementing this initiative, there were just above 100 RPKI invalid prefixes (IPv4) being announced that either belong to AFRINIC Autonomous System Numbers (ASNs) and are announced by non-AFRINIC ASNs or are announced incorrectly by the rightful AFRINIC ASN. The latter case occurs when a prefix is deemed RPKI invalid due to the MaxLength property of a ROA. A network owner may have created a ROA with a MaxLength that is less than the length of the prefix being advertised. This is usually an internal mistake the operator may not be aware of.

A network operator may advertise a prefix with different lengths to different upstream providers or BGP peers based on a routing policy that best suits their business. Having the more specific prefix (longer length) being dropped by networks with RPKI-based ROV enabled, means the intended routing behaviour becomes sub-optimal.

For example, if a network operator has multiple paths for a prefix, they could advertise the longest length (for example, A.B.C.D/24) to the preferred shorter and cheaper path and advertise a less specific prefix length to the secondary path (for example, A.B.C.D/16). If the network operator has a ROA with a MaxLength of 16, for example, the A.B.C.D/24 advertisement would be RPKI invalid due to the MaxLength. Upstream network operators who receive both prefixes from different sources, but have RPKI-based ROV fully enabled, would therefore only use the secondary path. This is an oversimplified example, but I hope it demonstrates the unintended cost implications for the network operator and/or the compromised end-user experience that can arise from this oversight.

Consequently, there is some value to be gained in alerting network operators of instances of their RPKI invalid prefix-origin pairs and cleaning them up is good for the general health of the Internet.

Methodology

The procedure followed to implement this initiative can be broken down into four simple steps:

Collecting a list of AFRINIC RPKI invalids
Gathering contact details of ASNs involved
Contacting the network operators with diagnosis and suggested remedies
Monitoring operator feedback or changes to status

Collecting a list of RPKI invalids for the AFRINIC service region

Figure 1 — Screenshot of parts of the NIST RPKI Monitoring Tool menu with ‘Invalid Prefix-Origin Pairs’.

There are several platforms available to analyse RPKI statistics depending on specific research requirements. NLnet Labs maintains a useful list on their RPKI ReadTheDocs Resources page. My case required a tool that would allow me to filter a list of RPKI invalid prefix-origin pairs by Regional Internet Registry (RIR). My closest match was the US National Institute of Standards and Technology (NIST) RPKI Monitor.

When I initially wanted to start this project, I found the NIST tool’s feature for filtering by RIR was buggy. After several calls for help on various social media platforms such as Twitter and the RPKI Community on Discord, someone at NIST must have heard my cry and fixed it as part of release version 2 of the tool in May 2021.

Glad to see that NIST fixed this bug… shaves off some time from my after-hours project this month to get some of Africa's RPKI Invalids cleaned up. https://t.co/OKUxuWHIgu
— Peter Peele (@peterpeele) June 2, 2021

The NIST data used in this initiative was collected on and for 22 June 2021. A bonus match of my requirements from the NIST Monitoring tool is that its analysis features include the ability to expand an RPKI invalid prefix-origin pair to show the covering prefixes (as shown in Figure 2). That saved me some time digging, but a sanity check I did was to confirm the existence of the associated ROA through another web-based RPKI validator like AFRINIC’s deployment of the Routinator validator.

Figure 2 — A detailed view of the analysis feature of each RPKI invalid prefix-origin pair on the NIST RPKI Monitoring tool.

Further details of each involved ASN and peering relationships were extracted from bgp.he.net. It’s also important to note that the above exercise was only performed for IPv4.

Collecting network operator contact details

An indirect achievement of this initiative was being able to verify whether the network operator’s PeeringDB records included a working email address. This speaks to Action 3 of the Mutually Agreed Norms for Routing Security (MANRS) for Network Operators, which requires an operator, at minimum, to have up-to-date contact information on PeeringDB.

In a case where a network operator had no contact information on PeeringDB, I scraped through the operator’s IRR objects (RFC 2650).

Contacting network operators

With a diagnosis of the problem and some options for the operator to consider remedying the problem, I contacted each operator via email. The following is an example of an email I sent to an operator whose RPKI invalid was due to MaxLength:

Figure 3 — A screenshot of a typical email sent to relevant network operators. Specific details have been removed to protect the reputation of the network operators involved.

Monitoring

Monitoring just included regularly checking the RPKI status on the NIST Monitoring tool for these recorded prefixes.

Results

As mentioned earlier, there were 107 AFRINIC RPKI invalid prefix-origin pairs when the data was initially collected. Thirty-nine pairs were invalid due to the ROA MaxLength. These 39 were deemed low-hanging fruit as they required the least complicated solutions. In many cases, an ASN would have multiple pairs of these RPKI invalid prefix-origins and therefore needed to be contacted once for multiple prefixes, significantly reducing the amount of work for the 39.

There were instances where a network operator owning multiple ASNs had the complexity of advertising different subnets from different ASNs but having created a single ROA covering the different subnets. As you can guess, these are some large networks where organizational complexity can also influence such messy situations. I wasn’t holding my breath for a response on this one, but I proposed that they undo the quick and easy shortcut of defining a ROA with MaxLength=24. This MaxLength=24 method is common because it requires less planning and maintenance, however, it can come back to bite you when your network design gets sticky as in this case, and it also makes you vulnerable to ASN spoofing attacks.

There were some networks without PeeringDB records. Through IRR records, I could gather some email addresses and asked the operators to update their PeeringDB contact details in addition to correcting their RPKI data. Only 10 networks were contacted. Some networks were just not worth contacting in the rest of the list. From studying the ASNs involved, you could detect cases of badly managed IP brokering or undesirable dealings between ASNs that could stray from the objective of this initiative.

Out of the 10 networks contacted, only two had responded to my email at the time of writing this article. One of the messages was in good faith and was from someone who also cares about the health of the Internet. Both networks promised to address the issues raised.

Update (1 July): The two network operators who responded to my email fixed their ROAs. In addition to that, 20 out of 25 RPKI invalid prefixes belonging to one ASN had been fixed even though the network operator never responded to my email. I sent them a thank you email anyway.

Update (5 July): An additional network operator heeded my call and fixed one out of the five RPKI invalid prefixes I had asked them to look into.

Discussion and conclusion

The low count of RPKI invalid prefix-origin pairs found in this exercise should not be celebrated as a sign of cleanliness in the region. RPKI adoption in the AFRINIC service region is still relatively low. For example, RPKI NotFound makes up 89% of the total unique prefix-origin pairs in the AFRINIC region compared to the global 68.9% for IPv4.

Figure 4 — AFRINIC prefix-origin pairs. Source NIST RPKI Monitor.

This exercise helped to verify that network operators in the AFRINIC region who have implemented Action 4 of the recommended MANRS actions, have done so in a way that does not harm their business.

Beyond this initiative, regularly checking the status of prefix-origin pairs in your region is necessary to further promote the adoption of RPKI-based ROV. It ensures that network operators don’t have self-inflicted business-harming configurations in their network and end up joining the RPKI dark side.

Ideally, this should be a regular exercise done by upstream network operators who have ROV enabled to ensure their downstream customer networks don’t have any RPKI invalid prefix-origin pairs. I’m guessing that if an operator is contacted by another operator, they’d heed the call better than when contacted by some random Internet individual.

This post was originally published at The Peele Cryptex.

Peter Peele is a Computer Engineer.

Rate this article

The views expressed by the authors of this blog are their own and do not necessarily reflect the views of APNIC. Please note a Code of Conduct applies to this blog.

2 Comments

Musa Stephen Honlue July 27, 2021 at 5:57 pm

I really like it, let’s keep it up with more proactivity.

Reply ↓

Doug Montgomery August 4, 2021 at 6:31 am

Sorry we missed your first “call for help”. We are developing new analysis features for the NIST RPKI monitor. If you have suggestions, use the “Feedback” link at the top of the monitor page.