Cleaning up ROAs inconsistent with the BGP state

By on 16 Oct 2018

Category: Tech matters

Tags: , , , , ,

1 Comment

Blog home

Until recently, few network operators protected themselves from BGP hijacking attacks by implementing some form of Route Origin Validation (ROV).

One form of ROV is supported by the Resource Public Key Infrastructure (RPKI); using so-called  Route Origin Authorizations (ROAs), which allow BGP prefix-origin pairs to be validated before they are accepted by the receiving router.

An Autonomous System (AS) performing ROV is less vulnerable to receiving BGP hijacking announcements than an AS not doing so. ROV increases the chance that traffic going out of the AS is routed to the correct destination AS (if a ROA for the destination prefix exists).

Figure 1 — An example of a RPKI ROA, which authorizes AS13335 to announce the prefix 1.1.1.0/24 (other ASes are not authorized to announce 1.1.1.0/24). A BGP announcement originated by any other AS would be flagged as INVALID by any router performing ROV.

RPKI-based ROV is gaining traction (examples: 1, 2, 3, 4, 5, 6, 7), however one problem is the amount of broken legacy ROAs that cause many INVALID prefix-origin pairs.

In the past, IP holders creating incorrect ROAs weren’t a problem, as there was no negative effect of broken ROAs — considering no one performed validation. However, these inconsistent ROAs are now causing many RPKI INVALID BGP announcements (INVALIDs), which is, to some extent, blocking wider ROV adoption.

In this post, I’ll analyse RPKI INVALID BGP announcements to answer the following questions:

  • What does the distribution of INVALIDs look like across the Regional Internet Registries (RIRs)?
  • Which ROAs cause most INVALIDs?
  • Which IP holders can eliminate many INVALIDs by modifying just a few ROAs?
  • How many entities need to be contacted to solve all INVALIDs causing unreachable prefixes?

Answering these questions should help to efficiently contact affected parties, and set us on a course towards cleaning up and limiting the number of unreachable INVALIDs, so we can start identifying BGP hijacks more confidently.

Filtering INVALIDs to focus on unreachable IP space

As of 14 September 2018, there are more than 6,700 INVALID prefix-origin pairs (IPv4 + IPv6). But how many prefixes would actually become unreachable in a ROV environment?

To answer that question, I filtered the complete set of INVALIDs to those that actually result in unreachable IP space.

Here are a few examples of INVALIDs I don’t care about because they don’t result in unreachable IP space (alternative prefix-origin pairs are covering for an INVALID).

VALID/unknown less-specific available

Even though 5.1.0.0/23 is INVALID due to an ASN mismatch, it is still reachable because it is a part of 5.1.0.0/19 (which is VALID):

Figure 2 — Less-specific covers for more-specific.

VALID equally-specific available

Figure 3 — Equally-specific prefix-origin pair is available.

Multiple VALID more-specifics available

Figure 4 — Multiple more-specific cover for INVALID less-specific (100% overlap).

After filtering these cases that do not result in unreachable prefixes we are down to 2,415 (IPv4: 2,323 + IPv6: 92) prefix-origin pairs (2,403 unique prefixes). This remaining set is actually unreachable in an environment that performs ROV.

There are also prefixes with partial reachability as seen in the example below. For simplicity, let’s consider the entire prefix (202.57.120.0/21) unreachable even though half of it is reachable. So this one is also part of the 2,415 prefix-origin pairs we’d consider unreachable.

Figure 5 — Example of a partially (50%) reachable prefix.

Distribution of INVALIDs by RIR

Most INVALIDs are in the LACNIC region. This is just by the number of prefix-origin pairs; it doesn’t say anything about the size of the affected IP space since a single prefix can be more relevant (/16) than others (/24).

Figure 6 — RPKI INVALIDs and unreachable prefix-origin pairs by RIR.

Break down by reason

No. of affected prefix-origin pairsReason
1338INVALID_ASN
1077INVALID_LENGTH

Break down by RIR and reason

No. of affected prefix-origin pairsRIRReason
693LACNICINVALID_ASN
509LACNICINVALID_LENGTH
404RIPEINVALID_ASN
334APNICINVALID_LENGTH
203RIPEINVALID_LENGTH
193APNICINVALID_ASN
47ARININVALID_ASN
31ARININVALID_LENGTH
1AFRINICINVALID_ASN

Break down by announcing AS (top 10)

No. of affected prefix-origin pairsAnnouncing AS
180AS14080
128AS23650
111AS52308
79AS22080
64AS35104
59AS43554
52AS52228
51 AS10299
46AS264797
38AS45774

In total, there are 454 unique ASes announcing unreachable INVALIDs.

Which ROAs cause most INVALIDs?

A single ROA can invalidate many BGP announcements at once. Therefore, I looked at how many BGP announcements a given ROA invalidated because fixing them first would be more efficient than fixing other less relevant ROAs.

By fixing the top 10 ROAs on this list, we could solve more than 20% of INVALID prefix-origin pairs!

The listing below shows ROAs and how many INVALIDs they cause (this is a slight simplification since multiple ROAs can invalidate the same prefix).

No. of affected prefix-origin pairsRIRASN as seen in ROAPrefix as seen in ROAmaxLength as seen in ROA
91LACNICAS60458181.214.0.0/1524
78LACNICAS37692191.96.0.0/1624
62LACNICAS61440191.101.0.0/1624
59APNICAS2365061.160.0.0/1616
54RIPEAS435545.105.0.0/1616
52LACNICAS52228152.231.128.0/1717
41APNICAS4809115.168.0.0/1414
39APNICAS2365061.155.0.0/1616
37LACNICAS22080200.112.128.0/1919
35APNICAS2365061.147.0.0/1616
32RIPEAS4334378.158.160.0/1919
30LACNICAS52308190.108.32.0/1919
30LACNICAS52308181.114.192.0/1919
30LACNICAS33182179.61.128.0/1724
30APNICAS4577449.213.32.0/1919
29LACNICAS22080200.112.160.0/1919
29LACNICAS10986190.114.96.0/1922
27LACNICAS52308181.174.128.0/1919
23LACNICAS10620190.147.0.0/1624
21LACNICAS7195200.25.0.0/1717

Notifying affected IP holders

The next step (which was my initial intention when looking at INVALIDs) is to send emails to affected IP holders and ask them to address the INVALIDs. However, I believe the impact would be greater if these emails came from a trusted entity such as the RIR relevant to the affected IP holder instead of a random entity (myself) who they have never had any contact with before.

Asking RIRs to reach out to their Members also scales better since every RIR would only have to take care of their own Members. Figure 7 provides a brief breakdown of how many Members each RIR would actually have to contact (this is a rough approximation since I use the ASN and not the actual IP holder to count Members):

Figure 7 — An approximation of how many Members each RIR would have to contact to fix INVALIDs.

Sending these <500 emails could actually make a difference in reducing the number of INVALIDs.

What role do RIRs have in monitoring and informing RPKI INVALID BGP announcements?

This is a question that’s worth discussing in all regions serviced by RIRs. For me, I’d like to know whether RIRs:

  • Are currently monitoring the amount of RPKI INVALID BGP announcement prefixes resulting in actually unreachable IP space (in a ROV environment)?
  • Would be open to actively informing their affected Members about their affected IP prefixes?
  • Would encourage others to reach out to their affected Members (possibly in an automated way)?

In the event RIRs do undertake such outreach, it will be important to measure the number of prefixes that become unreachable if a router performing ROV discards RPKI INVALID BGP announcements over time to see how effective the outreach program is. NIST’s RPKI monitor has graphs but they take all INVALIDs into account, not just those that result in actually unreachable IP space in a ROV environment.

For this reason, I propose the following future steps should be considered:

  • Auto-generate results on a regular interval.
  • Analyse results based on IP space size (not just prefix-origin pair counts) — read my follow up post on this.
  • Reach out to NIST to suggest adding graphs about unreachable prefixes to their RPKI monitor.

Acknowledgements, disclaimers and data

I used RIPE NCC’s RPKI Validator 3.0–313 software with ARIN’s TAL enabled.

Don’t take the numbers too seriously; I made a few assumptions (RPKI validator 3 API documentation could be improved) and there might be some corner cases I didn’t take care of, but the results are sufficiently similar to other people’s results.

Adapted from original post, which appeared on Medium on 14 September 2018.

Nusenu cares about privacy enhancing technologies and likes Internet Metrics. He came across RPKI and routing security when measuring how resilient the Tor network is against BGP hijacking attacks.

Rate this article

The views expressed by the authors of this blog are their own and do not necessarily reflect the views of APNIC. Please note a Code of Conduct applies to this blog.

One Comment

  1. Sofia

    Great article Nusenu! Thanks for your work in this area.

    I have a few comments:

    * Re: Distribution on INVALIDs amongst RIRs. I think this may have to do with the amount of IP space in the LAC region that is covered by ROAs. I think it would be interesting to consider relative figures (% of INVALIDs out of covered space) instead of absolute counts.

    * Re: Fixing ROAs to solve INVALIDs. This is assuming the ROAs are wrong. I think it’s good to keep in mind that some of this INVALIDs could be detecting actual BGP hijackings.

    Reply

Leave a Reply

Your email address will not be published. Required fields are marked *

Top