Crashing the party — vulnerabilities in RPKI RP software

Donika Mirdita, Haya Schulmann, and Michael Waidner contributed to this work.

Routers use the Border Gateway Protocol (BGP) to exchange information about how to reach prefixes in their networks. BGP routers can announce any prefix — including those their networks do not legitimately own. When networks accept such bogus BGP announcements they route traffic to the hijacking network instead of to the legitimate target prefix.

Prefix hijacks are common in BGP routing, due to human error or as deliberate attacks by adversarial actors. Hijacks can be used, for example, to distribute malicious code, to steal cryptocurrency, to poison caches of Domain Name System (DNS) servers, or to trick a Certificate Authority into issuing fraudulent certificates. Over the past 10 years, prefix hijacks have become more refined and targeted. Hijacks, both due to erroneous configurations or malicious attacks, may have devastating consequences for Internet users and services. To prevent hijacks the IETF designed and standardized the Resource Public Key Infrastructure (RPKI).

RPKI prevents hijacks

RPKI authenticates BGP prefixes, allowing networks to filter bogus announcements and hence prevent hijacks. Public RPKI repositories, rooted at the five Regional Internet Registries (AFRINIC, APNIC, ARIN, LACNIC, RIPE), distribute information about prefix holders with cryptographically signed objects, called Route Origin Authorizations (ROAs), that bind network prefixes with their owner ASN (Autonomous System Number). Networks use Relying Party (RP) software to fetch these ROAs from the public RPKI repositories and validate them. Therefore, RPs act as middleware between the RPKI repositories and the routers, minimizing the load on routers, which only need to periodically retrieve the validated RPKI material from the RP caches and use it for making routing decisions.

In the last five years, the deployment of RPKI has gained traction. Now RPKI covers almost 50% of all Internet prefixes (compared to 6.5% in 2017). Many large providers and operators are using ROAs, such as Amazon Web Services which issued ROAs for their prefixes following the AWS Route 53 hijack in 2018. Route Origin Validation (ROV) is enforced by around 37.8% of Internet Autonomous Systems (ASes) (compared to 600 ASes in 2017), including large Internet Service Providers (ISPs) and operators, such as Level 3, Cogent, Deutsche Telekom, and Zayo.

The impact of RPKI on Internet routing is already visible. When YouTube was hijacked at the economy level in 2008, it was not reachable on the global Internet. In contrast, when Twitter prefixes were hijacked in March 2022 instead of traffic to Twitter being disrupted globally, due to Twitter’s rather recent adoption of RPKI, the effects of the hijack were experienced only in limited parts of the Internet. The hijacked prefixes were covered by valid ROAs, thus any AS that supported RPKI validation discarded the bogus announcement and their clients remained unaffected.

On the other hand, with the growing adoption, any inconsistency, vulnerability, or misconfiguration in RPKI will have a greater impact on Internet stability, since increasingly more networks may be affected. The potential damage can have significant consequences on the economy and society since almost everything we do depends on the correct functioning of the Internet.

In 2023, we performed an extensive security analysis of all popular RP implementations on the market. We found severe vulnerabilities and demonstrated how network adversaries can exploit them for a range of attacks, including Denial-of-Service (DoS) kill switches, cache poisoning and path traversal attacks. We also identified inconsistencies in support of RFC standard recommendations across the implementations. All these issues directly affect the integrity and correctness of the data routers rely on.

Why troubleshooting RPs is hard

Despite their central role in the RPKI infrastructure, before our study, there had been no comprehensive evaluation of RP software. When looking into the inner workings of RPs, this is not surprising. RP implementations are complex since they support a wide range of functionalities necessary for interacting with the RPKI repositories and validating the cryptographic RPKI material. This complexity makes it almost impossible to use traditional vulnerability analysis tools, like fuzzers.

CURE-ing the hurdles towards fuzzing RP

To address the existing gap in vulnerability analysis for RPs, our team at the National Research Center for Applied Cybersecurity ATHENE developed a novel fuzzing tool, which we named Comprehensively Usable Relying Party Evaluator (CURE). CURE combines the functionalities of a fuzzer (creating mutated objects, executing the target software, analysing the output) with the RPKI functionality of creating, signing and inter-linking RPKI objects. With CURE, we implemented our own optimized version of a fully functional RPKI repository, which allows us to construct valid RPKI repository structures around the RPKI objects we want to test and interact with the RPs.

Severe vulnerabilities and inconsistencies across the RP implementations

In our research, we found 18 severe vulnerabilities. Five CVEs have been assigned; one of them was rated critical with a score of 9.3. The vulnerabilities affect all popular RP implementations, and range between crashes, violation of standard behaviour and even severe bugs that allow a network adversary to completely take over an RPKI hierarchy, injecting its own trust anchor.

Path traversal / cache poisoning

The most severe vulnerability that we discovered with CURE was path traversal in Routinator, a software package used by 69.9% of the RPKI validating networks. The RPs universally use the names of objects as storage locations on discs. In the vulnerable Routinator instances, an attacker can exploit the lack of user-input sanitization and choose names containing a path traversal code ‘../‘ to place files with arbitrary content anywhere on the disk of the server running the RP.

In older Routinator versions, this ability could even be exploited to place a new attacker-controlled trust anchor in the RP. Placing a new trust anchor in Routinator would allow an attacker to silently circumvent all RPKI protection and get arbitrary RPKI objects accepted, exposing the routers to poisoning of their routing table and to customized BGP hijacks. In newer Routinator versions, the path traversal vulnerability persisted, but the new handling of trust anchors neutralized the ability to insert a malicious one. After our disclosure, the path traversal problem was quickly fixed and a CVE with a critical score was assigned (CVE-2023-39916).

Relying Party crashes

The majority of the vulnerabilities that we found were RP crashes due to insufficient error handling. DoS vulnerabilities are particularly severe, as routers rely on the availability of the RPs for RPKI validation — if an RP crashes, the router will eventually flush its RPKI cache and accept all BGP announcements, including hijacking attempts. While most crashes we found were parsing errors, for example, crashing if the data of a field was longer than indicated by its length value, we also found internal processing errors that led to failure. All crashes we found have been quickly fixed by the developers after our disclosure, except OctoRPKI.

RFC inconsistencies

During our analysis, we discovered several instances of implementations that do not conform to the RPKI standard. For example, OctoRPKI does not check the session_id parameter in the notification and snapshot files. Therefore it can parse files with inconsistent IDs. This error exposes the RP to replay attacks. Another example of a lack of adherence to RFC requirements in OctoRPKI is that it serves files from repositories with expired CRLs or CRLs that are missing mandatory extensions.

On the other side of the spectrum, we have Fort, which enforces certificate validation much more strictly than the standard requires and, as a result, discards even valid certificates that use the issuer name type OrganisationName. Such RFC non-compliance issues can lead to a silent downgrade of RPKI protection. Every software component and communication protocol is working correctly, but the invisible processing errors silently remove ROAs from the router view.

VRP discrepancies on the Internet

We also found that the Validated ROA Payloads (VRPs) output of RPs differs across the implementations, even when the exact same repository content is served. We made sure to avoid connection discrepancies by running all four major RPs at the same time on the same network. Our data showed considerable differences in the output offered by each RP.

In June 2023, we saw the following VRP output values: rpki-client (441,777), Routinator (441,770), Fort (435,002) and OctoRPKI (434,074). The different VRP entries indicate processing inconsistencies across all the RPs. We ran CURE on the RPKI objects to diagnose the issues.

We can draw several observations from our analysis. OctoRPKI discards all VRP prefixes with a length larger than the maximum allowed prefix in BGP, /24 for IPv4 and /48 for IPv6. This resulted in discarding at least 1,744 prefixes from its VRP cache. Fort discards 6,405 VRPs issued by Amazon due to a peculiarly strict certificate validation logic. In Fort, when the certificates use the option OrganisationName attribute field instead of CommonName or SerialNumber, the repository is discarded. Amazon updated their objects following our notification.

Our analysis highlights that, in reality, processing inconsistencies across RPs are common, and not only as a result of connectivity failures but also due to different processing logic in each RP implementation.

Conclusion

Our research demonstrates that more research and work are needed to make RPKI implementations production-grade ready. We found severe vulnerabilities in major RPKI implementations, even in basic modules such as data parsing.

RPs that implemented their own object parsings, such as Routinator and OctoRPKI, exhibited more errors in their processing as opposed to implementations like Fort or rpki-client, which use open source and long-established libraries like OpenSSL. While a diverse landscape of different implementations in many languages is certainly desirable, these libraries must be painstakingly tested before incorporating them in critical network security applications.

The vulnerabilities that lead to a silent downgrade of protection and even to the complete shutdown of the RP software, and the inconsistencies across the RP implementations are detrimental to RPKI correctness and to inter-domain routing stability. The vulnerabilities we found in our research emphasize the need for a testing tool to enable the RP developers to systematically and in an automated way analyse the vulnerabilities in RP software and fix them. We therefore developed CURE and made it available on GitHub for public use, to facilitate vulnerability testing of RPKI software.

It is imperative for RPKI software and all its modules to become more resilient to attacks that can compromise RPKI data completeness and correctness from the viewpoint of a router. There is also a need for consensus across the implementations on the set of the downloaded ROAs from the RPKI repositories.

Learn more in our research paper “The CURE to Vulnerabilities in RPKI Validation”, by Donika Mirdita, Haya Schulmann, Niklas Vogel, and Michael Waidner, which will appear at NDSS Symposium 2024.

Niklas Vogel is a PhD student at Goethe Universität Frankfurt / Fraunhofer SIT, interested in Internet routing and security focused on RPKI, BGP, and DNS research and measurements.

Rate this article

The views expressed by the authors of this blog are their own and do not necessarily reflect the views of APNIC. Please note a Code of Conduct applies to this blog.