An introspective analysis: RPKI deployment under scrutiny

By on 23 Sep 2024

Category: Tech matters

Tags: , ,

Blog home

As the Internet continues to grow in complexity, the security of inter-domain routing has become a matter of national security. The Border Gateway Protocol (BGP), the backbone of the Internet’s routing infrastructure, was designed with simplicity and efficiency in mind, at the expense of security. This design flaw has turned BGP into the Achilles’ heel of the Internet; it is exposed and vulnerable to various attacks with devastating social, economic and security implications.

To address these threats, the Resource Public Key Infrastructure (RPKI) has emerged as the most promising solution. RPKI generates and serves Route Origin Authorizations (ROAs) to routers, which use them to enforce Route Origin Validation (ROV), a process that discriminates against malicious BGP announcements or route leaks.

All that glitters is not gold

Despite its growing adoption and excellent track record in mitigating hijacking attacks, RPKI is not without issues. Our longitudinal study of the RPKI ecosystem highlights several problems that undermine RPKI efficiency and security toward full deployment.

Subpar RPKI repository management

RPKI relies on highly available distributed repositories to efficiently store and serve the ROA payload. However, many repositories offer unreliable service that undermines both their ability to distribute content and the performance of RPKI validators worldwide. They exhibit various errors ranging from repository server inaccessibility due to misconfigurations, bad DNS hygiene, and timeouts, to insufficient object management in the repositories. Because data needs to be collected from all the repositories for effective RPKI-based validation, these errors are the leading cause of incomplete ROA propagation and delays during the RPKI data validation process.

My fellow researchers from TU Darmstadt, Goethe University Frankfurt, and the National Research Center for Applied Cybersecurity (ATHENE) and I measured a 45% increase in daily average errors over a one-year period. Delays caused by repositories are so common that periods with little to no delays can initially appear as anomalous events. Persistent daily networking errors lead to a validation process at least 5x slower than the baseline processing without delaying errors. Worse, when repositories malfunction, processing time can increase at least by 2.5x times more than the average error-prone processing time.

The pervasiveness of software vulnerabilities

Our measurements show that 56% of globally deployed RPKI validators are affected by at least one of many vulnerabilities. These include Denial-of-Service (DoS) attacks, silent ROV downgrades, path traversals, and RPKI cache poisonings. Notably, most deployments of the popular RPKI validator, Routinator, which holds a 70.5% market share, are affected by multiple well-documented vulnerabilities.

According to our data, only 37.2% of Routinator deployments are safe from known issues. The remaining 62.8% are vulnerable to various remotely triggered DoS attacks, 57.9% of which are additionally vulnerable to path traversal attacks, and 32.7% are further affected by RPKI cache poisoning. 100% of all OctoRPKI instances on the Internet and at least 8.8% of Fort instances are vulnerable to DoS. 26.6% of OctoRPKI instances are additionally vulnerable to remote code executions (RCE) and at least 22.3% can be affected by out-of-memory attacks.

Over the past two years, our research group has discovered and disclosed multiple software bugs that resulted in the issuance of 11 CVEs, many of which were rated critical. The good news is that for all known vulnerabilities of actively maintained validator software, there are patches available.

The human factor

The bad news is network operators can take years to update their deployments, as proven by active relying party versions that are years old and rife with multiple vulnerabilities, or worse, have been discontinued years prior. Our measurements show that more than half of deployed relying party validators in the world have still not been patched. According to our research, the majority of system security issues could be eliminated if operators maintaining the RPKI infrastructure were more attentive to their service quality and the security advisories of the software they deployed.

The impact of RPKI failures on Internet security

The issues identified in this study have real-world implications for Internet security. We run passive measurements to analyse the RPKI network as-is without influencing content and behaviour. We observe the RPKI ecosystem natively, the same way any RPKI user experiences it. The identified software bugs, errors, delays, and ROA blackouts apply universally to all RPKI validators in the world.

The disabling of RPKI validators via DoS, or loss of access to ROA payloads due to misconfigured object re-generation processes, directly undermines the ability of routers to enforce ROV. Due to the general fail-open RPKI policy, ROV downgrades lead to the unexpected reintroduction of BGP hijacking potential for prefixes, whose resource owners have legitimately issued ROAs. Given the range of issues discovered, ROV downgrades are fairly easy to trigger, and sometimes they don’t even need an active malicious party, only human error.

The path forward: Strengthening RPKI for global deployment

RPKI is on its way to becoming the de facto BGP security standard. However, there is still ample work to do to prepare it for full global deployment.

Implementing patch management processes for RPKI

The growing use of RPKI requires the conscientious deployment of RPKI software. If a validator is vulnerable, the integrity of the data getting fed to a router is not guaranteed, therefore network operators must prioritize the timely patching of vulnerabilities. This includes upgrading to the latest software versions and regularly monitoring for new security advisories. Automated update mechanisms and better awareness among operators can significantly reduce the risk posed by unpatched systems.

Improving repository content management practices

RPKI repository operators must fix and improve their local RPKI data management to ensure high availability for the service, and ROA payload stability. This includes regular audits of certificate validity, improving DNS resolution processes, and optimizing web server configurations to minimize the likelihood of connection timeouts.

Enhancing monitoring and response mechanisms

To mitigate the impact of stalling attacks and other network issues, the RPKI validation process should incorporate intelligent algorithms that can detect and bypass misbehaving repositories. Additionally, the implementation of robust monitoring tools can help operators quickly identify and respond to anomalies in the validation process.

Education and outreach

The success of RPKI depends not only on safe and efficient technological solutions but also on the willingness of network operators to seriously adopt and correctly implement these solutions. Continued education and outreach efforts are still essential to raise awareness of RPKI’s benefits and the best practices for its deployment. This effort requires clear documentation, comprehensive deployment tutorials for different network topologies and scenarios, and support from the broader Internet community.

Exploring alternative infrastructures

RPKI is not secure by design. Various works have studied the susceptibility of the RPKI framework to censorship, centralization and stalling attacks that exploit the fundamental naivety of RPKI algorithms as described in the standard. To address and mitigate these issues, RPKI requires novel, alternative infrastructures.

There are already some RPKI redesign proposals, however, a major issue with most existing solutions, is that they fall short on compatibility with existing infrastructure. An important aspect of developing alternatives is ensuring backward compatibility of new solutions with existing deployments. In our latest work, which will appear at ACM CCS 2024, we propose a fully backward-compatible solution to endemic issues in RPKI. We designed a byzantine-secure relying party framework to eliminate the risk of vulnerabilities and other failures.

Conclusion

RPKI is a critical tool towards securing global Internet routing, but its effectiveness hinges on the vigilance and cooperation of the entire Internet community. While RPKI software and operational quality have improved significantly in recent years, our results show that they still lack the robust resilience and stability desirable for deployment in production environments.

By addressing the vulnerabilities and operational challenges identified in recent studies, RPKI software developers and network operators can help ensure that RPKI becomes a safe and reliable security component for production networks. Also, operators should not lose sight of other aspects of routing and general information security, such as protecting BGP sessions, securing devices and organization perimeters, streamlining configurations, and patch management.

As RPKI continues to evolve, it will play an increasingly vital role in protecting the integrity of global BGP routing, making it a key focus area for any network operator committed to the principles of MANRS.

This article is based on a research paper called “SoK: An Introspective Analysis of RPKI Security” that will appear at USENIX Security 2025 by Donika Mirdita, Haya Schulmann, and Michael Waidner.

Rate this article

The views expressed by the authors of this blog are their own and do not necessarily reflect the views of APNIC. Please note a Code of Conduct applies to this blog.

Leave a Reply

Your email address will not be published. Required fields are marked *

Top