How RRDP was implemented for OpenBSD rpki-client

By Job Snijders on 18 May 2021

Nowadays almost everyone is aware the Resource Public Key Infrastructure (RPKI) exists and can help protect the Internet’s routing system, but fewer people know how computer systems fetch RPKI data.

This blog post discusses the technical challenges encountered when implementing RPKI Repository Delta Protocol (RRDP) in the free, functional, and secure RPKI validator software rpki-client, and how those challenges were addressed. OpenBSD invites RPKI operators to celebrate the release of rpki-client 7.0 and help with testing.

But first, let’s start with a few paragraphs on how RPKI data is transmitted between computer systems, and what role RRDP plays in this.

Transmission of RPKI objects over the Internet

The RPKI threat model allows signed objects to be transported via any means — even unsecured channels! RPKI technology is transport protocol agnostic, which permits the Internet industry to transition from one RPKI data synchronization protocol to another synchronization protocol, and then the next one after that.

In the current IETF RPKI specifications, the valid on-the-wire transportation mechanisms include UDP (HTTP/3), TCP 443 (HTTP/TLS), and even unencrypted TCP 873 (rsync). All can be used with IPv4 or IPv6.

Note that HTTP over TCP 80 is not permitted — however some RPKI engineers have argued any strict requirement for common WebPKI Trust Anchors between RPKI Repository and Relying Party might have unintended and detrimental consequences. Reducing the requirements to be able to synchronize RPKI data to just a mutual RPKI Trust Anchor and some form of Internet access would help facilitate the exchange of RPKI data at planetary scale across all geo-political boundaries to foster a truly global Internet.

Whatever Layer-3 or Layer-4 protocol ends up being used, determining which computer files containing RPKI files should be transported from Certification Authorities (CAs —organizations in charge of delegating IP space to third parties) towards Internet Service Providers (“Relying Parties” in RPKI lingo) can be determined in fundamentally two ways: rsync or RRDP. Both represent data transfer protocols which are capable of very efficient one-direction synchronization. Both RRDP and rsync require different purpose-built client/server software, and each protocol has different impact as to where most of the operational burden lies.

An advantage of the rsync protocol is that it allows for easy manual debugging with standard UNIX utilities, but, on the other hand, an advantage of RRDP is that there are more HTTPS implementations to choose from than rsync protocol implementations. The RPKI community as a whole benefits from multiple distribution mechanisms concurrently existing: If one protocol doesn’t work, perhaps fetching via another protocol yields desirable results. Also — in the author’s opinion — it would be beneficial for long term evolution of the RPKI to continue to encourage RPKI implementers to not amalgamate with one specific synchronization protocol.

What exactly is RRDP?

RRDP’s design borrowed from the concept of distributing Internet Routing Registry (IRR) data through the Near Real Time Mirroring (NRTM) protocol, and in turn, the operational experience with RRDP is expected to influence the next iteration of IRR NRTM v4.

RRDP is a mechanism defined in RFC 8182. A short overview of how it works: An RPKI Repository operator writes a bundle of RPKI objects into a single ‘Delta‘ file (B64 encoded DER wrapped in XML). These Delta files are referenced from a central journal called the “Update Notification” file (also XML), then both files are published as static files hosted on an HTTPS server.

RRDP clients periodically fetch the Update Notification file (which is only a very small file), and if the notification file was changed (compared to the last fetch), the client proceeds to download only the missing delta files. A RRDP client bootstrapping from an empty local cache can use a RRDP Snapshot file to get up to speed. A Snapshot is expected to contain the complete repository, and from then onwards the client can fetch just Delta files.

While ‘piecing together the delta files’ increases the computational cost on the ISP side a tiny little bit, the ease for repository operators in merely moving static (cacheable) files around is considered advantageous by many in the RPKI community.

RRDP implementation challenges

Until the advent of RRDP, RPKI Repositories were synonymous with directories accessible via rsync modules on an rsync server. Most RPKI validator implementations would simply synchronize the entire module, and in doing so fetch all CA Repositories from the rsync server in one single operation. However, in RRDP, the CA’s location is somewhat obfuscated behind a slightly more opaque “RPKI Notify URI”, and additionally, a given X.509 certificate’s “CA Repository” attribute in the SIA Extension now means multiple things: It can either literally mean the rsync location, or an indicator for a relative location in the global RPKI file hierarchy.

At first glance this might seem a clever trick, but overloading of existing data structures always increases the potential for considerable confusion when implementing a protocol!

Challenge 1: What’s in a name?

OpenBSD rpki-client requires POSIX filesystem semantics to store to-be-validated and validated RPKI data. Storing RPKI objects simply as files on a file system allows for greater debuggability and offers significant advantages to users who wish to construct advanced pipelines where all RPKI X.509 artefacts are archived before and after validation.

When synchronizing through RRDP, a Relying Party downloads arbitrary “self-labeled” digital objects which reference filenames that could collide with the validated filesystem hierarchy. Because retrieved objects have not yet been validated, they are not mappable to the validated RPKI object tree stored on the local filesystem hierarchy. It presents a little bit of a chicken and egg problem!

Rpki-client’s developers came up with a trick for this catch-22: To link data retrieved from a RRDP service to a hierarchical file system layout, a SHA256 digest is calculated for the object’s RPKI Notify URI, and this digest is used as the unique directory name to store objects retrieved from the RRDP server.

Challenge 2: The RRDP <withdraw> XML element, a burden or a feature?

An interesting aspect of RRDP is that the RRDP information stream itself is not signed (just like rsync data exchanges are not protected). This means that any intermediate relay (for example a CDN) is in a position to modify (either by accident or deliberately) any data presented to the RRDP client. To reduce the risk of a rogue rsync server instructing the client to empty its cache, rpki-client invokes openrsync, without the ‘–delete’ option. Instead, rpki-client performs a garbage collection process based on whether any valid current manifest references a given file, and if no manifest lists the files found in the cache directory, delete the files. Similarly an RRDP client could ignore <withdraw> instructions from the RRDP server, and instead rely on cryptographically asserted file reference counting. The operational and security implications of RRDP as one of multiple synchronization channels for RPKI data represents an area of further research and study.

Challenge 3: RRDP can suffer from publication inconsistencies… just like rsync!

During the development of rpki-client, it came to light that some Internet Registries, (motivated by performance and reliability considerations) use HTTP load balancing to distribute incoming RRDP fetching requests across multiple back ends. However, when load balancing requests towards data sensitive to inconsistency (such as RPKI files!), it is of paramount importance to ensure all back ends are perfectly synchronized.

Much to our surprise, some RRDP service operators appeared unable to keep back ends perfectly synchronized. As the RRDP protocol requires multiple successful HTTP requests to perform a single synchronization, this can lead to race conditions when fetching an Update Notification from one back end which references Delta files not yet available on the other back ends. Similar cache inconsistency issues can exist in suboptimal rsync publication pipelines. A method to reduce the risk of fetching from multiple different back ends is for RRDP clients to establish a persistent HTTP connection. A proposal on how to implement HTTP Keep-Alive support in rpki-client was shared in this email thread. Another (complementary) approach is for RRDP server operators to use a “sticky” bucket assignment process. All RRDP service operators responded positively to our problem reports, and in most cases were able to improve service reliability in a matter of days.

At the time of writing, it appears there is no formal requirement in the RRDP specification for clients and servers to support HTTP Keep-Alive. This might be an opportunity for clarification in the next RPKI synchronization protocol specification.

RPKI operators should keep in mind that publication data inconsistencies can exist within RRDP itself, within rsync, but also between rsync and RRDP. Similar to how hosters need to monitor both IPv4 and IPv6 when offering dual-stack services; RPKI Repository operators have to monitor both RRDP and rsync. Having said that, the industry as a whole benefits from a deterministic approach on how to move forward with rsync and RRDP. Simply put, if all validator implementations prefer RRDP, and use rsync as a fallback option, eventually rsync will no longer receive synchronization requests. Rsync falling out of fashion is of course contingent on a steady quality RRDP service offering!

RRDP is a protocol across external boundaries, outside the local trust domain

Another discovery was that some RRDP feeds produced XML that did not conform with IETF specifications. Given that RRDP is a conduit between distinct administrative domains, it is very desirable for validators to apply the highest level of scrutiny and to expect nothing less than a strict interpretation of the IETF standards. It was discovered that some RPKI validators were unable to handle certain malformed RRDP input without crashing. Security sensitive applications such as RPKI validators require the opposite of the Robustness Principle, instead: be conservative in what you do, be even more conservative in what you accept from others!

Security features in rpki-client

The OpenBSD project has an extensive history pushing the envelope of cyber security research. The rpki-client application architecture is such that each task of the fetch and validation process takes place in a different process context (privilege separation), and each task-specific subprocess is further restricted from potential unauthorized access to resources through pledge(2) and unveil(2) system calls. For example, the embedded asynchronous HTTP client can access the Internet, but not the local file system; while the embedded RRDP XML parser has neither access to the local file system nor any network functions, but can only communicate to the main process via imsg pipes. The XML tree is constructed with the stream-oriented XML parser libexpat. TLS connections are established and maintained with the novel OpenBSD libtls library, which aims to make it easier and safer to write TLS applications.

The rpki-client utility’s RRDP implementation follows a very restricted access pattern, which helps reduce the cybersecurity attack surface.

Try RPKI-client 7.0 for yourself

The RRDP protocol offers advantages to the global network operations community, the deployment of RRDP is an important evolution in the RPKI technology stack. However the technology is not entirely without issues. Implementers will need to take great care to avoid scenarios in which the RRDP protocol itself can be used as an attack vector. Hopefully this implementation report contributes to the development of future IETF specifications for future RPKI synchronization protocols.

Rpki-client 7.0 (with support for RRDP as Technology Preview) was released on 15 April 2021. All RPKI Repository operators are requested to assist in testing rpki-client in relation to their RPKI publication service offering. At OpenBSD, we call upon each RIR and NIR to ensure their RPKI publication service is interoperable with rpki-client. Repository Operators can benefit from rpki-client as it can function as an early warning system, for example, as part of the preparation process before commencing maintenance. Additionally, use of rpki-client helps define what the common denominator is amongst a diverse set of RPKI validator implementations.

The rpki-client 7.0 release notes are available here, and signed release files here. Most people are expected to run rpki-client either natively on OpenBSD, or through third party software frameworks such as EPEL (CentOS, Fedora, Red Hat), or Ubuntu/Debian. Example build scripts to generate containers compatible with the Open Container Initiative (OCI) format are available here. Software defects in rpki-client itself may be reported to tech@openbsd.org. Issues found when building or running rpki-client on Linux, MacOS, FreeBSD, or Windows may be filed at the rpki-client-portable project on github.

OpenBSD welcomes feedback and improvements from the broader community.

RRDP support in rpki-client was primarily developed by Nils Fisher (Australia) and Claudio Jeker (Switzerland); testing and code changeset review by Theo de Raadt (Canada), Theo Beuhler (Germany), Job Snijders (The Netherlands), and Sebastian Benoit (Norway). Tom Harrison and George Michaelson (APNIC) offered assistance as RRDP subject matter experts. Nils Fisher received financial support from ARIN and APNIC.

Job Snijders is an Internet Engineer at Fastly where he analyzes and architects global networks for future growth.

The views expressed by the authors of this blog are their own and do not necessarily reflect the views of APNIC. Please note a Code of Conduct applies to this blog.