This is the second post in a series of questions asked of me about Border Gateway Protocol (BGP) routing security, and my efforts at trying to provide an answer.
Is rsync that bad?
The use of X.509 certificates as the vehicle to convey the public keys used in the Resource Public Key Infrastructure (RPKI) led inevitably to an outcome of distributed certificate publication points. Each client, or relying party, of the RPKI, now has a problem, in so far as the client has to regularly comb all these publication points and gather up the signed products and load them into a locally maintained cache. This can be done in several ways, and about the most inefficient is to take that previous statement literally!
Yes, it’s an option to start each pass through the distributed RPKI publication points, starting with the trust anchors and follow the publication point links in each validated CA certificate and collect a copy of everything that you find there. It’s also an incredibly inefficient option. The relying party client ends up grabbing a precise copy of what they got from the last pass through the RPKI system.
What the client really wants to know is what’s new, what’s changed and what’s been removed since the last visit. That way they can simply fetch the differences and move on.
There was one application that performs this ‘what’s changed’ function very conveniently and that’s the ‘rsync’ application. Rsync can simply take the URL of the remote publication point — the URL of a local cache — and update the local cache to mirror the remote state, efficiently as well I might add.
This seems like an ideal fit for the problem — with just one problem — rsync is not a good idea to use in the big bad environment of the Internet.
As APNIC’s George Michaelson and Byron Ellacot told us back in 2014, rsync is just not up to the task. It can be readily attacked and its synchronization algorithm can be misled. As the final slide of their presentation to the SIDR WG said: “It seems a little strange to build routing security on top of a protocol which we have demonstrated is inefficient, insecure and dangerous to run as server or client”.
Yes, rsync is that bad. Don’t use it in the Internet.
Why does the system rely on pull? What’s wrong with push?
Instead of rsync, what should we use?
Today’s answer is the RPKI Repository Delta Protocol (RRDP), described in RFC 8182. Its foundation is a more conventional HTTPS transport subsystem, and it certainly appears, so far, to be a far better match to our requirements than rsync.
The channel is secure, the delta files direct the synchronization process, and neither the server nor the client is exposed any more than conventional HTTPS exports the endpoints of a transaction. Better, yes. But is it really fit for purpose? Probably not!
It’s not the RRDP protocol, but the data model that I believe is broken.
The basic requirement here is a highly reliable information flooding requirement. All the data sources need to keep all the data clients up to date. When a source changes their data, all the clients need to be updated with the new data set.
What RRDP represents, and rsync too for that matter, is ‘poll and pull’ models. It’s up to the client to constantly check with every data source to see if the source has changed anything since the last check (poll) and then retrieve these changed items (pull). The burden is placed on the client. All the source has to do is simply make the data available.
But there are a few issues here with this approach. How often should clients check with each source? If sources can alter their data at any time, then clients may fall behind the sources and make decisions based on old data. Clients are motivated to poll sources at a high frequency to stay up to date. But this intensive polling places a large load on servers, and as the number of RPKI publication points increases the load imposed on the clients increases.
What we have today is both ends of the Goldilocks problem but no real concept of what might be ‘just right’. The two-minute polling intervals seem to be crazy fast and exacerbate scaling pressures. On the other hand, one-hour polling intervals seem to be geologically slow for a routing system. What’s the ‘right’ answer?
A similar issue was observed in the DNS between primary and secondary servers, and one response was the adoption of the NOTIFY and IXFR mechanisms. These allowed the primary source to notify the secondary clients of a change and then allow the clients to retrieve only the changes.
Would this work in the RPKI space? Probably not. The issue is that there is an unknown number of clients, so the server has no a prior knowledge of whom to notify. Perhaps there is also a deeper problem here in that this framework of sources and clients makes no use of intermediaries. If we want to scale up the system, then perhaps we need to consider a different distribution structure.
What we need is a reliable distributed data flooding model that can propagate routing-related metadata across the realm of BGP speakers and have the same dynamic properties in terms of propagation times as BGP itself. What’s the best protocol where we have experience and knowledge to achieve this outcome? BGP itself of course!
The same mechanisms that propagate route object updates and withdrawals across the inter-domain space is equally capable of propagating any other data payload. All that’s needed is to extend BGP to allow the support of other data objects. Can we do this? Of course, we can! The BGP session Open Messages conversation exchange a set of capabilities, and if both parties support a particular capability then the peers can exchange information based on this capability. This is not a new attribute of a route object, but an entirely new object.
Sources originate changes to their information base and pass them into BGP. Each BGP speaker integrates this new information into their local information model and then sends updates to their BGP peers based on the changes to the local information model.
Why haven’t we gone down this path? Why are we configuring RRDP in a way to try and replicate the capabilities and performance of BGP in terms of reliable and efficient information flooding? If BGP could achieve all this then why aren’t we using it?
I suspect a bit of IETF history is relevant here. The original brief to the SIDR Working Group included the admonition: “Don’t change the BGP protocol!” So, when the working group needed a reliable flooding capability that was equivalent in so many ways to BGP, altering BGP to add a new protocol object was just not an available option for the Working Group.
In answer to the original question, it appears that there is nothing wrong with push. BGP is a push protocol and it seems to be doing just fine! The RPKI system adopted pull largely because of a constrained set of options available to the design group. Personally, I see this as an unfortunate outcome!
Why don’t we attach credentials to BGP updates?
Transport Layer Security (TLS) is an interesting protocol in many ways.
TLS uses X.509 certificate validation as a means of authenticating that the party at the other end of a connection is the party that they are purporting to be. However, the protocol can do this without distributed repository publication points, without pull or push to maintain local caches, or any of the other mechanisms used in the RPKI system for BGP security. Yet both TLA and Route Origin Validation (ROV) are X.509 certificate systems that are used to validate digital signatures.
In the case of TLS, the difference is that within the initial exchange of information from the server to the client, the server includes the entire set of certificates that allow the client to construct a validation chain from the public key to a trust anchor. All the client needs is the trust anchor and it’s up to the server to demonstrate a chain of transitive trust from this trust anchor to the entity that is presenting a public key to the client as proof of authenticity.
Could we undertake an analogous operation in RPKI?
The basic answer is yes. If all we are looking for is the validation of the authority granted to an Autonomous System (AS) to originate a route for this prefix, then it’s possible to affix the certificate chain to the route object and just propagate the digital credentials along with the object. Affixing the entire validation path with every update can lead to significant levels of duplication in the BGP exchange, but we can leverage the observation that BGP itself uses TCP and is a reliable protocol.
Once a certificate is passed to a peer within the context of a BGP session it can be assumed that the peer possesses the certificate and further use of this certificate need not reproduce the entire certificate but simply refer to the previously sent item. The result is similar to the use of the 4-byte AS transition in BGP, where information was passed through the 2-byte BGP world in the form of an opaque community attribute.
So why didn’t we do this?
The same IETF history referred to above is relevant here. This would change the BGP protocol, and the admonition to the SIDR Working Group was to change nothing. Taking out the entire side-channel of attempting to pass the certificate credentials in advance of their use in a BGP updates takes out a rather significant source of operational complexity.
Like TLS, if a BGP update message contained sufficient metadata to allow the prefix to be validated, the entire system would be far simpler to operate. Again, I see the current design as an unfortunate outcome.
Can we use ROV to perform saturation DDoS defence?
This question is a bit like asking: Can I perform delicate microsurgery with a mallet and a chisel? You can give it a try but it’s going to be a really bad idea!
The objective of a saturation Distributed Denial of Service (DDoS) defence is to push back on the incoming traffic that is overwhelming the target. One approach is to use the routing system and instruct other networks to discard all traffic destined to the victim address. The current convention in the operational community is to use BGP communities attached to a specific route to signal that you want all traffic destined to this prefix to be dropped (Remotely Triggered Black Hole, or RTBH, RFC 5635).
Given that the RTBH signal is saying ‘drop traffic to this prefix’ it’s probably a good idea to make the prefix as specific as you can, yet still allow the prefix to be propagated through BGP. The idea of using a more specific route ensures that any destinations that share the same aggregate route are not affected. The signal propagates through the BGP space as fast as BGP, which means that the signal is usually effective in the order of two or three minutes after the RTBH route is first announced.
Could you achieve the same with a deliberately invalid Route Origin Authorization (ROA)? What if you advertised a prefix with a ROA of AS0, for example?
There are a few issues here:
- The network operator can’t generate the blocking ROA. Only the prefix holder can sign a ROA. So unless the prefix holder and the network operator are the same, the network operator has to wait for the prefix holder to mint this AS0 ROA.
- The effect of the ROA will take time to propagate through the BGP. It appears that it takes some 30 minutes for a blocking ROA to propagate to the point that packets are being discarded.
- The effect of the AS0 ROA is not to change the BGP next-hop attribute of a routed hop, but to withdraw the route completely from the local BGP FIB. This means that any covering aggregate route, or any default route, will take over and the traffic will still be passed through, assuming that aggregates or defaults exist. Now the network provider can control whether or not aggregate routes are being originated, but default routes are a local configuration, and the victim has no control over such routes.
Taken together, the result is that it’s slow to take effect, requires additional orchestration and may not have any effect on the attack traffic in any case.
Any more questions?
I hope this post and ultimate series has shed some light on the design trade-offs that were behind RPKI work so far, and point to some directions for further efforts that would shift the needle with some tangible improvements in the routing space.
As always, I welcome your questions in the comments below, which I may add to and address in this series in time.
The views expressed by the authors of this blog are their own and do not necessarily reflect the views of APNIC. Please note a Code of Conduct applies to this blog.
thanks for the very complete explanation, it turns out that resync is not that bad.
i got the point. thanks for sharing