The Internet does not make any promise that packets will be delivered to their destination. Each ‘Each traffic flow is at the mercy of congestion and, unfortunately, denial-of-service (DoS) attacks. Critical communications are therefore often forced to use more reliable — yet considerably more expensive — services, such as private wide-area networks or leased lines.
At the Network Security Group at ETH Zurich, we believe that huge improvements in the resilience and performance of the Internet, as well as considerable savings, will come from moving beyond this ‘best-effort’ traffic delivery model. In April we wrote an APNIC Blog post about Colibri, an inter-domain bandwidth reservation system we designed with this goal in mind.
In essence, Colibri allows hosts in the Internet to reserve bandwidth on communication paths, such that their packets are prioritized and protected from external congestion. Colibri promises to be scalable, to resist a multitude of different attacks, and it is designed to be integrated with SCION, a next-generation Internet architecture that is deployed today in 12 Internet Service Providers (ISPs) globally.
As it frequently happens, however, while working on the Colibri implementation our team started to envision new ways to improve the performance of the protocol. These ideas accrued into a radically new design, with many key differences from the initial Colibri specification. In this post, we explain the intuition behind flyover reservations, one of the major improvements we introduced, and explain how they will help create vastly more efficient and practical Internet bandwidth reservation protocols.
Flyovers: Simpler reservations
In many existing bandwidth reservation systems, including Colibri, reservations must be coordinated for a whole Internet path. In a nutshell, for each end-to-end communication, all ISPs on the path must agree on the amount of bandwidth reserved for such communication. The idea of making ‘path-based’ reservations is extremely intuitive and can be implemented efficiently. However, coordinating the reservation setup across multiple decentralized entities is quite complex, and has to be carried out for each individual path. Hence the question — can we avoid this overhead by decomposing these path-wide reservations into simpler, two-party reservation contracts?
The answer is what we call ‘flyover reservations’ — per-hop reservations where each transit ISP grants the exclusive use of a portion of their internal bandwidth to a source, without the need for external coordination. Flyover reservations thus work like real-life highway flyovers, allowing cars (packets, in our case) to avoid congested intersections. The source is then responsible for composing multiple such flyover reservations to protect a flow end-to-end.
More efficient reservations
This seemingly minor change in reservation semantics has two important repercussions. First, the reservation admission algorithm, which decides whether a reservation request should be granted or not, can be much simpler. Since a flyover reservation only involves the source and transit ISPs, no path-wide coordination is required. Second, the source is free in composing flyover reservations into end-to-end paths and can even use the same flyover to protect traffic on different (partially overlapping) paths. Therefore, flyovers reduce both the setup cost and the number of setup iterations required.
For the sake of brevity, we refer to our paper for the list of security and efficiency benefits of flyover reservations. However, we would like to highlight a new and exciting property that was not available to path-based protocols such as Colibri: Partial reservations. With flyovers, a source may decide to create reservations for just a subset of the hops on the path, protecting their traffic on just those hops (as illustrated in the figure). This may be useful in case there is a particularly congested set of ISPs on the path, and, more importantly, it offers an avenue for incremental deployment. Even if only a handful of ISPs were to offer flyover reservations, sources would already start to see improvements in the reliability of their connections.
Bandwidth fair-sharing
We were able to achieve further speedups by redesigning the admission algorithm, which controls how much bandwidth is granted to which source. While communicating parties in Colibri can request how much bandwidth they need for their communication, in this iteration of the flyover design we decided to take the simplest approach possible — the total available bandwidth for a flyover reservation is always equally split between all the requesting sources.
This may seem overly restrictive, but thanks to this allocation policy we could prove that there is a bounded “time to reservation”, that is, a source will receive its fair share within a short time limit. This instantiation of flyovers is therefore well-suited for the protection of critical traffic. We can, in fact, ensure that a small flow, carrying remote command and control traffic, interbank clearing transactions, or any other high-priority communication, is always assured to have timely access to reliable communication. Further, this computation is so simple and performant that it can be entirely executed in the routers’ fast path.
There is of course space for trade-offs between the demand-aware approach of Colibri and the strict fair-sharing we propose here. We are now working on extending this inflexible but speedy design and improving its flexibility in bandwidth allocation.
Implementation and outlook
To fully realize the promise of flyover reservations, we created a prototype implementation based on SCION. Compared to Colibri, the reservation admission protocol is up to four orders of magnitude faster, allowing it to run on the router itself and does not require an external server for processing. On the data plane, reservation-protected traffic forwarding is 2x faster. This large improvement is again mainly due to the simplicity of the protocol, which requires fewer operations at each stage.
Spurred by these impressive results, we are now working on merging flyover reservations into Colibri to provide a unified and efficient system for bandwidth reservations on SCION.
We anticipate that these efforts will soon culminate in a scalable and efficient Quality of Service (QoS) system we can all use across the SCION production network.
Giacomo Giuliari is a PhD student at the Network Security Group at ETH Zurich.
Marc Wyss contributed to this post.
The views expressed by the authors of this blog are their own and do not necessarily reflect the views of APNIC. Please note a Code of Conduct applies to this blog.
From the title, I thought you were referencing my RFC-2549 “IP over Avian Carriers with Quality of Service”.
You seem to have re-invented fair queuing?
Y’all might want to take a look at what we’re doing with libreqos.io and cake.
Hi Dave, author here 🙂 Thanks for the comment and the pointers to the libreqos work! As far as I understand, the libreqos work is focused on intra-ISP traffic engineering. Our work instead focuses on coordinating cross-ISP (inter-domain) bandwidth reservations. One key difference in this setting is that we have to assume a strong adversary model. Therefore, we had to build cryptographic source authentication and dos protection into the protocol. Fair queuing can then be used within an ISP to enforce the reservations agreed with our protocol. A bit more details can be found in the related APNIC post (on the Colibri system), or in the papers.