Good, fast, cheap – pick two. This has been a fundamental rule of network operations since the beginning. Or so we thought.
My colleagues and I at the Technical University of Munich and the University of Vienna recently conducted a project showing that strict latency guarantees can be provided with low-cost (< $100!), low-capacity and most importantly, software-based network equipment. What’s more, it can be more efficient than more expensive options.
In our paper ‘Loko: predictable latency in small networks’, which we presented at CONEXT ’19, we considered networks for critical environments such as airplanes, cars or industrial production sites. These environments often come with particular constraints (for example, require time-critical control loops), and their workloads often have specific characteristics, defined by the rate and burstiness of the arriving demands (for example, network connections between actuators and controllers in industrial manufacturing sites).
We argued that while such networks are usually small, they can still benefit from emerging flexible communication technologies: programmability can offer faster and more fine-grained control than proprietary solutions such as Profibus or CAN, which rely on specialized hardware to enable deterministic guarantees.
We showed that low-cost programmable devices provide great opportunities for ensuring a predictable performance in these environments: not despite their simple and cheap designs but because of their simple and cheap designs. For example, the Zodiac FX switch runs a single-threaded OS-free packet processing loop written in C. More expensive devices, such as carrier-grade switches, typically rely on multi-core architectures and operating systems with complex schedulers and optimizations, making it difficult to devise models: a prerequisite for predictability.
Another opportunity, besides architectural simplicity, comes from the fact that low-cost programmable switches are often based on open architectures, in contrast to the architecture of high-end switches, which typically need to be considered as a black box. As our paper shows, this allows network solution engineers to derive fundamental benchmarking dimensions.
This approach is further supported by the fact that industrial applications typically impose relaxed bandwidth (up to hundreds of kilobits per second) and latency guarantees on the order of milliseconds, which can potentially be achieved by low-capacity hardware.
A primary outcome of the project was the design, implementation, and evaluation of Loko, a system that builds upon this insight and provides end-to-end latency guarantees in networks based on low-cost programmable switches. It relies on a measurement-based approach to derive accurate performance models for such switches and manages the network accordingly.
To this end, Loko leverages principles of deterministic network calculus, a mathematical modelling framework for networks, and proceeds in three steps:
- Accurately and comprehensively benchmarking switching performance using a measurement campaign and leveraging knowledge of the (open) architecture of low-cost devices. Typically, as shown for the Zodiac FX in Figure 1, these switches rely on a central CPU for processing of data- and control-plane packets.
- Based on these measurement inputs, deriving a switch model that avoids traditional assumptions that are invalid for low-cost devices. In particular, as shown in Figure 2, this requires the definition of a service function shared by all the ports, rather than a per-port service definition as traditional state-of-the-art systems define.
- Extending the switch model to a network model, which forms the basis for the design of admission control and resource allocation strategies. Experiments show predictable end-to-end latencies, including guaranteed throughput, guaranteed packet delivery and burst allowance.
Programmable networks that do not depend on complex switch hardware platforms may find interesting applications in other contexts as well. For example, using DPDK for deploying new apps with predictable performance constitutes an attractive low-cost alternative to the typically more expensive and less flexible programmable P4 devices. Indeed, as DPDK typically runs on a pinned core, a Loko-like approach can potentially derive very strict and precise latency/throughput models for a given application.
So go on, deploy your new predictable applications on simple hardware based on low-cost equipment such as the Zodiac FX switch, or using DPDK on commodity off-the-shelf NICs.
The results of our project are also summarized on a web page, which also includes source code, configuration files and instructions to reproduce all our measurements and experiments. The page also includes the slides used during our presentation at CoNEXT’19.
Amaury Van Bemten is a researcher at the Chair of Communication Networks based at the Technical University of Munich. He is currently working on solutions for the provisioning of predictable latency in communication networks.
The views expressed by the authors of this blog are their own and do not necessarily reflect the views of APNIC. Please note a Code of Conduct applies to this blog.