Today’s data centre networks are becoming increasingly dynamic.
Besides basic forwarding functionalities, their switches are often expected to adapt the packet processing behaviour to ongoing network conditions, for example, identifying malicious flows and filtering based on the security policies, detecting failures and rerouting the affected traffic, recognizing load imbalance and redistributing the elephant flows, and so on.
Meanwhile, the granularity of these network events in today’s data centre networks is becoming microscopic in duration (so-called microbursts), often lasting just 10s of microseconds (μs).
These trends led us at the University of Pennsylvania, in collaboration with Princeton University, to ask in a recent SIGCOMM 2020 paper: is it possible to enable fine-grained reactions to ongoing network events achieving both minimum latency and maximum flexibility to manage the increasingly dynamic networks?
This question led to the development of Mantis, an open-source framework for implementing expressive and fine-grained reactive behaviours on top of today’s programmable switches.
Mantis simplifies the process of encoding reaction loops using a simple-to-reason-about set of reaction abstractions (malleable entities and reaction) and achieves both generic support of reactive behaviours and sub-RTT level latency without penalizing the line-rate packet processing speed.
Flexibility and latency tradeoff in existing primitives
In general, a reaction in computer networking can be defined as two steps:
- Aggregating statistics (for example, packet count or queue depth) from across packets.
- Using those statistics to influence the processing of subsequent packets (for example, by redirecting a subset of them or tagging them with a computed value).
Today, there are a few ways to implement a reaction, as shown in Figure 1.
Traditionally, one could leverage the control plane, for example, Software-defined networking or conventional control loops, which could measure the network and reconfigure it dynamically. These approaches are flexible, but also typically orders of magnitude slower than the network events that they’d ideally be able to capture.
The other approach is to directly integrate reactions into the data plane hardware. Data planes are fast but, unfortunately, limited by the current hardware capabilities.
The recent emergence of programmable switches provides a promising alternative, however, P4 and protocol-independent switch architecture (PISA) switches still suffer from a well-known set of restrictions including a limited set of allowed operations and branching allowed in actions, an inability to update match-action tables in-band, and memory access constraints among others.
Mantis: a system that enables fast and flexible reaction loops
To enable both flexibility and minimum latency, Mantis builds on existing programmable switch hardware, but pushes the reaction loop as close to the switch ASIC as possible and co-designs the data plane program for fine-grained malleability and ease of use. Two core abstractions underlay Mantis:
- Malleable entities: a set of primitives amenable to fine-grained reconfiguration at runtime.
- Reactions: the fine-grained reaction logic, which is packaged into a C-like function.
Figure 2 shows the end-to-end Mantis framework. First, the user writes a program in P4R, a simple extension to P4 language (BNF syntax for P4R can be found in our paper).
Figure 3 shows a simple P4R code snippet. The user writes a normal P4 program, but using P4R, injects into the program primitives and logic that specify the dynamic portions of the data plane.
Specifically, P4R enables the specification of primitives (tables, packet header fields, and constant values) that should be malleable, and a set of reaction functions that modify the malleable primitives at runtime.
The reaction function operates much like a normal C function. The parameters to the function can include fields from passing packets’ headers, switch counters, or any other data-plane information. The body of the function can contain arbitrary C code that can read the parameters, compute any applicable control logic, and reconfigure the malleable primitives. See our GitHub repo for examples of P4R programs and a practical tutorial on writing them.
The Mantis compiler translates P4R (including reaction functions) into a pair of deployable artifacts (for both the data plane and control plane) that support sub-RTT, serializable measurement polling and hitless reconfiguration without interrupting the data plane. While traditional data plane and control plane interactions are treated as one-off, asynchronous events, Mantis co-designs the two planes to focus on fast reactions and it applies aggressive optimizations to support repeated accesses and updates of the reaction arguments and malleable entities. It dynamically loads the shared object and executes the user-specified reaction logic, that is, polling measurement and updating portions of the data plane at the granularity of 10s of μs (~PCIe latency of the underlying system).
What is Mantis useful for?
We implemented a Mantis prototype on a Wedge100BF-32X Tofino switch. We found that Mantis provides low overheads, achieves reaction times in the 10s of microseconds, and crucially, does not significantly impact legacy control plane and data plane modules.
There are several aspects that make Mantis useful.
First, Mantis simplifies, for users, the implementation of sub-RTT control algorithms. There are many examples, a few of which we show in our paper, such as failover, DoS mitigation, and even reinforcement learning algorithms.
More specifically, when expressing this control logic, the programmer is no longer constrained by the limited computational operations and doesn’t need to worry about the coordination between the data plane and the control plane.
Second, Mantis’ co-design of the control/data plane can sometimes outperform existing pure data plane or control plane alternatives. For example, in the flow-size estimation, where today people might use tools like sFlow (a mostly control plane approach) or sketches (a mostly data plane approach), Mantis archives a sampling rate of every 10μs, corresponding to 1 in 5 packets in a 10Gbps link of an ISP backbone from CAIDA traces.
Moreover, Mantis can flexibly express the target algorithm; with the same resources, it can outperform both alternatives. See our paper, video and GitHub repo for more details.
Liangcheng Yu is currently a second-year PhD candidate at the Department of Computer and Information Science, University of Pennsylvania, supervised by Professor Vincent Liu.
The views expressed by the authors of this blog are their own and do not necessarily reflect the views of APNIC. Please note a Code of Conduct applies to this blog.