Network Function Virtualization (NFV) is being touted as a key component of 5G technology, with its ability to offload network functions into software that runs on industry-standard hardware and can be managed from anywhere. However, running core network functions in software is complex and compute-intensive and thus needs special attention when introducing new functions such as kernel stacks.
We at the Indian Institute of Technology, Bombay, are exploring the benefits and suitability of novel kernel bypass stacks as a replacement of Linux kernel stack within more complex and realistic network functions of mobile telecommunications networks.
Network Function Virtualization
A network path between two end hosts consists of multiple network elements or network functions. These network functions include routers and switches that forward user traffic, as well as middleboxes such as firewalls and proxy servers.
Traditionally, these network functions were built as custom packet processing hardware. However, in recent times, there has been a move to re-engineer these hardware components as software applications that are designed to run on commercial-off-the-shelf servers. This process of moving away from hardware network functions towards software packet processing components is termed Network Function Virtualization (NFV).
The benefits of NFV are it is much cheaper to develop and deploy software compared to physical hardware appliances. Further, software running on the cloud can be elastically scaled on demand by throwing in more compute and storage. And, it is easier to update and replace old software than it is to do the same with hardware.
However, there is a catch. Although NFV gives us all of these benefits, now with network functions running in software, it becomes much more difficult to ensure good performance that meets various service level objectives (SLOs).
While some network functions only access and modify packet headers, some other network functions are fairly complex and run at the application layer of the network stack (for example, an HTTP proxy server). Such network functions have to make use of abstractions such as network sockets that are provided by other software layers like the Linux kernel networking stack. As a result, the performance of such network functions not only depends on the network function implementation itself, but also on the underlying software on which it runs.
The era of kernel bypass
Previous research has identified several issues with the Linux kernel’s networking stack, especially when processing large volumes of network data. For example, the network stack copies packets multiple times, once from the device to kernel memory, and again from the kernel to user memory. Every network input/output (I/O) operation incurs overheads due to system calls and interrupts processing. Further, the network stack processing of a connection can span multiple CPU cores, causing lock contention on various kernel data structures. The network stack is also built atop the Virtual File System (VFS), leading to overheads unrelated to the network I/O altogether.
The popular argument today is that the kernel is not supposed to be on the data path of high-speed networking interface cards, and the best solution is to simply bypass it. Several ‘kernel bypass’ mechanisms like netmap and Intel’s Data Plane Development Kit (DPDK) are commonly used to build high-performance software network functions. These kernel bypass mechanisms directly transfer packets to/from user applications from/to high-speed network cards, eliminating all kernel processing on the data path.
However, because these kernel bypass mechanisms bypass the TCP/IP processing of the Linux kernel completely, they deliver ‘raw’ network packets with all network headers intact to user applications. Applications that require TCP/IP processing must run separate ‘userspace’ network stacks to provide transport layer functionality if they run at the application layer.
An example of a userspace network stack is multicore TCP (mTCP). mTCP is built over a kernel bypass mechanism like DPDK. With mTCP, the network stack processing and the network function logic run on separate threads, both co-located on the same core. mTCP is optimized for better throughput via batch processing, and for multicore scalability via per-core local data structures.
IX is a new operating system design that represents a different point in the design space from mTCP. In IX, the data path of the kernel is optimized for high-speed I/O by giving applications more control over the NIC via hardware virtualization mechanisms. Unlike the batching approach of mTCP, applications in IX rely on the run-to-completion model, with the network stack processing and the network function logic running in the same thread.
TCP Acceleration as a Service (TAS) is another userspace network stack that differs from mTCP’s principle of co-locating the network stack processing and network function logic on the same CPU core for cache locality. Instead, with TAS, the network stack threads and the application threads run on separate cores and scale independently.
In our research, we are exploring questions around which network stacks are suitable for which network functions. We have worked with the above three network stacks in our research, and between them, they cover a broad range in the design spectrum. For examples of network functions, we look towards telecommunications networks that are fast adopting the paradigm of NFV.
5G mobile packet core
The packet core of a mobile telecommunication network connects the radio access network (users and base stations) to external networks. In 5G, the packet core is being standardized wholly based on the idea of NFV.
In Figure 1, the control plane of the packet core handles signalling messages from users, while the data plane forwards user traffic. We will focus on the control plane here. More specifically, we focus on the user registration procedure in the control plane, which is invoked when a user begins a mobile data session with the network. This procedure involves two network functions:
- The Access Mobility Function (AMF) is the main entity that handles signalling messages from the user, and holds most of the user state.
- The Authentication Service function (AUSF) is responsible for user authentication and security setup during user registration.
In our research, we sought to understand whether novel kernel bypass stacks will be suitable for building network functions such as these.
Previous research on such network stacks has mostly evaluated the stacks over enterprise-grade I/O intensive network functions like load balancers and Network Address Translations (NATs). However, the network functions on the control plane of the 5G packet core are more complex and compute-intensive. For example, the AUSF takes a little more than a quarter-million CPU cycles to process an authentication request, which is several magnitudes more compute-intensive than the I/O intensive network functions considered in prior work.
Do kernel bypass stacks deliver the same performance gains they have delivered in previous work?
The main takeaway of our work is that the Linux kernel network stack is not as bad as it has been projected in previous work when it comes to hosting compute-intensive network functions like those found in the 5G packet core.
The Linux kernel stack manages to perform within 20-30% of the best performing kernel bypass stacks in our experiments, even though past research indicated that the gap would have been much higher. The main reason for this divergence from previous work is that CPU-intensive network functions like AUSF spend most of their time in the application logic and therefore do not stress the data path of the networking stacks very much. As a result, the optimizations employed by the kernel bypass stacks do not have a major impact on performance.
Furthermore, there are other trade-offs that we have to deal with when working with the various kernel bypass stacks:
- Kernel bypass stacks take some effort setting up, while the kernel stack just works out of the box.
- The novel network stacks use lightweight TCP/IP processing and are not always RFC compliant. In some cases, we found that the API is not consistent with the kernel stack, resulting in some effort in porting network functions across network stacks.
- These networking stacks do not support protocols like SCTP, which are necessary when working with 5G control plane network functions. When comparing the performance of the novel network stacks we found that the design of IX, based on the run-to-completion model, outperformed all other kernel bypass stacks we considered in our work.
While mTCP had overheads on switching between the network stack and application threads, the processing of TAS was inefficient when the workload had frequent TCP connection setups and teardowns.
Looking back, our results are probably obvious in hindsight. However, our results point to the need to revisit the debate of kernel bypass stacks vis-a-vis the good old Linux kernel stack, especially with more complex and realistic network functions of mobile telecommunications networks. For more details, please read our paper.
Acknowledgements: Priyanka Naik, Sahil Patki, Pranav Chaudhary, Mythili Vutukuru
Ashwin Kumar is a PhD student in the Computer Science department at Indian Institute of Technology, Bombay.
The views expressed by the authors of this blog are their own and do not necessarily reflect the views of APNIC. Please note a Code of Conduct applies to this blog.