Fast UDP I/O for Firefox in Rust

By on 28 Nov 2025

Category: Tech matters

Tags: , ,

Blog home

A fox.
Image by Kev from Pixabay/

For decades now, the boundary of ‘where work is done’ in a computer has shifted from inside the kernel — the operating system overlord — to inside the specific user-process trying to achieve some task.

Networking is no different in that much of the ‘work’ of a network protocol happens at the kernel level. It remains an open question to what extent networking should be handled in user processes.

Where computers ‘do work’

User processes delegate work to the kernel through system calls, which trigger interrupts. During an interrupt, the operating system pauses the running program to handle other tasks. These interrupts can introduce significant overhead, particularly for multi-threaded programs, due to synchronization costs and the resulting impact on overall execution speed.

Do these costs really matter?

At one level, no. There’s usually ample CPU capacity available, and most end-user systems sit idle for 99% of the time. But at another level, it still matters. Doing work efficiently, in the right place, and with the best possible security and energy-consumption outcomes remains important.

This has led to a shifting boundary in the delineation of where work happens. There have been efforts to streamline the kernel and remove surplus code from it, and almost ‘reverse’ efforts to look at work done in user-space programs and move it back inside the kernel to reduce the movement of data and points of blockage.

A case in point is the emergence of increasingly smart ‘line cards‘, which themselves now come with significant memory and high-speed Field Programmable Gate Array (FPGA) processors, designed to do nothing except process data coming off and going onto the network. This is often called ‘offline’ processing: The processing done on a line card instead of using your CPU.

Queues as pointers

Emerging alongside this move to high-speed firmware ‘code-on-card’, are ways to pass entire queues of data as pointers. A place, or a list of places, to find the data can be passed instead. With this mechanism, it’s possible to replace a sequence of system calls with a single call pointing to a vector. Instead of sending smaller chunks of data — with all the blocking consequences of kernel system calls — a single call is sent, pointing to a vector containing a list of things ‘to be done’ or a set of repeated read() operations. By getting the queue of data directly from the line card, this approach avoids much of the cost of marshalling and moving the data.

Fast UDP input-output in Firefox

This write-up of Firefox’s implementation of fast User Datagram Protocol (UDP) input-output (IO) written in the Rust programming language discusses what this looks like in practice, and what kind of impact it has on the system call costs of running the User UDP protocol.

Why does this matter?

Previously, we lived in a mostly TCP world, with HTTP, HTTPS, and TLS built directly on TCP/IP sockets. Now we’re in a ‘QUIC-often’ world, where QUIC has effectively become the new TCP — and goes well beyond it.

QUIC allows multiple states of data flow to exist side-by-side in one connection. Because the protocol implements session markers, it can be agile over changes in the endpoint, and pick up a session initiated in a different way.

QUIC has many more advantages, including being well secured by TLS and well supported by modern web browsers and servers like NGINX and Caddy. The protocol itself runs over UDP, which is a less reliable datagram protocol than TCP at the same level in the IP stack. UDP does not provide reliability, flow, or congestion control; this functionality is layered on top of UDP by QUIC.

Firefox has implemented an improved method to do I/O over UDP, using the new kernel models of coalesced reads and writes, efficient vectors pointing to data, and the hooks to exploit on-card processing. The result is that Firefox now uses the maximally fast, lowest cost approaches to sending data for your web sessions in QUIC.

It is a significant achievement and puts Firefox back into the same class of service Chrome occupies, where much of the QUIC protocol was first exposed. This is especially impressive given that QUIC was developed by, and came out of, Google and into the IETF standards process, giving Chrome, to some extent, a ‘first to market’ advantage. Not so much any more.

Rate this article

The views expressed by the authors of this blog are their own and do not necessarily reflect the views of APNIC. Please note a Code of Conduct applies to this blog.

Leave a Reply

Your email address will not be published. Required fields are marked *

Top