One of the most exciting sessions at the recent IETF 101 meeting, at least measured by the length of the queue at the microphone, was the discussion of the Latency Spin Bit in the QUIC Working Group (watch the session below).
The greater philosophical questions raised there have already been addressed on this blog. This article, in contrast, sets aside those questions and partial answers, and digs into the technical details, and what they might mean for the future of the transport protocol stack in the Internet.
The spin bit was proposed at the June 2017 interim meeting of the working group in Paris, because QUIC radiates far less information about its operation to devices on path than TCP does, as an explicit design goal. None of the information in TCP that can be used to measure round-trip-time (RTT) passively — sequence and acknowledgement numbers, as well as timestamps — is available to on-path devices with QUIC traffic. Unlike all previous transport protocols, QUIC splits the information it uses for its own operation from its wire image.
So, if passive measurability of RTT equivalent to TCP is a requirement for QUIC, then it’s necessary to add the spin bit, or a signal like it, back into this wire image.
But what is the spin bit, exactly? How does it work, how can it be used, and what is the potential for abuse?
The research we’ve been doing recently at ETH — in collaboration with a group of interested network operators, device vendors, and QUIC implementors — has looked at the details of this signal and how it works in different network conditions, and led to enhancements.
- The spin bit is a simple enhancement to QUIC that causes one bit in the header to ‘spin’, generating one edge (a transition from 0 to 1 or from 1 to 0) once per end-to-end RTT.
- Coupled with an additional two-bit signal, the Valid Edge Counter, it can provide a higher resolution of latency trouble spots in a network.
- This work represents the first step in a new way of thinking about network measurement and manageability.
The spin bit
The spin bit, as originally proposed, is a simple enhancement to QUIC that causes one bit in the header to ‘spin’, generating one edge (a transition from 0 to 1 or from 1 to 0) once per end-to-end RTT.
Any device on path can then measure the time (on its local clock) between these edges to generate one RTT sample per RTT for each flow in the general case. These RTT samples can then be aggregated by link or peer and analyzed to pinpoint latency trouble spots in a network.
Figure 1 — How the spin bit works.
The algorithm generating this signal is quite simple:
- When a server sends a packet, it sets the spin bit to the spin bit on the last packet it received from the client.
- When a client sends a packet, it sets the spin bit to the inverse of the spin bit on the last packet it received from the server.
Figure 2 — The spin bit, unimpaired.
Like all simple algorithms, this one has some important limitations when it meets network conditions in the real world:
- Packet loss will tend to cause overestimates of RTT if a spin edge is lost.
- Reordering of a spin edge will cause drastic underestimates of RTT since it will cause multiple edges to be observed per RTT.
- Observation of the spin bit measures the fundamental frequency of a transport protocol, which is usually but not always, the RTT. For example, when traffic is periodic with a period longer than RTT, the spin bit tends to measure this period instead.
Figure 3 — Reordering causes false RTT samples.
The Valid Edge Counter
To address these concerns, we developed an additional signal alongside the spin bit called the Valid Edge Counter (VEC).
The VEC operates on the principle that each RTT sample is taken by subtracting the observation time of a right edge from the left edge preceding it. Only right edges that are definitely in response to a given left edge can be used to generate valid RTT samples.
Figure 4 — The VEC.
The VEC is a two-bit counter, taking values from 0 to 3. When an impairment such as loss, reordering, or delay would cause the spin bit signal to generate an invalid RTT sample, it resets to a lower value, then counts back up to a higher value as valid edges are generated, hence the name. The algorithm for generating the VEC is the same on both client and server, and works as follows:
- The VEC is set to 0 by default.
- When an endpoint sends a packet containing a spin edge, it sets the VEC to the VEC of the last received edge plus 1, clamped to 3, except;
- When an endpoint sends a packet containing a delayed spin edge, that is, a spin edge sent for more than a specified timeout (for example, 1ms) since the receipt of the corresponding spin edge from its peer, it sets the VEC to 1.
Passive observers then use the following algorithm to generate RTT samples from the spin bit and the VEC:
- Apparent spin edges with a VEC of 0 are invalid and were caused by loss or reordering of a valid edge.
- Apparent spin edges with a VEC of 1 or 2 can be used as left edges, but not right edges.
- Apparent spin edges with a VEC of 3 can be used as left edges and right edges.
Our experimentation with the VEC in a variety of emulated network conditions shows that it handles loss, reordering, and delay as designed, even in network conditions so poor that they lead to effective connectivity failure. Details are given in Piet De Vaere’s recently published Master’s thesis.
Our current work is focused on scaling this experimentation out, working with implementers of QUIC and network measurement devices to gain experience with spin bit and VEC on real networks and production workloads.
The risk-utility tradeoff
The spin bit and VEC were designed to be a minimal-risk, maximum-utility signal fit for a single purpose: on-path measurement of end-to-end RTT, to generate RTT samples for a variety of passive latency measurement tasks. As such, we believe it to be a nearly ideal explicit path signal, meeting each of the principles for measurability in protocol design [PDF 148 KB] recently proposed by Allman et al. In other words, if it is at all possible to safely design passive measurability of any metric explicitly into a protocol, this signal represents how to do it.
In addition, each endpoint is in control of how much it participates in the generation of the signal. The signal requires the cooperation of both endpoints to appear: an endpoint can simply choose to always set the spin bit to zero in order to disable it. The VEC gives even finer-grained control, allowing an endpoint to probabilistically set the VEC to 0 or 1 in order to reduce the RTT sampling rate available from its flows.
Nevertheless, most of the spirited discussion of the spin bit has focused on the risk side of this equation. These can be divided into risks inherent in passive RTT measurement, and risks of unintentional information radiation from the spin and VEC itself.
Privacy and operational risk in Internet latency data
In response to the first question posed to the QUIC RTT design team at the IETF 99 meeting in Prague, we looked into geoprivacy risks posed by increasing the availability of RTT data and concluded (in an expanded study just published at PAM 2018) that these were indeed negligible. The various sources of random noise in Internet RTT overwhelm propagation delay relatively rapidly, so only relatively rare, very short RTT samples are at all useful for narrowing location beyond what is available even in low-resolution freely-available GeoIP databases.
In any case, RTT between any two points in the network is not at all secret: someone controlling an endpoint near one of the endpoints of interest can merely use one of a wide and growing variety of active measurement techniques. Discussion about all of the sorts of nefarious things that network operators can get up to with your RTT data tends to miss (or ignore) this fact.
There’s a more interesting question here: what can the spin bit and VEC be used for, and abused for, beyond RTT information?
It has been pointed out that the spin bit works like an alternate marking signal, and can be used by measurement infrastructure deployed at two points along a path to measure one-way delay and loss. The VEC, by encoding information about loss and reordering experienced by spin bit edges, might also be useful for inferring loss and reordering rates for the whole flow — this is an area that we’re actively looking into at the moment.
But both of these are examples of additional information made available about the network path by the signal, not additional information about the endpoints themselves. Digging further into this question led us to reframe it: what is the difference in unintentional radiation from this explicit signal as compared to the state of the art?
As it turns out, TCP timestamps, the current best source of passive RTT information, are pretty noisy themselves. They can be used to fingerprint hosts based on the fact that they expose clock drift, a node-linked physical property. Discontinuities in the timestamp sequence can also be used to detect host or infrastructure reboots [PDF 186 KB].
Simply by virtue of not exposing information derived from an internal interrupt counter, the spin bit doesn’t present these risks of unintentional radiation. In this aspect, at least, moving to a smaller explicit signal like the spin bit would appear to be a net positive for reducing unintentional radiation of endpoint information.
Though we’ve defined a purpose-built, minimal-risk signal that can provide accurate latency information to passive measurement for QUIC, we’re not done yet.
Our experience to date is limited to emulated network impairments, and while spin bit code exists for a number of QUIC implementations, wider adoption is necessary to support the kinds of large-scale experimentation that will provide us with enough information to make a risk-utility decision as a community.
This work also represents the first step in a new way of thinking about network measurement and manageability. QUIC may well replace TCP for a significant portion of Internet traffic, bringing with it the privacy, security, and flexibility benefits of a fully encrypted transport, and the opportunity to replace inference based on implicit signals (as we currently have with TCP timestamps) with explicit measurability.
More usable, less abusable signals for passive measurement have the potential to improve both Internet research and day-to-day network operations.
Acknowledgements: Thanks to Piet De Vaere for the graphics used in this article.
Brian Trammell is a Senior Researcher at ETH Zürich whose work focuses on Internet measurement and Internet architecture.
The views expressed by the authors of this blog are their own and do not necessarily reflect the views of APNIC. Please note a Code of Conduct applies to this blog.