QUIC (Quick UDP Internet Connection) is a transport protocol initially proposed by Google, which implements TCP-like properties at the application layer atop a UDP transport.
Since its introduction, the protocol has undergone rapid development (currently at version 43!) and has been deployed by companies such as Google and Akamai, with more than 20 implementations in progress, including for Microsoft, Mozilla, Verizon, and Facebook.
Google has the largest QUIC deployment, and has reported that more than 85% of requests from Chrome browsers to Google servers (about 90% of Chrome bytes received) are now using QUIC, which accounts for 7% of Internet traffic!
While Google-reported performance for QUIC is promising — 3% page load time (PLT) improvement on Google search and 18% reduction in buffer time on YouTube — they are aggregated statistics and not reproducible by others (such as ourselves). Other QUIC evaluations by independent researchers use limited tests in limited environments/networks, and do not provide root cause analysis to help us understand the performance results.
In a recent paper published at the ACM Internet Measurement Conference 2017, we worked to address these issues, and provided a comprehensive evaluation of QUIC’s performance and how it compares with TCP. This post shares some highlights from our study.
Page load time performance
In Figures 1, 2, and 3, we plot the performance difference between QUIC and TCP (in percentage), with each cell representing a different link capacity and object size. Each cell uses the same latency and loss settings (described in the caption); cells with red colours indicate that QUIC is faster than TCP and blue indicates that TCP outperforms QUIC. Darker colours indicate larger performance differences, and white cells indicate no statistically significant difference between QUIC and TCP.
Figure 1: QUIC outperforms TCP under a variety of scenarios.
We found that compared to TCP, QUIC is able to improve the PLTs under various network conditions (Figure 1). However, we observed that QUIC performs significantly worse than TCP when the network reorders packets (Figure 2).
Upon investigating the QUIC code, we found that in the presence of packet reordering, QUIC falsely infers that packets have been lost, while TCP detects packet reordering and increases its NACK threshold.
Figure 3 shows how QUIC can benefit from such a mechanism if integrated into the protocol, as QUIC begins to outperform TCP once the NACK threshold increases past 30.
Figure 4 shows our testbed for QUIC evaluations, which consisted of a client running a Chrome browser that downloaded an HTTP page from our server, and which supports both QUIC and TCP. We conducted tests using both transport protocols multiple times back-to-back and then compared the PLTs of the two protocols for pages with different object counts and sizes. To explore the impact of network conditions, we tested under a variety of network conditions emulated using Linux’s NETEM and TC tools.
Performance on mobile devices
Due to QUIC’s implementation in userspace, resource contention might negatively impact performance independent of the protocol’s optimizations for transport efficiency. To test whether this is a concern in practice, we evaluated an increasingly common resource-constrained deployment environment: smartphones.
We used the same measurement approach described above with two popular Android phones: Nexus 6 and MotoG. We found that, similar to the desktop environment, in mobile environments, QUIC outperforms TCP in most cases. Its advantages, however, diminish across the board and at times are not statistically significant.
Figure 5: QUIC performance improvements diminish or disappear on mobile devices (compared to Figure 1).
To understand why this is the case, we investigated QUIC’s congestion control code to infer QUIC’s state machine and how much time is spent in each state. We did this both for mobile and non-mobile scenarios under the same network conditions.
We found that in mobile, QUIC spends most of its time (58%) in the ‘Application Limited’ state, meaning that the sender paused the transfer while waiting for the receiver to process packets. In the desktop scenario, this occurs only 7% of the time. We believe the reason for this behaviour is that QUIC runs in a userspace process, whereas TCP runs in the kernel. As a result, QUIC is unable to consume received packets as quickly as on a desktop, leading to suboptimal performance, particularly when there is ample bandwidth available.
An essential property of transport-layer protocols is that they do not consume more than their fair share of bottleneck bandwidth resources. An unfair protocol may cause performance degradation for competing flows.
We expected that QUIC and TCP should be relatively fair to each other in our tests because they both use the Cubic congestion control algorithm. However, we found this is not the case (Figure 5), and when QUIC is competing with TCP flows, it prevents TCP from getting its fair share of the bottleneck bandwidth. In fact, our experiments showed that QUIC always consumes more than half of the bottleneck bandwidth, even as the number of competing TCP flows increases.
Figure 6: QUIC persistently consumes more than its fair share of bottleneck bandwidth, even in the presence of multiple TCP flows.
We further investigated why QUIC is unfair to TCP by investigating the QUIC source code and using TCP probe to extract congestion window information. The larger the congestion window, the more bytes that QUIC or TCP can have in flight, and thus the larger the sustained throughput.
We observed that when competing with TCP, QUIC is able to achieve a larger congestion window.
Taking a closer look at the congestion window changes (Figure 6), we found that while both protocols use Cubic, QUIC increases its window more aggressively (both in terms of slope, and in terms of more frequent window size increases). As a result, QUIC is able to grab available bandwidth faster than TCP does, leaving TCP unable to acquire its fair share of the bandwidth.
Figure 7: QUIC unfairness is in part derived for sustaining substantially larger congestion windows than TCP.
Next steps: how to further optimize QUIC
In this post, we highlighted several interesting findings from our study of the QUIC protocol. While it outperforms TCP in a wide range of scenarios, we found that it underperforms in the presence of packet reordering and on resource-constrained mobile devices.
Further, we found that QUIC consumes significantly more than its fair share of bottleneck bandwidth when competing with TCP flows, which can be detrimental to a wide range of applications.
As part of ongoing work, we are investigating the origin of this unfairness, how to address it, and how to further optimize the QUIC protocol.
Editor’s note: This research has been awarded a 2018 Applied Networking Research Prize.
Arash Molavi Kakhki is an Internet Measurement Researcher at ThousandEyes, where he analyzes a wide-ranging array of network events, long-term trends, benchmark service providers, Internet-wide outages, and overall network health. Prior to joining ThousandEyes, Arash was at Northeastern University, where he obtained his PhD focusing on performance and policy impacts of transport protocols and in-network devices. This post is based on his work while at Northeastern University.
The views expressed by the authors of this blog are their own and do not necessarily reflect the views of APNIC. Please note a Code of Conduct applies to this blog.