On the interplay between TLS certificates and QUIC performance

By on 16 Jan 2023

Category: Tech matters

Tags: , ,

Blog home

The QUIC protocol (RFC 9000) was designed to improve web performance and reduce access latency while keeping communication confidential. A fundamental approach is the reduction of initial round trip times (RTTs) by integrating the QUIC handshake with the TLS 1.3 handshake and coalescing multiple QUIC packets into one UDP datagram.

In a recent paper, my fellow researchers from Freie Universität Berlin, The Fraunhofer Institute for Open Communication Systems, and HAW Hamburg and I revisited QUIC connection setup performance. We analysed over 1M web domains with 272k QUIC-enabled services and found some worrying results. This post will summarize those findings and suggest some design choices for fast and secure connections to common web deployments.

The QUIC handshake process is depicted in Figure 1. The ideal QUIC handshake takes one round trip time (1-RTT). To prevent reflective amplification attacks, a server must not reply with more bytes than the QUIC anti-amplification factor allows until the client IP address is verified. RFC 9000 limits the data size from the server to 3× the bytes received in the client Initial. Common client initial packet sizes are 1,250 bytes for Chromium-based browsers and 1,350 bytes for Firefox. Large server responses can thus prolong the handshake (Multi-RTT).

Figure 1 — In QUIC handshakes, server replies are limited to 3× the size of the client initial until the client is verified. This can lead to multi-RTT handshakes.
Figure 1 — In QUIC handshakes, server replies are limited to 3× the size of the client initial until the client is verified. This can lead to multi-RTT handshakes.

QUIC handshakes in the wild are … disappointing

Figure 2 shows the absolute number of handshake types for all QUIC-reachable names from the Tranco 1M top domain list, depending on the client’s initial size. For a size of 1,352 bytes (similar to Firefox), we found that 61% of handshakes exceed the anti-amplification limit during the first RTT() while 38% require multiple RTTs (). Although QUIC has been designed for efficient handshakes, 1-RTT () handshakes are very rare. RETRY() messages are a lightweight mechanism to verify client addresses at the cost of an additional round trip.

Figure 2 — Influence of client initial sizes on the QUIC handshake. For QUIC-reachable domains from the Tranco 1M list, we found almost no ideal handshakes (1-RTT) in different initial sizes.
Figure 2 — Influence of client initial sizes on the QUIC handshake. For QUIC-reachable domains from the Tranco 1M list, we found almost no ideal handshakes (1-RTT) in different initial sizes.

Let’s look at the two main reasons for multi-RTT and amplification next.

Large non-leaf TLS certificates impede QUIC performance

During our study, we found that TLS certificate chains with large, multiple non-leaf certificates contribute substantially to the TLS data size exchanged.

Figure 3 shows the top 10 certificate chains deployed. Each white box represents the sizes of the certificates in the chain (excluding leaf certificates), yellow boxes () and orange boxes () represent the median sizes and the largest leaf certificate that we observed in that chain, respectively. Dotted lines represent the max allowed reply sizes of a server given common client initial sizes. We find that average-sized certificate chains are likely to exceed the QUIC anti-amplification limit.

Figure 3 — Certificate chain sizes, depths, and their dependency. Average-sized certificate chains are likely to exceed QUIC amplification limits.
Figure 3 — Certificate chain sizes, depths, and their dependency. Average-sized certificate chains are likely to exceed QUIC amplification limits.

Cloudflare’s CDN-centered deployment explains amplification

Based on TLS information and additional IP prefix mapping, we found that 96% of the amplification handshakes are completed with Cloudflare servers. This is surprising because Figure 3 shows that Cloudflare certificates stay below the anti-amplification limit. This stems from not utilizing packet coalescence at two levels:

  1. Initial flags are sent separately, leading to two UDP datagrams. The first contains the ACK and the second the ServerHello flag, both of which are padded.
  2. We neither observe the coalescence of initial or handshake messages.

Cloudflare explains the reason to exceed the limit is to help improve client performance. Specifically, the information needed to populate the ServerHello is contained in certificates that may be managed separately from connection termination, and unavailable at the moment of arrival of the client initial. The delay to fetch the certificates affects client estimates of RTT. Cloudflare mitigates the delay by immediately responding to client initials with an ACK, as shown in Figure 4.

Figure 4 — QUIC handshake behaviour as observed for Cloudflare. For deployments with external certificate management, there is a trade-off between packet coalescence (and hence fewer padding bytes) and correct RTT estimation by the client (due to delay Δ).
Figure 4 — QUIC handshake behaviour as observed for Cloudflare. For deployments with external certificate management, there is a trade-off between packet coalescence (and hence fewer padding bytes) and correct RTT estimation by the client (due to delay Δ).

Examining the potential of reflective amplification attacks

Our previous analysis considered the behaviour of servers when handshakes complete successfully. Now, we’ll consider the case when a client fails to react to the server’s responses. This causes the retransmission of data by the server. Since the client IP address is not verified, all resends must comply in sum with the 3× amplification limit. A resend occurs, for example, when malicious actors initiate a handshake with a spoofed IP address. All resends, that is, amplified traffic, are reflected to the spoofed address belonging to a victim.

We conducted active scans with a client that sends a single QUIC initial but does not acknowledge the response. Worryingly, we found that responses (together with retransmissions) from Meta servers lead to amplification factors of up to 28×. We found IP addresses that relate to Instagram and WhatsApp exhibit the highest amplification factors. After our responsible disclosure, Meta remediated the amplification potential to 5×, as compared in Figure 5.

Figure 5 — Mean amplification factors for Meta services observed at all point-of-presences. We see a significant improvement after the responsible disclosure of our results. The anti-amplification limit is still slightly above the allowed threshold.
Figure 5 — Mean amplification factors for Meta services observed at all point-of-presences. We see a significant improvement after the responsible disclosure of our results. The anti-amplification limit is still slightly above the allowed threshold.

Guidance for stakeholders

Current deployments of QUIC struggle to meet the innate design goals of the protocol to shorten latency at low amplification potential. Based on our research we recommend the following:

  • First, carefully created TLS certificates and certificate chains can positively influence QUIC protocol performance. ECDSA certificates lead to substantially smaller certificate chains. They can’t, however, realize their potential because root certificates often use RSA algorithms, leading to large signatures of the intermediate certs. Our results show that updating these certificates can have beneficial cascading effects.
  • Second, at the QUIC server-side implementation, bytes that result from padding or retransmissions must be included in anti-amplification limit checks. Enabling packet coalescence at the server is recommended to omit padding, to free up space for TLS certificates, and thus to reduce the need for additional round trips. 
  • Third, we recommend using certificate compression to compensate for large TLS certificates, which currently trigger multi-RTT handshakes.
Figure 6 — understanding-quic.net measures the behaviour of QUIC deployments regarding the handshake and certificate compression.
Figure 6 — understanding-quic.net measures the behaviour of QUIC deployments regarding the handshake and certificate compression.

To allow for reproduction and verification by service operators we created understanding-quic.net. The tool analyses the handshake behaviour of a given QUIC endpoint during the first round trip, including the response size and the support of certificate compression. We hope that easy access to this information will support the development of a better QUIC ecosystem.

For more information read our paper “On the Interplay between TLS Certificates and QUIC Performance”, which was presented at CoNEXT in December 2022.

Marcin Nawrocki is a PhD student and research assistant at Freie Universität Berlin.

Raphael Hiesgen and Jonas Mücke contributed to this post.

Rate this article

The views expressed by the authors of this blog are their own and do not necessarily reflect the views of APNIC. Please note a Code of Conduct applies to this blog.

Leave a Reply

Your email address will not be published. Required fields are marked *

Top