In an earlier blog, I discussed the effect of TCP queue oscillation on shared satellite Internet links in the Pacific. Today, we’ll look in more detail at how such links affect TCP connections. But first, let me argue that we’re dealing with a somewhat interdisciplinary problem here.
Engineers versus networkers
When I give presentations on the topic, I often ask my audience what comes first to their mind when I mention the term “packet loss”. Audiences that have gone through engineering school often come back with “noise”, “bit/symbol error”, “checksum” or the like – all things that cause packets to be lost at the receiver. They see packet loss as a physical / link layer issue.
Networking people from a computing background typically respond “congestion” or “queue”. In their world, packet loss happens at the input to a congested resource, most commonly a router. They see packet loss as a network / transport layer issue: Packets arrive at a router faster than the router can forward them; at some point, the input queue reaches capacity, and the router ignores any further packet arrivals until the queue has regained capacity.
Both of these views are, of course, perfectly valid. With satellite links, there’s a clear division of labour: Engineers build the link itself, and network folks connect it to the Internet.
When we see loss in such networks, the next question is: What causes the loss, and how can we mitigate the problem? This is where approaches differ, as engineers and network nerds tend to look at different layers and places for cause and solution.
Engineers tend to look at the physical layer: weak signal, elevated noise levels, fading, misaligned antennas, insufficient link budget, etc. Once malfunction has been excluded and a weak signal cannot be remedied otherwise, the next tool in the engineer’s kit is forward error correction (FEC) on the link between the satellite ground stations. If that don’t work, adaptive modulation can reduce the data rate to add robustness to the signal.
Note for now that even the best FEC on the link can’t fix congestion problems, and reduction in data rates is obviously not going to help congestion either. Engineers may now relax: The congestion I’m reporting on here is fairly and squarely a networking issue.
What is a satellite link?
Dumb question, right? We all know what a satellite link is, don’t we? Again, an insight from the last couple of years is that different people bring different concepts to the table.
Satellite links are not born equal. Apart from the obvious parameters: low/medium earth orbit (LEO/MEO) vs. geostationary (GEO) and bandwidth, there are other distinguishing features that become important. One of these is how the link is used: How many hosts share the link’s capacity, what do these hosts do, and how is the sharing achieved?
Why is this important? Example 1: Television using UDP. A TV station rents a satellite channel to stream a TV programme to a remote mining town in Australia. This is easy: We know what the data rate on the stream is, so all we need to do is provide pretty much exactly that capacity on the satellite channel. We’ll get both good link utilisation and good quality of service, because there’s no competition from other traffic. That’s provided the engineers got the level of FEC right, of course. Sorted.
Example 2: A scientific research institute uses a dedicated satellite channel to retrieve data sets with TCP from an astronomical experiment in Antarctica. This is a little trickier: TCP needs to work through a bottleneck with significant latency here (the satellite), so the TCP sender needs to determine just how much capacity the bottleneck has. The only way it can find out is by ramping up transmission rates until it encounters packet loss (detected in the form of overdue ACKs from the receiver). At this point, it backs off and then slowly increases transmission rates again. This can lead to an effect called TCP queue oscillation, which I’ve discussed at length in my previous blog post, and which was first observed around 30 years ago. In TCP queue oscillation, TCP senders get the timing of the back-off wrong and end up transmitting into overflowing queues and back off at times when there is plenty of capacity available. Motorists know this effect: By the time the radio reports congestion on a road, it’s sometimes already over and avoiding the previously congested area doesn’t necessarily make the trip faster.
With most TCP stacks now having evolved to something like TCP CUBIC or similar, TCP queue oscillation on such links is no longer a serious problem, as the more sophisticated stacks no longer rely solely on the arrival of ACKs to regulate the transmission rate. In most cases, Example 2 will also result in good link utilisation.
Example 3: An Internet Service Provider (ISP) in an island location connects to the wider Internet via satellite. This satellite carries all inbound and outbound IP traffic, for a significant number of users. This is where it gets really interesting. There are now a potentially large number of TCP senders, but the link generally still represents a significant bottleneck given that the rest of us now regards Gigabit Ethernet as a state of the art, and upstream providers routinely handle faster rates than that. Even ISPs on MEO sats are typically only looking at hundreds of Mbps.
From the perspective of a TCP sender, that bottleneck now no longer has a fixed capacity: The actual capacity is what’s left over once the other users (TCP and UDP senders) have taken their share. So getting your rate right is a much harder problem now, especially since everyone else faces a similar round-trip-time and will react at pretty much the same time scale as you.
This post is about this last example type of satellite connection. So let’s ask the next dumb question.
What is a typical TCP connection?
Ardent networkers know all about SYNs, ACKs, FINs, sequence numbers, congestion windows, and perhaps even a bit about the differences between the various TCP flavours, such as Tahoe, Reno, New Reno, CUBIC, Compound TCP etc. The mental trap here is of course that we automatically think about TCP connections that transport large amounts of data, and consider TCP connections with smaller amounts of data simply as a shorter case of the former. After all, it’s the wait on those large downloads that bites, isn’t it?
Our satellite-connected site involves a lot of latency though, and this creates two very distinct classes of TCP connections:
- TCP connections whose senders still have data in the sending buffer when the first ACK from the receiver arrives.
- TCP connections for which the first ACK arrives at the sender after the last data byte has left the sender.
Put another way: Those in the first class are subject to TCP congestion control, those in the second class get to enjoy TCP slow start, but for them, there’s never a question of the sender having to back off. Needless to say, those in the first class are the large downloads, where a lot of data fills a large number of data packets. The second class are connections where all the data fits into a few packets, which will all be at least enroute to the receiver by the time that the first ACK comes back.
And now let me ask: Which of the two classes dominates on your bog standard Pacific island satellite link? Paradoxically, both do: The vast majority of connections are small volume ones, but large volume connections contribute most of the bytes that traverse the link.
Put another way: Most bytes on the link are subject to TCP congestion control, but many connections aren’t.
Let’s put some numbers on this: On a GEO satellite, round-trip-time (RTT) is at least 500 ms, and on MEO it’s about 120 ms and up, including any on-island latency. What we can do on such links is observe island-side with netflow and record the flows associated with TCP connections (a flow is simply the set of packets running in each direction of a connection). We can look at the size of flows and at the time they take to complete. So, for a MEO link, any TCP flow that takes 120 ms or less is guaranteed to arrive at the receiver in its entirety without the TCP sender having seen any feedback in the form of ACKs.
So we looked at the TCP flows on a MEO link that we know oscillates during peak time traffic. We found that, among the flows that took less than 120 ms:
- The largest flow contained just over 100 kB and clocked in at a respectable 8.7 Mbps in 96 ms. Note that this size covers most web page downloads for viewing. Just over 96.7% of flows are smaller than this size.
- The fastest flow contained 44 kB and achieved just over 11 Mbps over 32 ms. No faster flow was observed, even for durations longer than 120 ms. Even the 11 Mbps were an outlier: Only 0.005% of the flows under 120 ms achieve over 8 Mbps.
- Just under 38% of total flows fell into this category. Note that this is an underestimate for the percentage of uncontrolled flows, because we haven’t really allowed for any flows with additional RTT beyond the satellite link – and most “real” flows are bound to have that.
- Only 3.5% of bytes sit in these flows.
Note that, because of the short time frames, the rates here need to be taken with a grain of salt: If the first packet of the flow experienced more delay in router or satellite gateway queues than the last, this pushes the observed rate up with no credit due to TCP. Still, these flows would not have been affected by TCP flow because the queue delay would have delayed the return ACK as well.
Assuming now that any flow longer than 120 ms might see TCP flow control in action, we observed that, among such flows:
- The largest flow seen was over 800 MB, and achieved just under 3.6 Mbps over half an hour (the cut-off time for netflow). That’s less than half the rate of the largest flow under 120 ms.
- The fastest flow seen contained just under 18 MB and clocked up 6.3 Mbps only over 22 seconds.
Assuming that flow control is more or less guaranteed to be effective in flows of 600 ms or longer (allowing for additional RTT), almost 98.7% of bytes remain subject to flow control, however. These are the bytes in the flows that get hit by queue oscillation.
Now note that does not mean that low-volume flows are automatically safe from queue oscillation: Nearly 45% of flows smaller than the largest flow under 120 ms take more than 600 ms to complete.
With GEO satellites, uncontrolled flows can potentially be even larger due to the much latency. However, in many practical cases, the link capacity is much lower. This makes it harder for flows to achieve higher rates without packet loss – meaning that
What does the mean for the end users and the ISP?
So what do you base your subjective opinion on the quality of your Internet connection on? Chances are it’s one of these:
- Reachability: Whether you can reach certain web sites. Google, Facebook, your mail provider, YouTube, etc. Real network people just ping, of course!
- Page load time: Time it takes to load a web page you commonly visit
- Smoothness of replay for streamed movies and music
- Ability and time taken to complete large downloads, e.g., software
For many islanders, streaming and large downloads can be prohibitive because of volume charging, but let’s leave that aside for the moment.
Queue oscillation doesn’t affect reachability as such. If you ping, then there’s another pitfall to be aware of: the size of the packets you use. Queue capacities tend to be specified in bytes. When the queue at the satellite gateway input gets hammered with traffic during the queue overflow phase of the oscillation, it isn’t necessarily full: More commonly, the queue is just almost full and able to accept small packets only. Like a default ping packet. So when you ping, you’re likely to see less packet loss than there is actual data loss, simply because your packets just manage to squeeze in. The solution is of course to ping with maximum size packets.
Page load time: The satellite RTT ensures that no page (or page element) can load in less than twice the RTT (one RTT for the connection handshake, one RTT for the HTTP request & response). However, this delay is somewhat obscured by server and rendering delays, and the fact that even non-islanders commonly have to wait for a couple of seconds to see a page load.
Moreover, most modern web sites these days don’t really have a lot of large content items on their front page: A single page load will often trigger dozens of GET requests (many of which simply track you and download basically nothing). The vast majority of them won’t download more than a few kB, and downloads of over 100 kB are downright rare.
So generally, page load time isn’t affected much either. At this point, most casual users will probably look toward the server rather than their connection when the going gets tough.
Streaming via TCP is surprisingly common, but there’s a catch here that saves the day: The sender’s TCP congestion control generally won’t accelerate beyond a certain rate as the server itself limits the data rate. Remember, we’re streaming as we watch or listen. We’re not trying to download the whole thing in one go as quickly as possible. So as long as the average rate is achievable under queue oscillation and the player is smart enough to buffer sufficient media content at the start, the only thing that will impair our experience are the data charges we’re incurring!
This leaves the large downloads. This is where queue oscillation really bites when it occurs. During the download, there is practically always data in the send buffer, so TCP congestion control gets to accelerate as quickly as the algorithm allows. All data sits in big packets, which are the ones most likely to be dropped at queues. From the end user perspective, it’s probably just “congestion” or a “slow server”, when in fact, the link may well have had plenty of spare capacity and the server could have delivered in a fraction of the time.
However, there are users who are able to compare. They include those who frequently travel between the island and better-connected places. Unsurprisingly, these people tend to be reasonably heavy Internet users. They notice that, by and large, pages don’t load quite as fast, streaming video shows the “wheel of boredom” more often, and they know the value of a USB flash drive when it comes to software upgrades. It’s not uncommon for that user group to feel unease and blame – who else – their ISP.
From the ISP’s perspective, that’s of course unfair: They do a lot of quality control (ping is popular), employ good people, regularly invest in new equipment, and know their links are up. And, I need to emphasise here, they really can’t be blamed for the behaviour of the world’s TCP/IP stacks.
There’s another factor that should matter to island ISPs, though: When queue oscillation causes significant sender back-off among the large flows, the queue only receives the aforementioned short flows. They arrive casually but are really just peanuts in terms of capacity: The queue sits empty during most of that phase, and the satellite link is idle.
Again, in terms of numbers: The average inbound bitrate of the MEO link above was only around 42% the capacity deployed. In other words: The link sat idle for more than half of the time! Yet: TCP clearly couldn’t respond to demand, exploit the capacity and get large flows up to speed. So the ISP ends up paying for a lot of unused capacity.
If you’ve made it this far, you’ve probably understood that the flow ecosystem on a satellite link can be quite complex, even without bit errors and the like. We’d really like to be able to predict though when exactly queue oscillation will happen: under which demand, bandwidth, and latency. We’d also like to be able to find out how best to cope with it in a given scenario: performance-enhancing proxy (PEP), network coding (see my last post), or perhaps something completely different?
Usually, the first step is to try and get a good handle on the “as is” situation. So we tried simulating it in software. After a few attempts, we had to admit to ourselves that our models were too simple and the simulators way too slow: We’re looking at 2000+ parallel flows here. Sure, we could demonstrate the queue oscillation could happen, but not exactly using the parameters we would have liked to.
So we’ve built a hardware-based simulator, consisting of – at the time of writing – 84 Raspberry Pis and 10 Intel NUCs as island clients, a bunch of Gigabit Ethernet switches, and an armada of Super Micro servers to simulate satellite link equipment and TCP senders around the world (see below). We’re currently taking it through its baseline tests and are tweaking our software. Stay tuned for updates on what we find.
Ulrich Speidel is a senior lecturer in Computer Science at the University of Auckland with research interests in fundamental and applied problems in data communications, information theory, signal processing and information measurement.
The views expressed by the authors of this blog are their own and do not necessarily reflect the views of APNIC. Please note a Code of Conduct applies to this blog.