GEO vs MEO: which satellite solution works best?

By on 20 Mar 2018

Categories: Development Tech matters

Tags: , , , ,

5 Comments

Blog home

The following is the third of a three-part series of posts on measuring satellite blufferbloat as well as recommendations on how to solve it.

A considerable number of ISPs in remote locations, operators of ocean-going ships (in particular, cruise ships), and aircraft on intercontinental routes face user demand for reliable and fast Internet connectivity but can’t get fibre or radio links to terrestrial networks — either because it’s physically impossible or because it’s economically prohibitive. This leaves satellite Internet as the only option.

Until the advent of medium earth orbit (MEO) provider O3b (short for ‘Other 3 billion [users]’), geostationary (GEO) satellites were the only option for such ISPs. That meant huge dish antennas and, most of all, bandwidth prices of many hundreds of dollars per megabit per second per month (Mbps/month).

‘High throughput’ satellites with direct-to-site transmission have brought the dish size down in some areas of the world, but are generally aimed at end users, not at ISPs who will redistribute the service via their terrestrial network.

O3b are now part of a larger satellite operator, SES, which also offers a large range of GEO services. O3b’s original sales push was that it offered significantly reduced latencies (around 120 ms round-trip time (RTT) rather than 500 ms plus for GEO). Echoes of this can still be found around the net. At the time of writing this, however, a search for ‘latency’ on the SES website finds mostly blog posts from O3b’s pre-SES era, and the term certainly doesn’t feature prominently on the site.

For Pacific Island ISPs and Internet people in the region, connectivity is always a challenge. More often than not I get the message from them that they’re confused by the choices available to them when it comes to satellite service. Some have found themselves locked into long-term contracts with satellite providers that they have subsequently found didn’t meet their needs. So it’s quite understandable that there is a bit of anxiety around.

Part of the problem lies in the fact that satellite operators — the people who design, sell, and install satellite networks — tend to be communication engineers; that is they come from an electronic engineering background. Their world revolves around bands, fade margins, antenna gains, noise, rain fade, footprints and signal levels. ISP people tend to be computer networking people. Their world revolves around subnets, gateways, routers, network address translation, traffic shaping, latencies and congestion.

I understand both of these communities well, having worked across all layers of the communication stack, from soldering around in RF componentry, working with codes, troubleshooting link layer issues and IP networking, all the way up to application layers and security. In my opinion, knowledge from one community can benefit the other.

For example, unwanted effects in the satellite networking world can have different causes. As I’ve already discussed on this blog, packet loss can be the result of links being impaired by rain fade or equipment failure. But it can also be a symptom of queue drops during congestion — a consequence of high demand but not link impairment. Similarly, link underutilization doesn’t mean that there isn’t enough demand to fill the bandwidth — it’s also a symptom of TCP queue oscillation, the same high demand effect that also causes packet losses by queue drops.

At the University of Auckland, we have built a satellite simulator that lets us gain insight into the workings of such satellite links into island-like networks. We can simulate GEO and MEO links for bandwidths up to many hundred Mbps and for up to around 4,000 simultaneously active client sockets. This corresponds roughly to the sort of user load and connectivity of an island like Rarotonga in the Cook Islands.

One of the things it lets us do is compare the expected network behaviour of MEO and GEO links. So we thought this might be of interest to some readers.

Simulating network behaviour of MEO and GEO links

Figure 1 shows a comparison between simulated GEO (blue) and MEO (red) links with 32 Mbps in each direction (brown horizontal line) and payload data flowing in the island direction only, for a range of demand levels (load). We’re looking at average total goodput rates observed on the island side (crosses), the average goodput rate observed by a single 80 MB iperf3 transfer (squares), and the average number of TCP connections (flows) seen per second (spiky squares). Each load scenario was simulated ten times; the solid lines represent the average value obtained in the ten simulations, whereas the markers show the actual values seen in each experiment.

Figure 1: At the same demand level, MEO links complete significantly more connections in the same amount of time than GEO links. However, long TCP transfers under high demand scenarios can take longer as short flows get an unfair advantage on MEO.

Figure 1: At the same demand level, MEO links complete significantly more connections in the same amount of time than GEO links. However, long TCP transfers under high-demand scenarios can take longer as short flows get an unfair advantage on MEO.

Each channel represents a TCP socket on the ‘island’ that sequentially connects to a server on the ‘world’ side and downloads some data from there before being disconnected from the server. The server chooses the amount of data to send — an amount that follows an actual flow size distribution observed in Rarotonga.

As one would expect, total goodput and number of flows increase with load for both GEO and MEO. However, both are significantly larger for MEO across the entire load range.

This is a result of the difference in RTT between the two link types. Each load channel alternates between two phases: the connection establishment phase, during which no goodput accrues; and the connected phase, which covers the actual data transfer.

On a GEO satellite, the connection establishment phase takes significantly longer than on MEO, due to the higher latency. As the vast majority of flows is small (10 data packets or less), they spend most of their time in the connection establishment phase. As the lower MEO latency decreases, we get faster connection turnaround, more flows and higher total goodput.

However, there is a drawback: the increase in goodput and number of flows per second on MEO accrues more or less exclusively to short flows. Most of these flows don’t live long enough to have their congestion windows adjusted by TCP flow control (the occasional small flow that loses a packet to queue overflow at the satellite input queue does not significantly impact the statistics) — so they don’t back off under congestion, which sets in at a lower load level on MEO than on GEO.

On the other hand, even small amounts of packet loss cause large flows to back off considerably. At medium to large loads, the goodput rate observed for the 80 MB iperf3 transfer on MEO falls below that of the GEO link, despite the fact that both links have in principle spare capacity.

In practice, small flows occur primarily in applications such as simple web browsing (which includes Facebook, Instagram, Twitter) or email traffic (except for large attachments), whereas large flows are typically software downloads.

When we look at packet loss, the percentage of TCP payload bytes that doesn’t make it onto the link in the above scenario is lower for MEO than for GEO, up to a load of around 110 channels. As the load increases, MEO losses quickly outgrow GEO losses — at 200 channels, MEO loses nearly twice as much of its data (~4.8%) as GEO (~2.5%).

Assisting large flows

One way of assisting large flows is to increase buffer size at the input to the satellite link, albeit at the risk of standing queues. While this dampens TCP queue oscillation, it also raises the risk of an effect known as ‘bufferbloat’’.

The amount of buffer that is appropriate is generally taken to be proportional to the bandwidth-delay product, which would suggest MEO buffer capacities of around a quarter of GEO size.

Moreover, we get more flows in MEO, and the RTT is much lower, so we should see more of them — this indicates an even lower buffer size.

However, in our experiments, we tried to optimize the buffer size based on observed queue behaviour — admittedly a bit of a black art — and ended up with GEO queues of 250 kB and MEO queues of 200 kB capacity. So, going by theory, our results already give the big iperf3 transfer a massive leg up on MEO, and still, it’s not as fast as on GEO under high loads.

So who wins, GEO or MEO?

From a networking perspective — which is the only one covered here — our data tells us that a MEO solution is likely to make better use of the capacity and as a result transfers more data faster.

As long as the link utilization stays well below capacity and the demand stays relatively low, it also seems to be the better choice for large transfers.

However, under a high-load scenario, large transfers fare much worse in our experiments than under GEO.

Of course, this isn’t the only criterion. You’ll also want to look at price per Mbps/month, at technological suitability/serviceability for your remote and hard-to-reach location, cyclone wind load, rain fade, logistics and so on. But that’s for another blog!

Ulrich Speidel is a senior lecturer in Computer Science at the University of Auckland with research interests in fundamental and applied problems in data communications, information theory, signal processing and information measurement. His work was awarded ISIF Asia grants in 2014 and 2016.

Rate this article

The views expressed by the authors of this blog are their own and do not necessarily reflect the views of APNIC. Please note a Code of Conduct applies to this blog.

5 Comments

  1. Jonathan Morton

    Have you considered applying flow isolation (aka “fair queuing” or “flow queuing”) to this problem? The link rates and RTTs you quote are in the range where this makes a lot of sense to deploy.

    I particularly recommend trying out fq_codel. This is a combination of stochastic flow isolation (using an advanced DRR++ algorithm) with one of the better-regarded AQM algorithms. For a satellite link, you should tune the Codel parameters to suit the higher RTTs than are found in the general Internet.

    You can then configure a much larger queue size limit, and thus improve goodput, without the ill effects that usually result from that with a plain FIFO or AQM alone.

    If fq_codel works well for you, then I also recommend looking at Cake, which is a project I’ve been working on for a while and is gradually working towards inclusion in the Linux kernel. It builds on fq_codel’s principles and includes a lot more features, including a built-in shaper (which may simplify deployment if the link hardware doesn’t directly support flow-isolating queues), host-fairness in a flow-isolating environment, and basic Diffserv support.

    Reply
    1. Ulrich Speidel

      We’re well aware of fq_codel and various other attempts at queue management. They all make a lot of sense in environments where there is a wide mix of latencies in the TCP flows that go through the queue. Where available, sojourn-time limiting schemes are a valuable alternative to fixed-size queues at satellite modems. However, they can’t solve the problem completely either: Sats are different because of the huge latency that all flows share. So the fundamental issue here is that it doesn’t really matter too much which queue management algorithm drops excess packets – as long as it takes 500 ms plus (on geostationary satellites) for the TCP senders to find out that there won’t be an ACK, we have a problem.

      Reply
  2. Internet Explorer 10

    Satellite networking has no geographical boundaries. It has a wide range. It provides excellent networking. It just takes a lot of money to set it up.

    Reply
    1. Ulrich Speidel

      Yes we have, albeit not for this post. PEPs (including spoofers) are often black boxes that tend to be a bit difficult to get hold of (price and in some cases availability outside an actual satellite environment), so we are restricted to the likes of PEPsal or TCPEP (or anyone kindly wanting to donate equipment), where source is available.

      Making reliable measurements in the field is very difficult because one has almost no control whatsoever of background traffic, and creating this in the lab requires some effort – which we’ve gone to, but it also condemns us to sequential experimentation, which is why we haven’t got the PEPsal data quite publishable yet. Our initial series of PEPsal experiments shows that total goodput seems almost unaffected by the PEP’s presence (chiefly due to the fact that many flows are so short that nothing aimed at managing the congestion window will have any effect), but that large flows benefit within a certain demand range. We base this measurement on a large timed flow from a dedicated TCP sender. Our initial series of experiments put this sender very close to the PEP, meaning that any benefits accrue from the PEP’s own TCP behaviour rather than from the PEP splitting the overall RTT. We have another series planned which will place that sender a bit further away, so we can see which effect the RTT split has. Spoofers also wreak havoc with respect to the end-to-end principle, but this is reasonable well understood.

      Our own pet project is a coding approach, which isn’t so much TCP acceleration in the classic sense than concealment of packet drops at the satellite uplink input queue. You could think of this as a sophisticated version or parity packets, which some WAN accelerators offer.

      We’ve certainly also such WAN accelerators in the field. They aren’t specifically designed for sat links, but are fairly common. The one ISP where we saw an active box didn’t have have config access to it, so it was difficult to tell which options were activated and which were not (depending on where you have access, you can infer to an extent whether network memory or parity packets are in use, but most other functions are difficult to observe).

      Coding and PEPs can be thought of as both competing and complementary approaches – coding a spoofed connection is eminently possible, but neither approach relies on the presence of the other.

      Reply

Leave a Reply

Your email address will not be published. Required fields are marked *

Top