The tremendous growth of the mobile Internet, with over 11 billion devices connected by 2020, and its economic implications, have motivated several reports. And yet, we still lack an understanding of the impact of cellular networks around the world.
There are a number of reasons for this. For starters, it is currently challenging to tell whether a particular IP address comes from a cellular or fixed-line user.
In much of the world, cellular users reside in networks that combine both cellular and fixed-line customers, which complicates any straightforward attempt at identification. Knowing a device type (smartphone or tablet) has limited value as most mobile devices have multiple interfaces and users tend to offload cellular traffic to WiFi when available.
And while instrumented devices or data collected from a network operator’s core could provide detailed information on cell network usage, scaling these sorts of studies have proven to be difficult.
A comprehensive understanding of cellular access has a wide range of applications for different stakeholders in the Internet. For content providers and delivery networks, identifying access technology would help to diagnose and address performance issues in the wild. Researchers and operators could better understand how networks are being used around the world and identify potential trends, while policy-makers could have a firmer statistical footing for investment decisions.
In our paper, ‘Cell Spotting: Studying The Role Of Cellular Networks In The Internet’ [PDF, 1.7MB], which we presented at the 2017 Internet Measurement Conference, we developed and validated an approach to accurately identify cellular network addresses using client browser signals and show its effectiveness in a range of mixed networks, that is, networks that share both fixed line and cellular devices.
Using this approach, we leveraged the global vantage point of one of the world’s largest CDNs to map global cellular IP space and its housing Autonomous Systems (ASes), and carried a first-of-its-kind study characterizing cellular network configuration and usage around the world.
Identifying cellular networks
Our methodology for classifying subnets as either cellular/noncellular is straightforward. We use the Network Information API to detect the presence of cellular access technology in a particular IP address block.
The Network Information API allows web applications to access information about the underlying network connection in use by the device. While not a W3C standard, it is implemented in several popular mobile browsers, most notably Android’s native WebKit, Chrome for Android (beginning in version 38), and Firefox Mobile.
This API reveals the connection type that the system is using to communicate with the network (for example cellular, Bluetooth, Ethernet, and WiFi) and supports monitoring network changes. Connectivity is obtained from the browser, which calls the underlying operating system to obtain information on active network interfaces, or to detect changes in network connectivity.
We use connectivity information collected by Javascript beacons, part of the CDN’s Real-User Monitoring system (RUM). The logs also include information such as the timing and page load information obtained from browser instrumentation (the Resource Timing API [3]), and clients’ information such as IP addresses.
Using the connectivity type from the Network Information API, we label every entry in this log as either cellular or noncellular, and use this to calculate a cellular ratio for every /24 and /48 CIDR sampled. The calculated ratio represents the fraction of a given subnet that comes from cellular hits over the total number of Network Interface-enabled hits for that subnet. We label a particular subnet as cellular or noncellular based on this ratio. We also extend the approach to classify networks (ASes) based on a few heuristics such as eliminating non-access networks.
We validated our approach using data from three large mobile carriers, including a large mixed European provider and a large dedicated US provider. The analysis shows our approach can deliver very high levels of both precision — a fraction of correctly classified items over the total classified item, and recall — the fraction of correctly classified IPs over the true number of items in that class.
After labelling subnets with this method, we used demand logs from the same CDN’s entire platform, covering all types of protocols and devices, to assign a normalized value of demand to each of these subnets. Note that this is a request demand, which we use as a proxy for traffic demand.
Read Cell Spotting: Studying The Role Of Cellular Networks In The Internet for additional details on our methodology.
The shape and role of cellular networks
With this approach in hand, we carried out the first survey on the composition and traffic dynamics of global cellular networks, and looked at their use around the world.
We found, for instance, that a majority of cellular networks are mixed, hosting both cellular and fixed-line broadband clients. Given their prevalence, network characterization efforts should take the technology composition of the studied networks into account.
We also found that cellular demand is centralized in a few, large networks (the top 10 cellular ASes account for 38% of global demand) and within these networks cellular traffic is concentrated in a small fraction of IP addresses.
Looking into these networks’ DNS support, we show that in mixed cellular networks nearly 60% of DNS resolvers are shared between cellular and fixed-line clients. This implies that DNS resolvers alone are insufficient for identifying client types. The use of shared resolvers may also challenge client localization for common request-routing systems.
To further challenge the use of DNS for end-user mapping, we find significant public DNS usage by cellular clients outside the USA, breaking from common assumptions that cellular clients only use operator-provided DNS services.
This macroscopic view revealed the dominance, in terms of traffic, of a few markets such as the USA, and the various roles played by cellular connectivity around the globe — from a supplementary service in much of Europe to the primary means of connectivity in Asia and Africa.
While the top five economies alone account for 55.7% of global cellular traffic demand, in most of these economies, cellular connectivity provides a supplementary service. This is in clear contrast to economies such as Lao PDR and Ghana, themselves responsible for a very small fraction of global traffic, but where cellular captures 87% and 95.9% respectively of the overall economy demand.
The growing key role of cellular networks for providing Internet connectivity in many places around the world makes the case for considering such networks as part of the critical infrastructure of these economies.
Although this paper presents a snapshot of cellular address characteristics, we are exploring how cellular addresses evolve over time, both in their assignment to cellular end users, and how demand shifts across cellular address space.
Contributors: John P. Rula and Moritz Steiner
Fabián E. Bustamante is a professor of computer science at Northwestern University.
The views expressed by the authors of this blog are their own and do not necessarily reflect the views of APNIC. Please note a Code of Conduct applies to this blog.