The proliferation of online services comprising globally-spread micro services has security and performance implications.
For this reason, understanding the underlying physical paths connecting end points has become important and given rise to numerous approaches for inferring the location of infrastructure IP addresses.
Our recent study at SimulaMet sought to investigate the accuracy and difference between these approaches.
We found there are inaccuracies in the existing geolocation approaches when mapping end-to-end Internet paths to physical locations, with 77% of IPv4 and 65% of IPv6 economy-level mappings missing at least one economy along the path of our dataset.
We also found that popular geolocation services rely heavily on data published by the Regional Internet Registries (RIRs), as geolocation mappings from these services match the geolocation from the RIR delegation files. However, geolocation databases tend to erroneously geolocate IPs that belong to Autonomous Systems (ASes) with global presence and IPs that change ownership due to merger and acquisition. Further, lack of coverage of the geo-datasets and IP-to-economy inaccuracies can add or miss economies from the economy-level end-to-end path.
Our findings highlight the sources of IP-geolocation disagreements
We evaluated the economy-level accuracy of two dedicated IP geolocation datasets (MaxMind and IP2Location) and two RTT-based geolocation methods (HLOC and RIPE’s IPmap), using end-to-end IPv4 and IPv6 paths between 30 vantage points located in seven economies (Canada, China, Germany, Netherlands, Norway, Sweden, and the USA).
MaxMind and IP2Location cover at least 80% of the IP addresses that are part of the collected paths. However, IPmap and HLOC have limited coverage, which can be explained by the dependency on the number of vantage points for the RTT measurements. The use of HLOC is also limited by the ability to extract correct geo-hints from pointer records (PTRs) of the measured IPs.
Most of the IPs geolocated by MaxMind and IP2Location are most likely mapped to the same economy. Moreover, for a significant percentage of these IPs, this economy-level geomapping coincides with the economy where the IP space is registered (which we extracted from the RIR allocation and assignment files).
For a small percentage of IPs, the geomappings either completely or partially disagreed. Using these disagreements we identified three possible causes of erroneous IP geolocation:
- MaxMind and IP2Location appear to use information from the whois records (economy, network name) to build their IP-to-economy mappings.
- IPs owned by organizations with international presence are often geolocated incorrectly.
- Merger and acquisition of organizations is a key source of IP geolocation inaccuracies.
Do IP geolocation disagreements affect the end-to-end IP path geomappings?
IP geolocation disagreements can falsely indicate path tromboning or path detours, as well as miss economies along the IP paths. This, in turn, has security implications as it indicates that, depending on popular geolocation databases, end-hosts might be unaware of the economies their Internet traffic is traversing — something we found among our collected paths, with a high percentage of both IPv4 and IPv6 paths appearing to be detoured from Europe through the United State.
Further, we found about 77% of IPv4 and 65% of IPv6 economy-level mappings miss at least one economy along the path and that both IP paths within the same continent (short-haul paths) as well as between different continents appear to miss economies (long-haul paths).
Is there room for improvement?
We propose a novel active measurement-based approach, which hinges on a simple idea: a location of a route can be greatly narrowed down if it is probed from within its AS.
Using existing methods (whois service, DNS names and geolocation approaches), we found the IP space owner and the possible location of the IP address, which we further used to choose a vantage point (VP) from which to traceroute to the target IP. A VP is judged suitable if it lies within the IP holder’s AS and in close proximity to the initially guessed location.
Our current approach relies on publicly accessible looking glasses (LG) as VPs. We consider the IP in the same economy as the LG if the traceroute confirms a topological proximity (for example, within a few router level hops and a latency of sub-20ms). We proceed to select another LG, if the previous one proved far away from the IP under test. Figure 1 illustrates this approach.
Figure 2 compares economy-level paths inferred by MaxMind and IP2Location (top part) to the path inferred by the LG-based approach (bottom part).
In the top part, the figure also shows the economies where the IP space is registered (Delegation line). The path goes from China to Norway and traverses three organizations: China Unicom, Cogent and Broadnet. Both approaches appear to miss economy geomapping data along the path in the transit network.
We plan to further develop our method for narrowing down the location of IPs that is based on probing these IPs from within the ASes that advertise them, and welcome your input on this approach in the comment section below.
Contributors: Ahmed Elmokashfi
Ioana Livadariu is a Postdoctoral Fellow at SimulaMet in Oslo, Norway.
The views expressed by the authors of this blog are their own and do not necessarily reflect the views of APNIC. Please note a Code of Conduct applies to this blog.