Geofeeds and shepherding: Where is that network?

Network operators often have to ask the question: Where is this network block being used?

You would think this divides pretty nicely into two versions of the question:

Who has the rights to originate this block, in BGP? (Routing problem)
Where in the world is the network actually being used? (Real-world question)

It turns out that both problems have many more implications than you first think.

The first question is actually what Resource Public Key Infrastructure (RPKI) sets out to solve with Route Origin Authorization, so this post will instead examine registering the real world, physical, location of a network.

Where in the world is the network actually being used?

Sometimes, people need to know where a network is being used. It may be for Intellectual Property reasons (licensing content and contracts with regional providers), or it may be for law-enforcement (seeking the originator of bad traffic through local law enforcement, or mutual international law cooperation). It could be to find a person using a device in an emergency or to offer targeted services.

To figure this out you would typically look at sources of geolocation. RIPE NCC has a good list of them in this article. RIPE also operates what was formerly called OpenIPMAP and is now known as The RIPE IPmap.

However, the data used by those geolocation services includes data which is managed through the RIR delegation and registration processes. This includes fields related to the physical location in the whois objects for inetnum (IPv4 data), inet6num (IPv6 data), and aut-num information (ASN data).

There’s a problem, however. That’s not always what the field means. So you’re looking at a guess, based on registration data.

Is that really the network, or just its registration location?

When APNIC first engages with a delegate of resources, we record where they are located, for corporate entity purposes. This could be the location of the legally incorporated entity, the holder of a business number, or in the final analysis, the person (if the resources are being delegated to a real person).

This is part of APNIC’s core registry function. We hold this data inside the registry, but we also expose it in published records. These include the delegated files and the information in Whois. This is also used to populate the economy-related fields in the Routing Policy Specification Language (RPSL) objects, with the ISO3166 economycode.

This field is also important, and populated in the inetnum, inet6num and aut-num objects of the delegation where relevant.

These delegation records aren’t maintained ‘by’ the delegate, they are maintained by APNIC in its role as a registry. It’s the public record of what APNIC was told about the delegate and we maintain its status. So, when delegates want to use fields related to the physical location in Whois RPSL objects to state where they use resources, they have to rely on their mnt-lower: maintainer rights to delegate the ability to create what we sometimes call ‘more specific’ records. More specific records are those which are ‘inside’ the delegation APNIC maintains, but which can be managed by the delegate themselves.

So, if you are lucky enough to hold a /16 of IPv4 addresses with one inetnum: over it, you can make two /17 records inside this space, and you can give the country: field a value as you see fit, as long as it’s a valid ISO3166 two-letter economy code.

Read: How accurate are IP geolocation services

Ideally, we would term this field the economycode: field because it would help us identify locations for resources which aren’t, strictly speaking, a country. However, the formal definition of RPSL specifies the use of the fieldname, and we respect this in the ‘on the wire’ format of the data. In real life, we try very hard to refer to this as the economy code data.

In the delegated file, the equivalent field is exclusively related to the entity economy of registration, except for a very small number of special cases. For example, unallocated and unused resources withheld from distribution (for example, those still held by IANA) are marked with the ‘ZZ’ economy code which is a special-case assignment for worldwide, in the ISO3166 system.

Enter geofeed:. What geofeed: says is ‘let’s move this problem somewhere else’.

What is a geofeed entry?

The geofeed: entry in RPSL is a new field type. It has been proposed by Randy Bush, Massimo Candela, Warren Kumari and Russ Housely in the Operations and Management Area Working Group.

The idea is really very simple. A mechanism proposed by Google and adopted as RFC 8805, specifies a format for self-publishing geolocation information for IP prefixes, in a simple CSV format down to the street and postcode level if you want. But, for most purposes, simply publishing the economy code is going to be sufficient.

This CSV data is published in a URL but the problem remains: How do people know where to find the URL? Enter geofeed:. This marker can be used in the RIR Whois system, or in RDAP to identify (by network prefix) where to find the geolocation data for that range.

Additionally, the authors show how this can be a signed statement using mechanisms such as the RTA method proposed by APNIC in the IETF. This would permit the geolocation information, per-prefix, to be compared with the RPKI signature, which would show the authority to declare geolocation information is there.

What has this got to do with shepherds?

Shepherding is IETF jargon for the process of getting a document from ‘working group last call’, through Draft, to Published form as an RFC. Naturally, it is defined as RFC 4858.

It’s not an especially arduous process, but it does requires some basic formality for the document to proceed, including:

Checks that the authors have resolved concerns in the working group lists.
Evidence there is no intellectual property claim over the technology.
Evidence of conformance to the IETF document process, and outcomes.
And, for any (inevitable) queries which come up in the approval and publication process, the shepherd acts as a liaison making sure both sides of the problem (the approvers and the document editors/authors) get reconciliation of the open issues.

Personally, I think it’s more like herding kittens than sheep, but we don’t have a term for a kitten herder, so shepherd will have to do.

I was asked if I could undertake this part of the process. Because my role as APNIC’s Product Manager for Registry includes Whois and RDAP, I have a shared interest in solving the geolocation problem. I’m motivated to help for the good of this work, and because it will help reduce traffic through the RIR system helpdesk, and services, by making it easier for people to find where things are on the Internet.

Can I trust a geofeed result?

This is a good question. Will geofeed: data be better or worse than existing sources of geolocation?

Geofeed: data is asserted by the actual network block operator, and they can prove it if they choose to use the RPKI signing method documented in the draft. It’s a strong assertion that they did in fact say it, which I think sets a high bar. However, we need to look at this in more detail. The question actually has two forms:

Who said it?
Is it correct?

There are times, such as in IPR access, where misdirection might pay for a network operator. Without getting too specific here, I have been told that one lucky provider in the Caribbean saw a large up-tick in hotel registrations on their wireless network when it was discovered that — unlike competitor networks — it was able to access US domestic TV content. This is a good example of when it might pay for a network operator to be misleading with the geolocation information.

However, operators who pull stunts like that would have to contend with a lot of drawbacks, such as:

An increase in people being mis-identified as out-of-region, sent to the wrong content sources.
Banking and government service portals may think they are coming from overseas.
Local gamer networks would de-preference them and players would start seeing slower, longer-ping-time paths being offered up.

In the end, the risk of deception is there, but is a contained risk.

The very real side of the problem is that the existing geolocation is more about the entity and less about the network itself.

For example, there are Singapore and Hong Kong registered entities providing cloud services to entities in the USA. Not being able to formally declare the location of the network properly is a huge barrier to operators.

Of course, the RIR data is only one aspect of how geolocation is formed. Maxmind and other sources use their own methods of data collection, but are believed to use geofeed: sources when found. Google has said it uses them, and over 1,000 feeds are already registered worldwide from the pre-standard model. It would be net-beneficial to aid in the discovery of these feeds.

Other methods of geolocation include network triangulation, and many content providers with large worldwide BGP presence are able to calculate local network points for an IP range very well. In some ways, geofeed: data would help equalize for this big player benefit since the declaration can be used by providers with fewer points of presence, and weaker models of closest path to a given source address.

On balance, I think you can, and should, trust a geofeed:. I think it’s better to let the delegate of an address tell you where it is than to rely on the information registered when the address was delegated.

What do you think?

The views expressed by the authors of this blog are their own and do not necessarily reflect the views of APNIC. Please note a Code of Conduct applies to this blog.

Where in the world is the network actually being used?

Is that really the network, or just its registration location?

What is a geofeed entry?

What has this got to do with shepherds?

Can I trust a geofeed result?

Leave a Reply Cancel reply