Tier 1 and aspiring Tier 2 providers interconnect only in large metropolitan areas, due to commercial incentives and politics. They often won’t peer with smaller providers, because why peer with a potential customer? Due to this, it’s entirely likely that traffic between two parties, for example, in Thessaloniki is sent to Frankfurt or Milan and back.
One possible antidote to this is to connect to a local Internet Exchange Point (IXP). Not all ISPs have access to large metropolitan data centres where larger IXPs have a Point of Presence (PoP), and it doesn’t help that the data centre operator is happy to charge a substantial amount of money each month, just for the privilege of having a passive fibre cross-connect to the exchange.
Many IXPs these days ask for per-month port costs and meter the traffic with policers and rate limiters, such that the total cost of peering starts to exceed what one might pay for transit, especially at low volumes, which further exacerbates the problem. Bah.
This is an unfortunate market effect (the race to the bottom), where transit providers are continuously lowering their prices to compete. While transit providers can compensate to some extent with economies of scale, at some point they are mostly all of equal size, and thus the only thing flexible is the quality of service.
The benefit of using an IXP is to reduce the portion of an ISP’s (and Content Delivery Network’s (CDN’s)) traffic that must be delivered via their upstream transit providers, thereby reducing the average per-bit delivery cost and as well reducing the end-to-end latency as seen by their users or customers.
Furthermore, the increased number of paths available through the IXP improves routing efficiency and fault tolerance, and it avoids traffic going the scenic route to a large hub like Frankfurt, London, Amsterdam, Paris, or Rome if it could very well remain local.
IPng Networks really believes in an open and affordable Internet, and I would like to do my part in ensuring the Internet stays accessible for smaller parties.
Smöl IXPs
One notable problem with small exchanges, like for example [FNC-IX] in the Paris metro, or [CHIX-CH], [Community IX] and [Free-IX] in the Zurich metropolitan area, is that they are, well, smöl (small). They may be cheaper to connect to, in some cases even free, but they don’t have a sizeable membership, which means that there is inherently less traffic flowing, which in turn, makes it less appealing for prospect members to connect to.
At IPng, I have partnered with a few super cool ISPs and carriers to offer a free IX platform. Just to head the main question off at the pass: Free here actually does mean free or (Gratis) — a gift to the community that does not cost money. It also more philosophically wants to be ‘Free as in open, and transparent’ or [Libre].
Two examples are:
- [FreeIX: Switzerland] with PoPs at STACK GEN01 Geneva, NTT Zurich and Bancadati Lugano.
- [FreeIX: Greece] with PoPs at TISparkle in Athens and Balkan Gate in Thessaloniki.
… but there are actually quite a few out there once you start looking :).
Growing smöl IXPs
Some IXPs break through the magical 1Tbps barrier (and get a courtesy callout on X from Dr King), but many remain smöl. Perhaps it’s time to break the chicken-and-egg problem. What if there was a way to interconnect these IXPs?
Let’s take, for example, the FreeIX in Greece that was announced at the Greek Network Operators Group (GRNOG) 16 in Athens on 19 April 2024. This IXP initially targets Athens and Thessaloniki, with 2x100G between the two cities. Members can connect to either site for the cost of only a cross-connect. The 1G / 10G / 25G ports will be gratis. But I will be connecting one very special member to FreeIX Greece, AS50869.
FreeIX Remote
Here’s what I am going to build. The FreeIX Remote project offers an outreach infrastructure that connects to IXPs and Private Network Interconnects (PNIs), and allows members to benefit from that in the following way:
- FreeIX uses AS50869 to peer with any network operator who is available at public IXPs or using PNIs. It looks like a normal service provider in this regard. It will connect to IXPs and learn a bunch of routes.
- FreeIX members can join the program, after which they are granted certain propagation permissions by FreeIX at the point where they have a BGP session with AS50869. The prefixes learned during these member sessions are marked as such and will be allowed to propagate. Members will receive some or all learned prefixes from AS50869.
- FreeIX members can set fine-grained BGP communities to determine which of their prefixes are propagated and at which locations.
Members at smaller IXPs greatly benefit from this type of outreach, by receiving large portions of the public Internet directly at their preferred peering location. Similarly, the FreeIX Remote routers will carry their traffic to these remote IXPs.
Detailed design
Peer types
There are two types of BGP neighbor adjacency:
- Members: These are ip-address, AS-tuples that FreeIX has explicitly configured. Learned prefixes are added to the as-set AS50869:AS-MEMBERS. Members receive all prefixes from FreeIX, each annotated with BGP informational communities, and members can drive certain behaviour with BGP action communities.
- Peers: These are all other entities with whom FreeIX has an adjacency at public IXPs or PNIs. Peers receive some (or all) member prefixes from FreeIX and cannot drive any behaviour with communities. For IXPs and peers, AS50869 looks like a completely normal ISP, advertising subsets of the customer AS cone from AS50869:AS-MEMBERS at each IXP.
BGP sessions with members use strict ingress filtering using bgpq4
, and will be tagged with a set of informational BGP communities, such as where the prefix was learned, and what propagation permissions it received (for example, at which IXPs will it be allowed to be announced). Of course, prefixes that are RPKI invalid will be dropped, while valid and unknown prefixes will be accepted. Members are granted permissions by FreeIX, which determines where their prefixes will be announced by AS50869. Further, members can perform optional actions using BGP communities at their ingress point, to inhibit announcements to a certain peer or at a given exchange point.
Peers on the other hand are not granted any permissions and all action BGP communities will be stripped of prefixes learned. Informational communities will still be tagged on learned prefixes. Two things happen here. Firstly, members will be offered only those prefixes for which they have permission — in other words, I will create a configuration file that says member AS8298 may receive prefixes learned from Frys-IX. Secondly, even for those prefixes that are advertised, the member AS8298 can use the informational communities to further filter what they accept from FreeIX Remote AS50869.
BGP classic communities
Members are allowed to set the following legacy action BGP communities for coarse-grained distribution of their prefixes through the FreeIX network.
(50869,0)
do not announce anywhere(50869,666)
or(0,666)
blackhole everywhere (can be on any more specific from the member’s AS-SET)(50869,3041)
prepend once everywhere(50869,3042)
prepend twice everywhere(50869,3043)
prepend three times everywhere
Peers, on the other hand, are not allowed to set any communities, so all classic BGP communities from them are stripped on ingress.
BGP large communities
FreeIX Remote will use three types of BGP large communities, which each serve a distinct purpose:
- Informational: These communities are set by the FreeIX router when learning a prefix. They cannot be set by peers or members and will be stripped on ingress. They will be sent to both members and peers, allowing operators to choose which prefixes to learn based on their origin details, like which economy or IXP they were learned at.
- Permission: These communities are also set by FreeIX operators when learning a prefix (such as on the ingress router). They cannot be set by peers or members and will be stripped on ingress. The permission communities determine where FreeIX will allow the prefix to propagate. They will be stripped on egress.
- Action: Members can further steer announcements based on the permissions by sending certain action communities to FreeIX. These actions cannot be sent by peers, but in certain cases, they can be set by FreeIX operators on ingress. Similarly to the permission communities, all action communities will be stripped on egress.
Regular peers of AS50869 at IXPs and PNIs will not be able to set any communities, so all large BGP communities from them are stripped on ingress.
Informational communities
When FreeIX routers learn prefixes, they will annotate them with certain communities. For example, the router at Amsterdam NIKHEF (which is router #1, economy #2), when learning a prefix at FrysIX (which is IXP #1152), will set the following BGP large communities:
(50869,1010,1)
: Informational (10XX), Router (1010), vpp0.nlams0.free-ix.net (1)(50869,1020,2)
: Informational (10XX), Economy (1020), Netherlands (2)(50869,1030,1152)
: Informational (10XX), IXP (1030), PeeringDB IXP for FrysIX (1152)
When propagating these prefixes to neighbors (both members and peers), these informational communities can be used to determine local policy, for example by setting a different localpref or dropping prefixes from a certain location. Informational communities can be read, but they can’t be set by peers or members — they are always cleared by FreeIX routers when learning prefixes, and as such the only routers that will set them are the FreeIX ones.
Permission communities
FreeIX maintains a list of permissions per member. When members announce their prefixes to FreeIX routers, these permissions communities are set. They determine what the member is allowed to do with FreeIX propagation — notably which routers, economies, IXPs, and PNIs the member will be allowed to propagate to.
Usually, member prefixes are allowed to propagate everywhere, so the following communities might be set by the FreeIX router on ingress:
(50869,2010,0)
: Permission (20XX), Router (2010), everywhere (0)(50869,2020,0)
: Permission (20XX), Economy (2020), everywhere (0)(50869,2030,0)
: Permission (20XX), IXP (2030), everywhere (0)(50869,2031,0)
: Permission (20XX), PNI (2031), everywhere (0)
If the member prefixes are allowed to propagate only to certain places, the ‘everywhere’ communities will not be set, and instead lists of communities with finer-grained permissions can be used, for example:
(50869,2010,2)
: Permission (20XX), Router (2010), vpp0.grskg0.free-ix.net (2)(50869,2020,3)
: Permission (20XX), Economy (2020), Greece (3)(50869,2030,60)
: Permission (20XX), IXP (2030), PeeringDB IXP for SwissIX (60)(50869,2031,8298)
: Permission (20XX), PNI (2031), IPng Networks GmbH (AS8298)
Permission communities can’t be set by peers, nor by members — they are always cleared by FreeIX routers when learning prefixes and are configured explicitly by FreeIX operators.
Action communities
Based on the permission communities, zero or more egress routers, economies and IXPs are eligible to propagate member prefixes by AS50869 to its peers. Members can define very fine-grained action communities to further tweak which prefixes propagate on which routers, in which economies and towards which IXPs and PNIs:
(50869,3010,3)
: Inhibit Action (30XX), Router (3010), vpp0.gratt0.free-ix.net (3)(50869,3020,1)
: Inhibit Action (30XX), Economy (3020), Switzerland (1)(50869,3030,1308)
: Inhibit Action (30XX), IXP (3030), PeeringDB IXP for LS-IX (1308)(50869,3031,8298)
: Inhibit Action (30XX), PNI (3031), IPng Networks GmbH (AS8298)
Further actions can be placed on a per-remote-neighbor basis:
- (
50869,3040,13030)
: Inhibit Action (30XX), AS (3040), Init7 (AS13030) (50869,3041,6939)
: Prepend Action (30XX), Prepend Once (3041), Hurricane Electric (AS6939)- (
50869,3042,12859)
: Prepend Action (30XX), Prepend Twice (3042), BIT BV (AS12859) (50869,3043,8283)
: Prepend Action (30XX), Prepend Three Times (3043), Coloclue (AS8283)
Peers cannot set these actions, as all action communities will be stripped on ingress. Members can set these action communities on their sessions with FreeIX routers, however in some cases they may also be set by FreeIX operators when learning prefixes.
What’s next
Perhaps this interaction between informational, permission and action BGP communities gives you an idea of how such a network may operate. It’s somewhat different to a classic Transit provider, in that AS50869 will not carry a full table. It’ll merely provide a form of partial transit from member A at IXP #1, to and from all peers that can be found at IXPs #2-#N. Makes the mind boggle? Don’t worry, we’ll figure it out together. 🙂
In an upcoming article I’ll detail the programming work that goes into implementing this complex peering policy in Bird2 as driving Vector Packet Processing (VPP) routers (duh), with an Interior Gateway Protocol (IGP) that is IPv4-less, because, at this point, I may as well put my money where my mouth is.
If you’re interested in this kind of stuff, take a look at the IPng Networks AS8298 Routing Policy. Similar to that one, this one will use a combination of functional programming, templates, and clever expansions to make a customized per-member and per-peer configuration based on a YAML input file, which dictates which member and which prefix is allowed to go where.
First, I need to get a replacement router for the Thessaloniki router, which will run VPP of course. My buddy Antonis noticed there are CPU and/or DDR errors on that chassis, so it may need to be RMAd. But once it’s operational, I will start by deploying one instance in Amsterdam NIKHEF, and another in Thessaloniki Balkan Gate, with a 100G connection between them, graciously provided by LANCOM. Just look at that FD.io hound runnnnn!!
Pim van Pelt (PBVP1-RIPE) began his career as a network engineer in the Netherlands, where he worked for Intouch, Freeler, and BIT. He helped raise awareness for IPv6, for example by launching it at AMS-IX back in 2001. He also operated SixXS, a global IPv6 tunnel broker, from 2001 through to its sunset in 2017. Since 2006, Pim has worked as a Distinguished SRE at Google in Zurich, Switzerland.
Originally posted on IPng Networks blog.
The views expressed by the authors of this blog are their own and do not necessarily reflect the views of APNIC. Please note a Code of Conduct applies to this blog.