How to be a better network service provider peer

By on 12 Jan 2024

Category: Tech matters

Tags: , , , ,

Blog home

Internet connectivity is a result of multiple peering sessions with different Autonomous System Numbers (ASNs). My employer, RETN, is an international Network Service Provider (NSP) that has established multiple peering sessions with many ASNs. This post will share some tips on peering evaluation and network operations.

Why peer?

There are several benefits associated with peering, especially for Internet Service Providers (ISPs) and large networks:

  1. Cost saving: Reduce the usage of your upstream bandwidth.
  2. Latency: Direct peering connection with your peers can improve the network quality and minimize the hop.
  3. Control: Provide direct routing control for traffic engineering. For example, your customer can use the BGP community to control the announcement to different peers.

Evaluate your potential peers

Before establishing peering relationships, there should be thorough planning and discussion beforehand. Here are some common considerations:

  • Company peering policy — Each ASN usually has its own peering policy and requirements such as traffic level, traffic ratio, port size, and so on.
  • Flow data — Flow data can hint at how much traffic you would exchange with your potential peers. With that, you can estimate how much you will benefit from forming a peering.
  • PeeringDB — PeeringDB can also provide some additional information like the number of prefixes, traffic ratios, common peering facility, peering policy type, and so on.
  • ASRank — For those who want to understand their target peers, you can also check the data at ASRank, which is an excellent evaluation tool.

ASRank

ASRank is a long-term research project from the Center for Applied Internet Data Analysis (CAIDA) based in the San Diego Supercomputer Centre in the US. They use data from the RouteViews project and RIPE NCC to infer relationships between different ASNs. There, you can see information like the number of prefixes, addresses and ASNs observed from your ASN, and the number of ASNs observed inside your customer cone (Figure 1).

CAIDA’s AS ranking system.
Figure 1 — CAIDA’s AS ranking system.

Figure 2 shows how a change in relationship will affect the customer cone of your ASN. For example, when the relationship between A and B has changed from customer to peer, ASN A (the provider) will lose the customer cone under ASN B. When there is a change of relationship between different ASNs, customer size would be changed accordingly.

Diagram of The effects of changing the link between A and B to a peering link.
Figure 2 — The effects of changing the link between A and B to a peering link.

How can ASRank help?

The ASRank measurement provides an objective way for you to evaluate your peers. You can use it to quantify and compare the number of routed prefixes, observed ASNs, and total IP addresses visible through BGP, with transparency about how it’s derived. ASRank also has an open-source dataset to integrate with your own system for further analysis. The source data is updated monthly.

It is worth mentioning that the data for content-related ASNs is usually lower because they seldom provide transit service to other ASNs.

Secure your network — RPKI

RPKI is currently one of the industry’s most popular and discussed topics. With RPKI, you can verify the prefix using Route Origin Authorization (ROA) to filter any ASN that isn’t legitimate to announce those prefixes.

Using the National Institute of Standards and Technology’s (NIST’s) measurements, we can see RPKI uptake increasing. Currently, prefixes under APNIC have around 49% ROA coverage, which is roughly the same number of prefixes for which we don’t see ROAs.

Chart illustrating IPv4 RPKI ROV history of unique prefix origin pairs in the APNIC region over time.
Figure 3 — IPv4 RPKI ROV history of unique prefix origin pairs in the APNIC region over time. Source: NIST.
Chart of IPv4 RPKI ROV history of unique prefix origin pairs in the APNIC region over time.
Figure 4 — IPv4 RPKI ROV history of unique prefix origin pairs in the APNIC region over time. Source: NIST.

Let’s use the example of South Korea.

Compared with the nearby economies, South Korea lags behind its neighbours with only around 1.6% of prefixes currently protected by ROAs. Additionally, there is nearly no filtering after Route Origin Validation (ROV) at all.

Chart of ROAs in South Korea compared to some of its neighbours.
Figure 5 — ROAs in South Korea compared to some of its neighbours. Source.
Chart of ROV filtering in South Korea compared to some of its neighbours.
Figure 6 — ROV filtering in South Korea compared to some of its neighbours. Source.

The KlaySwap incident

Not addressing best practice security measures can result in hijacking like the KlaySwap incident. In 2022, South Korean crypto platform KlaySwap had USD 1.9M worth of digital assets stolen within just a few hours. Hackers were not targeting the server of KlaySwap itself, they were using a classic BGP hijacking technique.

Kakao Corp (AS38099) is a small BGP network peering with one other network and has one upstream carrier. Typically, Kakao Corp will only announce a /21 prefix to the public Internet. The attackers used this point to announce a more specific /24 route to the public Internet using another ASN.

It is a simple and well-defined rule that every router on the network will likely select the prefix with a more specific route. Zayo, Kakao Corp’s upstream carrier, accepted this route and began announcing it to the public network. Without any ROAs for these prefixes, other carriers were not able to filter them properly. Combining these issues, all the traffic from end users was driven to the hacker’s server.

Diagram of the KlaySwap BGP hijack overview.
Figure 6 — KlaySwap BGP hijack overview. Source.

My objective here is not to attribute blame to anyone who hasn’t implemented these security measures but to convince everyone who hasn’t implemented RPKI to take action. RETN deployed RPKI and started route filtering three years ago. Beforehand, we filtered the prefixes based on the route object entry only. At the time, other network engineers in my team and I questioned if these changes were necessary…

However, three years later, I can say it was the right decision. ROA and RPKI validation has really helped to minimize the possible risk of being hijacked and announcing something incorrectly. It also saves our time to deal with such BGP hijacking since no one can mitigate such security problems immediately. It will take you a lot of time to perform troubleshooting and take the appropriate action manually.

Apart from my previous experience, research shows that 75% of traffic can reach the correct destination during the BGP hijacking and that invalid route propagation can be reduced by half to even two-thirds.

Anti-spoofing

Another popular action is anti-spoofing. This is not mentioned often enough but it is a good technique for operators. RFC 2827 describes how to filter the ingress direction and around 82% of IPv4 blocks are not spoofable now (Figure 7). Put simply, network ingress filtering is a way for you to validate the inbound packets and make sure they have a valid source IP.

Chart of spoofable IPv4 blocks (excluding NAT).
Figure 8 — Spoofable IPv4 blocks (excluding NAT).

Major router OEMs, such as Juniper/Cisco and Huawei all support the unicast reverse path forwarding filtering function, so for most there’s no additional cost but setting it up can save your bandwidth and minimize the amount of malicious traffic.

FlowSpec

To deal with Distributed Denial-of-Service (DDoS) attacks, we always use remotely triggered blackholes as the simplest way to mitigate them. Of course, multiple products allow you to filter out possible attacks. However, usually the cost is much higher.

BGP Flow Specification (Flowspec) was introduced in 2009’s RFC 5575 and it is quite simple to implement. Then you can manage traffic using Network Layer Reachability Information (NLRI), enabling routers to share specific flow details and respond accordingly.

Diagram of the Flowspec overview.
Figure 9 — Flowspec overview. Source.

Minor points (but no less important)

As a good peer, you should also maintain a network without packet loss so that users can exchange data with the host properly. A good ASN should always be prepared for network outages and worst-case scenarios, especially for long-term outages like subsea cable cuts.

Good technical support is also important. I remember last time when I was trying to contact some of our peers regarding the possible network issue. They somehow could not identify the peering connections or provide any help immediately. Therefore, highly responsive technical support is key.

An up-to-date PeeringDB profile is crucial. It is a major peering requirement for all major Content Distribution Networks (CDNs) and operators so that they can automate the process of peering connection and set appropriate peering parameters like the peering session prefix limit.

Conclusion

Lastly, I just want to share some inspiration I heard at APNIC 56. It’s a single word — ‘collaboration’. Deploying RPKI on just one network will not work. The Internet comprises many ASNs, and collective collaboration is required to enhance the Internet and deliver added value to our customers.

Rate this article

The views expressed by the authors of this blog are their own and do not necessarily reflect the views of APNIC. Please note a Code of Conduct applies to this blog.

Leave a Reply

Your email address will not be published. Required fields are marked *

Top