Do you ever wonder whether you can really trust other networks, such as your provider(s) and peers? More precisely, wouldn’t you like to be able to tell if the traffic you send always flows through the paths received in the Border Gateway Protocol (BGP)? Could it be that, for some prefixes, the forwarding path might differ?
When you think about tools that would allow you to draw conclusions on the subject, how many come to mind? Not many, right? In this article, we describe a methodology to detect BGP lies. So, if you are curious and have 10 minutes, grab a cup of coffee and enjoy the read.
What are BGP lies?
BGP is the de-facto inter-domain routing protocol that allows Autonomous Systems (ASes) to exchange traffic. In particular, in the control plane, routers exchange BGP updates for reachable prefixes. These messages contain a field named AS-path that stores sequences of AS numbers (ASNs). For each prefix, the AS-path tells the theoretical path that packets should traverse towards destinations inside that prefix. The BGP policies that Internet Service Providers enforce often take the AS-paths into consideration. On the other hand, ASes exchange traffic in the data plane. The forwarding path that packets empirically traverse towards their destination is what we call a data path.
The way the Internet works, we implicitly assume that data paths always match AS-paths. In this article, we are interested in detecting BGP lies, cases where this underlying assumption does not hold.
As an example, Figure 1 shows that, for prefix P, the forwarding path matches the AS-path on the left figure, but not on the right. In the first case, AS B respects what it had advertised to AS A, but actually lied in the second one.
In essence, BGP lies can result from two different phenomena. The first is malicious behaviour by an AS. For example, an AS may consciously modify the advertised AS-paths in order to attract traffic, and then forward it through any alternative path of its choice. On the other hand, BGP lies can be involuntarily produced by an unintended behaviour of an AS. We can think of cases where, due to technical limitations, traffic may eventually exit an AS through unexpected border routers and then a peering AS different from the one indicated in BGP.
In practice, it is not easy to find tools that pinpoint BGP lies. This is because crafting such tools requires IP-to-AS mapping and co-located vantage points (VPs).
Data paths are usually collected with traceroute in the form of sequences of IP addresses, whereas AS-paths are lists of ASNs. As a consequence, an IP-to-AS mapping tool needs to be used to overcome this limitation, as highlighted in Figure 2:
On the other hand, we need a co-located VP from which to run traceroute — a VP placed next to the BGP speaker that shares the BGP snapshots. Indeed, since different BGP speakers of the same AS may choose different best paths towards the same prefix, data paths need to be collected either by the same router from which the BGP data is gathered, or from a point in the network that ensures that packets will actually traverse this router, as shown on the left side of Figure 3. When this does not happen (as shown on the right side of Figure 3), we cannot ensure that the forwarding path should match the AS-path, and thus comparing the paths would not give a valid measure of whether BGP lies occur or not.
The problem of finding co-located VPs
Co-located VPs may be manually deployed having privilege access to networks, however, finding them on a larger scale is not easy. As an example, despite the fact that multiple projects share BGP routes, such as RIPE RIS, PCH, RouteViews or the Isolario Project, the exact location within the ASes where the routes are collected is not known. As a consequence, even if these ASes hosted a traceroute VP, like probes of RIPE ATLAS or NLNOG RING infrastructure, it is not sure whether these would actually be co-located VPs. A heuristic approach assumes that if the first AS found in traceroute matches the one seen in the AS-path, then the complete data path and AS-path should match.
In reality, ensuring that VPs are truly co-located VPs depends on the measuring infrastructure that is used. In that sense, the PEERING testbed is a convenient infrastructure since it allows us to gather BGP data from border routers, while also providing transit via them. On the other hand, efforts are currently being put into deploying Scamper in RouteViews’ collectors.
The problem of IP-to-AS mapping
To map from IP addresses to ASNs, the BGP snapshots of a BGP speaker can be used. For any IP address, we first identify the longest matching prefix present in the BGP snapshot, extract the associated AS-path, and simply map the IP address to the originating AS in that path (the ASN that appears at the end of the AS-path).
Unfortunately, the above mentioned technique is known to be error-prone in the presence of:
- AS siblings, cases where an organization has been delegated multiple ASNs, and thus these may appear interchangeably both in data paths and AS-paths
As an example, in Figure 4, organization A manages AS A0 and AS A1. In practice, it could occur that the AS-path traverses both ASes, but the forwarding path only traverses one (left figure). The opposite may eventually also occur, as shown on the right.
- Third-party addresses, IP addresses that map to an off-path AS, and lead data paths to wrongly include its related ASN
Figure 5 shows an example of the inconvenience that third-party addresses produce. In this scenario, the router in the middle of AS B responds to traceroute with the IP address belonging to the outgoing interface connecting AS B with AS X. If this IP address belongs to the address space of AS X, and this AS announces a prefix containing it in BGP, it is likely that after mapping, the inferred path will be AS A, AS B, AS X, AS B, AS C. So, even if the data path had actually respected the AS-path, a mismatch is wrongly inferred.
In addition, traceroute may include missing hops when IP addresses fail to be retrieved, or when they are retrieved but there exists no prefix matching them in the BGP snapshots, such as private IP addresses. In such events, wildcards are used to indicate their occurrence.
As an example, in Figure 6, the routers in AS B do not respond to traceroute, or do so with IP addresses that are contained in prefixes that are not advertised in BGP. As a consequence, we have no way of knowing the complete forwarding path. The indetermination introduced by the wildcards, or missing hops, needs to be solved, otherwise the paths including them should be discarded.
A framework to detect BGP lies
To detect BGP lies, we propose a framework that overcomes the problem in IP-to-AS mapping. Our framework takes BGP snapshots of a BGP speaker and traceroutes gathered in a co-located VP, and outputs a lower bound of BGP lies that are found.
Besides some logic blocks, our methodology comprises three steps, as shown in Figure 7:
- Preparation stage, that applies the basic IP-to-AS mapping and computes which forwarding path should be compared with which AS-path.
- Mapping relaxation stage, that aims to filter the usual problems generated by AS siblings and third-party addresses.
- Wildcards correction stage, that infers the presumable values for the wildcards in the data paths (if any), and concludes whether the purged data paths and AS-paths match or not.
To overcome the limitations produced by AS siblings, we apply an additional AS-to-organization mapping. This way, AS A0 and AS A1 in Figure 4 would both end up being mapped to Org A, and thus the order in which these ASes appear would not introduce false mismatches.
On the other hand, we identify seemingly third-party addresses looking for IP addresses that map to an AS that differs from the ASes of the adjacent IP addresses. We replace the candidate third-party addresses with wildcards. This way, we would identify the IP address mapping to AS X in Figure 5 as a potential third-party address, and map it to a wildcard instead of AS X itself. That is, the data path would be AS A, AS B, ‘*’, AS B, AS C.
Finally, in the wildcards correction stage, whenever possible, we replace wildcards by the potential missing sub-AS-paths they might represent. For this, we allow a single wildcard to represent up to one AS. Hence, in Figure 6, we would be able to infer that the wildcards should all be mapped to AS B. In addition, recalling the case in Figure 5, after the third-party address is replaced by a wildcard in the mapping relaxation stage, we would then also be able to infer that the most likely value for this wildcard is AS B, and thus avoid a wrongly inferred mismatch.
All in all, our methodology is conservative: we tend to rewrite data paths and AS-paths replacing exact representations (ASNs) with more generic ones (organizations and wildcards), therefore reducing the likelihood of introducing false positives. Similarly, we replace as many wildcards as possible with their counterpart observed in the control plane.
We ran measurements from eight co-located VPs for 13 days, collecting full BGP snapshots every two hours and 80,000 traceroutes towards different /24 IP prefixes per day for each of them. In particular, we deployed two VPs ourselves (VP 7 and 8), and for the remaining six we relied on the PEERING testbed. To apply the basic IP-to-AS mapping for each VP, we used the BGP snapshots of its co-located BGP speaker.
Figure 8 compares the number of BGP lies found, relying only on this basic IP-to-AS mapping (red bars, upper bound) and on our framework (green bars, lower bound), per VP. The bars indicate the mean value per day, and additionally, the standard deviation over time is shown.
Our results show that, even with a conservative analysis, some BGP lies remain in all VPs. Taking a closer look, we can distinguish two patterns. First, we see VPs (1 to 5) where our framework is able to filter most of the mismatches that would have been wrongly inferred by relying on basic IP-to-AS mapping, and these results are stable over time. On the other hand, we have cases (VP 6 and 7) where even after applying our conservative methodology, the number of BGP lies found remains high, and usually have a large daily variance.
We confirm our results in VP 6, where we have ground truth and are further able to determine that the observed BGP lies result from technical limitations in the provider of the AS that hosts the co-located VP. The provider AS uses a router with a partial Forwarding Information Base (FIB), and this leads packets to exit the AS through an incorrect AS, according to what is advertised in BGP.
Detecting BGP lies is not simple, yet, we have provided a way to allow network operators to get a feeling of whether this issue might be affecting them. In future work, it would be interesting to determine whether the patterns seen in our results correlate with either malicious or unintended behaviours.
If you have extra time, I recommend you read our TMA 2019 paper, which provides more details on the problem and solution we have discussed in this article:
J. M. Del Fiore, P. Merindol, V. Persico, C. Pelsser and A. Pescapé, ‘Filtering the Noise to Reveal Inter-Domain Lies,’ 2019 Network Traffic Measurement and Analysis Conference (TMA), 2019, pp. 17-24, doi: 10.23919/TMA.2019.8784618.
On the other hand, if you want to know why the use of partial-FIB routers may even produce intra-domain detours, please consider reading our TNSM 2021 paper:
J. M. Del Fiore, V. Persico, P. Mérindol, C. Pelsser and A. Pescapè, ‘The Art of Detecting Forwarding Detours,’ in IEEE Transactions on Network and Service Management, doi: 10.1109/TNSM.2021.3062151.
Lastly, I invite you to take a look at my PhD thesis that expands on both topics, and focuses rather on the more general problem of detecting hidden broken pieces of the Internet:
J. M. Del Fiore, ‘Detecting hidden broken pieces of the Internet : BGP lies, forwarding detours and failed IXPs,’ PhD thesis, Université de Strasbourg.
Julian received a PhD from the University of Strasbourg, France, in 2021. Previously, he graduated with an Electronics Engineer degree with honours from the University of Buenos Aires, Argentina.
Valerio Persico, Pascal Merindol, Cristel Pelsser and Antonio Pescape contributed to this work.
The views expressed by the authors of this blog are their own and do not necessarily reflect the views of APNIC. Please note a Code of Conduct applies to this blog.