Route collectors — including the Route Views project of the University of Oregon and the Routing Information Service (RIS) of RIPE NCC — have been an invaluable source of information about the Internet inter-domain ecosystem over the last 20 years.
They typically collect routing data, in the form of establishing BGP sessions, from organizations that voluntarily agree to participate, and then regularly dump it in MRT format RIB snapshots. The sequence of each BGP packet is then collected in a predetermined time interval chosen by the collector.
Over the years, many things have changed that have impacted on this system, notably the size of the Internet, which encompasses more than 60K public ASes, and a full IPv4 RIB count, including more than 700K different routes. BGP itself has also changed, with new extensions such as the possibility to route IPv6 (RFC 4760) and to advertise multiple path attributes for the same prefix (RFC 7911).
All these changes, together with an increasing number of route collector participants and new collector services emerging (such as Isolario, PCH and BGPmon), have resulted in a significant increase in available routing data (Figure 1) and, in doing so, introduced a new problem: how do we analyse such a large amount of data in a reasonable time?
Figure 1 — Cumulative amount of data provided by Isolario, Route Views and RIS (2000-2018).
Read: Isolario improving understanding of AS ecosystem
MRT/BGP data reader state of the art
There are a number of existing tools/libraries to analyse BGP data in MRT format (Table 1).
Most of those that are listed below in Table 1 have not been updated for quite some time and some of them are missing the capability to handle basic extensions. It is important to note, that all of them were not written focusing on performance.
While these limitations may be acceptable for simple utilities and analysers, they are less so for systems with performance constraints and in systems exploiting all data sources available without the possibility of having a Hadoop cluster run the computation and lower the running time. Furthermore, most of the available tools are only able to dump all the data within an MRT file, requiring users to select the BGP data needed to be analysed.
Tool | Language | v6 | ADD PATH | Filter |
bgpdump | C | ✓ | ✓ | ✗ |
bgpparser | C++ | ✓ | ✗ | ✗ |
bgpreader | C | ✓ | ✗ | ✓ |
mabo | OCaml | ✓ | ✗ | ✗ |
mrtparse | Python | ✓ | ✗ | ✗ |
Java-MRT | Java | ✓ | ✗ | ✗ |
zebra-dump-parser | Perl | ✓ | ✗ | ✗ |
BGP Scanner and its filtering capability
BGP Scanner was developed to meet the limitations of existing large-scale BGP data analysing tools.
The library is multi-threaded and throughput-oriented, and focuses on avoiding superfluous copies and memory allocations to reduce the overhead from decoding MRT data. To this end, the library has been written in C, which enables low-level access and direct control over the machine, and exploits some of the features introduced in the ISO C99 and ISO C11 standards, such as dynamic stack allocations and thread local variables. These choices also allow the library to be easily wrapped into higher level languages (for example, Python or Lua) so that it can be exploited by a larger community of users.
What sets BGP Scanner apart from most existing software though is its ability to filter BGP packets by attributes, routes and announcements. This is especially useful for network administrators wanting to troubleshoot a particular routing event involving a well-defined subset of routes. For example, BGP Scanner can identify packets with particular patterns into an AS path attribute and/or can select every packet containing routing information concerning a given subnet/supernet.
BGP Scanner outperforms other tools six-fold
BGP Scanner has been tested against all the tools listed in Table 1 for time elapsed and the memory consumption required to analyse the first RIB available for a given collector in July 2018, followed by the sequence of update messages collected during the month.
Each file was preliminarily decompressed to remove the overhead caused by the compressing algorithm. The following results represent the average results of 10 runs, in order to avoid spurious effects caused by external factors. Each test was performed on a machine equipped with an Intel(R) Core (TM) i7-4790K 4.00GHz, 16GB RAM and running Debian Stretch.
Figure 2 — Tool performances during the analysis of data collected by route-views6 (Route Views).
The first test (Figure 2) was run on the collector route-views6 of Route Views. This collector historically collects only IPv6 routes, and the amount of data collected during the month is not very large. In this scenario the RIB file size was 99MB and the sum of update files was 25.65GB, with 24 full routing tables (out of 26 tables collected).
BGP Scanner was able to complete the analysis in about three minutes while keeping the memory consumption under 3MB, while every other tool required at least 20 minutes, with a peak of more than 5 hours.
Figure 3 — Tool performances during the analysis of data collected by Korriban (Isolario).
The second test (Figure 3) was run on the collector Korriban of the Isolario project. This collector is hosting feeders with ADD-PATH capability both in IPv4 and IPv6, which dramatically increase the amount of data collected. In this scenario, the RIB file size was 5.7GB and the sum of update files was 810.64GB, with 112 IPv4 full routing tables (out of 512 IPv4 tables collected) and 126 IPv6 full routing tables (out of 407 IPv6 tables collected). This test was run for only BGP Scanner and bgpdump, the only two tools currently supporting the ADD-PATH capability.
BGP Scanner was able to carry out the analysis in less than two hours while bgpdump required more than 11 hours. It can be assumed that this gap will likely increase in the future with more feeder information available and more data to be analysed.
Get BGP Scanner
BGP Scanner and its basic building block, the BGP/MRT C library, are developed within the Isolario project and released as open-source under the BSD license.
The source code is available via Gitlab or the Isolario webpage and can be installed on any POSIX compliant platform. The utility comes with a detailed main page containing an in-depth description of its features.
Contributors: Luca Sani and Alessandro Improta
Lorenzo Cogotti is a software developer and free software enthusiast, now contributing to the IIT-CNR Isolario Project. He is the co-founder of the Alpha Cogs, where he supervises high-efficiency, parallel, graphics and multimedia-related projects.
The views expressed by the authors of this blog are their own and do not necessarily reflect the views of APNIC. Please note a Code of Conduct applies to this blog.