This post will introduce my Google Summer of Code Project at the Internet Health Report (IHR), which focuses on alarms correlation and aggregated reports through an online tool. The project is being conducted under the mentorship of Romain Fontugne and Emile Aben. To see the project in action please see the IHR Global Report Page.
With today’s rapidly evolving Internet infrastructure, effective monitoring of Internet alarms related to BGP hijacking, BGP routing, Internet delays, and outages is not just essential, but critical. This project was initiated to address this crucial need, offering an array of benefits.
Improved network monitoring
With the comprehensive correlation and aggregation of alarm data, our project significantly enhances the capability to monitor the health and performance of Internet networks. This translates to quicker issue detection, prompt response, and minimized downtime.
Enhanced situational awareness
By collating alarms from diverse sources and visualizing them in coherent ways, our project augments the understanding of the overall Internet ecosystem. This heightened awareness empowers network administrators, policymakers, and stakeholders to grasp the landscape’s dynamics more comprehensively.
Better decision-making
The availability of consolidated and correlated alarm data equips decision-makers with more accurate and up-to-date information. In turn, this leads to informed decisions that can be swiftly implemented, ensuring network stability and minimizing potential risks.
Improved collaboration
The shared platform for alarm correlation and aggregated reports facilitates collaboration among various stakeholders. Network operators, analysts, policymakers, and researchers can all access the same comprehensive data pool, fostering cooperation, problem-solving, and the exchange of insights.
Inside the project
Our project successfully achieved its objective by aggregating and correlating data from multiple sources, including IHR, Global Routing Intelligence Platform (GRIP), and Internet Outage Detection and Analysis (IODA). Subsequently, we employed three visualization methods — World Map, Time Series, and TreeMap — to present the insights. These visualizations provide diverse levels of granularity, encompassing Economy and Autonomous System Entities, along with Date Time Filtering.
IHR alarms
IHR is dedicated to monitoring the condition of Internet networks. It facilitates better comprehension of Internet infrastructure for network operators, policymakers, and stakeholders. The following alarms are used in IHR:
- Hegemony alarms: These measure Autonomous System (AS) dependency using BGP data.
- Network delay alarms: These monitor latencies via traceroutes from the RIPE Atlas measurement platform.
GRIP alarms
GRIP is focused on BGP hijacking observability, involving near-real-time monitoring and analysis of suspicious BGP routing events to identify and mitigate hijacking attempts. It encompasses these BGP hijack alarms:
IODA alarms
IODA is a 24/7 Internet monitoring tool for detecting and visualizing large connectivity issues in real time. It raises the following alarms:
- BGP: Gathers BGP data by processing updates from Route Views and RIPE RIS collectors.
- Active probing (ping-slash24): Monitors subnet health by sending ICMP echo requests to all IP addresses within a specific subnet (usually a /24 subnet). IODA uses a custom implementation of the Trinocular technique.
- Network telescope: Observes Internet events targeting unused address space, aiding in detecting possible network attacks. The basic idea is to observe traffic targeting the dark (unused) address space of the network. Since all traffic to these addresses is suspicious, one can gain information about possible network attacks. IODA analyses the traffic data from both the University of California, San Diego (UCSD) and Michigan Educational Research Information Triad (MERIT) network telescopes.
Real-world use case
Our project’s practicality is highlighted by recent events. On 15 August 2023, at 11:30 UTC, Brazil experienced Internet outages due to a power cut. Our project detected these outages through analysis of Hegemony Alarms, as depicted below:
Expanding the project
Feedback from users and stakeholders is invaluable for refining and expanding this project. The code for this project is open source, fostering a collaborative environment for further development and innovation. Gathering insights from network administrators, policymakers, and analysts will aid in enhancing the alarms correlation and aggregated reports tool.
Future iterations could involve incorporating additional data sources to enhance the scope and accuracy of alarms. For instance, potential data sources could include DNS query data to monitor DNS-related anomalies and HTTP request data to detect website availability issues. The open-source nature of the project welcomes contributions and suggestions for integrating new data sources.
Additionally, future developments should also focus on refining visualization techniques to provide even more actionable insights at a glance. We believe that an open approach to development will pave the way for a more comprehensive and impactful tool.
Feel free to suggest any potential data sources you have in mind. It’s worth noting that there’s existing documentation on how to incorporate new data sources into the project, available in the IHR GitHub repository. Your input regarding possible data sources is highly appreciated, and we look forward to collaborating on the evolution of this project.
Mohamed Awnallah is a Data Engineer with a strong understanding of the data product life cycle and is passionate about contributing to open source and the Internet communities.
Adapted from the original post at RIPE Labs.
The views expressed by the authors of this blog are their own and do not necessarily reflect the views of APNIC. Please note a Code of Conduct applies to this blog.