The RIPE Atlas Monitor - Network Monitoring with RIPE Atlas

A new tool joins the family of applications whose goal it is to take full advantage of RIPE Atlas to monitor availability, consistency and reachability of networks and services: the RIPE Atlas Monitor.

Introduction

The RIPE Atlas infrastructure provides a privileged observation point from where you can monitor Internet services. The wide geographical distribution of RIPE Atlas probes and the diversity of networks these probes are hosted in allows you to detect how target services are reached. This can be analyzed under circumstances that often can not be reproduced or even imagined by network operators.

Service providers (hopefully) make sure that their services follow a set of key performance indicators or metrics that reflect good user experiences, contractual obligations, or cost savings, or that those services can be reached with no geographical or other constraints and in a consistent manner. The tool that I have developed allows you to monitor and verify that these predefined indicators are matched by taking advantage of the powerful RIPE Atlas network.

How does it work?

Overview

At first, we define some source probes and the expected values with regards to the kind of measurement we want to run and the condition we want to monitor. For example:
1. An RTT threshold that must not be exceeded from a given set of countries
2. A hostname that must be resolved always and everywhere in a predefined IP address or
3. An AS path that must be traversed by hosts from a specific source ASN
A RIPE Atlas measurement can then be run to gather results from the previously selected probes.
Finally, the RIPE Atlas Monitor can be configured to process these results and verify that they match the expected values. We can also set optional alerts.

Of course, steps 1 and 2 can be inverted if you prefer to define expected values starting from results obtained by an already running measurement: to help with this job, the RIPE Atlas Monitor offers a measurement analyzer that can build a report based on a further elaboration of the results by aggregating the collected values on the basis of some heuristics that facilitate the identification of common patterns.

Configuration details

Monitors are the core of the program. Each monitor must contain the following things:

A set of rules that ties source probes together
Expected results
Actions in case the results match or don’t match.

Everything is wrapped up into a YAML file. An example is worth more than a thousand words:

descr: Check network reachability
matching_rules:
- descr: Probes from France via AS64496
  src_country: FR
  expected_results: ViaAS64496
  actions: EMailToNOC
- descr: RTT from AS64499 and AS64500 below 50ms
  src_as:
  - 64499
  - 64500
  expected_results: LowRTT
  actions: EMailToNOC
expected_results:
  ViaAS64496:
    upstream_as: 64496
  LowRTT:
    rtt: 50
actions:
  EMailToNOC:
    kind: email
    to_addr: noc@example.org
    subject: "ripe-atlas-monitor: unexpected results"
measurement-id: 123456789

The running of these monitors can be scheduled to periodically evaluate measurement results and verify if they match expectations.

Results streaming can also be used to process results in near real time while they are gathered by probes.

Applications

At the time of writing the RIPE Atlas Monitor is still in beta and actively developed, but it already supports the following checks (more details can be found in the official documentation):

RTT
Destination responded
Destination IP address
Destination ASN
Upstream ASN
AS path
SSL certificates fingerprints
DNS responses’ flags
EDNS support
NSID option
DNS answers (records of type A, AAAA, NS, CNAME)

What follows is a brief list of some application scenarios in which the RIPE Atlas Monitor can be used.

Consistency-focused monitoring

With the number of Internet censorship cases increasing, it may be useful to know which country or network applies restrictions on accessing a specific domain name. DNS responses can be matched against expected IP addresses to detect whether lying resolvers are serving answers that have been tampered with:

descr: www.etha.com.tr
matching_rules:
- descr: Any
  expected_results: RealIPAddress
  actions: Log
expected_results:
  RealIPAddress:
    dns_answers:
      answers:
        - type: A
          address: 176.9.34.7
actions:
  Log:
    kind: log
measurement-id: 2905528

(The example is based on the work done by Stéphane Bortzmeyer: DNS Censorship (DNS Lies) As Seen By RIPE Atlas.)

Fingerprints of SSL certificates seen by RIPE Atlas probes can also be checked to verify that TLS connections between probes and the target server are not subject to hijacks or man-in-the-middle attacks:

descr: www.ripe.net SSL cert
matching_rules:
- descr: Any probe
  expected_results: ValidSSLCertificate
  actions: Log
expected_results:
  ValidSSLCertificate:
    cert_fp:
    - 6A:EF:0C:82:1F:B9:8E:13:AA:74:BF:F1:93:E9:C3:84:14:03:88:4D:48:2A:93:AC:BC:94:61:57:BB:4A:80:5C
actions:
  Log:
    kind: log

Network operators

Network operators can already use existing tools to monitor their network’s health and receive alerts using RIPE Atlas, for example, the RIPE NCC Status Checks that can also be integrated with Icinga. The RIPE Atlas Monitor adds features such as traceroute analysis. You can also set different thresholds based on the probes’ properties such as source AS, source country, probe ID.

Traceroutes can be used, for example, to monitor direct peers from where traffic is expected either via a private interconnect or via an Internet Exchange Point:

matching_rules:
- descr: AS64496 Private Interconnect
  src_as:
  - 64499
  - 64500
  expected_results: Direct
- descr: AS64496 IXP peerers
  src_as:
  - 64501
  - 64502
  expected_results: IXP
expected_results:
  Direct:
    as_path: S 64496
  IXP:
    as_path: S IX 64496

In the above example, the “S” and “IX” macros are expanded during rules processing and are used to represent the probe’s source AS and any Internet Exchange peering network respectively (info fetched from the PeeringDB 2.0 API).

For example, a measurement can be used to verify if the GeoDNS mapping of a CDN works as expected and to spot any anomalies:

descr: Traceroute to wikipedia.org
measurement-id: 1983448
matching_rules:
- descr: Any probe
  expected_results: DestinationResponded
  actions: AddLabel-DstResponded
  process_next: True
- descr: EU probes which received a response from target
  internal_labels: DstResponded
  src_country:
  - "AD"
  - "AL"
  - "AT"
  - ...
  expected_results:
  - esams
  - ValidASPaths
  actions:
  - Log
- descr: CA and US probes which received a response from target
  internal_labels: DstResponded
  src_country:
  - "CA"
  - "US"
  expected_results:
  - ulsfo_or_eqiad
  - ValidASPaths
  actions:
  - Log
expected_results:
  DestinationResponded:
    dst_responded: True
  esams:
    descr: esams
    dst_ip:
    - 91.198.174.0/24
    - 185.15.56.0/22
    dst_as: 43821
  ulsfo_or_eqiad:
    descr: ulsfo or eqiad
    dst_ip:
    - 208.80.152.0/22
    - 198.35.26.0/23
    - 198.73.209.0/24
    dst_as: 14907
  ValidASPaths:
    as_path:
    - "13030 43821"
    - "1200 43821"
    - "1299 43821"
    - "1299 14907"
    - "2914 14907"
   - ...
actions:
    AddLabel-DstResponded:
      when: on_match
      kind: label
      op: add
      label_name: DstResponded
    Log:
      kind: syslog

(The example is based on the work done by RIPE NCC staff and engineers from the Wikimedia Foundation: How RIPE Atlas Helped Wikipedia Users.)

Here, labels are used to mark those RIPE Atlas probes that received a response from the target in order to subsequently use them for further analysis.

Integration with RIPE Atlas CLI toolset (Magellan)

Among the actions that can be taken after the results have been processed, there is one that allows you to run an external program. It is particularly suited to integrate with another great tool, Magellan, the RIPE Atlas command line interface toolset:

actions:
  RunRIPEAtlasTraceroute:
    descr: Create new traceroute msm from the probe which missed expectations
    kind: run
    path: ripe-atlas
    args:
    - measure
    - traceroute
    - --target
    - www.example.com
    - --no-report
    - --from-probes
    - $ProbeID

In the above example, the RIPE Atlas tool is executed to create a new one-off traceroute measurement from the probe (the $ProbeID macro) where the reported result was not within the expected values (an API key is needed for this).

More details

The RIPE Atlas Monitor is an open source project and is available on GitHub. Check out the full documentation for more information. At the time of writing the program is still in beta version: contributions and suggestions from the community are very welcome.

This story originally appeared on RIPE Labs

Pier Carlo Chiodi is a system and network administrator based in Italy, whose area’s of interest include: IPv4 exhaustion and IPv6 transition mechanisms, infrastructure security, Internet measurement and network data analysis.

The views expressed by the authors of this blog are their own and do not necessarily reflect the views of APNIC. Please note a Code of Conduct applies to this blog.

The RIPE Atlas Monitor – Network Monitoring with RIPE Atlas