What’s your wish list for the perfect RPKI validator?

By on 5 Aug 2021

Category: Tech matters

Tags: , ,

Blog home

It wasn’t long ago that the sun set on RIPE’s RPKI validator. They ceased updating it and asked the community to stop using it. That prompted us on the APNIC Blog to put together a list of other validator options.

Looking through the list of various options had me wondering, what if I could mix and match those features? What would my ideal validator look like?

This is not to say the existing validator options are no good; I use a lot of them and I know how much time and effort goes into making them. The community, myself included, owe a debt to the people who work so hard on these, and as I’ll discuss below, these projects aren’t always funded as well as they should be.

Read: Validating RPKI validators

So this ‘ideal validator’ of mine comes with the caveat — it’s not a command or request for this to be made, just some feedback on what I, personally, would like to see in a validator.

What features would it have, and what would it have to overcome?

Oh boy. Where to begin?

Most importantly? Be correct

My strongest motivation would be for a validator I could rely on to tell me the truth. The role of a relying party (RP) in the RPKI framework is to play the ‘verify’ part in ‘trust but verify’ so I am very strongly driven to a validator that can demonstrate it conforms to the standards. How does a validator do that?

Well, it has to do a couple of things. Firstly, it should have demonstrable test cases, to drive it into good and detected-bad states over the inputs. It should be able to detect and ignore invalidly signed, constructed objects, expired objects, or objects with no trust anchor. Anything that is easily detected as a ‘wrong’ input should be dealt with properly. 

Some of this comes from interoperability testing, but also checking by other bodies. It helps a lot to have people like NIST write compliance tests, or related bodies in other economies writing formal checks against the products.

Can it be resilient as well?

After being correct, the next most important thing for me, is to survive instability in the network, and the supply-side (the Certification Authorities (CAs) and the publication servers) of RPKI products. Maybe somebody is having a bad hair day and missed a deadline? Is it ok for me to flag it, and show what I would do with the data in hand? If the data in hand is not provably superseded and hasn’t expired, can you let me run with it?

Keep me informed at all times

I really like the monitoring methods of modern systems like Prometheus and Grafana. There are other approaches, and I don’t mind if somebody chooses to use XML or some other output format, (SNMP!) but I really do want the RP system to generate logs. Log everything. Log when you start and stop. Log when you get things, log when you remove them. Log when unusual things happen, but log when boring normal things happen too. 

Understand the modern deployment models

I have often said in the past “don’t trust anyone else to tell you the state of validity in RPKI” and it pretty much underpins the ‘verify’ in ‘trust but verify’ that you do it yourself. However, modern computing includes cloud methods that are highly available, and keep me informed to their status at all times. I like cloud. So, my dream validator probably includes cloud deployment methods like Docker and Kubernetes. Use these things — they make it easier to package. You can use Helm, Terraform, Puppet, Chef, or Ansible. I don’t mind if you use one I don’t use; using any of them makes it more likely I can do a deployment using the one I do use.

I want good visualizations

Now, here’s where we really get into the detail of preferences, where people often disagree on what’s best. Some people are Emacs, some are Vi. Some people are Chrome, some are Firefox. Some use Windows, Linux, BSD, and OSX. We’re all glued to our ‘own’ view of how to see the world. And, usually, I am very, very glued to a command line model of interaction. I want text and tabular data.

But when I need to communicate with other people I know, there are times when a good picture does the job much better. I like visualizations. They’re useful. Having a Graphical User Interface (GUI) view, maybe as an adjunct or optional extra, is a huge win for me. I want that display of the PKI certification chain, or the BGP states and interactions, or the delegation tree. 

Be written in a language I trust (and understand, or understand HOW to understand)

I come from a Fortran-Pascal-C background. My roots are deep in a method of writing code that reflects these languages, and their style of making a computer work. But, despite having this background, I acknowledge that they are quite weak in erecting defences and boundaries around what you do with data.

They make it easy to write fast code, but the code can also be ‘fast and loose’ if you aren’t careful. Other, more modern languages, impose constraints on how data is used. We call this type-checking, and although I am not a good programmer at any time, let alone the best of times, I do believe these languages that implement more type-safe and type-respecting checks at compile or run time (or both) have significant advantages when it comes to writing trustable code.

I think that this drive would go a long way to meeting the requirements of ‘be correct’ and ‘be resilient’ because a useful primary outcome of this kind of programming is how strongly it can reflect a formal check. These checks cover correctness, systems behaviour and avoidance of simple coding errors that emerge at runtime. Is it perfect? No. But it’s a lot better. So while I know and love running some validators written in the old, old C, or Python, I also know how much I value the ones written in Rust or Haskell. 

Let me mix and match components

‘Sendmail’ is a monolith email program from times gone by. Multichannel Memorandum Distribution Facility (MMDF) consisted of a series of smaller, focused programs for each role, but all of them were ultimately then done by the sendmail dinosaur mothership.

My preferred mail program ‘MH’ consists of individual UNIX commands for each function as a mail client, operating on files. It’s a million miles from a mail GUI. I just like systems that write as a series of connected, inter-related small elements; I like them more than things written as a single giant binary process doing all things. And, when you think about my wishlist for a GUI, that can be a distinct element. The suggestion of Prometheus/Grafana could be a distinct element for monitoring as well.

Let me mix and match what I need, without having unnecessary elements weighing me down.

Be documented and supported

I really like having documentation. However, it’s a burden to write and maintain. As is the base system. I don’t want fire-and-forget code, I want code with a community of users and developers around it, with evidence of robust backing, which is going to persist into the longer term. When I consider the value of routing integrity, and a certain amount to spend on RPKI code and validators, I think there is a case to be made that this budget should be a lot higher than people have been expecting to pay.

Free is a good thing, but its an unwise choice facing a difficult 24/7 delivery of service. I think I should expect to have to pay for the code to exist, and be maintained, and explained to me and others. Others may disagree, but I’m willing to pay a bit more for this.

So what do I currently run? Four of them!

As I write, I have three validators I am running on my own laptop, and a fourth I operate in APNIC Labs. I run Routinator, RPKI-client and Dragon every day, from my own laptop. I’m doing this deliberately to try and make sure I am not captive behind any one validator’s model of ‘correctness’ and to understand the minor variances between them in reporting on states of validity. All of them work fine, in Docker or in local run state on my Mac.

I also run Routinator and the RIPE v3 client on a bigger host in Labs, collecting data on a daily snapshot of the exported VRP state. The RIPE validator has a GUI I can use simply, and while I know I have other choices, it’s useful to have around. (It won’t last much longer; RIPE have formally deprecated it and we really need to move away from it).

Dragon’s systems actually produce a really simple, parseable XML form that I view in the locally served web, but I can process this as text only. It’s a little crude to share with others, but it helps me walk the space quickly when I need to. It does warn about some compliance problems that stem from its age and lack of current support, but I would be very happy if this was rectified with an investment to bring it into the modern age, and this is a distinct possibility.

Routinator is written in Rust, and so delivers on my interest in strongly typed languages, with memory models I believe in. And, it has a support basis in NLNet, and is actively maintained and documented. RPKI-client is fast code, written in C/C++ and remorselessly policed by the OpenBSD security for potential weaknesses. It is also fast, and delivers a text dump of state I can work with to compare with the other two validators I run on my Mac. 

There are other code bases out there I could run like RPStir and FORT and I do not mean to malign them in choosing not to run them. It’s purely the expediency of having three, with deep roots and some understanding of how to keep them working on my own environment.

A changing world of RPKI needs funded code

The decision by the RIPE NCC to formally deprecate their v3 validator, and move off development of this code was sensible (I think) and brave (I know!) because it is really hard to stop work on code once it’s widely deployed out in the community. It was a sensible decision because of the depth of future cost to make it respond to changing needs, and it is brave because of the high chance of backlash from the user base.

RIPE NCC started work on a validator in a time when there were few choices. Now that we have many, the imperative to provide one reduces. However, the costs of keeping up with the emerging requirements in the community never drops. RIPE looked to their core mission, and decided to focus on the certification side of service delivery and I think this was wise.

However, it doesn’t remove the need for other RP code development to go on, and therefore to be funded.

So, in that spirit I want to remind people that the software you run to compute validity of routing might be some of the most important code you operate, apart from the switch and router logic it’s being applied to. In that sense, depending on it being ‘free’ may be valuing your own outcomes tragically low. Code costs money to develop and maintain. The APNIC Secretariat made a decision to fund some of the free products we depend on, both for maintenance and for development for the common good. I would encourage anyone running freely available RP code to get in touch with the developers to discuss support and development options for it.

Rate this article

The views expressed by the authors of this blog are their own and do not necessarily reflect the views of APNIC. Please note a Code of Conduct applies to this blog.

Leave a Reply

Your email address will not be published. Required fields are marked *