Since publishing our post and video on APIs, I’ve talked with a few people on the topic, and one aspect that keeps coming up is the importance of security for APIs. In particular, I hear the term ‘zero trust’ being increasingly applied to APIs, which led to the idea for this post.
At the same time, I’ve also noticed what might be called a zero-trust backlash, as it becomes apparent that you can’t wave a zero-trust wand and instantly solve all your security concerns.
Zero trust has been on my radar for almost a decade, as it was part of the environment that enabled network virtualization to take off. We’ve told that story briefly in our SDN book — the rise of microsegmentation as a widespread use case was arguably the critical step that took network virtualization from a niche technology to the mainstream. The term goes back at least to 2009, when it was coined by Forrester analyst John Kindervag and it is possible to draw a line back from there to the principle of least privilege as framed by Saltzer and Schroeder in 1975. That principle states:
“Every program and every user of the system should operate using the least set of privileges necessary to complete the job.”
While the Internet was designed following another of Saltzer’s principles — the end-to-end argument that he formulated with David Clark and David Reed — least privilege didn’t really make it into the Internet architecture. As David Clark pointed out some 20 years after the end-to-end paper, he and his coauthors assumed that end-systems were willing participants in achieving correct behaviour, an assumption that no longer holds true.
While the goal of the early Internet was to interconnect a handful of computing systems running in research labs around the US (initially), a substantial subset of the end-systems connected to the Internet today are actively trying to harm other systems —inserting malware, launching DoS attacks, extracting sensitive information, and so on. The last 20+ years of networking have seen an ever-expanding set of attempts to deal with the lack of security in the original Internet.
The easiest way to conceptualize zero trust is by considering what it is not
Perimeter-based security (as provided by perimeter firewalls for example) is a good counterexample. The idea of a firewall is there is an inside and an outside, with systems on the inside ‘trusted’ and those outside ‘untrusted’. This division of the world into trusted and untrusted regions fails both the principle of least privilege and the definition of zero trust. Traditionally, a device on the inside of a firewall is trusted to access many other devices that are also inside by virtue of its location. That is a lot more privilege than needed to do its job, and contrary to this description of zero trust provided by NIST:
“Zero trust…became the term used to describe various cybersecurity solutions that moved security away from the implied trust based on network location and instead focused on evaluating trust on a per-transaction basis.”
(As someone who has been involved in plenty of documents produced by committees, I have to say that the NIST Zero Trust Architecture is remarkably clear and well written.)
VPNs are another example of an approach to security that fails to meet this definition; even though modern VPN technology lets you connect to a corporate network from anywhere, it still creates the sense of an inside that is trusted and an outside that is not. The Colonial pipeline ransomware attack is an example of a compromise of a VPN with dire consequences because of the broad range of systems that were reachable once the attacker was ‘inside’ the VPN.
My theory about the occasional backlash that I’ve seen around zero trust has two parts:
- First, the name is an oversimplification of what’s going on. It’s not that you literally trust nothing. But rather, trust is not assumed just because of a device’s (or a user’s) location, and nor does an entity gain wide access to resources just because it could authenticate itself for a single purpose. So ‘zero trust’ might be better termed ‘narrow and specific trust after authentication’ but that’s not very catchy.
- Second, there is a lot of work to be done to implement zero-trust comprehensively. So while a vendor might say ‘my product/solution lets you implement zero trust’, the reality is there are a lot of moving parts to a comprehensive zero trust implementation, which is unlikely to be solved by one or two products.
When we were developing microsegmentation as part of our network virtualization solution at VMware, we were quick to point out that it helped with zero trust implementation by allowing fine-grained firewalling of east-west traffic. Distributed firewalls enabled us to move beyond zone-based trust (as provided by traditional firewalls) to an approach where an operator could specify precise rules for communication between any pair of VMs, and the default could be that no VM could communicate with any other VM. That default, applied to VMs even if they sat in the same zone (relative to traditional firewalls), was what enabled us to claim a ‘zero trust’ approach. While that was quite a breakthrough in 2014, the granularity of control is limited by what is visible to the distributed firewall and so it doesn’t achieve the ‘per-transaction’ evaluation of trust described above. If communication between applications is encrypted (as it should be in many, if not most, cases) then the granularity at which the firewall would have to operate is the TCP port, with no deeper visibility into the type of transactions happening.
This brings us to securing APIs
As we discussed in our earlier post, the unit of infrastructure is no longer the server or the VM but the service (or microservice) and so the API to the service becomes the point of security enforcement. This is why we see things like API gateways and service meshes becoming increasingly important — we need new classes of tools to manage the security of APIs, providing fine-grained control over exactly which API requests can be executed by whom. This is nicely explained in the video that introduced me to both service meshes and the Cilium project.
A final observation is that we are now reaping the rewards of the SDN architectural approach that combines central control with distributed data planes.
Like many networking people of a certain era, I grew up learning that the end-to-end argument was the basis for all good architecture and I was unimpressed with the rise of firewalls and other ‘middleboxes’ because they didn’t adhere to the end-to-end principle. But over time I came to realize that firewalls were appealing because they offered a central point of control and that was important for those operators who needed to secure the network after the fact.
What we saw with the rise of distributed firewalling, and SDN more broadly, was that we could have centralized control (with the benefits that provides for operators) and a distributed implementation that pushed the necessary security functions closer to the endpoints, where they were more effective. Service meshes are the next step in that journey — effectively SDN for a world where APIs are the primary form of communication.
Adapted from original post which appeared on Systems Approach.
Bruce Davie is a Computer Scientist, Author and Co-Founder of Systems Approach, LLC.
The views expressed by the authors of this blog are their own and do not necessarily reflect the views of APNIC. Please note a Code of Conduct applies to this blog.