OCSP is one of two primary protocols by which clients communicate with Certificate Authorities (CAs) to obtain revocation authentications.
In my previous post, I discussed the origins of the OCSP Must-Staple, a certificate extension that was introduced to address the slow performance, unreliability, soft-failures, and privacy issues associated with Online Certificate Status Protocol (OCSP). In order for OCSP Must-Staple to be deployed successfully, each of the three major entities in the Public Key Infrastructure (PKI) — CAs, clients (browsers), and web servers — need to perform its role reliably and correctly.
In this post, I will look at the current reliability and accuracy of CA responders and what they need to do in order for OCSP Must-Staple to be deployed successfully.
Measuring the reliability of Certificate Authorities’ OCSP responders
To understand the currently reliability of CAs’ OCSP responders, we obtained 536 unique OCSP responders from 128 M certificates and measured the responders by sending 14,634 OCSP requests every hour from 25 April 2018 to 4 September 2018. We also monitored the OCSP responders from six different vantage points around the world — Oregon (Amazon Web Services [AWS] US West), Virginia (AWS US East), São Paulo (AWS Brazil), Paris (AWS France), Sydney (AWS Australia), and Seoul (AWS South Korea) — to obtain a comprehensive understanding of how responders behave.
We first focused on the portion of OCSP responses where we were unable to successfully interact with the OCSP responder. As OCSP requests are sent over HTTP, we defined a successful request as a request that resulted in the server responding with HTTP status code 200 — Figure 1 shows the fraction of the requests that were successful from six different vantage points.
Figure 1 — Fraction of requests that resulted in a successful response for the dataset, for each of our measurement clients.
The request success rate for OCSP responders is very low
First, we observed that we were never able to receive successful requests from all OCSP responders in a given hour in any of our measurement client locations. On average, 1.7% of requests failed; for two OCSP responders, we were never able to make a successful OCSP request from any of our six vantage points. This implies that clients who are served certificates making use of these responders would always fail to be able to check the revocations status of all certificates in the chain.
For 29 other responders, there was at least one measurement client that was never able to make a successful request. Looking into our logs, we found a variety of reasons why there were persistent failures:
- For 16 responders, we observed persistent DNS lookup failures (NXDOMAIN) from at least one client.
- For 4 additional responders, we were never able to establish a TCP connection to them from at least one client.
- For 8 more responders, we persistently received HTTP 4xx or 5xx response codes from at least one client.
- Finally, for 1 responder, at least one client was unable to connect to the HTTPS URL because it was served with an invalid certificate.
Failure rate varies substantially across different locations
The average failure rate ranges between 2.2% (Virginia) and 5.7% (São Paulo) of requests. We found that the measurement clients located at Oregon, São Paulo, Paris, and Seoul always failed to fetch OCSP responses from one, seven, one, and four responders, respectively. For example, five OCSP URLs are subdomains of digitalcertvalidation.com, all of which returned HTTP 404 errors to our measurement client located in São Paulo; statush.digitalcertvalidation.com is one of those URLs.
Unfortunately, a certificate of wellsfargo.com, which is one of the largest banks in the United States, relies on this OCSP URL; therefore, any client in São Paulo would not be able to fetch the certificate revocation status of wellsfargo.com from OCSP servers even if the certificate was compromised and revoked.
Transient outages lasting hours
During our measurement period, we observed 211 (36.8%) OCSP responders experienced at least one outage from at least one vantage point. For example, we noticed that all of our OCSP requests made to ocsp.comodoca.com failed at 19:00 (UTC) on 25 April for two hours. Interestingly, this outage was observed only at the clients in Oregon, Sydney and Seoul.
We also found that an additional 14 OCSP responders experienced outages at the same time, all of which were related to Comodo: the domain names of eight OCSP responders had CNAME records that pointed to ocsp.comodoca.com, and the domain names of the remaining six OCSP responders resolved to the same IP address as ocsp.comodoca.com.
Similarly, we found that all of our OCSP requests to the servers managed by wosign and startssl failed at 22:00 (UTC) on 3 August for an hour across the regions.
What CAs need to do to deploy OCSP Must-Staple
To implement OCSP Must-Staple, CAs need to improve the availability, effectiveness, and correctness of their OCSP responders to web servers.
Once this vital component of the PKI is amended, all that is required from CAs is to include the OCSP Must-Staple extension into certificates that they issue with the domain owners’ consent.
Stay tuned for my next post in which I discuss the results of our study into the current effectiveness of client and web server roles in the PKI, including recommendations as to what action these entities need to take before OCSP Must-Staple can be effectively deployed.
Taejoong (Tijay) Chung is an Assistant Professor at Rochester Institute of Technology (RIT).
The views expressed by the authors of this blog are their own and do not necessarily reflect the views of APNIC. Please note a Code of Conduct applies to this blog.