Whose certificate is it anyway?

By Jan Schaumann on 28 Jun 2023

Tags: certificate, Guest Post, measurement

This is the third blog post on the topic of the centralization of the Internet. The first post discussed the diversity of authoritative name servers, and the second post discussed the diversity of MX records.

Remember the X.509 PKI? You know, the one that gave us such hits as ‘Oh wait, certificate revocation is basically all broken’, the one where that Dutch CA issued a fraudulent *.google.com certificate, and the latest surprise in certificate issuance, a reverse RCE in the acme.sh script. It’s great because it secures virtually all web traffic, and all you have to do is get a certificate from a Certificate Authority (CA) — anyone will do!

That’s right, no need to be picky — any CA can sign any domain name, so you can pick from literally hundreds since that is the number of trusted CA root certificates baked into your browser or included in most operating systems:

$ security find-certificate -a \
        -Z /System/Library/Keychains/SystemRootCertificates.keychain | \
        sed -n -e 's/.*alis"<blob>=//p' | wc -l
     166
$ security find-certificate -a                                         \
        -Z /System/Library/Keychains/SystemRootCertificates.keychain | \
        sed -n -e 's/.*alis"<blob>=//p' | more
"Go Daddy Root Certificate Authority - G2"
"HARICA TLS ECC Root CA 2021"
"SwissSign Platinum CA - G2"
"NAVER Global Root Certification Authority"
"chambersignroot@chambersign.org"
"OISTE WISeKey Global Root GA CA"
"KISA RootCA 1"
"Actalis Authentication Root CA"
"D-TRUST Root CA 3 2013"
"Apple Root CA - G2"
"StartCom Certification Authority G2"
"SSL.com EV Root Certification Authority ECC"
"Hellenic Academic and Research Institutions RootCA 2015"
"ePKI Root Certification Authority"
"AAA Certificate Services"
"VeriSign Class 3 Public Primary Certification Authority - G5"
"VeriSign Class 3 Public Primary Certification Authority - G3"
"Trustis FPS Root CA"
"Apple Root CA - G3"
[...]

But chances are, you really only want a very small number of CAs to do that — the ones that you have a business relationship with or that you use for free. To solve that problem, the industry has tried a few things with varying degrees of success.

Possible alternatives to Certification Authority Authorization (CAA) records

For a while, we tried to tell the browser which CAs can issue a certificate for a given domain via dynamic HTTP Public Key Pinning (HPKP), an HTTP response header (Public-Key-Pins). For example, HTTP Strict Transport Security (HSTS) does not address the Trust on First Use issue. In addition, it was quickly identified as a pretty big foot gun and was again deprecated, and support for it was removed from the browsers. Except, of course, static HPKP, whereby pins that are baked into the browsers remain alive (and likely forgotten by the various companies who submitted their pins years ago).

Note: As of 16 May 2023, it looks like only Google, Facebook, the TorProject, and Yahoo have static pins in Chrome. Considering that changing or updating your static pins requires the release and propagation across all markets you care about — of multiple browsers, no less — it might be time to deprecate that, too.

Certificate Transparency (CT) was supposed to make up for (dynamic) HPKP being deprecated, but of course that shifts the defence mechanism from prevention to detection. Monitoring all certificates in the logs for all of your domains is far from trivial. Accounting for typo- and bitflip squatting, insult domains, and reserving every language variant of your trademark in the almost 1,500 Top-Level Domains (TLDs), many large organizations end up with literally thousands of domains to keep track of. It’s no surprise that CT Monitoring as a (paid) service is now a thing.

CT is enforced in browsers nowadays, which is why the Expect-CT header, defined in RFC 9163, was pretty short-lived.

And, of course, there’s also a solution that works perfectly well but isn’t used at all because it depends on Domain Name System Security Extensions (DNSSEC) — pinning your certificate in the DNS using DNS-based Authentication of Named Entities, aka DANE, but that aside…

CAA records

The preventative mechanism that has seen at least some adoption is the use of CAA DNS Resource Records specified in RFC 8659.

Checking CAA records was made a requirement for CAs via CA/B Forum Ballot 187 in 2017. The idea here is that you specify the name of the CAs that you wish to grant authorization to issue certificates for the domain in question in the CAA records. Sounds straightforward, right?

It’s worth noting that compliance with CAA records, like Certificate Transparency and some other restrictions, is not required for root certs that you (or your organization’s IT policy) installed in your trust bundle.

Unfortunately, there are a few pitfalls to consider. On the one hand, the determination of the CAA record to use for a given fully qualified domain name (FQDN) is performed as a left-to-right first match. This is useful because it allows you to have different records for sub.domain.example.com and domain.example.com, with perhaps a catch-all record set on the second-level domain (example.com). (And yes, you can have CAA records on a TLD, but as of early May 2023, no TLD has one set.).

Where this gets complicated, however, is when it comes to CNAME records. Per RFC 2181, a given label in the DNS may not have any other records if it has a CNAME record (except the associated DNSSEC records), and the CAA resolution must follow the CNAME. This gets messy quickly.

The other drawback is that you have to have your act together for all of your domains. You need to know which domains are used where and how, which may have subdomains CNAMEd to third parties, which have subdomains you delegate, which you use for internal or external use, and so on. Many large organizations are really, really bad at this.

But, it is what it is. It’s still better than allowing any CA of questionable integrity to issue certificates in your domains.

Use of CAA records

Like before for NS and MX records, I once again pulled down the various generic Top-Level Domain (gTLD) zone files and combined them with whatever country code Top-Level Domain (ccTLD) data I could get my hands on, ending up with just around 214 million domain names in almost 1,200 TLDs. In addition, I also examined the Tranco Top 1M domains list and compared results for all TLDs and the Top 1M domains.

In total, fewer than three million domains have CAA records; fewer than 50k for the Top 1M domains. That’s barely 1.4% of all TLDs or 4.8% of the Top 1M domains — not that great, adoption-wise.

Pie charts showing CCA records usage across all TLDs vs the top 1M from the Tranco list. — Figure 1 — CCA records usage across all TLDs vs the top 1M from the Tranco list.

Of those domains that do have CAA records, what do they look like? RFC 8659 defines the resource record to be of the format CAA <flags> <tag> <value>. The tag-value portion is called a property, and each domain may have zero, one, or more properties defined.

The majority of domains that do have CAA records set appear to use a small number of CAs, commonly <=5. This then adds around 10 records in total, which is indeed the most frequently seen number of CAA records (Table 1).

Number of CAA records	Number of domains
10	1,060,973
8	841,310
1	420,128
2	313,908
3	65,862

Table 1 — Domains that set CAA records appear to use a small number of CAs.

Of course, there are outliers, too. Almost 900 domains have over 20 CAA records, and some domains have even more than 50 (Table 2).

Number of CAA records	Domain name
59	benemortasia.us.
59	lifelessandcalm.com.
59	unorganized.email.
57	benemortasia.com.
36	estrategiaadigital.fun.

Table 2 — 900 domains were found to have more than 20 CAA records, and some over 50.

CAA flags

The flags field should practically be exactly either 0 or 128, as no other values are currently defined. But this being an RFC, it’s of course needlessly complicated and easy to misunderstand — the Issuer Critical Flag is bit 0 of the flags field and not the value of this field. That is, to set bit 0, you have to specify a value of 128; a value of 1 still leaves that bit unset.

It’s therefore not surprising to find the top flags as shown in Table 3.

Flag	Number of records	Comment
0	20,064,100	valid, critical flag unset
1	219,928	invalid, critical flag unset
128	49,775	valid, critical flag set
10	735	invalid, critical flag unset
250	18	invalid, critical flag set

Table 3 — Top flags encountered.

There were an additional 50 other values found, ranging from 2 to 250, with no clear indication of what people thought those values might mean.

CAA properties

RFC 8659 defines three different properties: issue, issuewild, and iodef. That’s it. But of course, you won’t be surprised to find that across all the domains analysed, we find over 100 additional words, including different misspellings of those three properties (such as issiue, issuewld, iodev) and what seems like guesswork based on expected functionality (for example, enable). The overwhelming majority of records are, however, correct and break down across the three valid properties as shown in Figure 2.

Pie charts showing CAA records by tag type; all TLDs vs the top 1M from the Tranco list. — Figure 2 — CAA records by tag type; all TLDs vs the top 1M from the Tranco list.

Not surprisingly, the majority of organizations implementing CAA records want to restrict issuance, with most also using wildcard issuance restrictions. What is a bit surprising, perhaps, is that only a very small number of organizations appear interested in receiving reports of attempted unauthorized issue requests. That is likely explained by the fact that RFC 6844 makes honouring iodef optional (“…MAY report…”), and at least Let’s Encrypt has publicly stated that they do not send emails on failed issuance due to CAA).

Note: CA/B Forum Ballot SC13 and Ballot SC14 added contactemail and contactphone to allow domain owners to provide information that increasingly is hidden in whois. But these are not defined in the RFC and are very rarely used; only 741 out of all domains observed used contactemail (54 out of the Top 1M domains), 23 contactphone (three out of the Top 1M domains).

The number of domains using any combination of these three properties is shown in more detail in Table 4.

Domains with…	All TLDs	Top 1M domains
`issue`	2,851,746	151,046
`issuewild`	2,173,641	110,532
`iodef`	182,461	8,622
i`ssue` and `issuewild`	2,139,747	28,910
`issue`, `issuewild` and `iodef`	86,098	4,041
`issue` and `issuewild`	2,139,747	28,910
`iodef` and `issue`	178,556	8,265
`iodef` and `issuewild`	87,840	4,195
either `issue` or `issuewild` and not `iodef`	2,705,342	24,869
only `issue`	711,999	17,555
only `issuewild`	33,894	1,856
only `iodef`	2,163	99

Table 4 — Number of domains using any combination of issue, issuewild, and iodef.

`iodef`

RFC 8659 defines three valid methods for CAs to report requests for issuance that violate the policy: mailto and http(s). For the most part, domains get this right, and not surprisingly prefer the simpler mailto reporting mechanism, as shown in Table 5.

`iodef method`	all TLDs	Top 1M domains
`mailto`	174,230	8,248
raw email (invalid)	7,294	248
https	166	24
http	18	0

Table 5 — Most domains prefer the mailto reporting mechanism.

Most domains have a single iodef record, although some have multiple, while others clearly misunderstood the proper syntax of the Resource Record (RR), and at least one is using the record as a Log4Shell canary:

$ host -t caa elevate.services | grep iodef
elevate.services has CAA record 0 iodef "mailto:imdomains@intermedia.net"
elevate.services has CAA record 0 iodef "mailto:hostmaster@elevate.services"
elevate.services has CAA record 0 iodef "mailto:hostmaster@intermedia.net"
$ host -t caa smartroom.com | grep iodef
smartroom.com has CAA record 0 iodef "comodoca.com"
smartroom.com has CAA record 0 iodef "usertrust.com"
smartroom.com has CAA record 0 iodef "trust-provider.com"
smartroom.com has CAA record 0 iodef "mailto:domains@bmcgroup.com"
smartroom.com has CAA record 0 iodef "sectigo.com"
$ host -t caa kyhwana.org | grep iodef
kyhwana.org has CAA record 0 iodef "mailto:kyhwana@gmail.com"
kyhwana.org has CAA record 0 iodef "${jndi:ldap://baylwjkcgkp30xx2ut082owpu.canarytokens.com/a}"
$

The most frequently used iodef records are shown in Figures 3 and 4:

Pie chart showing CAA iodef records for all TLDs. — Figure 3 — CAA iodef records for all TLDs.

Pie chart showing CAA iodef records for the Tranco top 1M domains. — Figure 4 — CAA iodef records for the Tranco top 1M domains.

Note the dominance of security@yahoo-inc.com for the iodef records. I’m pleased to see this, as setting the right CAA policy and adding default CAA records for all of Yahoo’s (many) parked domains was something I pushed for during my time there. Yay! \o/

`issue` and `issuewild`

Ok, so now let’s see what CAs the different domains authorize. In total, I found almost 2,200 distinct issue records (for domains in all TLDs, 456 distinct for the Top 1M domains) and 878 issuewild records (all TLDs, 227 Top 1M).

The various misspellings and otherwise invalid records aside, the top 20 CAs in these records are shown in Table 6.

`issue` records (in all TLDs)	Count	`issue` records (in Top 1M domains)	Count
1. letsencrypt.org	2,769,264	1. letsencrypt.org	38,218
2. digicert.com	2,059,878	2. digicert.com	29,777
3. comodoca.com	2,010,652	3. comodoca.com	24,098
4. globalsign.com	1,901,486	4. pki.goog	19,078
5. sectigo.com	1,300,807	5. globalsign.com	9,522
6. pki.goog	384,434	6. sectigo.com	8,632
7. trust-provider.com	157,727	7. amazon.com	5,382
8. ;	79,788	8. amazonaws.com	2,545
9. amazon.com	70,065	9. amazontrust.com	2,139
10. certum.pl	32,870	10. godaddy.com	2,020
11. entrust.net	23,103	11. awstrust.com	1,998
12. godaddy.com	22,537	12. entrust.net	949
13. geotrust.com	14,587	13. certum.pl	620
14. starfieldtech.com	13,776	14. ;	417
15. ssl.com	13,484	15. quovadisglobal.com	407
16. amazonaws.com	13,051	16. geotrust.com	395
17. amazontrust.com	10,922	17. symantec.com	354
18. awstrust.com	10,500	18. trust-provider.com	338
19. rapidssl.com	9,549	19. thawte.com	318
20. comodo.com	7,968	20. comodo.com	268

Table 6 — The top 20 CAs showing the issue and issuewild records.

What you see here shows the overwhelming majority of CAA records using just a handful of CAs. (The use of ; signals that no CA is allowed to issue a certificate for the domain in question; this is used primarily for parked and otherwise unused domains).

But recall what RFC 8659 says about the meaning of these records (emphasis mine):

If the issue Property Tag is present in the Relevant RRset for an FQDN, it
is a request that Issuers:

1. Perform CAA issue restriction processing for the FQDN, and
2. Grant authorization to issue certificates containing that FQDN to the
holder of the issuer-domain-name or a party acting under the explicit
authority of the holder of the issuer-domain-name.

Who is the ‘holder of the issuer-domain-name’ for geotrust.com, rapidssl.com, or thawte.com? That’s right: DigiCert. That is, by specifying, say, geotrust.com in your CAA record, you are implicitly also granting the various DigiCert subsidiaries authorization. So, we can collate many of the above records, which then gives us a breakdown of the most popular CAs used in CAA issue and issuewild records as shown in the following figures.

Pie chart showing the top CAA issue records for all TLDs. — Figure 5 — Top CAA `issue` records for all TLDs.

Pie chart showing the top CAA issue records from the Tranco top 1M domains. — Figure 6 — Top CAA `issue` records from the Tranco top 1M domains.

Pie chart showing the top CAA issuewild records for all TLDs. — Figure 7 — Top CAA `issuewild` records for all TLDs.

Pie chart showing the top CAA issuewild records from the Tranco top 1M domains. — Figure 8 — Top CAA `issuewild` records from the Tranco top 1M domains.

Or, if you prefer Pareto charts:

Chart showing the top CAA issue records for all TLDs. — Figure 9 — Top CAA issue records for all TLDs.

Chart showing the top CAA issue records from the Tranco Top 1M domains. — Figure 10 — Top CAA issue records from the Tranco Top 1M domains.

Chart showing the top CAA issuewild records for all TLDs. — Figure 11 — Top CAA issuewild records for all TLDs.

Chart showing the top CAA issuewild records from the Tranco Top 1M domains. — Figure 12 — Top CAA issuewild records from the Tranco Top 1M domains.

Extensions

As noted above, even authorizing a given CA can still end up being rather broad, and you may well want to have much tighter restrictions, such as specifying which specific account under a given CA may request certificates for a domain, or how the CA should validate the request. For this, RFC 8657 specifies a few extensions: The accounturi parameter and the validationmethods parameter.

There is also a draft on Signed HTTP Exchanges& within the Web Packages group that adds another parameter: cansignhttpexchanges. As of May 2023, it looks like the only CAs supporting this parameter are digicert.comand pki.goog (see for example, DigiCert’s documentation as well as a discussion on the Let’s Encrypt forum), although I also saw a very small number of domains setting this parameter on records authorizing letsencrypt.org, sectigo.com, amazon.com, and globalsign.com (I’m guessing those were set, but not honoured).

In addition, I encountered three more extension parameters that appear to not be well documented: policy=ev (found only in combination with comodo.com), root=g1-class3 (found only in combination with cacert.org), and account= (found only in combination with letsencrypt.org, digicert.com, cacert.org, and Amazon’s CAs). It is not clear to me whether these are actually supported by the different CAs, or if they are opportunistically or mistakenly set by the domain owner.

The use of these extension parameters broken down by the number of domains using them looks like this:

Extension parameter	Count (all TLDs)	Count (Top 1M domains)
`cansignhttpexchanges`	259,245	17,108
`validationmethods`	559	43
`accounturi`	243	29
`account`	163	29
`root`	11	0
`policy`	9	4

Table 7 — Extension parameter counts.

As shown in Table 7, validationmethods encountered were dns-01 (dominant), http-01 and tls-alpn-01; accounturis were primarily under https://acme-v02.api.letsencrypt.org/, with just a handful under https://acme-v01.api.letsencrypt.org/ and https://acme-staging-v02.api.letsencrypt.org/.

Summary

Having analysed around 214M domain names, below are my main findings:

CAA records are still not widely used

Across all TLDs, only 1.4% of domains use CAA records; out of the Top 1M domains, only 4.8%. Considering that CAA records have been around since 2010 and honouring them has been mandatory for CAs since 2017, this seems like a poor adoption rate, likely because: (a) The PKI threat model it addresses is poorly understood; and (b) The implementation can lead to difficulties if the use of domain names and third-party services used is not clearly organized.

Most people don’t set `iodef`

Those domains that do use CAA records tend to use the issue (52% for all TLDs, 55.9% for the top 1M Domains) and issuewild (46.9% and 40.9%) records, but only a minuscule fraction (0.9% and 3.2%) set iodef. This may be a sign that organizations generally are not well prepared to handle error reports, although, even if honouring iodef is optional in the RFC, I am a bit surprised by these abysmal numbers.

Extensions are not widely used

This is not surprising, since they require subject matter expertise that, frankly, is absent in most organizations. What is surprising, to me at least, is that the non-standard cansignhttpexchanges extension is so dominant here. I suspect this is something that is being pushed by Google — hence the frequent use on pki.goog — as part of the Accelerated Mobile Pages (AMP) framework, but no industry-wide consensus seems to have developed.

A small number of CAs dominate

This is not surprising, but the concentration is still stark — seven CAs account for over 99% of all CAA issue and issuewild records (10 CAs for 99% of the Top 1M domains); three alone for over 75%: Comodo, DigiCert, and Let’s Encrypt.

Even though this only covers the small percentage of domains that do set CAA records, I would not be surprised if the overall use of CAs across all domains followed a similar distribution. (In some markets, regional players will play a bigger role; once again the inability to get access to all ccTLD zones makes this difficult to assess).

If you’re wondering whether you really need to have over 160 different CAs in your trust bundle, I suspect the answer is ‘no’. You could likely get away with fewer than 20 and wouldn’t notice the difference. But whether that’s a good thing and whether it’s wise for the entire Internet to place all — well, >99% — of its certificates/eggs into fewer than 10 CAs/baskets seems more than questionable.

Jan Schaumann is a Distinguished Infrastructure Security Architect, and Adjunct Professor of Computer Science, with an interest in information security and the overall health of the internet, as well as the safety and privacy of its users. You can follow Jan on Twitter and Mastodon.

This post is adapted from the original at Jan’s Blog.

Rate this article

The views expressed by the authors of this blog are their own and do not necessarily reflect the views of APNIC. Please note a Code of Conduct applies to this blog.