Bulletproof TLS Newsletter #63
The Let’s Encrypt Certificate Authority Authorization incident
31 March 2020
Author: Hanno Böck

This issue was distributed to 53,921 email subscribers.

Bulletproof TLS Newsletter is a free periodic newsletter bringing you commentary and news surrounding SSL/TLS and Internet PKI, designed to keep you informed about the latest developments in this space.

In this issue:

  1. The Let’s Encrypt Certificate Authority Authorization incident
  2. Short news

The Let’s Encrypt Certificate Authority Authorization incident

The Let’s Encrypt certificate authority discovered a bug in its certificate issuance process and the checking of CAA records, which caused them to announce the intent to revoke three million certificates. However, due to the disruption this would have caused, Let’s Encrypt did not proceed with that plan.

The incident was caused by a bug in the Boulder software that Let’s Encrypt uses on its certificate issuance servers. CAA records allow domain owners to set a DNS record that lists which certificate authorities are allowed to issue certificates for a domain. The bug happened when more than one host and the corresponding CAA records were checked within the past 30 days and a new certificate was issued for the same hosts.

In such a situation, the certificate authority needs to recheck the CAA record, but that check was flawed. In certificates with multiple host names, only the first host was rechecked. A detailed technical description of the bug can be found in Mozilla’s bug tracker.

According to the Baseline Requirements, in a situation in which certificates have been illegitimately issued, the CA must revoke them promptly. Therefore Let’s Encrypt announced the plan to revoke around three million affected certificates the day after the announcement of the bug and asked affected users to renew their certificates.

Unsurprisingly, not all affected certificates were renewed in time, and due to the impact such a large number of revoked certificates would have, Let’s Encrypt changed its plans. It revoked only certificates that were already replaced, plus a small number of high-risk certificates. These high risk certificates had a new CAA record that disallowed certificate issuance for Let’s Encrypt.

Although this was a violation of the Baseline Requirements, there is some precedent for such violations to avoid large disruptions. Mozilla has some notes in its wiki about exceptional circumstances that may justify delaying a revocation. In a previous incident related to a lack of randomness in certificate serial numbers, many certificate authorities didn’t revoke certificates promptly.

Certificates issued by Let’s Encrypt have a lifetime of 90 days, so by late May all certificates affected by this incident will have expired.

There will probably be no premature revocation of certificates still in use, but renewing affected certificates is still recommended. Let’s Encrypt provides an online check and a list of serial numbers affected. A shell script to check hosts quickly based on that list is provided by the author of this newsletter.

A follow-up discussion of this situation centered on whether such incidents could be handled better by automation. Let’s Encrypt uses the ACME protocol for automatic certificate issuance, but there is currently no way for a certificate authority to ask a client for certificate renewal. A thread in the Let’s Encrypt forum discusses such a possibility, which could be added to the ACME protocol. ACME clients could also check the revocation status of certificates via OCSP and renew revoked certificates. However, that might still cause some downtimes, although those could be mitigated with OCSP stapling.

Short news