-
Notifications
You must be signed in to change notification settings - Fork 29
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Remove the pain points when managing ACME certificates #117
Comments
I agree, it's a mess. The whole PKI realm concept was created to provide a standardized way to access the X.509 certificates by services - one concept being a support for multiple Certificate Authorities. ACME support was kinda-sorta bolted on there, when it works, it works, but getting it to work might be a pain sometimes. I think that the whole concept of a PKI realm should be moved out of an Ansible role into its own separate project. Python comes to mind first, but maybe Go would be easier to handle? I'm not sure yet. Having PKI realm as a separate project could help its development - splitting parts of it as separate plugins, one of which would be support for ACME. |
@drybjed I'm wondering if making a generic pluggable system is the right thing to do. I would much prefer a system where Letsencrypt is the first class citizen and everything else comes after. I'm afraid that, while laudable, that approach would penalize the most common setup in favor of a flexibility that a tiny minority of users really need. Just a thought, I'm not against it, but I think the LE setup should be an unstoppable tank that Just Works(TM) every time. |
@bfabio What about internal networks? Let's Encrypt certificates work well at frontend hosts, webservers, public stuff, but setting up an internal network with LE is unfeasible. Not all hosts are reachable from the public Internet, but still require encrypted communication. Also keep in mind that Let's Encrypt CA has rate limits - I destroy and create hosts sometimes multiple times a day, and relying only on LE certificates I would hit it's rate limits within hours. During DebOps development I don't use Let's Encrypt certificates at all, because all my hosts are on a private network behind NAT, but I still require X.509 certificates to work correctly, in so far that when the roles are deployed in a production, public environment, they work in the same way. It seems that Let's Encrypt became an instant hit in the webdev/HTTP community. That's great, but what about other services? SMTP, IMAP, LDAP, MQTT, AMQP, are they not first class citizens? Should they just stick to self-signed certs set up by hand? For me all different CA models supported by From the point of view of an application that uses X.509 certificates, there's no difference between an internal DebOps CA certificates, Let's Encrypt certificates, or any other CA certificates. Currently "Unstoppable tank that 'Just Works(TM)'" - as long as the minimum requirements are met ( @bfabio, you posted a todo list for the changes you would want to see to make the ACME/Let's Encrypt support better. That's great! I'm currently working on updates to the DebOps mail stack - Postfix, OpenDKIM, SPF, OpenDMARC, perhaps rspamd a bit later. If you want to help with |
To me the main thing that makes the ACME configuration unnecessarily complicated is the choice of default Subject/CN (Common Name) and SANs (Subject Alternative Names). The domain name is used as CN, imho it should be the host's fqdn. That is the only thing that can with reasonable certainty be assumed to point to the host. Using the domain name as the CN falls apart as soon as there is more than one host in the domain. To make Let's Encrypt work the way I expect, I usually put the following into pki_realms:
- name: '{{ ansible_fqdn }}'
acme_default_subdomains: [ ]
acme: True
acme_ca: 'le-staging' With this, it "just works" out-of-the-box, no matter if it is a one-server-domain or a bigger cluster. I wouldn't actually even be scared to use le-live right away, but better safe than sorry. So once this looks good, I can set Also I tried to set up Let's Encrypt within the 'domain' realm for quite a while, which I eventually realised is just not gonna work out. I think the documentation could be a bit clearer about having to use a dedicated realm for Let's Encrypt. On the other hand: if the acme-integration would work as explained above (use the fqdn as CN and don't assume any sub-domains), one could just configure the realm 'domain' to use acme and all would work out of the box. I'm not completely sure though what other ramifications this would have as it would effectively kill the internal CA iiuc. tl;dr: Changing the default values for CA and SAN should make ACME certificates more straight forward to use. No clue about any potentially associated gremlins though. |
At some point I noticed that the choice of adding arbitrary subdomains to ACME certificated by default, namely pki_realms:
- name: '{{ ansible_fqdn }}' By default, if The And it's best if you don't mess with the Actually, So, the use case you want should be already implemented. Of course for this you need to specifically enable the `{{ ansible_fqdn }}' PKI realm, but due to various rate limits of Let's Encrypt, and other factors mentioned earlier, I don't think that ACME support like this can be enabled by default. Maaaybe, with some more specific logic that enables the FQDN-based PKI in specific situations. |
Thank you for your work on this amazing project. I am currently trying to get it to work for me however PKI is a huge pain point (at least for me). Maybe redeveloping the pki in Go is not even necessary since there is already something like that: https://github.com/smallstep/certificates maybe we can get this integrated into debops? |
@prk0ghy I support this. We should not implement our own certificate management again. I would say it was a solid way to learn how PKI works, both for @drybjed who implemented it and for me spending one month reviewing it. Now that we do understand it, we can compare other solutions better. https://awesomeopensource.com/projects/certificate-authority seems to be a good list. |
Looking at my 2017 comment from 2021 brings totally new perspective to this issue. :-) The problem with current PKI implementation is that it is "lopsided" and depends entirely on the remote hosts. The environment we can work with on the Ansible Controller is limited, so I did what I could back then and just relied on the remote hosts to provide initial information about the domain(s) we work with, what CA certificate should include, etc. Today, while working on re-implementing the One problem is this is finding a way to have internal CA management without the |
"Don’t roll your own crypto". There is still #106. There has to be an existing tool we can use. |
Would it be useful to compile a list of features the new pki should have? I think it would be easier to implement if we know exactly what it should be able to do. |
Here are some things I would like to address from the current
|
@ypid https://github.com/NLnetLabs/krill |
Maybe it's just me, but the whole system seems really brittle and every time it breaks it makes me wish I could just run certbot and be done with it.
/usr/local/lib/pki/pki-realm
is a ~2500 lines bash script and it's a pain to debug when something goes wrong.This could be an umbrella bug to improve the whole experience. I think the main points to tackle are:
pki-realm
existence and the arguments it takes, also implement a--help
switchpki-realm
tells the user what it's doing, without running in background.error.log
, but send it to the sysadmin by mail as wellerror.log
presence.The text was updated successfully, but these errors were encountered: