Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Storing sensitive values in state files #516

Open
seanherron opened this issue Oct 28, 2014 · 246 comments
Open

Storing sensitive values in state files #516

seanherron opened this issue Oct 28, 2014 · 246 comments

Comments

@seanherron
Copy link
Contributor

#309 was the first change in Terraform that I could find that moved to store sensitive values in state files, in this case the password value for Amazon RDS. This was a bit of a surprise for me, as previously I've been sharing our state files publicly. I can't do that now, and feel pretty nervous about the idea of storing state files in version control at all (and definitely can't put them on github or anything).

If Terraform is going to store secrets, then some sort of field-level encryption should be built in as well. In the meantime, I'm going to change things around to use https://github.com/AGWA/git-crypt on sensitive files in my repos.

@bitglue
Copy link

bitglue commented Jan 28, 2015

See #874. I changed the RDS provider to store an SHA1 hash of the password.

That said, I'm not sure I'd agree that it's Terraform's responsibility to protect data in the state file. Things other than passwords can be sensitive: for example if I had a security group restricting SSH access to a particular set of hosts, I wouldn't want the world to know which IP they need to spoof to gain access. The state file can be protected orthogonally: you can not put it on github, you can put it in a private repo, you can use git-crypt, etc.

@kubek2k
Copy link
Contributor

kubek2k commented Jan 28, 2015

related #689

@dentarg
Copy link

dentarg commented Mar 17, 2015

Just want to give my opinion on this topic.

I do think Terraform should address this issue. I think it will increase the usefulness and ease of use of Terraform.

Some examples from other projects: Ansible has vaults, and on Travis CI you can encrypt informaton in the .travis.yml file.

@ketzacoatl
Copy link
Contributor

Ansible vaults is a feature I often want in other devops tools. Protecting these details is not as easy as protecting the state file.. what about using consul or Atlas as a remote/backend store?

+1 on this

@dayer4b
Copy link
Contributor

dayer4b commented May 28, 2015

I just want to point out that, according to official documentation, storing the state file in version control is a best practice:

https://www.terraform.io/intro/getting-started/build.html

Terraform also put some state into the terraform.tfstate file by default. This state file is extremely
important; it maps various resource metadata to actual resource IDs so that Terraform knows what
it is managing. This file must be saved and distributed to anyone who might run Terraform. We
recommend simply putting it into version control
, since it generally isn't too large.

(emphasis added)

Which means we really shouldn't have to worry about secrets popping up in there...

@hobbeswalsh
Copy link

👍 on this idea -- it would be enough for our case to allow configuration of server-side encryption for S3 buckets. Any thoughts on implementing that?

@apparentlymart
Copy link
Contributor

At the risk of adding scope to this discussion, I think another way to think of this is that Terraform's current architecture is based on a faulty assumption: Terraform assumes that all provider configuration is sensitive and that all resource configuration isn't sensitive. That is wrong in both directions:

  • Several resources now take passwords as inputs or produce secret values as outputs. In this issue we see the RDS password as one example. The potential Vault provider discussed in Vault provider #2221 is another example.
  • Several provider arguments are explicitly not sensitive, such as the AWS region name, and excluding them from the Terraform state results in Terraform having an incomplete picture of the world: it can see that there is an EC2 instance with the id i-12345 but it can't see what region that instance is in without help of the configuration. Changing the region on the AWS provider causes Terraform to lose track of all of the existing resources, because as far as the AWS provider is concerned they've all been apparently deleted.

So all of this is to say that I think overloading the provider/resource separation as a secret/non-secret separation is not the best design. Instead, it'd be nice to have a mechanism on both sides to distinguish between things that should live in the state and things that should not, so that e.g. generated secrets can be passed into provisioners but not retained in the state, and that the state can encode that a particular instance belongs to a particular AWS region and respond in a better way when the region changes.

There are of course a number of tricky cases in making this situation, which I'd love to explore some more. Here are some to start:

  • If you don't retain something in the state then it's not safe to interpolate it anywhere because future runs will assume they can interpolate attributes from existing resources in the state.
  • Some provider config changes effectively switch all resources to an entirely new "namespace", and thus effectively force every attached resource to be destroyed and recreated in the new region. The AWS region is one example, since AWS resources are region-specific. But that's not so simple for other arguments: the AWS access_key might change what Terraform has permission to interact with, but it doesn't change the id namespace that resources live within.

@little-arhat
Copy link

Hi, any progress on that? Terraform 0.6.3 still stores raw passwords in the state file. Also, as a related issue, if you do not want to keep passwords in configuration, you can create variable without default value. But, this will force you to pass this variable every time you run plan/apply, even if you're not going to change resource that has this password.

I think, it would be nice to separate sensitive stuff from other attributes, so it will:

  • be stored as sha1 or smth in state file
  • not require value if it already has one.

So, for configuration like:

variable db {
    password {}
}

resource ... {
    password = "${var.db.password}"
}

terraform will require variable for the first run, when it doesn't have anything, but will not require on subsequent runs.

To change such value one need to provide different value for password.

@EvanKrall
Copy link
Contributor

Maybe there's a simple solution: store the state in Vault?

@mwarkentin
Copy link
Contributor

A good solution for this would be useful for us as well - we're manually configuring certain things to keep them out of the tfstate file in the meantime.

@ascendantlogic
Copy link

So as I slowly cobble together another clean-sheet infra with Terraform I see this problem still exists, and this issue is almost exactly 1 year old. What is the thinking in regards to solving this? the ability to mark specific attributes within a resource as sensitive and storing SHA1 or SHA2 hashes of their values in the state for comparison? I see this comment on a related ticket, does that mean that using Vault will be the prescribed way? I get that it promotes product synergy but I'd really like a quick-n-dirty hashing solution as a fallback option if I'm honest.

@ketzacoatl
Copy link
Contributor

Moving secrets to vault, and using consul-template or integration with other custom solutions you have for CM certainly helps for a lot of cases, but completely avoiding secrets in TF or ending up in TF state is not always reasonable.

@ascendantlogic
Copy link

Sure, in this particular case I don't want to manually manage RDS but I don't want the PW in the state in cleartext, regardless of where I keep it. I'm sure this is a somewhat common issue. Maybe an overarching ability to mark arbitrary attributes as sensitive is shooting for the moon but a good start would be anything that is obviously sensitive, such as passwords.

@jfuechsl
Copy link

Would it be feasible to open up state handling to plugins?
The standard could be to store it in files, like it is currently done.
Other options could be Vault, S3, Atlas, etc.

That way this issue can be dealt with appropriately based on the use-case.

@brikis98
Copy link
Contributor

I just got tripped up by this as well, as the docs explicitly tell you to store .tfstate files in version control, which is problematic if passwords and other secrets end up in the .tfstate files. At the bare minimum, the docs should be updated with a massive warning about this. Beyond that, there seem to be a few options:

  1. Offer some way to mark variables as secret and either ensure they never get stored in .tfstate files or store them in a hashed form.
  2. Encrypt the entire .tfstate file.
  3. Remove the recommendation to store .tfstate files in version control and only recommend them to be stored in secure, preferably encrypted storage.

@ejoubaud
Copy link

ejoubaud commented Jan 5, 2016

One thing to consider around this is output. When you create a resource with secrets (key pair, access keys, db password, etc.), you likely want to show the secret in question at least once (possibly in the stdout of the first run, as output do)

Currently output are also stored in plain text in the .tfstate, and can be retrieved later with terraform output.

One possible solution would be a mechanism to only show the secrets once, then not store them at all and not show them again (like AWS does), possibly using only-once output as I suggested in #4437

@revett

This comment was marked as duplicate.

1 similar comment
@sstarcher

This comment was marked as duplicate.

@pauldraper
Copy link

pauldraper commented Jul 28, 2023

I'm thinking it should be possible to create a filtering proxy for the remote/cloud backend protocol, and whenever it sees a state file just replace all sensitive values with null before forwarding it

Yes, this (plus ignore_changes) is exactly the workaround I suggested: #516 (comment)

Also:

(Unfortunately, Terraform backends are not very customizable, so a workaround today is moderately difficult, as it requires creating an http server backend to strip the secrets.)

See #33007 for alternative backend.

@RobvH
Copy link

RobvH commented Aug 21, 2023

I am of the opinion that the provider should never read the values of SecureStrings from aws_ssm_parameter. I believe strongly that you should not be managing your secrets in TFC - because, while Hashicorp is super cool, everyone should be considered by default, unqualified to hold your orgs secrets until they have proven a substantial security effort exists in their org. We keep them in AWS to provide that isolation.

This resource and data provider should never read the value. Even if using TFC - this allow anyone with access to TFC to add aws_ssm_parameters and import blocks for every secret they want. Run an apply, then switch their backend to local and them in their local state file.

Managing the existence of a parameter in IaC makes sense - other resources can depend on the namespace and name w/o relying on magical strings; but managing the value should, IMHO, always be impossible.

@mogan1
Copy link

mogan1 commented Oct 31, 2023

Sensitive secrets should be masked in the state file if it cannot be removed completely

@michaelvonderbecke

This comment was marked as off-topic.

@archvalmiki

This comment was marked as off-topic.

@archvalmiki
Copy link

Looks like the parallel open source project OpenTofu is working on this.

Moving sensitive values to outside the state (e.g. key mgmt system) would require deep architectural changes. A more feasible, near-term fix is state encryption: opentofu/opentofu#874

I understand that this still doesn't solve the problem that teams setting infra would still have access to the state file with app level secrets. Saving secret hashes instead of the secrets themselves... seems to be difficult to implement (very likely because it requires architectural changes in code of all providers): opentofu/opentofu#801 (comment)

@Jasper-Ben
Copy link

A more feasible, near-term fix is state encryption

@archvalmiki depending on your remote setup this is already possible today. E.g. when using s3 Backend, you can use KMS for encypting the s3 bucket containing your state. Or if you use kubernetes secret backend you can configure kubernetes to use a key management system to encrypt its secrets. Just to name a few options

@archvalmiki
Copy link

A more feasible, near-term fix is state encryption

@archvalmiki depending on your remote setup this is already possible today. E.g. when using s3 Backend, you can use KMS for encypting the s3 bucket containing your state. Or if you use kubernetes secret backend you can configure kubernetes to use a key management system to encrypt its secrets. Just to name a few options

The RFC introduces encryption directly within Terraform for state files, allowing more granular control over encryption, such as partial encryption of sensitive values and full encryption of state and plan files.

@Jasper-Ben
Copy link

Jasper-Ben commented Feb 29, 2024

The RFC introduces encryption directly within Terraform for state files, allowing more granular control over encryption, such as partial encryption of sensitive values and full encryption of state and plan files.

Got that :) just sharing this information for those that who are looking for a (somewhat) workable solution right now. It probably has been said somewhere in this thread before, but it's quite long by now so... 😅

@Tbohunek
Copy link

Tbohunek commented Mar 2, 2024

A more feasible, near-term fix is state encryption

@archvalmiki depending on your remote setup this is already possible today. E.g. when using s3 Backend, you can use KMS for encypting the s3 bucket containing your state. Or if you use kubernetes secret backend you can configure kubernetes to use a key management system to encrypt its secrets. Just to name a few options

Pls note that S3 bucket encryption (and anything similar) is only at-rest. If you authenticate yourself to S3 or to Terraform to view the state, you get the value. The only solution is if the value is not in state, and fetched at runtime temporarily from actual secret stores...for TF to compare it with reality and forget it afterwards.
Still no response from Hashicorp on this one. 😞

@mr-miles
Copy link

mr-miles commented Mar 2, 2024

The only solution is if the value is not in state, and fetched at runtime temporarily from actual secret stores...for TF to compare it with reality and forget it afterwards.

@Tbohunek - I don't think this is the only solution. The backend is well-placed to encrypt/decrypt sensitive values within the state at the point of serialization/deserialization (which is agnostic to the physical backend), so that providers can access the raw values as normal. If this were done with a key fetched at runtime from an actual secret store, then you have a statefile that is portable but doesn't leak sensitive values to those not authorised to see them.

I tried a simple POC here - mr-miles@3e466c1 - which turns values marked as sensitive into "REDACTED" when storing in state. It doesn't look like the changes are even that extensive.

@Tbohunek
Copy link

Tbohunek commented Mar 3, 2024

@mr-miles can you explain what do you use for encryption? I mean, which "key/password" and where that is stored?
I guess your POC goes beyond my code understanding, but you need some encryption key and that is stored somewhere..and when you can access it, again, boom.

By fetching secrets at runtime from AWS KMS or similar, you eliminate all possible ways of leaking the secret. You only give your TF identity access to the KMS, and TF applies it to where it should be and forgets it until next time.. No one else needs to have actual access to the secret value, and it never ever has to be known. If need be, IAM access to it can still be added in those rare cases, typically debug.

@mr-miles
Copy link

mr-miles commented Mar 3, 2024

Hi @Tbohunek - thanks for the questions.

What I meant was, it is possible to store one encryption key/password in (say) AWS KMS for sensitive values, and permission the TF identity to retrieve it at runtime (as you describe), but store the encrypted sensitive values in the statefile. That way the actual values are still useless to anyone who hasn't got access to the key, but users don't have to coordinate the terraform and an independent secret store, or to encrypt the statefile in its entirety.

I think there are two other advantages:

  • There are quite a lot of debugging tasks which end-users can carry out themselves, without requiring access to the encryption key
  • It doesn't need deep architectural changes since it can be implemented independently of all the providers or indeed the various backends, so its straightforward to achieve

@Tbohunek
Copy link

Tbohunek commented Mar 4, 2024

Right @mr-miles, so an explicit encryption key from KMS. This works too.
But what if I need to fetch KMS-stored secrets with data?
I guess it could encrypt it too, but this means I need to configure something, which opens up the door to "I forgot". -> Need something that works by default and is as seamless as possible.

@mr-miles
Copy link

Do you have an example? I'm just wondering if there are data-fetches where the field you'd request is not marked as sensitive already. For example, the aws_kms_secrets data source returns plaintext values but they are flagged as sensitive so would be spotted and encrypted automatically.

There is a "sensitive" function to flag arbitrary values too, but AFAICS it looks like the need for the user to mark values as sensitive (and possibly forget, like you say) is vanishingly small. What do you think?

@rwblokzijl
Copy link

Made a proposal that could remove many, if not all, secrets from the state: #34860. Current workarounds listed also.

@Tbohunek
Copy link

@mr-miles Indeed the field could be marked as sensitive, but it can only be encrypted if encryption is configured. There's no possible encryption by default as anyone can run their Terraform whenever and wherever they can.

This is why I seek a way to not store the secrets at all in state. Value is stored in the Vault, and if any resource/provider needs secret input, Terraform could fetch it from the Vault on the fly, both plan and apply phases.

@TryTryAgain
Copy link

It should be incorporated into vault by default, maybe one day.

@dmurvihill
Copy link

@omarismail are we getting close to an actionable design yet?

It has been seven years since @sarneaud first proposed a lifecycle parameter along the lines of ignore_changes and five years since his pull request was closed "for a larger design conversation". Well, the design conversation has gotten larger and larger and larger, (up to almost 300 coments now) and we have yet to see someone come forward with a technical problem that would block implementation. Is it time to do this yet?

@dancorne
Copy link

are we getting close to an actionable design yet?

I'm not from Hashicorp so could be wrong, but I think this is expected to be resolved with "ephemeral values" which is currently in v1.10-alpha. However the example at the start of this issue (a password attribute in a resource) would need "write-only attributes" which the PR notes will not be ready for experimentation due changes needed in the provider protocol.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.