Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Load ConfigFile from Environment Variables #148

Open
wants to merge 1 commit into
base: main
Choose a base branch
from
Open

Load ConfigFile from Environment Variables #148

wants to merge 1 commit into from

Conversation

oxzi
Copy link
Member

@oxzi oxzi commented Jan 9, 2024

Next to populating ConfigFile through default values and by a YAML configuration, environment variable support was added.

The code works by reflecting on the existing nested ConfigFile struct. The environment variable key is an underscore separated string of uppercase struct fields.

@oxzi oxzi requested a review from julianbrost January 9, 2024 17:10
@cla-bot cla-bot bot added the cla/signed CLA is signed by all contributors of a PR label Jan 9, 2024
@oxzi
Copy link
Member Author

oxzi commented Jan 9, 2024

This PR's motivation is to address this comment #144 (comment) which might be a precondition for the integration tests.

@julianbrost
Copy link
Collaborator

The code works by reflecting on the existing nested ConfigFile struct.

If it already uses reflection on the config struct, why did you choose to generate a YAML file which is then later parsed into that struct rather than setting the values using reflection?

Also, it would be great if the code would be reusable so that Icinga DB could be adapted to use this as well in the future. If I'm not missing something, this would only require making the environment variable prefix and destination type function parameters.

@oxzi
Copy link
Member Author

oxzi commented Jan 11, 2024

The code works by reflecting on the existing nested ConfigFile struct.

If it already uses reflection on the config struct, why did you choose to generate a YAML file which is then later parsed into that struct rather than setting the values using reflection?

Because I wanted to achieve the exact same behavior as by passing a YAML configuration.

This starts with detecting the right native type for each field and trying to convert the environment variable's value - which is at first always a string - to it and ends at custom parsing for everything from different *Unmarshaler interfaces to, e.g., time.Duration1. Thus, I had to re-write or import lots of already existing code to mimic a library's behavior which is already included.

With this architecture, a small amount of code translates environment variables to the already known configuration format.

Also, it would be great if the code would be reusable so that Icinga DB could be adapted to use this as well in the future. If I'm not missing something, this would only require making the environment variable prefix and destination type function parameters.

Yes, the current code only requires that yaml struct tags are available, which is afaik also the case for Icinga DB. Thus, when making the environment variable prefix configurable, it should be portable.

Footnotes

  1. https://github.com/goccy/go-yaml/blob/master/decode.go#L1099-L1125

@julianbrost
Copy link
Collaborator

The basic problem here is taking a string and putting it into any (representing some scalar type), isn't it? Unfortunately, TextUnmarshaler isn't implemented universally, but if it was, I'd say that would be the way to go.

Your code already contains some special handling for integers (probably inspired by the Icinga DB container entrypoint):

// Encode numeric values as integers as, otherwise, YAML cannot decode those as integers back.
if i, err := strconv.ParseInt(match[2], 10, 64); err == nil {
partCfg[fieldName] = i
} else {
partCfg[fieldName] = match[2]
}

Which is basically guessing based on the value, if the destination type would be taken into account, the code could make a better informed decision about which conversion would be needed.

Icinga DB already happens to need and have a string to any function function anyways, which will make it's way to icinga-go-library, could be exported and extended with the missing few types and also be used here.

Note: I couldn't test if the current code would handle bool correctly, the only ones available in the config are somewhere that uses yaml:",inline" which isn't handled by the code currently and also, the environment variable names are derived from the struct member names, not the yaml tags. Currently, the enable TLS option is (attempted to be) used from ICINGA_NOTIFICATIONS_DATABASE_TLSOPTIONS_ENABLE=true whereas in the config file YAML it's just database.tls. Also, the generated YAML doesn't work in that situation:

database:
  ",inline":
    tls: "true"

@oxzi
Copy link
Member Author

oxzi commented Jan 12, 2024

The basic problem here is taking a string and putting it into any (representing some scalar type), isn't it? Unfortunately, TextUnmarshaler isn't implemented universally, but if it was, I'd say that would be the way to go.

Exactly.

Unfortunately, even some (more or less) scalar types from the standard library do not implement the TextUnmarshaler, e.g., time.Time or time.Duration. As it seems, those two are not going to implement it - golang/go#16039.

At this point, I would like to link again to go-yaml's additional type check:
https://github.com/goccy/go-yaml/blob/0640a158e99023787bc503fd4476e472d804c3a3/decode.go#L674-L692

Furthermore, there are all possible compound data types (structs, arrays and slices, maps) and interfaces.

Your code already contains some special handling for integers (probably inspired by the Icinga DB container entrypoint):

// Encode numeric values as integers as, otherwise, YAML cannot decode those as integers back.
if i, err := strconv.ParseInt(match[2], 10, 64); err == nil {
partCfg[fieldName] = i
} else {
partCfg[fieldName] = match[2]
}

Which is basically guessing based on the value, if the destination type would be taken into account, the code could make a better informed decision about which conversion would be needed.

This is, honestly, just an ugly hack a very intelligent solution as unquoted text might still get parsed as a string - even when the string only consist of number literals -, but a quoted number cannot be parsed as a numeric value. For all values in the added tests, this was the only corner case I ran into (maybe because I ignored the , inline you mentioned later on).

Icinga DB already happens to need and have a string to any function function anyways, which will make it's way to icinga-go-library, could be exported and extended with the missing few types and also be used here.

As I stated earlier, this would require more testing to ensure that this code behaves like go-yaml's code. Furthermore, when there are changes to either the future icinga-go-library or go-yaml code, things might break.

IMO, when using a (partial) YAML configuration as an intermediate representation, the logical coupling is smaller as when try to mimic go-yaml's behavior, especially where they made decisions in the absence of standards.

Note: I couldn't test if the current code would handle bool correctly, the only ones available in the config are somewhere that uses yaml:",inline" which isn't handled by the code currently and also, the environment variable names are derived from the struct member names, not the yaml tags. Currently, the enable TLS option is (attempted to be) used from ICINGA_NOTIFICATIONS_DATABASE_TLSOPTIONS_ENABLE=true whereas in the config file YAML it's just database.tls. Also, the generated YAML doesn't work in that situation:

database:
  ",inline":
    tls: "true"

The field names were derived from the Go fields as, by naming convention, Go does not use underscores as separators but camel case. I was expecting to find underscores in the yaml struct fields as separators, but, as I just tried to verified that, there are none, but dashes.
My assumption was that an underscore in a yaml struct tag would break the environment variable separator. As long as none are going to appear, this should work. (Let's try to not mention the sword of Damocles hanging over this "parser".)

I'll try to generalize the code as mentioned in #148 (comment), make some tests for other types, e.g., booleans, derive the names from the yaml struct tag with an additional safety check, and try to address the ,inline possibility.

Next to populating ConfigFile through default values and by a YAML
configuration, environment variable support was added.

The code works by reflecting on the existing nested ConfigFile struct.
The environment variable key is an underscore separated string of
uppercase struct fields.
@oxzi
Copy link
Member Author

oxzi commented Jan 15, 2024

After multiple iterations, both the code as well as the logic has now changed quite a bit.

I reflected on my prior statement that creating a temporary YAML configuration would be a good idea and now used the available runtime type information to decode each value into a given reference to the struct directly. Furthermore, I added rudimentary support for ,inlined fields and created tests both on a green field as well as for the Database.TlsOptions.

Please feel free to take another look, @julianbrost.

@lippserd
Copy link
Member

Have you considered existing libraries such as https://github.com/caarlos0/env?

@oxzi
Copy link
Member Author

oxzi commented Jan 18, 2024

Have you considered existing libraries such as https://github.com/caarlos0/env?

Yes. IIRC, I have even tried this library or at least some other Go library for this purpose.

The problem is, however, it requires a fixed env struct tag on all fields. Since this also affects fields of types within Icinga DB, that I don't have code write access to, this turns out to be difficult. On first glance, there is no built-in mechanism to handle missing struct tags.

Furthermore, I aimed to get both the same names as well as the same decoding behavior as with the already used YAML library. As outlined in earlier comments, there is already some highly opinionated behavior and having to mimic another library's decisions - also with future or forward compatibility - just sounds tedious.

@julianbrost
Copy link
Collaborator

The problem is, however, it requires a fixed env struct tag on all fields. Since this also affects fields of types within Icinga DB, that I don't have code write access to, this turns out to be difficult.

That shouldn't be a general issue. This code would probably be destined for icinga-go-library where those database config struct will also end up. (It would probably be a problem if the library wouldn't allow a custom variable prefix per project.)

On first glance, there is no built-in mechanism to handle missing struct tags.

Is that a general issue or is it just saying that because of that, it's not possible to build a workaround that wouldn't require adding tags to the structs?

Furthermore, I aimed to get both the same names as well as the same decoding behavior as with the already used YAML library. As outlined in earlier comments, there is already some highly opinionated behavior and having to mimic another library's decisions - also with future or forward compatibility - just sounds tedious.

In general, please elaborate more on which libraries you looked into and what limitations made you rule them out in particular. We might find that some can easily be worked around, otherwise we have a good picture of why implementing it on our own is the better choice.

@oxzi
Copy link
Member Author

oxzi commented Jan 25, 2024

In general, please elaborate more on which libraries you looked into and what limitations made you rule them out in particular. We might find that some can easily be worked around, otherwise we have a good picture of why implementing it on our own is the better choice.

I have tried two different libraries.

My first attempt was using https://github.com/sethvargo/go-envconfig. Unfortunately, my testing code is gone as I have not committed it.

However, with @lippserd's suggestion I achieved more, as outlined below. (But GitHub just ate my half-written comparison of what worked and what did not.)

Then, I tried https://github.com/caarlos0/env.

  • It is possible to change the checked struct tag from env to, e.g., yaml through the Options.
    Even when names like ICINGA_NOTIFICATIONS_icingaweb2-url might not be beautiful to the eye, this generally works.
    However, as there are some additional comma-separated attributes in the go-yaml struct tags which are not known to env, the parser errors with something like "env: tag option "inline" not supported". With ignoring the errors, this will still work, but ignoring parser errors might not be the best move.
  • As the previous error shows, it is not possible to directly mimic go-yaml's struct inlineing. This would result in manual code to achieve feature or naming parity.
  • Instead of using struct tags, it is possible to let the library derive names from their field names via UseFieldNameByDefault.
    This mostly works, but in some cases the toEnvName function generates unguessable names, e.g., Icingaweb2URL will be ICINGAWEB_2U_R_L.
  • When using nested structs, the additional envPrefix struct tag becomes necessary to resolve sub-struct fields. Take a look at this example.
  • Contrary to prior assumptions, unmarshalling a time.Duration worked as expected with go-yaml.
    For example, ICINGA_NOTIFICATIONS_interval=9001h correctly populated the Logging.Interval field.

In a nutshell, with some tweaks it would be possible to use the https://github.com/caarlos0/env library.

However, in the meantime since creating this PR, the premise has changed.
First, I wanted to alter only code in icinga-notifications and thus used some (more or less hacky) workarounds to achieve this. Now, with planning to move the referenced code from icingadb to the icinga-go-library anyway, we can then also populate it with correct env struct tags.

Thus, would it be possible to (re-)define the concrete scope of this PR, which was initially created to address a comment of @julianbrost, allowing to configure Icinga Notifications through environment variables for integration tests. We have moved a considerable distance away from the initial motivation, inflating the dependencies before getting anywhere near the tests.

Now, to something completely different: password and other credentials in environment variables.

I am not quite sure if this is desirable from a security perspective. First, at least the same, or a more privileged, user can access the environment through /proc (on Linux) or ps. It seems like there are (or at least were) some systems with a setuided ps, allowing any user to inspect each process' environment. Furthermore, as listed in this blog post, the environment will be passed to child processes - which we are currently creating - and is often included in crash logs.

One can argue that currently the subprocesses are having the same privileges as the daemon anyway and we have control over logs. But, however, this seems like something which might get forgotten along the path and that traditional file permissions are an easier concept.

@Al2Klimov
Copy link
Member

Now, to something completely different: password and other credentials in environment variables.

I am not quite sure if this is desirable from a security perspective. (...) the environment will be passed to child processes - which we are currently creating - and is often included in crash logs.

Not if we unset such first. https://github.com/caarlos0/env?tab=readme-ov-file#unset-environment-variable-after-reading-it

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
cla/signed CLA is signed by all contributors of a PR
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants