Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Worker: don't use yargs defaults for env var stuff #604

Closed
josephjclark opened this issue Feb 15, 2024 · 3 comments · Fixed by #695 or #699
Closed

Worker: don't use yargs defaults for env var stuff #604

josephjclark opened this issue Feb 15, 2024 · 3 comments · Fixed by #695 or #699
Assignees
Labels
good first issue Good for newcomers tech debt Non-critical, invisible code issues that ought to be addressed

Comments

@josephjclark
Copy link
Collaborator

josephjclark commented Feb 15, 2024

The worker is started with a script called start.ts, which uses the yargs library to parse inputs from command-line arguments and from the environment.

The way it's set up at the moment allows env vars to be leaked into console output. Here's what happens if you run pnpm start --help from packages/ws-worker/:

image

If you look closely at the output, you'll see that options which list env vars also list the env var value.

This isn't really harmful, but it is misleading and scary.

Requirements

  • Env vars should not be loaded in the yargs.default property
  • Env vars should still be used to set values in
  • Running node start --help should not output values in env vars

Background

The Worker is direct peer dependency of Lightning. When users trigger Workflows in Lightning, they are "claimed" by the worker (which is a separate service running on a different IP) and processed.

While the Worker executes a workflow, it sends back events to Lightning via websocket.

See the worker readme in packages/ws-worker/README.md for a bit of an overview

The actual execution is handled by the engine (engine-multi). The Worker package often mocks out the engine in tests so that it can focus on the Lighting-Worker communication.

Implementation Notes

I'm looking for a solution which finds a nice way to pull env vars off of process.env and apply them to the args object generated by yargs. Env vars should only be applied if an option was not explicitly provided. In other words, CLI arguments should be preferred to env vars, and env vars should be preferred to yargs defaults.

If the args object is populated appropriately, this change should be totally invisible.

Testing

There are currently no unit tests against start.ts. It would be nice to have some.

One way of doing this would be to separate the yargs stuff from start.ts, and just have a single function which assembles options from yargs and the environment. A cli.ts and a start.ts would be reasonable. You could then unit test cli.ts and ensure that it returns the right options.

@josephjclark josephjclark added the tech debt Non-critical, invisible code issues that ought to be addressed label Feb 15, 2024
@github-project-automation github-project-automation bot moved this to New Issues in v2 Feb 15, 2024
@josephjclark josephjclark moved this from New Issues to Ready in v2 Feb 15, 2024
@christad92 christad92 moved this from Ready to Backlog in v2 Feb 29, 2024
@josephjclark josephjclark moved this from Backlog to Icebox in v2 Apr 10, 2024
@josephjclark josephjclark added good first issue Good for newcomers more details needed This issue has insufficient requirements and needs more details before it can be started and removed more details needed This issue has insufficient requirements and needs more details before it can be started labels Apr 18, 2024
@SatyamMattoo
Copy link
Contributor

SatyamMattoo commented May 18, 2024

Hello @josephjclark , I see the problem with environment variables being exposed in the console output when running pnpm start --help. While it's not harmful, it can definitely be misleading and cause unnecessary concern.

To resolve this, I've come up with a plan to separate the environment variables from the yargs defaults. We'll refactor the argument parsing so that environment variables are not directly loaded into the yargs.default property. Instead, we'll apply environment variable values to the args object manually after parsing the command-line arguments. This will prevent the values from being exposed in the help output.

We'll move the logic for handling environment variables outside of the yargs configuration and apply them manually in the parseArgs function. Here's the updated implementation:

const parser = yargs(hideBin(argv))
    .command('server', 'Start a ws-worker server')
    .option('port', {
      alias: 'p',
      description: 'Port to run the server on.',
      type: 'number',
    })
    .option('lightning', {
      alias: ['l', 'lightning-service-url'],
      description: 'Base url to Lightning websocket endpoint, eg, ws://localhost:4000/worker. Set to "mock" to use the default mock server.',
      type: 'string',
    })

.....
const args = parser.parse() as Args;
return {
    ...args,
    port: args.port ?? (WORKER_PORT ? parseInt(WORKER_PORT) : 2222),
    lightning: args.lightning ?? WORKER_LIGHTNING_SERVICE_URL ?? 'ws://localhost:4000/worker',
    repoDir: args.repoDir ?? WORKER_REPO_DIR,
    secret: args.secret ?? WORKER_SECRET,
    lightningPublicKey: args.lightningPublicKey ?? WORKER_LIGHTNING_PUBLIC_KEY,
    log: args.log ?? (WORKER_LOG_LEVEL as LogLevel) ?? 'debug',
    backoff: args.backoff ?? WORKER_BACKOFF ?? '1/10',
    capacity: args.capacity ?? (WORKER_CAPACITY ? parseInt(WORKER_CAPACITY) : 5),
    statePropsToRemove: args.statePropsToRemove ?? (WORKER_STATE_PROPS_TO_REMOVE ? WORKER_STATE_PROPS_TO_REMOVE.split(',') : ['configuration', 'response']),
    runMemory: args.runMemory ?? (WORKER_MAX_RUN_MEMORY_MB ? parseInt(WORKER_MAX_RUN_MEMORY_MB) : 500),
    maxRunDurationSeconds: args.maxRunDurationSeconds ?? (WORKER_MAX_RUN_DURATION_SECONDS ? parseInt(WORKER_MAX_RUN_DURATION_SECONDS) : 300),
  } as Args;

To ensure these changes are effective and robust, I have added unit tests for the parseArgs function. I have included this function in a separate file cli.ts. These tests are verifying the correct handling of CLI arguments and environment variables.

@josephjclark
Copy link
Collaborator Author

Hey @SatyamMattoo sounds like you're on the right track! Can you raise a PR so I can take a proper look?

One point of early feedback: the setup of the args object is quite painful:

port: args.port ?? (WORKER_PORT ? parseInt(WORKER_PORT) : 2222),

The right-hand-side is very verbose and repettive, and hard to read. Ideally it would be nice to easily understand where a value has come from.

Every arg has a value from the command line, a value from the env, and sometimes a default. Could we have some kind of helper function like this?

port:  setArg('port', 'WORKER_PORT',  2222),

SatyamMattoo added a commit to SatyamMattoo/kit that referenced this issue May 20, 2024
@SatyamMattoo SatyamMattoo mentioned this issue May 20, 2024
3 tasks
@SatyamMattoo
Copy link
Contributor

Hey @josephjclark I have raised a PR with the suggested changes, you can have a look whenever possible and suggest more changes that might be needed.

Best regards.

josephjclark pushed a commit that referenced this issue May 27, 2024
* Fixing: #604

* added new test for statePropsToRemove

* requested changes

* completed: requested changes
josephjclark added a commit that referenced this issue May 27, 2024
* worker: restructure env vars

* Fixing: #604

* added new test for statePropsToRemove

* requested changes

* completed: requested changes

* changeset

* worker: simplify typings

* worker: adjust tests

* release: worker @1.1.10

---------

Co-authored-by: Satyam Mattoo <[email protected]>
@github-project-automation github-project-automation bot moved this from Icebox to Done in v2 May 27, 2024
josephjclark pushed a commit that referenced this issue Jun 3, 2024
* Fixing: #604

* added new test for statePropsToRemove

* runtime: adding more validation to workflow.json

* removing ws-worker changes

* removing ws-worker changes

* runtime: added new tests to validate-plan.test.ts

* log function switched to openfn/logger

* completed: requested changes

* moved validtion to CLI

* removed @ts-ignore and changed error and test messages.

* fixing workflow error message
josephjclark pushed a commit that referenced this issue Jun 3, 2024
* Fixing: #604

* added new test for statePropsToRemove

* runtime: adding more validation to workflow.json

* removing ws-worker changes

* removing ws-worker changes

* runtime: added new tests to validate-plan.test.ts

* log function switched to openfn/logger

* completed: requested changes

* moved validtion to CLI

* removed @ts-ignore and changed error and test messages.

* fixing workflow error message

relesae: [email protected]

formatting
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
good first issue Good for newcomers tech debt Non-critical, invisible code issues that ought to be addressed
Projects
Archived in project
Development

Successfully merging a pull request may close this issue.

2 participants