Worker: don't use yargs defaults for env var stuff #604

josephjclark · 2024-02-15T12:04:46Z

The worker is started with a script called start.ts, which uses the yargs library to parse inputs from command-line arguments and from the environment.

The way it's set up at the moment allows env vars to be leaked into console output. Here's what happens if you run pnpm start --help from packages/ws-worker/:

If you look closely at the output, you'll see that options which list env vars also list the env var value.

This isn't really harmful, but it is misleading and scary.

Requirements

Env vars should not be loaded in the yargs.default property
Env vars should still be used to set values in
Running node start --help should not output values in env vars

Background

The Worker is direct peer dependency of Lightning. When users trigger Workflows in Lightning, they are "claimed" by the worker (which is a separate service running on a different IP) and processed.

While the Worker executes a workflow, it sends back events to Lightning via websocket.

See the worker readme in packages/ws-worker/README.md for a bit of an overview

The actual execution is handled by the engine (engine-multi). The Worker package often mocks out the engine in tests so that it can focus on the Lighting-Worker communication.

Implementation Notes

I'm looking for a solution which finds a nice way to pull env vars off of process.env and apply them to the args object generated by yargs. Env vars should only be applied if an option was not explicitly provided. In other words, CLI arguments should be preferred to env vars, and env vars should be preferred to yargs defaults.

If the args object is populated appropriately, this change should be totally invisible.

Testing

There are currently no unit tests against start.ts. It would be nice to have some.

One way of doing this would be to separate the yargs stuff from start.ts, and just have a single function which assembles options from yargs and the environment. A cli.ts and a start.ts would be reasonable. You could then unit test cli.ts and ensure that it returns the right options.

The text was updated successfully, but these errors were encountered:

SatyamMattoo · 2024-05-18T19:21:09Z

Hello @josephjclark , I see the problem with environment variables being exposed in the console output when running pnpm start --help. While it's not harmful, it can definitely be misleading and cause unnecessary concern.

To resolve this, I've come up with a plan to separate the environment variables from the yargs defaults. We'll refactor the argument parsing so that environment variables are not directly loaded into the yargs.default property. Instead, we'll apply environment variable values to the args object manually after parsing the command-line arguments. This will prevent the values from being exposed in the help output.

We'll move the logic for handling environment variables outside of the yargs configuration and apply them manually in the parseArgs function. Here's the updated implementation:

const parser = yargs(hideBin(argv))
    .command('server', 'Start a ws-worker server')
    .option('port', {
      alias: 'p',
      description: 'Port to run the server on.',
      type: 'number',
    })
    .option('lightning', {
      alias: ['l', 'lightning-service-url'],
      description: 'Base url to Lightning websocket endpoint, eg, ws://localhost:4000/worker. Set to "mock" to use the default mock server.',
      type: 'string',
    })

.....
const args = parser.parse() as Args;
return {
    ...args,
    port: args.port ?? (WORKER_PORT ? parseInt(WORKER_PORT) : 2222),
    lightning: args.lightning ?? WORKER_LIGHTNING_SERVICE_URL ?? 'ws://localhost:4000/worker',
    repoDir: args.repoDir ?? WORKER_REPO_DIR,
    secret: args.secret ?? WORKER_SECRET,
    lightningPublicKey: args.lightningPublicKey ?? WORKER_LIGHTNING_PUBLIC_KEY,
    log: args.log ?? (WORKER_LOG_LEVEL as LogLevel) ?? 'debug',
    backoff: args.backoff ?? WORKER_BACKOFF ?? '1/10',
    capacity: args.capacity ?? (WORKER_CAPACITY ? parseInt(WORKER_CAPACITY) : 5),
    statePropsToRemove: args.statePropsToRemove ?? (WORKER_STATE_PROPS_TO_REMOVE ? WORKER_STATE_PROPS_TO_REMOVE.split(',') : ['configuration', 'response']),
    runMemory: args.runMemory ?? (WORKER_MAX_RUN_MEMORY_MB ? parseInt(WORKER_MAX_RUN_MEMORY_MB) : 500),
    maxRunDurationSeconds: args.maxRunDurationSeconds ?? (WORKER_MAX_RUN_DURATION_SECONDS ? parseInt(WORKER_MAX_RUN_DURATION_SECONDS) : 300),
  } as Args;

To ensure these changes are effective and robust, I have added unit tests for the parseArgs function. I have included this function in a separate file cli.ts. These tests are verifying the correct handling of CLI arguments and environment variables.

josephjclark · 2024-05-20T08:02:05Z

Hey @SatyamMattoo sounds like you're on the right track! Can you raise a PR so I can take a proper look?

One point of early feedback: the setup of the args object is quite painful:

port: args.port ?? (WORKER_PORT ? parseInt(WORKER_PORT) : 2222),

The right-hand-side is very verbose and repettive, and hard to read. Ideally it would be nice to easily understand where a value has come from.

Every arg has a value from the command line, a value from the env, and sometimes a default. Could we have some kind of helper function like this?

port:  setArg('port', 'WORKER_PORT',  2222),

SatyamMattoo · 2024-05-20T09:02:54Z

Hey @josephjclark I have raised a PR with the suggested changes, you can have a look whenever possible and suggest more changes that might be needed.

Best regards.

* Fixing: #604 * added new test for statePropsToRemove * requested changes * completed: requested changes

* worker: restructure env vars * Fixing: #604 * added new test for statePropsToRemove * requested changes * completed: requested changes * changeset * worker: simplify typings * worker: adjust tests * release: worker @1.1.10 --------- Co-authored-by: Satyam Mattoo <[email protected]>

@ts-ignore

* Fixing: #604 * added new test for statePropsToRemove * runtime: adding more validation to workflow.json * removing ws-worker changes * removing ws-worker changes * runtime: added new tests to validate-plan.test.ts * log function switched to openfn/logger * completed: requested changes * moved validtion to CLI * removed @ts-ignore and changed error and test messages. * fixing workflow error message

@ts-ignore

* Fixing: #604 * added new test for statePropsToRemove * runtime: adding more validation to workflow.json * removing ws-worker changes * removing ws-worker changes * runtime: added new tests to validate-plan.test.ts * log function switched to openfn/logger * completed: requested changes * moved validtion to CLI * removed @ts-ignore and changed error and test messages. * fixing workflow error message relesae: [email protected] formatting

josephjclark added the tech debt Non-critical, invisible code issues that ought to be addressed label Feb 15, 2024

taylordowns2000 added this to v2 Feb 15, 2024

github-project-automation bot moved this to New Issues in v2 Feb 15, 2024

josephjclark moved this from New Issues to Ready in v2 Feb 15, 2024

christad92 moved this from Ready to Backlog in v2 Feb 29, 2024

josephjclark moved this from Backlog to Icebox in v2 Apr 10, 2024

SatyamMattoo added a commit to SatyamMattoo/kit that referenced this issue May 20, 2024

Fixing: OpenFn#604

e7632ae

SatyamMattoo mentioned this issue May 20, 2024

Fixing: #604 #695

Merged

3 tasks

josephjclark assigned SatyamMattoo May 21, 2024

josephjclark pushed a commit that referenced this issue May 27, 2024

worker: restructure env vars

c49ae24

* Fixing: #604 * added new test for statePropsToRemove * requested changes * completed: requested changes

josephjclark mentioned this issue May 27, 2024

worker: restructure env vars #699

Merged

josephjclark closed this as completed in #699 May 27, 2024

github-project-automation bot moved this from Icebox to Done in v2 May 27, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Worker: don't use yargs defaults for env var stuff #604

Worker: don't use yargs defaults for env var stuff #604

josephjclark commented Feb 15, 2024 •

edited

Loading

SatyamMattoo commented May 18, 2024 •

edited

Loading

josephjclark commented May 20, 2024

SatyamMattoo commented May 20, 2024

Worker: don't use yargs defaults for env var stuff #604

Worker: don't use yargs defaults for env var stuff #604

Comments

josephjclark commented Feb 15, 2024 • edited Loading

Requirements

Background

Implementation Notes

Testing

SatyamMattoo commented May 18, 2024 • edited Loading

josephjclark commented May 20, 2024

SatyamMattoo commented May 20, 2024

josephjclark commented Feb 15, 2024 •

edited

Loading

SatyamMattoo commented May 18, 2024 •

edited

Loading