Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[bug] Dependency chains are not respected in moon ci #1624

Closed
kikones34 opened this issue Aug 22, 2024 · 18 comments
Closed

[bug] Dependency chains are not respected in moon ci #1624

kikones34 opened this issue Aug 22, 2024 · 18 comments
Assignees
Labels
bug Something isn't working

Comments

@kikones34
Copy link

kikones34 commented Aug 22, 2024

Describe the bug

If you have a dependency chain task1 -> task2 -> task3, when task1 is touched, moon ci will only execute task2 and it will not propagate to task3.

Steps to reproduce

Configure three tasks as follows:

  dependency:
    command: echo "I'm the dependency"
    inputs:
      - dependency-input.txt
    options:
      runInCI: true

  dependency-checker:
    deps:
      - dependency
    command: echo "I'm the dependency checker"
    inputs:
      - dependency-checker-input.txt
    options:
      runInCI: true

  dependent:
    deps:
      - dependency-checker
    command: echo "I'm the dependent"
    inputs:
      - dependent-input.txt
    options:
      runInCI: true

Commit everything to the default git branch. moon ci shouldn't execute anything.
Now, modify the file dependency-input.txt and execute moon ci. The dependency-checker task is run, but the dependent is not:

▪▪▪▪ Gathering touched files
  services/trio-cloud-search/dependency-input.txt
▪▪▪▪ Gathering potential targets
  ...
  trio-cloud-search:dependency
  trio-cloud-search:dependency-checker
  trio-cloud-search:dependent
  ...
▪▪▪▪ Generating action graph
Target count: 335
Action count: 12
▪▪▪▪ Running targets
▪▪▪▪ trio-cloud-search:dependency (faf9f199)
        trio-cloud-search:dependency | I'm the dependency
▪▪▪▪ trio-cloud-search:dependency (faf9f199)
▪▪▪▪ trio-cloud-search:dependency-checker (7f147af2)
trio-cloud-search:dependency-checker | I'm the dependency checker
▪▪▪▪ trio-cloud-search:dependency-checker (7f147af2)

Expected behavior

The task dependent should also be executed since it depends on dependency-checker.

Additional context

Note that these dependency chains behave correctly regarding the cache logic (so if dependency is dirty, it will propagate the dirty state up to dependent). The issue only occurs in moon ci.

@kikones34 kikones34 added the bug Something isn't working label Aug 22, 2024
@milesj
Copy link
Collaborator

milesj commented Aug 22, 2024

@kikones34 This is by design at the moment. The "run dependents" logic only applies to direct dependents of affected tasks, and not the entire tree. It was built this way as a simple regression testing mechanism and ensuring a change in a project didn't affect it's immediate consumers.

However, we can probably add an option to control the depth.

@kikones34
Copy link
Author

kikones34 commented Aug 22, 2024

@milesj Oh, I see. The problem is that this breaks the way we're managing inter-project dependencies. In the real-world use case, what we actually have is:

  • A check-source-changed task on each project which gets dirty when the source code of the project changes.
  • A check-deps-source-changed task on each project, which has dependencies such as proj1:check-source-changed, proj2:check-source-changed, etc. It gets dirty whenever one of those project's source is changed.
  • Tasks that depend on check-deps-source-changed. For example, test only gets dirty if the project source or the source of the dependencies has changed.

This has worked perfectly as far as caching goes, but I'm finding it impossible to replicate with the CI feature.

@milesj
Copy link
Collaborator

milesj commented Aug 22, 2024

@kikones34 Yah that makes sense. Are you able to use moon run in CI now to replicate what you want?

@kikones34
Copy link
Author

Yeah for now we're using moon run, but managing the cache is tricky due to the pipelines being run by Bamboo agents, different PRs open on different branches at the same time, etc.
For now it's mostly working, although we are limiting the cache to be per-branch, otherwise many issues arise. That's why we were looking into leveraging the moon ci capabilities so that we can always run just the minimum necessary tasks given by the touched files, without having to worry so much about the cache state.

@milesj
Copy link
Collaborator

milesj commented Aug 23, 2024

I'll look into supporting deep dependents in the next release.

@milesj
Copy link
Collaborator

milesj commented Oct 7, 2024

I've made some changes in v1.29, can you give that a shot?

@kikones34
Copy link
Author

I've been checking out the new tracker logs and CLI options upstream and downstream, but I think there's still no way to pass these to moon ci, right? It only seems available for moon query projects. I also cannot really test if it would behave as expected for this issue's example, because the options can't be passed to moon query tasks either, and I get no information regarding affected tasks using the projects query.
In any case, the introduction of these parameters looks promising, and the tracker logs definitely will help debug stuff moving forwards!

@milesj
Copy link
Collaborator

milesj commented Oct 11, 2024

Yeah moon ci is hard-coded to how it works and probably won't change. I'd rather add this functionality to moon run.

@dudicoco
Copy link

dudicoco commented Nov 18, 2024

@milesj can you please elaborate on why this is the intended behavior of moon ci?
I believe most use cases require downstream dependents to run when a task changes, otherwise you have to rely solely on task inputs in order to make them affected by such a change.

We could put the outputs of a dependency as an inputs for the task like so:

# app1
tasks:
  build:
    command: build
    inputs:
    - 'src/**/*'
    outputs:
    - bin/main
# app2
tasks:
  build:
    deps:
    - app1:build
    command: build
    inputs:
    - 'src/**/*'
    - /app1/bin/main
    outputs:
    - bin/main
# app3
tasks:
  build:
    deps:
    - app2:build
    command: build
    inputs:
    - 'src/**/*'
    - /app2/bin/main
    outputs:
    - bin/main

However, moon generates the graph before the outputs are generated so this doesn't work.

So it seems that at the current state moon ci is actually not suitable to run in CI since a lot of dependent tasks will not be triggered.

@milesj
Copy link
Collaborator

milesj commented Nov 26, 2024

@dudicoco Why are the outputs of dep task used as inputs for the owning task? This shouldn't be necessary, since the dep task itself is considered an "input".

@milesj
Copy link
Collaborator

milesj commented Nov 26, 2024

@kikones34 v1.30 now properly supports tasks in the affected tracker, so moon query tasks should work better.

@dudicoco
Copy link

@milesj if my dep task produces an artifact which the next task needs to use as input for its build then this is neccessary.

If you look at how bazel works this happens everywhere basically, for example:

go_library(
    name = "lib",
    srcs = ["lib.go"],
)

go_binary(
    name = "bin",
    srcs = ["bin.go"],
    deps = [":lib"],
)

@milesj
Copy link
Collaborator

milesj commented Nov 26, 2024

@dudicoco Yes I've used Bazel many times.

In moon however, inputs are pretty much only used to generate the task hash (for caching). We do this by running each input through git object-hash. On top of that, we also include the task hash of all dependencies of a task.

Because of this, you don't need the output of another task (A) as the input of a task (B). Because if that other task (A) generates a new output (a binary in your example), its new task hash will trigger a recomputed task hash in the downstream task (B).

Here's an example of the hash manifest:

[
  {
    "command": "packemon",
    "args": [
      "--addExports",
      "--addFiles",
      "--declaration",
      "build"
    ],
    "deps": {
      "types:build": "051aee1254b3632bed9a24881ee7b9d8291fb64c301827241f442e005f2e6fe2"
    },
    "env": {
      "NODE_ENV": "production"
    },
    "inputs": {
      ".moon/tasks/node.yml": "0d12b1661063be4206079ee064ddae38ccd1c828",
      ".moon/toolchain.yml": "a22995b47346668753102ec84334da6284263140",
      ".moon/workspace.yml": "c49cd7311e1fb927f533deafb811bbce8c1b617b",
      "packages/nx-compat/package.json": "de182d2331dd91e282cd8e96ca7d71759d87b29f",
      "packages/nx-compat/src/bin.ts": "928e5165417cb29e9905c9434d035054e25f32c3",
      "packages/nx-compat/src/execute.ts": "5bbb9d24cdcc14acc9f83544c0236366f25aa9b8",
      "packages/nx-compat/src/helpers.ts": "45746a7465117d5987fe10182cd5382e123d5278",
      "packages/nx-compat/src/index.ts": "cff9946dfbb065ba98b08c87977da2efa6ccc2b6",
      "packages/nx-compat/src/moon.ts": "ac0e3496c4b6d885983b9240df2883c528278328",
      "packages/nx-compat/src/nx.ts": "247193c029bc2bf75657ec5dfe5072370b75081d",
      "packages/nx-compat/tsconfig.json": "a11ef0030ba77f737a945aa9caa4796e1656f07f",
      "packages/nx-compat/tsconfig.mjs.json": "fea3aa02bbc1d48e15c06a178279d60b35461adc",
      "tsconfig.options.json": "720b9c8f7dc0b6c38c33f4a6e26fff5fb5fc95d0"
    },
    "inputEnv": {},
    "outputs": [
      "mjs"
    ],
    "platform": "node",
    "projectDeps": [
      "types"
    ],
    "script": null,
    "target": "nx-compat:build",
    "version": "2"
  },
}

As you can see, nx-compat:build depends on types:build, so we include its task hash in the manifest, so that any changes to types:build would bust the nx-compat:build cache.

@dudicoco
Copy link

dudicoco commented Nov 26, 2024

@milesj i'll try to illustrate what I believe is the problem:

# app1
tasks:
  build:
    command: echo output1>>file.txt
    inputs:
    - file.txt
    outputs:
    - out/file.txt
# app2
tasks:
  build:
    deps:
    - app1:build
    command: echo output2>>file.txt
    inputs:
    - /app1/out/file.txt
    outputs:
    - out/file.txt
# app3
tasks:
  build:
    deps:
    - app2:build
    command: echo output3>>file.txt
    inputs:
    - /app2/out/file.txt
    outputs:
    - out/file.txt

As you can see in this example, app3 needs the output of of app2 as its input in order to produce output, app2 in turn needs the output of app1 as input in order to produce output.
And if a change to app1 doesn't trigger a change to the transitive dependents (app3) then app3 won't be rebuilt.

@milesj
Copy link
Collaborator

milesj commented Nov 26, 2024

@kikones34 This should work much better in v1.30. We have a test case that does exactly that: https://github.com/moonrepo/moon/blob/master/crates/affected/tests/affected_tracker_test.rs#L733

@dudicoco
Copy link

So this was resolved in v1.30, but it introduced another undesired behavior, see #1736

@kikones34
Copy link
Author

@milesj This looks really good now, I did a small test project and everything works as expected! However in our main big monorepo, there's an issue which causes a ton of tasks to be marked as affected, see: #1743

@milesj
Copy link
Collaborator

milesj commented Nov 29, 2024

Awesome, I'll close this for now.

@milesj milesj closed this as completed Nov 29, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Development

No branches or pull requests

3 participants