Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[CI] Divide CI into gates #13861

Merged
merged 16 commits into from
Oct 16, 2023
Merged

[CI] Divide CI into gates #13861

merged 16 commits into from
Oct 16, 2023

Conversation

dkijania
Copy link
Member

@dkijania dkijania commented Aug 14, 2023

Explain your changes:
Current Buildkite CI does not distinguish on dhall level if we are running nightly or PullRequest tests. However, from deep dive into logic, there is some work done, which can be extended to control which job should run on nightly. By providing export BUILDKITE_PIPELINE_MODE=Stable we can signal Ci that we want to run nightly jobs so Monorepo triage is disabled and all known jobs will be run.

However, it's not satisfactory, as we also want to exclude some jobs from PullRequest stage and run only on Stable. Therefore I introduced new attribute of JobSpec struct : mode which by default is set to PullRequest, but in some occasion can be override to Stable. Then code in monorepo traige job analyze it and prevents this jobs from running.

Stable (a.k.a Nightly ) logic is unchanged.

Then i went a bit further into CI configuration and was experimenting with splitting pipeline into 2-3 pipeline gates.

  • 1 Gate only runs fast jobs
  • 2 Gate run remaining jobs only if 1 Gate is green.
  • 3 Gate only run if 2 Gate is green. It contains only closing jobs like gathering coverage

As a result i come up with tagging mechanism which can be fluently used to choose which subset of jobs should be run. Imagine situation in which we want to run only fast and lints jobs. In my PR i already tag jobs which duration is less than 10 minutes (with tag: Fast) and marked all linting jobs with tag: Lint. Then, I introduced Filter module which creates collection of tags (for user convenience, it's better to use single value: PIPELINE_FILTER: FastAndLints rather than PIPELINE_TAGS: [Fast, Lint]). Now user can run fast and lint jobs by below setup:

steps:
  - commands:
      - "dhall-to-yaml --quoted <<< './buildkite/src/Prepare.dhall' | buildkite-agent pipeline upload"
    label: ":pipeline: Fast"
    agents:
       size: "generic"
    plugins:
      "docker#v3.5.0":
        environment:
          - BUILDKITE_AGENT_ACCESS_TOKEN
          - "BUILDKITE_PIPELINE_MODE=Stable"
          - "BUILDKITE_PIPELINE_FILTER=FastAndLints"
        image: codaprotocol/ci-toolchain-base:v3
        mount-buildkite-agent: false
        propagate-environment: true

But this is not over. We can define multiple stages using filters and wait command between them. Imagine situation in which we would like to have 3 group of jobs for our pipeline:

  • Fast (Lints and fast jobs)
  • Long (Unit tests/Building dockers etc)
  • TearDown (Coverage gathering)

We can reach above goal by setup like below (i omitted not important parameters):

steps:
  - commands:
      - "dhall-to-yaml ..."
        environment:
           - "BUILDKITE_PIPELINE_MODE=Stable"
          - "BUILDKITE_PIPELINE_FILTER=Fast"
  - wait
  - commands:
      - "dhall-to-yaml ..."
          - "BUILDKITE_PIPELINE_MODE=Stable"
          - "BUILDKITE_PIPELINE_FILTER=LongOnly"
  - wait
  - commands:
      - "dhall-to-yaml ..."
          - "BUILDKITE_PIPELINE_MODE=Stable"
          - "BUILDKITE_PIPELINE_FILTER=TearDownOnly"

The above setup can be used for PR verifications. It will first run fast jobs and lints. If any of them is red, fail the build and won't run long jobs, preserving time and cost.

Explain how you tested your changes:

I tested it on fuzzy zkapp tests job example, which is using $NIGHTLY parameter which on script level control if jobs is empty but evergreen on PullRequest or have actual test run on Stable. I removed that logic from script and put Stable as job mode in dhall. Then verified job is not run.

Checklist:

  • Dependency versions are unchanged
    • Notify Velocity team if dependencies must change in CI
  • Modified the current draft of release notes with details on what is completed or incomplete within this project
  • Document code purpose, how to use it
    • Mention expected invariants, implicit constraints
  • Tests were added for the new behavior
    • Document test purpose, significance of failures
    • Test names should reflect their purpose
  • All tests pass (CI will check this if you didn't)
  • Serialized types are in stable-versioned modules
  • Does this close issues? List them

@dkijania dkijania changed the title Dkijania/move te to stage 2 Divide CI into gates Aug 14, 2023
@dkijania dkijania self-assigned this Aug 14, 2023
@dkijania dkijania changed the base branch from develop to berkeley August 14, 2023 17:12
@dkijania dkijania force-pushed the dkijania/move_te_to_stage_2 branch from 0119012 to c8f5ceb Compare August 14, 2023 17:17
@dkijania dkijania changed the title Divide CI into gates [CI] Divide CI into gates Aug 14, 2023
@dkijania dkijania marked this pull request as ready for review August 15, 2023 13:21
@dkijania dkijania requested a review from a team as a code owner August 15, 2023 13:21
@dkijania
Copy link
Member Author

Effect of my changes can be observed on separate pipeline:

https://buildkite.com/o-1-labs-2/tmp-teardownjob/builds/89

@dkijania dkijania force-pushed the dkijania/move_te_to_stage_2 branch 2 times, most recently from 537b554 to ca7280e Compare September 6, 2023 19:53
@dkijania dkijania force-pushed the dkijania/move_te_to_stage_2 branch from 882751e to 67753c3 Compare September 12, 2023 20:53
@dkijania dkijania force-pushed the dkijania/move_te_to_stage_2 branch from ac4aac4 to 2cfe5a1 Compare September 21, 2023 19:09
@dkijania
Copy link
Member Author

!ci-build-me

2 similar comments
@dkijania
Copy link
Member Author

!ci-build-me

@dkijania
Copy link
Member Author

!ci-build-me

@dkijania dkijania force-pushed the dkijania/move_te_to_stage_2 branch from 99bced3 to fb3024d Compare September 25, 2023 19:21
@dkijania
Copy link
Member Author

!ci-build-me

@dkijania dkijania force-pushed the dkijania/move_te_to_stage_2 branch from 19b2f5d to 139763c Compare September 25, 2023 20:51
@dkijania
Copy link
Member Author

!ci-build-me

3 similar comments
@dkijania
Copy link
Member Author

!ci-build-me

@dkijania
Copy link
Member Author

!ci-build-me

@dkijania
Copy link
Member Author

!ci-build-me

@dkijania dkijania force-pushed the dkijania/move_te_to_stage_2 branch from 1837496 to 38712a8 Compare September 27, 2023 17:10
@dkijania
Copy link
Member Author

!ci-build-me

1 similar comment
@dkijania
Copy link
Member Author

!ci-build-me

@dkijania
Copy link
Member Author

!ci-build-me

Copy link
Contributor

@stevenplatt stevenplatt left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good. I cannot think of any conflicts of concerns.

One item worth noting is that the Tag and Mode have adjacent functionality to controls available in webhooks settings within the repo, but I don't think they overlap in current form.

@dkijania dkijania force-pushed the dkijania/move_te_to_stage_2 branch from a4475b7 to c425cb1 Compare October 6, 2023 17:21
@dkijania
Copy link
Member Author

dkijania commented Oct 6, 2023

!ci-build-me

@dkijania
Copy link
Member Author

dkijania commented Oct 9, 2023

!ci-build-me

@dkijania
Copy link
Member Author

!ci-build-me

@dkijania
Copy link
Member Author

!ci-build-me

@dkijania
Copy link
Member Author

!ci-build-me

@deepthiskumar
Copy link
Member

Approving since the changes are only in the buildkite/CI related files

@deepthiskumar
Copy link
Member

!approved-for-mainnet

@dkijania dkijania merged commit 4703298 into berkeley Oct 16, 2023
1 of 2 checks passed
@dkijania dkijania deleted the dkijania/move_te_to_stage_2 branch October 16, 2023 21:44
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Improved buildkite nightly/stable pipeline configuration
3 participants