Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

CLDSRV-574 implement KMS health check #5697

Merged
merged 1 commit into from
Nov 22, 2024

Conversation

nicolas2bert
Copy link
Contributor

Context:
Cloudserver deep healthcheck is used in S3C by

S3 Frontend (Nginx) to manage the distribution of incoming S3 API requests based on backend server health. It sends a deep health check call every 2 seconds. Based on the deep health check results, Nginx dynamically adjusts the distribution of incoming S3 API requests. Healthy servers receive traffic, while unhealthy ones are temporarily removed from the pool to prevent failed requests.

Prometheus to monitor the health of bucketd. It sends an cloudserver deep health check call every 30 seconds. Prometheus requests blackbox that retrieves the cloudserver health, which has the bucketd health.

Current behavior:

Currently, KMS (KMIP/AWS KMS) is not part of the cloudserver deep health check. This prevents the cloudserver health check from failing if KMS is down, which is expected because only requests to encrypted buckets should fail. KMS being down should not prevent the server from starting up or run. (bugfix: https://scality.atlassian.net/browse/S3C-4833 )

Expected behavior:

We want KMS to be part of the cloudserver deep health check but still prevent the cloudserver health check from failing if KMS is down. Since only requests to encrypted buckets should fail when KMS is down, the server should continue to start up and run without issues.

Why do we need to check for the KMS health?

we want to be alerted if KMS is down to take appropriate action.

as a side effect, including KMS in the health check will help keep the connection alive. Some KMS providers require maintaining an active connection because, for example, when cloudserver connection to KMS remains idle for more than 24 hours, the internal Thales token expires. (ref: https://scality.atlassian.net/browse/S3C-8464 )

NOTEs:

Based on how frequently the cloudserver deep health check is performed, successful KMS requests should be cached with a cache expiration time set to less than 24 hours to keep the KMS connection alive.Caching will lower the number of requests made to the KMS service → less system resources used. Also, if KMS requests have costs, it will save some.

This approach is not a proper design for handling connection maintenance but rather a good side effect of integrating KMS into the health check. To handle connection “upkeep” properly, we should clearly separate both concerns: health checks vs connection maintenance.

@bert-e
Copy link
Contributor

bert-e commented Nov 12, 2024

Hello nicolas2bert,

My role is to assist you with the merge of this
pull request. Please type @bert-e help to get information
on this process, or consult the user documentation.

Available options
name description privileged authored
/after_pull_request Wait for the given pull request id to be merged before continuing with the current one.
/bypass_author_approval Bypass the pull request author's approval
/bypass_build_status Bypass the build and test status
/bypass_commit_size Bypass the check on the size of the changeset TBA
/bypass_incompatible_branch Bypass the check on the source branch prefix
/bypass_jira_check Bypass the Jira issue check
/bypass_peer_approval Bypass the pull request peers' approval
/bypass_leader_approval Bypass the pull request leaders' approval
/approve Instruct Bert-E that the author has approved the pull request. ✍️
/create_pull_requests Allow the creation of integration pull requests.
/create_integration_branches Allow the creation of integration branches.
/no_octopus Prevent Wall-E from doing any octopus merge and use multiple consecutive merge instead
/unanimity Change review acceptance criteria from one reviewer at least to all reviewers
/wait Instruct Bert-E not to run until further notice.
Available commands
name description privileged
/help Print Bert-E's manual in the pull request.
/status Print Bert-E's current status in the pull request TBA
/clear Remove all comments from Bert-E from the history TBA
/retry Re-start a fresh build TBA
/build Re-start a fresh build TBA
/force_reset Delete integration branches & pull requests, and restart merge process from the beginning.
/reset Try to remove integration branches unless there are commits on them which do not appear on the source branch.

Status report is not available.

@bert-e
Copy link
Contributor

bert-e commented Nov 12, 2024

Incorrect fix version

The Fix Version/s in issue CLDSRV-574 contains:

  • None

Considering where you are trying to merge, I ignored possible hotfix versions and I expected to find:

  • 7.70.57

  • 8.6.29

  • 8.7.50

  • 8.8.36

Please check the Fix Version/s of CLDSRV-574, or the target
branch of this pull request.

@nicolas2bert
Copy link
Contributor Author

ping

@bert-e
Copy link
Contributor

bert-e commented Nov 12, 2024

Request integration branches

Waiting for integration branch creation to be requested by the user.

To request integration branches, please comment on this pull request with the following command:

/create_integration_branches

Alternatively, the /approve and /create_pull_requests commands will automatically
create the integration branches.

@nicolas2bert nicolas2bert force-pushed the improvement/CLDSRV-574/kms-healthcheck branch from 07179dd to e3fcce6 Compare November 12, 2024 11:00
@nicolas2bert
Copy link
Contributor Author

/create_integration_branches

@bert-e
Copy link
Contributor

bert-e commented Nov 13, 2024

Integration data created

I have created the integration data for the additional destination branches.

The following branches will NOT be impacted:

  • development/7.10
  • development/7.4

You can set option create_pull_requests if you need me to create
integration pull requests in addition to integration branches, with:

@bert-e create_pull_requests

The following options are set: create_integration_branches

@bert-e
Copy link
Contributor

bert-e commented Nov 13, 2024

Waiting for approval

The following approvals are needed before I can proceed with the merge:

  • the author

  • 2 peers

The following options are set: create_integration_branches

Copy link
Contributor

@anurag4DSB anurag4DSB left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like that its not gonna bring CS down :)

@bert-e
Copy link
Contributor

bert-e commented Nov 20, 2024

Incorrect fix version

The Fix Version/s in issue CLDSRV-574 contains:

  • 7.70.57

  • 8.6.29

  • 8.7.50

  • 8.8.36

Considering where you are trying to merge, I ignored possible hotfix versions and I expected to find:

  • 7.70.58

  • 8.6.29

  • 8.7.50

  • 8.8.37

Please check the Fix Version/s of CLDSRV-574, or the target
branch of this pull request.

The following options are set: create_integration_branches

tests/unit/encryption/healthCheckCache.js Outdated Show resolved Hide resolved
lib/kms/wrapper.js Outdated Show resolved Hide resolved
lib/kms/wrapper.js Outdated Show resolved Hide resolved
lib/kms/Cache.js Outdated Show resolved Hide resolved
tests/unit/encryption/checkHealth.js Outdated Show resolved Hide resolved
@nicolas2bert nicolas2bert force-pushed the improvement/CLDSRV-574/kms-healthcheck branch 2 times, most recently from d18b7df to 86149db Compare November 21, 2024 12:47
@nicolas2bert
Copy link
Contributor Author

@bert-e approve

@bert-e
Copy link
Contributor

bert-e commented Nov 21, 2024

Incorrect fix version

The Fix Version/s in issue CLDSRV-574 contains:

  • 7.70.57

  • 8.6.29

  • 8.7.50

  • 8.8.36

Considering where you are trying to merge, I ignored possible hotfix versions and I expected to find:

  • 7.70.58

  • 8.6.29

  • 8.7.50

  • 8.8.37

Please check the Fix Version/s of CLDSRV-574, or the target
branch of this pull request.

The following options are set: approve, create_integration_branches

@nicolas2bert
Copy link
Contributor Author

ping

@bert-e
Copy link
Contributor

bert-e commented Nov 21, 2024

History mismatch

Merge commit #e3fcce6e3426bbbafb595646c9bfbc7dc833fb71 on the integration branch
w/8.6/improvement/CLDSRV-574/kms-healthcheck is merging a branch which is neither the current
branch improvement/CLDSRV-574/kms-healthcheck nor the development branch
development/8.6.

It is likely due to a rebase of the branch improvement/CLDSRV-574/kms-healthcheck and the
merge is not possible until all related w/* branches are deleted or updated.

Please use the reset command to have me reinitialize these branches.

The following options are set: approve, create_integration_branches

@nicolas2bert
Copy link
Contributor Author

@bert-e reset

@bert-e
Copy link
Contributor

bert-e commented Nov 21, 2024

Reset complete

I have successfully deleted this pull request's integration branches.

The following options are set: approve, create_integration_branches

@bert-e
Copy link
Contributor

bert-e commented Nov 21, 2024

Integration data created

I have created the integration data for the additional destination branches.

The following branches will NOT be impacted:

  • development/7.10
  • development/7.4

You can set option create_pull_requests if you need me to create
integration pull requests in addition to integration branches, with:

@bert-e create_pull_requests

The following options are set: approve, create_integration_branches

@nicolas2bert nicolas2bert force-pushed the improvement/CLDSRV-574/kms-healthcheck branch from 86149db to e23e224 Compare November 21, 2024 14:01
@bert-e
Copy link
Contributor

bert-e commented Nov 21, 2024

Branches have diverged

This pull request's source branch improvement/CLDSRV-574/kms-healthcheck has diverged from
development/7.70 by more than 50 commits.

To avoid any integration risks, please re-synchronize them using one of the
following solutions:

  • Merge origin/development/7.70 into improvement/CLDSRV-574/kms-healthcheck
  • Rebase improvement/CLDSRV-574/kms-healthcheck onto origin/development/7.70

Note: If you choose to rebase, you may have to ask me to rebuild
integration branches using the reset command.

The following options are set: approve, create_integration_branches

@nicolas2bert nicolas2bert changed the base branch from development/7.70 to development/7.10 November 21, 2024 14:01
@bert-e
Copy link
Contributor

bert-e commented Nov 21, 2024

Incorrect fix version

The Fix Version/s in issue CLDSRV-574 contains:

  • 7.70.58

  • 8.6.29

  • 8.7.50

  • 8.8.37

Considering where you are trying to merge, I ignored possible hotfix versions and I expected to find:

  • 7.10.51

  • 7.70.58

  • 8.6.29

  • 8.7.50

  • 8.8.37

Please check the Fix Version/s of CLDSRV-574, or the target
branch of this pull request.

The following options are set: approve, create_integration_branches

@nicolas2bert
Copy link
Contributor Author

ping

@bert-e
Copy link
Contributor

bert-e commented Nov 21, 2024

History mismatch

Merge commit #86149db5ac99a615297fe2889dd65ac2705cb41d on the integration branch
w/8.6/improvement/CLDSRV-574/kms-healthcheck is merging a branch which is neither the current
branch improvement/CLDSRV-574/kms-healthcheck nor the development branch
development/8.6.

It is likely due to a rebase of the branch improvement/CLDSRV-574/kms-healthcheck and the
merge is not possible until all related w/* branches are deleted or updated.

Please use the reset command to have me reinitialize these branches.

The following options are set: approve, create_integration_branches

@nicolas2bert
Copy link
Contributor Author

@bert-e reset

@bert-e
Copy link
Contributor

bert-e commented Nov 21, 2024

Reset complete

I have successfully deleted this pull request's integration branches.

The following options are set: approve, create_integration_branches

@bert-e
Copy link
Contributor

bert-e commented Nov 21, 2024

Conflict

A conflict has been raised during the creation of
integration branch w/7.70/improvement/CLDSRV-574/kms-healthcheck with contents from improvement/CLDSRV-574/kms-healthcheck
and development/7.70.

I have not created the integration branch.

Here are the steps to resolve this conflict:

 $ git fetch
 $ git checkout -B w/7.70/improvement/CLDSRV-574/kms-healthcheck origin/development/7.70
 $ git merge origin/improvement/CLDSRV-574/kms-healthcheck
 $ # <intense conflict resolution>
 $ git commit
 $ git push -u origin w/7.70/improvement/CLDSRV-574/kms-healthcheck

The following options are set: approve, create_integration_branches

@nicolas2bert nicolas2bert force-pushed the improvement/CLDSRV-574/kms-healthcheck branch from e23e224 to c320582 Compare November 21, 2024 14:44
@bert-e
Copy link
Contributor

bert-e commented Nov 21, 2024

Integration data created

I have created the integration data for the additional destination branches.

The following branches will NOT be impacted:

  • development/7.4

You can set option create_pull_requests if you need me to create
integration pull requests in addition to integration branches, with:

@bert-e create_pull_requests

The following options are set: approve, create_integration_branches

@bert-e
Copy link
Contributor

bert-e commented Nov 21, 2024

Build failed

The build for commit did not succeed in branch w/8.6/improvement/CLDSRV-574/kms-healthcheck.

The following options are set: approve, create_integration_branches

@nicolas2bert nicolas2bert force-pushed the improvement/CLDSRV-574/kms-healthcheck branch from c320582 to 4352e97 Compare November 22, 2024 11:20
@bert-e
Copy link
Contributor

bert-e commented Nov 22, 2024

History mismatch

Merge commit #c32058209f34877e57ec0b35a7b02a9419f3a49b on the integration branch
w/7.70/improvement/CLDSRV-574/kms-healthcheck is merging a branch which is neither the current
branch improvement/CLDSRV-574/kms-healthcheck nor the development branch
development/7.70.

It is likely due to a rebase of the branch improvement/CLDSRV-574/kms-healthcheck and the
merge is not possible until all related w/* branches are deleted or updated.

Please use the reset command to have me reinitialize these branches.

The following options are set: approve, create_integration_branches

@nicolas2bert
Copy link
Contributor Author

@bert-e reset

@bert-e
Copy link
Contributor

bert-e commented Nov 22, 2024

Reset complete

I have successfully deleted this pull request's integration branches.

The following options are set: approve, create_integration_branches

@bert-e
Copy link
Contributor

bert-e commented Nov 22, 2024

Conflict

A conflict has been raised during the creation of
integration branch w/7.70/improvement/CLDSRV-574/kms-healthcheck with contents from improvement/CLDSRV-574/kms-healthcheck
and development/7.70.

I have not created the integration branch.

Here are the steps to resolve this conflict:

 $ git fetch
 $ git checkout -B w/7.70/improvement/CLDSRV-574/kms-healthcheck origin/development/7.70
 $ git merge origin/improvement/CLDSRV-574/kms-healthcheck
 $ # <intense conflict resolution>
 $ git commit
 $ git push -u origin w/7.70/improvement/CLDSRV-574/kms-healthcheck

The following options are set: approve, create_integration_branches

@bert-e
Copy link
Contributor

bert-e commented Nov 22, 2024

Integration data created

I have created the integration data for the additional destination branches.

The following branches will NOT be impacted:

  • development/7.4

You can set option create_pull_requests if you need me to create
integration pull requests in addition to integration branches, with:

@bert-e create_pull_requests

The following options are set: approve, create_integration_branches

@bert-e
Copy link
Contributor

bert-e commented Nov 22, 2024

I have successfully merged the changeset of this pull request
into targetted development branches:

  • ✔️ development/7.10

  • ✔️ development/7.70

  • ✔️ development/8.6

  • ✔️ development/8.7

  • ✔️ development/8.8

The following branches have NOT changed:

  • development/7.4

Please check the status of the associated issue CLDSRV-574.

Goodbye nicolas2bert.

The following options are set: approve, create_integration_branches

@bert-e bert-e merged commit 4352e97 into development/7.10 Nov 22, 2024
10 checks passed
@bert-e bert-e deleted the improvement/CLDSRV-574/kms-healthcheck branch November 22, 2024 11:48
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants