-
Notifications
You must be signed in to change notification settings - Fork 241
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
CLDSRV-574 implement KMS health check #5697
CLDSRV-574 implement KMS health check #5697
Conversation
Hello nicolas2bert,My role is to assist you with the merge of this Available options
Available commands
Status report is not available. |
Incorrect fix versionThe
Considering where you are trying to merge, I ignored possible hotfix versions and I expected to find:
Please check the |
ping |
Request integration branchesWaiting for integration branch creation to be requested by the user. To request integration branches, please comment on this pull request with the following command:
Alternatively, the |
07179dd
to
e3fcce6
Compare
/create_integration_branches |
Integration data createdI have created the integration data for the additional destination branches.
The following branches will NOT be impacted:
You can set option
The following options are set: create_integration_branches |
Waiting for approvalThe following approvals are needed before I can proceed with the merge:
The following options are set: create_integration_branches |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I like that its not gonna bring CS down :)
Incorrect fix versionThe
Considering where you are trying to merge, I ignored possible hotfix versions and I expected to find:
Please check the The following options are set: create_integration_branches |
d18b7df
to
86149db
Compare
@bert-e approve |
Incorrect fix versionThe
Considering where you are trying to merge, I ignored possible hotfix versions and I expected to find:
Please check the The following options are set: approve, create_integration_branches |
ping |
History mismatchMerge commit #e3fcce6e3426bbbafb595646c9bfbc7dc833fb71 on the integration branch It is likely due to a rebase of the branch Please use the The following options are set: approve, create_integration_branches |
@bert-e reset |
Reset completeI have successfully deleted this pull request's integration branches. The following options are set: approve, create_integration_branches |
Integration data createdI have created the integration data for the additional destination branches.
The following branches will NOT be impacted:
You can set option
The following options are set: approve, create_integration_branches |
86149db
to
e23e224
Compare
Branches have divergedThis pull request's source branch To avoid any integration risks, please re-synchronize them using one of the
Note: If you choose to rebase, you may have to ask me to rebuild The following options are set: approve, create_integration_branches |
Incorrect fix versionThe
Considering where you are trying to merge, I ignored possible hotfix versions and I expected to find:
Please check the The following options are set: approve, create_integration_branches |
ping |
History mismatchMerge commit #86149db5ac99a615297fe2889dd65ac2705cb41d on the integration branch It is likely due to a rebase of the branch Please use the The following options are set: approve, create_integration_branches |
@bert-e reset |
Reset completeI have successfully deleted this pull request's integration branches. The following options are set: approve, create_integration_branches |
ConflictA conflict has been raised during the creation of I have not created the integration branch. Here are the steps to resolve this conflict: $ git fetch
$ git checkout -B w/7.70/improvement/CLDSRV-574/kms-healthcheck origin/development/7.70
$ git merge origin/improvement/CLDSRV-574/kms-healthcheck
$ # <intense conflict resolution>
$ git commit
$ git push -u origin w/7.70/improvement/CLDSRV-574/kms-healthcheck The following options are set: approve, create_integration_branches |
e23e224
to
c320582
Compare
Integration data createdI have created the integration data for the additional destination branches.
The following branches will NOT be impacted:
You can set option
The following options are set: approve, create_integration_branches |
Build failedThe build for commit did not succeed in branch w/8.6/improvement/CLDSRV-574/kms-healthcheck. The following options are set: approve, create_integration_branches |
c320582
to
4352e97
Compare
History mismatchMerge commit #c32058209f34877e57ec0b35a7b02a9419f3a49b on the integration branch It is likely due to a rebase of the branch Please use the The following options are set: approve, create_integration_branches |
@bert-e reset |
Reset completeI have successfully deleted this pull request's integration branches. The following options are set: approve, create_integration_branches |
ConflictA conflict has been raised during the creation of I have not created the integration branch. Here are the steps to resolve this conflict: $ git fetch
$ git checkout -B w/7.70/improvement/CLDSRV-574/kms-healthcheck origin/development/7.70
$ git merge origin/improvement/CLDSRV-574/kms-healthcheck
$ # <intense conflict resolution>
$ git commit
$ git push -u origin w/7.70/improvement/CLDSRV-574/kms-healthcheck The following options are set: approve, create_integration_branches |
Integration data createdI have created the integration data for the additional destination branches.
The following branches will NOT be impacted:
You can set option
The following options are set: approve, create_integration_branches |
I have successfully merged the changeset of this pull request
The following branches have NOT changed:
Please check the status of the associated issue CLDSRV-574. Goodbye nicolas2bert. The following options are set: approve, create_integration_branches |
Context:
Cloudserver deep healthcheck is used in S3C by
S3 Frontend (Nginx) to manage the distribution of incoming S3 API requests based on backend server health. It sends a deep health check call every 2 seconds. Based on the deep health check results, Nginx dynamically adjusts the distribution of incoming S3 API requests. Healthy servers receive traffic, while unhealthy ones are temporarily removed from the pool to prevent failed requests.
Prometheus to monitor the health of bucketd. It sends an cloudserver deep health check call every 30 seconds. Prometheus requests blackbox that retrieves the cloudserver health, which has the bucketd health.
Current behavior:
Currently, KMS (KMIP/AWS KMS) is not part of the cloudserver deep health check. This prevents the cloudserver health check from failing if KMS is down, which is expected because only requests to encrypted buckets should fail. KMS being down should not prevent the server from starting up or run. (bugfix: https://scality.atlassian.net/browse/S3C-4833 )
Expected behavior:
We want KMS to be part of the cloudserver deep health check but still prevent the cloudserver health check from failing if KMS is down. Since only requests to encrypted buckets should fail when KMS is down, the server should continue to start up and run without issues.
Why do we need to check for the KMS health?
we want to be alerted if KMS is down to take appropriate action.
as a side effect, including KMS in the health check will help keep the connection alive. Some KMS providers require maintaining an active connection because, for example, when cloudserver connection to KMS remains idle for more than 24 hours, the internal Thales token expires. (ref: https://scality.atlassian.net/browse/S3C-8464 )
NOTEs:
Based on how frequently the cloudserver deep health check is performed, successful KMS requests should be cached with a cache expiration time set to less than 24 hours to keep the KMS connection alive.Caching will lower the number of requests made to the KMS service → less system resources used. Also, if KMS requests have costs, it will save some.
This approach is not a proper design for handling connection maintenance but rather a good side effect of integrating KMS into the health check. To handle connection “upkeep” properly, we should clearly separate both concerns: health checks vs connection maintenance.