You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I noticed a "double restart" occurred on the search head cluster last night and the reason is obvious now...the .tgz files in the s3 bucket were updated with slightly different timestamps
For example, 9:08:17, 9:08:20 (these were caught by deployment 1), unfortunately the 100MB tgz file was 9:08:35, 09:08:36, 09:08:37 (caught by deployment 2)
Expected behavior
Ideally, the appframework would run, download any updates, and perhaps have a 2nd check 30-60 seconds later for any "additional" changes before applying the bundle.
Otherwise, in my case, all 4 nodes of the SHC restarted, and then the operator confirmed there was more updates, and re-applied the bundle resulting in a 2nd rolling restart.
Splunk setup on K8S
K8s 1.28
Reproduction/Testing steps
This would be a little bit tricky to re-produce, you would need to get the files into s3 and have the operator start deploying, then upload more files seconds later or during the middle of the existing deployment...
K8s environment
K8s 1.28, mostly vanilla setup.
Proposed changes(optional)
Could I specify a delay before the bundle push occurs to allow a "2nd check" that no other apps are required? Otherwise I risk a "double restart" which involves some level of outage in an SHC.
Additional context(optional)
This is more of an enhancement as it's not going to be the most common case...
For now I'll just increase the appsRepoPollIntervalSeconds to reduce the chance of this happening
The text was updated successfully, but these errors were encountered:
Hello @gjanders,
We will investigate this issue and attempt to reproduce it on our end. In the meantime, you can perform a manual app update, which allows you to customize when the update occurs. Additionally, we are working on enhancing the manual update feature to enable app updates for specific CR types using a ConfigMap.
Thank you for your patience and understanding.
Please select the type of request
Bug / Enhancement
Tell us more
Describe the request
I noticed a "double restart" occurred on the search head cluster last night and the reason is obvious now...the .tgz files in the s3 bucket were updated with slightly different timestamps
For example, 9:08:17, 9:08:20 (these were caught by deployment 1), unfortunately the 100MB tgz file was 9:08:35, 09:08:36, 09:08:37 (caught by deployment 2)
Expected behavior
Ideally, the appframework would run, download any updates, and perhaps have a 2nd check 30-60 seconds later for any "additional" changes before applying the bundle.
Otherwise, in my case, all 4 nodes of the SHC restarted, and then the operator confirmed there was more updates, and re-applied the bundle resulting in a 2nd rolling restart.
Splunk setup on K8S
K8s 1.28
Reproduction/Testing steps
This would be a little bit tricky to re-produce, you would need to get the files into s3 and have the operator start deploying, then upload more files seconds later or during the middle of the existing deployment...
K8s environment
K8s 1.28, mostly vanilla setup.
Proposed changes(optional)
Could I specify a delay before the bundle push occurs to allow a "2nd check" that no other apps are required? Otherwise I risk a "double restart" which involves some level of outage in an SHC.
Additional context(optional)
This is more of an enhancement as it's not going to be the most common case...
For now I'll just increase the appsRepoPollIntervalSeconds to reduce the chance of this happening
The text was updated successfully, but these errors were encountered: