-
Notifications
You must be signed in to change notification settings - Fork 5.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat: Use actual deployment replica count instead of env variable for accuracy #20041
base: master
Are you sure you want to change the base?
Conversation
🔴 Preview Environment stopped on BunnyshellSee: Environment Details | Pipeline Logs Available commands (reply to this comment):
|
✅ Preview Environment created on Bunnyshell but will not be auto-deployedSee: Environment Details Available commands (reply to this comment):
|
85f0376
to
53cabce
Compare
controller/sharding/sharding.go
Outdated
return nil, fmt.Errorf("(dynamic cluster distribution) failed to get app controller deployment: %w", err) | ||
} | ||
applicationControllerName := env.StringFromEnv(common.EnvAppControllerName, common.DefaultApplicationControllerName) | ||
appControllerDeployment, err := kubeClient.AppsV1().Deployments(settingsMgr.GetNamespace()).Get(context.Background(), applicationControllerName, metav1.GetOptions{}) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The controller can be a Deployment or a StatefulSet. I am not sure if enableDynamicClusterDistribution
must be true when using a deployment. If it is not possible, I don't think we should perform 2 k8s api calls for a warning log to help detect a misconfiguration, considering that 1 call will always fail.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My issue was that I attempted sharding with enableDynamicClusterDistribution set to false, but the actual Deployment replica count was smaller than the sharding configuration. This caused a problem, and it was difficult to debug since no logs were recorded. Since the ArgoCD application controller typically runs as a Deployment, I believe it would be more accurate to retrieve the actual replica count using the Kubernetes API, even when enableDynamicClusterDistribution is set to false. This approach could help prevent discrepancies between the sharding configuration and the actual replica count, making debugging easier. Would this be a suitable implementation? I would appreciate any feedback or suggestions on this matter.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I encountered an issue when attempting sharding with enableDynamicClusterDistribution set to false. The root cause was my incorrect assumption that the ArgoCD application controller always runs as a Deployment, but I overlooked the fact that it can also operate as a StatefulSet.
In this scenario, even with enableDynamicClusterDistribution disabled, an error occurred due to a mismatch between the actual replica count and the sharding configuration. To prevent such discrepancies and make debugging easier, I have revised the code to account for both StatefulSet and Deployment replica counts. This change should help avoid potential issues arising from misconfigurations.
Let me know what you think
bb33596
to
8b8db9f
Compare
Signed-off-by: nueavv <nuguni@kakao.com>
appControllerStatefulSet, err := kubeClient.AppsV1().StatefulSets(namespace).Get(context.Background(), applicationControllerName, metav1.GetOptions{}) | ||
if err != nil { | ||
replicasCount = 1 | ||
log.Warnf("Failed to retrieve StatefulSet '%s'. Defaulting replicasCount to 1.", applicationControllerName) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This seems like it won't work with the code below, e.g. if the number of replicas is > 1 and the call fails.
if err != nil { | ||
replicasCount = 1 | ||
log.Warnf("Failed to retrieve StatefulSet '%s'. Defaulting replicasCount to 1.", applicationControllerName) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As discussed in the issue, if the call fails, we should default to the environment variable. We cannot assume that the call will always work, and defaulting to 1 is not accurate.
fixes #19928
Checklist: