-
Notifications
You must be signed in to change notification settings - Fork 5.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
fix(appset): Fix infinite loop caused by not ignoring statuses updates #19676
base: master
Are you sure you want to change the base?
fix(appset): Fix infinite loop caused by not ignoring statuses updates #19676
Conversation
🔴 Preview Environment stopped on BunnyshellSee: Environment Details | Pipeline Logs Available commands (reply to this comment):
|
✅ Preview Environment created on Bunnyshell but will not be auto-deployedSee: Environment Details Available commands (reply to this comment):
|
This commit fixes a bug where the appset controller updating the status of the resource itself, was causing it be notified for a new reconcile loop. This happened, because, altough the individual resource (app) statuses were not changing, the order of the list was, as such, resulting in an updated appset resource. The fix, adds a GenerationPredicate to the Watch on the appSet to guarantee that if no actual change to the AppSet change was done, then it can be ignored. Signed-off-by: João Oliveirinha <joliveirinha@gmail.com>
a1c0196
to
70c5e4b
Compare
Codecov ReportAll modified and coverable lines are covered by tests ✅
Additional details and impacted files@@ Coverage Diff @@
## master #19676 +/- ##
=========================================
Coverage ? 55.91%
=========================================
Files ? 316
Lines ? 43794
Branches ? 0
=========================================
Hits ? 24487
Misses ? 16762
Partials ? 2545 ☔ View full report in Codecov by Sentry. |
Thank you for the detailed description and the PR, @joliveirinha. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
change looks good, a test would ensure we don't regress
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Add tests and it looks 🔥
Sorry for the late response. Only had time to look at this now. It doesn't seem straight forward to implement a regression test for this. It seems the only possibility is to use envtest instead. I maybe missing something since this is my first time looking at this. |
Another option is to ensure order of the |
Hmm I have mixed feelings about that.
For 2) the current fix here proposed, gives all the flexibility for the implementation to decide what is the best requeue time. If we go via the direction you are suggesting, then a status update that changes the actual status content, will always trigger right away the notification from the runtime, which means you have less flexibility. You may argue, that it will only happen once because in the next iteration the status is probably the same, still.... To summarize, what I suggest, since the fix is small and relatively obvious, is for us to merge this PR and eventually when the project has integrations tests with the envtest, then it can cover this predicate and the rest that already exist to validate the runtime scenarios. Also, because this PR most probably also fix other related issues related to the progressive sync. |
But if you are using SCM Provider and someone creates a new repo, then it should reconcile if a new application is added? |
Yes. The fix doesn't change that. normal requeueing based on the configured requeue time will still happen. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There are many cases where the behavior of the appset controller may depend on status updates adding back the object to the queue. One that comes to my mind is the progressive delivery. There will be less update events, so less calls to shouldRequeueApplicationSet
. What are the implication of that? hard to know.
These kinds of scenario are usually coverer by the E2E test suite.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree with @agaudreault . Argo CD application controller logic relies on reconciliation after changing the status. It is not ideal, but I'm afraid the AppicationSet controller is the same. We might open a can of worms by disabling reconciliation on status change.
I would implement this suggestion instead.
This commit fixes a bug where the appset controller updating the status of the resource itself, was causing it be notified for a new reconcile loop.
This happened, because, altough the individual resource (app) statuses were not changing, the order of the list was, as such, resulting in an updated appset resource.
The fix, adds a GenerationPredicate to the Watch on the appSet to guarantee that if no actual change to the AppSet change was done, then it can be ignored.
Fixes [#19675]
Checklist: