Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ARO-3258: propagate errors of ARO PullSecret controller to ARO operator #3947

Open
wants to merge 4 commits into
base: master
Choose a base branch
from

Conversation

lbossis
Copy link
Collaborator

@lbossis lbossis commented Nov 8, 2024

Which issue this PR addresses:

Fixes https://issues.redhat.com/browse/ARO-3258

What this PR does / why we need it:

It is needed to propagate errors of ARO PullSecret controller to ARO operator in order to see those errors at the Operator level and to see all the controller errors in one place.

Test plan for issue:

[x] Unit Test Cases
[x] Local Cluster Creation
[] CI
[] E2E

Are there unit tests

Yes, https://github.com/Azure/ARO-RP/blob/master/pkg/operator/controllers/pullsecret/pullsecret_controller_test.go

How do you know this will function as expected in production?

Running all unit tests locally and verifying tests output

Is there any documentation that needs to be updated for this PR?

N/A

@lbossis
Copy link
Collaborator Author

lbossis commented Nov 8, 2024

@microsoft-github-policy-service agree company="Red Hat"

Copy link
Collaborator

@cadenmarchese cadenmarchese left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

first pass looks good, just a few suggestions. thanks!

return reconcile.Result{}, err
}

// fix pull secret if its broken to have at least the ARO pull secret
userSecret, err = r.ensureGlobalPullSecret(ctx, operatorSecret, userSecret)
if err != nil {
r.Log.Error(err)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We probably want to SetDegraded here if we can't ensure the global pull secret. Here's an example: https://github.com/Azure/ARO-RP/pull/3177/files#diff-17756e345e334809c4f7ea3663119196d4de8e8085aca1b0da6a96e708d5d251R84

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Resolving as we've agreed in the JIRA to not set degraded as part of this PR as it could impact user's ability to rotate their pull secret.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We can safely SetDegraded here without impacting any functionality. The controller will still continue to execute as it would normally, and the controller Degraded status will (as of right now) not be cascaded to the ARO Cluster Operator itself, so this controller being degraded will not e.g. block cluster upgrades.

@@ -285,11 +287,11 @@ func TestPullSecretReconciler(t *testing.T) {
ctx := context.Background()

clientFake := ctrlfake.NewClientBuilder().WithObjects(tt.instance).WithObjects(tt.secrets...).Build()
assert.NotNil(t, clientFake)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we need to add these extra assert calls here? The test works as-is without them added.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It should be helpful to manifest nil object occurrence whenever that object is supposed to be not nil and thus to render test case failure early on.

@cadenmarchese cadenmarchese added the chainsaw Pull requests or issues owned by Team Chainsaw label Nov 11, 2024
fahlmant
fahlmant previously approved these changes Nov 12, 2024
Copy link
Collaborator

@fahlmant fahlmant left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm

slawande2
slawande2 previously approved these changes Nov 13, 2024
Copy link
Collaborator

@tsatam tsatam left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Few fixups and additions to our test cases, otherwise LGTM!

return reconcile.Result{}, err
}

err = r.client.Update(ctx, instance)
err = r.Client.Update(ctx, instance)
return reconcile.Result{}, err
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should validate if err is nil or not here, and:

  • if err is nil, call the ClearConditions function to reset the controller back to Available: True, Progressing: False, Degraded: False
  • if err is not nil, call SetDegraded to set the degraded status.

@@ -285,11 +286,11 @@ func TestPullSecretReconciler(t *testing.T) {
ctx := context.Background()

clientFake := ctrlfake.NewClientBuilder().WithObjects(tt.instance).WithObjects(tt.secrets...).Build()
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should also add validation in these tests to ensure that the conditions on the controller are what we expect (e.g. Degraded:True when an error occurs). You can see an example of this in the dnsmasq machine config controller tests: https://github.com/Azure/ARO-RP/blob/master/pkg/operator/controllers/dnsmasq/machineconfig_controller_test.go#L36

@lbossis lbossis dismissed stale reviews from slawande2 and fahlmant via f2991ab November 15, 2024 20:04
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
chainsaw Pull requests or issues owned by Team Chainsaw
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants