Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix the issue of hiding errors #435

Merged
merged 2 commits into from
Sep 11, 2024
Merged

Conversation

turkenf
Copy link
Collaborator

@turkenf turkenf commented Sep 9, 2024

Description of your changes

We encountered an issue where error messages were hidden/not visible in recent provider releases. When we try to create a resource with incorrect or incomplete configuration, it gets stuck at creating false. See the discussion here.

Since recoverIfPanic is executed first, we pass ph.err as nil to the finishing operations, which causes the error to be hidden.

This PR removes ph.err = nil from the recoverIfPanic func.

I have:

  • Read and followed Upjet's contribution process.
  • Run make reviewable to ensure this PR is ready for review.
  • Added backport release-x.y labels to auto-backport this PR if necessary.

How has this code been tested

After running make run in the main branch of crossplane-contrib/provider-upjet-gcp apply the following resource below:

apiVersion: cloudplatform.gcp.upbound.io/v1beta1
kind: ProjectIAMMember
metadata:
  name: project-iam-member2
spec:
  forProvider:
    member: <redacted>
    role: "roles/viewer"
    project: xxx
  • Before this fix, the creating status gets stuck at false
  conditions:
  - lastTransitionTime: "2024-09-09T17:11:57Z"
    reason: Creating
    status: "False"
    type: Ready
  - lastTransitionTime: "2024-09-09T17:11:57Z"
    reason: ReconcileSuccess
    status: "True"
    type: Synced
  - lastTransitionTime: "2024-09-09T17:12:01Z"
    reason: Success
    status: "True"
    type: LastAsyncOperation
  • After go mod edit -replace=github.com/crossplane/upjet=github.com/turkenf/upjet@fix-nil-err
  conditions:
  - lastTransitionTime: "2024-09-09T17:22:07Z"
    reason: Creating
    status: "False"
    type: Ready
  - lastTransitionTime: "2024-09-09T17:23:56Z"
    message: 'create failed: async create failed: failed to create the resource: [{0
      Request `Create IAM Members roles/viewer serviceAccount:<redacted>
      for project "xxx"` returned error: Error retrieving IAM policy for project "xxx":
      googleapi: Error 403: The caller does not have permission, forbidden  []}]'
    reason: ReconcileError
    status: "False"
    type: Synced
  - lastTransitionTime: "2024-09-09T17:23:56Z"
    message: 'async create failed: failed to create the resource: [{0 Request `Create
      IAM Members roles/viewer serviceAccount:<redacted>
      for project "xxx"` returned error: Error retrieving IAM policy for project "xxx":
      googleapi: Error 403: The caller does not have permission, forbidden  []}]'
    reason: AsyncCreateFailure
    status: "False"
    type: LastAsyncOperation

Signed-off-by: Fatih Türken <[email protected]>
@turkenf turkenf marked this pull request as ready for review September 9, 2024 17:28
Copy link
Member

@mergenci mergenci left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @turkenf for discovering and addressing this critical issue. Explaining how the error behaves in the documentation might be helpful. Unfortunately, GitHub doesn't allow suggesting edits to arbitrary sections of code. So I'm copying the full updated recoverIfPanic() documentation below (only the second and the third sentences are new):

// recoverIfPanic recovers from panics, if any. Upon recovery, the
// error is set to a recovery message. Otherwise, the error is left
// unmodified. Calls to this function should be defferred directly:
// `defer ph.recoverIfPanic()`. Panic recovery won't work if the call
// is wrapped in another function call, such as `defer func() {
// ph.recoverIfPanic() }()`. On recovery, API machinery panic handlers
// run. The implementation follows the outline of panic recovery
// mechanism in controller-runtime:
// https://github.com/kubernetes-sigs/controller-runtime/blob/v0.17.3/pkg/internal/controller/controller.go#L105-L112

Copy link
Member

@mergenci mergenci left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @turkenf. LGTM.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants