Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix: Shut down zombie goroutine in chronicleexporter #2029

Merged
merged 3 commits into from
Dec 9, 2024

Conversation

mrsillydog
Copy link
Contributor

Proposed Change

While testing another fix by spinning up a chronicle exporter, sending a log to Chronicle, and then shutting down the exporter, I encountered this error:

image

I was understandably confused, so asked for help from Dan Jaglowski, and he figured it out and came up with this solution after hours, along with some other refactors. Explanation and fix are all his, I'm just opening the PR.

To fix this, the Shutdown function actually needs to be:

func (ce *chronicleExporter) Shutdown(context.Context) error {
	if ce.cfg.Protocol == protocolHTTPS {
		t := ce.httpClient.Transport.(*oauth2.Transport)
		if t.Base != nil {
			t.Base.(*http.Transport).CloseIdleConnections()
		} else {
			http.DefaultTransport.(*http.Transport).CloseIdleConnections()
		}
		return nil
	}

	ce.cancel()
	ce.wg.Wait()
	if ce.grpcConn != nil {
		if err := ce.grpcConn.Close(); err != nil {
			return fmt.Errorf("connection close: %s", err)
		}
	}
	return nil
}

Larger explanation:

In Start, when we instantiate the httpClient : ce.httpClient = oauth2.NewClient(context.Background(), creds.TokenSource) , it doesn't actually matter what context we pass in. It isn't used for cancelation.

What we get back is always an *http.Client that contains a Transport field of type *oauth2.Transport. This in turn always contains a Base that is nil, which means it will use http.DefaultTransport. The thing with http.DefaultTransport (as well as many others) is that they will reuse connections by setting them into a "keep alive" state. The only way to clean these up is to call CloseIdleConnections() on the Transport. However, because we're getting an *oauth2.Transport that doesn't itself contain a CloseIdleConnections() method, we have to to access the http.DefaultTransport directly and call CloseIdleConnections() on it.

If you would like a test case to reproduce this, please contact me for one - not sure we have one that doesn't involve actually sending data to Chronicle.

Checklist
  • Changes are tested
  • CI has passed

@mrsillydog mrsillydog requested review from dpaasman00 and a team as code owners December 5, 2024 14:40
@tbm48813
Copy link

tbm48813 commented Dec 5, 2024

Tested both gRPC and HTTPS. Installed adapter fresh, collected successfully on both. restarted several times. No errors on the collector logged, everything looks great.

@mrsillydog
Copy link
Contributor Author

mrsillydog commented Dec 5, 2024

While testing GRPC, discovered that it had the same issue - we need an integration test or two around this to ensure it doesn't crop up again, but it should be fixed now. Much credit to Dan again.

@mrsillydog mrsillydog merged commit ffe69f0 into release/v1.67.0 Dec 9, 2024
15 checks passed
@mrsillydog mrsillydog deleted the fix/chronicle-zombie-goroutine branch December 9, 2024 15:01
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants