[BUG] Agent falls back to TCP log submission even if the site does not support it #31014

ollien · 2024-11-12T21:28:08Z

Agent Environment

Agent 7.55.1 - Commit: 8ec9dff - Serialization version: v5.0.119 - Go version: go1.21.11

Describe what happened:

In production, we observed a number of our agent instances failing to resolve agent-intake.logs.us5.datadoghq.com, which does not exist, since US5 does not support TCP log submission. This caused our logs to not be ingested.

2024-11-12 14:32:20 UTC | CORE | WARN | (pkg/logs/client/tcp/connection_manager.go:108 in NewConnection) | dial tcp: lookup agent-intake.logs.us5.datadoghq.com: no such host

After some investigation, it seems that our agents had failed the HTTP health check at startup, and fell back to TCP

2024-11-12 14:32:20 UTC | CORE | WARN | (pkg/logs/client/http/destination.go:442 in CheckConnectivity) | HTTP connectivity failure: Post "https://agent-http-intake.logs.us5.datadoghq.com/api/v2/logs": context deadline exceeded (Client.Timeout exceeded while awaiting headers)
2024-11-12 14:32:20 UTC | CORE | WARN | (comp/logs/agent/config/config.go:120 in BuildEndpointsWithConfig) | You are currently sending Logs to Datadog through TCP (either because logs_config.force_use_tcp or logs_config.socks5_proxy_address is set or the HTTP connectivity test has failed) To benefit from increased reliability and better network performances, we strongly encourage switching over to compressed HTTPS which is now the default protocol.

Describe what you expected:

I would not expect the agent to fall back to a TCP endpoint that does not exist. If US5 does not support TCP, then the agent should act as if force_http is enabled, or fail loudly in some other way.

Steps to reproduce the issue:
I don't have explicit steps to do this, but if you can make the HTTP probe fail in some way (perhaps an iptables rule to drop it so it times out), you can get into this state.

Additional environment details (Operating System, Cloud provider, etc): The container image used is gcr.io/datadoghq/agent:7.55.1

The text was updated successfully, but these errors were encountered:

ollien added the team/triage label Nov 12, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[BUG] Agent falls back to TCP log submission even if the site does not support it #31014

[BUG] Agent falls back to TCP log submission even if the site does not support it #31014

ollien commented Nov 12, 2024 •

edited

Loading

[BUG] Agent falls back to TCP log submission even if the site does not support it #31014

[BUG] Agent falls back to TCP log submission even if the site does not support it #31014

Comments

ollien commented Nov 12, 2024 • edited Loading

ollien commented Nov 12, 2024 •

edited

Loading