Skip to content

Releases: DataDog/datadog-agent

7.38.2

10 Aug 16:09
ba442fd
Compare
Choose a tag to compare

Prelude

Release on: 2022-08-10

Bug Fixes

  • Fixes a bug making the agent creating a lot of zombie (defunct) processes. This bug happened only with the docker images 7.38.x when the containerized agent was launched without hostPID: true.

7.38.1

03 Aug 08:58
7dad1dd
Compare
Choose a tag to compare

Prelude

Release on: 2022-08-02

Bug Fixes

  • Fixes CWS rules with 'process.file.name !=""' expression.

Datadog Cluster Agent 1.22.0

26 Jul 10:42
37c7984
Compare
Choose a tag to compare

Prelude

Released on: 2022-07-26
Pinned to datadog-agent v7.38.0: CHANGELOG

New Features

  • Enable collection of Ingresses by default in the orchestrator check.

7.38.0

25 Jul 14:17
Compare
Choose a tag to compare

Prelude

Release on: 2022-07-25

New Features

  • Add NetFlow feature to listen to NetFlow traffic and forward them to Datadog.
  • The CWS agent now supports filtering events depending on whether they are performed by a thread. A process is considered a thread if it's a child process that hasn't executed another program.
  • Adds a diagnose datadog-connectivity command that displays information about connectivity issues between the Agent and Datadog intake.
  • Adds support for tailing modes in the journald logs tailer.
  • The CWS agent now supports writing rules on processes termination.
  • Add support for new types of CI Visibility payloads to the Trace Agent, so features that until now were Agentless-only are available as well when using the Agent.

Enhancement Notes

  • Tags configured with DD_TAGS or DD_EXTRA_TAGS in an EKS Fargate environment are now attached to OTLP metrics.
  • Add NetFlow static enrichments (TCP flags, IP Protocol, EtherType, and more).
  • Report lines matched by auto multiline detection as metrics and show on the status page.
  • Add a containerd_exclude_namespaces configuration option for the Agent to ignore containers from specific containerd namespaces.
  • The log_level of the agent is now appended to the flare archive name upon its creation.
  • The metrics reported by KSM core now include the tags "kube_app_name", "kube_app_instance", and so on, if they're related to a Kubernetes entity that has a standard label like "app.kubernetes.io/name", "app.kubernetes.io/instance", etc.
  • The Kubernetes State Metrics Core check now collects two ingress metrics: kubernetes_state.ingress.count and kubernetes_state.ingress.path.
  • Move process chunking code to util package to avoid cycle import when using it in orchestrator check.
  • APM: Add support for PostgreSQL JSON operators in the SQL obfuscate package.
  • The OTLP ingest endpoint now supports the same settings and protocol as the OpenTelemetry Collector OTLP receiver v0.54.0 (OTLP v0.18.0).
  • The Agent now embeds Python-3.8.13, an upgrade from Python-3.8.11.
  • APM: Updated Rare Sampler default configuration values to sample traces more uniformly across environments and services.
  • The OTLP ingest endpoint now supports Exponential Histograms with delta aggregation temporality.
  • The Windows installer now supports grouped Managed Service Accounts.
  • Enable https monitoring on arm64 with kernel >= 5.5.0.
  • Add otlp_config.debug.loglevel to determine log level when the OTLP Agent receives metrics/traces for debugging use cases.

Deprecation Notes

  • Deprecateotlp_config.metrics.instrumentation_library_metadata_as_tags in in favor of otlp_config.metrics.instrumentation_scope_metadata_as_tags.

Bug Fixes

  • When enable_payloads.series or enable_payloads.sketches are set to false, don't log the error Cannot append a metric in a closed buffered channel.
  • Restrict permissions for the entrypoint executables of the Dockerfiles.
  • Revert docker.mem.in_use calculation to use RSS Memory instead of total memory.
  • Add missing telemetry metrics for HTTP log bytes sent.
  • Fix panic in container, containerd, and docker when container stats are temporarily not available
  • Fix prometheus check Metrics parsing by not enforcing a list of strings.
  • Fix potential deadlock when shutting down an Agent with a log TCP listener.
  • APM: Fixed trace rare sampler's oversampling behavior. With this fix, the rare sampler will sample rare traces more accurately.
  • Fix journald byte count on the status page.
  • APM: Fixes an issue where certain (#> and #>>) PostgreSQL JSON operators were being interpreted as comments and removed by the obfuscate package.
  • Scrubs HTTP Bearer tokens out of log output
  • Fixed the triggered "svType != tvType; key=containerd_namespace, st=[]interface {}, tt=[]string, sv=[], tv=[]" error when using a secret backend reader.
  • Fixed an issue that made the container check to show an error in the "agent status" output when it was working properly but there were no containers deployed.

Datadog Cluster Agent 1.21.0

20 Jul 14:02
5777664
Compare
Choose a tag to compare

Prelude

Released on: 2022-06-28
Pinned to datadog-agent v7.37.0: CHANGELOG

Enhancement Notes

  • The Cluster Agent followers now forward queries to the Cluster Agent leaders themselves. This allows a reduction in the overall number of connections to the Cluster Agent and better spreads the load between leader and forwarders.

  • Make the name of the ConfigMap used by the Cluster Agent for its leader election configurable.

  • The Datadog Cluster Agent exposes a new metric endpoint_checks_configs_dispatched.

Bug Fixes

  • Fix a panic occuring during the invocation of the check command on the
    Cluster Agent if the Orchestrator Explorer feature is enabled.

  • Fix the node count reported for Kubernetes clusters.

Datadog Cluster Agent 1.20.0

19 Jul 16:03
d42cb11
Compare
Choose a tag to compare

Prelude

Released on: 2022-05-22
Pinned to datadog-agent v7.36.0: CHANGELOG

New Features

  • The Datadog Admission Controller supports multiple configuration injection
    modes through the admission_controller.inject_config.mode parameter
    or the DD_ADMISSION_CONTROLLER_INJECT_CONFIG_MODE environment variable:

    • hostip: Inject the host IP. (default)
    • service: Inject Datadog's local-service DNS name.
    • socket: Inject the Datadog socket path.
  • Collect ResourceRequirements for jobs and cronjobs for kubernetes live containers.

Enhancement Notes

  • Added a configuration option to admission controller to allow
    configuration of the failure policy. Defaults to Ignore which
    was the previous default. The default of Ignore means that pods
    will still be admitted even if the webhook is unavailable to
    inject them. Setting to Fail will require the admission controller
    to be present and pods to be injected before they are allowed to run.

  • The admission controller's reinvocation policy is now set to IfNeeded by default.
    It can be changed using the admission_controller.reinvocation_policy parameter.

  • The Datadog Cluster Agent now supports internal profiling.

  • KSM core check: add a new kubernetes_state.cronjob.complete
    service check that returns the status of the most recent job for
    a cronjob.

Security Notes

  • Cluster Agent API (only used by Node Agents) is now only server with TLS >= 1.3 by default. Setting "cluster_agent.allow_legacy_tls" to true allows to fallback to TLS 1.0.

Bug Fixes

  • Fix the node count reported for Kubernetes clusters.

  • Fixed an issue that created lots of log messages when the DCA admission controller was enabled on AKS.

  • Time-based metrics (for example, kubernetes_state.pod.age, kubernetes_state.pod.uptime) are now comparable in the Kubernetes state core check.

  • Fix a risk of panic when multiple KSM Core check instances run concurrently.

  • Remove noisy Kubernetes API deprecation warnings in the Cluster Agent logs.

Other Notes

  • Change the default value of the external metrics provider port from 443 to 8443.
    This will allow to run the cluster agent with a non-root user for better security.
    This was already the default value in the Helm chart and in the datadog operator.

7.37.1

28 Jun 13:09
3c29612
Compare
Choose a tag to compare

Prelude

Release on: 2022-06-28

Bug Fixes

  • Fixes issue where proxy config was ignored by the trace-agent.

7.37.0

27 Jun 14:20
1124d66
Compare
Choose a tag to compare

Prelude

Release on: 2022-06-27

Upgrade Notes

  • OTLP ingest: Support for the deprecated experimental.otlp section and the DD_OTLP_GRPC_PORT and DD_OTLP_HTTP_PORT environment variables has been removed. Use the otlp_config section or the DD_OTLP_CONFIG_RECEIVER_PROTOCOLS_GRPC_ENDPOINT and DD_OTLP_CONFIG_RECEIVER_PROTOCOLS_HTTP_ENDPOINT environment variables instead.
  • OTLP: Deprecated settings otlp_config.metrics.report_quantiles and otlp_config.metrics.send_monotonic_counter have been removed in favor of otlp_config.metrics.summaries.mode and otlp_config.metrics.sums.cumulative_monotonic_mode respectively.

New Features

  • Adds User-level service unit filtering support for Journald log collection via include_user_units and exclude_user_units.
  • A wildcard (*) can be used in either exclude_units or exclude_user_units if only a particular type of Journald log is desired.
  • A new troubleshooting section has been added to the Agent CLI. This section will hold helpers to understand the Agent behavior. For now, the section only has two command to print the different metadata payloads sent by the Agent (v5 and inventory).
  • APM: Incoming OTLP traces are now allowed to set their own sampling priority.
  • Enable NPM NAT gateway lookup by default.
  • Partial support of IPv6 on EKS clusters
    • Fix the kubelet client when the IP of the host is IPv6.
    • Fix the substitution of %%host%% patterns inside the auto-discovery annotations: If the concerned pod has an IPv6 and the %%host%% pattern appears inside an URL context, then the IPv6 is surrounded by square brackets.
  • OTLP ingest now supports the same settings and protocol version as the OpenTelemetry Collector OTLP receiver v0.50.0.
  • The Cloud Workload Security agent can now monitor and evaluate rules on bind syscall.
  • [corechecks/snmp] add scale factor option to metric configurations
  • Evaluate memory.usage metrics based on collected metrics.

Enhancement Notes

  • APM: DD_APM_FILTER_TAGS_REQUIRE and DD_APM_FILTER_TAGS_REJECT can now be a literal JSON array. e.g. ["someKey:someValue"] This allows for matching tag values with the space character in them.
  • SNMP Traps are now sent to a dedicated intake via the epforwarder.
  • Update SNMP traps database to include integer enumerations.
  • The Agent now supports a single com.datadoghq.ad.checks label in Docker, containerd, and Podman containers. It merges the contents of the existing check_names, init_configs (now optional), and instances annotations into a single JSON value.
  • Add a new Agent telemetry metric autodiscovery_poll_duration (histogram) to monitor configuration poll duration in Autodiscovery.
  • APM: Added /config/set endpoint in trace-agent to change configuration settings during runtime. Supports changing log level(log_level).
  • APM: When the X-Datadog-Trace-Count contains an invalid value, an error will be issued.
  • Upgrade to Docker client 20.10, reducing the duration of docker check on Windows (requires Docker >= 20.10 on the host).
  • The Agent maintains scheduled cluster and endpoint checks when the Cluster Agent is unavailable.
  • The Cluster Agent followers now forward queries to the Cluster Agent leaders themselves. This allows a reduction in the overall number of connections to the Cluster Agent and better spreads the load between leader and forwarders.
  • The kube_namespace tag is now included in all metrics, events, and service checks generated by the Helm check.
  • Include install_info to version-history.json
  • Allow nightly builds install on non-prod repos
  • Add a kubernetes_node_annotations_as_tags parameter to use Kubernetes node annotations as host tags.
  • Add more detailed logging around leadership status failures.
  • Move the experimental SNMP Traps Listener configuration under network_devices.
  • Add support for the DNS Monitoring feature of NPM to Linux kernels older than 4.1.
  • Adds segment_name and segment_id tags to PCF containers that belong to an isolation segment.
  • Make logs agent additional_endpoints reliable by default. This can be disabled by setting is_reliable: false on the additional endpoint.
  • On Windows, if a datadog.yaml file is found during an installation or upgrade, the dialogs collecting the API Key and Site are skipped.
  • Resolve SNMP trap variables with integer enumerations to their string representation.
  • [corechecks/snmp] Add profile static_tags config
  • Report telemetry metrics about the retry queue capacity: datadog.agent.retry_queue_duration.capacity_secs, datadog.agent.retry_queue_duration.bytes_per_sec and datadog.agent.retry_queue_duration.capacity_bytes
  • Updated cloud providers to add the Instance ID as a host alias for EC2 instances, matching what other cloud providers do. This should help with correctly identifying hosts where the customer has changed the hostname to be different from the Instance ID.
  • NTP check: Include /etc/ntpd.conf and /etc/openntpd/ntpd.conf for use_local_defined_servers.
  • Kubernetes pod with short-lived containers do not have log lines duplicated with both container tags (the stopped one and the running one) when logs are collected. This feature is enabled by default, set logs_config.validate_pod_container_id to false to disable it.

Security Notes

  • The Agent is built with Go 1.17.11.

Bug Fixes

  • Updates defaults for the port and binding host of the experimental traps listener.
  • APM: The Agent is now performing rare span detection on all spans, as opposed to only dropped spans. This change will slightly reduce the number of rare spans kept unnecessarily.
  • APM OTLP: This change ensures that the ingest now standardizes certain attribute keys to their correct Datadog tag counter parts, such as: container tags, "operation.name", "service.name", etc.
  • APM: Fix a bug where the APM section of the GUI would not show up in older Internet Explorer versions on Windows.
  • Support dynamic Auth Tokens in Kubernetes v1.22+ (Bound Service Account Token Volume).
  • The %%host%% autodiscovery tag now works properly when using containerd, but only on Linux and when using IP v4 addresses.
  • Enhanced the coverage of pause-containers filtering on Containerd.
  • APM: Fix the loss of trace metric container information when large payloads need to be split.
  • Fix cri check producing no metrics when running on OpenShift / cri-o.
  • Fix missing health status from Docker containers in Live Container View.
  • Fix Agent startup failure when running as a non-privileged user (for instance, when running on OpenShift with restricted SCC).
  • Fix missing container metrics (container, containerd checks and live container view) on AWS Bottlerocket.
  • APM: Fixed an issue where "CPU threshold exceeded" logs would show the wrong user CPU usage by a factor of 100.
  • Ensures that when kubernetes_namespace_labels_as_tags is set, the namespace labels are always attached to metrics and logs, even when the pod is not ready yet.
  • Add missing support for UDPv6 receive path to NPM.
  • The agent workload-list --verbose command and the workload-list.log file in the flare do not show containers' environment variables anymore. Except for DD_SERVICE, DD_ENV and DD_VERSION.
  • Fixed a potential deadlock in the Python check runner during agent shutdown.
  • Fixes issue where trace-agent would not report any version info.
  • The DCA and the cluster runners no longer write warning logs to /tmp.
  • Fixes an issue where the Agent would panic when trying to inspect Docker containers while the Docker daemon was unavailable or taking too long to respond.

Other Notes

  • Exclude teradata on Mac agents.

7.36.1

31 May 19:50
Compare
Choose a tag to compare

Prelude

Release on: 2022-05-31

Bug Fixes

  • Fixes issue where proxy config was ignored by the trace-agent.
  • This fixes a regression introduced in 7.36.0 where some logs sources attached to a container/pod would not be unscheduled on container/pod stop if multiple logs configs were attached to the container/pod. This could lead to duplicate log entries being created on container/pod restart as there would be more than one tailer tailing the targeted source.

7.36.0

24 May 11:15
7.36.0
c1180f8
Compare
Choose a tag to compare

Prelude

Release on: 2022-05-24

Upgrade Notes

  • Debian packages are now built on Debian 8. Newly built DEBs are supported on Debian >= 8 and Ubuntu >= 14.
  • The OTLP endpoint will no longer enable the legacy OTLP/HTTP endpoint 0.0.0.0:55681 by default. To keep using the legacy endpoint, explicitly declare it via the otlp_config.receiver.protocols.http.endpoint configuration setting or its associated environment variable,DD_OTLP_CONFIG_RECEIVER_PROTOCOLS_HTTP_ENDPOINT.
  • Package signing keys were rotated:
    • DEB packages are now signed with key AD9589B7, a signing subkey of key F14F620E
    • RPM packages are now signed with key FD4BF915

New Features

  • Adding support for IBM cloud. The agent will now detect that we're running on IBM cloud and collect host aliases (vm name and ID).
  • Added event collection in the Helm check. The feature is disabled by default. To enable it, set the collect_events option to true.
  • Adds a service check for the Helm check. The check fails for a release when its latest revision is in "failed" state.
  • Adds a kube_qos (quality of service) tag to metrics associated with kubernetes pods and their containers.
  • CWS can now track network devices creation and load TC classifiers dynamically.
  • CWS can now track network namespaces.
  • The DNS event type was added to CWS.
  • The OTLP ingest endpoint is now considered GA for metrics.

Enhancement Notes

  • Traps OIDs are now resolved to names using user-provided 'traps db' files in snmp.d/traps_db/.
  • The Agent now supports a single ad.datadoghq.com/$IDENTIFIER.checks annotation in Kubernetes Pods and Services to configure Autodiscovery checks. It merges the contents of the existing "check_names", init_configs (now optional), and instances annotations into a single JSON value.
  • DD_URL environment variable can now be used to set the Datadog intake URL just like DD_DD_URL. If both DD_DD_URL and DD_URL are set, DD_DD_URL will be used to avoid breaking change.
  • Added a process-agent version command, and made the output mimic the core agent.
  • Windows: Add Datadog registry to Flare.
  • Add --service flag to stream-logs command to filter streamed logs in detail.
  • Support a simple date pattern for automatic multiline detection
  • APM: The OTLP ingest stringification of non-standard Datadog values such as Arrays and KeyValues is now consistent with OpenTelemetry attribute stringification.
  • APM: Connections to upload profiles to the Datadog intake are now closed after 47 seconds of idleness. Common tracer setups send one profile every 60 seconds, which coincides with the intake's connection timeout and would occasionally lead to errors.
  • The Cluster Agent now exposes a new metric cluster_checks_configs_info. It exposes the node and the check ID as tags.
  • KSM core check: add a new kubernetes_state.cronjob.complete service check that returns the status of the most recent job for a cronjob.
  • Retry more HTTP status codes for the logs agent HTTP destination.
  • COPYRIGHT-3rdparty.csv now contains each copyright statement exactly as it is shown on the original component.
  • Adds sidecar_present and sidecar_count tags on Cloud Foundry containers that run apps with sidecar processes.
  • Agent flare now includes output from the process and container checks.
  • Add the --cfgpath parameter in the Process Agent replacing --config.
  • Add the check subcommand in the Process Agent replacing --check (-check). Only warn once if the -version flag is used.
  • Adds human readable output of process and container data in the check command for the Process Agent.
  • The Agent flare command now collects Process Agent performance profile data in the flare bundle when the --profile flag is used.

Deprecation Notes

  • Deprecated process-agent --vesion in favor of process-agent version.
  • The logs configuration use_http and use_tcp flags have been deprecated in favor of force_use_http and force_use_tcp.
  • OTLP ingest: metrics.send_monotonic_counter has been deprecated in favor of metrics.sums.cumulative_monotonic_mode. metrics.send_monotonic_counter will be removed in v7.37.
  • OTLP ingest: metrics.report_quantiles has been deprecated in favor of metrics.summaries.mode. metrics.report_quantiles will be removed in v7.37 / v6.37.
  • Remove the unused --ddconfig (-ddconfig) parameter. Deprecate the --config (-config) parameter (show warning on usage).
  • Deprecate the --check (-check) parameter (show warning on usage).

Bug Fixes

  • Bump GoSNMP to fix incomplete support of SNMP v3 INFORMs.
  • APM: OTLP: Fixes an issue where attributes from different spans were merged leading to spans containing incorrect attributes.
  • APM: OTLP: Fixed an inconsistency where the error message was left empty in cases where the "exception" event was not found. Now, the span status message is used as a fallback.
  • Fixes an issue where some data coming from the Agent when running in ECS Fargate did not have task_*, ecs_cluster_name, region, and availability_zone tags.
  • Collect the "0" value for resourceRequirements if it has been set
  • Fix a bug introduced in 7.33 that could prevent auto-discovery variable %%port_<name>%% to not be resolved properly.
  • Fix a panic in the Docker check when a failure happens early (when listing containers)
  • Fix missing docker.memory.limit (and docker.memory.in_use) on Windows
  • Fixes a conflict preventing NPM/USM and the TCP Queue Length check from being enabled at the same time.
  • Fix permission of "/readsecret.sh" script in the agent Dockerfile when executing with dd-agent user (for cluster check runners)
  • For Windows, fixes problem in upgrade wherein NPM driver is not automatically started by system probe.
  • Fix Gohai not being able to fetch network information when running on a non-English windows (when the output of commands like ipconfig were not in English). gohai no longer relies on system commands but uses Golang net package instead (same as Linux hosts). This bug had the side effect of preventing network monitoring data to be linked back to the host.
  • Time-based metrics (for example, kubernetes_state.pod.age, kubernetes_state.pod.uptime) are now comparable in the Kubernetes state core check.
  • Fix a risk of panic when multiple KSM Core check instances run concurrently.
  • For Windows, includes NPM driver 1.3.2, which has a fix for a BSOD on system probe shutdown.
  • Adds new --json flag to check. process-agent check --json now outputs valid json.
  • On Windows, includes NPM driver update which fixes performance problem when host is under high connection load.
  • Previously, the Agent could not log the start or end of a check properly after the first five check runs. The Agent now can log the start and end of a check correctly.

Other Notes

  • Include pre-generated trap db file in the conf.d/snmp.d/traps_db/ folder.
  • Gohai dependency has been upgraded. This brings a newer version of gopsutil and a fix when fetching network information in non-english Windows (see fixes section).
  • If users are using strict firewall rules, they should also exclude the new port 6162 from their firewall.