Kubernetes APM - Trace Collection

learning center
Try Introduction to Monitoring Kubernetes in the Learning Center

Learn without cost on real cloud compute capacity and a Datadog trial account. Start these hands-on labs to get up to speed with the metrics, logs, and APM traces that are specific to Kubernetes.

ENROLL NOW

This page describes how to set up and configure Application Performance Monitoring (APM) for your Kubernetes application.

The APM troubleshooting pipeline: The tracer sends traces and metrics data from the application pod to the Agent pod, which sends it to the Datadog backend to be shown in the Datadog UI.

You can send traces over Unix Domain Socket (UDS), TCP (IP:Port), or Kubernetes service. Datadog recommends that you use UDS, but it is possible to use all three at the same time, if necessary.

Setup

  1. If you haven’t already, install the Datadog Agent in your Kubernetes environment.
  2. Configure the Datadog Agent to collect traces.
  3. Configure application pods to submit traces to the Datadog Agent.

Configure the Datadog Agent to collect traces

The instructions in this section configure the Datadog Agent to receive traces over UDS. To use TCP, see the additional configuration section. To use Kubernetes service, see Setting up APM with Kubernetes Service.

Edit your datadog-agent.yaml to set features.apm.enabled to true.

apiVersion: datadoghq.com/v2alpha1
kind: DatadogAgent
metadata:
  name: datadog
spec:
  global:
    credentials:
      apiKey: <DATADOG_API_KEY>

  features:
    apm:
      enabled: true
      unixDomainSocketConfig:
        path: /var/run/datadog/apm.socket # default

When APM is enabled, the default configuration creates a directory on the host and mounts it within the Agent. The Agent then creates and listens on a socket file /var/run/datadog/apm/apm.socket. The application pods can then similarly mount this volume and write to this same socket. You can modify the path and socket with the features.apm.unixDomainSocketConfig.path configuration value.

After making your changes, apply the new configuration by using the following command:

kubectl apply -n $DD_NAMESPACE -f datadog-agent.yaml

Note: On minikube, you may receive an Unable to detect the kubelet URL automatically error. In this case, set global.kubelet.tlsVerify to false.

If you used Helm to install the Datadog Agent, APM is enabled by default over UDS or Windows named pipe.

To verify, ensure that datadog.apm.socketEnabled is set to true in your datadog-values.yaml.

datadog:
  apm:
    socketEnabled: true    

The default configuration creates a directory on the host and mounts it within the Agent. The Agent then creates and listens on a socket file /var/run/datadog/apm.socket. The application pods can then similarly mount this volume and write to this same socket. You can modify the path and socket with the datadog.apm.hostSocketPath and datadog.apm.socketPath configuration values.

datadog:
  apm:
    # the following values are default:
    socketEnabled: true
    hostSocketPath: /var/run/datadog/
    socketPath: /var/run/datadog/apm.socket

To disable APM, set datadog.apm.socketEnabled to false.

After making your changes, upgrade your Datadog Helm chart using the following command:

helm upgrade -f datadog-values.yaml <RELEASE NAME> datadog/datadog

Note: On minikube, you may receive an Unable to detect the kubelet URL automatically error. In this case, set datadog.kubelet.tlsVerify to false.

Configure your application pods to submit traces to Datadog Agent

The Datadog Admission Controller is a component of the Datadog Cluster Agent that simplifies your application pod configuration. Learn more by reading the Datadog Admission Controller documentation.

Use the Datadog Admission Controller to inject environment variables and mount the necessary volumes on new application pods, automatically configuring pod and Agent trace communication. Learn how to automatically configure your application to submit traces to Datadog Agent by reading the Injecting Libraries Using Admission Controller documentation.

If you are sending traces to the Agent by using UDS, mount the host directory the socket is in (that the Agent created) to the application container and specify the path to the socket with DD_TRACE_AGENT_URL:

apiVersion: apps/v1
kind: Deployment
#(...)
    spec:
      containers:
      - name: "<CONTAINER_NAME>"
        image: "<CONTAINER_IMAGE>/<TAG>"
        env:
        - name: DD_TRACE_AGENT_URL
          value: 'unix:///var/run/datadog/apm.socket'
        volumeMounts:
        - name: apmsocketpath
          mountPath: /var/run/datadog
        #(...)
      volumes:
        - hostPath:
            path: /var/run/datadog/
          name: apmsocketpath

Configure your application tracers to emit traces:

After configuring your Datadog Agent to collect traces and giving your application pods the configuration on where to send traces, install the Datadog tracer into your applications to emit the traces. Once this is done, the tracer sends the traces to the appropriate DD_TRACE_AGENT_URL endpoint.

If you are sending traces to the Agent by using TCP (<IP_ADDRESS>:8126) supply this IP address to your application pods—either automatically with the Datadog Admission Controller, or manually using the downward API to pull the host IP. The application container needs the DD_AGENT_HOST environment variable that points to status.hostIP:

apiVersion: apps/v1
kind: Deployment
#(...)
    spec:
      containers:
      - name: "<CONTAINER_NAME>"
        image: "<CONTAINER_IMAGE>/<TAG>"
        env:
          - name: DD_AGENT_HOST
            valueFrom:
              fieldRef:
                fieldPath: status.hostIP

Note: This configuration requires the Agent to be configured to accept traces over TCP

Configure your application tracers to emit traces:

After configuring your Datadog Agent to collect traces and giving your application pods the configuration on where to send traces, install the Datadog tracer into your applications to emit the traces. Once this is done, the tracer automatically sends the traces to the appropriate DD_AGENT_HOST endpoint.

Refer to the language-specific APM instrumentation docs for more examples.

Additional configuration

Configure the Datadog Agent to accept traces over TCP

Update your datadog-agent.yaml with the following:

apiVersion: datadoghq.com/v2alpha1
kind: DatadogAgent
metadata:
  name: datadog
spec:
  global:
    credentials:
      apiKey: <DATADOG_API_KEY>

  features:
    apm:
      enabled: true
      hostPortConfig:
        enabled: true
        hostPort: 8126 # default

After making your changes, apply the new configuration by using the following command:

kubectl apply -n $DD_NAMESPACE -f datadog-agent.yaml

Warning: The hostPort parameter opens a port on your host. Make sure your firewall only allows access from your applications or trusted sources. If your network plugin doesn’t support hostPorts, add hostNetwork: true in your Agent pod specifications. This shares the network namespace of your host with the Datadog Agent. This also means that all ports opened on the container are opened on the host. If a port is used both on the host and in your container, they conflict (since they share the same network namespace) and the pod does not start. Some Kubernetes installations do not allow this.

Update your datadog-values.yaml file with the following APM configuration:

datadog:
  apm:
    portEnabled: true
    port: 8126 # default

After making your changes, upgrade your Datadog Helm chart using the following command:

helm upgrade -f datadog-values.yaml <RELEASE NAME> datadog/datadog

Warning: The datadog.apm.portEnabled parameter opens a port on your host. Make sure your firewall only allows access from your applications or trusted sources. If your network plugin doesn’t support hostPorts, add hostNetwork: true in your Agent pod specifications. This shares the network namespace of your host with the Datadog Agent. This also means that all ports opened on the container are opened on the host. If a port is used both on the host and in your container, they conflict (since they share the same network namespace) and the pod does not start. Some Kubernetes installations do not allow this.

APM environment variables

Set additional APM environment variables under override.nodeAgent.containers.trace-agent.env:

datadog-agent.yaml

apiVersion: datadoghq.com/v2alpha1
kind: DatadogAgent
metadata:
  name: datadog
spec:
  override:
    nodeAgent:
      containers:
        trace-agent:
          env:
            - name: <ENV_VAR_NAME>
              value: <ENV_VAR_VALUE>

Set additional APM environment variables under agents.containers.traceAgent.env:

datadog-values.yaml

agents:
  containers:
    traceAgent:
      env:
        - name: <ENV_VAR_NAME>
          value: <ENV_VAR_VALUE>

Add environment variables to the DaemonSet or Deployment (for Datadog Cluster Agent).

apiVersion: apps/v1
kind: DaemonSet
metadata:
  name: datadog
spec:
  template:
    spec:
      containers:
        - name: agent
          ...
          env:
            - name: <ENV_VAR_NAME>
              value: <ENV_VAR_VALUE>

List of environment variables available for configuring APM:

Environment variableDescription
DD_APM_ENABLEDWhen set to true, the Datadog Agent accepts trace metrics.
Default: true (Agent 7.18+)
DD_APM_ENVSets the env: tag on collected traces.
DD_APM_RECEIVER_SOCKETFor tracing over UDS. When set, must point to a valid socket file.
DD_APM_RECEIVER_PORTFor tracing over TCP, the port that the Datadog Agent’s trace receiver listens on.
Default: 8126
DD_APM_NON_LOCAL_TRAFFICAllow non-local traffic when tracing from other containers.
Default: true (Agent 7.18+)
DD_APM_DD_URLThe Datadog API endpoint where your traces are sent: https://trace.agent..
Default: https://trace.agent.datadoghq.com
DD_APM_TARGET_TPSThe target traces per second to sample.
Default: 10
DD_APM_ERROR_TPSThe target error trace chunks to receive per second.
Default: 10
DD_APM_MAX_EPSMaximum number of APM events per second to sample.
Default: 200
DD_APM_MAX_MEMORYWhat the Datadog Agent aims to use in terms of memory. If surpassed, the API rate limits incoming requests.
Default: 500000000
DD_APM_MAX_CPU_PERCENTThe CPU percentage that the Datadog Agent aims to use. If surpassed, the API rate limits incoming requests.
Default: 50
DD_APM_FILTER_TAGS_REQUIRECollects only traces that have root spans with an exact match for the specified span tags and values.
See Ignoring unwanted resources in APM.
DD_APM_FILTER_TAGS_REJECTRejects traces that have root spans with an exact match for the specified span tags and values.
See Ignoring unwanted resources in APM.
DD_APM_REPLACE_TAGSScrub sensitive data from your span’s tags.
DD_APM_IGNORE_RESOURCESConfigure resources for the Agent to ignore. Format should be comma separated, regular expressions.
For example: GET /ignore-me,(GET|POST) /and-also-me
DD_APM_LOG_FILEPath to file where APM logs are written.
DD_APM_CONNECTION_LIMITMaximum connection limit for a 30 second time window.
Default: 2000
DD_APM_ADDITONAL_ENDPOINTSSend data to multiple endpoints and/or with multiple API keys.
See Dual Shipping.
DD_APM_DEBUG_PORTPort for the debug endpoints for the Trace Agent. Set to 0 to disable the server.
Default: 5012.
DD_BIND_HOSTSet the StatsD and receiver hostname.
DD_DOGSTATSD_PORTFor tracing over TCP, set the DogStatsD port.
DD_ENVSets the global env for all data emitted by the Agent. If env is not present in your trace data, this variable is used.
DD_HOSTNAMEManually set the hostname to use for metrics if autodetection fails, or when running the Datadog Cluster Agent.
DD_LOG_LEVELSet the logging level.
Values: trace, debug, info, warn, error, critical, off
DD_PROXY_HTTPSSet up the URL for the proxy to use.

Further Reading

PREVIEWING: may/unit-testing