Containers View

In Datadog, the Containers page provides real-time visibility into all containers across your environment.

Taking inspiration from bedrock tools like htop, ctop, and kubectl, the Containers page gives you complete coverage of your container infrastructure in a continuously updated table with resource metrics at two-second resolution, faceted search, and streaming container logs.

Coupled with Docker, Kubernetes, ECS, and other container technologies, plus built-in tagging of dynamic components, the Containers page provides a detailed overview of your containers’ health, resource consumption, logs, and deployment in real-time:

Live containers with summaries

Setup

To display data on the Containers view, enable the Process Agent.

Set the DD_PROCESS_AGENT_ENABLED env variable to true.

For example:

-v /etc/passwd:/etc/passwd:ro
-e DD_PROCESS_AGENT_ENABLED=true

The Datadog Operator enables the Process Agent by default.

For verification, ensure that features.liveContainerCollection.enabled is set to true in your datadog-agent.yaml:

apiVersion: datadoghq.com/v2alpha1
kind: DatadogAgent
metadata:
  name: datadog
spec:
  global:
    credentials:
      apiKey: <DATADOG_API_KEY>
      appKey: <DATADOG_APP_KEY>
  features:
    liveContainerCollection:
      enabled: true

If you are using the official Helm chart, enable the processAgent.enabled parameter in your values.yaml file:

datadog:
  # (...)
  processAgent:
    enabled: true

Then, upgrade your Helm chart.

In some setups, the Process Agent and Cluster Agent cannot automatically detect a Kubernetes cluster name. If this happens, the feature does not start, and the following warning displays in the Cluster Agent log: Orchestrator explorer enabled but no cluster name set: disabling. In this case, you must set datadog.clusterName to your cluster name in values.yaml.

datadog:
  #(...)
  clusterName: <YOUR_CLUSTER_NAME>
  #(...)
  processAgent:
    enabled: true

Update your Task Definitions with the following environment variable:

{
  "name": "DD_PROCESS_AGENT_ENABLED",
  "value": "true"
}

Configuration

For configuration options, like filtering containers and scrubbing sensitive information, see Configure Containers View. To set up this page for older Agent versions (Datadog Agent v7.21.1 - v7.27.0 and Cluster Agent v1.9.0 - 1.11.0), see Live Containers legacy configuration.

Kubernetes Orchestrator Explorer

In the Select Resources box at the top left of the Containers page, you can expand the Kubernetes heading to look at pods, clusters, namespaces, and other resources in the Kubernetes Orchestrator Explorer. For more information, see the Orchestrator Explorer documentation.

You can also use the Kubernetes page to see an overview of your Kubernetes resources.

Searching, filtering, and pivoting

Containers are, by their nature, extremely high cardinality objects. Datadog’s flexible string search matches substrings in the container name, ID, or image fields.

To combine multiple string searches into a complex query, you can use any of the following Boolean operators:

AND
Intersection: both terms are in the selected events (if nothing is added, AND is taken by default)
Example: java AND elasticsearch
OR
Union: either term is contained in the selected events
Example: java OR python
NOT / !
Exclusion: the following term is NOT in the event. You may use the word NOT or ! character to perform the same operation
Example: java NOT elasticsearch or java !elasticsearch

Use parentheses to group operators together. For example, (NOT (elasticsearch OR kafka) java) OR python.

Filtering and pivoting

The screenshot below displays a system that has been filtered down to a Kubernetes cluster of 25 nodes. RSS and CPU utilization on containers is reported compared to the provisioned limits on the containers, when they exist. Here, it is apparent that the containers in this cluster are over-provisioned. You could use tighter limits and bin packing to achieve better utilization of resources.

A system that has been filter down to a Kubernetes cluster of 25 nodes

Container environments are dynamic and can be hard to follow. The following screenshot displays a view that has been pivoted by kube_service and host—and, to reduce system noise, filtered to kube_namespace:default. You can see what services are running where, and how saturated key metrics are:

Host x services

You could pivot by ECS ecs_task_name and ecs_task_version to understand changes to resource utilization between updates.

Tasks x version

Tagging

Containers are tagged with all existing host-level tags, as well as with metadata associated with individual containers.

All containers are tagged by image_name, including integrations with popular orchestrators, such as ECS and Kubernetes, which provide further container-level tags. Additionally, each container is decorated with Docker, ECS, or Kubernetes icons so you can tell which are being orchestrated at a glance.

ECS containers are tagged by:

  • task_name
  • task_version
  • ecs_cluster

Kubernetes containers are tagged by:

  • pod_name
  • kube_pod_ip
  • kube_service
  • kube_namespace
  • kube_replica_set
  • kube_daemon_set
  • kube_job
  • kube_deployment
  • kube_cluster

If you have a configuration for Unified Service Tagging in place, Datadog automatically picks up env, service, and version tags. Having these tags available lets you tie together APM, logs, metrics, and container data.

Views

The Containers page includes Scatter Plot and Timeseries views, and a table to better organize your container data by fields such as container name, status, and start time.

Scatter plot

Use the scatter plot analytic to compare two metrics with one another in order to better understand the performance of your containers.

You can switch between the “Scatter Plot” and “Timeseries” tabs in the collapsible Summary Graphs section in the Containers page:

Scatter plot selection

By default, the graph groups by the short_image tag key. The size of each dot represents the number of containers in that group, and clicking on a dot displays the individual containers and hosts that contribute to the group.

The query at the top of the scatter plot analytic allows you to control your scatter plot analytic:

  • Selection of metrics to display.
  • Selection of the aggregation method for both metrics.
  • Selection of the scale of both X and Y axis (Linear/Log).
Scatter plot

Real-time monitoring

While actively working with the containers page, metrics are collected at a 2-second resolution. This is important for volatile metrics such as CPU. In the background, for historical context, metrics are collected at 10s resolution.

Container logs

View streaming logs for any container like docker logs -f or kubectl logs -f in Datadog. Click any container in the table to inspect it. Click the Logs tab to see real-time data from live tail or indexed logs for any time in the past.

Live tail

With live tail, all container logs are streamed. Pausing the stream helps you read logs that are quickly being written; unpause to continue streaming.

Streaming logs can be searched with simple string matching. See Live Tail for more details.

Note: Streaming logs are not persisted, and entering a new search or refreshing the page clears the stream.

Indexed logs

You can see indexed logs that you have chosen to index and persist by selecting a corresponding timeframe. Indexing allows you to filter your logs using tags and facets. For example, to search for logs with an Error status, type status:error into the search box. Autocompletion can help you locate the particular tag that you want. Key attributes about your logs are already stored in tags, which enables you to search, filter, and aggregate as needed.

Preview Logs Side panel

Notes and known issues

  • Real-time (2s) data collection is turned off after 30 minutes. To resume real-time collection, refresh the page.
  • RBAC settings can restrict Kubernetes metadata collection. See the RBAC entities for the Datadog Agent.
  • In Kubernetes the health value is the containers’ readiness probe, not its liveness probe.

Further reading

PREVIEWING: rtrieu/product-analytics-ui-changes