The Datadog Agent and Cluster Agent can retrieve Kubernetes resources for the Orchestrator Explorer. This feature allows you to monitor the state of pods, deployments, and other Kubernetes concepts in a specific namespace or availability zone, view resource specifications for failed pods within a deployment, correlate node activity with related logs, and more.
Orchestrator Explorer requires Agent version >= 7.27.0 and Cluster Agent version >= 1.11.0.
Note: For Kubernetes version 1.25 and above, the minimal Cluster Agent version required is 7.40.0.
Ensure that you have enabled the Process Agent. If you are using Datadog Operator or the official Helm chart, the Orchestrator Explorer is enabled by default.
Toggle among the Pods, Clusters, Namespaces, and other Kubernetes resources in the Select Resources dropdown menu in the top left corner of the page.
Each of these views includes a data table to help you better organize your data by field such as status, name, and Kubernetes labels, and a detailed Cluster Map to give you a bigger picture of your pods and Kubernetes clusters.
Group pods by tags, Kubernetes labels, or Kubernetes annotations to get an aggregated view which allows you to find information quicker. You can perform a group by using the “Group by” bar on the top right of the page or by clicking on a particular tag or label and locating the group by function in the context menu as shown below.
You can also use facets on the left hand side of the page to group resources or filter for resources you care most about, such as pods with a CrashLoopBackOff pod status.
A cluster map gives you a bigger picture of your pods and Kubernetes clusters. You can see all of your resources together on one screen with customized groups and filters, and choose which metrics to fill the color of the nodes.
Examine resources from cluster maps by clicking on any circle or group to populate a detailed panel.
Click on any row in the table or on any object in a Cluster Map to view information about a specific resource in a side panel.
The side panel’s YAML tab shows the full resource definition. Starting in Agent version 7.44.0, it also includes seven days of definition history. You can compare what changed over time and across different versions. The time indicated is approximately when the changes were applied to the resource.
To prevent displaying a large number of irrelevant changes, updates affecting only the following fields are ignored:
metadata.resourceVersion
metadata.managedFields
metadata.generation
status
The other tabs show more information for troubleshooting the selected resource:
Logs: View logs from your container or resource. Click on any log to view related logs in the Log Explorer.
APM: View traces from your container or resource, including the date, service, duration, method, and status code of a trace.
Metrics: View live metrics for your container or resource. You can view any graph full screen, share a snapshot of it, or export it from this tab.
Processes: View all processes running in the container of this resource.
Network: View a container or resource’s network performance, including source, destination, sent and received volume, and throughput fields. Use the Destination field to search by tags like DNS or ip_type, or use the Group by filter in this view to group network data by tags, like pod_name or service.
Events: View all Kubernetes events for your resource.
Monitors: View monitors tagged, scoped, or grouped for this resource.
For a detailed dashboard of this resource, click the View Dashboard in the top right corner of this panel.
Tags: Attached to resources by the agent collecting them. There are also additional tags that Datadog generates for Kubernetes resources.
datacenter:staging tag#datacenter:staging (the tag# is optional)
Labels: Extracted from a resource’s metadata. They are typically used to organize your cluster and target specific resources with selectors.
label#chart_version:2.1.0
Annotations: Extracted from a resource’s metadata. They are generally used to support tooling that aid in cluster management.
annotation#checksum/configmap:a1bc23d4
Metrics: Added to workload resources (pods, deployments, etc.). You can find resources based on their utilization. To see what metrics are supported, see Resource Utilization Filters.
metric#cpu_usage_pct_limits_avg15:>80%
String matching: Supported by some specific resource attributes, see below. Note: string matching does not use the key-value format, and you cannot specify the attribute to match on.
Note: You might find the same key-value pairs as both a tag and label (or annotation) - this is dependent on how your cluster is configured.
The following resource attributes are supported in arbitrary String Matching:
metadata.name
metadata.uid
IP Addresses found in:
Pods
Nodes (internal and external)
Services (cluster, external, and load balancer IPs)
You do not need to specify a key to search for a resource by name, or IP. Quotes are not required unless your string search includes certain special characters.
In addition to the tags you have configured within your Datadog agent, Datadog injects generated tags based on resource attributes that can help your searching and grouping needs. These tags are added to resources conditionally, when they are relevant.
All resources have the kube_cluster_name tag and all namespaced resources have the kube_namespace tag added to them.
Additionally, resources contain a kube_<api_kind>:<metadata.name> tag. For example, a deployment named web-server-2 would have the kube_deployment:web-server-2 tag automatically added to it.
Note: There are some exceptions to this pattern:
Pods use pod_name instead.
VPAs: verticalpodautoscaler.
HPAs: horizontalpodautoscaler.
Persistent Volume Claims: persistentvolumeclaim.
Based on the labels attached to the resource, the following tags will also be extracted:
Related Resources will be tagged with each other. Some examples:
A pod that is part of the “XYZ” deployment will have a kube_deployment:xyz tag.
An ingress that points at service “A” will have a kube_service:a tag.
Resources that are spawned from “parent” resources will have the kube_ownerref_kind and kube_ownerref_name tags (such as pods and jobs).
Tip: Utilize the filter query autocomplete feature to discover what related resource tags are available. Type kube_ and see what results are suggested.
Workload resources (pods, deployments, stateful sets, etc.) will have the following tags, indicating their support within the Resources Utilization page:
Some conditions, for some resources, are extracted as tags. For example, you can find the kube_condition_available tag on deployments. The tag format is always kube_condition_<name> with a true or false value.
Tip: Use the autocomplete feature to discover what conditions are available on a given resource type by entering kube_condition and reviewing the results.
Some resources have specific tags that are extracted based on your cluster’s environment. The following tags are available in addition to the shared tags above.
The following workload resources are enriched with resource utilization metrics:
Clusters
Daemonsets
Deployments
Nodes
Pods
Replica Sets
Stateful Sets
These metrics are calculated at the time of collection, based on the average values over the last 15 minutes. You can filter by metric values like so: metric#<metric_name><comparator><numeric_value>.
Percents (*_pct_*) are stored as floats, where 0.0 is 0%, and 1.0 is 100%. The value is the ratio of the two indicated metrics - for example cpu_usage_pct_limits_avg15 is the value of usage / limits. Metric values can be above 100%, such as Percentage CPU Usage of Requests.
Data is updated automatically in constant intervals.
In clusters with 1000+ Deployments or ReplicaSets you may notice elevated CPU usage from the Cluster Agent. There is an option to disable container scrubbing in the Helm chart. See the Helm Chart repo for more details.