Kubernetes State

Supported OS Linux Mac OS Windows

Integration version10.1.0

Overview

Get metrics from kubernetes_state service in real time to:

  • Visualize and monitor kubernetes_state states
  • Be notified about kubernetes_state failovers and events.

Setup

Installation

The Kubernetes-State check is included in the Datadog Agent package, so you don’t need to install anything else on your Kubernetes servers.

Configuration

Edit the kubernetes_state.d/conf.yaml file, in the conf.d/ folder at the root of your Agent’s configuration directory. See the sample kubernetes_state.d/conf.yaml for all available configuration options.

Validation

Run the Agent’s status subcommand and look for kubernetes_state under the Checks section.

Data Collected

Metrics

kubernetes_state.container.ready
(gauge)
Whether the containers readiness check succeeded
kubernetes_state.container.running
(gauge)
Whether the container is currently in running state
kubernetes_state.container.terminated
(gauge)
Whether the container is currently in terminated state
kubernetes_state.container.status_report.count.terminated
(gauge)
Count of the containers currently reporting a in terminated state with the reason as a tag
kubernetes_state.container.waiting
(gauge)
Whether the container is currently in waiting state
kubernetes_state.container.status_report.count.waiting
(gauge)
Count of the containers currently reporting a in waiting state with the reason as a tag
kubernetes_state.container.gpu.request
(gauge)
The number of requested gpu devices by a container
kubernetes_state.container.gpu.limit
(gauge)
The limit on gpu devices to be used by a container
kubernetes_state.container.restarts
(gauge)
The number of restarts per container
kubernetes_state.container.cpu_requested
(gauge)
The number of requested cpu cores by a container
Shown as cpu
kubernetes_state.container.memory_requested
(gauge)
The number of requested memory bytes by a container
Shown as byte
kubernetes_state.container.cpu_limit
(gauge)
The limit on cpu cores to be used by a container
Shown as cpu
kubernetes_state.container.memory_limit
(gauge)
The limit on memory to be used by a container
Shown as byte
kubernetes_state.daemonset.scheduled
(gauge)
The number of nodes running at least one daemon pod and that are supposed to
kubernetes_state.daemonset.misscheduled
(gauge)
The number of nodes running a daemon pod but are not supposed to
kubernetes_state.daemonset.desired
(gauge)
The number of nodes that should be running the daemon pod
kubernetes_state.daemonset.ready
(gauge)
The number of nodes that should be running the daemon pod and have one or more running and ready
kubernetes_state.daemonset.updated
(gauge)
The number of nodes that run the updated daemon pod spec
kubernetes_state.deployment.count
(gauge)
The number of deployments
kubernetes_state.deployment.replicas
(gauge)
The number of replicas per deployment
kubernetes_state.deployment.replicas_available
(gauge)
The number of available replicas per deployment
kubernetes_state.deployment.replicas_unavailable
(gauge)
The number of unavailable replicas per deployment
kubernetes_state.deployment.replicas_updated
(gauge)
The number of updated replicas per deployment
kubernetes_state.deployment.replicas_desired
(gauge)
The number of desired replicas per deployment
kubernetes_state.deployment.paused
(gauge)
Whether a deployment is paused
kubernetes_state.deployment.rollingupdate.max_unavailable
(gauge)
Maximum number of unavailable replicas during a rolling update
kubernetes_state.endpoint.address_available
(gauge)
Number of addresses available in endpoint
kubernetes_state.endpoint.address_not_ready
(gauge)
Number of addresses not ready in endpoint
kubernetes_state.endpoint.created
(gauge)
Unix creation timestamp
kubernetes_state.job.count
(gauge)
The number of jobs
kubernetes_state.job.failed
(count)
Observed number of failed pods in a job
kubernetes_state.job.succeeded
(count)
Observed number of succeeded pods in a job
kubernetes_state.limitrange.cpu.min
(gauge)
Minimum CPU request for this type
kubernetes_state.limitrange.cpu.max
(gauge)
Maximum CPU limit for this type
kubernetes_state.limitrange.cpu.default
(gauge)
Default CPU limit if not specified
kubernetes_state.limitrange.cpu.default_request
(gauge)
Default CPU request if not specified
kubernetes_state.limitrange.cpu.max_limit_request_ratio
(gauge)
Maximum CPU limit / request ratio
kubernetes_state.limitrange.memory.min
(gauge)
Minimum memory request for this type
kubernetes_state.limitrange.memory.max
(gauge)
Maximum memory limit for this type
kubernetes_state.limitrange.memory.default
(gauge)
Default memory limit if not specified
kubernetes_state.limitrange.memory.default_request
(gauge)
Default memory request if not specified
kubernetes_state.limitrange.memory.max_limit_request_ratio
(gauge)
Maximum memory limit / request ratio
kubernetes_state.node.count
(count)
The number of nodes
Shown as node
kubernetes_state.node.cpu_capacity
(gauge)
The total CPU resources of the node
Shown as cpu
kubernetes_state.node.memory_capacity
(gauge)
The total memory resources of the node
Shown as byte
kubernetes_state.node.pods_capacity
(gauge)
The total pod resources of the node
kubernetes_state.node.gpu.cards_allocatable
(gauge)
The GPU resources of a node that are available for scheduling
kubernetes_state.node.gpu.cards_capacity
(gauge)
The total GPU resources of the node
kubernetes_state.persistentvolumeclaim.status
(gauge)
The phase the persistent volume claim is currently in
kubernetes_state.persistentvolumeclaim.request_storage
(gauge)
Storage space request for a given pvc
Shown as byte
kubernetes_state.persistentvolumes.by_phase
(gauge)
Number of persistent volumes to sum by phase and storageclass
kubernetes_state.namespace.count
(gauge)
The number of namespaces
Shown as cpu
kubernetes_state.node.cpu_allocatable
(gauge)
The CPU resources of a node that are available for scheduling
Shown as cpu
kubernetes_state.node.memory_allocatable
(gauge)
The memory resources of a node that are available for scheduling
Shown as byte
kubernetes_state.node.pods_allocatable
(gauge)
The pod resources of a node that are available for scheduling
kubernetes_state.node.status
(gauge)
Submitted with a value of 1 for each node and tagged either ‘status:schedulable’ or ‘status:unschedulable’; Sum this metric by either status to get the number of nodes in that status.
kubernetes_state.node.by_condition
(gauge)
The condition of a cluster node
kubernetes_state.nodes.by_condition
(gauge)
To sum by condition and status to get number of nodes in a given condition.
kubernetes_state.hpa.min_replicas
(gauge)
Lower limit for the number of pods that can be set by the autoscaler
kubernetes_state.hpa.max_replicas
(gauge)
Upper limit for the number of pods that can be set by the autoscaler
kubernetes_state.hpa.desired_replicas
(gauge)
Desired number of replicas of pods managed by this autoscaler
kubernetes_state.hpa.condition
(gauge)
Observed condition of autoscalers to sum by condition and status
kubernetes_state.pdb.pods_desired
(gauge)
Minimum desired number of healthy pods
kubernetes_state.pdb.disruptions_allowed
(gauge)
Number of pod disruptions that are currently allowed
kubernetes_state.pdb.pods_healthy
(gauge)
Current number of healthy pods
kubernetes_state.pdb.pods_total
(gauge)
Total number of pods counted by this disruption budget
kubernetes_state.pod.ready
(gauge)
In association with the condition tag, whether the pod is ready to serve requests, e.g. condition:true keeps the pods that are in a ready state
kubernetes_state.pod.scheduled
(gauge)
Reports the status of the scheduling process for the pod with its tags
kubernetes_state.pod.unschedulable
(gauge)
Reports PODs that Kube scheduler cannot schedule on any node
kubernetes_state.pod.status_phase
(gauge)
To sum by phase to get number of pods in a given phase, and namespace to break this down by namespace
kubernetes_state.replicaset.count
(gauge)
The number of replicasets
kubernetes_state.replicaset.replicas
(gauge)
The number of replicas per ReplicaSet
kubernetes_state.replicaset.fully_labeled_replicas
(gauge)
The number of fully labeled replicas per ReplicaSet
kubernetes_state.replicaset.replicas_ready
(gauge)
The number of ready replicas per ReplicaSet
kubernetes_state.replicaset.replicas_desired
(gauge)
Number of desired pods for a ReplicaSet
kubernetes_state.replicationcontroller.replicas
(gauge)
The number of replicas per ReplicationController
kubernetes_state.replicationcontroller.fully_labeled_replicas
(gauge)
The number of fully labeled replicas per ReplicationController
kubernetes_state.replicationcontroller.replicas_ready
(gauge)
The number of ready replicas per ReplicationController
kubernetes_state.replicationcontroller.replicas_desired
(gauge)
Number of desired replicas for a ReplicationController
kubernetes_state.replicationcontroller.replicas_available
(gauge)
The number of available replicas per ReplicationController
kubernetes_state.resourcequota.pods.used
(gauge)
Observed number of pods used for a resource quota
kubernetes_state.resourcequota.services.used
(gauge)
Observed number of services used for a resource quota
kubernetes_state.resourcequota.persistentvolumeclaims.used
(gauge)
Observed number of persistent volume claims used for a resource quota
kubernetes_state.resourcequota.services.nodeports.used
(gauge)
Observed number of node ports used for a resource quota
kubernetes_state.resourcequota.services.loadbalancers.used
(gauge)
Observed number of loadbalancers used for a resource quota
kubernetes_state.resourcequota.requests.cpu.used
(gauge)
Observed sum of CPU cores requested for a resource quota
Shown as cpu
kubernetes_state.resourcequota.requests.memory.used
(gauge)
Observed sum of memory bytes requested for a resource quota
Shown as byte
kubernetes_state.resourcequota.requests.storage.used
(gauge)
Observed sum of storage bytes requested for a resource quota
Shown as byte
kubernetes_state.resourcequota.limits.cpu.used
(gauge)
Observed sum of limits for CPU cores for a resource quota
Shown as cpu
kubernetes_state.resourcequota.limits.memory.used
(gauge)
Observed sum of limits for memory bytes for a resource quota
Shown as byte
kubernetes_state.resourcequota.pods.limit
(gauge)
Hard limit of the number of pods for a resource quota
kubernetes_state.resourcequota.services.limit
(gauge)
Hard limit of the number of services for a resource quota
kubernetes_state.resourcequota.persistentvolumeclaims.limit
(gauge)
Hard limit of the number of PVC for a resource quota
kubernetes_state.resourcequota.services.nodeports.limit
(gauge)
Hard limit of the number of node ports for a resource quota
kubernetes_state.resourcequota.services.loadbalancers.limit
(gauge)
Hard limit of the number of loadbalancers for a resource quota
kubernetes_state.resourcequota.requests.cpu.limit
(gauge)
Hard limit on the total of CPU core requested for a resource quota
Shown as cpu
kubernetes_state.resourcequota.requests.memory.limit
(gauge)
Hard limit on the total of memory bytes requested for a resource quota
Shown as byte
kubernetes_state.resourcequota.requests.storage.limit
(gauge)
Hard limit on the total of storage bytes requested for a resource quota
Shown as byte
kubernetes_state.resourcequota.limits.cpu.limit
(gauge)
Hard limit on the sum of CPU core limits for a resource quota
Shown as cpu
kubernetes_state.resourcequota.limits.memory.limit
(gauge)
Hard limit on the sum of memory bytes limits for a resource quota
Shown as byte
kubernetes_state.service.count
(gauge)
Sum by namespace and type to count active services
kubernetes_state.statefulset.count
(gauge)
The number of statefulsets
kubernetes_state.statefulset.replicas
(gauge)
The number of replicas per statefulset
kubernetes_state.statefulset.replicas_desired
(gauge)
The number of desired replicas per statefulset
kubernetes_state.statefulset.replicas_current
(gauge)
The number of current replicas per StatefulSet
kubernetes_state.statefulset.replicas_ready
(gauge)
The number of ready replicas per StatefulSet
kubernetes_state.statefulset.replicas_updated
(gauge)
The number of updated replicas per StatefulSet
kubernetes_state.telemetry.payload.size
(gauge)
The message size received from kube-state-metrics
Shown as byte
kubernetes_state.telemetry.metrics.processed.count
(count)
The number of metrics processed
kubernetes_state.telemetry.metrics.input.count
(count)
The number of metrics received
kubernetes_state.telemetry.metrics.blacklist.count
(count)
The number of metrics blacklisted by the check
kubernetes_state.telemetry.metrics.ignored.count
(count)
The number of metrics ignored by the check
kubernetes_state.telemetry.collector.metrics.count
(count)
The number of metrics by collector (kubernetes object kind) by kubernetes namespaces
kubernetes_state.vpa.lower_bound
(gauge)
The vpa lower bound recommendation
kubernetes_state.vpa.target
(gauge)
The vpa target recommendation
kubernetes_state.vpa.uncapped_target
(gauge)
The vpa uncapped recommendation recommendation
kubernetes_state.vpa.upperbound
(gauge)
The vpa upper bound recommendation
kubernetes_state.vpa.update_mode
(gauge)
The vpa update mode

Events

The Kubernetes-state check does not include any events.

Service Checks

See ../kubernetes/assets/service_checks.json for a list of service checks provided by this integration.

Troubleshooting

Need help? Contact Datadog support.

PREVIEWING: antoine.dussault/adaptive_sampling_ga