Kubernetes Metrics Server

Supported OS Linux Windows Mac OS

Integration version5.0.0

Overview

This check monitors Kube_metrics_server v0.3.0+, a component used by the Kubernetes control plane.

Setup

Installation

The Kube_metrics_server check is included in the Datadog Agent package. No additional installation is needed on your server.

Configuration

Host

To configure this check for an Agent running on a host:

  1. Edit the kube_metrics_server.d/conf.yaml file, in the conf.d/ folder at the root of your Agent’s configuration directory to start collecting your kube_metrics_server performance data. See the sample kube_metrics_server.d/conf.yaml for all available configuration options.

  2. Restart the Agent.

Containerized

For containerized environments, see the Kubernetes Autodiscovery Integration Templates for guidance on applying the parameters below.

ParameterValue
<INTEGRATION_NAME>kube_metrics_server
<INIT_CONFIG>blank or {}
<INSTANCE_CONFIG>{"prometheus_url": "https://%%host%%:443/metrics"}

SSL

If your endpoint is secured, additional configuration is required:

  1. Identify the certificate used for securing the metric endpoint.

  2. Mount the related certificate file in the Agent pod.

  3. Apply your SSL configuration. See the default configuration file for more information.

Validation

Run the Agent’s status subcommand and look for kube_metrics_server under the Checks section.

Data Collected

Metrics

kube_metrics_server.authenticated_user.requests
(count)
Counter of authenticated requests broken out by username
kube_metrics_server.go.gc_duration_seconds.count
(gauge)
Number of the GC invocation
kube_metrics_server.go.gc_duration_seconds.quantile
(gauge)
GC invocation durations quantiles
kube_metrics_server.go.gc_duration_seconds.sum
(gauge)
GC invocation durations sum
kube_metrics_server.go.goroutines
(gauge)
Number of goroutines that currently exist
kube_metrics_server.kubelet_summary_request_duration.count
(gauge)
Number of Kubelet summary request
kube_metrics_server.kubelet_summary_request_duration.sum
(gauge)
The Kubelet summary request latencies sum
kube_metrics_server.kubelet_summary_scrapes_total
(count)
Total number of attempted Summary API scrapes done by Metrics Server
kube_metrics_server.manager_tick_duration.count
(gauge)
The total time spent collecting and storing metrics
kube_metrics_server.manager_tick_duration.sum
(gauge)
The total time spent collecting and storing metrics
kube_metrics_server.process.max_fds
(gauge)
Maximum number of open file descriptors
kube_metrics_server.process.open_fds
(gauge)
Number of open file descriptors
kube_metrics_server.scraper_duration.count
(gauge)
Time spent scraping sources
kube_metrics_server.scraper_duration.sum
(gauge)
Time spent scraping sources
kube_metrics_server.scraper_last_time
(gauge)
Last time metrics-server performed a scrape since unix epoch

Events

kube_metrics_server does not include any events.

Service Checks

kube_metrics_server.prometheus.health
Returns CRITICAL if the check cannot access the metrics endpoint.
Statuses: ok, critical

kube_metrics_server.up
Returns CRITICAL if Kubernetes Metrics Server is not healthy.
Statuses: ok, critical

Troubleshooting

Need help? Contact Datadog support.

PREVIEWING: rtrieu/product-analytics-ui-changes