Kubernetes Metrics Server
Overview
This check monitors Kube_metrics_server v0.3.0+, a component used by the Kubernetes control plane.
Setup
Installation
The Kube_metrics_server check is included in the Datadog Agent package. No additional installation is needed on your server.
Configuration
Host
To configure this check for an Agent running on a host:
Edit the kube_metrics_server.d/conf.yaml
file, in the conf.d/
folder at the root of your Agent’s configuration directory to start collecting your kube_metrics_server performance data. See the sample kube_metrics_server.d/conf.yaml for all available configuration options.
Restart the Agent.
Containerized
For containerized environments, see the Kubernetes Autodiscovery Integration Templates for guidance on applying the parameters below.
Parameter | Value |
---|
<INTEGRATION_NAME> | kube_metrics_server |
<INIT_CONFIG> | blank or {} |
<INSTANCE_CONFIG> | {"prometheus_url": "https://%%host%%:443/metrics"} |
SSL
If your endpoint is secured, additional configuration is required:
Identify the certificate used for securing the metric endpoint.
Mount the related certificate file in the Agent pod.
Apply your SSL configuration. See the default configuration file for more information.
Validation
Run the Agent’s status subcommand and look for kube_metrics_server
under the Checks section.
Data Collected
Metrics
kube_metrics_server.authenticated_user.requests (count) | Counter of authenticated requests broken out by username |
kube_metrics_server.go.gc_duration_seconds.count (gauge) | Number of the GC invocation |
kube_metrics_server.go.gc_duration_seconds.quantile (gauge) | GC invocation durations quantiles |
kube_metrics_server.go.gc_duration_seconds.sum (gauge) | GC invocation durations sum |
kube_metrics_server.go.goroutines (gauge) | Number of goroutines that currently exist |
kube_metrics_server.kubelet_summary_request_duration.count (gauge) | Number of Kubelet summary request |
kube_metrics_server.kubelet_summary_request_duration.sum (gauge) | The Kubelet summary request latencies sum |
kube_metrics_server.kubelet_summary_scrapes_total (count) | Total number of attempted Summary API scrapes done by Metrics Server |
kube_metrics_server.manager_tick_duration.count (gauge) | The total time spent collecting and storing metrics |
kube_metrics_server.manager_tick_duration.sum (gauge) | The total time spent collecting and storing metrics |
kube_metrics_server.process.max_fds (gauge) | Maximum number of open file descriptors |
kube_metrics_server.process.open_fds (gauge) | Number of open file descriptors |
kube_metrics_server.scraper_duration.count (gauge) | Time spent scraping sources |
kube_metrics_server.scraper_duration.sum (gauge) | Time spent scraping sources |
kube_metrics_server.scraper_last_time (gauge) | Last time metrics-server performed a scrape since unix epoch |
Events
kube_metrics_server does not include any events.
Service Checks
kube_metrics_server.prometheus.health
Returns CRITICAL
if the check cannot access the metrics endpoint.
Statuses: ok, critical
kube_metrics_server.up
Returns CRITICAL
if Kubernetes Metrics Server is not healthy.
Statuses: ok, critical
Troubleshooting
Need help? Contact Datadog support.