- 필수 기능
- 시작하기
- Glossary
- 표준 속성
- Guides
- Agent
- 통합
- 개방형텔레메트리
- 개발자
- API
- Datadog Mobile App
- CoScreen
- Cloudcraft
- 앱 내
- 서비스 관리
- 인프라스트럭처
- 애플리케이션 성능
- APM
- Continuous Profiler
- 스팬 시각화
- 데이터 스트림 모니터링
- 데이터 작업 모니터링
- 디지털 경험
- 소프트웨어 제공
- 보안
- AI Observability
- 로그 관리
- 관리
This page dives into the OpenMetricsBaseCheck
interface for more advanced usage, including an example of a simple check that collects timing metrics and status events from Kube DNS. For details on configuring a basic OpenMetrics check, see Kubernetes Prometheus and OpenMetrics metrics collection.
If you have more advanced needs than the generic check, such as metrics preprocessing, you can write a custom OpenMetricsBaseCheck
. It’s the base class of the generic check, and it provides a structure and some helpers to collect metrics, events, and service checks exposed with Prometheus. The minimal configuration for checks based on this class include:
namespace
and metrics
mapping.check()
method AND/OR:self.prometheus_metric_name
).This is a simple example of writing a Kube DNS check to illustrate usage of the OpenMetricsBaseCheck
class. The example below replicates the functionality of the following generic Prometheus check:
instances:
- prometheus_url: http://localhost:10055/metrics
namespace: "kubedns"
metrics:
- kubedns_kubedns_dns_response_size_bytes: response_size.bytes
- kubedns_kubedns_dns_request_duration_seconds: request_duration.seconds
- kubedns_kubedns_dns_request_count_total: request_count
- kubedns_kubedns_dns_error_count_total: error_count
- kubedns_kubedns_dns_cachemiss_count_total: cachemiss_count
mycheck.py
your configuration file must be named mycheck.yaml
.Configuration for a Prometheus check is almost the same as a regular Agent check. The main difference is to include the variable prometheus_url
in your check.yaml
file. This goes into conf.d/kube_dns.yaml
:
init_config:
instances:
# URL of the metrics endpoint of Prometheus
- prometheus_url: http://localhost:10055/metrics
All OpenMetrics checks inherit from the OpenMetricsBaseCheck
class:
from datadog_checks.base import OpenMetricsBaseCheck
class KubeDNSCheck(OpenMetricsBaseCheck):
from datadog_checks.base import OpenMetricsBaseCheck
class KubeDNSCheck(OpenMetricsBaseCheck):
def __init__(self, name, init_config, instances=None):
METRICS_MAP = {
#metrics have been renamed to kubedns in kubernetes 1.6.0
'kubedns_kubedns_dns_response_size_bytes': 'response_size.bytes',
'kubedns_kubedns_dns_request_duration_seconds': 'request_duration.seconds',
'kubedns_kubedns_dns_request_count_total': 'request_count',
'kubedns_kubedns_dns_error_count_total': 'error_count',
'kubedns_kubedns_dns_cachemiss_count_total': 'cachemiss_count'
}
A default instance is the basic configuration used for the check. The default instance should override namespace
, metrics
, and prometheus_url
.
Note: The default values for some config options in the OpenMetricsBaseCheck
are overwritten, so there is increased metric behavior correlation between Prometheus and Datadog metric types.
from datadog_checks.base import OpenMetricsBaseCheck
class KubeDNSCheck(OpenMetricsBaseCheck):
def __init__(self, name, init_config, instances=None):
METRICS_MAP = {
#metrics have been renamed to kubedns in kubernetes 1.6.0
'kubedns_kubedns_dns_response_size_bytes': 'response_size.bytes',
'kubedns_kubedns_dns_request_duration_seconds': 'request_duration.seconds',
'kubedns_kubedns_dns_request_count_total': 'request_count',
'kubedns_kubedns_dns_error_count_total': 'error_count',
'kubedns_kubedns_dns_cachemiss_count_total': 'cachemiss_count'
}
super(KubeDNSCheck, self).__init__(
name,
init_config,
instances,
default_instances={
'kubedns': {
'prometheus_url': 'http://localhost:8404/metrics',
'namespace': 'kubedns',
'metrics': [METRIC_MAP],
'send_histograms_buckets': True,
'send_distribution_counts_as_monotonic': True,
'send_distribution_sums_as_monotonic': True,
}
},
default_namespace='kubedns',
)
If you want to implement additional features, override the check()
function.
From instance
, use endpoint
, which is the Prometheus or OpenMetrics metrics endpoint to poll metrics from:
def check(self, instance):
endpoint = instance.get('prometheus_url')
If a check cannot run because of improper configuration, a programming error, or because it could not collect any metrics, it should raise a meaningful exception. This exception is logged and is shown in the Agent status command for debugging. For example:
$ sudo /etc/init.d/datadog-agent info
Checks
======
my_custom_check
---------------
- instance #0 [ERROR]: Unable to find prometheus_url in config file.
- Collected 0 metrics & 0 events
Improve your check()
method with ConfigurationError
:
from datadog_checks.base import ConfigurationError
def check(self, instance):
endpoint = instance.get('prometheus_url')
if endpoint is None:
raise ConfigurationError("Unable to find prometheus_url in config file.")
Then as soon as you have data available, flush:
from datadog_checks.base import ConfigurationError
def check(self, instance):
endpoint = instance.get('prometheus_url')
if endpoint is None:
raise ConfigurationError("Unable to find prometheus_url in config file.")
self.process(instance)
from datadog_checks.base import ConfigurationError, OpenMetricsBaseCheck
class KubeDNSCheck(OpenMetricsBaseCheck):
"""
Collect kube-dns metrics from Prometheus endpoint
"""
def __init__(self, name, init_config, instances=None):
METRICS_MAP = {
#metrics have been renamed to kubedns in kubernetes 1.6.0
'kubedns_kubedns_dns_response_size_bytes': 'response_size.bytes',
'kubedns_kubedns_dns_request_duration_seconds': 'request_duration.seconds',
'kubedns_kubedns_dns_request_count_total': 'request_count',
'kubedns_kubedns_dns_error_count_total': 'error_count',
'kubedns_kubedns_dns_cachemiss_count_total': 'cachemiss_count'
}
super(KubeDNSCheck, self).__init__(
name,
init_config,
instances,
default_instances={
'kubedns': {
'prometheus_url': 'http://localhost:8404/metrics',
'namespace': 'kubedns',
'metrics': [METRIC_MAP],
'send_histograms_buckets': True,
'send_distribution_counts_as_monotonic': True,
'send_distribution_sums_as_monotonic': True,
}
},
default_namespace='kubedns',
)
def check(self, instance):
endpoint = instance.get('prometheus_url')
if endpoint is None:
raise ConfigurationError("Unable to find prometheus_url in config file.")
self.process(instance)
To read more about Prometheus and OpenMetrics base integrations, see the integrations developer docs.
You can improve your OpenMetrics check by including default values for additional configuration options:
ignore_metrics
Unable to handle metric
debug line in the logs.labels_mapper
labels_mapper
dictionary is provided, the metrics labels in labels_mapper
use the corresponding value as tag name when sending the gauges.exclude_labels
exclude_labels
is an array of labels to exclude. Those labels are not added as tags when submitting the metric.type_overrides
type_overrides
is a dictionary where the keys are Prometheus or OpenMetrics metric names, and the values are a metric type (name as string) to use instead of the one listed in the payload. This can be used to force a type on untyped metrics.
Available types are: counter
, gauge
, summary
, untyped
, and histogram
.