CoreDNS

Supported OS Linux

통합 버전3.2.2

개요

실시간으로 CoreDNS에서 메트릭을 받아 DNS 실패와 캐시 적중 및 캐시 누락을 가시화 및 모니터링하세요.

설정

버전 1.11.0부터 본 OpenMetrics 기반 통합에는 최신 모드(openmetrics_endpoint가 대상 엔드포인트를 가리키도록 설정하여 활성화됨)와 레거시 모드(prometheus_url를 대신 설정하여 활성화됨)가 있습니다. Datadog에서는 최신 모드를 활성화해 최신 기능을 사용할 것을 권장합니다. 자세한 내용을 확인하려면 OpenMetrics 기반 통합의 최신 및 레거시 버전 관리를 참고하세요.

CoreDNS 점검 최신 모드를 사용하려면 Python 3이 필요하며 .bucket 메트릭을 제출하고 .sum.count 히스토그램 샘플을 단조 카운트 유형으로 제출합니다. 이전에는 이 메트릭이 레거시 모드에서 gauge 유형으로 제출되었습니다. 각 모드에서 사용할 수 있는 메트릭 목록은 metadata.csv 파일을 참고하세요.

Python 3을 사용할 수 없는 호스트나 이 통합 모드를 전에 구현한 적이 있으면 legacy 모드 구성 예시를 참고하세요. coredns.d/auto_conf.yaml 파일을 사용하는 자동탐지 사용자의 경우, 이 파일은 기본적으로 점검 legacy 모드에서 prometheus_url 옵션을 활성화합니다. 기본 구성 옵션을 보려면 coredns.d/auto_conf.yaml 샘플을 보고 사용할 수 있는 구성 옵션 목록 전체를 보려면 coredns.d/conf.yaml.example 샘플을 보세요.

설치

Datadog 에이전트 패키지에 CoreDNS 점검이 포함되어 있기 때문에 서버에 새로운 설치를 할 필요가 없습니다.

설정

Docker

컨테이너에서 실행 중인 에이전트에 이 점검을 구성하는 방법:

메트릭 수집

애플리케이션 컨테이너에서 자동탐지 통합 템플릿을 Docker 레이블로 설정합니다.

LABEL "com.datadoghq.ad.check_names"='["coredns"]'
LABEL "com.datadoghq.ad.init_configs"='[{}]'
LABEL "com.datadoghq.ad.instances"='[{"openmetrics_endpoint":"http://%%host%%:9153/metrics", "tags":["dns-pod:%%host%%"]}]'

이 OpenMetrics 기반 점검의 레거시 모드를 활성화하려면 openmetrics_endpointprometheus_url로 교체하세요.

LABEL "com.datadoghq.ad.instances"='[{"prometheus_url":"http://%%host%%:9153/metrics", "tags":["dns-pod:%%host%%"]}]' 

참고:

  • 전달된 coredns.d/auto_conf.yaml 파일은 기본적으로 레거시 모드에서 prometheus_url 옵션을 활성화합니다.
  • dns-pod 태그가 대상 DNS 파드 IP를 추적합니다. 다른 태그는 서비스 탐지를 사용해 정보를 폴링하는 Datadog 에이전트와 관련되어 있습니다.
  • 파드에서 서비스 탐지 주석을 해야 합니다. 배포하고 싶을 경우 템플릿 사양 메타데이터에 주석을 추가하세요. 외부 사양 수준에 추가하지 마세요.

로그 수집

기본적으로 로그 수집은 Datadog 에이전트에서 비활성화되어 있습니다. 활성화하려면 Docker 로그 수집을 참고하세요.

그런 다음 Docker 레이블로 로그 통합을 설정하세요.

LABEL "com.datadoghq.ad.logs"='[{"source":"coredns","service":"<SERVICE_NAME>"}]'

쿠버네티스(Kubernetes)

쿠버네티스에서 실행 중인 에이전트에 이 점검을 구성하는 방법:

메트릭 수집

애플리케이션 컨테이너에 자동탐지 통합 템플릿을 파드 주석으로 설정하세요. 또는 파일, configmap, 또는 키-값 저장소로 구성할 수도 있습니다.

주석 v1(Datadog 에이전트 v7.36 이하용)

apiVersion: v1
kind: Pod
metadata:
  name: coredns
  annotations:
    ad.datadoghq.com/coredns.check_names: '["coredns"]'
    ad.datadoghq.com/coredns.init_configs: '[{}]'
    ad.datadoghq.com/coredns.instances: |
      [
        {
          "openmetrics_endpoint": "http://%%host%%:9153/metrics", 
          "tags": ["dns-pod:%%host%%"]
        }
      ]      
  labels:
    name: coredns
spec:
  containers:
    - name: coredns

주석 v2(for Datadog 에이전트 v7.36 이상)

apiVersion: v1
kind: Pod
metadata:
  name: coredns
  annotations:
    ad.datadoghq.com/coredns.checks: |
      {
        "coredns": {
          "init_config": {},
          "instances": [
            {
              "openmetrics_endpoint": "http://%%host%%:9153/metrics", 
              "tags": ["dns-pod:%%host%%"]
            }
          ]
        }
      }      
  labels:
    name: coredns
spec:
  containers:
    - name: coredns

이 OpenMetrics 기반 점검의 레거시 모드를 활성화하려면 openmetrics_endpointprometheus_url로 교체하세요.

주석 v1 (Datadog 에이전트 v7.36 이하용)

    ad.datadoghq.com/coredns.instances: |
      [
        {
          "prometheus_url": "http://%%host%%:9153/metrics", 
          "tags": ["dns-pod:%%host%%"]
        }
      ]      

주석 v2(for Datadog 에이전트 v7.36 이상)

          "instances": [
            {
              "prometheus_url": "http://%%host%%:9153/metrics", 
              "tags": ["dns-pod:%%host%%"]
            }
          ]

참고:

  • 전달된 coredns.d/auto_conf.yaml 파일은 기본적으로 레거시 모드에서 prometheus_url 옵션을 활성화합니다.
  • dns-pod 태그가 대상 DNS 파드 IP를 추적합니다. 다른 태그는 서비스 탐지를 사용해 정보를 폴링하는 Datadog 에이전트와 관련되어 있습니다.
  • 파드에서 서비스 탐지 주석을 해야 합니다. 배포를 하고 싶을 경우 템플릿 사양 메타데이터에 주석을 추가하세요. 외부 사양 수준에 추가하지 마세요.

로그 수집

Datadog 에이전트에서 기본적으로 로그 수집이 비활성화되어 있습니다. 활성화하려면 [쿠버네티스 로그 수집]을 확인하세요.

그리고 로그 통합을 파드 주석으로 설정하세요. 또는 파일, configmap, 또는 키-쌍 저장소로 구성할 수도 있습니다.

주석 v1/v2

apiVersion: v1
kind: Pod
metadata:
  name: coredns
  annotations:
    ad.datadoghq.com/coredns.logs: '[{"source": "coredns", "service": "<SERVICE_NAME>"}]'
  labels:
    name: coredns

ECS

ECS에서 실행 중인 에이전트에 이 점검을 구성하는 방법:

메트릭 수집

애플리케이션 컨테이너에 자동탐지 통합 템플릿을 Docker 레이블로 설정하세요.

{
  "containerDefinitions": [{
    "name": "coredns",
    "image": "coredns:latest",
    "dockerLabels": {
      "com.datadoghq.ad.check_names": "[\"coredns\"]",
      "com.datadoghq.ad.init_configs": "[{}]",
      "com.datadoghq.ad.instances": "[{\"openmetrics_endpoint\":\"http://%%host%%:9153/metrics\", \"tags\":[\"dns-pod:%%host%%\"]}]"
    }
  }]
}

이 OpenMetrics 기반 점검의 레거시 모드를 활성화하려면 openmetrics_endpointprometheus_url로 교체하세요.

      "com.datadoghq.ad.instances": "[{\"prometheus_url\":\"http://%%host%%:9153/metrics\", \"tags\":[\"dns-pod:%%host%%\"]}]"

참고:

  • 전달된 coredns.d/auto_conf.yaml 파일은 기본적으로 레거시 모드에서 prometheus_url 옵션을 활성화합니다.
  • dns-pod 태그가 대상 DNS 파드 IP를 추적합니다. 다른 태그는 서비스 탐지를 사용해 정보를 폴링하는 Datadog 에이전트와 관련되어 있습니다.
  • 파드에서 서비스 탐지 주석을 해야 합니다. 배포를 하고 싶을 경우 템플릿 사양 메타데이터에 주석을 추가하세요. 외부 사양 수준에 추가하지 마세요.
로그 수집

기본적으로 로그 수집은 Datadog 에이전트에서 비활성화되어 있습니다. 활성화하려면 ECS 로그 수집을 참조하세요.

그런 다음 Docker 레이블로 로그 통합을 설정하세요.

{
  "containerDefinitions": [{
    "name": "coredns",
    "image": "coredns:latest",
    "dockerLabels": {
      "com.datadoghq.ad.logs": "[{\"source\":\"coredns\",\"service\":\"<SERVICE_NAME>\"}]"
    }
  }]
}

검증

에이전트의 status 하위 명령어를 실행하고 Checks 섹션 아래에서 coredns를 찾으세요.

수집한 데이터

메트릭

coredns.acl.allowed_requests
(count)
[OpenMetrics V1] Counter of DNS requests being allowed.
Shown as request
coredns.acl.allowed_requests.count
(count)
[OpenMetrics V2] Counter of DNS requests being allowed.
Shown as request
coredns.acl.blocked_requests
(count)
[OpenMetrics V1] Counter of DNS requests being blocked.
Shown as request
coredns.acl.blocked_requests.count
(count)
[OpenMetrics V2] Counter of DNS requests being blocked.
Shown as request
coredns.autopath.success_count
(count)
[OpenMetrics V1] Counter of requests that did autopath.
Shown as request
coredns.autopath.success_count.count
(count)
[OpenMetrics V2] Counter of requests that did autopath.
Shown as request
coredns.build_info
(gauge)
[OpenMetrics V1 and V2] A metric with a constant '1' value labeled by version, revision, and goversion from which CoreDNS was built.
coredns.cache_drops_count
(count)
[OpenMetrics V1] Counter of responses excluded from the cache due to request/response question name mismatch.
Shown as response
coredns.cache_drops_count.count
(count)
[OpenMetrics V2] Counter of responses excluded from the cache due to request/response question name mismatch.
Shown as response
coredns.cache_hits_count
(count)
[OpenMetrics V1] Counter of cache hits by cache type
Shown as hit
coredns.cache_hits_count.count
(count)
[OpenMetrics V2] Counter of cache hits by cache type
Shown as hit
coredns.cache_misses_count
(count)
[OpenMetrics V1] Counter of cache misses.
Shown as miss
coredns.cache_misses_count.count
(count)
[OpenMetrics V2] Counter of cache misses.
Shown as miss
coredns.cache_prefetch_count
(count)
[OpenMetrics V1] The number of time the cache has prefetched a cached item.
coredns.cache_prefetch_count.count
(count)
[OpenMetrics V2] The number of time the cache has prefetched a cached item.
coredns.cache_request_count
(count)
[OpenMetrics V1] Counter of cache requests.
Shown as request
coredns.cache_request_count.count
(count)
[OpenMetrics V2] Counter of cache requests.
Shown as request
coredns.cache_size.count
(gauge)
[OpenMetrics V1 and V2]
Shown as entry
coredns.cache_stale_count
(count)
[OpenMetrics V1] Counter of requests served from stale cache entries.
Shown as request
coredns.cache_stale_count.count
(count)
[OpenMetrics V2] Counter of requests served from stale cache entries.
Shown as request
coredns.dnssec.cache_hits
(count)
[OpenMetrics V1] Counter of cache hits.
Shown as hit
coredns.dnssec.cache_hits.count
(count)
[OpenMetrics V2] Counter of cache hits.
Shown as hit
coredns.dnssec.cache_misses
(count)
[OpenMetrics V1] Counter of cache misses.
Shown as miss
coredns.dnssec.cache_misses.count
(count)
[OpenMetrics V2] Counter of cache misses.
Shown as miss
coredns.dnssec.cache_size
(gauge)
[OpenMetrics V1 and V2] Total elements in the cache, type is signature.
coredns.forward_healthcheck_broken_count
(count)
[OpenMetrics V1] counter of when all upstreams are unhealthy
Shown as entry
coredns.forward_healthcheck_broken_count.count
(count)
[OpenMetrics V2] counter of when all upstreams are unhealthy
Shown as entry
coredns.forward_healthcheck_failure_count
(count)
[OpenMetrics V1] number of failed health checks per upstream
Shown as entry
coredns.forward_healthcheck_failure_count.count
(count)
[OpenMetrics V2] number of failed health checks per upstream
Shown as entry
coredns.forward_max_concurrent_rejects
(count)
[OpenMetrics V1] Counter of the number of queries rejected because the concurrent queries were at maximum.
Shown as query
coredns.forward_max_concurrent_rejects.count
(count)
[OpenMetrics V2] Counter of the number of queries rejected because the concurrent queries were at maximum.
Shown as query
coredns.forward_request_count
(count)
[OpenMetrics V1] query count per upstream
Shown as request
coredns.forward_request_count.count
(count)
[OpenMetrics V2] query count per upstream
Shown as request
coredns.forward_request_duration.seconds.bucket
(count)
[OpenMetrics V2] duration per upstream interaction
Shown as second
coredns.forward_request_duration.seconds.count
(count)
[OpenMetrics V1 and V2] duration per upstream interaction
Shown as second
coredns.forward_request_duration.seconds.sum
(count)
[OpenMetrics V1 and V2] duration per upstream interaction
Shown as second
coredns.forward_response_rcode_count
(count)
[OpenMetrics V1] count of RCODEs per upstream
Shown as response
coredns.forward_response_rcode_count.count
(count)
[OpenMetrics V2] count of RCODEs per upstream
Shown as response
coredns.forward_sockets_open
(gauge)
[OpenMetrics V1 and V2] number of sockets open per upstream
Shown as connection
coredns.go.gc_duration_seconds.count
(count)
[OpenMetrics V1 and V2] Count of the GC invocation durations.
Shown as second
coredns.go.gc_duration_seconds.quantile
(gauge)
[OpenMetrics V1 and V2] Quantiles of the GC invocation durations.
Shown as second
coredns.go.gc_duration_seconds.sum
(count)
[OpenMetrics V1 and V2] Sum of the GC invocation durations.
Shown as second
coredns.go.goroutines
(gauge)
[OpenMetrics V1 and V2] Number of goroutines that currently exist.
Shown as thread
coredns.go.info
(gauge)
[OpenMetrics V1 and V2] Information about the Go environment.
coredns.go.memstats.alloc_bytes
(gauge)
[OpenMetrics V1 and V2] Number of bytes allocated and still in use.
Shown as byte
coredns.go.memstats.alloc_bytes_total
(count)
[OpenMetrics V1] Total number of bytes allocated even if freed.
Shown as byte
coredns.go.memstats.buck_hash_sys_bytes
(gauge)
[OpenMetrics V1 and V2] Number of bytes used by the profiling bucket hash table.
Shown as byte
coredns.go.memstats.frees_total
(count)
[OpenMetrics V1] Total number of frees.
coredns.go.memstats.frees_total.count
(count)
[OpenMetrics V2] Total number of frees.
coredns.go.memstats.gc_cpu_fraction
(gauge)
[OpenMetrics V1 and V2] CPU taken up by GC
Shown as percent
coredns.go.memstats.gc_sys_bytes
(gauge)
[OpenMetrics V1 and V2] Number of bytes used for garbage collection system metadata.
Shown as byte
coredns.go.memstats.heap_alloc_bytes
(gauge)
[OpenMetrics V1 and V2] Bytes allocated to the heap
Shown as byte
coredns.go.memstats.heap_idle_bytes
(gauge)
[OpenMetrics V1 and V2] Number of idle bytes in the heap
Shown as byte
coredns.go.memstats.heap_inuse_bytes
(gauge)
[OpenMetrics V1 and V2] Number of Bytes in the heap
Shown as byte
coredns.go.memstats.heap_objects
(gauge)
[OpenMetrics V1 and V2] Number of objects in the heap
Shown as object
coredns.go.memstats.heap_released_bytes
(gauge)
[OpenMetrics V1 and V2] Number of bytes released to the system in the last gc
Shown as byte
coredns.go.memstats.heap_released_bytes.count
(count)
[OpenMetrics V2] Count of bytes released to the system in the last gc
Shown as byte
coredns.go.memstats.heap_sys_bytes
(gauge)
[OpenMetrics V1 and V2] Number of bytes used by the heap
Shown as byte
coredns.go.memstats.last_gc_time_seconds
(gauge)
[OpenMetrics V1 and V2] Length of last GC
Shown as second
coredns.go.memstats.lookups_total
(count)
[OpenMetrics V1] Number of lookups
Shown as operation
coredns.go.memstats.lookups_total.count
(count)
[OpenMetrics V2] Number of lookups
Shown as operation
coredns.go.memstats.mallocs_total
(count)
[OpenMetrics V1] Number of mallocs
Shown as operation
coredns.go.memstats.mallocs_total.count
(count)
[OpenMetrics V2] Number of mallocs
Shown as operation
coredns.go.memstats.mcache_inuse_bytes
(gauge)
[OpenMetrics V1 and V2] Number of bytes in use by mcache structures.
Shown as byte
coredns.go.memstats.mcache_sys_bytes
(gauge)
[OpenMetrics V1 and V2] Number of bytes used for mcache structures obtained from system.
Shown as byte
coredns.go.memstats.mspan_inuse_bytes
(gauge)
[OpenMetrics V1 and V2] Number of bytes in use by mspan structures.
Shown as byte
coredns.go.memstats.mspan_sys_bytes
(gauge)
[OpenMetrics V1 and V2] Number of bytes used for mspan structures obtained from system.
Shown as byte
coredns.go.memstats.next_gc_bytes
(gauge)
[OpenMetrics V1 and V2] Number of heap bytes when next garbage collection will take place
Shown as byte
coredns.go.memstats.other_sys_bytes
(gauge)
[OpenMetrics V1 and V2] Number of bytes used for other system allocations
Shown as byte
coredns.go.memstats.stack_inuse_bytes
(gauge)
[OpenMetrics V1 and V2] Number of bytes in use by the stack allocator
Shown as byte
coredns.go.memstats.stack_sys_bytes
(gauge)
[OpenMetrics V1 and V2] Number of bytes obtained from system for stack allocator
Shown as byte
coredns.go.memstats.sys_bytes
(gauge)
[OpenMetrics V1 and V2] Number of bytes obtained from system
Shown as byte
coredns.go.threads
(gauge)
[OpenMetrics V1 and V2] Number of OS threads created.
Shown as thread
coredns.grpc.request_count
(count)
[OpenMetrics V1] Query count per upstream.
coredns.grpc.request_count.count
(count)
[OpenMetrics V2] Query count per upstream.
coredns.grpc.response_rcode_count
(count)
[OpenMetrics V1] Count of RCODEs per upstream. and we are randomly (this always uses the random policy) spraying to an upstream.
coredns.grpc.response_rcode_count.count
(count)
[OpenMetrics V2] Count of RCODEs per upstream. and we are randomly (this always uses the random policy) spraying to an upstream.
coredns.health_request_duration.bucket
(count)
[OpenMetrics V2] Sample for the histogram of the time (in seconds) each request took.
coredns.health_request_duration.count
(count)
[OpenMetrics V1 and V2] Count for the histogram of the time (in seconds) each request took.
coredns.health_request_duration.sum
(count)
[OpenMetrics V1 and V2] Sum for the histogram of the time (in seconds) each request took.
coredns.hosts.entries_count
(gauge)
[OpenMetrics V1 and V2] The combined number of entries in hosts and Corefile.
coredns.hosts.reload_timestamp
(gauge)
[OpenMetrics V1 and V2] The timestamp of the last reload of hosts file.
Shown as second
coredns.panic_count.count
(count)
[OpenMetrics V1 and V2]
Shown as entry
coredns.plugin_enabled
(gauge)
[OpenMetrics V1 and V2] A metric that indicates whether a plugin is enabled on per server and zone basis.
coredns.process.cpu_seconds_total
(count)
[OpenMetrics V1 and V2] Total user and system CPU time spent in seconds.
Shown as second
coredns.process.cpu_seconds_total.count
(count)
[OpenMetrics V2] Count of user and system CPU time spent in seconds.
Shown as second
coredns.process.max_fds
(gauge)
[OpenMetrics V1 and V2] Maximum number of open file descriptors.
Shown as file
coredns.process.open_fds
(gauge)
[OpenMetrics V1 and V2] Number of open file descriptors.
Shown as file
coredns.process.resident_memory_bytes
(gauge)
[OpenMetrics V1 and V2] Resident memory size in bytes.
Shown as byte
coredns.process.start_time_seconds
(gauge)
[OpenMetrics V1 and V2] Start time of the process since unix epoch in seconds.
Shown as second
coredns.process.virtual_memory_bytes
(gauge)
[OpenMetrics V1 and V2] Virtual memory size in bytes.
Shown as byte
coredns.proxy_request_count
(count)
[OpenMetrics V1] query count per upstream.
Shown as request
coredns.proxy_request_count.count
(count)
[OpenMetrics V2] query count per upstream.
Shown as request
coredns.proxy_request_duration.seconds.bucket
(count)
[OpenMetrics V2] sample of duration per upstream interaction
Shown as second
coredns.proxy_request_duration.seconds.count
(count)
[OpenMetrics V1 and V2] duration per upstream interaction
Shown as second
coredns.proxy_request_duration.seconds.sum
(count)
[OpenMetrics V1 and V2] duration per upstream interaction
Shown as second
coredns.reload.failed_count
(count)
[OpenMetrics V1] Counts the number of failed reload attempts.
coredns.reload.failed_count.count
(count)
[OpenMetrics V2] Counts the number of failed reload attempts.
coredns.request_count
(count)
[OpenMetrics V1] total query count.
Shown as request
coredns.request_count.count
(count)
[OpenMetrics V2] total query count.
Shown as request
coredns.request_duration.seconds.bucket
(count)
[OpenMetrics V2] sample duration to process each query
Shown as second
coredns.request_duration.seconds.count
(count)
[OpenMetrics V1 and V2] duration to process each query
Shown as second
coredns.request_duration.seconds.sum
(count)
[OpenMetrics V1 and V2] duration to process each query
Shown as second
coredns.request_size.bytes.bucket
(count)
[OpenMetrics V2] sample size of the request in bytes
Shown as byte
coredns.request_size.bytes.count
(count)
[OpenMetrics V1 and V2] size of the request in bytes
Shown as byte
coredns.request_size.bytes.sum
(count)
[OpenMetrics V1 and V2] size of the request in bytes
Shown as byte
coredns.request_type_count
(count)
[OpenMetrics V1] counter of queries per zone and type
coredns.request_type_count.count
(count)
[OpenMetrics V2] counter of queries per zone and type
coredns.response_code_count
(count)
[OpenMetrics V1] number of responses per zone and rcode
coredns.response_code_count.count
(count)
[OpenMetrics V2] number of responses per zone and rcode
coredns.response_size.bytes.bucket
(count)
[OpenMetrics V2] sample size of the request in bytes
Shown as byte
coredns.response_size.bytes.count
(count)
[OpenMetrics V1 and V2] size of the request in bytes
Shown as byte
coredns.response_size.bytes.sum
(count)
[OpenMetrics V1 and V2] size of the request in bytes
Shown as byte
coredns.template.failures_count
(count)
[OpenMetrics V1] The number of times the Go templating failed.
Shown as error
coredns.template.failures_count.count
(count)
[OpenMetrics V2] The number of times the Go templating failed.
Shown as error
coredns.template.matches_count
(count)
[OpenMetrics V1] The total number of matched requests by regex.
coredns.template.matches_count.count
(count)
[OpenMetrics V2] The total number of matched requests by regex.
coredns.template.rr_failures_count
(count)
[OpenMetrics V1] The number of times the templated resource record was invalid and could not be parsed.
Shown as error
coredns.template.rr_failures_count.count
(count)
[OpenMetrics V2] The number of times the templated resource record was invalid and could not be parsed.
Shown as error

이벤트

CoreDNS 점검에는 이벤트가 포함되어 있지 않습니다.

서비스 점검

coredns.prometheus.health
Returns CRITICAL if the check cannot access the metrics endpoint. Returns OK otherwise.
Statuses: ok, critical

트러블슈팅

도움이 필요하신가요? Datadog 지원팀에 문의하세요.

참고 자료

기타 유용한 문서, 링크 및 기사:

PREVIEWING: rtrieu/product-analytics-ui-changes