- 필수 기능
- 시작하기
- Glossary
- 표준 속성
- Guides
- Agent
- 통합
- 개방형텔레메트리
- 개발자
- Administrator's Guide
- API
- Datadog Mobile App
- CoScreen
- Cloudcraft
- 앱 내
- 서비스 관리
- 인프라스트럭처
- 애플리케이션 성능
- APM
- Continuous Profiler
- 스팬 시각화
- 데이터 스트림 모니터링
- 데이터 작업 모니터링
- 디지털 경험
- 소프트웨어 제공
- 보안
- AI Observability
- 로그 관리
- 관리
Supported OS
This check monitors the Datadog Cluster Agent through the Datadog Agent.
Follow the instructions below to install and configure this check for an Agent running on a host. For containerized environments, see the Autodiscovery Integration Templates for guidance on applying these instructions.
The Datadog Cluster Agent check is included in the Datadog Agent package. No additional installation is needed on your server.
The Datadog Cluster Agent check uses Autodiscovery to automatically configure itself in most scenarios. The check runs in the Datadog Agent pod on the same node as the Cluster Agent pod. It will not run in the Cluster Agent itself.
If you need to further configure the check:
Edit the datadog_cluster_agent.d/conf.yaml
file, in the conf.d/
folder at the root of your Agent’s configuration directory to start collecting your datadog_cluster_agent performance data. See the sample datadog_cluster_agent.d/conf.yaml for all available configuration options.
Run the Agent’s status subcommand and look for datadog_cluster_agent
under the Checks section.
datadog.cluster_agent.admission_webhooks.certificate_expiry (gauge) | Time left before the certificate expires Shown as hour |
datadog.cluster_agent.admission_webhooks.cws_exec_mutation_attempts (count) | Number of CWS exec mutation attempts by reason and status |
datadog.cluster_agent.admission_webhooks.cws_pod_mutation_attempts (count) | Number of CWS pod mutation attempts by reason and status |
datadog.cluster_agent.admission_webhooks.cws_response_duration.count (count) | CWS mutating webhook response duration count |
datadog.cluster_agent.admission_webhooks.cws_response_duration.sum (count) | CWS mutating webhook response duration sum |
datadog.cluster_agent.admission_webhooks.library_injection_attempts (count) | Number of library injection attempts by language |
datadog.cluster_agent.admission_webhooks.library_injection_errors (count) | Number of library injection failures by language |
datadog.cluster_agent.admission_webhooks.mutation_attempts (gauge) | Number of pod mutation attempts by mutation type |
datadog.cluster_agent.admission_webhooks.mutation_errors (gauge) | Number of mutation failures by mutation type |
datadog.cluster_agent.admission_webhooks.patcher.attempts (count) | Number of patch attempts |
datadog.cluster_agent.admission_webhooks.patcher.completed (count) | Number of completed patch attempts |
datadog.cluster_agent.admission_webhooks.patcher.errors (count) | Number of patch errors |
datadog.cluster_agent.admission_webhooks.rc_provider.configs (gauge) | Number of valid remote configuration |
datadog.cluster_agent.admission_webhooks.rc_provider.invalid_configs (gauge) | Number of invalid remote configurations |
datadog.cluster_agent.admission_webhooks.reconcile_errors (gauge) | Number of reconcile errors per controller |
datadog.cluster_agent.admission_webhooks.reconcile_success (gauge) | Number of reconcile successes per controller Shown as success |
datadog.cluster_agent.admission_webhooks.response_duration.count (count) | Webhook response duration count |
datadog.cluster_agent.admission_webhooks.response_duration.sum (count) | Webhook response duration sum Shown as second |
datadog.cluster_agent.admission_webhooks.validation_attempts (gauge) | Number of pod validation attempts by validation type |
datadog.cluster_agent.admission_webhooks.webhooks_received (gauge) | Number of webhook requests received |
datadog.cluster_agent.aggregator.flush (count) | Number of metrics/service checks/events flushed by (data_type, state) |
datadog.cluster_agent.aggregator.processed (count) | Amount of metrics/services_checks/events processed by the aggregator by data_type |
datadog.cluster_agent.api_requests (count) | Requests made to the cluster agent API by (handler, status) Shown as request |
datadog.cluster_agent.autodiscovery.errors (gauge) | Number of Autodiscovery errors |
datadog.cluster_agent.autodiscovery.poll_duration.count (count) | Autodiscovery poll duration count |
datadog.cluster_agent.autodiscovery.poll_duration.sum (count) | Autodiscovery poll duration sum Shown as second |
datadog.cluster_agent.autodiscovery.watched_resources (gauge) | Number of watched resources (Services and Endpoints) |
datadog.cluster_agent.cluster_checks.busyness (gauge) | Busyness of a node per the number of metrics submitted and average duration of all checks run |
datadog.cluster_agent.cluster_checks.configs_dangling (gauge) | Number of check configurations not dispatched |
datadog.cluster_agent.cluster_checks.configs_dispatched (gauge) | Number of check configurations dispatched by node |
datadog.cluster_agent.cluster_checks.configs_info (gauge) | Information about check configurations dispatched (node and check ID) |
datadog.cluster_agent.cluster_checks.failed_stats_collection (count) | Total number of unsuccessful stats collection attempts |
datadog.cluster_agent.cluster_checks.nodes_reporting (gauge) | Number of node agents reporting |
datadog.cluster_agent.cluster_checks.rebalancing_decisions (count) | Total number of check rebalancing decisions |
datadog.cluster_agent.cluster_checks.rebalancing_duration_seconds (gauge) | Duration of the check rebalancing algorithm last execution Shown as second |
datadog.cluster_agent.cluster_checks.successful_rebalancing_moves (count) | Total number of successful check rebalancing decisions Shown as check |
datadog.cluster_agent.cluster_checks.unscheduled_check (gauge) | Number of check configurations not scheduled |
datadog.cluster_agent.cluster_checks.updating_stats_duration_seconds (gauge) | Duration of collecting stats from check runners and updating cache Shown as second |
datadog.cluster_agent.datadog.rate_limit_queries.limit (gauge) | Maximum number of queries to the Datadog API allowed in the period by endpoint Shown as query |
datadog.cluster_agent.datadog.rate_limit_queries.period (gauge) | Period of rate limiting for the Datadog API by endpoint Shown as second |
datadog.cluster_agent.datadog.rate_limit_queries.remaining (gauge) | Number of queries to the Datadog API remaining before next reset by endpoint Shown as query |
datadog.cluster_agent.datadog.rate_limit_queries.remaining_min (gauge) | Minimum number of queries remaining before next reset observed during an expiration interval of 2*refresh period Shown as query |
datadog.cluster_agent.datadog.rate_limit_queries.reset (gauge) | Number of seconds before next reset applied to the Datadog API by endpoint Shown as second |
datadog.cluster_agent.datadog.requests (count) | Requests made to Datadog by status Shown as request |
datadog.cluster_agent.endpoint_checks.configs_dispatched (gauge) | Number of endpoint-check configurations dispatched by node |
datadog.cluster_agent.external_metrics (gauge) | Number of external metrics tagged |
datadog.cluster_agent.external_metrics.api_elapsed.count (count) | Count of API Requests received |
datadog.cluster_agent.external_metrics.api_elapsed.sum (count) | Count of API Requests received |
datadog.cluster_agent.external_metrics.api_requests (gauge) | Count of API Requests received |
datadog.cluster_agent.external_metrics.datadog_metrics (gauge) | The label valid is true if the DatadogMetric CR is valid, false otherwise |
datadog.cluster_agent.external_metrics.delay_seconds (gauge) | Freshness of the metric evaluated from querying Datadog Shown as second |
datadog.cluster_agent.external_metrics.processed_value (gauge) | Value processed from querying Datadog by metric |
datadog.cluster_agent.go.goroutines (gauge) | Number of goroutines that currently exist |
datadog.cluster_agent.go.memstats.alloc_bytes (gauge) | Number of bytes allocated and still in use Shown as byte |
datadog.cluster_agent.go.threads (gauge) | Number of OS threads created Shown as thread |
datadog.cluster_agent.kubernetes_apiserver.emitted_events (count) | Datadog events emitted by the kubernetes_apiserver check |
datadog.cluster_agent.kubernetes_apiserver.kube_events (count) | Kubernetes events processed by the kubernetes_apiserver check |
datadog.cluster_agent.language_detection_dca_handler.processed_requests (count) | The number of process language detection requests processed by the handler |
datadog.cluster_agent.language_detection_patcher.patches (count) | The number of patch requests sent by the patcher to the kube api server |
datadog.cluster_agent.secret_backend.elapsed (gauge) | The elapsed time of secret backend invocation Shown as millisecond |
datadog.cluster_agent.tagger.stored_entities (gauge) | Number of entities stored in the tagger |
datadog.cluster_agent.tagger.updated_entities (count) | Number of updates made to entities in the tagger |
datadog.cluster_agent.workloadmeta.events_received (count) | Number of events received by workloadmeta |
datadog.cluster_agent.workloadmeta.notifications_sent (count) | Number of notifications sent by workloadmeta to its subscribers |
datadog.cluster_agent.workloadmeta.stored_entities (gauge) | Number of entities stored in workloadmeta |
datadog.cluster_agent.workloadmeta.subscribers (gauge) | Number of workloadmeta subscribers |
The Datadog-Cluster-Agent integration does not include any events.
datadog.cluster_agent.prometheus.health
Returns CRITICAL
if the check cannot access the metrics endpoint. Returns OK
otherwise.
Statuses: ok, critical
Need help? Contact Datadog support.