- 필수 기능
- 시작하기
- Glossary
- 표준 속성
- Guides
- Agent
- 통합
- 개방형텔레메트리
- 개발자
- Administrator's Guide
- API
- Datadog Mobile App
- CoScreen
- Cloudcraft
- 앱 내
- 서비스 관리
- 인프라스트럭처
- 애플리케이션 성능
- APM
- Continuous Profiler
- 스팬 시각화
- 데이터 스트림 모니터링
- 데이터 작업 모니터링
- 디지털 경험
- 소프트웨어 제공
- 보안
- AI Observability
- 로그 관리
- 관리
Supported OS
이 점검은 Kubernetes 컨트롤 플레인의 일부인 Kubernetes Controller Manager를 모니터링합니다.
참고: 이 점검은 해당 서비스가 노출되지 않기 때문에 Amazon EKS 클러스터에 대한 데이터를 수집하지 않습니다.
Kubernetes Controller Manager 점검은 Datadog Agent 패키지에 포함되어 있으므로 서버에 추가 설치할 필요가 없습니다.
Agent의 설정 디렉터리 루트에서 conf.d/
폴더에 있는 kube_controller_manager.d/conf.yaml
파일을 편집하여 kube_controller_manager 성능 데이터 수집을 시작합니다. 사용 가능한 모든 설정 옵션은 샘플 kube_controller_manager.d/conf.yaml을 참조하세요.
이 통합을 위해서는 컨트롤러 관리자의 메트릭 엔드포인트에 대한 액세스가 필요합니다. 메트릭 엔드포인트에 액세스하려면 다음을 수행해야 합니다.
get
RBAC 권한이 있어야 합니다. (기본 Datadog Helm 차트에는 이미 이에 대한 올바른 RBAC 역할과 바인딩이 추가되어 있습니다.)Agent의 status
하위 명령을 실행하고 Checks 섹션에서 kube_controller_manager
를 찾습니다.
kube_controller_manager.goroutines (gauge) | Number of goroutines that currently exist |
kube_controller_manager.job_controller.terminated_pods_tracking_finalizer (count) | Used to monitor whether the job controller is removing Pod finalizers from terminated Pods after accounting them in Job status |
kube_controller_manager.leader_election.lease_duration (gauge) | Duration of the leadership lease |
kube_controller_manager.leader_election.transitions (count) | Number of leadership transitions observed |
kube_controller_manager.max_fds (gauge) | Maximum allowed open file descriptors |
kube_controller_manager.nodes.count (gauge) | Number of registered nodes, per zone |
kube_controller_manager.nodes.evictions (count) | Count of node eviction events, per zone |
kube_controller_manager.nodes.unhealthy (gauge) | Number of unhealthy nodes, per zone |
kube_controller_manager.open_fds (gauge) | Number of open file descriptors |
kube_controller_manager.queue.adds (count) | Elements added, by queue |
kube_controller_manager.queue.depth (gauge) | Current depth, by queue |
kube_controller_manager.queue.latency.count (gauge) | Processing latency count, by queue (deprecated in kubernetes v1.14) |
kube_controller_manager.queue.latency.quantile (gauge) | Processing latency quantiles, by queue (deprecated in kubernetes v1.14) Shown as microsecond |
kube_controller_manager.queue.latency.sum (gauge) | Processing latency sum, by queue (deprecated in kubernetes v1.14) Shown as microsecond |
kube_controller_manager.queue.process_duration.count (gauge) | How long processing an item from workqueue takes, by queue |
kube_controller_manager.queue.process_duration.sum (gauge) | Total workqueue processing time, by queue Shown as second |
kube_controller_manager.queue.queue_duration.count (gauge) | How long item stays in a queue before being requested, by queue |
kube_controller_manager.queue.queue_duration.sum (gauge) | Total time of items stays in a queue before being requested, by queue Shown as second |
kube_controller_manager.queue.retries (count) | Retries handled, by queue |
kube_controller_manager.queue.work_duration.count (gauge) | Work duration, by queue (deprecated in kubernetes v1.14) |
kube_controller_manager.queue.work_duration.quantile (gauge) | Work duration quantiles, by queue (deprecated in kubernetes v1.14) Shown as microsecond |
kube_controller_manager.queue.work_duration.sum (gauge) | Work duration sum, by queue (deprecated in kubernetes v1.14) Shown as microsecond |
kube_controller_manager.queue.work_longest_duration (gauge) | How many seconds has the longest running processor been running, by queue Shown as second |
kube_controller_manager.queue.work_unfinished_duration (gauge) | How many seconds of work has done that is in progress and hasn't been observed by process_duration, by queue Shown as second |
kube_controller_manager.rate_limiter.use (gauge) | Usage of the rate limiter, by limiter |
kube_controller_manager.slis.kubernetes_healthcheck (gauge) | Result of a single controller manager healthcheck (alpha; requires k8s v1.26+) |
kube_controller_manager.slis.kubernetes_healthcheck_total (count) | Cumulative results of all controller manager healthchecks (alpha; requires k8s v1.26+) |
kube_controller_manager.threads (gauge) | Number of OS threads created |
Kubernetes Controller Manager 점검은 이벤트를 포함하지 않습니다.
kube_controller_manager.prometheus.health
Returns CRITICAL
if the check cannot access the metrics endpoint.
Statuses: ok, critical
kube_controller_manager.leader_election.status
Returns CRITICAL
if no replica is currently set as leader.
Statuses: ok, critical
kube_controller_manager.up
Returns CRITICAL
if Kube Controller Manager is not healthy.
Statuses: ok, critical
도움이 필요하신가요? Datadog 지원팀에 문의하세요.