- 필수 기능
- 시작하기
- Glossary
- 표준 속성
- Guides
- Agent
- 통합
- 개방형텔레메트리
- 개발자
- Administrator's Guide
- API
- Datadog Mobile App
- CoScreen
- Cloudcraft
- 앱 내
- 서비스 관리
- 인프라스트럭처
- 애플리케이션 성능
- APM
- Continuous Profiler
- 스팬 시각화
- 데이터 스트림 모니터링
- 데이터 작업 모니터링
- 디지털 경험
- 소프트웨어 제공
- 보안
- AI Observability
- 로그 관리
- 관리
Supported OS
This check collects metrics for Mesos masters. For Mesos slave metrics, see the Mesos Slave integration.
This check collects metrics from Mesos masters for:
And many more.
The installation is the same on Mesos with and without DC/OS. Run the datadog-agent container on each of your Mesos master nodes:
docker run -d --name datadog-agent \
-v /var/run/docker.sock:/var/run/docker.sock:ro \
-v /proc/:/host/proc/:ro \
-v /sys/fs/cgroup/:/host/sys/fs/cgroup:ro \
-e DD_API_KEY=<YOUR_DATADOG_API_KEY> \
-e MESOS_MASTER=true \
-e MARATHON_URL=http://leader.mesos:8080 \
datadog/agent:latest
Substitute your Datadog API key and Mesos Master’s API URL into the command above.
If you passed the correct Master URL when starting datadog-agent, the Agent is already using a default mesos_master.d/conf.yaml
to collect metrics from your masters. See the sample mesos_master.d/conf.yaml for all available configuration options.
Unless your masters’ API uses a self-signed certificate. In that case, set disable_ssl_validation: true
in mesos_master.d/conf.yaml
.
Collecting logs is disabled by default in the Datadog Agent, enable it in your datadog.yaml
file:
logs_enabled: true
Add this configuration block to your mesos_master.d/conf.yaml
file to start collecting your Mesos logs:
logs:
- type: file
path: /var/log/mesos/*
source: mesos
Change the path
parameter value based on your environment, or use the default docker stdout:
logs:
- type: docker
source: mesos
See the sample mesos_master.d/conf.yaml for all available configuration options.
To enable logs for Kubernetes environments, see Kubernetes Log Collection.
In Datadog, search for mesos.cluster
in the Metrics Explorer.
mesos.cluster.cpus_percent (gauge) | Percentage of allocated CPUs Shown as percent |
mesos.cluster.cpus_total (gauge) | Number of CPUs |
mesos.cluster.cpus_used (gauge) | Number of allocated CPUs |
mesos.cluster.disk_percent (gauge) | Percentage of allocated disk space Shown as percent |
mesos.cluster.disk_total (gauge) | Disk space Shown as mebibyte |
mesos.cluster.disk_used (gauge) | Allocated disk space Shown as mebibyte |
mesos.cluster.dropped_messages (gauge) | Number of dropped messages Shown as message |
mesos.cluster.event_queue_dispatches (gauge) | Number of dispatches in the event queue |
mesos.cluster.event_queue_http_requests (gauge) | Number of HTTP requests in the event queue Shown as request |
mesos.cluster.event_queue_messages (gauge) | Number of messages in the event queue Shown as message |
mesos.cluster.frameworks_active (gauge) | Number of active frameworks |
mesos.cluster.frameworks_connected (gauge) | Number of connected frameworks |
mesos.cluster.frameworks_disconnected (gauge) | Number of disconnected frameworks |
mesos.cluster.frameworks_inactive (gauge) | Number of inactive frameworks |
mesos.cluster.gpus_percent (gauge) | Percentage of allocated GPUs Shown as percent |
mesos.cluster.gpus_total (gauge) | Number of GPUs |
mesos.cluster.gpus_used (gauge) | Number of allocated GPUs |
mesos.cluster.invalid_framework_to_executor_messages (gauge) | Number of invalid framework messages Shown as message |
mesos.cluster.invalid_status_update_acknowledgements (gauge) | Number of invalid status update acknowledgements |
mesos.cluster.invalid_status_updates (gauge) | Number of invalid status updates |
mesos.cluster.mem_percent (gauge) | Percentage of allocated memory Shown as percent |
mesos.cluster.mem_total (gauge) | Total memory Shown as mebibyte |
mesos.cluster.mem_used (gauge) | Allocated memory Shown as mebibyte |
mesos.cluster.outstanding_offers (gauge) | Number of outstanding resource offers |
mesos.cluster.slave_registrations (gauge) | Number of slaves that were able to cleanly re-join the cluster and connect back to the master after the master is disconnected. |
mesos.cluster.slave_removals (gauge) | Number of slaves removed for various reasons, including maintenance |
mesos.cluster.slave_reregistrations (gauge) | Number of slave re-registrations |
mesos.cluster.slave_shutdowns_canceled (gauge) | Number of cancelled slave shutdowns |
mesos.cluster.slave_shutdowns_scheduled (gauge) | Number of slaves which have failed their health check and are scheduled to be removed |
mesos.cluster.slaves_active (gauge) | Number of active slaves |
mesos.cluster.slaves_connected (gauge) | Number of connected slaves |
mesos.cluster.slaves_disconnected (gauge) | Number of disconnected slaves |
mesos.cluster.slaves_inactive (gauge) | Number of inactive slaves |
mesos.cluster.tasks_error (gauge) | Number of tasks that were invalid Shown as task |
mesos.cluster.tasks_failed (count) | Number of failed tasks Shown as task |
mesos.cluster.tasks_finished (count) | Number of finished tasks Shown as task |
mesos.cluster.tasks_killed (count) | Number of killed tasks Shown as task |
mesos.cluster.tasks_lost (count) | Number of lost tasks Shown as task |
mesos.cluster.tasks_running (gauge) | Number of running tasks Shown as task |
mesos.cluster.tasks_staging (gauge) | Number of staging tasks Shown as task |
mesos.cluster.tasks_starting (gauge) | Number of starting tasks Shown as task |
mesos.cluster.valid_framework_to_executor_messages (gauge) | Number of valid framework messages Shown as message |
mesos.cluster.valid_status_update_acknowledgements (gauge) | Number of valid status update acknowledgements |
mesos.cluster.valid_status_updates (gauge) | Number of valid status updates |
mesos.framework.cpu (gauge) | Framework cpu |
mesos.framework.disk (gauge) | Framework disk Shown as mebibyte |
mesos.framework.mem (gauge) | Framework mem Shown as mebibyte |
mesos.registrar.log.recovered (gauge) | Registrar log recovered |
mesos.registrar.queued_operations (gauge) | Number of queued operations |
mesos.registrar.registry_size_bytes (gauge) | Registry size Shown as byte |
mesos.registrar.state_fetch_ms (gauge) | Registry read latency Shown as millisecond |
mesos.registrar.state_store_ms (gauge) | Registry write latency Shown as millisecond |
mesos.registrar.state_store_ms.count (gauge) | Registry write count |
mesos.registrar.state_store_ms.max (gauge) | Maximum registry write latency Shown as millisecond |
mesos.registrar.state_store_ms.min (gauge) | Minimum registry write latency Shown as millisecond |
mesos.registrar.state_store_ms.p50 (gauge) | Median registry write latency Shown as millisecond |
mesos.registrar.state_store_ms.p90 (gauge) | 90th percentile registry write latency Shown as millisecond |
mesos.registrar.state_store_ms.p95 (gauge) | 95th percentile registry write latency Shown as millisecond |
mesos.registrar.state_store_ms.p99 (gauge) | 99th percentile registry write latency Shown as millisecond |
mesos.registrar.state_store_ms.p999 (gauge) | 99.9th percentile registry write latency Shown as millisecond |
mesos.registrar.state_store_ms.p9999 (gauge) | 99.99th percentile registry write latency Shown as millisecond |
mesos.role.cpu (gauge) | Role cpu |
mesos.role.disk (gauge) | Role disk Shown as mebibyte |
mesos.role.mem (gauge) | Role mem Shown as mebibyte |
mesos.stats.elected (gauge) | Whether this is the elected master |
mesos.stats.registered (gauge) | Whether this slave is registered with a master |
mesos.stats.system.cpus_total (gauge) | Number of CPUs available |
mesos.stats.system.load_15min (gauge) | Load average for the past 15 minutes |
mesos.stats.system.load_1min (gauge) | Load average for the past minutes |
mesos.stats.system.load_5min (gauge) | Load average for the past 5 minutes |
mesos.stats.system.mem_free_bytes (gauge) | Free memory Shown as byte |
mesos.stats.system.mem_total_bytes (gauge) | Total memory Shown as byte |
mesos.stats.uptime_secs (gauge) | Uptime Shown as second |
The Mesos-master check does not include any events.
mesos_master.can_connect
Returns CRITICAL if the Agent cannot connect to the Mesos Master API to collect metrics, UNKNOWN if the master is not detected as the leader, otherwise OK.
Statuses: ok, critical, unknown
Need help? Contact Datadog support.