- 필수 기능
- 시작하기
- Glossary
- 표준 속성
- Guides
- Agent
- 통합
- 개방형텔레메트리
- 개발자
- API
- Datadog Mobile App
- CoScreen
- Cloudcraft
- 앱 내
- 서비스 관리
- 인프라스트럭처
- 애플리케이션 성능
- APM
- Continuous Profiler
- 스팬 시각화
- 데이터 스트림 모니터링
- 데이터 작업 모니터링
- 디지털 경험
- 소프트웨어 제공
- 보안
- AI Observability
- 로그 관리
- 관리
Supported OS
This check collects metrics from your YARN ResourceManager, including (but not limited to):
yarn.apps.<METRIC>
metrics are deprecated in favor of yarn.apps.<METRIC>_gauge
metrics because yarn.apps
metrics are incorrectly reported as a RATE
instead of a GAUGE
.
The YARN check is included in the Datadog Agent package, so you don’t need to install anything else on your YARN ResourceManager.
To configure this check for an Agent running on a host:
Edit the yarn.d/conf.yaml
file in the conf.d/
folder at the root of your Agent’s configuration directory.
init_config:
instances:
## @param resourcemanager_uri - string - required
## The YARN check retrieves metrics from YARNS's ResourceManager. This
## check must be run from the Master Node and the ResourceManager URI must
## be specified below. The ResourceManager URI is composed of the
## ResourceManager's hostname and port.
## The ResourceManager hostname can be found in the yarn-site.xml conf file
## under the property yarn.resourcemanager.address
##
## The ResourceManager port can be found in the yarn-site.xml conf file under
## the property yarn.resourcemanager.webapp.address
#
- resourcemanager_uri: http://localhost:8088
## @param cluster_name - string - required - default: default_cluster
## A friendly name for the cluster.
#
cluster_name: default_cluster
See the example check configuration for a comprehensive list and description of all check options.
Restart the Agent to start sending YARN metrics to Datadog.
For containerized environments, see the Autodiscovery Integration Templates for guidance on applying the parameters below.
Parameter | Value |
---|---|
<INTEGRATION_NAME> | yarn |
<INIT_CONFIG> | blank or {} |
<INSTANCE_CONFIG> | {"resourcemanager_uri": "http://%%host%%:%%port%%", "cluster_name": "<CLUSTER_NAME>"} |
Collecting logs is disabled by default in the Datadog Agent, enable it in your datadog.yaml
file:
logs_enabled: true
Uncomment and edit the logs configuration block in your yarn.d/conf.yaml
file. Change the type
, path
, and service
parameter values based on your environment. See the sample yarn.d/conf.yaml for all available configuration options.
logs:
- type: file
path: <LOG_FILE_PATH>
source: yarn
service: <SERVICE_NAME>
# To handle multi line that starts with yyyy-mm-dd use the following pattern
# log_processing_rules:
# - type: multi_line
# pattern: \d{4}\-\d{2}\-\d{2} \d{2}:\d{2}:\d{2},\d{3}
# name: new_log_start_with_date
To enable logs for Docker environments, see Docker Log Collection.
Run the Agent’s status subcommand and look for yarn
under the Checks section.
yarn.apps.allocated_mb (rate) | Deprecated use yarn.apps.allocatedmbgauge instead Shown as mebibyte |
yarn.apps.allocated_mb_gauge (gauge) | The sum of memory in MB allocated to the applications running containers Shown as mebibyte |
yarn.apps.allocated_vcores (rate) | Deprecated use yarn.apps.allocatedvcoresgauge instead Shown as core |
yarn.apps.allocated_vcores_gauge (gauge) | The sum of virtual cores allocated to the applications running containers Shown as core |
yarn.apps.elapsed_time (rate) | Deprecated use yarn.apps.elapsedtimegauge instead Shown as second |
yarn.apps.elapsed_time_gauge (gauge) | The elapsed time since the application started (in ms) Shown as millisecond |
yarn.apps.finished_time (rate) | Deprecated use yarn.apps.finishedtimegauge instead Shown as second |
yarn.apps.finished_time_gauge (gauge) | The time in which the application finished (in ms since epoch) Shown as millisecond |
yarn.apps.memory_seconds (rate) | Deprecated use yarn.apps.memorysecondsgauge instead Shown as second |
yarn.apps.memory_seconds_gauge (gauge) | The amount of memory the application has allocated (megabyte-seconds) Shown as mebibyte |
yarn.apps.progress (rate) | Deprecated use yarn.apps.progress_gauge instead Shown as percent |
yarn.apps.progress_gauge (gauge) | The progress of the application, displayed as 0, 10, & 100, which represent the 3 states: hasn't started, in progress, & completed Shown as percent |
yarn.apps.running_containers (rate) | Deprecated use yarn.apps.runningcontainersgauge instead |
yarn.apps.running_containers_gauge (gauge) | The number of containers currently running for the application Shown as container |
yarn.apps.started_time (rate) | Deprecated use yarn.apps.startedtimegauge instead Shown as second |
yarn.apps.started_time_gauge (gauge) | The time in which application started (in ms since epoch) Shown as millisecond |
yarn.apps.vcore_seconds (rate) | Deprecated use yarn.apps.vcoresecondsgauge instead Shown as second |
yarn.apps.vcore_seconds_gauge (gauge) | The amount of CPU resources the application has allocated (virtual core-seconds) Shown as core |
yarn.metrics.active_nodes (gauge) | The number of active nodes Shown as node |
yarn.metrics.allocated_mb (gauge) | The amount of allocated memory Shown as mebibyte |
yarn.metrics.allocated_virtual_cores (gauge) | The number of allocated virtual cores Shown as core |
yarn.metrics.apps_completed (gauge) | The number of completed apps Shown as task |
yarn.metrics.apps_failed (gauge) | The number of failed apps Shown as task |
yarn.metrics.apps_killed (gauge) | The number of killed apps Shown as task |
yarn.metrics.apps_pending (gauge) | The number of pending apps Shown as task |
yarn.metrics.apps_running (gauge) | The number of running apps Shown as task |
yarn.metrics.apps_submitted (gauge) | The number of submitted apps Shown as task |
yarn.metrics.available_mb (gauge) | The amount of available memory Shown as mebibyte |
yarn.metrics.available_virtual_cores (gauge) | The number of available virtual cores Shown as core |
yarn.metrics.containers_allocated (gauge) | The number of containers allocated |
yarn.metrics.containers_pending (gauge) | The number of containers pending |
yarn.metrics.containers_reserved (gauge) | The number of containers reserved |
yarn.metrics.decommissioned_nodes (gauge) | The number of decommissioned nodes Shown as node |
yarn.metrics.decommissioning_nodes (gauge) | The number of decommissioning nodes Shown as node |
yarn.metrics.lost_nodes (gauge) | The number of lost nodes Shown as node |
yarn.metrics.rebooted_nodes (gauge) | The number of rebooted nodes Shown as node |
yarn.metrics.reserved_mb (gauge) | The size of reserved memory Shown as mebibyte |
yarn.metrics.reserved_virtual_cores (gauge) | The number of reserved virtual cores Shown as core |
yarn.metrics.total_mb (gauge) | The amount of total memory Shown as mebibyte |
yarn.metrics.total_nodes (gauge) | The total number of nodes Shown as node |
yarn.metrics.total_virtual_cores (gauge) | The total number of virtual cores Shown as core |
yarn.metrics.unhealthy_nodes (gauge) | The number of unhealthy nodes Shown as node |
yarn.node.avail_memory_mb (gauge) | The total amount of memory currently available on the node (in MB) Shown as mebibyte |
yarn.node.available_virtual_cores (gauge) | The total number of vCores available on the node Shown as core |
yarn.node.last_health_update (gauge) | The last time the node reported its health (in ms since epoch) Shown as millisecond |
yarn.node.num_containers (gauge) | The total number of containers currently running on the node |
yarn.node.used_memory_mb (gauge) | The total amount of memory currently used on the node (in MB) Shown as mebibyte |
yarn.node.used_virtual_cores (gauge) | The total number of vCores currently used on the node Shown as core |
yarn.queue.absolute_capacity (gauge) | The absolute capacity percentage this queue can use of entire cluster Shown as percent |
yarn.queue.absolute_max_capacity (gauge) | The absolute maximum capacity percentage this queue can use of the entire cluster Shown as percent |
yarn.queue.absolute_used_capacity (gauge) | The absolute used capacity percentage this queue is using of the entire cluster Shown as percent |
yarn.queue.am_resource_limit.memory (gauge) | The maximum memory resources this queue can use for Application Masters (in MB) Shown as mebibyte |
yarn.queue.am_resource_limit.vcores (gauge) | The maximum vCpus this queue can use for Application Masters Shown as core |
yarn.queue.capacity (gauge) | The configured queue capacity in percentage relative to its parent queue Shown as percent |
yarn.queue.max_active_applications (gauge) | The maximum number of active applications this queue can have Shown as task |
yarn.queue.max_active_applications_per_user (gauge) | The maximum number of active applications per user this queue can have Shown as task |
yarn.queue.max_applications (gauge) | The maximum number of applications this queue can have Shown as task |
yarn.queue.max_applications_per_user (gauge) | The maximum number of applications per user this queue can have Shown as task |
yarn.queue.max_capacity (gauge) | The configured maximum queue capacity in percentage relative to its parent queue Shown as percent |
yarn.queue.num_active_applications (gauge) | The number of active applications in this queue Shown as task |
yarn.queue.num_applications (gauge) | The number of applications currently in the queue Shown as task |
yarn.queue.num_containers (gauge) | The number of containers being used |
yarn.queue.num_pending_applications (gauge) | The number of pending applications in this queue Shown as task |
yarn.queue.resources_used.memory (gauge) | The total memory resources this queue is using (in MB) Shown as mebibyte |
yarn.queue.resources_used.vcores (gauge) | The total vCpus this queue is using Shown as core |
yarn.queue.root.capacity (gauge) | The configured queue capacity in percentage for root queue Shown as percent |
yarn.queue.root.max_capacity (gauge) | The configured maximum queue capacity in percentage for root queue Shown as percent |
yarn.queue.root.used_capacity (gauge) | The used queue capacity in percentage for root queue Shown as percent |
yarn.queue.used_am_resource.memory (gauge) | The memory resources used for Application Masters (in MB) Shown as mebibyte |
yarn.queue.used_am_resource.vcores (gauge) | The vCpus used for Application Masters Shown as core |
yarn.queue.used_capacity (gauge) | The used queue capacity in percentage Shown as percent |
yarn.queue.user_am_resource_limit.memory (gauge) | The maximum memory resources a user can use for Application Masters (in MB) Shown as mebibyte |
yarn.queue.user_am_resource_limit.vcores (gauge) | The maximum vCpus a user can use for Application Masters Shown as core |
yarn.queue.user_limit (gauge) | The user limit factor set in the configuration |
yarn.queue.user_limit_factor (gauge) | The minimum user limit percent set in the configuration |
The Yarn check does not include any events.
yarn.can_connect
Returns CRITICAL
if the Agent cannot connect to the ResourceManager URI to collect metrics, otherwise OK
.
Statuses: ok, critical
yarn.application.status
By default, returns OK
if the Yarn application state is NEW
, NEW_SAVING
, SUBMITTED
, ACCEPTED
, RUNNING
, or FINISHED
; UNKNOWN
if the application state is ALL
; and CRITICAL
if the Yarn application state is FAILED
or KILLED
.
Statuses: ok, unknown, critical
Need help? Contact Datadog support.