Presto

Supported OS Linux Mac OS Windows

Integration version3.1.0

Overview

This check collects Presto metrics, for example:

  • Overall activity metrics: completed/failed queries, data input/output size, execution time.
  • Performance metrics: cluster memory, input CPU, execution CPU time.

Setup

Follow the instructions below to install and configure this check for an Agent running on a host. For containerized environments, see the Autodiscovery Integration Templates for guidance on applying these instructions.

Installation

The Presto check is included in the Datadog Agent package. No additional installation is needed on your server. Install the Agent on each Coordinator and Worker node from which you wish to collect usage and performance metrics.

Configuration

  1. Edit the presto.d/conf.yaml file, in the conf.d/ folder at the root of your Agent’s configuration directory to start collecting your Presto performance data. See the sample presto.d/conf.yaml for all available configuration options.

    This check has a limit of 350 metrics per instance. The number of returned metrics is indicated in the status page. You can specify the metrics you are interested in by editing the configuration below. To learn how to customize the metrics to collect, see the JMX Checks documentation for more detailed instructions. If you need to monitor more metrics, contact Datadog support.

  2. Restart the Agent.

Metric collection

Use the default configuration of your presto.d/conf.yaml file to activate the collection of your Presto metrics. See the sample presto.d/conf.yaml for all available configuration options.

Log collection

Available for Agent versions >6.0

  1. Collecting logs is disabled by default in the Datadog Agent. Enable it in your datadog.yaml file:

    logs_enabled: true
    
  2. Add this configuration block to your presto.d/conf.yaml file to start collecting your Presto logs:

    logs:
      - type: file
        path: /var/log/presto/*.log
        source: presto
        service: "<SERVICE_NAME>"
    

    Change the path and service parameter values and configure them for your environment. See the sample presto.d/conf.yaml for all available configuration options.

  3. Restart the Agent.

Validation

Run the Agent’s status subcommand and look for presto under the Checks section.

Data Collected

Metrics

presto.execution.abandoned_queries.one_minute.count
(gauge)
Abandoned queries - one minute count.
Shown as query
presto.execution.abandoned_queries.one_minute.rate
(gauge)
Abandoned queries - one minute rate.
Shown as query
presto.execution.abandoned_queries.total_count
(gauge)
Abandoned queries - total count.
Shown as query
presto.execution.canceled_queries.one_minute.count
(gauge)
Canceled queries - one minute count.
Shown as query
presto.execution.canceled_queries.one_minute.rate
(gauge)
Canceled queries - one minute queries per second.
Shown as query
presto.execution.canceled_queries.total_count
(gauge)
Canceled queries - total count.
Shown as query
presto.execution.completed_queries.one_minute.count
(gauge)
Completed queries - one minute count.
Shown as query
presto.execution.completed_queries.one_minute.rate
(gauge)
Completed queries - one minute queries per second.
Shown as query
presto.execution.completed_queries.total_count
(gauge)
Completed queries - total count.
Shown as query
presto.execution.consumed_cpu_time_secs.one_minute.count
(gauge)
CPU (processing) time consumed - one minute count (seconds).
Shown as second
presto.execution.consumed_cpu_time_secs.one_minute.rate
(gauge)
CPU (processing) time consumed - one minute rate.
Shown as second
presto.execution.consumed_cpu_time_secs.total_count
(gauge)
CPU (processing) time consumed - total count (seconds).
Shown as second
presto.execution.cpu_input_byte_rate.all_time.avg
(gauge)
Distribution of query input data rates (cpu) - all time average bytes per second.
Shown as byte
presto.execution.cpu_input_byte_rate.all_time.p75
(gauge)
Distribution of query input data rates (cpu) - all time bytes per second - p75.
Shown as byte
presto.execution.cpu_input_byte_rate.all_time.p95
(gauge)
Distribution of query input data rates (cpu) - all time bytes per second - p95.
Shown as byte
presto.execution.cpu_input_byte_rate.one_minute.avg
(gauge)
Distribution of query input data rates (cpu) - one minute average bytes per second.
Shown as byte
presto.execution.cpu_input_byte_rate.one_minute.count
(gauge)
Distribution of query input data rates (cpu) - one minute count.
Shown as byte
presto.execution.cpu_input_byte_rate.one_minute.max
(gauge)
Distribution of query input data rates (cpu) - one minute max bytes per second.
Shown as byte
presto.execution.cpu_input_byte_rate.one_minute.min
(gauge)
Distribution of query input data rates (cpu) - one minute min bytes per second.
Shown as byte
presto.execution.cpu_input_byte_rate.one_minute.p75
(gauge)
Distribution of query input data rates (cpu) - one minute bytes per second - p75.
Shown as byte
presto.execution.cpu_input_byte_rate.one_minute.p95
(gauge)
Distribution of query input data rates (cpu) - one minute bytes per second - p95.
Shown as byte
presto.execution.cpu_input_byte_rate.one_minute.total
(gauge)
Distribution of query input data rates (cpu) - one minute total bytes per second.
Shown as byte
presto.execution.execution_time.all_time.avg
(gauge)
Query execution time (millisecond) - all time average.
Shown as millisecond
presto.execution.execution_time.all_time.count
(gauge)
Query execution time (millisecond) - all time count.
Shown as millisecond
presto.execution.execution_time.all_time.max
(gauge)
Query execution time (millisecond) - all time max.
Shown as millisecond
presto.execution.execution_time.all_time.min
(gauge)
Query execution time (millisecond) - all time min.
Shown as millisecond
presto.execution.execution_time.all_time.p75
(gauge)
Query execution time (millisecond) - all time - p75.
Shown as millisecond
presto.execution.execution_time.all_time.p95
(gauge)
Query execution time (millisecond) - all time - p95.
Shown as millisecond
presto.execution.execution_time.one_minute.avg
(gauge)
Query execution time (millisecond) - one minute average.
Shown as millisecond
presto.execution.execution_time.one_minute.max
(gauge)
Query execution time (millisecond) - one minute max.
Shown as millisecond
presto.execution.execution_time.one_minute.min
(gauge)
Query execution time (millisecond) - one minute min.
Shown as millisecond
presto.execution.execution_time.one_minute.p75
(gauge)
Query execution time (millisecond) - one minute p75.
Shown as millisecond
presto.execution.execution_time.one_minute.p95
(gauge)
Query execution time (millisecond) - one minute p95.
Shown as millisecond
presto.execution.executor.active_count
(gauge)
presto.execution.executor.blocked_splits
(gauge)
Blocked splits count.
Shown as split
presto.execution.executor.completed_task_count
(gauge)

Shown as task
presto.execution.executor.core_pool_size
(gauge)
presto.execution.executor.pool_size
(gauge)
presto.execution.executor.processor_executor.queued_task_count
(gauge)
Queued task count.
Shown as task
presto.execution.executor.queued_task_count
(gauge)
presto.execution.executor.running_splits
(gauge)
Running splits count.
Shown as split
presto.execution.executor.task_count
(gauge)

Shown as task
presto.execution.executor.total_splits
(gauge)
Total splits count.
Shown as split
presto.execution.executor.waiting_splits
(gauge)
Waiting splits count.
Shown as split
presto.execution.external_failures.one_minute.count
(gauge)
Failed queries (external) - one minute count.
Shown as query
presto.execution.external_failures.one_minute.rate
(gauge)
Failed queries (external) - one minute failures per second.
Shown as query
presto.execution.external_failures.total_count
(gauge)
Failed queries (external) - total count.
Shown as query
presto.execution.failed_queries.one_minute.count
(gauge)
Failed queries - one minute count.
Shown as query
presto.execution.failed_queries.one_minute.rate
(gauge)
Failed queries - one minute queries per second.
Shown as query
presto.execution.failed_queries.total_count
(gauge)
Failed queries - total count.
Shown as query
presto.execution.input_data_size.one_minute.count
(gauge)
Input data (bytes) - one minute count.
Shown as byte
presto.execution.input_data_size.one_minute.rate
(gauge)
Input data (bytes) - one minute bytes per second.
Shown as byte
presto.execution.input_data_size.total_count
(gauge)
Input data (bytes) - total count.
Shown as byte
presto.execution.input_positions.one_minute.count
(gauge)
Input positions (rows) - one minute count.
Shown as row
presto.execution.input_positions.one_minute.rate
(gauge)
Input positions (rows) - one minute rows per second.
Shown as row
presto.execution.input_positions.total_count
(gauge)
Input positions (rows) - total count.
Shown as row
presto.execution.insufficient_resources_failures.one_minute.count
(gauge)
Insufficient resources failures one minute count.
presto.execution.insufficient_resources_failures.one_minute.rate
(gauge)
Insufficient resources failures one minute failures per second.
presto.execution.insufficient_resources_failures.total_count
(gauge)
Insufficient resources failures total count.
presto.execution.internal_failures.one_minute.count
(gauge)
Failed queries (internal) - one minute count.
Shown as query
presto.execution.internal_failures.one_minute.rate
(gauge)
Failed queries (internal) - one minute queries per second.
Shown as query
presto.execution.internal_failures.total_count
(gauge)
Failed queries (internal) - total count.
Shown as query
presto.execution.management_executor.active_count
(gauge)
presto.execution.management_executor.completed_task_count
(gauge)

Shown as task
presto.execution.management_executor.queued_task_count
(gauge)

Shown as task
presto.execution.output_data_size.one_minute.count
(gauge)
Output data (bytes) - one minute count.
Shown as byte
presto.execution.output_data_size.one_minute.rate
(gauge)
Output data (bytes) - one minute bytes per second.
Shown as byte
presto.execution.output_data_size.total_count
(gauge)
Output data (bytes) - total count.
Shown as byte
presto.execution.output_positions.one_minute.count
(gauge)
Output positions (rows) - one minute count.
Shown as row
presto.execution.output_positions.one_minute.rate
(gauge)
Output positions (rows) - one minute rows per second.
Shown as row
presto.execution.output_positions.total_count
(gauge)
Output positions (rows) - total count.
Shown as row
presto.execution.running_queries
(gauge)
Active queries.
Shown as query
presto.execution.started_queries.one_minute.count
(gauge)
Queries started - one minute count.
Shown as query
presto.execution.started_queries.one_minute.rate
(gauge)
Queries started - one minute queries per second.
Shown as query
presto.execution.started_queries.total_count
(gauge)
Queries started - total count.
Shown as query
presto.execution.task_notification_executor.active_count
(gauge)
presto.execution.task_notification_executor.completed_task_count
(gauge)

Shown as task
presto.execution.task_notification_executor.pool_size
(gauge)
presto.execution.task_notification_executor.queued_task_count
(gauge)

Shown as task
presto.execution.user_error_failures.one_minute.count
(gauge)
Failed queries (user error) - one minute count.
Shown as query
presto.execution.user_error_failures.one_minute.rate
(gauge)
Failed queries (user error) - one minute queries per second.
Shown as query
presto.execution.user_error_failures.total_count
(gauge)
Failed queries (user error) - total count.
Shown as query
presto.execution.wall_input_bytes_rate.one_minute.avg
(gauge)
Input data rate (bytes) - one minute average.
Shown as byte
presto.execution.wall_input_bytes_rate.one_minute.max
(gauge)
Input data rate (bytes) - one minute max.
Shown as byte
presto.execution.wall_input_bytes_rate.one_minute.min
(gauge)
Input data rate (bytes) - one minute min.
Shown as byte
presto.execution.wall_input_bytes_rate.one_minute.p75
(gauge)
Input data rate (bytes) - one minute p75.
Shown as byte
presto.execution.wall_input_bytes_rate.one_minute.p95
(gauge)
Input data rate (bytes) - one minute p95.
Shown as byte
presto.failure_detector.active_count
(gauge)
Active node count.
Shown as node
presto.memory.assigned_queries
(gauge)
Memory (assigned queries).
Shown as byte
presto.memory.blocked_nodes
(gauge)
Memory (blocked nodes).
Shown as byte
presto.memory.cluster_memory_bytes
(gauge)
Cluster memory (bytes).
Shown as byte
presto.memory.free_bytes
(gauge)
Memory (free bytes).
Shown as byte
presto.memory.free_distributed_bytes
(gauge)
Memory (free distributed bytes).
Shown as byte
presto.memory.max_bytes
(gauge)
Memory (max bytes).
Shown as byte
presto.memory.nodes
(gauge)
Memory (nodes).
Shown as byte
presto.memory.reserved_bytes
(gauge)
Memory (reserved bytes).
Shown as byte
presto.memory.reserved_distributed_bytes
(gauge)
Memory (reserved distributed bytes).
Shown as byte
presto.memory.reserved_revocable_bytes
(gauge)
Memory (reserved revocable bytes).
Shown as byte
presto.memory.reserved_revocable_distributed_bytes
(gauge)
Memory (reserved revocable distributed bytes).
Shown as byte
presto.memory.total_distributed_bytes
(gauge)
Memory (total distributed bytes).
Shown as byte

Events

Presto does not include any events.

Service Checks

presto.can_connect
Returns CRITICAL if the Agent is unable to connect to and collect metrics from the monitored Presto instance, WARNING if no metrics are collected, and OK otherwise.
Statuses: ok, critical, warning

Troubleshooting

Need help? Contact Datadog support.

PREVIEWING: brett0000FF/node-compatibility