Riak MDC Replication

Supported OS Linux Mac OS Windows

Integration version1.0.1

Overview

This check monitors Riak replication riak-repl.

Setup

The Riak-Repl check is not included in the Datadog Agent package, so you need to install it.

Installation

For Agent v7.21+ / v6.21+, follow the instructions below to install the Riak-Repl check on your host. See Use Community Integrations to install with the Docker Agent or earlier versions of the Agent.

  1. Run the following command to install the Agent integration:

    datadog-agent integration install -t datadog-riak_repl==<INTEGRATION_VERSION>
    
  2. Configure your integration similar to core integrations.

Configuration

  1. Edit the riak_repl.d/conf.yaml file, in the conf.d/ folder at the root of your Agent’s configuration directory to start collecting your riak_repl performance data. See the sample riak_repl.d/conf.yaml for all available configuration options.

  2. Restart the Agent

Validation

Run the Agent’s status subcommand and look for riak_repl under the Checks section.

Data Collected

Metrics

riak_repl.server_bytes_sent
(gauge)
Total number of bytes the primary has sent
Shown as byte
riak_repl.server_bytes_recv
(gauge)
Total number of bytes the primary has received
Shown as byte
riak_repl.server_connects
(gauge)
Number of times the primary connects to the client sink
Shown as connection
riak_repl.server_connect_errors
(gauge)
The number of listener to site connection errors
Shown as error
riak_repl.server_fullsyncs
(gauge)
Number of fullsync operations since the server was started
Shown as occurrence
riak_repl.client_bytes_sent
(gauge)
Total number of bytes sent to all connected secondaries
Shown as byte
riak_repl.client_bytes_recv
(gauge)
Total number of bytes the client has received since the server has been started
Shown as byte
riak_repl.client_connects
(gauge)
Total number of sink connections made to this node
Shown as connection
riak_repl.client_connect_errors
(gauge)
Total number of sink connection errors to this node
Shown as connection
riak_repl.client_redirect
(gauge)
Count of client connects to a non-leader node that are redirected to a leader node
Shown as connection
riak_repl.objects_dropped_no_clients
(gauge)
Total number of objects dropped from a full realtime queue
Shown as object
riak_repl.objects_dropped_no_leader
(gauge)
Total number of objects dropped by a sink with no leader
Shown as object
riak_repl.objects_sent
(gauge)
Total number of objects sent via realtime replication
Shown as object
riak_repl.objects_forwarded
(gauge)
Total number of objects forwarded to the leader
Shown as object
riak_repl.elections_elected
(gauge)
Total number of times a new leader has been elected
Shown as occurrence
riak_repl.elections_leader_changed
(gauge)
Total number of times a Riak node has surrendered leadership
Shown as occurrence
riak_repl.rt_source_errors
(gauge)
Total number of source errors detected on the source node
Shown as error
riak_repl.rt_sink_errors
(gauge)
Total number of sink errors detected on the source node
Shown as error
riak_repl.rt_dirty
(gauge)
Number of errors detected that can prevent objects from being replicated via realtime
Shown as error
riak_repl.realtime_send_kbps
(gauge)
Total number of bytes realtime has sent
Shown as kibibyte
riak_repl.realtime_recv_kbps
(gauge)
Total number of bytes realtime has received
Shown as kibibyte
riak_repl.fullsync_send_kbps
(gauge)
Total number of bytes fullsync has sent
Shown as kibibyte
riak_repl.fullsync_recv_kbps
(gauge)
Total number of bytes fullsync has received
Shown as kibibyte
riak_repl.realtime_queue_stats.percent_bytes_used
(gauge)
Percentage of realtime queue used (max_bytes/bytes)
Shown as percent
riak_repl.realtime_queue_stats.bytes
(gauge)
Size in bytes of all objects currently in the realtime queue
Shown as byte
riak_repl.realtime_queue_stats.max_bytes
(gauge)
Size in bytes of the realtime queue
Shown as byte
riak_repl.realtime_queue_stats.overload_drops
(gauge)
Number of put transfers dropped due to an overload of the message queue of the Erlang process responsible for processing outgoing transfers
Shown as occurrence
riak_repl.realtime_queue_stats_consumers.pending
(gauge)
Total number of objects waiting to be sent to the sink cluster
Shown as occurrence
riak_repl.realtime_queue_stats_consumers.unacked
(gauge)
Total number of objects waiting to be acknowledged by a queue consumer
Shown as occurrence
riak_repl.realtime_queue_stats_consumers.drops
(gauge)
Total number of objects dropped from the realtime queue as the result of the queue being full or other errors
Shown as occurrence
riak_repl.realtime_queue_stats_consumers.errs
(gauge)
Total number of errors while pushing/popping from the realtime queue
Shown as occurrence
riak_repl.realtime_source_conn.hb_rtt
(gauge)
Realtime replication heartbeat round-trip time in milliseconds, recorded on the replication source
Shown as millisecond
riak_repl.realtime_source_conn.sent_seq
(gauge)
The last realtime queue sequence number that has been transmitted
Shown as item
riak_repl.realtime_source_conn.objects
(gauge)
Total number of realtime replication objects that have been successfully transmitted to the sink cluster
Shown as object
riak_repl.realtime_sink_conn.deactivated
(gauge)
Total number of realtime replication objects that have been deactivated
Shown as object
riak_repl.realtime_sink_conn.source_drops
(gauge)
Total number of dropped put transfers from the perspective of the sink cluster
Shown as object
riak_repl.realtime_sink_conn.expected_seq
(gauge)
The next realtime queue sequence number that is expected
Shown as item
riak_repl.realtime_sink_conn.acked_seq
(gauge)
The last realtime queue sequence number that has been acknowledged
Shown as item
riak_repl.realtime_sink_conn.pending
(gauge)
Total number of objects waiting to be sent to the sink cluster
Shown as object
riak_repl.fullsync_coordinator.queued
(gauge)
Total number of partitions that are waiting for an available process
Shown as occurrence
riak_repl.fullsync_coordinator.in_progress
(gauge)
Total number of partitions that are being synced
Shown as occurrence
riak_repl.fullsync_coordinator.waiting_for_retry
(gauge)
Total number of partitions waiting for retry
Shown as occurrence
riak_repl.fullsync_coordinator.starting
(gauge)
Total number of partitions connecting to remote cluster
Shown as occurrence
riak_repl.fullsync_coordinator.successful_exits
(gauge)
Total number of partitions successfully synced.
Shown as occurrence
riak_repl.fullsync_coordinator.error_exits
(gauge)
Total number of partitions for which sync failed or was aborted
Shown as occurrence
riak_repl.fullsync_coordinator.retry_exits
(gauge)
Total number of partitions successfully synced via retry
Shown as occurrence
riak_repl.fullsync_coordinator.soft_retry_exits
(gauge)
description
Shown as occurrence
riak_repl.fullsync_coordinator.busy_nodes
(gauge)
description
Shown as node
riak_repl.fullsync_coordinator.fullsyncs_completed
(gauge)
Total number of fullsyncs that have been completed to the specified sink cluster
Shown as occurrence
riak_repl.fullsync_coordinator.last_fullsync_duration
(gauge)
The duration (in seconds) of the last completed fullsync
Shown as second

Service Checks

The Riak-Repl integration does not include any service checks.

Events

The Riak-Repl integration does not include any events.

Troubleshooting

Need help? Contact Datadog support.

PREVIEWING: piotr_wolski/update-dsm-docs