Portworx

Supported OS Linux

Integration version1.1.0

Overview

Get metrics from Portworx service in real time to:

  • Monitor health and performance of your Portworx Cluster
  • Track disk usage, latency and throughput for Portworx volumes

Setup

The Portworx check is not included in the Datadog Agent package, so you need to install it.

Installation

For Agent v7.21+ / v6.21+, follow the instructions below to install the Portworx check on your host. See Use Community Integrations to install with the Docker Agent or earlier versions of the Agent.

  1. Run the following command to install the Agent integration:

    datadog-agent integration install -t datadog-portworx==<INTEGRATION_VERSION>
    
  2. Configure your integration similar to core integrations.

Configuration

  1. Edit the portworx.d/conf.yaml file in the conf.d/ folder at the root of your Agent’s configuration directory to start collecting your Portworx metrics. See the sample portworx.d/conf.yaml for all available configuration options.

    init_config:
    
    instances:
     # url of the metrics endpoint of prometheus
     - prometheus_endpoint: http://localhost:9001/metrics
    
  2. Restart the Agent

Validation

Run the Agent’s info subcommand, you should see something like the following:

Compatibility

The Portworx check is compatible with Portworx 1.4.0 and possible earlier versions.

Data Collected

Metrics

portworx.cluster.cpu_percent
(gauge)
Node CPU Percentage
Shown as percent
portworx.cluster.disk_available_bytes
(gauge)
Node Available disk space
Shown as byte
portworx.cluster.disk_total_bytes
(gauge)
Node total bytes
Shown as byte
portworx.cluster.disk_utilized_bytes
(gauge)
Node Utilized bytes
Shown as byte
portworx.cluster.memory_utilized_percent
(gauge)
Node Memory Utilization Percentage
Shown as percent
portworx.cluster.pendingio
(gauge)
Node Pendiong IO's
portworx.cluster.status_cluster.quorum
(gauge)
Cluster Quorum
portworx.cluster.status_cluster.size
(gauge)
Cluster Size
portworx.cluster.status_nodes_offline
(gauge)
Cluster Number of Offline nodes
portworx.cluster.status_nodes_online
(gauge)
Cluster Number of Online nodes
portworx.cluster.status_nodes_storage_down
(gauge)
Cluster Number of nodes with storage down
portworx.cluster.status_storage_nodes_offline
(gauge)
Cluster Number of nodes with storage offline
portworx.cluster.status_storage_nodes_online
(gauge)
Cluster Number of nodes with storage online
portworx.alerts.pool_expand_successful
(gauge)
Triggered when a pool expand operation succeeds
portworx.alerts.pool_expand_failed
(gauge)
Triggered when a pool expand operation fails
portworx.cluster.cpu_percent
(gauge)
Node CPU Percentage
Shown as percent
portworx.cluster.disk_available_bytes
(gauge)
Node Available disk space
Shown as byte
portworx.cluster.disk_total_bytes
(gauge)
Node total bytes
Shown as byte
portworx.cluster.disk_utilized_bytes
(gauge)
Node Utilized bytes
Shown as byte
portworx.cluster.memory_utilized_percent
(gauge)
Node Memory Utilization Percentage
Shown as percent
portworx.cluster.pendingio
(gauge)
Node Pendiong IO's
portworx.cluster.status.cluster_quorum
(gauge)
Indicates if the cluster is in quorum
portworx.cluster.status.cluster_size
(gauge)
Node count for your portworx cluster
portworx.cluster.status.nodes_offline
(gauge)
Number of offline nodes in the cluster (includes storage and storageless)
portworx.cluster.status.nodes_online
(gauge)
Number of online nodes in the cluster (includes storage and storageless)
portworx.cluster.status.nodes_storage_down
(gauge)
Number of nodes where the storage is full or down
portworx.cluster.status.storage_nodes_offline
(gauge)
Number of storage nodes that are offline
portworx.cluster.status.storage_nodes_online
(gauge)
Number of storage nodes that are online
portworx.disk_stats.interval_seconds
(gauge)
Disk stats for interval seconds
Shown as second
portworx.disk_stats.io_seconds
(gauge)
Disk stats for io's per seconds
Shown as second
portworx.disk_stats.progress_io
(gauge)
Disk stats for io's in progress
Shown as second
portworx.disk_stats.read_bytes
(gauge)
Disk stats for number of read bytes
Shown as byte
portworx.disk_stats.read_latency_seconds
(gauge)
Disk stats for read latency in seconds
Shown as second
portworx.disk_stats.read_seconds
(gauge)
Disk stats for of reads per seconds
Shown as second
portworx.disk_stats.reads
(gauge)
Disk stats for number of reads
portworx.disk_stats.used_bytes
(gauge)
Disk stats for used bytes
Shown as byte
portworx.disk_stats.write_bytes
(gauge)
Disk stats for of written bytes
Shown as byte
portworx.disk_stats.write_latency_seconds
(gauge)
Disk stats for write latency in seconds
Shown as second
portworx.disk_stats.write_seconds
(gauge)
Disk stats for writes per seconds
Shown as second
portworx.disk_stats.writes
(gauge)
Disk stats for number of writes
portworx.disk_stats.num_reads
(gauge)
Total number of read operations completed successfully for this disk
portworx.disk_stats.num_writes
(gauge)
Total number of write operations completed successfully for this disk
portworx.network_io.bytessent
(gauge)
Network stats for bytes sent
Shown as byte
portworx.network_io.received_bytes
(gauge)
Network stats for bytes received
Shown as byte
portworx.node_status.status
(gauge)
Status of this node
portworx.node_status.licence_expiry
(gauge)
Number of days until License (or License lease) expires (<0 means Expired)
portworx.pool_stats.pool_flushed_bytes
(gauge)
Pool stats for flushed bytes
Shown as byte
portworx.pool_stats.pool_flushms
(gauge)
Pool stats for flush latency
Shown as millisecond
portworx.pool_stats.pool_num_flushes
(gauge)
Pool stats for number of flushes
portworx.pool_stats.pool_write_latency_seconds
(gauge)
Pool stats for write latency
Shown as second
portworx.pool_stats.pool_writethroughput
(gauge)
Pool stats for write throughput
Shown as byte
portworx.pool_stats.pool_written_bytes
(gauge)
Pool stats for written bytes
Shown as byte
portworx.proc_stats.cpu_percenttime
(gauge)
Proc stats for CPU percent time
portworx.proc_stats.res
(gauge)
Proc stats for resident memory size
portworx.proc_stats.virt
(gauge)
Proc stats for virtual memory size
portworx.pool_stats.status
(gauge)
Status of this pool (0=Offline, 1=Online, 2=Full, 3=NotFound, 4=Maintenance)
portworx.pool_stats.write_latency_seconds.main
(gauge)
Average time spent per write operation for this pool
Shown as second
portworx.pool_stats.write_throughput.main
(gauge)
Average number of bytes written per second for this pool
portworx.proc_stats.cpu_percenttime
(gauge)
Proc stats for CPU percent time
portworx.proc_stats.res
(gauge)
Proc stats for resident memory size
portworx.proc_stats.virt
(gauge)
Proc stats for virtual memory size
portworx.volume.capacity_bytes
(gauge)
Volume stats for capacity bytes
Shown as byte
portworx.volume.currhalevel
(gauge)
Volume stats for the current HA level
portworx.volume.depth_io
(gauge)
Volume stats for io depth
portworx.volume.dev_depth_io
(gauge)
Volume Device stats for io depth
portworx.volume.dev_read_latency_seconds
(gauge)
Volume Device stats for read latency
Shown as second
portworx.volume.dev_readthroughput
(gauge)
Volume Device stats for read throughput
Shown as byte
portworx.volume.dev_write_latency_seconds
(gauge)
Volume Device stats for write latency
Shown as second
portworx.volume.dev_writethroughput
(gauge)
Volume Device stats for write throughput
Shown as byte
portworx.volume.halevel
(gauge)
Volume stats for HA Level
portworx.volume.iopriority
(gauge)
Volume stats for IO Priority
portworx.volume.iops
(gauge)
Volume stats for IOPS
portworx.volume.num_long_flushes
(gauge)
Volume stats for number of flushes
portworx.volume.num_long_reads
(gauge)
Volume stats for number of long reads
portworx.volume.num_long_writes
(gauge)
Volume stats for number of long writes
portworx.volume.readthroughput
(gauge)
Volume stats for read throughput
Shown as byte
portworx.volume.usage_bytes
(gauge)
Volume stats for used bytes
Shown as byte
portworx.volume.vol_read_latency_seconds
(gauge)
Volume stats for read latency
Shown as second
portworx.volume.vol_write_latency_seconds
(gauge)
Volume stats for write latency
Shown as second
portworx.volume.writethroughput
(gauge)
Volume stats for write throughput
Shown as byte
portworx.volume.written_bytes
(gauge)
Volume stats for written bytes
Shown as byte
portworx.volume.replication_status
(gauge)
Replication status for this volume (0:up, 1:not in quorum, 2:resync state, 3:degraded, 4:detached, 5:restore)
portworx.volume.state
(gauge)
State for this volume
portworx.volume.attached_state
(gauge)
Attached state for this volume (valid only if volume is attached)
portworx.volume.read_iops
(gauge)
Average number of completed read operations per second for this volume
portworx.volume.read_bytes
(gauge)
Number of successfully read bytes during this interval for this volume
Shown as byte
portworx.volume.read_latency_seconds
(gauge)
Average time spent per successfully completed read operation in seconds for this volume
Shown as second
portworx.volume.write_iops
(gauge)
Average number of completed write operations per second for this volume
portworx.volume.write_latency_seconds
(gauge)
Average time spent per successfully completed write operation in seconds for this volume
Shown as second
portworx.volume.discard_ops
(gauge)
Number of discard operations for this volume
portworx.volume.discarded_bytes
(gauge)
Number of discarded bytes for this volume
Shown as byte
portworx.fs.usage_bytes
(gauge)
Filesystem stats for used bytes
Shown as byte
portworx.fs.capacity_bytes
(gauge)
Filesystem stats for total bytes
Shown as byte

Events

The Portworx check does not include any events.

Troubleshooting

Agent cannot connect

    portworx
    -------
      - instance #0 [ERROR]: "('Connection aborted.', error(111, 'Connection refused'))"
      - Collected 0 metrics, 0 events & 0 service check

Check that the url in portworx.yaml is correct.

Further Reading

Additional helpful documentation, links, and articles:

PREVIEWING: esther/docs-8632-slo-blog-links