Overview
This check monitors Boundary through the Datadog Agent. The minimum supported version of Boundary is 0.8.0
.
Setup
Follow the instructions below to install and configure this check for an Agent running on a host. For containerized environments, see the Autodiscovery Integration Templates for guidance on applying these instructions.
Installation
The Boundary check is included in the Datadog Agent package.
No additional installation is needed on your server.
Configuration
Listener
A listener with an ops
purpose must be set up in the config.hcl
file to enable metrics collection. Here’s an example listener stanza:
controller {
name = "boundary-controller"
database {
url = "postgresql://<username>:<password>@10.0.0.1:5432/<database_name>"
}
}
listener "tcp" {
purpose = "api"
tls_disable = true
}
listener "tcp" {
purpose = "ops"
tls_disable = true
}
The boundary.controller.health
service check submits as WARNING
when the controller is shutting down. To enable this shutdown grace period, update the controller
block with a defined wait duration:
controller {
name = "boundary-controller"
database {
url = "env://BOUNDARY_PG_URL"
}
graceful_shutdown_wait_duration = "10s"
}
Datadog Agent
Edit the boundary.d/conf.yaml
file, in the conf.d/
folder at the root of your Agent’s configuration directory to start collecting your boundary performance data. See the sample boundary.d/conf.yaml for all available configuration options.
Restart the Agent.
Validation
Run the Agent’s status subcommand and look for boundary
under the Checks section.
Data Collected
Metrics
boundary.cluster.client.grpc.request_duration_seconds.bucket (count) | Histogram of latencies for gRPC requests between the cluster and any of its clients. Shown as second |
boundary.cluster.client.grpc.request_duration_seconds.count (count) | Histogram of latencies for gRPC requests between the cluster and any of its clients. Shown as second |
boundary.cluster.client.grpc.request_duration_seconds.sum (count) | Histogram of latencies for gRPC requests between the cluster and any of its clients. Shown as second |
boundary.controller.api.http.request_duration_seconds.bucket (count) | Histogram of latencies for HTTP requests. Shown as second |
boundary.controller.api.http.request_duration_seconds.count (count) | Histogram of latencies for HTTP requests. Shown as second |
boundary.controller.api.http.request_duration_seconds.sum (count) | Histogram of latencies for HTTP requests. Shown as second |
boundary.controller.api.http.request_size_bytes.bucket (count) | Histogram of request sizes for HTTP requests. Shown as byte |
boundary.controller.api.http.request_size_bytes.count (count) | Histogram of request sizes for HTTP requests. Shown as byte |
boundary.controller.api.http.request_size_bytes.sum (count) | Histogram of request sizes for HTTP requests. Shown as byte |
boundary.controller.api.http.response_size_bytes.bucket (count) | Histogram of response sizes for HTTP responses. Shown as byte |
boundary.controller.api.http.response_size_bytes.count (count) | Histogram of response sizes for HTTP responses. Shown as byte |
boundary.controller.api.http.response_size_bytes.sum (count) | Histogram of response sizes for HTTP responses. Shown as byte |
boundary.controller.cluster.grpc.request_duration_seconds.bucket (count) | Histogram of latencies for gRPC requests. Shown as second |
boundary.controller.cluster.grpc.request_duration_seconds.count (count) | Histogram of latencies for gRPC requests. Shown as second |
boundary.controller.cluster.grpc.request_duration_seconds.sum (count) | Histogram of latencies for gRPC requests. Shown as second |
boundary.worker.proxy.http.write_header_duration_seconds.bucket (count) | Histogram of time elapsed after the TLS connection is established to when the first http header is written back from the server. Shown as second |
boundary.worker.proxy.http.write_header_duration_seconds.count (count) | Histogram of time elapsed after the TLS connection is established to when the first http header is written back from the server. Shown as second |
boundary.worker.proxy.http.write_header_duration_seconds.sum (count) | Histogram of time elapsed after the TLS connection is established to when the first http header is written back from the server. Shown as second |
boundary.worker.proxy.websocket.active_connections (gauge) | Count of open websocket proxy connections (to Boundary workers). Shown as connection |
boundary.worker.proxy.websocket.received_bytes.count (count) | Count of received bytes for Worker proxy websocket connections. Shown as byte |
boundary.worker.proxy.websocket.sent_bytes.count (count) | Count of sent bytes for Worker proxy websocket connections. Shown as byte |
Events
The Boundary integration does not include any events.
Service Checks
boundary.openmetrics.health
Returns CRITICAL
if the Agent is unable to connect to the OpenMetrics endpoint, otherwise returns OK
.
Statuses: ok, critical
boundary.controller.health
Returns CRITICAL
if the Agent is unable to connect to the controller’s health endpoint, WARNING
if the controller received a shutdown signal, otherwise returns OK
.
Statuses: ok, warning, critical
Log collection
Collecting logs is disabled by default in the Datadog Agent. Enable it in your datadog.yaml
file:
To start collecting your Boundary logs, add this configuration block to your boundary.d/conf.yaml
file:
logs:
- type: file
source: boundary
path: /var/log/boundary/events.ndjson
Change the path
parameter value based on your environment. See the sample boundary.d/conf.yaml
file for all available configuration options.
Troubleshooting
Need help? Contact Datadog support.