- 필수 기능
- 시작하기
- Glossary
- 표준 속성
- Guides
- Agent
- 통합
- 개방형텔레메트리
- 개발자
- Administrator's Guide
- API
- Datadog Mobile App
- CoScreen
- Cloudcraft
- 앱 내
- 서비스 관리
- 인프라스트럭처
- 애플리케이션 성능
- APM
- Continuous Profiler
- 스팬 시각화
- 데이터 스트림 모니터링
- 데이터 작업 모니터링
- 디지털 경험
- 소프트웨어 제공
- 보안
- AI Observability
- 로그 관리
- 관리
Supported OS
Linkerd is a light but powerful open-source service mesh with CNCF graduated status. It provides the tools you need to write secure, reliable, observable cloud-native applications. With minimal configuration and no application changes, Linkerd:
This integration sends your Linkerd metrics to Datadog, including application success rates, latency, and saturation.
This OpenMetrics-based integration has a latest mode (enabled by setting openmetrics_endpoint
to point to the target endpoint) and a legacy mode (enabled by setting prometheus_url
instead). To get all the most up-to-date features, Datadog recommends enabling the latest mode. For more information, see Latest and Legacy Versioning For OpenMetrics-based Integrations.
Metrics marked as [OpenMetrics V1]
or [OpenMetrics V2]
are only available using the corresponding mode of the Linkerd integration. Metrics not marked are collected by all modes.
The Linkerd check is included in the Datadog Agent package, so you don’t need to install anything else on your server.
To configure this check for an Agent running on a host:
Edit the linkerd.d/conf.yaml
file, in the conf.d/
folder at the root of your Agent’s configuration directory.
See the sample linkerd.d/conf.yaml
for all available configuration options using the latest OpenMetrics check example. If you previously implemented this integration, see the legacy example.
For containerized environments, see the Autodiscovery Integration Templates for guidance on applying the parameters below.
Parameter | Value |
---|---|
<INTEGRATION_NAME> | linkerd |
<INIT_CONFIG> | blank or {} |
<INSTANCE_CONFIG> | {"openmetrics_endpoint": "http://%%host%%:9990/admin/metrics/prometheus"} |
Note: This is a new default OpenMetrics check example. If you previously implemented this integration, see the legacy example.
Parameter | Value |
---|---|
<INTEGRATION_NAME> | linkerd |
<INIT_CONFIG> | blank or {} |
<INSTANCE_CONFIG> | {"openmetrics_endpoint": "http://%%host%%:4191/metrics"} |
Note: This is a new default OpenMetrics check example. If you previously implemented this integration, see the legacy example.
Collecting logs is disabled by default in the Datadog Agent. To enable it, see Kubernetes log collection.
Parameter | Value |
---|---|
<LOG_CONFIG> | {"source": "linkerd", "service": "<SERVICE_NAME>"} |
To increase the verbosity of the data plane logs, see Modifying the Proxy Log Level.
Run the Agent’s status subcommand and look for linkerd
under the Checks section.
linkerd.control.request.count (count) | [OpenMetrics V2] Total count of control HTTP requests. Shown as request |
linkerd.control.request_total (count) | [OpenMetrics V1] Total count of control HTTP requests. Shown as request |
linkerd.control.response.count (count) | [OpenMetrics V2] Total count of control HTTP responses. Shown as response |
linkerd.control.response_latency.count (gauge) | Number of control responses on which the linkerd.control.response_latency.sum is evaluated. Shown as response |
linkerd.control.response_latency.sum (gauge) | Elapsed times between a control request’s headers being received and its response stream completing. Shown as millisecond |
linkerd.control.response_total (count) | [OpenMetrics V1] Total count of control HTTP responses. Shown as response |
linkerd.control.retry_skipped.count (count) | [OpenMetrics V2] Total count of retryable control HTTP responses that were not retried. Shown as response |
linkerd.control.retry_skipped_total (count) | [OpenMetrics V1] Total count of retryable control HTTP responses that were not retried. Shown as response |
linkerd.jvm.fd_count (gauge) | (only available on Unix-based OS) A gauge of the number of open file descriptors (Linkerd v1 only). Shown as unit |
linkerd.jvm.gc.ConcurrentMarkSweep.cycles (gauge) | A gauge for ConcurrentMarkSweep of the total number of collections that have occurred (Linkerd v1 only). Shown as unit |
linkerd.jvm.gc.ConcurrentMarkSweep.msec (gauge) | A gauge for ConcurrentMarkSweep of the total elapsed time garbage collection pool doing collections, in milliseconds (Linkerd v1 only). Shown as millisecond |
linkerd.jvm.gc.ParNew.cycles (gauge) | A gauge for ParNew of the total number of collections that have occurred (Linkerd v1 only). Shown as unit |
linkerd.jvm.gc.ParNew.msec (gauge) | A gauge for ParNew of the total elapsed time garbage collection pool doing collections, in milliseconds (Linkerd v1 only). Shown as millisecond |
linkerd.jvm.gc.cycles (gauge) | A gauge of the number of the total number of collections that have occurred (Linkerd v1 only). Shown as unit |
linkerd.jvm.gc.eden.pause_msec.quantile (gauge) | Stats of the durations, in milliseconds, of the eden collection pauses (Linkerd v1 only). Shown as millisecond |
linkerd.jvm.gc.msec (gauge) | A gauge of the total elapsed time doing collections, in milliseconds (Linkerd v1 only). Shown as millisecond |
linkerd.jvm.heap.committed (gauge) | For the heap used for object allocation, a gauge of the amount of memory, in bytes, committed for the JVM to use (Linkerd v1 only). Shown as byte |
linkerd.jvm.heap.max (gauge) | For the heap used for object allocation, a gauge of the maximum amount of memory, in bytes, that can be used by the JVM (Linkerd v1 only). Shown as byte |
linkerd.jvm.heap.used (gauge) | For the heap used for object allocation, a gauge of the current amount of memory used, in bytes (Linkerd v1 only). Shown as byte |
linkerd.jvm.mem.current.CMS_Old_Gen.used (gauge) | A gauge of the of the current memory used, in bytes, for CMS_Old_Gen memory pool (Linkerd v1 only). Shown as byte |
linkerd.jvm.mem.current.Par_Eden_Space.used (gauge) | A gauge of the of the current memory used, in bytes, for Par_Eden_Space memory pool (Linkerd v1 only). Shown as byte |
linkerd.jvm.mem.current.Par_Survivor_Space.used (gauge) | A gauge of the of the current memory used, in bytes, for Par_Survivor_Space memory pool (Linkerd v1 only). Shown as byte |
linkerd.jvm.nonheap.committed (gauge) | For the non-heap memory, a gauge of the amount of memory, in bytes, committed for the JVM to use (Linkerd v1 only). Shown as byte |
linkerd.jvm.nonheap.max (gauge) | For the non-heap memory, a gauge of the maximum amount of memory, in bytes, that can be used by the JVM (Linkerd v1 only). Shown as byte |
linkerd.jvm.nonheap.used (gauge) | For the non-heap memory, a gauge of the current amount of memory used, in bytes (Linkerd v1 only). Shown as byte |
linkerd.jvm.num_cpus (gauge) | A gauge of the number of processors available to the JVM (Linkerd v1 only). Shown as core |
linkerd.jvm.start_time (gauge) | A gauge of the start time of the Java virtual machine in milliseconds since the epoch (Linkerd v1 only). Shown as millisecond |
linkerd.jvm.thread.count (gauge) | A gauge of the number of live threads including both daemon and non-daemon threads (Linkerd v1 only). Shown as thread |
linkerd.jvm.uptime (gauge) | A gauge of the uptime of the Java virtual machine in milliseconds (Linkerd v1 only). Shown as millisecond |
linkerd.openmetrics.health (gauge) | [OpenMetrics V2] Whether the check is able to connect to the metrics endpoint. |
linkerd.process.cpu_seconds.count (count) | [OpenMetrics V2] Total user and system CPU time spent in seconds. Shown as second |
linkerd.process.cpu_seconds_total (count) | [OpenMetrics V1] Total user and system CPU time spent in seconds. Shown as second |
linkerd.process.max_fds (gauge) | Maximum number of open file descriptors. Shown as file |
linkerd.process.open_fds (gauge) | Number of open file descriptors. Shown as file |
linkerd.process.resident_memory (gauge) | Resident memory size in bytes. Shown as byte |
linkerd.process.start_time (gauge) | Time that the process started (in seconds since the UNIX epoch). Shown as second |
linkerd.process.virtual_memory (gauge) | Virtual memory size in bytes. Shown as byte |
linkerd.prometheus.health (gauge) | Whether the check is able to connect to the metrics endpoint. |
linkerd.request.count (count) | [OpenMetrics V2] Total count of HTTP requests. Shown as request |
linkerd.request_total (count) | [OpenMetrics V1] Total count of HTTP requests. Shown as request |
linkerd.response.count (count) | [OpenMetrics V2] Total count of HTTP responses. Shown as response |
linkerd.response_latency.count (gauge) | Number of responses on which the linkerd.response_latency.sum metric is evaluated. Shown as response |
linkerd.response_latency.sum (gauge) | Elapsed times between a request’s headers being received and its response stream completing. Shown as millisecond |
linkerd.response_total (count) | [OpenMetrics V1] Total count of HTTP responses. Shown as response |
linkerd.retry_skipped.count (count) | [OpenMetrics V2] Total count of retryable HTTP responses that were not retried. Shown as response |
linkerd.retry_skipped_total (count) | [OpenMetrics V1] Total count of retryable HTTP responses that were not retried. Shown as response |
linkerd.route.actual_request.count (count) | [OpenMetrics V2] Total count of actual route HTTP requests. Shown as request |
linkerd.route.actual_request_total (count) | [OpenMetrics V1] Total count of actual route HTTP requests. Shown as request |
linkerd.route.actual_response.count (count) | [OpenMetrics V2] Total count of actual route HTTP responses. Shown as response |
linkerd.route.actual_response_latency.count (gauge) | Number of responses on which the linkerd.route.actual_response_latency.sum is evaluated. Shown as millisecond |
linkerd.route.actual_response_latency.sum (gauge) | Elapsed times between a actual route request’s headers being received and its response stream completing. Shown as millisecond |
linkerd.route.actual_response_total (count) | [OpenMetrics V1] Total count of actual route HTTP responses. Shown as response |
linkerd.route.actual_retry_skipped.count (count) | [OpenMetrics V2] Total count of retryable actual route HTTP responses that were not retried. Shown as response |
linkerd.route.actual_retry_skipped_total (count) | [OpenMetrics V1] Total count of retryable actual route HTTP responses that were not retried. Shown as response |
linkerd.route.request.count (count) | [OpenMetrics V2] Total count of route HTTP requests. Shown as request |
linkerd.route.request_total (count) | [OpenMetrics V1] Total count of route HTTP requests. Shown as request |
linkerd.route.response.count (count) | [OpenMetrics V2] Total count of route HTTP responses. Shown as response |
linkerd.route.response_latency.count (gauge) | Number of responses on which the linkerd.route.response_latency.sum metric is evaluated. Shown as response |
linkerd.route.response_latency.sum (gauge) | Elapsed times between a route request’s headers being received and its response stream completing. Shown as millisecond |
linkerd.route.response_total (count) | [OpenMetrics V1] Total count of route HTTP responses. Shown as response |
linkerd.route.retry_skipped.count (count) | [OpenMetrics V2] Total count of retryable route HTTP responses that were not retried. Shown as response |
linkerd.route.retry_skipped_total (count) | [OpenMetrics V1] Total count of retryable route HTTP responses that were not retried. Shown as response |
linkerd.rt.client.connections (rate) | Number of active connections for the client (Linkerd v1 only). Shown as connection |
linkerd.rt.client.connects_s (rate) | Number of connection par second for the client (Linkerd v1 only). Shown as connection |
linkerd.rt.client.pool_cached (gauge) | A gauge of the number of connections cached for the client (Linkerd v1 only). Shown as connection |
linkerd.rt.client.pool_num_too_many_waiters (gauge) | A counter of the number of times there were no connections immediately available and there were already too many waiters (Linkerd v1 only). Shown as unit |
linkerd.rt.client.pool_num_waited (gauge) | A counter of the number of times there were no connections immediately available and the client waited for a connection (Linkerd v1 only). Shown as unit |
linkerd.rt.client.pool_size (gauge) | A gauge of the number of connections that are currently alive, either in use or not (Linkerd v1 only). Shown as connection |
linkerd.rt.client.pool_waiters (gauge) | A gauge of the number of clients waiting on connections (Linkerd v1 only). Shown as unit |
linkerd.rt.client.request_latency_ms.quantile (gauge) | Stats of the latency of requests in milliseconds for the client (Linkerd v1 only). Shown as millisecond |
linkerd.rt.client.requests_s (rate) | Number of requests by second received by the client (Linkerd v1 only). |
linkerd.rt.client.status.1XX_s (rate) | Number of request by second returning 1XX status code for the client (Linkerd v1 only). Shown as unit |
linkerd.rt.client.status.2XX_s (rate) | Number of request by second returning 2XX status code for the client (Linkerd v1 only). Shown as unit |
linkerd.rt.client.status.3XX_s (rate) | Number of request by second returning 3XX status code for the client (Linkerd v1 only). Shown as unit |
linkerd.rt.client.status.4XX_s (rate) | Number of request by second returning 4XX status code for the client (Linkerd v1 only). Shown as unit |
linkerd.rt.client.status.5XX_s (rate) | Number of request by second returning 5XX status code for the client (Linkerd v1 only). Shown as unit |
linkerd.rt.client.success_s (rate) | Number of success per second for the client (Linkerd v1 only). |
linkerd.rt.server.connections (gauge) | Number of active connections for the server (Linkerd v1 only). Shown as connection |
linkerd.rt.server.connects_s (rate) | Number of connection par second for the server (Linkerd v1 only). Shown as connection |
linkerd.rt.server.request_latency_ms.quantile (gauge) | Stats of the latency of requests in milliseconds for the server (Linkerd v1 only). Shown as millisecond |
linkerd.tcp.close.count (count) | [OpenMetrics V2] Total count of closed connections. Shown as connection |
linkerd.tcp.close_total (count) | [OpenMetrics V1] Total count of closed connections. Shown as connection |
linkerd.tcp.connection_duration.count (gauge) | Number of connections on which the linkerd.tcp.connection_duration.sum metric is evaluated. Shown as connection |
linkerd.tcp.connection_duration.sum (gauge) | Connection lifetimes. Shown as millisecond |
linkerd.tcp.open.count (count) | [OpenMetrics V2] Total count of opened connections. Shown as connection |
linkerd.tcp.open_connections (gauge) | Number of currently-open connections. Shown as connection |
linkerd.tcp.open_total (count) | [OpenMetrics V1] Total count of opened connections. Shown as connection |
linkerd.tcp.read_bytes.count (count) | [OpenMetrics V2] Total count of bytes read from peers. Shown as byte |
linkerd.tcp.read_bytes_total (count) | [OpenMetrics V1] Total count of bytes read from peers. Shown as byte |
linkerd.tcp.write_bytes.count (count) | [OpenMetrics V2] Total count of bytes written to peers. Shown as byte |
linkerd.tcp.write_bytes_total (count) | [OpenMetrics V1] Total count of bytes written to peers. Shown as byte |
For Linkerd v1, see the finagle metrics guide for metric descriptions and this gist for an example of metrics exposed by Linkerd.
Linkerd is a Prometheus-based integration. Depending on your Linkerd configuration, some metrics might not be exposed by Linkerd. If any metric is not present in the cURL output, the Datadog Agent is unable to collect that particular metric.
To list the metrics exposed by your current configuration, run:
curl <linkerd_prometheus_endpoint>
Where linkerd_prometheus_endpoint
is the Linkerd Prometheus endpoint (you should use the same value as the prometheus_url
config key in your linkerd.yaml
)
If you need to use a metric that is not provided by default, you can add an entry to linkerd.yaml
.
For more information, see the examples in the default configuration.
linkerd.prometheus.health
Returns CRITICAL
if the agent fails to connect to the prometheus endpoint, otherwise OK
.
Statuses: ok, critical
Need help? Contact Datadog support.