Wayfinder

Supported OS Linux Windows

Integration version1.0.0

Overview

Wayfinder is an infrastructure management platform that enables developer self-service through a centralized configuration. This check monitors Wayfinder key management components through the Datadog Agent.

The integration collects key metrics from the Wayfinder API server, controller, and webhook components. These metrics should highlight issues in managed workspaces.

Setup

Follow the instructions below to install the integration in the Wayfinder Kubernetes management cluster.

Installation

For containerized environments, the best way to use this integration with the Docker Agent is to build the Agent with the Wayfinder integration installed.

Prerequisites:

A network policy must be configured to allow the Datadog Agent to connect to Wayfinder components. The network policy below assumes Datadog is deployed to the Datadog namespace and Wayfinder is deployed to the Wayfinder namespace.

apiVersion: networking.k8s.io/v1
kind: NetworkPolicy
metadata:
  name: datadog-agent
  namespace: wayfinder
spec:
  ingress:
  - from:
    - namespaceSelector:
        matchLabels:
          name: datadog
      podSelector:
        matchLabels:
          app: datadog-agent
    ports:
    - port: 9090
      protocol: TCP
  podSelector:
    matchExpressions:
    - key: name
      operator: In
      values:
      - wayfinder-controllers
      - wayfinder-apiserver
      - wayfinder-webhooks
  policyTypes:
  - Ingress

To build an updated version of the Agent:

  1. Use the following Dockerfile:

    FROM gcr.io/datadoghq/agent:latest
    
    ARG INTEGRATION_VERSION=1.0.0
    
    RUN agent integration install -r -t datadog-wayfinder==${INTEGRATION_VERSION}
    
  2. Build the image and push it to your private Docker registry.

  3. Upgrade the Datadog Agent container image. If you are using a Helm chart, modify the agents.image section in the values.yaml file to replace the default agent image:

    agents:
      enabled: true
      image:
        tag: <NEW_TAG>
        repository: <YOUR_PRIVATE_REPOSITORY>/<AGENT_NAME>
    
  4. Use the new values.yaml file to upgrade the Agent:

    helm upgrade -f values.yaml <RELEASE_NAME> datadog/datadog
    

Configuration

  1. Edit the wayfinder/conf.yaml file, in the conf.d/ folder at the root of your Agent’s configuration directory to start collecting your Wayfinder data. See the sample wayfinder/conf.yaml for all available configuration options.

  2. Restart the Agent.

Validation

Run the Agent’s status subcommand and look for wayfinder under the Checks section.

Data Collected

Metrics

wayfinder.controller_runtime.active_workers
(gauge)
Number of currently used workers per controller.
wayfinder.controller_runtime.max_concurrent_reconciles
(gauge)
Maximum number of concurrent reconciles per controller.
wayfinder.controller_runtime.reconcile_errors_total.count
(count)
Total number of reconciliation errors per controller.
wayfinder.controller_runtime.reconcile_time_seconds.bucket
(count)
Bucket of length of time per reconciliation per controller.
wayfinder.controller_runtime.reconcile_time_seconds.count
(count)
Count of length of time per reconciliation per controller.
wayfinder.controller_runtime.reconcile_time_seconds.sum
(count)
Sum of length of time per reconciliation per controller.
wayfinder.controller_runtime.reconcile_total.count
(count)
A summary of the total controller reconciles.
wayfinder.workqueue.adds_total.count
(count)
Total number of adds handled by workqueue.
wayfinder.workqueue.depth
(gauge)
Current depth of workqueue.
wayfinder.workqueue.queue_duration_seconds.bucket
(count)
Bucket of length of time in seconds an item stays in workqueue before being requested.
wayfinder.workqueue.queue_duration_seconds.count
(count)
Count of time in seconds an item stays in workqueue before being requested.
wayfinder.workqueue.queue_duration_seconds.sum
(count)
Sum of time in seconds an item stays in workqueue before being requested.
wayfinder.workqueue.retries.count
(count)
Total number of retries handled by workqueue.
wayfinder.workqueue.unfinished_work_seconds
(gauge)
How many seconds of work has been done that is in progress and hasn't been observed by work_duration. Large values indicate stuck threads. One can deduce the number of stuck threads by observing the rate at which this increases.

Service Checks

Wayfinder does not include any service checks.

Events

Wayfinder does not include any events.

Troubleshooting

Need help? Contact Datadog support.

PREVIEWING: esther/docs-8632-slo-blog-links