Data Streams Monitoring instruments Kafka clients (consumers/producers). If you can instrument your client infrastructure, you can use Data Streams Monitoring.
For installation instructions and lists of supported technologies, choose your language:
Data Streams Monitoring provides an out-of-the-box topology map, so that you can visualize data flow across your pipelines and identify producer/consumer services, queue dependencies, service ownership, and key health metrics.
Depending on how events traverse through your system, different paths can lead to increased latency. With the Measure tab, you can select a start service and end service for end-to-end latency information to identify bottlenecks and optimize performance. Easily create a monitor for that pathway, or export to a dashboard.
Alternatively, click a service to open a detailed side panel and view the Pathways tab for latency between the service and upstream services.
Slowdowns caused by high consumer lag or stale messages can lead to cascading failures and increase downtime. With out-of-the-box alerts, you can pinpoint where bottlenecks occur in your pipelines and respond to them right away. For supplementary metrics, Datadog provides additional integrations for message queue technologies like Kafka and SQS.
Through Data Stream Monitoring’s out-of-the-box monitor templates, you can setup monitors on metrics like consumer lag, throughput, and latency in one click.
Click 'Add Monitors and Synthetic Tests' to view monitor templates
High lag on a consuming service, increased resource use on a Kafka broker, and increased RabbitMQ or Amazon SQS queue size are frequently explained by changes in the way adjacent services are producing to or consuming from these entities.
Click on the Throughput tab on any service or queue in Data Streams Monitoring to quickly detect changes in throughput, and which upstream or downstream service these changes originate from. Once the Software Catalog is configured, you can immediately pivot to the corresponding team’s Slack channel or on-call engineer.
By filtering to a single Kafka, RabbitMQ, or Amazon SQS cluster, you can detect changes in incoming or outgoing traffic for all detected topics or queues running on that cluster:
Datadog automatically links the infrastructure powering your services and related logs through Unified Service Tagging, so you can easily localize bottlenecks. Click the Infra, Logs or Traces tabs to further troubleshoot why pathway latency or consumer lag has increased.
Datadog can automatically detect your managed Confluent Cloud connectors and visualize them in the Data Streams Monitoring topology map. Install and configure the Confluent Cloud integration to collect information from your Confluent Cloud connectors—including throughput, status, and topic dependencies.