Error Tracking for Backend Services

Overview

The details of an issue in the Error Tracking Explorer

It is critical for your system’s health to consistently monitor the errors collected by Datadog. When there are many individual error events, it becomes hard to prioritize errors for troubleshooting.

Error Tracking simplifies debugging by grouping thousands of similar errors into a single issue. An issue is an aggregation of error data that provides insights such as

  • How many users have been impacted
  • When the error first occurred
  • Which commit probably caused the error

Error Tracking enables you to:

  • Track, triage, and debug fatal errors
  • Group similar errors into issues, so that you can more easily identify important errors and reduce noise
  • Set monitors on error tracking events, such as high error volume or new issues
  • Follow issues over time to know when they first started, if they are still ongoing, and how often they occur

Setup

Error Tracking is available for all the languages supported by APM and does not require using a different SDK.

Optionally, to see code snippets in your stack traces, set up the GitHub integration.

An inline code snippet in a stack trace

To get started with configuring your repository, see the Source Code Integration documentation.

Use span tags to track error spans

The Datadog tracers collect errors through integrations and the manual instrumentation of your backend services’ source code. Error spans within a trace are processed by Error Tracking if the error is located in a service entry span (the uppermost service span). This span must also contain the error.stack, error.message, and error.type span tags to be tracked.

Flame graph with errors

Error Tracking computes a fingerprint for each error span it processes using the error type, the error message, and the frames that form the stack trace. Errors with the same fingerprint are grouped together and belong to the same issue. For more information, see the Trace Explorer documentation.

Examine issues to start troubleshooting or debugging

Error Tracking automatically categorizes errors into issues collected from your backend services in the Error Tracking Explorer. See the Error Tracking Explorer documentation for a tour of key features.

Issues created from APM include the distribution of impacted spans, the latest most relevant stack trace, span tags, host tags, container tags, and metrics.

Further Reading

PREVIEWING: may/unit-testing