- 필수 기능
- 시작하기
- Glossary
- 표준 속성
- Guides
- Agent
- 통합
- 개방형텔레메트리
- 개발자
- API
- Datadog Mobile App
- CoScreen
- Cloudcraft
- 앱 내
- 서비스 관리
- 인프라스트럭처
- 애플리케이션 성능
- APM
- Continuous Profiler
- 스팬 시각화
- 데이터 스트림 모니터링
- 데이터 작업 모니터링
- 디지털 경험
- 소프트웨어 제공
- 보안
- AI Observability
- 로그 관리
- 관리
Once CI Visibility is enabled for your organization, you can create a CI Pipeline or CI Test monitor.
CI monitors allow you to visualize CI data and set up alerts on it. For example, create a CI Pipeline monitor to receive alerts on a failed pipeline or a job. Create a CI Test monitor to receive alerts on failed or slow tests.
To create a CI monitor in Datadog, use the main navigation: Monitors -> New Monitor –> CI.
Choose between a Pipelines or a Tests monitor:
Unique value count
of the facet.min
, avg
, sum
, median
, pc75
, pc90
, pc95
, pc98
, pc99
, or max
).group by
, multi alerts apply the alert to each source according to your group parameters. An alerting event is generated for each group that meets the set conditions. For example, you could group a query by @ci.pipeline.name
to receive a separate alert for each CI Pipeline name when the number of errors is high.You can create CI Pipeline monitors using formulas and functions. This can be used, for example, to create monitors on the rate of an event happening, such as the rate of a pipeline failing (error rate).
The following example is of a pipeline error rate monitor using a formula that calculates the ratio of “number of failed pipeline events” (ci.status=error
) over “number of total pipeline events” (no filter), grouped by ci.pipeline.name
(to be alerted once per pipeline). To learn more, see the Functions Overview.
main
branch of the myapp
test service using the following query: @test.status:fail @git.branch:main @test.service:myapp
.Unique value count
of the facet.min
, avg
, sum
, median
, pc75
, pc90
, pc95
, pc98
, pc99
, or max
).group by
, an alert is sent for every source according to the group parameters. An alerting event is generated for each group that meets the set conditions. For example, you could group a query by @test.full_name
to receive a separate alert for each CI Test full name when the number of errors is high. Test full name is a combination of a test suite and test name, for example: MySuite.myTest
. In Swift, test full name is a combination of a test bundle, and suite and name, for example: MyBundle.MySuite.myTest
.Use @test.fingerprint
in the monitor group by
when you have tests with the same test full name, but different test parameters or configurations. This way, alerts trigger for test runs with specific test parameters or configurations. Using @test.fingerprint
provides the same granularity level as the Test Stats, Failed, and Flaky Tests section on the Commit Overview page.
For example, if a test with the same full name failed on Chrome, but passed on Firefox, then using the fingerprint only triggers the alert on the Chrome test run.
Using @test.full_name
in this case triggers the alert, even though the test passed on Firefox.
You can create CI Test monitors using formulas and functions. For example, this can be used to create monitors on the rate of an event happening, such as the rate of a test failing (error rate).
The following example is a test error rate monitor using a formula that calculates the ratio of “number of failed test events” (@test.status:fail
) over “number of total test events” (no filter), grouped by @test.full_name
(to be alerted once per test). To learn more, see the Functions Overview.
You can send the notification to different teams using the CODEOWNERS
information available in the test event.
The example below configures the notification with the following logic:
MyOrg/my-team
, then send the notification to the my-team-channel
Slack channel.MyOrg/my-other-team
, then send the notification to the my-other-team-channel
Slack channel.{{#is_match "citest.attributes.test.codeowners" "MyOrg/my-team"}}
@slack-my-team-channel
{{/is_match}}
{{#is_match "citest.attributes.test.codeowners" "MyOrg/my-other-team"}}
@slack-my-other-team-channel
{{/is_match}}
In the Notification message
section of your monitor, add text similar to the code snippet above to configure monitor notifications. You can add as many is_match
clauses as you need. For more information on Notification variables, see Monitors Conditional Variables.
above
, above or equal to
, below
, or below or equal to
5 minutes
, 15 minutes
, 1 hour
, or custom
to set a value between 1 minute
and 2 days
<NUMBER>
<NUMBER>
For detailed instructions on the advanced alert options (such as evaluation delay), see the Monitor configuration page.
For detailed instructions on the Configure notifications and automations section, see the Notifications page.
When a CI Test or Pipeline monitor is triggered, samples or values can be added to the notification message.
Monitor Setup | Can be added to notification message |
---|---|
Ungrouped Simple-Alert count | Up to 10 samples. |
Grouped Simple-Alert count | Up to 10 facet or measure values. |
Grouped Multi-Alert count | Up to 10 samples. |
Ungrouped Simple-Alert measure | Up to 10 samples. |
Grouped Simple-Alert measure | Up to 10 facet or measure values. |
Grouped Multi-Alert measure | Up to 10 facet or measure values. |
These are available for notifications sent to Slack, Jira, webhooks, Microsoft Teams, Pagerduty, and email. Note: Samples are not displayed for recovery notifications.
To disable samples, uncheck the box at the bottom of the Say what’s happening section. The text next to the box is based on your monitor’s grouping (as stated above).
Include a table of CI Test 10 samples in the alert notification:
Include a table of CI Pipeline 10 samples in the alert notification:
A monitor that uses an event count for its evaluation query will resolve after the specified evaluation period with no data, triggering a notification. For example, a monitor configured to alert on the number of pipeline errors with an evaluation window of five minutes will automatically resolve after five minutes without any pipeline executions.
As an alternative, Datadog recommends using rate formulas. For example, instead of using a monitor on the number of pipeline failures (count), use a monitor on the rate of pipeline failures (formula), such as (number of pipeline failures)/(number of all pipeline executions)
. In this case, when there’s no data, the denominator (number of all pipeline executions)
will be 0
, making the division x/0
impossible to evaluate. The monitor will keep the previous known state instead of evaluating it to 0
.
This way, if the monitor triggers because there’s a burst of pipeline failures that makes the error rate go above the monitor threshold, it will not clear until the error rate goes below the threshold, which can be at any time afterwards.
Common monitor use cases are outlined below. Monitor queries can be modified to filter for specific branches, authors, or any other in-app facet.
The duration
metric can be used to identify pipeline and test performance regressions for any branch. Alerting on this metric can prevent performance regressions from being introduced into your codebase.
Test monitors have the New Flaky Test
, Test Failures
, and Test Performance
common monitor types for simple monitor setup. This monitor sends alerts when new flaky tests are added to your codebase. The query is grouped by Test Full Name
so you don’t get alerted on the same new flaky test more than once.
A test run is marked as flaky
if it exhibits flakiness within the same commit after some retries. If it exhibits flakiness multiple times (because multiple retries were executed), the is_flaky
tag is added to the first test run that is detected as flaky.
A test run is marked as new flaky
if that particular test has not been detected to be flaky within the same branch or default branch. Only the first test run that is detected as new flaky is marked with the is_new_flaky
tag (regardless of the number of retries).
For more information, see Search and Manage CI Tests.
Custom metrics, such as code coverage percentage, can be created and used within monitors. The monitor below sends alerts when code coverage dips below a certain percentage, which can help with maintaining test performance over time.
For more information, see Code Coverage.