- 필수 기능
- 시작하기
- Glossary
- 표준 속성
- Guides
- Agent
- 통합
- 개방형텔레메트리
- 개발자
- API
- Datadog Mobile App
- CoScreen
- Cloudcraft
- 앱 내
- 서비스 관리
- 인프라스트럭처
- 애플리케이션 성능
- APM
- Continuous Profiler
- 스팬 시각화
- 데이터 스트림 모니터링
- 데이터 작업 모니터링
- 디지털 경험
- 소프트웨어 제공
- 보안
- AI Observability
- 로그 관리
- 관리
Datadog Log Management collects, processes, archives, explores, and monitors your logs, so that you have visibility into your system’s issues. However, it can be hard to get the right level of visibility from your logs and log throughput can vary highly, creating unexpected resource usage.
Therefore, this guide walks you through various Log Management best practices and account configurations that provide you flexibility in governance, usage attribution, and budget control. More specifically, how to:
This guide also goes through how to monitor your log usage by:
If you want to transform your logs or redact sensitive data in your logs before they leave your environment, see how to aggregate, process, and transform your log data with Observability Pipelines.
Set up multiple indexes if you want to segment your logs for different retention periods or daily quotas, usage monitoring, and billing.
For example, if you have logs that only need to be retained for 7 days, while other logs need to be retained for 30 days, use multiple indexes to separate out the logs by the two retention periods.
To set up multiple indexes:
Setting daily quotas on your indexes can help prevent billing overages when new log sources are added or if a developer unintentionally changes the logging levels to debug mode. See Alert on indexes reaching their daily quota on how to set up a monitor to alert when a percentage of the daily quota is reached within the past 24 hours.
If you want to retain logs for an extended time while maintaining querying speeds similar to Standard Indexing, configure Flex Logs. This tier is best suited for logs that require longer retention and occasionally need to be queried urgently. Flex Logs decouples storage from compute costs so you can cost effectively retain more logs for longer without sacrificing visibility. Logs that need to be frequently queried should be stored in standard indexes.
If you want to store your logs for longer periods of time, set up Log Archives to send your logs to a storage-optimized system, such as Amazon S3, Azure Storage, or Google Cloud Storage. When you want to use Datadog to analyze those logs, use Log Rehydration™ to capture those logs back in Datadog. With multiple archives, you can both segment logs for compliance reasons and keep rehydration costs under control.
Set a limit on the volume of logs that can be rehydrated at one time. When setting up an archive, you can define the maximum volume of log data that can be scanned for Rehydration. See Define maximum scan size for more information.
There are three default Datadog roles: Admin, Standard, and Read-only. You can also create custom roles with unique permission sets. For example, you can create a role that restricts users from modifying index retention policies to avoid unintended cost spikes. Similarly, you can restrict who can modify log parsing configurations to avoid unwanted changes to well-defined log structures and formats.
To set up custom roles with permissions:
See How to Set Up RBAC for Logs for a step-by-step guide on how to set up and assign a role with specific permissions for an example use case.
You can monitor your log usage, by setting up the following:
By default, log usage metrics are available to track the number of ingested logs, ingested bytes, and indexed logs. These metrics are free and kept for 15 months:
datadog.estimated_usage.logs.ingested_bytes
datadog.estimated_usage.logs.ingested_events
See Anomaly detection monitors for steps on how to create anomaly monitors with the usage metrics.
Note: Datadog recommends setting the unit to byte
for the datadog.estimated_usage.logs.ingested_bytes
in the metric summary page:
Create an anomaly detection monitor to alert on any unexpected log indexing spikes:
datadog.estimated_usage.logs.ingested_events
metric.datadog_is_excluded:false
tag to monitor indexed logs and not ingested ones.service
and datadog_index
tags, so that you are notified if a specific service spikes or stops sending logs in any index.An unexpected amount of logs has been indexed in the index: {{datadog_index.name}}
1. [Check Log patterns for this service](https://app.datadoghq.com/logs/patterns?from_ts=1582549794112&live=true&to_ts=1582550694112&query=service%3A{{service.name}})
2. [Add an exclusion filter on the noisy pattern](https://app.datadoghq.com/logs/pipelines/indexes)
Set up a monitor to alert if an indexed log volume in any scope of your infrastructure (for example, service
, availability-zone
, and so forth) is growing unexpectedly.
index:main
) to capture the log volume you want to monitor.host,
services, and so on) to the group by field.Unexpected spike on indexed logs for service {{service.name}}
The volume on this service exceeded the threshold. Define an additional exclusion filter or increase the sampling rate to reduce the volume.
Leverage the datadog.estimated_usage.logs.ingested_events
metric filtered on datadog_is_excluded:false
to only count indexed logs and the metric monitor cumulative window to monitor the count since the beginning of the month.
Set up a daily quota on indexes to prevent indexing more than a given number of logs per day. If an index has a daily quota, Datadog recommends that you set the monitor that notifies on that index’s volume to alert when 80% of this quota is reached within the past 24 hours.
An event is generated when the daily quota is reached. By default, these events have the datadog_index
tag with the index name. Therefore, when this event has been generated, you can create a facet on the datadog_index
tag, so that you can use datadog_index
in the group by
step for setting up a multi-alert monitor.
To set up a monitor to alert when the daily quota is reached for an index:
source:datadog "daily quota reached"
in the Define the search query section.datadog_index
to the group by field. It automatically updates to datadog_index(datadog_index)
. The datadog_index(datadog_index)
tag is only available when an event has already been generated.above or equal to
and enter 1
for the Alert threshold.datadog_index(datadog_index)
.This is an example of what the notification looks like in Slack:
Once you begin ingesting logs, an out-of-the-box dashboard summarizing your log usage metrics is automatically installed in your account.
Note: The metrics used in this dashboard are estimates and may differ from official billing numbers.
To find this dashboard, go to Dashboards > Dashboards List and search for Log Management - Estimated Usage.
When your usage monitors alert, you can set up exclusion filters and increase the sampling rate to reduce the volume. See Exclusion Filters on how to set them up. You can also use Log Patterns to group and identify high-volume logs. Then, in the log pattern’s side panel, click Add Exclusion Filter to add a filter to stop indexing those logs.
Even if you use exclusion filters, you can still visualize trends and anomalies over all of your log data using log-based metrics. See Generate Metrics from Ingested Logs for more information.
If you want to prevent data leaks and limit non-compliance risks, use Sensitive Data Scanner to identify, tag, and optionally redact or hash sensitive data. For example, you can scan for credit card numbers, bank routing numbers, and API keys in your logs, APM spans, and RUM events, See Sensitive Data Scanner on how to set up scanning rules to determine what data to scan.
Note: Sensitive Data Scanner is a separate billable product.
If you want to see user activities, such as who changed the retention of an index or who modified an exclusion filter, enable Audit Trail to see these events. See Audit Trail Events for a list of platform and product-specific events that are available. To enable and configure Audit Trail, follow the steps in the Audit Trail documentation.
Note: Audit Trail is a separate billable product.