Cette page n'est pas encore disponible en français, sa traduction est en cours.
Si vous avez des questions ou des retours sur notre projet de traduction actuel,
n'hésitez pas à nous contacter.
Getting started
Navigate to the SLO Manage page.
Start thinking from the perspective of your user:
- How are your users interacting with your application?
- What is their journey through the application?
- Which parts of your infrastructure do these journeys interact with?
- What are they expecting from your systems and what are they hoping to accomplish?
Select the relevant SLI
STEP 1
Response/Request
Type of SLI | Description |
---|
Availability | Could the server respond to the request successfully? |
Latency | How long did it take for the server to respond to the request? |
Throughput | How many requests can be handled? |
Storage
Type of SLI | Description |
---|
Availability | Can the data be accessed on demand? |
Latency | How long does it take to read or write data? |
Durability | Is the data still there when it is needed? |
Pipeline
Type of SLI | Description |
---|
Correctness | Was the right data returned? |
Freshness | How long does it take for new data or processed results to appear? |
STEP 2
Best practices for choosing an SLO Type
- Whenever possible, use metric-based SLOs. It’s best practice to have SLOs where the error budget reflects the number of bad events you have left before you breach your SLO. Your SLO calculations will also be volume weighted based on the number of events.
- If, instead, you want an SLO that tracks uptime and uses a time-based SLI calculation, use time slice SLOs. Unlike monitor-based SLOs, time slice SLOs don’t require you to maintain an underlying monitor for your SLO.
- Finally, consider monitor-based SLOs for use cases that are not covered by time slice SLOs, which include SLOs based on non-metric monitors or multiple monitors.
For a detailed comparison of the SLO types, see the SLO Type Comparison guide.
Do you require an SLI calculation that is time-based or count-based?
The following SLO types are available in Datadog:
Metric-based SLOs
Example: 99% of requests should complete in less than 250 ms over a 30-day window.
- Count-based SLI calculation
- SLI is calculated as the sum of good events divided by the sum of total events
Monitor-based SLOs
Example: the latency of all user requests should be less than 250 ms 99% of the time in any
30-day window.
- Time-based SLI calculation
- SLI calculated based on the underlying Monitor’s uptime
- You can select a single monitor, multiple monitors (up to 20), or a single multi alert monitor with groups
If you need to create a new monitor go to the Monitor create page.
Time Slice SLOs
Example: the latency of all user requests should be less than 250 ms 99% of the time in any
30-day window.
- Time-based SLI calculation
- SLI calculated based on your custom uptime definition using a metric query
Implement your SLIs
- Custom metrics (for example, counters)
- Integration metrics (for example, load balancer, http requests)
- Datadog APM (for example, errors, latency on services and resources)
- Datadog Logs (for example, metrics generated from logs for a count of particular occurrence)
Set your target objective and time window
- Select your target:
99%
, 99.5%
, 99.9%
, 99.95%
, or any other target value that makes sense for your requirements. - Select your time window: over the last rolling
7
, 30
, or 90 days
Name, describe, and tag your SLOs
- Name your SLO.
- Add a description: describe what the SLO is tracking and why it is important for your end user experience. You can also add links to dashboards for reference.
- Add tags: tagging by
team
and service
is a common practice.
View and search
Use tags to search for your SLOs from the SLO list view.
Further Reading
Documentation, liens et articles supplémentaires utiles: