Monitor Flex Compute Usage
Overview
Monitor the usage of Flex compute through various graphs on the Flex Controls page. Make informed decisions using data on cost-performance tradeoffs and balance operational success with financial efficiency.
Flex compute is limited by two factors:
- The number of concurrent queries
- The maximum number of logs that can be scanned per query
Query slowdowns occur when the concurrent query limit is reached, and a query is retrying to find an available slot to run in. If an available slot is not found, the query will not run. Datadog displays an error message advising you to retry your query at a later time.
Available metrics
The Flex Logs Controls page provides visualizations so you can assess how often query slowdowns are occurring and where they are happening most frequently. The following metrics are available:
- Query slowdowns
- Top sources of query slowdowns
- Top users experiencing slowdowns
- Top dashboards experiencing slowdowns
Optimization recommendations
Use this information to optimize your usage.
- Reach out to outlier users to:
- Discuss their querying needs
- Understand if there are logs they query frequently that should be stored in Standard Indexing instead
- Improve dashboards experiencing slowdowns by:
- Evaluating if logs used to power widgets can be converted into metrics to reduce the heavy Flex compute usage
- Breaking them down into smaller dashboards to spread the load
- Reducing the number of concurrent queries
- Consider upgrading your Flex compute size to increase the concurrent query limit if you notice sustained query slowdowns.
To learn more about compute sizes, see the Flex Logs documentation.
Further reading
Additional helpful documentation, links, and articles: