Datadog Kubernetes Autoscaling continuously monitors your Kubernetes resources to provide immediate scaling recommendations and multidimensional autoscaling of your Kubernetes workloads. You can deploy autoscaling through the Datadog web interface, or with a DatadogPodAutoscaler
custom resource.
How it works
Datadog uses real-time and historical utilization metrics and event signals from your existing Datadog Agents to make recommendations. You can then examine these recommendations and choose to deploy them.
By default, Datadog Kubernetes Autoscaling uses estimated CPU and memory cost values to show savings opportunities and impact estimates. You can also use Kubernetes Autoscaling alongside Cloud Cost Management to get reporting based on your exact instance type costs.
Automated workload scaling is powered by a DatadogPodAutoscaler
custom resource that defines scaling behavior on a per-workload level. The Datadog Cluster Agent acts as the controller for this custom resource.
Each cluster can have a maximum of 1000 workloads optimized with Datadog Kubernetes Autoscaler.
Compatibility
- Distributions: This feature is compatible with all of Datadog’s supported Kubernetes distributions.
- Workload autoscaling: This feature is an alternative to Horizontal Pod Autoscaler (HPA) and Vertical Pod Autoscaler (VPA). Datadog recommends that you remove any HPAs or VPAs from a workload before you use Datadog Kubernetes Autoscaling to optimize it. These workloads are identified in the application on your behalf.
Requirements
- Remote Configuration must be enabled for your organization. See Enabling Remote Configuration.
- Helm, for updating your Datadog Agent.
- (For Datadog Operator users)
kubectl
CLI, for updating the Datadog Agent. - The following user permissions:
- Org Management (required for Remote Configuration)
- API Keys Write (required for Remote Configuration)
- Workload Scaling Read
- Workload Scaling Write
- Autoscaling Manage
Setup
- Ensure you are using Datadog Operator v1.8.0+. To upgrade your Datadog Operator:
helm upgrade datadog-operator datadog/datadog-operator
- Add the following to your
datadog-agent.yaml
configuration file:
spec:
features:
orchestratorExplorer:
customResources:
- datadoghq.com/v1alpha1/datadogpodautoscalers
autoscaling:
workload:
enabled: true
eventCollection:
unbundleEvents: true
override:
clusterAgent:
image:
tag: 7.58.1
nodeAgent:
image:
tag: 7.58.1 # or 7.58.1-jmx
clusterChecksRunner
image:
tag: 7.58.1 # or 7.58.1-jmx
- Admission Controller is enabled by default with the Datadog Operator. If you disabled it, re-enable it by adding the following highlighted lines to
datadog-agent.yaml
:
...
spec:
features:
admissionController:
enabled: true
...
- Apply the updated
datadog-agent.yaml
configuration:
kubectl apply -n $DD_NAMESPACE -f datadog-agent.yaml
- Add the following to your
datadog-values.yaml
configuration file:
datadog:
orchestratorExplorer:
customResources:
- datadoghq.com/v1alpha1/datadogpodautoscalers
autoscaling:
workload:
enabled: true
kubernetesEvents:
unbundleEvents: true
clusterAgent:
image:
tag: 7.58.1
agents:
image:
tag: 7.58.1 # or 7.58.1-jmx
clusterChecksRunner:
image:
tag: 7.58.1 # or 7.58.1-jmx
- Admission Controller is enabled by default in the Datadog Helm chart. If you disabled it, re-enable it by adding the following highlighted lines to
datadog-values.yaml
:
...
clusterAgent:
image:
tag: 7.58.1
admissionController:
enabled: true
...
- Update your Helm version:
- Redeploy the Datadog Agent with your updated
datadog-values.yaml
:
helm upgrade -f datadog-values.yaml <RELEASE_NAME> datadog/datadog
Ingest cost data with Cloud Cost Management
By default, Datadog Kubernetes Autoscaling shows idle cost and savings estimates using the following fixed values:
- $0.0295 per CPU core hour
- $0.0053 per memory GB hour
Fixed cost values are subject to refinement over time.
When Cloud Cost Management is enabled within an org, Datadog Kubernetes Autoscaling shows idle cost and savings estimates based on your exact bill cost of underlying monitored instances.
See Cloud Cost setup instructions for AWS, Azure, or Google Cloud.
Cost data enhances Kubernetes Autoscaling, but it is not required. All of Datadog’s workload recommendations and autoscaling decisions are valid and functional without cost data.
Usage
Identify resources to rightsize
The Autoscaling Summary page provides a starting point for platform teams to understand the total Kubernetes Resource savings opportunities across an organization, and filter down to key clusters and namespaces. The Cluster Scaling view provides per-cluster information about total idle CPU, total idle memory, and costs. Click on a cluster for detailed information and a table of the cluster’s workloads. If you are an individual application or service owner, you can also filter by your team or service name directly from the Workload Scaling list view.
Click Optimize on any workload to see its scaling recommendation.
Deploy Autoscaling
After you identify a workload to optimize, Datadog recommends inspecting its Scaling Recommendation. You can also click Configure Recommendation to add constraints or adjust target utilization levels.
When you are ready to proceed with enabling Autoscaling for a workload, you have two options for deployment:
Click Enable Autoscaling. (Requires Workload Scaling Write permission.)
Datadog automatically installs and configures autoscaling for this workload on your behalf.
Deploy a DatadogPodAutoscaler
custom resource.
Use your existing deploy process to target and configure Autoscaling for your workload. Click Export Recommendation to see a suggested manifest configuration.
Deploy recommendations manually
As an alternative to Autoscaling, you can also deploy Datadog’s scaling recommendations manually. When you configure resources for your Kubernetes deployments, use the values suggested in the scaling recommendations. You can also click Export Recommendation to see a generated kubectl patch
command.
Further reading
Additional helpful documentation, links, and articles: