Datadog Kubernetes Autoscaling
Join the Preview!
Datadog Kubernetes Autoscaling is in Preview.
Request AccessDatadog Kubernetes Autoscaling automates the scaling of your Kubernetes environments based on utilization metrics. This feature enables you to make changes to your Kubernetes environments from within Datadog.
How it works
Datadog Kubernetes Autoscaling provides cluster scaling observability and workload scaling recommendations and automation. Datadog uses real-time and historical utilization metrics to make recommendations. With data from Cloud Cost Management, Datadog can also make recommendations based on costs.
Automated workload scaling is powered by a DatadogPodAutoscaler
custom resource that defines scaling behavior on a per-workload level.
Each cluster can have a maximum of 100 workloads optimized with Datadog Kubernetes Autoscaler.
Compatibility
- Distributions: This feature is compatible with all of Datadog’s supported Kubernetes distributions.
- Cluster autoscaling: This feature works alongside cluster autoscaling solutions, such as Karpenter and Cluster Autoscaler.
- Workload autoscaling: This feature is an alternative to Horizontal Pod Autoscaler (HPA) and Vertical Pod Autoscaler (VPA). Datadog recommends that you remove any HPAs or VPAs from a workload before you use Datadog Kubernetes Autoscaling to optimize it.
Requirements
- Remote Configuration must be enabled for your organization. See Enabling Remote Configuration.
- Helm, for updating your Datadog Agent
- (For Datadog Operator users)
kubectl
CLI, for updating the Datadog Agent - The following user permissions:
- Org Management (
org_management
) - API Keys Write (
api_keys_write
) - Workload Scaling Write (
orchestration_workload_scaling_write
)
During the Preview period, Preview users are granted access at an organization level...
Setup
- Ensure you are using Datadog Operator v1.8.0+. To upgrade your Datadog Operator:
helm upgrade datadog-operator datadog/datadog-operator
- Add the following to your
datadog-agent.yaml
configuration file:
spec:
features:
orchestratorExplorer:
customResources:
- datadoghq.com/v1alpha1/datadogpodautoscalers
autoscaling:
workload:
enabled: true
eventCollection:
unbundleEvents: true
override:
clusterAgent:
image:
tag: 7.58.1
nodeAgent:
image:
tag: 7.58.1 # or 7.58.1-jmx
clusterChecksRunner
image:
tag: 7.58.1 # or 7.58.1-jmx
- Admission Controller is enabled by default with the Datadog Operator. If you disabled it, add the following highlighted lines to
datadog-agent.yaml
:
...
spec:
features:
admissionController:
enabled: true
...
- Apply the updated
datadog-agent.yaml
configuration:
kubectl apply -n $DD_NAMESPACE -f datadog-agent.yaml
- Add the following to your
datadog-values.yaml
configuration file:
datadog:
orchestratorExplorer:
customResources:
- datadoghq.com/v1alpha1/datadogpodautoscalers
autoscaling:
workload:
enabled: true
kubernetesEvents:
unbundleEvents: true
clusterAgent:
image:
tag: 7.58.1
agents:
image:
tag: 7.58.1 # or 7.58.1-jmx
clusterChecksRunner:
image:
tag: 7.58.1 # or 7.58.1-jmx
- Admission Controller is enabled by default in the Datadog Helm chart. If you disabled it, add the following highlighted lines to
datadog-values.yaml
:
...
clusterAgent:
image:
tag: 7.58.1
admissionController:
enabled: true
...
- Update your Helm version:
- Redeploy the Datadog Agent with your updated
datadog-values.yaml
:
helm upgrade -f datadog-values.yaml <RELEASE_NAME> datadog/datadog
Ingest cost data with Cloud Cost Management
Datadog’s Kubernetes Autoscaling can work with Cloud Cost Management to make workload scaling recommendations based on cost data…
Kubernetes Autoscaling Preview users are granted limited access to
Cloud Cost Management during the Preview period. To coordinate this trial access, contact your customer success manager and CC
kubernetes-beta@datadoghq.com
.
If you are already using Cloud Cost Management, no action is required.
See Cloud Cost setup instructions for AWS, Azure, or Google Cloud.
If you do not enable Cloud Cost Management, all workload recommendations and autoscaling decisions are still valid and functional.
Usage
Identifying resources to scale
In Datadog, navigate to Containers > Kubernetes Explorer and select the Autoscaling tab. Use the Cluster Scaling view to see a list of your clusters, sortable by total idle CPU or total idle memory. If you enabled Cloud Cost Management, you can also see cost information and a trailing 30-day cost breakdown.
Click Optimize cluster to open a detailed view of the selected cluster, including a table of this cluster’s workloads.
You can also use the Workload Scaling view to see a filterable list of all workloads across all clusters.
Select a workload and click Optimize to see its Scaling Recommendations. You can inspect the metrics backing the recommendation for each container within the deployment.
Deploying recommendations
You can deploy scaling recommendations:
automatically, with Datadog Kubernetes Autoscaling.
Select Enable Autoscaling to automatically apply your recommendations.
manually, with kubectl patch
.
Select Apply to see a generated kubectl patch
command.
Autoscale a workload with a custom resource
You can also deploy a DatadogPodAutoscaler
custom resource to enable autoscaling for a workload. This custom resource targets a deployment.
For example:
apiVersion: datadoghq.com/v1alpha1
kind: DatadogPodAutoscaler
metadata:
name: <name>
# usually the same as your deployment object name
spec:
constraints:
# Adjust constraints as safeguards
maxReplicas: 50
minReplicas: 1
owner: Local
policy: All
# Values: All, None
# All - Allows automated recommendations to be applied. Default.
# None - Computes recommendations without applying them (dry run).
targetRef:
apiVersion: apps/v1
kind: Deployment
name: <your Deployment name>
targets:
# Currently, recommendation is to use a single target with CPU Utilization of main container of the POD.
- type: ContainerResource
containerResource:
container: <main-container-name>
name: cpu
value:
type: Utilization
utilization: 75
Further reading
Additional helpful documentation, links, and articles: