Advanced setup for Datadog Operator
The Datadog Operator is a way to deploy the Datadog Agent on Kubernetes and OpenShift. It reports deployment status, health, and errors in its Custom Resource status, and it limits the risk of misconfiguration thanks to higher-level configuration options.
Prerequisites
Using the Datadog Operator requires the following prerequisites:
- Kubernetes Cluster version >= v1.20.X: Tests were done on versions >=
1.20.0
. Still, it should work on versions >= v1.11.0
. For earlier versions, because of limited CRD support, the Operator may not work as expected. Helm
for deploying the datadog-operator
.Kubectl
CLI for installing the datadog-agent
.
Deploy the Datadog Operator
To use the Datadog Operator, deploy it in your Kubernetes cluster. Then create a DatadogAgent
Kubernetes resource that contains the Datadog deployment configuration:
- Add the Datadog Helm repo:
helm repo add datadog https://helm.datadoghq.com
- Install the Datadog Operator:
helm install my-datadog-operator datadog/datadog-operator
Deploy the Datadog Agents with the Operator
After deploying the Datadog Operator, create the DatadogAgent
resource that triggers the Datadog Agent’s deployment in your Kubernetes cluster. By creating this resource in the Datadog-Operator
namespace, the Agent is deployed as a DaemonSet
on every Node
of your cluster.
Create the datadog-agent.yaml
manifest out of one of the following templates:
Replace <DATADOG_API_KEY>
and <DATADOG_APP_KEY>
with your Datadog API and application keys, then trigger the Agent installation with the following command:
$ kubectl apply -n $DD_NAMESPACE -f datadog-agent.yaml
datadogagent.datadoghq.com/datadog created
You can check the state of the DatadogAgent
resource with:
kubectl get -n $DD_NAMESPACE dd datadog
NAME ACTIVE AGENT CLUSTER-AGENT CLUSTER-CHECKS-RUNNER AGE
datadog-agent True Running (2/2/2) 110m
In a 2-worker-nodes cluster, you should see the Agent pods created on each node.
$ kubectl get -n $DD_NAMESPACE daemonset
NAME DESIRED CURRENT READY UP-TO-DATE AVAILABLE NODE SELECTOR AGE
datadog-agent 2 2 2 2 2 <none> 5m30s
$ kubectl get -n $DD_NAMESPACE pod -owide
NAME READY STATUS RESTARTS AGE IP NODE
agent-datadog-operator-d897fc9b-7wbsf 1/1 Running 0 1h 10.244.2.11 kind-worker
datadog-agent-k26tp 1/1 Running 0 5m59s 10.244.2.13 kind-worker
datadog-agent-zcxx7 1/1 Running 0 5m59s 10.244.1.7 kind-worker2
Cleanup
The following command deletes all the Kubernetes resources created by the above instructions:
kubectl delete datadogagent datadog
helm delete my-datadog-operator
It is important to delete the DatadogAgent
resource and let Operator perform a cleanup. When the DatadogAgent
resource is created in a cluster, Operator adds a finalizer to prevent deletion until it finishes the cleanup of resources it created. If Operator is uninstalled first, attempts to delete the DatadogAgent
resource are blocked indefinitely; this will block namespace deletion as well. A workaround in this situation is to remove the metadata.finalizers
value from DatadogAgent
resource.
Tolerations
Update your datadog-agent.yaml
file with the following configuration to add the toleration in the Daemonset.spec.template
of your DaemonSet
:
kind: DatadogAgent
apiVersion: datadoghq.com/v2alpha1
metadata:
name: datadog
spec:
global:
credentials:
apiKey: <DATADOG_API_KEY>
appKey: <DATADOG_APP_KEY>
override:
nodeAgent:
image:
name: gcr.io/datadoghq/agent:latest
tolerations:
- operator: Exists
Apply this new configuration:
$ kubectl apply -f datadog-agent.yaml
datadogagent.datadoghq.com/datadog updated
The DaemonSet update can be validated by looking at the new desired pod value:
$ kubectl get -n $DD_NAMESPACE daemonset
NAME DESIRED CURRENT READY UP-TO-DATE AVAILABLE NODE SELECTOR AGE
datadog-agent 3 3 3 3 3 <none> 7m31s
$ kubectl get -n $DD_NAMESPACE pod
NAME READY STATUS RESTARTS AGE
agent-datadog-operator-d897fc9b-7wbsf 1/1 Running 0 15h
datadog-agent-5ctrq 1/1 Running 0 7m43s
datadog-agent-lkfqt 0/1 Running 0 15s
datadog-agent-zvdbw 1/1 Running 0 8m1s
Further Reading
Additional helpful documentation, links, and articles: