Data Jobs Monitoring for Spark on Kubernetes

이 페이지는 아직 한국어로 제공되지 않으며 번역 작업 중입니다. 번역에 관한 질문이나 의견이 있으시면 언제든지 저희에게 연락해 주십시오.

Data Jobs Monitoring gives visibility into the performance and reliability of Apache Spark applications on Kubernetes.

Setup

Data Jobs Monitoring requires Datadog Agent version 7.55.0 or later, and Java tracer version 1.38.0 or later.

Follow these steps to enable Data Jobs Monitoring for Spark on Kubernetes.

  1. Install the Datadog Agent on your Kubernetes cluster.
  2. Inject Spark instrumentation.

Install the Datadog Agent on your Kubernetes cluster

If you have already installed the Datadog Agent on your Kubernetes cluster, ensure that you have enabled the Datadog Admission Controller. You can then go to the next step, Inject Spark instrumentation.

You can install the Datadog Agent using the Datadog Operator or Helm.

Prerequisites

Installation

  1. Install the Datadog Operator by running the following commands:

    helm repo add datadog https://helm.datadoghq.com
    helm install my-datadog-operator datadog/datadog-operator
    
  2. Create a Kubernetes Secret to store your Datadog API key.

    kubectl create secret generic datadog-secret --from-literal api-key=<DATADOG_API_KEY> --from-literal app-key=<DATADOG_APP_KEY>
    
  3. Create a file, datadog-agent.yaml, that contains the following configuration:

    kind: DatadogAgent
    apiVersion: datadoghq.com/v2alpha1
    metadata:
      name: datadog
    spec:
      features:
        apm:
          enabled: true
          hostPortConfig:
            enabled: true
            hostPort: 8126
        admissionController:
          enabled: true
          mutateUnlabelled: false
      global:
        site: <DATADOG_SITE>
        credentials:
          apiSecret:
            secretName: datadog-secret
            keyName: api-key
          appSecret:
            secretName: datadog-secret
            keyName: app-key
      override:
        nodeAgent:
          image:
            tag: <DATADOG_AGENT_VERSION>
    

    Replace <DATADOG_SITE> with your Datadog site. Your site is . (Ensure the correct SITE is selected on the right).

    Replace <DATADOG_AGENT_VERSION> with version 7.55.0 or later.

  4. Deploy the Datadog Agent with the above configuration file:

    kubectl apply -f /path/to/your/datadog-agent.yaml
    
  1. Create a Kubernetes Secret to store your Datadog API key.

    kubectl create secret generic datadog-secret --from-literal api-key=<DATADOG_API_KEY> --from-literal app-key=<DATADOG_APP_KEY>
    
  2. Create a file, datadog-values.yaml, that contains the following configuration:

    datadog:
      apiKeyExistingSecret: datadog-secret
      appKeyExistingSecret: datadog-secret
      site: <DATADOG_SITE>
      apm:
        portEnabled: true
        port: 8126
    
    agents:
      image:
        tag: <DATADOG_AGENT_VERSION>
    
    clusterAgent:
      admissionController:
        enabled: true
        muteUnlabelled: false
    

    Replace <DATADOG_SITE> with your Datadog site. Your site is . (Ensure the correct SITE is selected on the right).

    Replace <DATADOG_AGENT_VERSION> with version 7.55.0 or later.

  3. Run the following command:

    helm install <RELEASE_NAME> \
     -f datadog-values.yaml \
     --set targetSystem=<TARGET_SYSTEM> \
     datadog/datadog
    
    • Replace <RELEASE_NAME> with your release name. For example, datadog-agent.

    • Replace <TARGET_SYSTEM> with the name of your OS. For example, linux or windows.

Inject Spark instrumentation

When you run your Spark job, use the following configurations:

spark.kubernetes.{driver,executor}.label.admission.datadoghq.com/enabled (Required)
true
spark.kubernetes.{driver,executor}.annotation.admission.datadoghq.com/java-lib.version (Required)
latest
spark.{driver,executor}.extraJavaOptions
-Ddd.data.jobs.enabled=true (Required)
true
-Ddd.service (Optional)
Your service name. Because this option sets the job name in Datadog, it is recommended that you use a human-readable name.
-Ddd.env (Optional)
Your environment, such as prod or dev.
-Ddd.version (Optional)
Your version.
-Ddd.tags (Optional)
Other tags you wish to add, in the format <KEY_1>:<VALUE_1>,<KEY_2:VALUE_2>.

Example: spark-submit

spark-submit \
  --class org.apache.spark.examples.SparkPi \
  --master k8s://<CLUSTER_ENDPOINT> \
  --conf spark.kubernetes.container.image=895885662937.dkr.ecr.us-west-2.amazonaws.com/spark/emr-6.10.0:latest \
  --deploy-mode cluster \
  --conf spark.kubernetes.namespace=<NAMESPACE> \
  --conf spark.kubernetes.authenticate.driver.serviceAccountName=<SERVICE_ACCOUNT> \
  --conf spark.kubernetes.authenticate.executor.serviceAccountName=<SERVICE_ACCOUNT> \
  --conf spark.kubernetes.driver.label.admission.datadoghq.com/enabled=true \
  --conf spark.kubernetes.executor.label.admission.datadoghq.com/enabled=true \
  --conf spark.kubernetes.driver.annotation.admission.datadoghq.com/java-lib.version=latest \
  --conf spark.kubernetes.executor.annotation.admission.datadoghq.com/java-lib.version=latest \
  --conf spark.driver.extraJavaOptions="-Ddd.data.jobs.enabled=true -Ddd.service=<JOB_NAME> -Ddd.env=<ENV> -Ddd.version=<VERSION> -Ddd.tags=<KEY_1>:<VALUE_1>,<KEY_2:VALUE_2>" \
  --conf spark.executor.extraJavaOptions="-Ddd.data.jobs.enabled=true -Ddd.service=<JOB_NAME> -Ddd.env=<ENV> -Ddd.version=<VERSION> -Ddd.tags=<KEY_1>:<VALUE_1>,<KEY_2:VALUE_2>" \
  local:///usr/lib/spark/examples/jars/spark-examples.jar 20

Example: AWS start-job-run

aws emr-containers start-job-run \
--virtual-cluster-id <EMR_CLUSTER_ID> \
--name myjob \
--execution-role-arn <EXECUTION_ROLE_ARN> \
--release-label emr-6.10.0-latest \
--job-driver '{
  "sparkSubmitJobDriver": {
    "entryPoint": "s3://BUCKET/spark-examples.jar",
    "sparkSubmitParameters": "--class <MAIN_CLASS> --conf spark.kubernetes.driver.label.admission.datadoghq.com/enabled=true --conf spark.kubernetes.executor.label.admission.datadoghq.com/enabled=true --conf spark.kubernetes.driver.annotation.admission.datadoghq.com/java-lib.version=latest --conf spark.kubernetes.executor.annotation.admission.datadoghq.com/java-lib.version=latest --conf spark.driver.extraJavaOptions=\"-Ddd.data.jobs.enabled=true -Ddd.service=<JOB_NAME> -Ddd.env=<ENV> -Ddd.version=<VERSION> -Ddd.tags=<KEY_1>:<VALUE_1>,<KEY_2:VALUE_2>\"  --conf spark.executor.extraJavaOptions=\"-Ddd.data.jobs.enabled=true -Ddd.service=<JOB_NAME> -Ddd.env=<ENV> -Ddd.version=<VERSION> -Ddd.tags=<KEY_1>:<VALUE_1>,<KEY_2:VALUE_2>\""
  }
}

Validation

In Datadog, view the Data Jobs Monitoring page to see a list of all your data processing jobs.

Advanced Configuration

Tag spans at runtime

런타임의 Spark 스팬(span)에서 태그를 설정할 수 있습니다. 본 태그는 태그가 추가된 후 실행되는 스팬(span)에만 적용됩니다.

// 다음 모든 Spark 컴퓨팅에 태그 추가
sparkContext.setLocalProperty("spark.datadog.tags.key", "value")
spark.read.parquet(...)

런타임 태그를 제거하려면 다음을 수행합니다.

// 다음 모든 Spark 컴퓨팅에서 태그 제거
sparkContext.setLocalProperty("spark.datadog.tags.key", null)

Further Reading

추가 유용한 문서, 링크 및 기사:

PREVIEWING: rtrieu/product-analytics-ui-changes