Información general

AWS Glue es un servicio de ETL (extracción, transformación y carga) totalmente gestionado que hace más simple y rentable la categorización, limpieza, mejora y traslado fiable de datos entre varios almacenes de datos.

Habilita esta integración para ver todas tus métricas de Glue en Datadog.



Si aún no lo has hecho, configura primero la integración de Amazon Web Services.

Recopilación de métricas

  1. En la página de la integración de AWS, asegúrate de que Glue está activado en la pestaña Metric Collection.
  2. Instala la integración de Datadog y AWS Glue.


Activar logging

Configura AWS Glue para enviar logs a un bucket de S3 o a CloudWatch.

Nota: Si vas a loguear en un bucket de S3, asegúrate de que amazon_glue está configurado como Target prefix (Prefijo de destino).

Enviar logs a Datadog

  1. Si aún no lo has hecho, configura la función de Lambda de Datadog Forwarder.

  2. Una vez instalada la función de Lambda, añade manualmente un activador en el bucket de S3 o grupo de logs de CloudWatch que contenga tus logs de AWS Glue en la consola de AWS:

Datos recopilados


The number of actively running job executors.
The number of maximum (actively running and pending) job executors needed to satisfy the current load.
The average fraction of memory used by the JVM heap for this driver (scale: 0-1) for all executors.
Shown as percent
The number of memory bytes used by the JVM heap for all executors.
Shown as byte
The average number of bytes read from Amazon S3 all executors since the previous report.
The average fraction of CPU system load used (scale: 0-1) by all executors.
Shown as percent
The number of bytes read from all data sources by all completed Spark tasks running in all executors.
Shown as byte
The ETL elapsed time in milliseconds (does not include the job bootstrap times).
Shown as millisecond
The number of completed stages in the job.
The number of completed tasks in the job.
The number of failed tasks.
The number of tasks killed.
The number of records read from all data sources by all completed Spark tasks running in all executors.
The number of bytes written by all executors to shuffle data between them since the previous report.
The number of bytes read by all executors to shuffle data between them since the previous report.
The average number of megabytes of disk spaced used across all executors.
The average fraction of memory used by the JVM heap for this driver (scale: 0-1) for driver.
Shown as percent
The number of memory bytes used by the JVM heap for the driver.
Shown as byte
The average number of bytes read from Amazon S3 by the driver since the previous report.
The average number of bytes written to Amazon S3 by the driver since the previous report.
The average fraction of CPU system load used (scale: 0-1) by the driver.
Shown as percent
The average fraction of memory used by the JVM heap for this driver (scale: 0-1) for executor identified.
Shown as percent
The number of memory bytes used by the JVM heap for the executor identified.
Shown as byte
The average fraction of CPU system load used (scale: 0-1) by the executor identified.
Shown as percent
The average number of bytes read from Amazon S3 by the executor identified since the previous report.
The average number of bytes written to Amazon S3 by the executor identified since the previous report.


La integración de AWS Glue no incluye ningún evento.

Checks de servicio

La integración de AWS Glue no incluye ningún check de servicio.

Resolución de problemas

¿Necesitas ayuda? Ponte en contacto con el servicio de asistencia de Datadog.

PREVIEWING: dgreen15/github-error-fix