This page is not yet available in Spanish. We are working on its translation.
If you have any questions or feedback about our current translation project, feel free to reach out to us!

gcp_dataplex_task

ancestors

Type: UNORDERED_LIST_STRING

create_time

Type: TIMESTAMP
Provider name: createTime
Description: Output only. The time when the task was created.

description

Type: STRING
Provider name: description
Description: Optional. Description of the task.

execution_spec

Type: STRUCT
Provider name: executionSpec
Description: Required. Spec related to how a task is executed.

  • kms_key
    Type: STRING
    Provider name: kmsKey
    Description: Optional. The Cloud KMS key to use for encryption, of the form: projects/{project_number}/locations/{location_id}/keyRings/{key-ring-name}/cryptoKeys/{key-name}.
  • max_job_execution_lifetime
    Type: STRING
    Provider name: maxJobExecutionLifetime
    Description: Optional. The maximum duration after which the job execution is expired.
  • project
    Type: STRING
    Provider name: project
    Description: Optional. The project in which jobs are run. By default, the project containing the Lake is used. If a project is provided, the ExecutionSpec.service_account must belong to this project.
  • service_account
    Type: STRING
    Provider name: serviceAccount
    Description: Required. Service account to use to execute a task. If not provided, the default Compute service account for the project is used.

execution_status

Type: STRUCT
Provider name: executionStatus
Description: Output only. Status of the latest task executions.

  • latest_job
    Type: STRUCT
    Provider name: latestJob
    Description: Output only. latest job execution
    • end_time
      Type: TIMESTAMP
      Provider name: endTime
      Description: Output only. The time when the job ended.
    • execution_spec
      Type: STRUCT
      Provider name: executionSpec
      Description: Output only. Spec related to how a task is executed.
      • kms_key
        Type: STRING
        Provider name: kmsKey
        Description: Optional. The Cloud KMS key to use for encryption, of the form: projects/{project_number}/locations/{location_id}/keyRings/{key-ring-name}/cryptoKeys/{key-name}.
      • max_job_execution_lifetime
        Type: STRING
        Provider name: maxJobExecutionLifetime
        Description: Optional. The maximum duration after which the job execution is expired.
      • project
        Type: STRING
        Provider name: project
        Description: Optional. The project in which jobs are run. By default, the project containing the Lake is used. If a project is provided, the ExecutionSpec.service_account must belong to this project.
      • service_account
        Type: STRING
        Provider name: serviceAccount
        Description: Required. Service account to use to execute a task. If not provided, the default Compute service account for the project is used.
    • message
      Type: STRING
      Provider name: message
      Description: Output only. Additional information about the current state.
    • name
      Type: STRING
      Provider name: name
      Description: Output only. The relative resource name of the job, of the form: projects/{project_number}/locations/{location_id}/lakes/{lake_id}/tasks/{task_id}/jobs/{job_id}.
    • retry_count
      Type: INT32
      Provider name: retryCount
      Description: Output only. The number of times the job has been retried (excluding the initial attempt).
    • service
      Type: STRING
      Provider name: service
      Description: Output only. The underlying service running a job.
      Possible values:
      • SERVICE_UNSPECIFIED - Service used to run the job is unspecified.
      • DATAPROC - Dataproc service is used to run this job.
    • service_job
      Type: STRING
      Provider name: serviceJob
      Description: Output only. The full resource name for the job run under a particular service.
    • start_time
      Type: TIMESTAMP
      Provider name: startTime
      Description: Output only. The time when the job was started.
    • state
      Type: STRING
      Provider name: state
      Description: Output only. Execution state for the job.
      Possible values:
      • STATE_UNSPECIFIED - The job state is unknown.
      • RUNNING - The job is running.
      • CANCELLING - The job is cancelling.
      • CANCELLED - The job cancellation was successful.
      • SUCCEEDED - The job completed successfully.
      • FAILED - The job is no longer running due to an error.
      • ABORTED - The job was cancelled outside of Dataplex.
    • trigger
      Type: STRING
      Provider name: trigger
      Description: Output only. Job execution trigger.
      Possible values:
      • TRIGGER_UNSPECIFIED - The trigger is unspecified.
      • TASK_CONFIG - The job was triggered by Dataplex based on trigger spec from task definition.
      • RUN_REQUEST - The job was triggered by the explicit call of Task API.
    • uid
      Type: STRING
      Provider name: uid
      Description: Output only. System generated globally unique ID for the job.
  • update_time
    Type: TIMESTAMP
    Provider name: updateTime
    Description: Output only. Last update time of the status.

gcp_display_name

Type: STRING
Provider name: displayName
Description: Optional. User friendly display name.

labels

Type: UNORDERED_LIST_STRING

name

Type: STRING
Provider name: name
Description: Output only. The relative resource name of the task, of the form: projects/{project_number}/locations/{location_id}/lakes/{lake_id}/ tasks/{task_id}.

notebook

Type: STRUCT
Provider name: notebook
Description: Config related to running scheduled Notebooks.

  • archive_uris
    Type: UNORDERED_LIST_STRING
    Provider name: archiveUris
    Description: Optional. Cloud Storage URIs of archives to be extracted into the working directory of each executor. Supported file types: .jar, .tar, .tar.gz, .tgz, and .zip.
  • file_uris
    Type: UNORDERED_LIST_STRING
    Provider name: fileUris
    Description: Optional. Cloud Storage URIs of files to be placed in the working directory of each executor.
  • infrastructure_spec
    Type: STRUCT
    Provider name: infrastructureSpec
    Description: Optional. Infrastructure specification for the execution.
    • batch
      Type: STRUCT
      Provider name: batch
      Description: Compute resources needed for a Task when using Dataproc Serverless.
      • executors_count
        Type: INT32
        Provider name: executorsCount
        Description: Optional. Total number of job executors. Executor Count should be between 2 and 100. Default=2
      • max_executors_count
        Type: INT32
        Provider name: maxExecutorsCount
        Description: Optional. Max configurable executors. If max_executors_count > executors_count, then auto-scaling is enabled. Max Executor Count should be between 2 and 1000. Default=1000
    • container_image
      Type: STRUCT
      Provider name: containerImage
      Description: Container Image Runtime Configuration.
      • image
        Type: STRING
        Provider name: image
        Description: Optional. Container image to use.
      • java_jars
        Type: UNORDERED_LIST_STRING
        Provider name: javaJars
        Description: Optional. A list of Java JARS to add to the classpath. Valid input includes Cloud Storage URIs to Jar binaries. For example, gs://bucket-name/my/path/to/file.jar
      • python_packages
        Type: UNORDERED_LIST_STRING
        Provider name: pythonPackages
        Description: Optional. A list of python packages to be installed. Valid formats include Cloud Storage URI to a PIP installable library. For example, gs://bucket-name/my/path/to/lib.tar.gz
    • vpc_network
      Type: STRUCT
      Provider name: vpcNetwork
      Description: Vpc network.
      • network
        Type: STRING
        Provider name: network
        Description: Optional. The Cloud VPC network in which the job is run. By default, the Cloud VPC network named Default within the project is used.
      • network_tags
        Type: UNORDERED_LIST_STRING
        Provider name: networkTags
        Description: Optional. List of network tags to apply to the job.
      • sub_network
        Type: STRING
        Provider name: subNetwork
        Description: Optional. The Cloud VPC sub-network in which the job is run.
  • notebook
    Type: STRING
    Provider name: notebook
    Description: Required. Path to input notebook. This can be the Cloud Storage URI of the notebook file or the path to a Notebook Content. The execution args are accessible as environment variables (TASK_key=value).

organization_id

Type: STRING

parent

Type: STRING

project_id

Type: STRING

project_number

Type: STRING

resource_name

Type: STRING

spark

Type: STRUCT
Provider name: spark
Description: Config related to running custom Spark tasks.

  • archive_uris
    Type: UNORDERED_LIST_STRING
    Provider name: archiveUris
    Description: Optional. Cloud Storage URIs of archives to be extracted into the working directory of each executor. Supported file types: .jar, .tar, .tar.gz, .tgz, and .zip.
  • file_uris
    Type: UNORDERED_LIST_STRING
    Provider name: fileUris
    Description: Optional. Cloud Storage URIs of files to be placed in the working directory of each executor.
  • infrastructure_spec
    Type: STRUCT
    Provider name: infrastructureSpec
    Description: Optional. Infrastructure specification for the execution.
    • batch
      Type: STRUCT
      Provider name: batch
      Description: Compute resources needed for a Task when using Dataproc Serverless.
      • executors_count
        Type: INT32
        Provider name: executorsCount
        Description: Optional. Total number of job executors. Executor Count should be between 2 and 100. Default=2
      • max_executors_count
        Type: INT32
        Provider name: maxExecutorsCount
        Description: Optional. Max configurable executors. If max_executors_count > executors_count, then auto-scaling is enabled. Max Executor Count should be between 2 and 1000. Default=1000
    • container_image
      Type: STRUCT
      Provider name: containerImage
      Description: Container Image Runtime Configuration.
      • image
        Type: STRING
        Provider name: image
        Description: Optional. Container image to use.
      • java_jars
        Type: UNORDERED_LIST_STRING
        Provider name: javaJars
        Description: Optional. A list of Java JARS to add to the classpath. Valid input includes Cloud Storage URIs to Jar binaries. For example, gs://bucket-name/my/path/to/file.jar
      • python_packages
        Type: UNORDERED_LIST_STRING
        Provider name: pythonPackages
        Description: Optional. A list of python packages to be installed. Valid formats include Cloud Storage URI to a PIP installable library. For example, gs://bucket-name/my/path/to/lib.tar.gz
    • vpc_network
      Type: STRUCT
      Provider name: vpcNetwork
      Description: Vpc network.
      • network
        Type: STRING
        Provider name: network
        Description: Optional. The Cloud VPC network in which the job is run. By default, the Cloud VPC network named Default within the project is used.
      • network_tags
        Type: UNORDERED_LIST_STRING
        Provider name: networkTags
        Description: Optional. List of network tags to apply to the job.
      • sub_network
        Type: STRING
        Provider name: subNetwork
        Description: Optional. The Cloud VPC sub-network in which the job is run.
  • main_class
    Type: STRING
    Provider name: mainClass
    Description: The name of the driver’s main class. The jar file that contains the class must be in the default CLASSPATH or specified in jar_file_uris. The execution args are passed in as a sequence of named process arguments (–key=value).
  • main_jar_file_uri
    Type: STRING
    Provider name: mainJarFileUri
    Description: The Cloud Storage URI of the jar file that contains the main class. The execution args are passed in as a sequence of named process arguments (–key=value).
  • python_script_file
    Type: STRING
    Provider name: pythonScriptFile
    Description: The Gcloud Storage URI of the main Python file to use as the driver. Must be a .py file. The execution args are passed in as a sequence of named process arguments (–key=value).
  • sql_script
    Type: STRING
    Provider name: sqlScript
    Description: The query text. The execution args are used to declare a set of script variables (set key=“value”;).
  • sql_script_file
    Type: STRING
    Provider name: sqlScriptFile
    Description: A reference to a query file. This should be the Cloud Storage URI of the query file. The execution args are used to declare a set of script variables (set key=“value”;).

state

Type: STRING
Provider name: state
Description: Output only. Current state of the task.
Possible values:

  • STATE_UNSPECIFIED - State is not specified.
  • ACTIVE - Resource is active, i.e., ready to use.
  • CREATING - Resource is under creation.
  • DELETING - Resource is under deletion.
  • ACTION_REQUIRED - Resource is active but has unresolved actions.

tags

Type: UNORDERED_LIST_STRING

trigger_spec

Type: STRUCT
Provider name: triggerSpec
Description: Required. Spec related to how often and when a task should be triggered.

  • disabled
    Type: BOOLEAN
    Provider name: disabled
    Description: Optional. Prevent the task from executing. This does not cancel already running tasks. It is intended to temporarily disable RECURRING tasks.
  • max_retries
    Type: INT32
    Provider name: maxRetries
    Description: Optional. Number of retry attempts before aborting. Set to zero to never attempt to retry a failed task.
  • schedule
    Type: STRING
    Provider name: schedule
    Description: Optional. Cron schedule (https://en.wikipedia.org/wiki/Cron) for running tasks periodically. To explicitly set a timezone to the cron tab, apply a prefix in the cron tab: “CRON_TZ=${IANA_TIME_ZONE}” or “TZ=${IANA_TIME_ZONE}”. The ${IANA_TIME_ZONE} may only be a valid string from IANA time zone database. For example, CRON_TZ=America/New_York 1 * * * *, or TZ=America/New_York 1 * * * *. This field is required for RECURRING tasks.
  • start_time
    Type: TIMESTAMP
    Provider name: startTime
    Description: Optional. The first run of the task will be after this time. If not specified, the task will run shortly after being submitted if ON_DEMAND and based on the schedule if RECURRING.
  • type
    Type: STRING
    Provider name: type
    Description: Required. Immutable. Trigger type of the user-specified Task.
    Possible values:
    • TYPE_UNSPECIFIED - Unspecified trigger type.
    • ON_DEMAND - The task runs one-time shortly after Task Creation.
    • RECURRING - The task is scheduled to run periodically.

uid

Type: STRING
Provider name: uid
Description: Output only. System generated globally unique ID for the task. This ID will be different if the task is deleted and re-created with the same name.

update_time

Type: TIMESTAMP
Provider name: updateTime
Description: Output only. The time when the task was last updated.

PREVIEWING: guacbot/translation-pipeline