gcp_dataplex_task

ancestors

Type: UNORDERED_LIST_STRING

create_time

Type: TIMESTAMP
Provider name: createTime
Description: Output only. The time when the task was created.

description

Type: STRING
Provider name: description
Description: Optional. Description of the task.

execution_spec

Type: STRUCT
Provider name: executionSpec
Description: Required. Spec related to how a task is executed.

  • kms_key
    Type: STRING
    Provider name: kmsKey
    Description: Optional. The Cloud KMS key to use for encryption, of the form: projects/{project_number}/locations/{location_id}/keyRings/{key-ring-name}/cryptoKeys/{key-name}.
  • max_job_execution_lifetime
    Type: STRING
    Provider name: maxJobExecutionLifetime
    Description: Optional. The maximum duration after which the job execution is expired.
  • project
    Type: STRING
    Provider name: project
    Description: Optional. The project in which jobs are run. By default, the project containing the Lake is used. If a project is provided, the ExecutionSpec.service_account must belong to this project.
  • service_account
    Type: STRING
    Provider name: serviceAccount
    Description: Required. Service account to use to execute a task. If not provided, the default Compute service account for the project is used.

execution_status

Type: STRUCT
Provider name: executionStatus
Description: Output only. Status of the latest task executions.

  • latest_job
    Type: STRUCT
    Provider name: latestJob
    Description: Output only. latest job execution
    • end_time
      Type: TIMESTAMP
      Provider name: endTime
      Description: Output only. The time when the job ended.
    • execution_spec
      Type: STRUCT
      Provider name: executionSpec
      Description: Output only. Spec related to how a task is executed.
      • kms_key
        Type: STRING
        Provider name: kmsKey
        Description: Optional. The Cloud KMS key to use for encryption, of the form: projects/{project_number}/locations/{location_id}/keyRings/{key-ring-name}/cryptoKeys/{key-name}.
      • max_job_execution_lifetime
        Type: STRING
        Provider name: maxJobExecutionLifetime
        Description: Optional. The maximum duration after which the job execution is expired.
      • project
        Type: STRING
        Provider name: project
        Description: Optional. The project in which jobs are run. By default, the project containing the Lake is used. If a project is provided, the ExecutionSpec.service_account must belong to this project.
      • service_account
        Type: STRING
        Provider name: serviceAccount
        Description: Required. Service account to use to execute a task. If not provided, the default Compute service account for the project is used.
    • message
      Type: STRING
      Provider name: message
      Description: Output only. Additional information about the current state.
    • name
      Type: STRING
      Provider name: name
      Description: Output only. The relative resource name of the job, of the form: projects/{project_number}/locations/{location_id}/lakes/{lake_id}/tasks/{task_id}/jobs/{job_id}.
    • retry_count
      Type: INT32
      Provider name: retryCount
      Description: Output only. The number of times the job has been retried (excluding the initial attempt).
    • service
      Type: STRING
      Provider name: service
      Description: Output only. The underlying service running a job.
      Possible values:
      • SERVICE_UNSPECIFIED - Service used to run the job is unspecified.
      • DATAPROC - Dataproc service is used to run this job.
    • service_job
      Type: STRING
      Provider name: serviceJob
      Description: Output only. The full resource name for the job run under a particular service.
    • start_time
      Type: TIMESTAMP
      Provider name: startTime
      Description: Output only. The time when the job was started.
    • state
      Type: STRING
      Provider name: state
      Description: Output only. Execution state for the job.
      Possible values:
      • STATE_UNSPECIFIED - The job state is unknown.
      • RUNNING - The job is running.
      • CANCELLING - The job is cancelling.
      • CANCELLED - The job cancellation was successful.
      • SUCCEEDED - The job completed successfully.
      • FAILED - The job is no longer running due to an error.
      • ABORTED - The job was cancelled outside of Dataplex.
    • trigger
      Type: STRING
      Provider name: trigger
      Description: Output only. Job execution trigger.
      Possible values:
      • TRIGGER_UNSPECIFIED - The trigger is unspecified.
      • TASK_CONFIG - The job was triggered by Dataplex based on trigger spec from task definition.
      • RUN_REQUEST - The job was triggered by the explicit call of Task API.
    • uid
      Type: STRING
      Provider name: uid
      Description: Output only. System generated globally unique ID for the job.
  • update_time
    Type: TIMESTAMP
    Provider name: updateTime
    Description: Output only. Last update time of the status.

gcp_display_name

Type: STRING
Provider name: displayName
Description: Optional. User friendly display name.

labels

Type: UNORDERED_LIST_STRING

name

Type: STRING
Provider name: name
Description: Output only. The relative resource name of the task, of the form: projects/{project_number}/locations/{location_id}/lakes/{lake_id}/ tasks/{task_id}.

notebook

Type: STRUCT
Provider name: notebook
Description: Config related to running scheduled Notebooks.

  • archive_uris
    Type: UNORDERED_LIST_STRING
    Provider name: archiveUris
    Description: Optional. Cloud Storage URIs of archives to be extracted into the working directory of each executor. Supported file types: .jar, .tar, .tar.gz, .tgz, and .zip.
  • file_uris
    Type: UNORDERED_LIST_STRING
    Provider name: fileUris
    Description: Optional. Cloud Storage URIs of files to be placed in the working directory of each executor.
  • infrastructure_spec
    Type: STRUCT
    Provider name: infrastructureSpec
    Description: Optional. Infrastructure specification for the execution.
    • batch
      Type: STRUCT
      Provider name: batch
      Description: Compute resources needed for a Task when using Dataproc Serverless.
      • executors_count
        Type: INT32
        Provider name: executorsCount
        Description: Optional. Total number of job executors. Executor Count should be between 2 and 100. Default=2
      • max_executors_count
        Type: INT32
        Provider name: maxExecutorsCount
        Description: Optional. Max configurable executors. If max_executors_count > executors_count, then auto-scaling is enabled. Max Executor Count should be between 2 and 1000. Default=1000
    • container_image
      Type: STRUCT
      Provider name: containerImage
      Description: Container Image Runtime Configuration.
      • image
        Type: STRING
        Provider name: image
        Description: Optional. Container image to use.
      • java_jars
        Type: UNORDERED_LIST_STRING
        Provider name: javaJars
        Description: Optional. A list of Java JARS to add to the classpath. Valid input includes Cloud Storage URIs to Jar binaries. For example, gs://bucket-name/my/path/to/file.jar
      • python_packages
        Type: UNORDERED_LIST_STRING
        Provider name: pythonPackages
        Description: Optional. A list of python packages to be installed. Valid formats include Cloud Storage URI to a PIP installable library. For example, gs://bucket-name/my/path/to/lib.tar.gz
    • vpc_network
      Type: STRUCT
      Provider name: vpcNetwork
      Description: Vpc network.
      • network
        Type: STRING
        Provider name: network
        Description: Optional. The Cloud VPC network in which the job is run. By default, the Cloud VPC network named Default within the project is used.
      • network_tags
        Type: UNORDERED_LIST_STRING
        Provider name: networkTags
        Description: Optional. List of network tags to apply to the job.
      • sub_network
        Type: STRING
        Provider name: subNetwork
        Description: Optional. The Cloud VPC sub-network in which the job is run.
  • notebook
    Type: STRING
    Provider name: notebook
    Description: Required. Path to input notebook. This can be the Cloud Storage URI of the notebook file or the path to a Notebook Content. The execution args are accessible as environment variables (TASK_key=value).

organization_id

Type: STRING

parent

Type: STRING

project_id

Type: STRING

project_number

Type: STRING

resource_name

Type: STRING

spark

Type: STRUCT
Provider name: spark
Description: Config related to running custom Spark tasks.

  • archive_uris
    Type: UNORDERED_LIST_STRING
    Provider name: archiveUris
    Description: Optional. Cloud Storage URIs of archives to be extracted into the working directory of each executor. Supported file types: .jar, .tar, .tar.gz, .tgz, and .zip.
  • file_uris
    Type: UNORDERED_LIST_STRING
    Provider name: fileUris
    Description: Optional. Cloud Storage URIs of files to be placed in the working directory of each executor.
  • infrastructure_spec
    Type: STRUCT
    Provider name: infrastructureSpec
    Description: Optional. Infrastructure specification for the execution.
    • batch
      Type: STRUCT
      Provider name: batch
      Description: Compute resources needed for a Task when using Dataproc Serverless.
      • executors_count
        Type: INT32
        Provider name: executorsCount
        Description: Optional. Total number of job executors. Executor Count should be between 2 and 100. Default=2
      • max_executors_count
        Type: INT32
        Provider name: maxExecutorsCount
        Description: Optional. Max configurable executors. If max_executors_count > executors_count, then auto-scaling is enabled. Max Executor Count should be between 2 and 1000. Default=1000
    • container_image
      Type: STRUCT
      Provider name: containerImage
      Description: Container Image Runtime Configuration.
      • image
        Type: STRING
        Provider name: image
        Description: Optional. Container image to use.
      • java_jars
        Type: UNORDERED_LIST_STRING
        Provider name: javaJars
        Description: Optional. A list of Java JARS to add to the classpath. Valid input includes Cloud Storage URIs to Jar binaries. For example, gs://bucket-name/my/path/to/file.jar
      • python_packages
        Type: UNORDERED_LIST_STRING
        Provider name: pythonPackages
        Description: Optional. A list of python packages to be installed. Valid formats include Cloud Storage URI to a PIP installable library. For example, gs://bucket-name/my/path/to/lib.tar.gz
    • vpc_network
      Type: STRUCT
      Provider name: vpcNetwork
      Description: Vpc network.
      • network
        Type: STRING
        Provider name: network
        Description: Optional. The Cloud VPC network in which the job is run. By default, the Cloud VPC network named Default within the project is used.
      • network_tags
        Type: UNORDERED_LIST_STRING
        Provider name: networkTags
        Description: Optional. List of network tags to apply to the job.
      • sub_network
        Type: STRING
        Provider name: subNetwork
        Description: Optional. The Cloud VPC sub-network in which the job is run.
  • main_class
    Type: STRING
    Provider name: mainClass
    Description: The name of the driver’s main class. The jar file that contains the class must be in the default CLASSPATH or specified in jar_file_uris. The execution args are passed in as a sequence of named process arguments (–key=value).
  • main_jar_file_uri
    Type: STRING
    Provider name: mainJarFileUri
    Description: The Cloud Storage URI of the jar file that contains the main class. The execution args are passed in as a sequence of named process arguments (–key=value).
  • python_script_file
    Type: STRING
    Provider name: pythonScriptFile
    Description: The Gcloud Storage URI of the main Python file to use as the driver. Must be a .py file. The execution args are passed in as a sequence of named process arguments (–key=value).
  • sql_script
    Type: STRING
    Provider name: sqlScript
    Description: The query text. The execution args are used to declare a set of script variables (set key=“value”;).
  • sql_script_file
    Type: STRING
    Provider name: sqlScriptFile
    Description: A reference to a query file. This should be the Cloud Storage URI of the query file. The execution args are used to declare a set of script variables (set key=“value”;).

state

Type: STRING
Provider name: state
Description: Output only. Current state of the task.
Possible values:

  • STATE_UNSPECIFIED - State is not specified.
  • ACTIVE - Resource is active, i.e., ready to use.
  • CREATING - Resource is under creation.
  • DELETING - Resource is under deletion.
  • ACTION_REQUIRED - Resource is active but has unresolved actions.

tags

Type: UNORDERED_LIST_STRING

trigger_spec

Type: STRUCT
Provider name: triggerSpec
Description: Required. Spec related to how often and when a task should be triggered.

  • disabled
    Type: BOOLEAN
    Provider name: disabled
    Description: Optional. Prevent the task from executing. This does not cancel already running tasks. It is intended to temporarily disable RECURRING tasks.
  • max_retries
    Type: INT32
    Provider name: maxRetries
    Description: Optional. Number of retry attempts before aborting. Set to zero to never attempt to retry a failed task.
  • schedule
    Type: STRING
    Provider name: schedule
    Description: Optional. Cron schedule (https://en.wikipedia.org/wiki/Cron) for running tasks periodically. To explicitly set a timezone to the cron tab, apply a prefix in the cron tab: “CRON_TZ=${IANA_TIME_ZONE}” or “TZ=${IANA_TIME_ZONE}”. The ${IANA_TIME_ZONE} may only be a valid string from IANA time zone database. For example, CRON_TZ=America/New_York 1 * * * *, or TZ=America/New_York 1 * * * *. This field is required for RECURRING tasks.
  • start_time
    Type: TIMESTAMP
    Provider name: startTime
    Description: Optional. The first run of the task will be after this time. If not specified, the task will run shortly after being submitted if ON_DEMAND and based on the schedule if RECURRING.
  • type
    Type: STRING
    Provider name: type
    Description: Required. Immutable. Trigger type of the user-specified Task.
    Possible values:
    • TYPE_UNSPECIFIED - Unspecified trigger type.
    • ON_DEMAND - The task runs one-time shortly after Task Creation.
    • RECURRING - The task is scheduled to run periodically.

uid

Type: STRING
Provider name: uid
Description: Output only. System generated globally unique ID for the task. This ID will be different if the task is deleted and re-created with the same name.

update_time

Type: TIMESTAMP
Provider name: updateTime
Description: Output only. The time when the task was last updated.

PREVIEWING: guacbot/translation-pipeline