gcp_dataplex_task
ancestors
Type: UNORDERED_LIST_STRING
create_time
Type: TIMESTAMP
Provider name: createTime
Description: Output only. The time when the task was created.
description
Type: STRING
Provider name: description
Description: Optional. Description of the task.
execution_spec
Type: STRUCT
Provider name: executionSpec
Description: Required. Spec related to how a task is executed.
kms_key
Type: STRING
Provider name: kmsKey
Description: Optional. The Cloud KMS key to use for encryption, of the form: projects/{project_number}/locations/{location_id}/keyRings/{key-ring-name}/cryptoKeys/{key-name}.
max_job_execution_lifetime
Type: STRING
Provider name: maxJobExecutionLifetime
Description: Optional. The maximum duration after which the job execution is expired.
project
Type: STRING
Provider name: project
Description: Optional. The project in which jobs are run. By default, the project containing the Lake is used. If a project is provided, the ExecutionSpec.service_account must belong to this project.
service_account
Type: STRING
Provider name: serviceAccount
Description: Required. Service account to use to execute a task. If not provided, the default Compute service account for the project is used.
execution_status
Type: STRUCT
Provider name: executionStatus
Description: Output only. Status of the latest task executions.
latest_job
Type: STRUCT
Provider name: latestJob
Description: Output only. latest job execution
end_time
Type: TIMESTAMP
Provider name: endTime
Description: Output only. The time when the job ended.
execution_spec
Type: STRUCT
Provider name: executionSpec
Description: Output only. Spec related to how a task is executed.
kms_key
Type: STRING
Provider name: kmsKey
Description: Optional. The Cloud KMS key to use for encryption, of the form: projects/{project_number}/locations/{location_id}/keyRings/{key-ring-name}/cryptoKeys/{key-name}.
max_job_execution_lifetime
Type: STRING
Provider name: maxJobExecutionLifetime
Description: Optional. The maximum duration after which the job execution is expired.
project
Type: STRING
Provider name: project
Description: Optional. The project in which jobs are run. By default, the project containing the Lake is used. If a project is provided, the ExecutionSpec.service_account must belong to this project.
service_account
Type: STRING
Provider name: serviceAccount
Description: Required. Service account to use to execute a task. If not provided, the default Compute service account for the project is used.
message
Type: STRING
Provider name: message
Description: Output only. Additional information about the current state.
name
Type: STRING
Provider name: name
Description: Output only. The relative resource name of the job, of the form: projects/{project_number}/locations/{location_id}/lakes/{lake_id}/tasks/{task_id}/jobs/{job_id}.
retry_count
Type: INT32
Provider name: retryCount
Description: Output only. The number of times the job has been retried (excluding the initial attempt).
service
Type: STRING
Provider name: service
Description: Output only. The underlying service running a job.
Possible values:
SERVICE_UNSPECIFIED
- Service used to run the job is unspecified.
DATAPROC
- Dataproc service is used to run this job.
service_job
Type: STRING
Provider name: serviceJob
Description: Output only. The full resource name for the job run under a particular service.
start_time
Type: TIMESTAMP
Provider name: startTime
Description: Output only. The time when the job was started.
state
Type: STRING
Provider name: state
Description: Output only. Execution state for the job.
Possible values:
STATE_UNSPECIFIED
- The job state is unknown.
RUNNING
- The job is running.
CANCELLING
- The job is cancelling.
CANCELLED
- The job cancellation was successful.
SUCCEEDED
- The job completed successfully.
FAILED
- The job is no longer running due to an error.
ABORTED
- The job was cancelled outside of Dataplex.
trigger
Type: STRING
Provider name: trigger
Description: Output only. Job execution trigger.
Possible values:
TRIGGER_UNSPECIFIED
- The trigger is unspecified.
TASK_CONFIG
- The job was triggered by Dataplex based on trigger spec from task definition.
RUN_REQUEST
- The job was triggered by the explicit call of Task API.
uid
Type: STRING
Provider name: uid
Description: Output only. System generated globally unique ID for the job.
update_time
Type: TIMESTAMP
Provider name: updateTime
Description: Output only. Last update time of the status.
gcp_display_name
Type: STRING
Provider name: displayName
Description: Optional. User friendly display name.
labels
Type: UNORDERED_LIST_STRING
name
Type: STRING
Provider name: name
Description: Output only. The relative resource name of the task, of the form: projects/{project_number}/locations/{location_id}/lakes/{lake_id}/ tasks/{task_id}.
notebook
Type: STRUCT
Provider name: notebook
Description: Config related to running scheduled Notebooks.
archive_uris
Type: UNORDERED_LIST_STRING
Provider name: archiveUris
Description: Optional. Cloud Storage URIs of archives to be extracted into the working directory of each executor. Supported file types: .jar, .tar, .tar.gz, .tgz, and .zip.
file_uris
Type: UNORDERED_LIST_STRING
Provider name: fileUris
Description: Optional. Cloud Storage URIs of files to be placed in the working directory of each executor.
infrastructure_spec
Type: STRUCT
Provider name: infrastructureSpec
Description: Optional. Infrastructure specification for the execution.
batch
Type: STRUCT
Provider name: batch
Description: Compute resources needed for a Task when using Dataproc Serverless.
executors_count
Type: INT32
Provider name: executorsCount
Description: Optional. Total number of job executors. Executor Count should be between 2 and 100. Default=2
max_executors_count
Type: INT32
Provider name: maxExecutorsCount
Description: Optional. Max configurable executors. If max_executors_count > executors_count, then auto-scaling is enabled. Max Executor Count should be between 2 and 1000. Default=1000
container_image
Type: STRUCT
Provider name: containerImage
Description: Container Image Runtime Configuration.
image
Type: STRING
Provider name: image
Description: Optional. Container image to use.
java_jars
Type: UNORDERED_LIST_STRING
Provider name: javaJars
Description: Optional. A list of Java JARS to add to the classpath. Valid input includes Cloud Storage URIs to Jar binaries. For example, gs://bucket-name/my/path/to/file.jar
python_packages
Type: UNORDERED_LIST_STRING
Provider name: pythonPackages
Description: Optional. A list of python packages to be installed. Valid formats include Cloud Storage URI to a PIP installable library. For example, gs://bucket-name/my/path/to/lib.tar.gz
vpc_network
Type: STRUCT
Provider name: vpcNetwork
Description: Vpc network.
network
Type: STRING
Provider name: network
Description: Optional. The Cloud VPC network in which the job is run. By default, the Cloud VPC network named Default within the project is used.
network_tags
Type: UNORDERED_LIST_STRING
Provider name: networkTags
Description: Optional. List of network tags to apply to the job.
sub_network
Type: STRING
Provider name: subNetwork
Description: Optional. The Cloud VPC sub-network in which the job is run.
notebook
Type: STRING
Provider name: notebook
Description: Required. Path to input notebook. This can be the Cloud Storage URI of the notebook file or the path to a Notebook Content. The execution args are accessible as environment variables (TASK_key=value).
organization_id
Type: STRING
parent
Type: STRING
project_id
Type: STRING
project_number
Type: STRING
resource_name
Type: STRING
spark
Type: STRUCT
Provider name: spark
Description: Config related to running custom Spark tasks.
archive_uris
Type: UNORDERED_LIST_STRING
Provider name: archiveUris
Description: Optional. Cloud Storage URIs of archives to be extracted into the working directory of each executor. Supported file types: .jar, .tar, .tar.gz, .tgz, and .zip.
file_uris
Type: UNORDERED_LIST_STRING
Provider name: fileUris
Description: Optional. Cloud Storage URIs of files to be placed in the working directory of each executor.
infrastructure_spec
Type: STRUCT
Provider name: infrastructureSpec
Description: Optional. Infrastructure specification for the execution.
batch
Type: STRUCT
Provider name: batch
Description: Compute resources needed for a Task when using Dataproc Serverless.
executors_count
Type: INT32
Provider name: executorsCount
Description: Optional. Total number of job executors. Executor Count should be between 2 and 100. Default=2
max_executors_count
Type: INT32
Provider name: maxExecutorsCount
Description: Optional. Max configurable executors. If max_executors_count > executors_count, then auto-scaling is enabled. Max Executor Count should be between 2 and 1000. Default=1000
container_image
Type: STRUCT
Provider name: containerImage
Description: Container Image Runtime Configuration.
image
Type: STRING
Provider name: image
Description: Optional. Container image to use.
java_jars
Type: UNORDERED_LIST_STRING
Provider name: javaJars
Description: Optional. A list of Java JARS to add to the classpath. Valid input includes Cloud Storage URIs to Jar binaries. For example, gs://bucket-name/my/path/to/file.jar
python_packages
Type: UNORDERED_LIST_STRING
Provider name: pythonPackages
Description: Optional. A list of python packages to be installed. Valid formats include Cloud Storage URI to a PIP installable library. For example, gs://bucket-name/my/path/to/lib.tar.gz
vpc_network
Type: STRUCT
Provider name: vpcNetwork
Description: Vpc network.
network
Type: STRING
Provider name: network
Description: Optional. The Cloud VPC network in which the job is run. By default, the Cloud VPC network named Default within the project is used.
network_tags
Type: UNORDERED_LIST_STRING
Provider name: networkTags
Description: Optional. List of network tags to apply to the job.
sub_network
Type: STRING
Provider name: subNetwork
Description: Optional. The Cloud VPC sub-network in which the job is run.
main_class
Type: STRING
Provider name: mainClass
Description: The name of the driver’s main class. The jar file that contains the class must be in the default CLASSPATH or specified in jar_file_uris. The execution args are passed in as a sequence of named process arguments (–key=value).
main_jar_file_uri
Type: STRING
Provider name: mainJarFileUri
Description: The Cloud Storage URI of the jar file that contains the main class. The execution args are passed in as a sequence of named process arguments (–key=value).
python_script_file
Type: STRING
Provider name: pythonScriptFile
Description: The Gcloud Storage URI of the main Python file to use as the driver. Must be a .py file. The execution args are passed in as a sequence of named process arguments (–key=value).
sql_script
Type: STRING
Provider name: sqlScript
Description: The query text. The execution args are used to declare a set of script variables (set key=“value”;).
sql_script_file
Type: STRING
Provider name: sqlScriptFile
Description: A reference to a query file. This should be the Cloud Storage URI of the query file. The execution args are used to declare a set of script variables (set key=“value”;).
state
Type: STRING
Provider name: state
Description: Output only. Current state of the task.
Possible values:
STATE_UNSPECIFIED
- State is not specified.
ACTIVE
- Resource is active, i.e., ready to use.
CREATING
- Resource is under creation.
DELETING
- Resource is under deletion.
ACTION_REQUIRED
- Resource is active but has unresolved actions.
Type: UNORDERED_LIST_STRING
trigger_spec
Type: STRUCT
Provider name: triggerSpec
Description: Required. Spec related to how often and when a task should be triggered.
disabled
Type: BOOLEAN
Provider name: disabled
Description: Optional. Prevent the task from executing. This does not cancel already running tasks. It is intended to temporarily disable RECURRING tasks.
max_retries
Type: INT32
Provider name: maxRetries
Description: Optional. Number of retry attempts before aborting. Set to zero to never attempt to retry a failed task.
schedule
Type: STRING
Provider name: schedule
Description: Optional. Cron schedule (https://en.wikipedia.org/wiki/Cron) for running tasks periodically. To explicitly set a timezone to the cron tab, apply a prefix in the cron tab: “CRON_TZ=${IANA_TIME_ZONE}” or “TZ=${IANA_TIME_ZONE}”. The ${IANA_TIME_ZONE} may only be a valid string from IANA time zone database. For example, CRON_TZ=America/New_York 1 * * * *, or TZ=America/New_York 1 * * * *. This field is required for RECURRING tasks.
start_time
Type: TIMESTAMP
Provider name: startTime
Description: Optional. The first run of the task will be after this time. If not specified, the task will run shortly after being submitted if ON_DEMAND and based on the schedule if RECURRING.
type
Type: STRING
Provider name: type
Description: Required. Immutable. Trigger type of the user-specified Task.
Possible values:
TYPE_UNSPECIFIED
- Unspecified trigger type.
ON_DEMAND
- The task runs one-time shortly after Task Creation.
RECURRING
- The task is scheduled to run periodically.
uid
Type: STRING
Provider name: uid
Description: Output only. System generated globally unique ID for the task. This ID will be different if the task is deleted and re-created with the same name.
update_time
Type: TIMESTAMP
Provider name: updateTime
Description: Output only. The time when the task was last updated.