Docs > Infrastructure > Datadog Resource Catalog >

gcp_aiplatform_endpoint

`ancestors`

Type: UNORDERED_LIST_STRING

`client_connection_config`

Type: STRUCT
Provider name: clientConnectionConfig
Description: Configurations that are applied to the endpoint for online prediction.

inference_timeout
Type: STRING
Provider name: inferenceTimeout
Description: Customizable online prediction request timeout.

`create_time`

Type: TIMESTAMP
Provider name: createTime
Description: Output only. Timestamp when this Endpoint was created.

`dedicated_endpoint_dns`

Type: STRING
Provider name: dedicatedEndpointDns
Description: Output only. DNS of the dedicated endpoint. Will only be populated if dedicated_endpoint_enabled is true. Depending on the features enabled, uid might be a random number or a string. For example, if fast_tryout is enabled, uid will be fasttryout. Format: https://{endpoint_id}.{region}-{uid}.prediction.vertexai.goog.

`dedicated_endpoint_enabled`

Type: BOOLEAN
Provider name: dedicatedEndpointEnabled
Description: If true, the endpoint will be exposed through a dedicated DNS [Endpoint.dedicated_endpoint_dns]. Your request to the dedicated DNS will be isolated from other users’ traffic and will have better performance and reliability. Note: Once you enabled dedicated endpoint, you won’t be able to send request to the shared DNS {region}-aiplatform.googleapis.com. The limitation will be removed soon.

`deployed_models`

Type: UNORDERED_LIST_STRUCT
Provider name: deployedModels
Description: Output only. The models deployed in this Endpoint. To add or remove DeployedModels use EndpointService.DeployModel and EndpointService.UndeployModel respectively.

automatic_resources
Type: STRUCT
Provider name: automaticResources
Description: A description of resources that to large degree are decided by Vertex AI, and require only a modest additional configuration.
- max_replica_count
  Type: INT32
  Provider name: maxReplicaCount
  Description: Immutable. The maximum number of replicas that may be deployed on when the traffic against it increases. If the requested value is too large, the deployment will error, but if deployment succeeds then the ability to scale to that many replicas is guaranteed (barring service outages). If traffic increases beyond what its replicas at maximum may handle, a portion of the traffic will be dropped. If this value is not provided, a no upper bound for scaling under heavy traffic will be assume, though Vertex AI may be unable to scale beyond certain replica number.
- min_replica_count
  Type: INT32
  Provider name: minReplicaCount
  Description: Immutable. The minimum number of replicas that will be always deployed on. If traffic against it increases, it may dynamically be deployed onto more replicas up to max_replica_count, and as traffic decreases, some of these extra replicas may be freed. If the requested value is too large, the deployment will error.
create_time
Type: TIMESTAMP
Provider name: createTime
Description: Output only. Timestamp when the DeployedModel was created.
dedicated_resources
Type: STRUCT
Provider name: dedicatedResources
Description: A description of resources that are dedicated to the DeployedModel, and that need a higher degree of manual configuration.
- autoscaling_metric_specs
  Type: UNORDERED_LIST_STRUCT
  Provider name: autoscalingMetricSpecs
  Description: Immutable. The metric specifications that overrides a resource utilization metric (CPU utilization, accelerator’s duty cycle, and so on) target value (default to 60 if not set). At most one entry is allowed per metric. If machine_spec.accelerator_count is above 0, the autoscaling will be based on both CPU utilization and accelerator’s duty cycle metrics and scale up when either metrics exceeds its target value while scale down if both metrics are under their target value. The default target value is 60 for both metrics. If machine_spec.accelerator_count is 0, the autoscaling will be based on CPU utilization metric only with default target value 60 if not explicitly set. For example, in the case of Online Prediction, if you want to override target CPU utilization to 80, you should set autoscaling_metric_specs.metric_name to aiplatform.googleapis.com/prediction/online/cpu/utilization and autoscaling_metric_specs.target to 80.
  - metric_name
    Type: STRING
    Provider name: metricName
    Description: Required. The resource metric name. Supported metrics: * For Online Prediction: * aiplatform.googleapis.com/prediction/online/accelerator/duty_cycle * aiplatform.googleapis.com/prediction/online/cpu/utilization
  - target
    Type: INT32
    Provider name: target
    Description: The target resource utilization in percentage (1% - 100%) for the given metric; once the real usage deviates from the target by a certain percentage, the machine replicas change. The default value is 60 (representing 60%) if not provided.
- machine_spec
  Type: STRUCT
  Provider name: machineSpec
  Description: Required. Immutable. The specification of a single machine being used.
  - accelerator_count
    Type: INT32
    Provider name: acceleratorCount
    Description: The number of accelerators to attach to the machine.
  - accelerator_type
    Type: STRING
    Provider name: acceleratorType
    Description: Immutable. The type of accelerator(s) that may be attached to the machine as per accelerator_count.
    Possible values:
    - ACCELERATOR_TYPE_UNSPECIFIED - Unspecified accelerator type, which means no accelerator.
    - NVIDIA_TESLA_K80 - Deprecated: Nvidia Tesla K80 GPU has reached end of support, see https://cloud.google.com/compute/docs/eol/k80-eol.
    - NVIDIA_TESLA_P100 - Nvidia Tesla P100 GPU.
    - NVIDIA_TESLA_V100 - Nvidia Tesla V100 GPU.
    - NVIDIA_TESLA_P4 - Nvidia Tesla P4 GPU.
    - NVIDIA_TESLA_T4 - Nvidia Tesla T4 GPU.
    - NVIDIA_TESLA_A100 - Nvidia Tesla A100 GPU.
    - NVIDIA_A100_80GB - Nvidia A100 80GB GPU.
    - NVIDIA_L4 - Nvidia L4 GPU.
    - NVIDIA_H100_80GB - Nvidia H100 80Gb GPU.
    - NVIDIA_H100_MEGA_80GB - Nvidia H100 Mega 80Gb GPU.
    - NVIDIA_H200_141GB - Nvidia H200 141Gb GPU.
    - TPU_V2 - TPU v2.
    - TPU_V3 - TPU v3.
    - TPU_V4_POD - TPU v4.
    - TPU_V5_LITEPOD - TPU v5.
  - machine_type
    Type: STRING
    Provider name: machineType
    Description: Immutable. The type of the machine. See the list of machine types supported for prediction See the list of machine types supported for custom training. For DeployedModel this field is optional, and the default value is n1-standard-2. For BatchPredictionJob or as part of WorkerPoolSpec this field is required.
  - reservation_affinity
    Type: STRUCT
    Provider name: reservationAffinity
    Description: Optional. Immutable. Configuration controlling how this resource pool consumes reservation.
    - key
      Type: STRING
      Provider name: key
      Description: Optional. Corresponds to the label key of a reservation resource. To target a SPECIFIC_RESERVATION by name, use compute.googleapis.com/reservation-name as the key and specify the name of your reservation as its value.
    - reservation_affinity_type
      Type: STRING
      Provider name: reservationAffinityType
      Description: Required. Specifies the reservation affinity type.
      Possible values:
      - TYPE_UNSPECIFIED - Default value. This should not be used.
      - NO_RESERVATION - Do not consume from any reserved capacity, only use on-demand.
      - ANY_RESERVATION - Consume any reservation available, falling back to on-demand.
      - SPECIFIC_RESERVATION - Consume from a specific reservation. When chosen, the reservation must be identified via the key and values fields.
    - values
      Type: UNORDERED_LIST_STRING
      Provider name: values
      Description: Optional. Corresponds to the label values of a reservation resource. This must be the full resource name of the reservation or reservation block.
  - tpu_topology
    Type: STRING
    Provider name: tpuTopology
    Description: Immutable. The topology of the TPUs. Corresponds to the TPU topologies available from GKE. (Example: tpu_topology: “2x2x1”).
- max_replica_count
  Type: INT32
  Provider name: maxReplicaCount
  Description: Immutable. The maximum number of replicas that may be deployed on when the traffic against it increases. If the requested value is too large, the deployment will error, but if deployment succeeds then the ability to scale to that many replicas is guaranteed (barring service outages). If traffic increases beyond what its replicas at maximum may handle, a portion of the traffic will be dropped. If this value is not provided, will use min_replica_count as the default value. The value of this field impacts the charge against Vertex CPU and GPU quotas. Specifically, you will be charged for (max_replica_count * number of cores in the selected machine type) and (max_replica_count * number of GPUs per replica in the selected machine type).
- min_replica_count
  Type: INT32
  Provider name: minReplicaCount
  Description: Required. Immutable. The minimum number of machine replicas that will be always deployed on. This value must be greater than or equal to 1. If traffic increases, it may dynamically be deployed onto more replicas, and as traffic decreases, some of these extra replicas may be freed.
- required_replica_count
  Type: INT32
  Provider name: requiredReplicaCount
  Description: Optional. Number of required available replicas for the deployment to succeed. This field is only needed when partial deployment/mutation is desired. If set, the deploy/mutate operation will succeed once available_replica_count reaches required_replica_count, and the rest of the replicas will be retried. If not set, the default required_replica_count will be min_replica_count.
- spot
  Type: BOOLEAN
  Provider name: spot
  Description: Optional. If true, schedule the deployment workload on spot VMs.
disable_container_logging
Type: BOOLEAN
Provider name: disableContainerLogging
Description: For custom-trained Models and AutoML Tabular Models, the container of the DeployedModel instances will send stderr and stdout streams to Cloud Logging by default. Please note that the logs incur cost, which are subject to Cloud Logging pricing. User can disable container logging by setting this flag to true.
disable_explanations
Type: BOOLEAN
Provider name: disableExplanations
Description: If true, deploy the model without explainable feature, regardless the existence of Model.explanation_spec or explanation_spec.
enable_access_logging
Type: BOOLEAN
Provider name: enableAccessLogging
Description: If true, online prediction access logs are sent to Cloud Logging. These logs are like standard server access logs, containing information like timestamp and latency for each prediction request. Note that logs may incur a cost, especially if your project receives prediction requests at a high queries per second rate (QPS). Estimate your costs before enabling this option.
explanation_spec
Type: STRUCT
Provider name: explanationSpec
Description: Explanation configuration for this DeployedModel. When deploying a Model using EndpointService.DeployModel, this value overrides the value of Model.explanation_spec. All fields of explanation_spec are optional in the request. If a field of explanation_spec is not populated, the value of the same field of Model.explanation_spec is inherited. If the corresponding Model.explanation_spec is not populated, all fields of the explanation_spec will be used for the explanation configuration.
- metadata
  Type: STRUCT
  Provider name: metadata
  Description: Optional. Metadata describing the Model’s input and output for explanation.
  - feature_attributions_schema_uri
    Type: STRING
    Provider name: featureAttributionsSchemaUri
    Description: Points to a YAML file stored on Google Cloud Storage describing the format of the feature attributions. The schema is defined as an OpenAPI 3.0.2 Schema Object. AutoML tabular Models always have this field populated by Vertex AI. Note: The URI given on output may be different, including the URI scheme, than the one given on input. The output URI will point to a location where the user only has a read access.
  - latent_space_source
    Type: STRING
    Provider name: latentSpaceSource
    Description: Name of the source to generate embeddings for example based explanations.
- parameters
  Type: STRUCT
  Provider name: parameters
  Description: Required. Parameters that configure explaining of the Model’s predictions.
  - examples
    Type: STRUCT
    Provider name: examples
    Description: Example-based explanations that returns the nearest neighbors from the provided dataset.
    - example_gcs_source
      Type: STRUCT
      Provider name: exampleGcsSource
      Description: The Cloud Storage input instances.
      - data_format
        Type: STRING
        Provider name: dataFormat
        Description: The format in which instances are given, if not specified, assume it’s JSONL format. Currently only JSONL format is supported.
        Possible values:
        DATA_FORMAT_UNSPECIFIED - Format unspecified, used when unset.
        JSONL - Examples are stored in JSONL files.
      - gcs_source
        Type: STRUCT
        Provider name: gcsSource
        Description: The Cloud Storage location for the input instances.
        uris
        Type: UNORDERED_LIST_STRING
        Provider name: uris
        Description: Required. Google Cloud Storage URI(-s) to the input file(s). May contain wildcards. For more information on wildcards, see https://cloud.google.com/storage/docs/wildcards.
    - neighbor_count
      Type: INT32
      Provider name: neighborCount
      Description: The number of neighbors to return when querying for examples.
    - presets
      Type: STRUCT
      Provider name: presets
      Description: Simplified preset configuration, which automatically sets configuration values based on the desired query speed-precision trade-off and modality.
      - modality
        Type: STRING
        Provider name: modality
        Description: The modality of the uploaded model, which automatically configures the distance measurement and feature normalization for the underlying example index and queries. If your model does not precisely fit one of these types, it is okay to choose the closest type.
        Possible values:
        MODALITY_UNSPECIFIED - Should not be set. Added as a recommended best practice for enums
        IMAGE - IMAGE modality
        TEXT - TEXT modality
        TABULAR - TABULAR modality
      - query
        Type: STRING
        Provider name: query
        Description: Preset option controlling parameters for speed-precision trade-off when querying for examples. If omitted, defaults to PRECISE.
        Possible values:
        PRECISE - More precise neighbors as a trade-off against slower response.
        FAST - Faster response as a trade-off against less precise neighbors.
  - integrated_gradients_attribution
    Type: STRUCT
    Provider name: integratedGradientsAttribution
    Description: An attribution method that computes Aumann-Shapley values taking advantage of the model’s fully differentiable structure. Refer to this paper for more details: https://arxiv.org/abs/1703.01365
    - blur_baseline_config
      Type: STRUCT
      Provider name: blurBaselineConfig
      Description: Config for IG with blur baseline. When enabled, a linear path from the maximally blurred image to the input image is created. Using a blurred baseline instead of zero (black image) is motivated by the BlurIG approach explained here: https://arxiv.org/abs/2004.03383
      - max_blur_sigma
        Type: FLOAT
        Provider name: maxBlurSigma
        Description: The standard deviation of the blur kernel for the blurred baseline. The same blurring parameter is used for both the height and the width dimension. If not set, the method defaults to the zero (i.e. black for images) baseline.
    - smooth_grad_config
      Type: STRUCT
      Provider name: smoothGradConfig
      Description: Config for SmoothGrad approximation of gradients. When enabled, the gradients are approximated by averaging the gradients from noisy samples in the vicinity of the inputs. Adding noise can help improve the computed gradients. Refer to this paper for more details: https://arxiv.org/pdf/1706.03825.pdf
      - feature_noise_sigma
        Type: STRUCT
        Provider name: featureNoiseSigma
        Description: This is similar to noise_sigma, but provides additional flexibility. A separate noise sigma can be provided for each feature, which is useful if their distributions are different. No noise is added to features that are not set. If this field is unset, noise_sigma will be used for all features.
        noise_sigma
        Type: UNORDERED_LIST_STRUCT
        Provider name: noiseSigma
        Description: Noise sigma per feature. No noise is added to features that are not set.
        name
        Type: STRING
        Provider name: name
        Description: The name of the input feature for which noise sigma is provided. The features are defined in explanation metadata inputs.
        sigma
        Type: FLOAT
        Provider name: sigma
        Description: This represents the standard deviation of the Gaussian kernel that will be used to add noise to the feature prior to computing gradients. Similar to noise_sigma but represents the noise added to the current feature. Defaults to 0.1.
      - noise_sigma
        Type: FLOAT
        Provider name: noiseSigma
        Description: This is a single float value and will be used to add noise to all the features. Use this field when all features are normalized to have the same distribution: scale to range [0, 1], [-1, 1] or z-scoring, where features are normalized to have 0-mean and 1-variance. Learn more about normalization. For best results the recommended value is about 10% - 20% of the standard deviation of the input feature. Refer to section 3.2 of the SmoothGrad paper: https://arxiv.org/pdf/1706.03825.pdf. Defaults to 0.1. If the distribution is different per feature, set feature_noise_sigma instead for each feature.
      - noisy_sample_count
        Type: INT32
        Provider name: noisySampleCount
        Description: The number of gradient samples to use for approximation. The higher this number, the more accurate the gradient is, but the runtime complexity increases by this factor as well. Valid range of its value is [1, 50]. Defaults to 3.
    - step_count
      Type: INT32
      Provider name: stepCount
      Description: Required. The number of steps for approximating the path integral. A good value to start is 50 and gradually increase until the sum to diff property is within the desired error range. Valid range of its value is [1, 100], inclusively.
  - sampled_shapley_attribution
    Type: STRUCT
    Provider name: sampledShapleyAttribution
    Description: An attribution method that approximates Shapley values for features that contribute to the label being predicted. A sampling strategy is used to approximate the value rather than considering all subsets of features. Refer to this paper for model details: https://arxiv.org/abs/1306.4265.
    - path_count
      Type: INT32
      Provider name: pathCount
      Description: Required. The number of feature permutations to consider when approximating the Shapley values. Valid range of its value is [1, 50], inclusively.
  - top_k
    Type: INT32
    Provider name: topK
    Description: If populated, returns attributions for top K indices of outputs (defaults to 1). Only applies to Models that predicts more than one outputs (e,g, multi-class Models). When set to -1, returns explanations for all outputs.
  - xrai_attribution
    Type: STRUCT
    Provider name: xraiAttribution
    Description: An attribution method that redistributes Integrated Gradients attribution to segmented regions, taking advantage of the model’s fully differentiable structure. Refer to this paper for more details: https://arxiv.org/abs/1906.02825 XRAI currently performs better on natural images, like a picture of a house or an animal. If the images are taken in artificial environments, like a lab or manufacturing line, or from diagnostic equipment, like x-rays or quality-control cameras, use Integrated Gradients instead.
    - blur_baseline_config
      Type: STRUCT
      Provider name: blurBaselineConfig
      Description: Config for XRAI with blur baseline. When enabled, a linear path from the maximally blurred image to the input image is created. Using a blurred baseline instead of zero (black image) is motivated by the BlurIG approach explained here: https://arxiv.org/abs/2004.03383
      - max_blur_sigma
        Type: FLOAT
        Provider name: maxBlurSigma
        Description: The standard deviation of the blur kernel for the blurred baseline. The same blurring parameter is used for both the height and the width dimension. If not set, the method defaults to the zero (i.e. black for images) baseline.
    - smooth_grad_config
      Type: STRUCT
      Provider name: smoothGradConfig
      Description: Config for SmoothGrad approximation of gradients. When enabled, the gradients are approximated by averaging the gradients from noisy samples in the vicinity of the inputs. Adding noise can help improve the computed gradients. Refer to this paper for more details: https://arxiv.org/pdf/1706.03825.pdf
      - feature_noise_sigma
        Type: STRUCT
        Provider name: featureNoiseSigma
        Description: This is similar to noise_sigma, but provides additional flexibility. A separate noise sigma can be provided for each feature, which is useful if their distributions are different. No noise is added to features that are not set. If this field is unset, noise_sigma will be used for all features.
        noise_sigma
        Type: UNORDERED_LIST_STRUCT
        Provider name: noiseSigma
        Description: Noise sigma per feature. No noise is added to features that are not set.
        name
        Type: STRING
        Provider name: name
        Description: The name of the input feature for which noise sigma is provided. The features are defined in explanation metadata inputs.
        sigma
        Type: FLOAT
        Provider name: sigma
        Description: This represents the standard deviation of the Gaussian kernel that will be used to add noise to the feature prior to computing gradients. Similar to noise_sigma but represents the noise added to the current feature. Defaults to 0.1.
      - noise_sigma
        Type: FLOAT
        Provider name: noiseSigma
        Description: This is a single float value and will be used to add noise to all the features. Use this field when all features are normalized to have the same distribution: scale to range [0, 1], [-1, 1] or z-scoring, where features are normalized to have 0-mean and 1-variance. Learn more about normalization. For best results the recommended value is about 10% - 20% of the standard deviation of the input feature. Refer to section 3.2 of the SmoothGrad paper: https://arxiv.org/pdf/1706.03825.pdf. Defaults to 0.1. If the distribution is different per feature, set feature_noise_sigma instead for each feature.
      - noisy_sample_count
        Type: INT32
        Provider name: noisySampleCount
        Description: The number of gradient samples to use for approximation. The higher this number, the more accurate the gradient is, but the runtime complexity increases by this factor as well. Valid range of its value is [1, 50]. Defaults to 3.
    - step_count
      Type: INT32
      Provider name: stepCount
      Description: Required. The number of steps for approximating the path integral. A good value to start is 50 and gradually increase until the sum to diff property is met within the desired error range. Valid range of its value is [1, 100], inclusively.
faster_deployment_config
Type: STRUCT
Provider name: fasterDeploymentConfig
Description: Configuration for faster model deployment.
- fast_tryout_enabled
  Type: BOOLEAN
  Provider name: fastTryoutEnabled
  Description: If true, enable fast tryout feature for this deployed model.
gcp_display_name
Type: STRING
Provider name: displayName
Description: The display name of the DeployedModel. If not provided upon creation, the Model’s display_name is used.
gcp_status
Type: STRUCT
Provider name: status
Description: Output only. Runtime status of the deployed model.
- available_replica_count
  Type: INT32
  Provider name: availableReplicaCount
  Description: Output only. The number of available replicas of the deployed model.
- last_update_time
  Type: TIMESTAMP
  Provider name: lastUpdateTime
  Description: Output only. The time at which the status was last updated.
- message
  Type: STRING
  Provider name: message
  Description: Output only. The latest deployed model’s status message (if any).
id
Type: STRING
Provider name: id
Description: Immutable. The ID of the DeployedModel. If not provided upon deployment, Vertex AI will generate a value for this ID. This value should be 1-10 characters, and valid characters are /[0-9]/.
model
Type: STRING
Provider name: model
Description: Required. The resource name of the Model that this is the deployment of. Note that the Model may be in a different location than the DeployedModel’s Endpoint. The resource name may contain version id or version alias to specify the version. Example: projects/{project}/locations/{location}/models/{model}@2 or projects/{project}/locations/{location}/models/{model}@golden if no version is specified, the default version will be deployed.
model_version_id
Type: STRING
Provider name: modelVersionId
Description: Output only. The version ID of the model that is deployed.
private_endpoints
Type: STRUCT
Provider name: privateEndpoints
Description: Output only. Provide paths for users to send predict/explain/health requests directly to the deployed model services running on Cloud via private services access. This field is populated if network is configured.
- explain_http_uri
  Type: STRING
  Provider name: explainHttpUri
  Description: Output only. Http(s) path to send explain requests.
- health_http_uri
  Type: STRING
  Provider name: healthHttpUri
  Description: Output only. Http(s) path to send health check requests.
- predict_http_uri
  Type: STRING
  Provider name: predictHttpUri
  Description: Output only. Http(s) path to send prediction requests.
- service_attachment
  Type: STRING
  Provider name: serviceAttachment
  Description: Output only. The name of the service attachment resource. Populated if private service connect is enabled.
service_account
Type: STRING
Provider name: serviceAccount
Description: The service account that the DeployedModel’s container runs as. Specify the email address of the service account. If this service account is not specified, the container runs as a service account that doesn’t have access to the resource project. Users deploying the Model must have the iam.serviceAccounts.actAs permission on this service account.
shared_resources
Type: STRING
Provider name: sharedResources
Description: The resource name of the shared DeploymentResourcePool to deploy on. Format: projects/{project}/locations/{location}/deploymentResourcePools/{deployment_resource_pool}
speculative_decoding_spec
Type: STRUCT
Provider name: speculativeDecodingSpec
Description: Optional. Spec for configuring speculative decoding.
- draft_model_speculation
  Type: STRUCT
  Provider name: draftModelSpeculation
  Description: draft model speculation.
  - draft_model
    Type: STRING
    Provider name: draftModel
    Description: Required. The resource name of the draft model.
- ngram_speculation
  Type: STRUCT
  Provider name: ngramSpeculation
  Description: N-Gram speculation.
  - ngram_size
    Type: INT32
    Provider name: ngramSize
    Description: The number of last N input tokens used as ngram to search/match against the previous prompt sequence. This is equal to the N in N-Gram. The default value is 3 if not specified.
- speculative_token_count
  Type: INT32
  Provider name: speculativeTokenCount
  Description: The number of speculative tokens to generate at each step.

`description`

Type: STRING
Provider name: description
Description: The description of the Endpoint.

`enable_private_service_connect`

Type: BOOLEAN
Provider name: enablePrivateServiceConnect
Description: Deprecated: If true, expose the Endpoint via private service connect. Only one of the fields, network or enable_private_service_connect, can be set.

`encryption_spec`

Type: STRUCT
Provider name: encryptionSpec
Description: Customer-managed encryption key spec for an Endpoint. If set, this Endpoint and all sub-resources of this Endpoint will be secured by this key.

kms_key_name
Type: STRING
Provider name: kmsKeyName
Description: Required. The Cloud KMS resource identifier of the customer managed encryption key used to protect a resource. Has the form: projects/my-project/locations/my-region/keyRings/my-kr/cryptoKeys/my-key. The key needs to be in the same region as where the compute resource is created.

`etag`

Type: STRING
Provider name: etag
Description: Used to perform consistent read-modify-write updates. If not set, a blind “overwrite” update happens.

`gcp_display_name`

Type: STRING
Provider name: displayName
Description: Required. The display name of the Endpoint. The name can be up to 128 characters long and can consist of any UTF-8 characters.

`gen_ai_advanced_features_config`

Type: STRUCT
Provider name: genAiAdvancedFeaturesConfig
Description: Optional. Configuration for GenAiAdvancedFeatures. If the endpoint is serving GenAI models, advanced features like native RAG integration can be configured. Currently, only Model Garden models are supported.

rag_config
Type: STRUCT
Provider name: ragConfig
Description: Configuration for Retrieval Augmented Generation feature.
- enable_rag
  Type: BOOLEAN
  Provider name: enableRag
  Description: If true, enable Retrieval Augmented Generation in ChatCompletion request. Once enabled, the endpoint will be identified as GenAI endpoint and Arthedain router will be used.

`labels`

Type: UNORDERED_LIST_STRING

`model_deployment_monitoring_job`

Type: STRING
Provider name: modelDeploymentMonitoringJob
Description: Output only. Resource name of the Model Monitoring job associated with this Endpoint if monitoring is enabled by JobService.CreateModelDeploymentMonitoringJob. Format: projects/{project}/locations/{location}/modelDeploymentMonitoringJobs/{model_deployment_monitoring_job}

`name`

Type: STRING
Provider name: name
Description: Output only. The resource name of the Endpoint.

`network`

Type: STRING
Provider name: network
Description: Optional. The full name of the Google Compute Engine network to which the Endpoint should be peered. Private services access must already be configured for the network. If left unspecified, the Endpoint is not peered with any network. Only one of the fields, network or enable_private_service_connect, can be set. Format: projects/{project}/global/networks/{network}. Where {project} is a project number, as in 12345, and {network} is network name.

`organization_id`

Type: STRING

`parent`

Type: STRING

`predict_request_response_logging_config`

Type: STRUCT
Provider name: predictRequestResponseLoggingConfig
Description: Configures the request-response logging for online prediction.

bigquery_destination
Type: STRUCT
Provider name: bigqueryDestination
Description: BigQuery table for logging. If only given a project, a new dataset will be created with name logging__ where will be made BigQuery-dataset-name compatible (e.g. most special characters will become underscores). If no table name is given, a new table will be created with name request_response_logging
- output_uri
  Type: STRING
  Provider name: outputUri
  Description: Required. BigQuery URI to a project or table, up to 2000 characters long. When only the project is specified, the Dataset and Table is created. When the full table reference is specified, the Dataset must exist and table must not exist. Accepted forms: * BigQuery path. For example: bq://projectId or bq://projectId.bqDatasetId or bq://projectId.bqDatasetId.bqTableId.
enabled
Type: BOOLEAN
Provider name: enabled
Description: If logging is enabled or not.
sampling_rate
Type: DOUBLE
Provider name: samplingRate
Description: Percentage of requests to be logged, expressed as a fraction in range(0,1].

`private_service_connect_config`

Type: STRUCT
Provider name: privateServiceConnectConfig
Description: Optional. Configuration for private service connect. network and private_service_connect_config are mutually exclusive.

enable_private_service_connect
Type: BOOLEAN
Provider name: enablePrivateServiceConnect
Description: Required. If true, expose the IndexEndpoint via private service connect.
project_allowlist
Type: UNORDERED_LIST_STRING
Provider name: projectAllowlist
Description: A list of Projects from which the forwarding rule will target the service attachment.
service_attachment
Type: STRING
Provider name: serviceAttachment
Description: Output only. The name of the generated service attachment resource. This is only populated if the endpoint is deployed with PrivateServiceConnect.

`project_id`

Type: STRING

`project_number`

Type: STRING

`resource_name`

Type: STRING

`satisfies_pzi`

Type: BOOLEAN
Provider name: satisfiesPzi
Description: Output only. Reserved for future use.

`satisfies_pzs`

Type: BOOLEAN
Provider name: satisfiesPzs
Description: Output only. Reserved for future use.

`tags`

Type: UNORDERED_LIST_STRING

`update_time`

Type: TIMESTAMP
Provider name: updateTime
Description: Output only. Timestamp when this Endpoint was last updated.