TiDB

문서 > 통합 > TiDB

Supported OS Linux Mac OS Windows

통합 버전2.1.0

이 페이지는 아직 한국어로 제공되지 않으며 번역 작업 중입니다. 번역에 관한 질문이나 의견이 있으시면 언제든지 저희에게 연락해 주십시오.

Overview

Connect TiDB cluster to Datadog in order to:

Collect key TiDB metrics of your cluster.
Collect logs of your cluster, such as TiDB/TiKV/TiFlash logs and slow query logs.
Visualize cluster performance on the provided dashboard.

Note:
TiDB 4.0+ is required for this integration.
For TiDB Cloud, see the TiDB Cloud Integration.

Setup

Installation

First, download and launch the Datadog Agent.

Then, manually install the TiDB check. Instructions vary depending on the environment.

Run datadog-agent integration install -t datadog-tidb==<INTEGRATION_VERSION>.

Configuration

Metric collection

Edit the tidb.d/conf.yaml file in the conf.d/ folder at the root of your Agent’s configuration directory to start collecting your TiDB performance data. See the sample tidb.d/conf.yaml for all available configuration options.

The sample tidb.d/conf.yaml only configures the PD instance. You need to manually configure the other instances in the TiDB cluster. Like this:

init_config:

instances:

  - pd_metric_url: http://localhost:2379/metrics
    send_distribution_buckets: true
    tags:
      - cluster_name:cluster01

  - tidb_metric_url: http://localhost:10080/metrics
    send_distribution_buckets: true
    tags:
      - cluster_name:cluster01

  - tikv_metric_url: http://localhost:20180/metrics
    send_distribution_buckets: true
    tags:
      - cluster_name:cluster01

  - tiflash_metric_url: http://localhost:8234/metrics
    send_distribution_buckets: true
    tags:
      - cluster_name:cluster01

  - tiflash_proxy_metric_url: http://localhost:20292/metrics
    send_distribution_buckets: true
    tags:
      - cluster_name:cluster01

Restart the Agent.

Log collection

Available for Agent versions >6.0

Collecting logs is disabled by default in the Datadog Agent, enable it in your datadog.yaml file:
```
logs_enabled: true
```

Add this configuration block to your tidb.d/conf.yaml file to start collecting your TiDB logs:

logs:
 # pd log
 - type: file
   path: "/tidb-deploy/pd-2379/log/pd*.log"
   service: "tidb-cluster"
   source: "pd"

 # tikv log
 - type: file
   path: "/tidb-deploy/tikv-20160/log/tikv*.log"
   service: "tidb-cluster"
   source: "tikv"

 # tidb log
 - type: file
   path: "/tidb-deploy/tidb-4000/log/tidb*.log"
   service: "tidb-cluster"
   source: "tidb"
   exclude_paths:
     - /tidb-deploy/tidb-4000/log/tidb_slow_query.log
 - type: file
   path: "/tidb-deploy/tidb-4000/log/tidb_slow_query*.log"
   service: "tidb-cluster"
   source: "tidb"
   log_processing_rules:
     - type: multi_line
       name: new_log_start_with_datetime
       pattern: '#\sTime:'
   tags:
     - "custom_format:tidb_slow_query"

 # tiflash log
 - type: file
   path: "/tidb-deploy/tiflash-9000/log/tiflash*.log"
   service: "tidb-cluster"
   source: "tiflash"

Change the path and service according to your cluster’s configuration.

Use these commands to show all log path:

# show deploying directories
tiup cluster display <YOUR_CLUSTER_NAME>
# find specific logging file path by command arguments
ps -fwwp <TIDB_PROCESS_PID/PD_PROCESS_PID/etc.>

Restart the Agent.

Validation

Run the Agent’s status subcommand and look for tidb under the Checks section.

Data Collected

Metrics

tidb_cluster.tidb_executor_statement_total (count)	The total number of statements executed Shown as execution
tidb_cluster.tidb_server_execute_error_total (count)	The total number of execution errors Shown as error
tidb_cluster.tidb_server_connections (gauge)	Current number of connections in TiDB server Shown as connection
tidb_cluster.tidb_server_handle_query_duration_seconds.count (count)	The total number of handled queries in server Shown as query
tidb_cluster.tidb_server_handle_query_duration_seconds.sum (count)	The sum of handled query duration in server Shown as second
tidb_cluster.tikv_engine_size_bytes (gauge)	The disk usage bytes of TiKV instances Shown as byte
tidb_cluster.tikv_store_size_bytes (gauge)	The disk capacity bytes of TiKV instances Shown as byte
tidb_cluster.tikv_io_bytes (count)	The io read/write bytes of TiKV instances Shown as byte
tidb_cluster.tiflash_store_size_used_bytes (gauge)	The disk usage bytes of TiFlash instances Shown as byte
tidb_cluster.tiflash_store_size_capacity_bytes (gauge)	The disk capacity bytes of TiFlash instances Shown as byte
tidb_cluster.process_cpu_seconds_total (count)	The cpu usage seconds of TiDB/TiKV/TiFlash instances Shown as second
tidb_cluster.process_resident_memory_bytes (gauge)	The resident memory bytes of TiDB/TiKV/TiFlash instances Shown as byte

It is possible to use the metrics configuration option to collect additional metrics from a TiDB cluster.

Events

TiDB check does not include any events.

Service Checks

tidb_cluster.prometheus.health
Returns CRITICAL if the Agent cannot fetch Prometheus metrics, otherwise returns OK.
Statuses: ok, critical

Troubleshooting

Missing CPU and Memory metrics for TiKV and TiFlash instances on macOS

CPU and Memory metrics are not provided for TiKV and TiFlash instances in the following cases:

Running TiKV or TiFlash instances with tiup playground on macOS.
Running TiKV or TiFlash instances with docker-compose up on a new Apple M1 machine.

Too many metrics

The TiDB check enables Datadog’s distribution metric type by default. This part of data is quite large and may consume lots of resources. You can modify this behavior in tidb.yml file:

send_distribution_buckets: false

Since there are many important metrics in a TiDB cluster, the TiDB check sets max_returned_metrics to 10000 by default. You can decrease max_returned_metrics in tidb.yml file if necessary:

max_returned_metrics: 1000

Need help? Contact Datadog support.