Overview
This check monitors IBM Db2 through the Datadog Agent.
Setup
Installation
The IBM Db2 check is included in the Datadog Agent package.
Dependencies
The ibm_db client library is required. To install it, ensure you have a working compiler and run:
Unix
sudo -Hu dd-agent /opt/datadog-agent/embedded/bin/pip install ibm_db==3.1.0
Note: If you are on an Agent running Python 2, use ibm_db==3.0.1
instead of ibm_db=3.1.0
.
Windows
For Agent versions <= 6.11:
"C:\Program Files\Datadog\Datadog Agent\embedded\python.exe" -m pip install ibm_db==3.0.1
For Agent versions >= 6.12 and < 7.0:
"C:\Program Files\Datadog\Datadog Agent\embedded<PYTHON_MAJOR_VERSION>\python.exe" -m pip install ibm_db==3.0.1
For Agent versions >= 7.0:
"C:\Program Files\Datadog\Datadog Agent\embedded3\python.exe" -m pip install ibm_db==3.2.2
On Linux there may be need for XML functionality. If you encounter errors during
the build process, install libxslt-dev
(or libxslt-devel
for RPM).
Enable monitoring
The IBM Db2 integration pulls data using the following table functions:
MON_GET_TABLESPACE
MON_GET_TRANSACTION_LOG
MON_GET_BUFFERPOOL
MON_GET_DATABASE
MON_GET_INSTANCE
For more information about these table functions, see the official IBM documentation.
To monitor a Db2 instance, create a Db2 user with either the EXECUTE
permission on the above five table functions, or grant the Db2 user one of the following roles:
DATAACCESS
authorityDBADM
authoritySQLADM
authority
To monitor the health of an instance, its associated databases, and database objects, enable the database system monitor switches for each of the objects you want to monitor:
- Statement
- Lock
- Tables
- Buffer pool
Switch to the instance master user and run these commands at the db2
prompt:
update dbm cfg using HEALTH_MON on
update dbm cfg using DFT_MON_STMT on
update dbm cfg using DFT_MON_LOCK on
update dbm cfg using DFT_MON_TABLE on
update dbm cfg using DFT_MON_BUFPOOL on
Next, run get dbm cfg
and you should see the following:
Default database monitor switches
Buffer pool (DFT_MON_BUFPOOL) = ON
Lock (DFT_MON_LOCK) = ON
Sort (DFT_MON_SORT) = OFF
Statement (DFT_MON_STMT) = ON
Table (DFT_MON_TABLE) = ON
Timestamp (DFT_MON_TIMESTAMP) = ON
Unit of work (DFT_MON_UOW) = OFF
Monitor health of instance and databases (HEALTH_MON) = ON
Configuration
Host
To configure this check for an Agent running on a host:
Metric collection
Edit the ibm_db2.d/conf.yaml
file, in the conf.d/
folder at the root of your Agent’s configuration directory to start collecting your ibm_db2
performance data. See the sample ibm_db2.d/conf.yaml for all available configuration options.
Restart the Agent.
Log collection
Available for Agent versions >6.0
Collecting logs is disabled by default in the Datadog Agent, enable it in your datadog.yaml
file:
Add this configuration block to your ibm_db2.d/conf.yaml
file to start collecting your IBM Db2 logs:
logs:
- type: file
path: /home/db2inst1/sqllib/db2dump/db2diag.log
source: ibm_db2
service: db2sysc
log_processing_rules:
- type: multi_line
name: new_log_start_with_date
pattern: \d{4}\-(0?[1-9]|[12][0-9]|3[01])\-(0?[1-9]|1[012])
Restart the Agent.
Containerized
For containerized environments, see the Autodiscovery Integration Templates for guidance on applying the parameters below.
Metric collection
Parameter | Value |
---|
<INTEGRATION_NAME> | ibm_db2 |
<INIT_CONFIG> | blank or {} |
<INSTANCE_CONFIG> | {"db": "<DB_NAME>", "username":"<USERNAME>", "password":"<PASSWORD>", "host":"%%host%%", "port":"%%port%%"} |
Log collection
Available for Agent versions >6.0
Collecting logs is disabled by default in the Datadog Agent. To enable it, see Kubernetes Log Collection.
Parameter | Value |
---|
<LOG_CONFIG> | `{“source”: “ibm_db2”, “service”: “<SERVICE_NAME>”, “log_processing_rules”: {“type”:“multi_line”,“name”:“new_log_start_with_date”, “pattern”:"\d{4}-(0?[1-9] |
Validation
Run the Agent’s status subcommand and look for ibm_db2
under the Checks section.
Data Collected
Metrics
ibm_db2.application.active (gauge) | The number of applications that are currently connected to the database. Shown as connection |
ibm_db2.application.executing (gauge) | The number of applications for which the database manager is currently processing a request. Shown as connection |
ibm_db2.backup.latest (gauge) | The time elapsed since the latest database backup was completed. Shown as second |
ibm_db2.bufferpool.column.hit_percent (gauge) | The percentage of time that the database manager did not need to load a page from disk to service a column-organized table data page request. Shown as percent |
ibm_db2.bufferpool.column.reads.logical (count) | The number of column-organized table data pages read from the logical table space containers for temporary, regular, and large table spaces. Shown as get |
ibm_db2.bufferpool.column.reads.physical (count) | The number of column-organized table data pages read from the physical table space containers for temporary, regular, and large table spaces. Shown as get |
ibm_db2.bufferpool.column.reads.total (count) | The total number of column-organized table data pages read from the table space containers for temporary, regular, and large table spaces. Shown as get |
ibm_db2.bufferpool.data.hit_percent (gauge) | The percentage of time that the database manager did not need to load a page from disk to service a data page request. Shown as percent |
ibm_db2.bufferpool.data.reads.logical (count) | The number of data pages read from the logical table space containers for temporary, regular and large table spaces. Shown as get |
ibm_db2.bufferpool.data.reads.physical (count) | The number of data pages read from the physical table space containers for temporary, regular and large table spaces. Shown as get |
ibm_db2.bufferpool.data.reads.total (count) | The total number of data pages read from the table space containers for temporary, regular and large table spaces. Shown as get |
ibm_db2.bufferpool.group.column.hit_percent (gauge) | The percentage of time that the group database manager did not need to load a page from disk to service a column-organized table data page request. Shown as percent |
ibm_db2.bufferpool.group.data.hit_percent (gauge) | The percentage of time that the group database manager did not need to load a page from disk to service a data page request. Shown as percent |
ibm_db2.bufferpool.group.hit_percent (gauge) | The percentage of time that the group database manager did not need to load a page from disk to service a page request. Shown as percent |
ibm_db2.bufferpool.group.index.hit_percent (gauge) | The percentage of time that the group database manager did not need to load a page from disk to service an index page request. Shown as percent |
ibm_db2.bufferpool.group.xda.hit_percent (gauge) | The percentage of time that the group database manager did not need to load a page from disk to service an index page request. Shown as percent |
ibm_db2.bufferpool.hit_percent (gauge) | The percentage of time that the database manager did not need to load a page from disk to service a page request. Shown as percent |
ibm_db2.bufferpool.index.hit_percent (gauge) | The percentage of time that the database manager did not need to load a page from disk to service an index page request. Shown as percent |
ibm_db2.bufferpool.index.reads.logical (count) | The number of index pages read from the logical table space containers for temporary, regular and large table spaces. Shown as get |
ibm_db2.bufferpool.index.reads.physical (count) | The number of index pages read from the physical table space containers for temporary, regular and large table spaces. Shown as get |
ibm_db2.bufferpool.index.reads.total (count) | The total number of index pages read from the table space containers for temporary, regular and large table spaces. Shown as get |
ibm_db2.bufferpool.reads.logical (count) | The number of pages read from the logical table space containers for temporary, regular and large table spaces. Shown as get |
ibm_db2.bufferpool.reads.physical (count) | The number of pages read from the physical table space containers for temporary, regular and large table spaces. Shown as get |
ibm_db2.bufferpool.reads.total (count) | The total number of pages read from the table space containers for temporary, regular and large table spaces. Shown as get |
ibm_db2.bufferpool.xda.hit_percent (gauge) | The percentage of time that the database manager did not need to load a page from disk to service an index page request. Shown as percent |
ibm_db2.bufferpool.xda.reads.logical (count) | The number of data pages for XML storage objects (XDAs) read from the logical table space containers for temporary, regular and large table spaces. Shown as get |
ibm_db2.bufferpool.xda.reads.physical (count) | The number of data pages for XML storage objects (XDAs) read from the physical table space containers for temporary, regular and large table spaces. Shown as get |
ibm_db2.bufferpool.xda.reads.total (count) | The total number of data pages for XML storage objects (XDAs) read from the table space containers for temporary, regular and large table spaces. Shown as get |
ibm_db2.connection.active (gauge) | The current number of connections. Shown as connection |
ibm_db2.connection.max (gauge) | The highest number of simultaneous connections to the database since the database was activated. Shown as connection |
ibm_db2.connection.total (count) | The total number of connections to the database since the first connect, activate, or last reset (coordinator agents). Shown as connection |
ibm_db2.lock.active (gauge) | The number of locks currently held. Shown as lock |
ibm_db2.lock.dead (count) | The total number of deadlocks that have occurred. Shown as lock |
ibm_db2.lock.pages (gauge) | The memory pages (4 KiB each) currently in use by the lock list. Shown as page |
ibm_db2.lock.timeouts (count) | The number of times that a request to lock an object timed out instead of being granted. Shown as lock |
ibm_db2.lock.wait (gauge) | The average wait time for a lock. Shown as millisecond |
ibm_db2.lock.waiting (gauge) | The number of agents waiting on a lock. Shown as lock |
ibm_db2.log.available (gauge) | The disk blocks (4 KiB each) of active log space in the database that is not being used by uncommitted transactions. Shown as block |
ibm_db2.log.reads (count) | The number of log pages read from disk by the logger. Shown as read |
ibm_db2.log.used (gauge) | The disk blocks (4 KiB each) of active log space currently used in the database. Shown as block |
ibm_db2.log.utilized (gauge) | The utilization of active log space as a percentage. Shown as percent |
ibm_db2.log.writes (count) | The number of log pages written to disk by the logger. Shown as write |
ibm_db2.row.modified.total (count) | The total number of rows inserted, updated, or deleted. Shown as row |
ibm_db2.row.reads.total (count) | The total number of rows that had to be read in order to return result sets. Shown as row |
ibm_db2.row.returned.total (count) | The total number of rows that have been selected by and returned to applications. Shown as row |
ibm_db2.tablespace.size (gauge) | The total size of the table space in bytes. Shown as byte |
ibm_db2.tablespace.usable (gauge) | The total usable size of the table space in bytes. Shown as byte |
ibm_db2.tablespace.used (gauge) | The total used size of the table space in bytes. Shown as byte |
ibm_db2.tablespace.utilized (gauge) | The utilization of the table space as a percentage. Shown as percent |
Events
ibm_db2.tablespace_state_change
is triggered whenever the state of a tablespace changes.
Service Checks
ibm_db2.can_connect
Returns CRITICAL
if the Agent is unable to connect to the monitored IBM Db2 database, or OK
otherwise.
Statuses: ok, critical
ibm_db2.status
Returns CRITICAL
if the monitored IBM Db2 database is quiesced, WARNING
for quiesce-pending or rollforwards, or OK
otherwise.
Statuses: ok, warning, critical, unknown
Troubleshooting
CLI Driver SQL1531N error
If you encounter an issue that produces error logs like the following:
2023-08-10 23:34:47 UTC | CORE | ERROR | (pkg/collector/python/datadog_agent.go:129 in LogMessage) | ibm_db2:c051131490335a94 | (ibm_db2.py:563) | Unable to connect to database `datadog` as user `db2inst1`: [IBM][CLI Driver] SQL1531N The connection failed because the name specified with the DSN connection string keyword could not be found in either the db2dsdriver.cfg configuration file or the db2cli.ini configuration file. Data source name specified in the connection string: "DATADOG". SQLCODE=-1531
Then it’s most likely caused by one of the following scenarios:
- The configuration (conf.yaml) is missing a host and port configuration
- The CLI Driver isn’t able to locate the database due to the absence of
db2cli.ini
and db2dsdriver.cfg
The Agent requires the information in both of the above scenarios to determine where to properly connect to the database. To solve this issue, you can either include a host and port parameter for every instance of the ibm_db2
check experiencing this issue. Alternatively, if you want to use the DSNs defined in either the db2cli.ini
or db2dsdriver.cfg
files, you can copy those files over to the clidriver
directory that the Agent uses. Under normal circumstances, that directory is located at /opt/datadog-agent/embedded/lib/python3.9/site-packages/clidriver/cfg
for Linux.
Installing ibm_db
client library offline
If you’re in an air gapped environment, or on a restricted network where it’s not possible to run pip install ibm_db==x.y.z
where x.y.z
is the version number, you can install ibm_db
using the following method:
On a machine with network access, download the source tarballs for the ibm_db
library and the ODBC and CLI. The ODBC and CLI are required to be downloaded separately because the ibm_db
library requires them, but it cannot download them via pip
. The following script installs the archive file for ibm_db==x.y.z
on a Linux machine, where x.y.z
is the version number:
curl -Lo ibm_db.tar.gz https://github.com/ibmdb/python-ibmdb/archive/refs/tags/vx.y.z.tar.gz
curl -Lo linuxx64_odbc_cli.tar.gz https://public.dhe.ibm.com/ibmdl/export/pub/software/data/db2/drivers/odbc_cli/linuxx64_odbc_cli.tar.gz
Transport the two files over to the restricted host, and then extract the archive.
tar -xvf ibm_db.tar.gz
tar -xvf linuxx64_odbc_cli.tar.gz
Set the IBM_DB_HOME
environment variable to the location of where /clidriver
was extracted from linuxx64_odbc_cli.tar.gz
. This will prevent the ibm_db
library from installing a new version of the ODBC and CLI since that would fail.
export IBM_DB_HOME=/path/to/clidriver
Using the embedded pip
on the Agent, install the ibm_db
library locally. This library’s files are contained within the extracted python-ibmdb-x.y.z
from ibm_db.tar.gz
.
/opt/datadog-agent/embedded/bin/pip install --no-index --no-deps --no-build-isolation /path/to/python-ibmdb-x.y.z/IBM_DB/ibm_db/
If you get the following error:
error: subprocess-exited-with-error
× Preparing metadata (pyproject.toml) did not run successfully.
| exit code: 1
-> [8 lines of output]
Detected 64-bit Python
Detected platform = linux, uname = x86_64
Downloading https://public.dhe.ibm.com/ibmdl/export/pub/software/data/db2/drivers/odbc_cli/linuxx64_odbc_cli.tar.gz
Downloading DSDriver from url = https://public.dhe.ibm.com/ibmdl/export/pub/software/data/db2/drivers/odbc_cli/linuxx64_odbc_cli.tar.gz
Pre-requisite check [which gcc] : Failed
No Gcc installation detected.
Please install gcc and continue with the installation of the ibm_db.
[end of output]
You may need to install gcc
.
Need help? Contact Datadog support.
Further Reading
Additional helpful documentation, links, and articles: