Hazelcast

Supported OS Linux Mac OS Windows

Integration version6.0.0

Overview

This check monitors Hazelcast v4.0+.

Setup

Installation

The Hazelcast check is included in the Datadog Agent package. No additional installation is needed on your server.

Configuration

Host

To configure this check for an Agent running on a host:

Metric collection
  1. Edit the hazelcast.d/conf.yaml file, in the conf.d/ folder at the root of your Agent’s configuration directory to start collecting your Hazelcast performance data. See the sample hazelcast.d/conf.yaml for all available configuration options.

    This check has a limit of 350 metrics per instance. The number of returned metrics is indicated in the status page. You can specify the metrics you are interested in by editing the configuration below. To learn how to customize the metrics to collect, see the JMX Checks documentation for more detailed instructions. If you need to monitor more metrics, contact Datadog support.

  2. Restart the Agent.

Log collection
  1. Hazelcast supports many different logging adapters. Here is an example of a log4j2.properties file:

    rootLogger=file
    rootLogger.level=info
    property.filepath=/path/to/log/files
    property.filename=hazelcast
    
    appender.file.type=RollingFile
    appender.file.name=RollingFile
    appender.file.fileName=${filepath}/${filename}.log
    appender.file.filePattern=${filepath}/${filename}-%d{yyyy-MM-dd}-%i.log.gz
    appender.file.layout.type=PatternLayout
    appender.file.layout.pattern = %d{yyyy-MM-dd HH:mm:ss} [%thread] %level{length=10} %c{1}:%L - %m%n
    appender.file.policies.type=Policies
    appender.file.policies.time.type=TimeBasedTriggeringPolicy
    appender.file.policies.time.interval=1
    appender.file.policies.time.modulate=true
    appender.file.policies.size.type=SizeBasedTriggeringPolicy
    appender.file.policies.size.size=50MB
    appender.file.strategy.type=DefaultRolloverStrategy
    appender.file.strategy.max=100
    
    rootLogger.appenderRefs=file
    rootLogger.appenderRef.file.ref=RollingFile
    
    #Hazelcast specific logs.
    
    #log4j.logger.com.hazelcast=debug
    
    #log4j.logger.com.hazelcast.cluster=debug
    #log4j.logger.com.hazelcast.partition=debug
    #log4j.logger.com.hazelcast.partition.InternalPartitionService=debug
    #log4j.logger.com.hazelcast.nio=debug
    #log4j.logger.com.hazelcast.hibernate=debug
    
  2. By default, Datadog’s integration pipeline supports the following conversion pattern:

    %d{yyyy-MM-dd HH:mm:ss} [%thread] %level{length=10} %c{1}:%L - %m%n
    

    Clone and edit the integration pipeline if you have a different format.

  3. Collecting logs is disabled by default in the Datadog Agent. Enable it in your datadog.yaml file:

    logs_enabled: true
    
  4. Add the following configuration block to your hazelcast.d/conf.yaml file. Change the path and service parameter values based on your environment. See the sample hazelcast.d/conf.yaml for all available configuration options.

    logs:
      - type: file
        path: /var/log/hazelcast.log
        source: hazelcast
        service: <SERVICE>
        log_processing_rules:
          - type: multi_line
            name: log_start_with_date
            pattern: \d{4}\.\d{2}\.\d{2}
    
  5. Restart the Agent.

Containerized

Metric collection

For containerized environments, see the Autodiscovery with JMX guide.

Log collection

Collecting logs is disabled by default in the Datadog Agent. To enable it, see Docker log collection.

ParameterValue
<LOG_CONFIG>{"source": "hazelcast", "service": "<SERVICE_NAME>"}

Validation

Run the Agent’s status subcommand and look for hazelcast under the JMXFetch section:

========
JMXFetch
========
  Initialized checks
  ==================
    hazelcast
      instance_name : hazelcast-localhost-9999
      message :
      metric_count : 46
      service_check_count : 0
      status : OK

Data Collected

Metrics

hazelcast.imap.local_backup_count
(gauge)
Backup count
hazelcast.imap.local_backup_entry_count
(gauge)
Backup entry count
hazelcast.imap.local_backup_entry_memory_cost
(gauge)
Backup entry cost
hazelcast.imap.local_creation_time
(gauge)
Creation time
hazelcast.imap.local_dirty_entry_count
(gauge)
Dirty entry count
hazelcast.imap.local_event_operation_count
(gauge)
Event count
hazelcast.imap.local_get_operation_count
(gauge)
Get operation count
hazelcast.imap.local_heap_cost
(gauge)
Heap Cost
hazelcast.imap.local_hits
(gauge)
Hits
hazelcast.imap.local_last_access_time
(gauge)
Last access time
hazelcast.imap.local_last_update_time
(gauge)
Last update time
hazelcast.imap.local_locked_entry_count
(gauge)
Locked entry count
hazelcast.imap.local_max_get_latency
(gauge)
Max get latency
hazelcast.imap.local_max_put_latency
(gauge)
Max put latency
hazelcast.imap.local_max_remove_latency
(gauge)
Max remove latency
hazelcast.imap.local_other_operation_count
(gauge)
Other (keySet,entrySet etc..) operation count
hazelcast.imap.local_owned_entry_count
(gauge)
Owned entry count
hazelcast.imap.local_owned_entry_memory_cost
(gauge)
Owned entry memory cost
hazelcast.imap.local_put_operation_count
(gauge)
Put operation count
hazelcast.imap.local_remove_operation_count
(gauge)
Remove operation count
hazelcast.imap.local_total
(gauge)
Total operation count
hazelcast.imap.local_total_get_latency
(gauge)
Total get latency
hazelcast.imap.local_total_put_latency
(gauge)
Total put latency
hazelcast.imap.local_total_remove_latency
(gauge)
Total remove latency
hazelcast.imap.size
(gauge)
Size
hazelcast.instance.cluster_time
(gauge)
Elapsed time since the isntance was created
hazelcast.instance.managed_executor_service.completed_task_count
(gauge)
Completed task count
hazelcast.instance.managed_executor_service.is_shutdown
(gauge)
hazelcast.instance.managed_executor_service.is_terminated
(gauge)
hazelcast.instance.managed_executor_service.maximum_pool_size
(gauge)
Maximum thread count of the pool
hazelcast.instance.managed_executor_service.pool_size
(gauge)
Thread count of the pool
hazelcast.instance.managed_executor_service.queue_size
(gauge)
Work queue size
hazelcast.instance.managed_executor_service.remaining_queue_capacity
(gauge)
Remaining capacity of the work queue
hazelcast.instance.member_count
(gauge)
Size of the cluster
hazelcast.instance.partition_service.active_partition_count
(gauge)
Active partition count
hazelcast.instance.partition_service.is_cluster_safe
(gauge)
Cluster Safe State
hazelcast.instance.partition_service.is_local_member_safe
(gauge)
LocalMember Safe State
hazelcast.instance.partition_service.partition_count
(gauge)
Partition count
hazelcast.instance.running
(gauge)
Running state
hazelcast.instance.version
(gauge)
The Hazelcast version
hazelcast.iqueue.average_age
(gauge)
Average age
Shown as second
hazelcast.iqueue.backup_item_count
(gauge)
Backup item count
hazelcast.iqueue.empty_poll_operation_count
(gauge)
Empty poll count
hazelcast.iqueue.event_operation_count
(gauge)
Event operation count
hazelcast.iqueue.maximum_age
(gauge)
Maximum age
Shown as second
hazelcast.iqueue.minimum_age
(gauge)
Minimum age
Shown as second
hazelcast.iqueue.offer_operation_count
(gauge)
Offer count
hazelcast.iqueue.other_operation_count
(gauge)
Other operation count
hazelcast.iqueue.owned_item_count
(gauge)
Owned item count
hazelcast.iqueue.poll_operation_count
(gauge)
Poll count
hazelcast.iqueue.rejected_offer_operation_count
(gauge)
Rejected offer count
hazelcast.mc.license_expiration_time
(gauge)
The number of seconds until license expiration
Shown as second
hazelcast.member.accepted_socket_count
(gauge)
hazelcast.member.active_count
(gauge)
hazelcast.member.active_members
(gauge)
hazelcast.member.active_members_commit_index
(gauge)
hazelcast.member.async_operations
(gauge)
hazelcast.member.available_processors
(gauge)
hazelcast.member.backup_timeout_millis
(gauge)
hazelcast.member.backup_timeouts
(gauge)
hazelcast.member.bytes_read
(gauge)
hazelcast.member.bytes_received
(gauge)
hazelcast.member.bytes_send
(gauge)
hazelcast.member.bytes_transceived
(gauge)
hazelcast.member.bytes_written
(gauge)
hazelcast.member.call_timeout_count
(gauge)
hazelcast.member.client_count
(gauge)
hazelcast.member.closed_count
(gauge)
hazelcast.member.cluster_start_time
(gauge)
hazelcast.member.cluster_time
(gauge)
hazelcast.member.cluster_time_diff
(gauge)
hazelcast.member.cluster_up_time
(gauge)
hazelcast.member.commit_count
(gauge)
hazelcast.member.committed_heap
(gauge)
hazelcast.member.committed_native
(gauge)
hazelcast.member.committed_virtual_memory_size
(gauge)
hazelcast.member.completed_count
(gauge)
hazelcast.member.completed_migrations
(gauge)
hazelcast.member.completed_operation_batch_count
(gauge)
hazelcast.member.completed_operation_count
(gauge)
hazelcast.member.completed_packet_count
(gauge)
hazelcast.member.completed_partition_specific_runnable_count
(gauge)
hazelcast.member.completed_runnable_count
(gauge)
hazelcast.member.completed_task_count
(gauge)
hazelcast.member.completed_tasks
(gauge)
hazelcast.member.completed_total_count
(gauge)
hazelcast.member.connection_listener_count
(gauge)
hazelcast.member.count
(gauge)
hazelcast.member.created_count
(gauge)
hazelcast.member.daemon_thread_count
(gauge)
hazelcast.member.delayed_execution_count
(gauge)
hazelcast.member.destroyed_count
(gauge)
hazelcast.member.destroyed_group_ids
(gauge)
hazelcast.member.elapsed_destination_commit_time
(gauge)
hazelcast.member.elapsed_migration_operation_time
(gauge)
hazelcast.member.elapsed_migration_time
(gauge)
hazelcast.member.error_count
(gauge)
hazelcast.member.event_count
(gauge)
hazelcast.member.event_queue_size
(gauge)
hazelcast.member.events_processed
(gauge)
hazelcast.member.exception_count
(gauge)
hazelcast.member.failed_backups
(gauge)
hazelcast.member.frames_transceived
(gauge)
hazelcast.member.free_heap
(gauge)
hazelcast.member.free_memory
(gauge)
hazelcast.member.free_native
(gauge)
hazelcast.member.free_physical
(gauge)
hazelcast.member.free_physical_memory_size
(gauge)
hazelcast.member.free_space
(gauge)
hazelcast.member.free_swap_space_size
(gauge)
hazelcast.member.generic_priority_queue_size
(gauge)
hazelcast.member.generic_queue_size
(gauge)
hazelcast.member.generic_thread_count
(gauge)
hazelcast.member.groups
(gauge)
hazelcast.member.heartbeat_broadcast_period_millis
(gauge)
hazelcast.member.heartbeat_packets_received
(gauge)
hazelcast.member.heartbeat_packets_sent
(gauge)
hazelcast.member.idle_time_millis
(gauge)
hazelcast.member.idle_time_ms
(gauge)
hazelcast.member.imbalance_detected_count
(gauge)
hazelcast.member.in_progress_count
(gauge)
hazelcast.member.invocation_scan_period_millis
(gauge)
hazelcast.member.invocation_timeout_millis
(gauge)
hazelcast.member.invocations.last_call_id
(gauge)
hazelcast.member.invocations.pending
(gauge)
hazelcast.member.invocations.used_percentage
(gauge)
hazelcast.member.io_thread_id
(gauge)
hazelcast.member.last_heartbeat
(gauge)
hazelcast.member.last_repartition_time
(gauge)
hazelcast.member.listener_count
(gauge)
hazelcast.member.loaded_classes_count
(gauge)
hazelcast.member.local_clock_time
(gauge)
hazelcast.member.local_partition_count
(gauge)
hazelcast.member.major_count
(gauge)
hazelcast.member.major_time
(gauge)
hazelcast.member.max_backup_count
(gauge)
hazelcast.member.max_cluster_time_diff
(gauge)
hazelcast.member.max_file_descriptor_count
(gauge)
hazelcast.member.max_heap
(gauge)
hazelcast.member.max_memory
(gauge)
hazelcast.member.max_metadata
(gauge)
hazelcast.member.max_native
(gauge)
hazelcast.member.maximum_pool_size
(gauge)
hazelcast.member.member_groups_size
(gauge)
hazelcast.member.migration_active
(gauge)
hazelcast.member.migration_completed_count
(gauge)
hazelcast.member.migration_queue_size
(gauge)
hazelcast.member.minor_count
(gauge)
hazelcast.member.minor_time
(gauge)
hazelcast.member.missing_members
(gauge)
hazelcast.member.nodes
(gauge)
hazelcast.member.normal_frames_read
(gauge)
hazelcast.member.normal_frames_written
(gauge)
hazelcast.member.normal_pending_count
(gauge)
hazelcast.member.normal_timeouts
(gauge)
hazelcast.member.open_file_descriptor_count
(gauge)
hazelcast.member.opened_count
(gauge)
hazelcast.member.operation_timeout_count
(gauge)
hazelcast.member.owner_id
(gauge)
hazelcast.member.park_queue_count
(gauge)
hazelcast.member.partition_thread_count
(gauge)
hazelcast.member.peak_thread_count
(gauge)
hazelcast.member.planned_migrations
(gauge)
hazelcast.member.pool_size
(gauge)
hazelcast.member.priority_frames_read
(gauge)
hazelcast.member.priority_frames_transceived
(gauge)
hazelcast.member.priority_frames_written
(gauge)
hazelcast.member.priority_pending_count
(gauge)
hazelcast.member.priority_queue_size
(gauge)
hazelcast.member.priority_write_queue_size
(gauge)
hazelcast.member.process_count
(gauge)
hazelcast.member.process_cpu_load
(gauge)
hazelcast.member.process_cpu_time
(gauge)
hazelcast.member.proxy_count
(gauge)
hazelcast.member.publication_count
(gauge)
hazelcast.member.queue_capacity
(gauge)
hazelcast.member.queue_size
(gauge)
hazelcast.member.rejected_count
(gauge)
hazelcast.member.remaining_queue_capacity
(gauge)
hazelcast.member.replica_sync_requests_counter
(gauge)
hazelcast.member.replica_sync_semaphore
(gauge)
hazelcast.member.response_queue_size
(gauge)
hazelcast.member.responses.backup_count
(gauge)
hazelcast.member.responses.error_count
(gauge)
hazelcast.member.responses.missing_count
(gauge)
hazelcast.member.responses.normal_count
(gauge)
hazelcast.member.responses.timeout_count
(gauge)
hazelcast.member.retry_count
(gauge)
hazelcast.member.rollback_count
(gauge)
hazelcast.member.running_count
(gauge)
hazelcast.member.running_generic_count
(gauge)
hazelcast.member.running_partition_count
(gauge)
hazelcast.member.scheduled
(gauge)
hazelcast.member.selector_i_o_exception_count
(gauge)
hazelcast.member.selector_rebuild_count
(gauge)
hazelcast.member.selector_recreate_count
(gauge)
hazelcast.member.size
(gauge)
hazelcast.member.start_count
(gauge)
hazelcast.member.started_migrations
(gauge)
hazelcast.member.sync_delivery_failure_count
(gauge)
hazelcast.member.system_cpu_load
(gauge)
hazelcast.member.system_load_average
(gauge)
hazelcast.member.task_queue_size
(gauge)
hazelcast.member.terminated_raft_node_group_ids
(gauge)
hazelcast.member.text_count
(gauge)
hazelcast.member.thread_count
(gauge)
hazelcast.member.total_completed_migrations
(gauge)
hazelcast.member.total_elapsed_destination_commit_time
(gauge)
hazelcast.member.total_elapsed_migration_operation_time
(gauge)
hazelcast.member.total_elapsed_migration_time
(gauge)
hazelcast.member.total_failure_count
(gauge)
hazelcast.member.total_loaded_classes_count
(gauge)
hazelcast.member.total_memory
(gauge)
hazelcast.member.total_parked_operation_count
(gauge)
hazelcast.member.total_physical
(gauge)
hazelcast.member.total_physical_memory_size
(gauge)
hazelcast.member.total_registrations
(gauge)
hazelcast.member.total_space
(gauge)
hazelcast.member.total_started_thread_count
(gauge)
hazelcast.member.total_swap_space_size
(gauge)
hazelcast.member.unknown_count
(gauge)
hazelcast.member.unknown_time
(gauge)
hazelcast.member.unloaded_classes_count
(gauge)
hazelcast.member.uptime
(gauge)
hazelcast.member.usable_space
(gauge)
hazelcast.member.used_heap
(gauge)
hazelcast.member.used_memory
(gauge)
hazelcast.member.used_metadata
(gauge)
hazelcast.member.used_native
(gauge)
hazelcast.member.write_queue_size
(gauge)
hazelcast.multimap.local_backup_count
(gauge)
Backup count
hazelcast.multimap.local_backup_entry_count
(gauge)
Backup entry count
hazelcast.multimap.local_backup_entry_memory_cost
(gauge)
Backup entry cost
hazelcast.multimap.local_creation_time
(gauge)
Creation time
hazelcast.multimap.local_event_operation_count
(gauge)
Event count
hazelcast.multimap.local_get_operation_count
(gauge)
Get operation count
hazelcast.multimap.local_hits
(gauge)
Hits
hazelcast.multimap.local_last_access_time
(gauge)
Last access time
hazelcast.multimap.local_last_update_time
(gauge)
Last update time
hazelcast.multimap.local_locked_entry_count
(gauge)
Locked entry count
hazelcast.multimap.local_max_get_latency
(gauge)
Max get latency
hazelcast.multimap.local_max_put_latency
(gauge)
Max put latency
hazelcast.multimap.local_max_remove_latency
(gauge)
Max remove latency
hazelcast.multimap.local_other_operation_count
(gauge)
Other (keySet,entrySet etc..) operation count
hazelcast.multimap.local_owned_entry_count
(gauge)
Owned entry count
hazelcast.multimap.local_owned_entry_memory_cost
(gauge)
Owned entry memory cost
hazelcast.multimap.local_put_operation_count
(gauge)
Put operation count
hazelcast.multimap.local_remove_operation_count
(gauge)
Remove operation count
hazelcast.multimap.local_total
(gauge)
Total operation count
hazelcast.multimap.local_total_get_latency
(gauge)
Total get latency
hazelcast.multimap.local_total_put_latency
(gauge)
Total put latency
hazelcast.multimap.local_total_remove_latency
(gauge)
Total remove latency
hazelcast.multimap.size
(gauge)
Size
hazelcast.reliabletopic.creation_time
(gauge)
Creation time
Shown as second
hazelcast.reliabletopic.publish_operation_count
(gauge)
Publish count
hazelcast.reliabletopic.receive_operation_count
(gauge)
Receive count
hazelcast.replicatedmap.local_creation_time
(gauge)
Creation time
hazelcast.replicatedmap.local_event_operation_count
(gauge)
Event count
hazelcast.replicatedmap.local_get_operation_count
(gauge)
Get operation count
hazelcast.replicatedmap.local_hits
(gauge)
Hits
hazelcast.replicatedmap.local_last_access_time
(gauge)
Last access time
hazelcast.replicatedmap.local_last_update_time
(gauge)
Last update time
hazelcast.replicatedmap.local_max_get_latency
(gauge)
Max get latency
hazelcast.replicatedmap.local_max_put_latency
(gauge)
Max put latency
hazelcast.replicatedmap.local_max_remove_latency
(gauge)
Max remove latency
hazelcast.replicatedmap.local_other_operation_count
(gauge)
Other (keySet,entrySet etc..) operation count
hazelcast.replicatedmap.local_owned_entry_count
(gauge)
Owned entry count
hazelcast.replicatedmap.local_put_operation_count
(gauge)
Put operation count
hazelcast.replicatedmap.local_remove_operation_count
(gauge)
Remove operation count
hazelcast.replicatedmap.local_total
(gauge)
Total operation count
hazelcast.replicatedmap.local_total_get_latency
(gauge)
Total get latency
hazelcast.replicatedmap.local_total_put_latency
(gauge)
Total put latency
hazelcast.replicatedmap.local_total_remove_latency
(gauge)
Total remove latency
hazelcast.replicatedmap.size
(gauge)
Size
hazelcast.topic.creation_time
(gauge)
Creation time
Shown as second
hazelcast.topic.publish_operation_count
(gauge)
Publish count
hazelcast.topic.receive_operation_count
(gauge)
Receive count

Service Checks

hazelcast.can_connect
Returns CRITICAL if the Agent is unable to connect to Hazelcast, WARNING if no metrics are collected, and OK otherwise.
Statuses: ok, critical, warning

hazelcast.mc_cluster_state
Represents the state of the Hazelcast Management Center as indicated by its health check.
Statuses: ok, warning, critical

Troubleshooting

Need help? Contact Datadog support.

PREVIEWING: rtrieu/product-analytics-ui-changes