- 필수 기능
- 시작하기
- Glossary
- 표준 속성
- Guides
- Agent
- 통합
- 개방형텔레메트리
- 개발자
- API
- Datadog Mobile App
- CoScreen
- Cloudcraft
- 앱 내
- 서비스 관리
- 인프라스트럭처
- 애플리케이션 성능
- APM
- Continuous Profiler
- 스팬 시각화
- 데이터 스트림 모니터링
- 데이터 작업 모니터링
- 디지털 경험
- 소프트웨어 제공
- 보안
- AI Observability
- 로그 관리
- 관리
Supported OS
This check monitors Argo Rollouts through the Datadog Agent.
Follow the instructions below to install and configure this check for an Agent running in your Kubernetes environment. For more information about configuration in containerized environments, see the Autodiscovery Integration Templates for guidance.
Starting from Agent release 7.53.0, the Argo Rollouts check is included in the Datadog Agent package. No additional installation is needed in your environment.
This check uses OpenMetrics to collect metrics from the OpenMetrics endpoint that Argo Rollouts exposes, which requires Python 3.
The Argo Rollouts controller has Prometheus-formatted metrics readily available at /metrics
on port 8090
. For the Agent to start collecting metrics, the Argo Rollouts pods need to be annotated. For more information about annotations, refer to the Autodiscovery Integration Templates for guidance. You can find additional configuration options by reviewing the sample argo_rollouts.d/conf.yaml.
Note: The listed metrics can only be collected if they are available. Some metrics are generated only when certain actions are performed. For example, the argo_rollouts.info.replicas.updated
metric is exposed only after a replica is updated.
The only parameter required for configuring the Argo Rollouts check is:
openmetrics_endpoint
: This parameter should be set to the location where the Prometheus-formatted metrics are exposed. The default port is 8090
. In containerized environments, %%host%%
should be used for host autodetection.apiVersion: v1
kind: Pod
# (...)
metadata:
name: '<POD_NAME>'
annotations:
ad.datadoghq.com/argo-rollouts.checks: |
{
"argo_rollouts": {
"init_config": {},
"instances": [
{
"openmetrics_endpoint": "http://%%host%%:8090/metrics"
}
]
}
}
# (...)
spec:
containers:
- name: 'argo-rollouts'
# (...)
Available for Agent versions >6.0
Argo Rollouts logs can be collected from the different Argo Rollouts pods through Kubernetes. Collecting logs is disabled by default in the Datadog Agent. To enable it, see Kubernetes Log Collection.
See the Autodiscovery Integration Templates for guidance on applying the parameters below.
Parameter | Value |
---|---|
<LOG_CONFIG> | {"source": "argo_rollouts", "service": "<SERVICE_NAME>"} |
Run the Agent’s status subcommand and look for argo_rollouts
under the Checks section.
argo_rollouts.analysis.run.info (gauge) | Information about analysis run |
argo_rollouts.analysis.run.metric.phase (gauge) | Information on the duration of a specific metric in the Analysis Run |
argo_rollouts.analysis.run.metric.type (gauge) | Information on the type of a specific metric in the Analysis Runs |
argo_rollouts.analysis.run.phase (gauge) | Information on the state of the Analysis Run |
argo_rollouts.analysis.run.reconcile.bucket (count) | The number of observations in the Analysis Run reconciliation performance histogram by upper_bound buckets |
argo_rollouts.analysis.run.reconcile.count (count) | The number of observations in the Analysis Run reconciliation performance histogram |
argo_rollouts.analysis.run.reconcile.error.count (count) | Error occurring during the analysis run |
argo_rollouts.analysis.run.reconcile.sum (count) | The duration sum of all observations in the Analysis Run reconciliation performance histogram |
argo_rollouts.controller.clientset.k8s.request.count (count) | The total number of Kubernetes requests executed during application reconciliation |
argo_rollouts.experiment.info (gauge) | Information about Experiment |
argo_rollouts.experiment.phase (gauge) | Information on the state of the experiment |
argo_rollouts.experiment.reconcile.bucket (count) | The number of observations in the Experiments reconciliation performance histogram by upper_bound buckets |
argo_rollouts.experiment.reconcile.count (count) | The number of observations in the Experiments reconciliation performance histogram |
argo_rollouts.experiment.reconcile.error.count (count) | Error occurring during the experiment |
argo_rollouts.experiment.reconcile.sum (count) | The duration sum of all observations in the Experiments reconciliation performance histogram |
argo_rollouts.go.gc.duration.seconds.count (count) | The summary count of garbage collection cycles in the Argo Rollouts instance Shown as second |
argo_rollouts.go.gc.duration.seconds.quantile (gauge) | A summary of the pause duration of garbage collection cycles in the Argo Rollouts instance Shown as second |
argo_rollouts.go.gc.duration.seconds.sum (count) | The sum of the pause duration of garbage collection cycles in the Argo Rollouts instance Shown as second |
argo_rollouts.go.goroutines (gauge) | The number of goroutines that currently exist in the Argo Rollouts instance |
argo_rollouts.go.info (gauge) | Metric containing the Go version as a tag |
argo_rollouts.go.memstats.alloc_bytes (gauge) | The number of bytes allocated and still in use in the Argo Rollouts instance Shown as byte |
argo_rollouts.go.memstats.alloc_bytes.count (count) | The monotonic count of bytes allocated and still in use in the Argo Rollouts instance Shown as byte |
argo_rollouts.go.memstats.buck_hash.sys_bytes (gauge) | The number of bytes used by the profiling bucket hash table in the Argo Rollouts instance Shown as byte |
argo_rollouts.go.memstats.frees.count (count) | The total number of frees in the Argo Rollouts instance |
argo_rollouts.go.memstats.gc.cpu_fraction (gauge) | The fraction of this program's available CPU time used by the GC since the program started in the Argo Rollouts instance Shown as fraction |
argo_rollouts.go.memstats.gc.sys_bytes (gauge) | The number of bytes used for garbage collection system metadata in the Argo Rollouts instance Shown as byte |
argo_rollouts.go.memstats.heap.alloc_bytes (gauge) | The number of heap bytes allocated and still in use in the Argo Rollouts instance Shown as byte |
argo_rollouts.go.memstats.heap.idle_bytes (gauge) | The number of heap bytes waiting to be used in the Argo Rollouts instance Shown as byte |
argo_rollouts.go.memstats.heap.inuse_bytes (gauge) | The number of heap bytes that are in use in the Argo Rollouts instance Shown as byte |
argo_rollouts.go.memstats.heap.objects (gauge) | The number of allocated objects in the Argo Rollouts instance Shown as object |
argo_rollouts.go.memstats.heap.released_bytes (gauge) | The number of heap bytes released to the OS in the Argo Rollouts instance Shown as byte |
argo_rollouts.go.memstats.heap.sys_bytes (gauge) | The number of heap bytes obtained from system in the Argo Rollouts instance Shown as byte |
argo_rollouts.go.memstats.lookups.count (count) | The number of pointer lookups |
argo_rollouts.go.memstats.mallocs.count (count) | The number of mallocs |
argo_rollouts.go.memstats.mcache.inuse_bytes (gauge) | The number of bytes in use by mcache structures in the Argo Rollouts instance Shown as byte |
argo_rollouts.go.memstats.mcache.sys_bytes (gauge) | The number of bytes used for mcache structures obtained from system in the Argo Rollouts instance Shown as byte |
argo_rollouts.go.memstats.mspan.inuse_bytes (gauge) | The number of bytes in use by mspan structures in the Argo Rollouts instance Shown as byte |
argo_rollouts.go.memstats.mspan.sys_bytes (gauge) | The number of bytes used for mspan structures obtained from system in the Argo Rollouts instance Shown as byte |
argo_rollouts.go.memstats.next.gc_bytes (gauge) | The number of heap bytes when next garbage collection takes place in the Argo Rollouts instance Shown as byte |
argo_rollouts.go.memstats.other.sys_bytes (gauge) | The number of bytes used for other system allocations in the Argo Rollouts instance Shown as byte |
argo_rollouts.go.memstats.stack.inuse_bytes (gauge) | The number of bytes in use by the stack allocator in the Argo Rollouts instance Shown as byte |
argo_rollouts.go.memstats.stack.sys_bytes (gauge) | The number of bytes obtained from system for stack allocator in the Argo Rollouts instance Shown as byte |
argo_rollouts.go.memstats.sys_bytes (gauge) | The number of bytes obtained from system in the Argo Rollouts instance Shown as byte |
argo_rollouts.go.threads (gauge) | The number of OS threads created in the Argo Rollouts instance Shown as thread |
argo_rollouts.notification.send.bucket (count) | The number of observations in the Notification send performance histogram by upper_bound buckets |
argo_rollouts.notification.send.count (count) | The number of observations in the Notification send performance histogram |
argo_rollouts.notification.send.sum (count) | The duration sum of all observations in the Notification send performance histogram |
argo_rollouts.process.cpu.seconds.count (count) | The total user and system CPU time spent in seconds in the Argo Rollouts instance Shown as second |
argo_rollouts.process.max_fds (gauge) | The maximum number of open file descriptors in the Argo Rollouts instance |
argo_rollouts.process.open_fds (gauge) | The number of open file descriptors in the Argo Rollouts instance |
argo_rollouts.process.resident_memory.bytes (gauge) | The resident memory size in bytes in the Argo Rollouts instance Shown as byte |
argo_rollouts.process.start_time.seconds (gauge) | The start time of the process since unix epoch in seconds in the Argo Rollouts instance Shown as second |
argo_rollouts.process.virtual_memory.bytes (gauge) | The virtual memory size in bytes in the Argo Rollouts instance Shown as byte |
argo_rollouts.process.virtual_memory.max_bytes (gauge) | The maximum amount of virtual memory available in bytes in the Argo Rollouts instance Shown as byte |
argo_rollouts.rollout.events.count (count) | The count of rollout events |
argo_rollouts.rollout.info (gauge) | Information about rollout |
argo_rollouts.rollout.info.replicas.available (gauge) | The number of available replicas per rollout |
argo_rollouts.rollout.info.replicas.desired (gauge) | The number of desired replicas per rollout |
argo_rollouts.rollout.info.replicas.unavailable (gauge) | The number of unavailable replicas per rollout |
argo_rollouts.rollout.info.replicas.updated (gauge) | The number of updated replicas per rollout |
argo_rollouts.rollout.phase (gauge) | Information on the state of the rollout. This will be soon to be deprecated by Argo Rollouts, use argo_rollouts.rollout.info instead |
argo_rollouts.rollout.reconcile.bucket (count) | The number of observations in the Rollout reconciliation performance histogram by upper_bound buckets |
argo_rollouts.rollout.reconcile.count (count) | The number of observations in the Rollout reconciliation performance histogram |
argo_rollouts.rollout.reconcile.error.count (count) | Error occurring during the rollout |
argo_rollouts.rollout.reconcile.sum (count) | The duration sum of all observations in the Rollout reconciliation performance histogram |
argo_rollouts.workqueue.adds.count (count) | The total number of adds handled by workqueue |
argo_rollouts.workqueue.depth (gauge) | The current depth of the workqueue |
argo_rollouts.workqueue.longest.running_processor.seconds (gauge) | The number of seconds the longest running worqueue processor has been running Shown as second |
argo_rollouts.workqueue.queue.duration.seconds.bucket (count) | The histogram bucket of how long in seconds an item stays in the workqueue before being requested Shown as second |
argo_rollouts.workqueue.queue.duration.seconds.count (count) | The total number of events in the workqueue duration histogram |
argo_rollouts.workqueue.queue.duration.seconds.sum (count) | The sum the of events counted in the workqueue duration histogram |
argo_rollouts.workqueue.retries.count (count) | The total number of retries handled by workqueue |
argo_rollouts.workqueue.unfinished_work.seconds (gauge) | The number of seconds of work that has been done that is in progress and hasn't been observed by work_duration . Large values indicate stuck threads. One can deduce the number of stuck threads by observing the rate at which this increasesShown as second |
argo_rollouts.workqueue.work.duration.seconds.bucket (count) | The histogram bucket for time in seconds it takes for processing of an item in the workqueue Shown as second |
argo_rollouts.workqueue.work.duration.seconds.count (count) | The total number of events in the workqueue item processing duration histogram |
argo_rollouts.workqueue.work.duration.seconds.sum (count) | The sum of events in the workqueue item processing duration histogram |
The Argo Rollouts integration does not include any events.
argo_rollouts.openmetrics.health
Returns CRITICAL
if the Agent is unable to connect to the Argo Rollouts OpenMetrics endpoint, otherwise returns OK
.
Statuses: ok, critical
Need help? Contact Datadog support.
Additional helpful documentation, links, and articles: