Prometheus Monitoring
Overview
Prometheus is an open-source systems monitoring and alerting toolkit with a dimensional data model, flexible query language, efficient time series database, and modern alerting approach.
Prometheus collects metrics from monitored targets by scraping metrics HTTP endpoints on these targets. Since Prometheus also exposes data in the same manner about itself, it can also scrape and monitor its own health.
In Prometheus, a user configures service exporters to store application metrics.
With new design, monitor calls single URL to get response of all standard metrics with specified properties in monitor configuration file and for custom metrics, one URL each is dedicated. For example, if a user provides five custom metrics and five standard metrics in a configuration file, five URLs for custom metrics and one URL for standard metrics are invoked by the monitor.
And monitor will perform this action by single thread. By using this approach, the captured data is reduced. Therefore, monitor performance to capture / parse that response is increased.
Prometheus Configuration
To configure Prometheus, follow the below mentioned steps:
1. Login to a machine, go to Product UI Home page, click Monitors icon, and then click the Monitors menu item.

2. This displays the Monitor Group window. Select a ‘topology’ from the drop down list and click a ‘monitor group’ name.

3. This displays a list of Monitors. Expand the Prometheus group. A list of following monitors is displayed:
- ApplicationHealthStats
- KubernetesContainerNamespaceStats
- KubernetesContainerOverallStats
- KubernetesContainerStats
- KubernetesDeploymentStats
- KubernetesNodeStats

4. To configure any monitor, click the Configure Monitor icon.
5. This displays the Monitor Configuration window.

6. Provide the details for each field as follows:
- Exclude Tier: From the drop down list, the user can choose a tier that they want to exclude. The user can also type a pattern as a regular expression that allows them to exclude a tier. The user can type multiple regular expressions separated by comma.
- Type the Instance Name.
- Type the Prometheus Base URL.
- User Name: Type the user name for basic HTTP authentication.
- Password: Type the password for basic HTTP authentication.
- Filter Settings: Select the filter type as None, Positive Filter or Negative Filter from the drop-down, and then type the filter in the text box. The user can type multiple filters separated by comma in a text box.
- Proxy Settings: Type the Proxy URL, Proxy User Name, and Proxy Password.
- Recovery Settings: Type the Retry Count. By default, the retry count value is 60.
- Advanced Settings: Provide the following details under advanced settings.
- Type the Absolute Metrics Configuration File Path. This is used to give externally configured metrics.
- Vector Persist Count: Type the number of count to hold vector if it is freguently going up and down. The default count is 5.
- URL Response Timeout (Secs): Type the maximum time used to get response for specified URL. The default value is 10 seconds.
- Thread Pool Size: Type the number of threads used in parallel to get response from Prometheus URLs. If the number of thread is zero, it will calculate automatically based on the configured metric_url size.



7. Click Add to add the monitor configuration.
8. To delete one or more configured monitors, click the Delete icon in the lower-right corner of the window.
9. Select the check box to enable the monitor and click Save.

Prometheus Capabilities
ApplicationHealthStats
Metric | Metric Description |
---|---|
Prometheus Job Health Status | Health status of prometheus job (1-UP, 0-DOWN). |

KubernetesContainerNamespaceStats
Metric | Metric Description |
---|---|
Prometheus Kubernetes Namespaces | Number of Namespaces in Zone. |
Prometheus Kubernetes Pods | Number of Pods in Namaspace. |
Prometheus Kubernetes Containers | Number of Containers in Namespace. |

KubernetesContainerOverallStats
Metric | Metric Description |
---|---|
Prometheus Kubernetes Regions | Number of Regions in Prometheus Host. |
Prometheus Kubernetes Zones | Number of Zones in Prometheus Host. |
Prometheus Kubernetes Nodes | Number of Nodes in Prometheus Host. |
Prometheus Kubernetes Namespaces | Number of Namespaces in Prometheus Host. |
Prometheus Kubernetes Pods | Number of Pods in Prometheus Host. |
Prometheus Kubernetes Containers | Number of Containers in Prometheus Host. |

KubernetesContainerStats
Metric | Metric Description |
---|---|
Container Enforcement Period Intervals | Number of elapsed enforcement period intervals. This metric is derived from prometheus metric ‘container_cpu_cfs_periods_total’. |
Container Throttled Count/Sec | Number of times, task in this container has been throttled per second. This metric is derived from prometheus metric ‘container_cpu_cfs_throttled_periods_total’. |
Container Average Throttled Time (Sec) | Average time for which tasks in container have been throttled in seconds. This metric is derived from prometheus metric ‘container_cpu_cfs_throttled_seconds_total’. |
Container System CPU Time/Sec | Time spent by tasks of the container in kernel mode per second. This metric is derived from prometheus metric ‘container_cpu_system_seconds_total’. |
Container User CPU Time/Sec | Time spent by tasks of the container in user mode per second. This metric is derived from prometheus metric ‘container_cpu_user_seconds_total’. |
Container Total CPU Time (Sec) | Total CPU time consumed by all tasks in seconds. This metric is derived from prometheus metric ‘container_cpu_usage_seconds_total’. |
Container Available Inodes | Number of available Inodes. This metric is derived from prometheus metric ‘container_fs_inodes_free’. |
Container Total Inodes | Number of Inodes. This metric is derived from prometheus metric ‘container_fs_inodes_total’. |
Container Current IO Operations | Number of I/O operations are currently in progress. This metric is derived from prometheus metric ‘container_fs_io_current’. |
Container Time Spent In IO Operations (Sec) | Time spent doing I/Os in seconds. This metric is derived from prometheus metric ‘container_fs_io_time_seconds_total’. |
Container Weighted IO Time (Sec) | Weighted I/O time in seconds. This metric is derived from prometheus metric ‘container_fs_io_time_weighted_seconds_total’. |
Container Filesystem Limit (MB) | Space that can be consumed by the container in filesystem in MegaBytes. This metric is derived from prometheus metric ‘container_fs_limit_bytes’. |
Container Disk Reads/Sec | Number of reads completed from disk per second. This metric is derived from prometheus metric ‘container_fs_reads_total’. |
Container Average Disk Read Time (Sec) | Average time in reading data from filesystem in seconds. This metric is derived from prometheus metric ‘container_fs_read_seconds_total’. |
Container Disk Merged Reads/Sec | Number of reads merged per second. This metric is derived from prometheus metric ‘container_fs_reads_merged_total’. |
Container Sector Reads/Sec | Number of sector reads completed per second. This metric is derived from prometheus metric ‘container_fs_sector_reads_total’. |
Container Sector Writes/Sec | Number of sector writes completed per seconds. This metric is derived from prometheus metric ‘container_fs_sector_writes_total’. |
Container Disk Writes/Sec | Number of writes operation has completed into filesystem. This metric is derived from prometheus metric ‘container_fs_writes_total’. |
Container Average Disk Write Time (Sec) | Average time in writing into filesystem in seconds. This metric is derived from prometheus metric ‘container_fs_write_seconds_total’. |
Container Disk Merged Writes/Sec | Number of writes merged per second. This metric is derived from prometheus metric ‘container_fs_writes_merged_total’. |
Container Filesystem Usage(MB) | Space that are consumed by the container on this filesystem in MegaBytes. This metric is derived from prometheus metric ‘container_fs_usage_bytes’. |
Container Cache Memory Size (MB) | Page cache memory in MegaBytes. This metric is derived from prometheus metric ‘container_memory_cache’. |
Container Memory Hits Limits | Number of memory usage hits limits. This metric is derived from prometheus metric ‘container_memory_failcnt’. |
Container Memory Allocation Failures/Sec | Number of memory allocation failures per second. This metric is derived from prometheus metric ‘container_memory_failures_total’. |
Container Resident Set Size (MB) | RSS(anonymous and swap cache memory) in MegaBytes. This metric is derived from prometheus metric ‘container_memory_rss’. |
Container Swaped Memory (MB) | Container swap usage in MegaBytes. This metric is derived from prometheus metric ‘container_memory_swap’. |
Container Used Memory (MB) | Total used memory in MegaBytes. This metric is derived from prometheus metric ‘container_memory_usage_bytes’. |
Container Memory Working Set (MB) | Current working set in MegaBytes. This metric is derived from prometheus metric ‘container_memory_working_set_bytes’. |
Container Memory Limit(MB) | Memory limit for the container in MegaBytes. This metric is derived from prometheus metric ‘container_spec_memory_limit_bytes’. |
Container Memory Swap Limit(MB) | Memory swap limit for the container in MegaBytes. This metric is derived from prometheus metric ‘container_spec_memory_swap_limit_byte’. |
Container Network Received Throughput (Mbps) | Network received throughput in Megabits per second. This metric is derived from prometheus metric ‘container_network_receive_bytes_total’. |
Container Network Received Error/Sec | Number of errors encountered while receiving per second. This metric is derived from prometheus metric ‘container_network_receive_errors_total’. |
Container Network Received Packets Dropped/Sec | Number of packets received per second. This metric is derived from prometheus metric ‘container_network_receive_packets_dropped_total’. |
Container Network Received Packets/Sec | Number of bytes received per second. This metric is derived from prometheus metric ‘container_network_receive_packets_total’. |
Container Network Transmitted Throughput (Mbps) | Network transmitted throughput in Megabits per second. This metric is derived from prometheus metric ‘container_network_transmit_bytes_total’. |
Container Network Transmitted Error/Sec | Number of errors encountered while transmitting per second. This metric is derived from prometheus metric ‘container_network_transmit_errors_total’. |
Container Network Transmitted Packets Dropped/Sec | Number of packets dropped while transmitting per second. This metric is derived from prometheus metric ‘container_network_transmit_packets_dropped_total’. |
Container Network Transmitted Packets/Sec | Number of packets transmitted per second. This metric is derived from prometheus metric ‘container_network_transmit_packets_total’. |
Container CPU Period | CPU period of the container. This metric is derived from prometheus metric ‘container_spec_cpu_period’. |
Container CPU Quota | CPU quota of the container. This metric is derived from prometheus metric ‘container_spec_cpu_quota’. |
Container CPU Share | CPU share of the container. This metric is derived from prometheus metric ‘container_spec_cpu_shares’. |
Container Start Time (Sec) | Start time of the container since unix epoch in seconds. This metric is derived from prometheus metric ‘container_start_time_seconds’. |
Container Last Seen (Sec) | Last time in seconds a container was seen by the exporter. This metric is derived from prometheus metric ‘container_last_seen’. |
Container Stopped State | Number of tasks in stopped state. This metric is derived from prometheus metric ‘container_tasks_state’. |
Container Sleeping State | Number of tasks in sleeping state. This metric is derived from prometheus metric ‘container_tasks_state’. |
Container IOwaiting State | Number of tasks in iowaiting state. This metric is derived from prometheus metric ‘container_tasks_state’. |
Container Uninterruptible State | Number of tasks in uninterruptible state. This metric is derived from prometheus metric ‘container_tasks_state’. |
Container Running State | Number of tasks in running state. This metric is derived from prometheus metric ‘container_tasks_state’. |
Container CPU Usage (Pct) | Container’s cpu percentage utilization in percentage. This metric is derived from prometheus metric ‘rate(container_cpu_user_seconds_total[$(INTERVAL)]))/(container_spec_cpu_quota/container_spec_cpu_period)’. |
Container Used Memory (Pct) | Container’s used memory in percentage. This metric is derived from prometheus metric ‘container_memory_usage_bytes’. |

KubernetesDeploymentStats
Metric | Metric Description |
---|
Prometheus Kubernetes Pods Count | Number of Pods in Kubernetes Deployment. |
Prometheus Kubernetes Container Count | Number of Containers in Kubernetes Deployment. |

KubernetesNodeStats
Metric | Metric Description |
---|---|
Kubernetes Node Created Timestamp (ms) | Unix creation timestamp. This metric is derived from prometheus metric ‘kube_node_created’. |
Kubernetes Node Unschedulable | Whether a node can schedule new pods. 1 = node can schedule new pods, 0 = node can not schedule new pods. This metric is derived from prometheus metric ‘kube_node_spec_unschedulable’. |
Kubernetes Node Allocatable CPU Cores | The CPU resources of a node that are available for scheduling. This metric is derived from prometheus metric ‘kube_node_status_allocatable_cpu_cores’. |
Kubernetes Node Allocatable Memory (MB) | The memory resources of a node that are available for scheduling in MegaBytes. This metric is derived from prometheus metric ‘kube_node_status_allocatable_memory_bytes’. |
Kubernetes Node Allocatable Pods | The pod resources of a node that are available for scheduling. This metric is derived from prometheus metric ‘kube_node_status_allocatable_pods’. |
Kubernetes Node CPU Cores | The total CPU resources of the node. This metric is derived from prometheus metric ‘kube_node_status_capacity_cpu_cores’. |
Kubernetes Node Memory (MB) | The total memory resources of the node in MegaBytes. This metric is derived from prometheus metric ‘kube_node_status_capacity_memory_bytes’. |
Kubernetes Node Pods | The total pod resources of the node. This metric is derived from prometheus metric ‘kube_node_status_capacity_pods’. |