GCP Monitoring

Overview

A common framework for executing Google Cloud Platform (GCP) monitors is implemented and is used by Cavisson Application Agent. This framework is responsible for capturing and sending attributes of one or more GCP as per configuration of monitor.

GCP Monitoring

AgentCPULoadStats

MetricMetric Description
Load Average Over 1 MinuteAverage load of system during each minute
Load Average Over 5 MinutesAverage load of system during 5 minute
Load Average Over 15 MinutesAverage load of system during 15 minute

AgentCPUUsageStats

MetricMetric Description
CPU Usage (Pct)CPU usage in percentage

AgentDiskStats

MetricMetric Description
Disk Bytes Used (MB)Disk bytes used in megabytes, where type of usage (free, used, and reserved) in device
Average I/O Time (ms)/SecAverage time in millisecond an I/O-operation took to complete in device per second
Merged Operations/SecNumber of merged operations per second, where direction (read and write) in device
Operations/SecNumber of disk operations per second, where direction (read and write) in device
Operation Time (ms)/SecTotal time in millisecond spent In disk operations (read and write) in device per second
Average Pending OperationsNumber of Pending operations in device
Disk Used (Pct)Disk percent used, where type of usage are free, used, and reserved in device
Bytes Read/SecDisk bytes read in device per second
Weighted I/O Time (ms)/SecWeighted number of millisecond spent I/O operation in device per second
Disk Bytes Transferred Kilobytes/SecDisk bytes transferred in kilobytes per second

BigQueryDatasetStats

MetricMetric Description
BigQuery Stored bytes/SecNumber of bytes stored per second
BigQuery Table countNumber of tables
BigQuery Uploaded bytes/minUploaded bytes per minute
BigQuery Uploaded rows/minUploaded rows per minute

BigQueryGlobalStats

MetricMetric Description
BigQuery Queries CountNumber of In-flight queries
BigQuery Scanned Bytes/MinNumber of scanned bytes per minute
BigQuery Scanned Bytes Billed/MinNumber of server scanned bytes billed per minute
BigQuery Slots Used By ProjectNumber of slots used by project
BigQuery Total SlotsTotal number of BigQuery slots available for the project
BigQuery Average Query Execution Time (Sec)Average query execution time in seconds
BigQuery 5th Percentile Query Execution Time (Sec)5th percentile of time taken for processing query in seconds
BigQuery 50th Percentile Query Execution Time (Sec)50th percentile of time taken for processing query in seconds
BigQuery 95th Percentile Query Execution Time (Sec)95th percentile of time taken for processing query in seconds
BigQuery 99th Percentile Query Execution Time (Sec)99th percentile of time taken for processing query in seconds

BigtableClusterStats

MetricMetric Description
Bigtable Cluster Byte UsedAmount of compressed data for tables stored in a cluster
BigTable Cluster Node CountNumber of nodes in a cluster
Bigtable Cluster Disk LoadUtilization of HDD disks in a cluster
Bigtable Cluster CPU Load Hottest NodeCPU load of the busiest node in a cluster
BigTable Cluster CPU LoadCPU load of a cluster

BigtableTableStats

MetricMetric Description
Bigtable Server Sent Bytes/SecNumber of uncompressed bytes of response data sent by servers for a table
Bigtable Server Returned Rows/SecNumber of rows returned by server requests for a table
Bigtable Server Requests/SecNumber of server requests for a table
Bigtable Server Received Bytes/SecNumber of uncompressed bytes of request data received by servers for a table
Bigtable Server Modified Rows/SecNumber of rows modified by server requests for a table
Bigtable Server Errors/SecNumber of server requests for a table that failed with an error
Bigtable Average Server Latency (ms)Average server request latency for a table in milliseconds
Bigtable 5th Percentile Server Latency (ms)5th percentile server request latency for a table in milliseconds
Bigtable 50th Percentile Server Latency (ms)50th percentile server request latency for a table in milliseconds
Bigtable 95th Percentile Server Latency (ms)95th percentile server request latency for a table in milliseconds
Bigtable 99th Percentile Server Latency (ms)95th percentile server request latency for a table in milliseconds

CloudSqlStats

MetricMetric Description
CloudSql Reserved CoresNumber of cores reserved for the database
CloudSql CPU Usage Time (Millisecond/Sec)CPU usage time in millisecond per second
CloudSql Reserved CPU Utilization (Pct)Fraction of the reserved cpu that is currently in use
CloudSql Bytes UsedData utilization in bytes
CloudSql Disk Quota (GB)Maximum data disk size in giga bytes
CloudSql Read Operations/SecNumber of data disk read i/o operations per second
CloudSql Reserved Disk Utilization (Pct)Fraction of the disk quota that is currently used in percentage
CloudSql Write Operations/SecNumber of disk write i/o operations per second
CloudSql Memory Quota (GB)Maximum ram size in giga bytes
CloudSql Memory Usage (MB)RAM usage in mega bytes
CloudSql Reserved Memory Utilization (Pct)Fraction of the memory quota that is currently used in percentage
CloudSql Unflushed Pages In InnoDB Buffer PoolNumber of unflushed pages in the inno db buffer pool
CloudSql Unused Pages In InnoDB Buffer PoolNumber of unused pages in the inno db buffer pool
CloudSql Total Pages In InnoDB Buffer PoolTotal number of pages in the inno db buffer pool
CloudSql InnoDB Fsync() Calls/SecNumber of inno db fsync() calls per second
CloudSql InnoDB Fsync() Calls To Log File/SecNumber of inno db fsync() calls to the log file per second
CloudSql InnoDB Pages Read/SecNumber of inno db pages read per second
CloudSql InnoDB Pages Written/SecNumber of inno db pages written per second
CloudSql Queries Executed By Server/SecNumber of statements executed by the server per second
CloudSql Statements Executed By Server Sent By The Client/SecNumber of statements executed by the server sent by the client per second
CloudSql Bytes Received By MySQL Process/SecNumber of bytes received by mysql process per second
CloudSql Failover Operations Available On Master InstanceNumber of failover operations is available on the master instance per second
CloudSql Read Replica Is Behind Its Master (Sec)Number of seconds the read replica is behind its master
CloudSql Bytes Sent By MySQL Process/SecNumber of bytes sent by mysql process per second
CloudSql Connections To MySQL InstanceNumber of connections to the cloud sql mysql instance
CloudSql Bytes Received Through Network/SecNumber of bytes received through the network per second
CloudSql Bytes Sent Count Through Network/SecNumber of bytes sent through the network per second
CloudSql Cloud SQL PostgreSQL Instance ConnectionsNumber of connections to the cloud sql postgresql instance
CloudSql Transaction Count/SecNumber of transactions in postgresql per second
CloudSql Server UP StatusIndicates if the server is up or not. On-demand instances are spun down if no connections are made for a sufficient amount of time
CloudSql Instance Running Time (Millisecond/Sec)Total time in seconds the instance has been running per second

ComputeEngineStats

MetricMetric Description
Compute Engine Dropped Bytes/SecNumber of incoming bytes dropped per second by the firewall
Compute Engine Dropped Packets/SecNumber of incoming packets dropped per second by the firewall
Compute Engine Reserved CoresNumber of cores reserved on the host of the instance
Compute Engine Average CPU Usage (Sec)Average CPU usage for all cores in seconds
Compute Engine Average CPU Utilization (Pct)The fraction of the allocated CPU that is currently in use on the instance in percentage
Compute Engine Disk Read Bytes/SecNumber of bytes read per second from disk
Compute Engine Disk Read Operations/SecNumber of disk read IO operations per second
Compute Engine Throttled Read Bytes/SecNumber of bytes per second in throttled read operations
Compute Engine Throttled Read Operations/SecNumber of throttled read operations per second
Compute Engine Throttled Write Bytes/SecNumber of bytes per second in throttled write operations
Compute Engine Throttled Write Operations/SecNumber of throttled write operations per second
Compute Engine Disk Write Bytes/SecNumber of bytes per second written to disk
Compute Engine Disk Write Operations/SecNumber of disk write IO operations per second
Compute Engine Received Bytes/SecNumber of bytes received per second from the network
Compute Engine Received Packets/SecNumber of packets received per second from the network
Compute Engine Sent Bytes/SecNumber of bytes sent per second over the network
Compute Engine Sent Packets/SecNumber of packets sent per second over the network
Compute Engine Uptime (Sec)How long the VM is up in millisecond per seconds

DataProcHdfsYarnCustomStats

MetricMetric Description
Yarn Heap Memory Usage Committed (MB)Amount of heap memory in megabytes Reserved in cluster
Yarn Heap Memory Usage Max (MB)Amount of maximum heap memory used in megabytes in cluster
Yarn Heap Memory Used (MB)Amount of heap memory used in megabytes in cluster
Yarn Non Heap Memory Usage Committed (MB)Amount of non heap memory in megabytes Reserved in cluster
Yarn Non Heap Memory Usage Max (MB)Amount of maximum non heap memory used in megabytes in cluster
Yarn Non Heap Memory Usage Used (MB)Amount of non heap memory used in megabytes in cluster
Yarn Number Of Active NMsNumber of active NMs (Network management system) in cluster
Yarn Allocated Memory (MB)Amount of memory allocated to yarn in megabytes on cluster
Yarn Application Submitted/SecNumber of Application submitted on cluster
Yarn Available Memory (MB)Amount of Available memory in megabytes on cluster
Yarn Pending ContainersNumber of pending containers on cluster
Yarn Pending Memory (MB)Amount of pending memory in megabytes on cluster
Yarn Reserved Memory (MB)Amount of reserved memory in megabytes on cluster
HDFS Heap Memory Usage Committed (MB)Amount of heap memory in megabytes Reserved for hadoop distributed file system in cluster
HDFS Heap Memory Usage Max (MB)Amount of maximum heap memory used in megabytes for hadoop distributed file system in cluster
HDFS Heap Memory Usage Used (MB)Amount of heap memory in megabytes used for hadoop distributed file system in cluster
HDFS Non Heap Memory Usage Committed (MB)Amount of non heap memory in megabytes Reserved for hadoop distributed file system in cluster
HDFS Non Heap Memory Usage Max (MB)Amount of maximum non heap memory used in megabytes Reserved for hadoop distributed file system in cluster
HDFS Non Heap Memory Usage Used (MB)Amount of non heap memory used in megabytes Reserved for hadoop distributed file system in cluster
HDFS Capacity Remaining (GB)Amount of memory remaining in gigabytes that are reserved for hadoop distributed file system in cluster
HDFS Capacity Total (GB)Amount of memory in gigabytes that are reserved for hadoop distributed file system in cluster
HDFS Capacity Used (GB)Amount of memory used in gigabytes that are reserved for hadoop distributed file system in cluster
HDFS Total FilesNumber of files in hadoop distributed file system in cluster

DataProcHdfsYarnStats

MetricMetric Description
DataProc Yarn Apps Failed/SecNumber of yarn application failed in cluster per second
DataProc Yarn Containers AllocatedNumber of Yarn containers allocated to cluster
DataProc Yarn Allocated Memory (MB)Allocated yarn memory in megabytes to cluster
DataProc Yarn Vcores pendingNumber of yarn virtual cores pending in cluster
DataProc Yarn Apps Killed/SecNumber of yarn application killed in cluster per second
DataProc Yarn Nodes LostNumber of yarn nodes lost in cluster
DataProc Yarn Nodes Decommissioned/SecNumber of yarn nodes decommissioned in cluster per second
DataProc Yarn Nodes UnhealthyNumber of yarn nodes unhealthy in cluster
DataProc Yarn Available Memory (MB)Available yarn memory in megabytes in cluster
DataProc Yarn Containers ReservedNumber of Yarn containers reserved in cluster
DataProc Yarn Pending Memory (MB)Pending yarn memory in megabytes in cluster
DataProc Yarn Nodes Rebooted/SecNumber of yarn nodes rebooted in cluster per second
DataProc Yarn Total Memory (MB)Total yarn memory in megabytes in cluster
DataProc Yarn Apps Completed/SecNumber of yarn application completed in cluster per second
DataProc Yarn Containers PendingNumber of Yarn containers pending in cluster
DataProc Yarn Apps RunningNumber of yarn application running in cluster
DataProc Yarn Vcores AllocatedNumber of yarn virtual cores allocated to cluster
DataProc Yarn Vcores ReservedNumber of yarn virtual cores reserved by cluster
DataProc Yarn Reserved Memory (MB)Reserved yarn memory in megabytes by cluster
DataProc Yarn Total VcoresTotal number of yarn virtual cores available for cluster
DataProc Yarn Vcores AvailableNumber of yarn virtual cores available in cluster
DataProc Yarn Apps Submitted/SecNumber of yarn application submitted in cluster per second
DataProc Yarn Nodes ActiveNumber of yarn nodes Active in cluster
DataProc Yarn Apps PendingNumber of yarn application pending in cluster
DataProc Dfs Capacity Used (Bytes)Distributed file system capacity used in bytes
DataProc Dfs Capacity Present (Bytes)Distributed file system capacity present in bytes
DataProc Dfs Nodes Decommissioned/SecNumber of Distributed file system nodes decommissioned in cluster per second
DataProc Dfs Capacity Remaining (Bytes)Distributed file system capacity remaining in bytes
DataProc Dfs Nodes DecommissioningNumber of Distributed file system nodes decommissioning in cluster
DataProc Dfs Blocks MissingNumber of Distributed file system blocks missing in cluster
DataProc Dfs Blocks Pending DeletionNumber of Distributed file system blocks pending for delete from cluster
DataProc Dfs Total Capacity (Bytes)Distributed file system total capacity in bytes
DataProc Dfs Nodes runningNumber of Distributed file system nodes running in cluster
DataProc Dfs Blocks Under ReplicationNumber of Distributed file system blocks under replication in cluster
DataProc Dfs Blocks Missing ReplicationNumber of Distributed file system blocks missing replication in cluster
DataProc Dfs Blocks Corrupt/SecNumber of Distributed file system blocks corrupt in cluster per second

DataProcJobsStats

MetricMetric Description
DataProc Jobs Failed/SecNumber of jobs failed in cluster per second
DataProc Operations Failed/SecNumber of operation failed in cluster per second
DataProc Jobs RunningNumber of jobs running in cluster
DataProc Operation RunningNumber of operation running in cluster
DataProc Jobs SummitedNumber of jobs submitted in cluster for execution
DataProc Operation SummitedNumber of operation submitted in cluster for execution
DataProc 5th Percentile Job Completion Time (ms)5th percentile of time in millisecond, when jobs took to complete from the time the user submits a job to the time Dataproc reports it is completed
DataProc 50th Percentile Job Completion Time (ms)50th percentile of time in millisecond, when jobs took to complete from the time the user submits a job to the time Dataproc reports it is completed
DataProc 95th Percentile Job Completion Time (ms)95th percentile of time in millisecond, when jobs took to complete from the time the user submits a job to the time Dataproc reports it is completed
DataProc 99th Percentile Job Completion Time (ms)99th percentile of time in millisecond, when jobs took to complete from the time the user submits a job to the time Dataproc reports it is completed
DataProc 5th Percentile Job Time Taken (ms)5th percentile of time in millisecond jobs have spent in a given state
DataProc 50th Percentile Job Time Taken (ms)50th percentile of time in millisecond jobs have spent in a given state
DataProc 95th Percentile Job Time Taken (ms)95th percentile of time in millisecond jobs have spent in a given state
DataProc 99th Percentile Job Time Taken (ms)99th percentile of time in millisecond jobs have spent in a given state
DataProc 5th Percentile Operation Completion Time (ms)5th percentile of time in millisecond operations took to complete from the time the user submits a operation to the time Dataproc reports it is completed
DataProc 50th Percentile Operation Completion Time (ms)50th percentile of time in millisecond operations took to complete from the time the user submits a operation to the time Dataproc reports it is completed
DataProc 95th Percentile Operation Completion Time (ms)95th percentile of time in millisecond operations took to complete from the time the user submits a operation to the time Dataproc reports it is completed
DataProc 99th Percentile Operation Completion Time (ms)99th percentile of time in millisecond operations took to complete from the time the user submits a operation to the time Dataproc reports it is completed
DataProc 5th Percentile Operation Time Taken (ms)5th percentile of time in millisecond operation have spent in a given state
DataProc 50th Percentile Operation Time Taken (ms)50th percentile of time in millisecond operation have spent in a given state
DataProc 95th Percentile Operation Time Taken (ms)95th percentile of time in millisecond operation have spent in a given state
DataProc 99th Percentile Operation Time Taken (ms)99th percentile of time in millisecond operation have spent in a given state

DataflowStats

MetricMetric Description
Dataflow Job Current Active vCPUsNumber of vCPUs currently being used by this Dataflow job
Dataflow Job Data Watermark Age (Sec)The age of the most recent item of data that has been fully processed by the pipeline in seconds
Dataflow Job Elapsed Time (Min)Time duration that the current run of this pipeline has been in the Running state so far, in minutes
Dataflow Job ElementsNumber of elements added to the pcollection (multi-element data set) so far
Dataflow Job Estimated Element Size (KB)An estimated element size (in Kilobytes) added to the pcollection (multi-element data set) so far. Dataflow calculates the average encoded size of elements in a pcollection and multiplies it by the number of elements
Dataflow Job System Lag (Sec)Current maximum duration that an item of data has been awaiting processing, in seconds
Dataflow Job Total vCPUTotal vCPU seconds used by this dataflow job

DatastoreStats

MetricMetric Description
Datastore API Request/secNumber of API calls per second
Datastore Index Writes/secNumber of index writes per second
Datastore 5th Percentile Read Size Entities (Bytes)5th percentile read size entities in bytes
Datastore 50th Percentile Read Size Entities (Bytes)50th percentile read size entities in bytes
Datastore 95th Percentile Read Size Entities (Bytes)95th percentile read size entities in bytes
Datastore 99th Percentile Read Size Entities (Bytes)99th percentile read size entities in bytes
Datastore 5th Percentile Written Size Entities (Bytes)5th percentile write size entities in bytes
Datastore 50th Percentile Written Size Entities (Bytes)50th percentile write size entities in bytes
Datastore 95th Percentile Written Size Entities (Bytes)95th percentile write size entities in bytes
Datastore 99th Percentile Written Size Entities (Bytes)99th percentile write size entities in bytes

InstanceGroupStats

MetricMetric Description
Running InstancesNumber of running instances in an instance group
Terminated InstancesNumber of terminated instances in an instance group

MySQLBackupStats

MetricMetric Description
Full Backup StatusBackup status (status = 1, backup service is active and status = 0, indicate is inactive)
Incremental Backup StatusBackup status (status = 1, backup service is active and status = 0, indicate is inactive)
Full Backup Size (MB)The total amount of space used in Megabytes for full backup
Incremental Backup Size (MB)The total amount of space used in Megabytes for Incremental backup

PubSubSubscriptionStats

MetricMetric Description
PubSub Subscription Backlog Size (KB)Total size of the unacknowledged messages (backlog messages) in a subscription in Kilobytes
PubSub Subscription Cost (KB)Cost of operations per subscription, measured in Kilobytes. This is used to measure utilization for quotas
PubSub Subscription Updates/SecNumber of configuration changes per subscription, grouped by operation type and result
PubSub Subscription ModifyAckDeadline Message Operations/SecNumber of modify acknowledgment deadline message operations per second, grouped by result
PubSub Subscription ModifyAckDeadline Requests/SecNumber of Modify Acknowledgment Deadline requests, grouped by result
PubSub Subscription Outstanding Push MessagesNumber of messages delivered to a subscription’s push endpoint, but not yet acknowledged
PubSub Subscription Retained Acknowledged MessagesNumber of acknowledged messages retained in a subscription
PubSub Subscription Unacknowledged MessagesNumber of unacknowledged messages (backlog messages) in a subscription
PubSub Subscription Average Oldest Retained Acknowledged Message Age (Sec)Average Age (in seconds) of the oldest acknowledged message retained in a subscription
PubSub Subscription Average Oldest Unacknowledged Message Age (Sec)Average Age (in seconds) of the oldest unacknowledged message (backlog message) in a subscription
PubSub Subscription Acknowledged Message Operations/SecNumber of acknowledge message operations per second, grouped by result
PubSub Subscription Acknowledge Requests/SecNumber of acknowledge requests per second, grouped by result
PubSub Subscription Pull Operations/SecNumber of pull message operations per second, grouped by result
PubSub Subscription Pull Requests/SecNumber of pull requests per second, grouped by result
PubSub Subscription Push Requests/SecNumber of push attempts per second, grouped by result
PubSub Subscription Average Retained Acknowledged Size (Bytes)Average byte size of the acknowledged messages retained in a subscription
PubSub Subscription Streaming Pull Acknowledge Message Operations/SecNumber of streaming Pull acknowledge message operations per second, grouped by result
PubSub Subscription Streaming Pull Acknowledge Requests/SecNumber of streaming pull requests per second with non-empty acknowledge ids, grouped by result
PubSub Subscription Streaming Pull Message Operations/SecNumber of streaming pull message operations per second, grouped by result
PubSub Subscription Streaming Pull Modify Acknowledged Deadline Message Operations/SecNumber of streaming Pull Modify acknowledged Deadline operations per second, grouped by result
PubSub Subscription Streaming Pull Modify Acknowledged Deadline Requests/SecNumber of streaming pull requests per second with non-empty Modify acknowledged Deadline fields, grouped by result
PubSub Subscription Streaming Pull Responses/SecNumber of streaming pull responses per second, grouped by result

PubSubTopicStats

MetricMetric Description
PubSub Topic Byte Cost/SecCost of operations per topic per second, measured in bytes. This is used to measure utilization for quotas
PubSub Topic Updates/SecNumber of configuration changes per topic per second, grouped by operation type and result
PubSub Topic Publish Message Operations/SecNumber of publish message operations per second, grouped by result
PubSub Topic Publish Message Send Request/SecAverage of publish requests per second, grouped by result
PubSub Topic Publish Average Message Size 5th Percentile (Bytes)5th percentile of publish message size in bytes
PubSub Topic Publish Average Message Size 50th Percentile (Bytes)50th percentile of publish message size in bytes
PubSub Topic Publish Average Message Size 95th Percentile (Bytes)95th percentile of publish message size in bytes
PubSub Topic Publish Average Message Size 99th Percentile (Bytes)99th percentile of publish message size in bytes

StorageBucketStats

MetricMetric Description
Storage bytes received over the network (KB)Count of bytes received over the network in kilobyte, grouped by the API method (Write, Read ,Delete, etc.) and response code
Storage bytes sent over the network (KB)Count of bytes sent over the network in kilobyte, grouped by the API method (Write, Read, Delete, etc.) and response code
Storage Request Counts/SecNumber of API calls, grouped by the API method (Write, Read, Delete, etc.) and response code per second
Storage Bucket Size (KB)Total size of all objects in the bucket

StoragePermissionStats

MetricMetric Description
Cloud Storage Read PermissionRead permission Status. 1= Allowed, 0= Not Allowed
Cloud Storage Write PermissionWrite permission Status. 1= Allowed, 0= Not Allowed

AutoScaler

MetricMetric Description
Minimum ReplicasMinimum number of replicas
Maximum ReplicasMaximum number of replicas
Target CPU Utilization (Pct)Target CPU utilization percentage

GCPDatastore Metrics

DatastoreStats

MetricMetric Description
Datastore API Request/secNumber of API calls per second
Datastore Index Writes/secNumber of index writes per second
Datastore 5th Percentile Read Size Entities (Bytes)5th percentile read size entities in bytes
Datastore 50th Percentile Read Size Entities (Bytes)50th percentile read size entities in bytes
Datastore 95th Percentile Read Size Entities (Bytes)95th percentile read size entities in bytes
Datastore 99th Percentile Read Size Entities (Bytes)99th percentile read size entities in bytes
Datastore 5th Percentile Written Size Entities (Bytes)5th percentile write size entities in bytes
Datastore 50th Percentile Written Size Entities (Bytes)50th percentile write size entities in bytes
Datastore 95th Percentile Written Size Entities (Bytes)95th percentile write size entities in bytes
Datastore 99th Percentile Written Size Entities (Bytes)99th percentile write size entities in bytes