NetDiagnostics (ND) Agent Overhead – Test Report
This section provides a detailed analysis of the impact on the server due to the installation of the ND agent (latest Release 4.1.8) in terms of memory and CPU utilization. The analysis is performed on comprehensive systems both at client as well as Cavisson end.
Overview
Application performance monitoring (APM) is done by monitoring application and system level metrics. These metrics include but are not limited to:
- System CPU Utilization
- Memory Utilization
- Network Throughput Used
- Garbage Collection Stats
- Method Timings
- Method Hotspots
To monitor and capture these metrics all Application Performance Management (APM) solutions use agents. With agent-based approach, it is important to ensure that the overhead and footprint used by the APM tool is extremely low.
There are several legacy products where the overhead shoots exponentially with increase in number of monitors and features of APM used, but not with Cavisson NetDiagnostics Enterprise.
While most of the legacy products claim to having an overhead of 2-5% (only for the lite or limited version of APM), only Cavisson NetDiagnostics Enterprise offers minimal overhead, between 0 to 2%, even in its enterprise-ready state.
Cavisson Approach
Cavisson NetDiagnostics Enterprise (NDE) has been engineered to keep minimal CPU and memory overhead.
In case of NDE, there are two agents used:
- CavMon Agent for system-level and application-level monitoring such as:
- CPU, Memory utilization,
- Disk Space Usage, Disk I/O,
- Garbage Collection Stats
- NetDiagnostics (ND) Agent for code-level monitoring such as:
- Method timing
- Method Hotspots
- DB Query
- Business Transaction Monitoring, Flow paths
Cavisson NDE is being used at leading global retailers, banks, and network providers for years without causing any overhead issues in terms of CPU utilization.
Test Details
For the detailed analysis, 3 types of tests have performed:
Sr. No | Test Number | Scenario |
Test 1 | TR 9553 | Monitoring the server with inactive mode of ND agent. |
Test 2 | TR 9556 | Monitoring the impact of ND Agent on the server without ByteCode Instrumentation. (No profiling done) |
Test 3 | TR 9557 | Monitoring the impact of ND Agent on the server with 10 % ByteCode Instrumentation. (Profiles created) |
Note:
- The tests were conducted in LT2 environment where ND was installed across all JVMs on application servers.
Server Configuration
Model name | Intel(R) Xeon(R) CPU E5-2697 v3 @ 2.60GHz |
L3 cache | 35840K |
CPU MHz | 2593.993 |
CPU(s) | 8 |
Total Memory | 77485012 kB |
No of instances per server | 6 |
GC settings | UseParallelGC |
Test Scenario
Load | 50 Users |
Ramp up | 5 Minutes |
Duration | 30 Minutes |
Ramp down | 5 Minutes |
In all the tests, 50 users were ramped-up into the system in 5 minutes for dotcom channel for 30 minutes’ duration and then followed by ramp-down in 5 minutes.
Comparison Approach
Comparison was made between Test 1, Test 2, and Test 3 to see the resource consumption within the duration phase (i.e. 30 minutes).
Observation
The observations for Test metrics, CPU utilization, heap memory, and total number of GC/mins are stated below.
Test Metrics
The following stats shows test metrics comparison:
Test Results
Test Scenario | TPS | ART(msecs) | Execute Thread total count |
Test 1 (Without ND Agent) | 74 | 695 | 28 |
Test 2 (With ND Agent) | 79 | 560 | 30 |
Test 3 (ND Agent + Profiling) | 80 | 547 | 30 |
CPU Utilization
The following stats show comparison of CPU utilization on one of the application server:
Test Results
Test Scenario | Average CPU Consumption (%) | Overhead (%) |
Test 1 (Without ND Agent) | 23 | – |
Test 2 (With ND Agent) | 23 | 0 |
Test 3 (ND Agent + Profiling) | 24 | +1 |
Note: Similar behavior was observed on other App servers.
Heap Memory
The following stats show comparison of heap memory consumed by one of the JVMs on Dotcom application server.
Test Results
Test Scenario | Heap Memory Consumption by the Dotcom JVMs (MB) | JVM Heap Used Memory Percent (Pct) | Overhead (%) |
Test 1 (Without ND Agent) | 2657 | 37 | – |
Test 2 (With ND Agent) | 2995 | 40 | 3 |
Test 3 (With ND Agent + Profiling) | 2960 | 41 | 4 |
Note:
- Maximum increase ~300 MB to ~350 MB for ND enabled test as compared to Test 1 (without ND).
- Same behavior was observed on other JVMs also.
Total Number of GC/min
The following stats show comparison of Total Number of GC/min by one of the JVMs on Dotcom application server.
Test Results
Test Scenario | Total Number of GC/min on Dotcom JVMs | Percentage Change (%) |
Test 1 (Without ND Agent) | 0.450 | – |
Test 2 (With ND Agent) | 0.350 | -22.0% |
Test 3 (With ND Agent + Profiling) | 0.400 | -11.0% |
Summary
As seen in above metrics, the overhead of running ND Agent for code level monitoring is minimal:
- No major impact on CPU Utilization.
- Around 3 to 4% i.e. (300 to 350 MB) change in heap memory utilization was observed when tested with NDAgent and NDAgent + Profiling respectively.
- Decrease in Total number of GC per minute was observed with ND enabled tests, which is insignificant in terms of impact.