NetDiagnostics (ND) Agent Overhead – Test Report

This section provides a detailed analysis of the impact on the server due to the installation of the ND agent (latest Release 4.1.8) in terms of memory and CPU utilization. The analysis is performed on comprehensive systems both at client as well as Cavisson end.

Overview

Application performance monitoring (APM) is done by monitoring application and system level metrics. These metrics include but are not limited to:

  • System CPU Utilization
  • Memory Utilization
  • Network Throughput Used
  • Garbage Collection Stats
  • Method Timings
  • Method Hotspots

To monitor and capture these metrics all Application Performance Management (APM) solutions use agents. With agent-based approach, it is important to ensure that the overhead and footprint used by the APM tool is extremely low.

There are several legacy products where the overhead shoots exponentially with increase in number of monitors and features of APM used, but not with Cavisson NetDiagnostics Enterprise.

While most of the legacy products claim to having an overhead of 2-5% (only for the lite or limited version of APM), only Cavisson NetDiagnostics Enterprise offers minimal overhead, between 0 to 2%, even in its enterprise-ready state.

Cavisson Approach

Cavisson NetDiagnostics Enterprise (NDE) has been engineered to keep minimal CPU and memory overhead.

In case of NDE, there are two agents used:

  • CavMon Agent for system-level and application-level monitoring such as:
    • CPU, Memory utilization,
    • Disk Space Usage, Disk I/O,
    • Garbage Collection Stats
  • NetDiagnostics (ND) Agent for code-level monitoring such as:
    • Method timing
    • Method Hotspots
    • DB Query
    • Business Transaction Monitoring, Flow paths

Cavisson NDE is being used at leading global retailers, banks, and network providers for years without causing any overhead issues in terms of CPU utilization.

Test Details

For the detailed analysis, 3 types of tests have performed:

Sr. No Test NumberScenario
Test 1 TR 9553Monitoring the server with inactive mode of ND agent.
Test 2TR 9556Monitoring the impact of ND Agent on the server without ByteCode Instrumentation. (No profiling done)
Test 3TR 9557Monitoring the impact of ND Agent on the server with 10 % ByteCode Instrumentation. (Profiles created)

Note:

  • The tests were conducted in LT2 environment where ND was installed across all JVMs on application servers.

Server Configuration

Model nameIntel(R) Xeon(R) CPU E5-2697 v3 @ 2.60GHz
L3 cache35840K
CPU MHz2593.993
CPU(s)8
Total Memory77485012 kB
No of instances per server6
GC settingsUseParallelGC

Test Scenario

Load50 Users
Ramp up5 Minutes
Duration30 Minutes
Ramp down5 Minutes

In all the tests, 50 users were ramped-up into the system in 5 minutes for dotcom channel for 30 minutes’ duration and then followed by ramp-down in 5 minutes.

Comparison Approach

Comparison was made between Test 1, Test 2, and Test 3 to see the resource consumption within the duration phase (i.e. 30 minutes).

Observation

The observations for Test metrics, CPU utilization, heap memory, and total number of GC/mins are stated below.

Test Metrics

The following stats shows test metrics comparison:

Test Results
Test Scenario TPSART(msecs)Execute Thread total count
Test 1 (Without ND Agent)7469528
Test 2 (With ND Agent)7956030
Test 3 (ND Agent + Profiling)8054730

 

CPU Utilization

The following stats show comparison of CPU utilization on one of the application server:

Test Results
Test ScenarioAverage CPU Consumption (%)Overhead (%)
Test 1 (Without ND Agent)23
Test 2 (With ND Agent)230
Test 3 (ND Agent + Profiling)24+1

Note: Similar behavior was observed on other App servers.

Heap Memory

The following stats show comparison of heap memory consumed by one of the JVMs on Dotcom application server.

Test Results
Test ScenarioHeap Memory Consumption by the Dotcom JVMs (MB)JVM Heap Used Memory Percent (Pct)Overhead (%)
Test 1 (Without ND Agent)265737
Test 2 (With ND Agent)2995403
Test 3 (With ND Agent + Profiling)2960414

Note:

  • Maximum increase ~300 MB to ~350 MB for ND enabled test as compared to Test 1 (without ND).
  • Same behavior was observed on other JVMs also.
Total Number of GC/min

The following stats show comparison of Total Number of GC/min by one of the JVMs on Dotcom application server.

Test Results
Test ScenarioTotal Number of GC/min on Dotcom JVMsPercentage Change (%)
Test 1 (Without ND Agent)0.450
Test 2 (With ND Agent)0.350-22.0%
Test 3 (With ND Agent + Profiling)0.400-11.0%

Summary

As seen in above metrics, the overhead of running ND Agent for code level monitoring is minimal:

  • No major impact on CPU Utilization.
  • Around 3 to 4% i.e. (300 to 350 MB) change in heap memory utilization was observed when tested with NDAgent and NDAgent + Profiling respectively.
  • Decrease in Total number of GC per minute was observed with ND enabled tests, which is insignificant in terms of impact.