Apache Solr Monitoring

Overview

Apache solr is an open source search server. It is based on the full text search engine called Apache Lucene. Solr is a popular search platform for websites because it can index and search multiple sites and return recommendations for related content based on the search query’s taxonomy. Solr is also a popular search platform for enterprise search because it can be used to index and search documents and email attachments. Solr works with Hypertext Transfer Protocol (HTTP) and Extensible Markup Language (XML).

Following are the capabilities of solr:
• Indexing in near real time
• Document parsing and indexing
• Design for high-volume traffic
• Load-balanced querying

Solr Plugin / Element

Cache: Solr caches are associated with an Index Searcher – a particular ‘view’ of the index that doesn’t change. Therefore, as long as that Index Searcher is being used, any items in the cache will be valid and available for reuse. Caching in Solr is unlike ordinary caches, in that Solr cached objects will not expire after a certain period of time; rather, cached objects will be valid as long as the Index Searcher is valid.
Types of Cache
1. Field Cache : Field cache is used for sorting (and in some cases faceting). This cache is not managed by Solr. It has no configuration options and cannot be autowarmed — it is initialized the first time it is used for each Searcher.

2. Field Value Cache: The Field Value Cache is similar to the low-level fieldCache except that it supports multiple values per document (either multivalued fields or fields with multiple terms due to tokanization), This cache is primarily used by faceting. The “keys” of the cache are field names, and the values are large data structures mapping docIds to values. If the field Value Cache is not declared in solrconfig.xml then it is generated for you automatically with an initial size 10, a max size of 10000, and no auto warming.

3. Query Result Cache: This cache stores ordered sets of document IDs – the top N results of a query ordered by some criteria. The memory usage for the query result cache is significantly less than that of the Filter Cache because it only stores document IDs that were returned to the user by the query.

4. Document Cache: The Document Cache stores objects that have been fetched from disk. Each Document object in the document Cache contains a List of Field references. When enable LazyFieldLoading=true is set, and there is a document cache, Document objects fetched from the Index Reader will only contain the Fields specified in the fl. All other Fields will be marked as “LOAD_LAZY”. When there is a cache hit on that unique key at a later date, the Fields already loaded are used directly (if requested), but the Fields marked LOAD_LAZY will be lazy loaded from the Index Reader and then the Document object updates the references to the newly actualized fields (which are no longer marked LOAD_LAZY). So with different “fl” params, the same Document Object can be re-used from the document cache, but the Fields in that Document grow as the fields requested (using the “fl” parameter) change.
Update Handler: The UpdateHandler API allows you to specify a custom algorithm for determining how sequences of adds and deletes are processed by Solr. The UpdateHandler you wish to use can be configured in your solrconfig.xml, but implementing a new UpdateHandler is considered extremely advanced and is not recommended.

Query Handler: The standard approach in Solr is to define one query for the initial import and a second query to fetch the IDs of documents that have changed and a third query to fetch the data that changed. Especially if you expect a large number of changes, this is not very efficient. Query Handler Manages all the query processing work.

Graph Description

ApacheSolrCacheExtendedJMXStats

Metric Metric Description
Cache Lookups/Sec Number of cache lookups per second.
Cache Hits/Sec Number of cache hits per second.
Cache HitRatio Percentage of queries that were satisfied by the cache since last reset.
Cache Inserts/Sec Number of entries added into cache per second.
Cache Evictions/Sec Number of entries removed from the cache per second.
Number of Cache Entries Number of entries in the cache.
Cache Warmup Time(ms) Time taken in caching items that will be regenerated in the new searcher from an old cache since last reset.
Total Cache Lookups Number of Cache lookups over its lifetime.
Total Cache Hits Number of queries that were satisfied by the cache over its lifetime.
Total Cache HitRatio Percentage of queries that were satisfied by the cache (a number between 0 and 1, where 1 is ideal) over its lifetime.
Total Cache Inserts Number of entries added to the cache over its lifetime.
Total Cache Evictions Number of entries removed from the cache over its lifetime.
LRU Cache Max Size Maximum number of LRU cache entries.
LRU Cache Initial Size Initial size of LRU Cache.
Concurrent LRU Cache Maximum Size Maximum number of concurrent LRU cache entries.
Concurrent LRU Cache Initial Size Number of initial entries of the concurrent LRU cache.
Concurrent LRU Cache Minimum Size Number of minimum LRU cache entries, after the concurrent LRU cache entriers reaches to its size, the cache tries to bring it down to the minimum number of entries.
Concurrent LRU Cache Acceptable Size Number of possible least entries after LRU cache removes old entries, it targets to achieve the minSize. If not possible it at least tries to bring it down to acceptable number of entries.

ApacheSolrCacheJMXStats

Metric Metric Description
Cache Lookups/Sec Number of cache lookups per second.
Cache Hits/Sec Number of cache hits per second.
Cache HitRatio Percentage of queries that were satisfied by the cache since last reset.
Cache Inserts/Sec Number of entries added into cache per second.
Cache Evictions/Sec Number of entries removed from the cache per second.
Number of Cache Entries Number of entries in the cache.
Cache Warmup Time(ms) Time taken in caching items that will be regenerated in the new searcher from an old cache since last reset.
Total Cache Lookups Number of Cache lookups over its lifetime.
Total Cache Hits Number of queries that were satisfied by the cache over its lifetime.
Total Cache HitRatio Percentage of queries that were satisfied by the cache (a number between 0 and 1, where 1 is ideal) over its lifetime.
Total Cache Inserts Number of entries added to the cache over its lifetime.
Total Cache Evictions Number of entries removed from the cache over its lifetime

ApacheSolrQueryHandlerJMXStats

Metric Metric Description
Query Handler Requests/Sec Number of query processed per second by the query handler.
Query Handler Errors/Sec Number of errors encountered per second by the query handler while processing queries.
Query Handler Timeouts/Sec Number of responses received per second with partial results after query execution.
Query Handler Total Time (Sec) Sum of all query processing times.
Query Handler Time Per Request (Sec) Number of average time of requests processing by handler since last reset.
Query Handler Requests Since Reset/Sec Number of average request processed per second by handler since last reset.

ApacheSolrUpdateHandlerJMXStats

Solr Update Handler Monitor determines how sequences of adds and deletes and commit of document are processed by Solr.

Metric Metric Description
Update Handler Commits/Sec Number of operation writes per second. A commit operation makes document index changes visible to new search requests.
Update Handler Auto Commits/Sec Number of operation auto writes per second. Auto commit operation makes document index changes visible to new search requests.
Update Handler Optimizes Command Issued/Sec Number of optimize operation performed by update handler to merge internal data structures in a second. For a large index, optimization will take some time to complete, but by merging many small segment files into a larger one, search performance will improve.
Update Handler Rollbacks/Sec Number of rollback command exceuted per second to successfully rollbacks.
Update Handler Expunge Deletes/Sec Number of Operation merges segments per second that have more than 10% deleted docs.
Update Handler Pending Docs Number of documents pending to commit.
Update Handler Doc Adds/Sec Number Of document added per second.
Update Handler Doc Deletes By Id/Sec Number of the document deleted from index in a second.
Update Handler Doc Deletes By Query/Sec Number of documents deleted from index with a specified query in a second.
Update Handler Errors/Sec Number of errors in add/delete/commit/rollback commands execution in a second.
Total Update Handler Doc Adds Total number of document added since last reset.
Total Update Handler Doc Deletes By Id Total number of the document deleted from index with specified ids since last reset.
Total Update Handler Doc Deletes By Query Total number of document deleted from index with the specified queries since last reset.
Total Update Handler Errors Cumulative errors for add/delete commands since last reset