NetForest – Alerts
Alert functionality enables a user to get notified via email whenever any issue/shortcoming has occurred, such as high page response time by server, high response time by page, and so on.
Viewing Alerts
On clicking the Alert menu at the left navigation, the generated alerts on the system is displayed at the right pane.
Configuring Alerts
User can configure the alert rules according to the requirements. To do this, click the icon at the top right corner of the alert window.
The Alert Rules dialog box is displayed with the existing alert rules. User can add, edit, or delete an alert rule using this dialog box. In the subsequent sections, process for adding, editing, and deleting alert rules are described.
Adding an Alert Rule
To add an alert rule, click the Add button at the Alert rule window. The Add New Rule dialog box is displayed.
Here, user needs to provide certain information required for adding a new rule as follows:
- Specify the rule name.
- Choose the index pattern from the drop-down list.
- Specify the description of the rule.
- Select a rule from the provided options. The further input fields depend upon this selection. Descriptions of the inputs in each case are specified subsequently.
Frequency
Frequency option is used to match when there are at least a certain number of events in a given time frame. This may be counted on a per-query_key basis. On clicking the Frequency button, the following input fields are added in the dialog box.
- Specify the query for which the frequency is to be monitored.
- Specify the time frame (in minutes) at which the status needs to be checked.
- Specify the number of events which will trigger an alert.
- Specify the time that events count must occur within.
- Specify the type of environments that must be included.
- Specify the impact of the alert.
- Specify the urgency/priority of the alert.
- Specify the group name of the alert.
- Specify the summary of the alert.
- Specify the environment where the alert is generated.
- Specify the location where the alert is generated.
- Specify the IP address of the server where the alert is generated.
- To go back to the previous screen, click Back.
Spike
Spike option is used to match when the volume of events during a given time period is spike_height times larger or smaller than during the previous time period. It uses two sliding windows to compare the current and reference frequency of events. We will call this two windows “reference” and “current”. It contains some additional input fields as compared to Frequency. On clicking the Spike button, the following input fields are added in the dialog box.
- Specify the query for which the spike is to be monitored.
- Specify the time frame (in minutes) at which the status needs to be checked.
- Specify the number of events which will trigger an alert.
- Specify the time that events count must occur within.
- Specify the Spike Height. It is ratio of number of events in the last time frame to the previous time frame that when hit will trigger an alert.
- Specify the Spike Type either ‘up’, ‘down’ or ‘both’. ‘Up’ meaning the rule will only match when the number of events is spike_height times higher. ‘Down’ meaning the reference number is spike_height higher than the current number. ‘Both’ will match either.
- Specify the time frame for ‘reference’ and ‘current’ window. The rule will average out the rate of events over this time period. For example, hours: 1 means that the ‘current’ window will span from present to one hour ago, and the ‘reference’ window will span from one hour ago to two hours ago. The rule will not be active until the time elapsed from the first event is at least two time frames. This is to prevent an alert being triggered before a baseline rate has been established.
- Specify the type of environments that must be included.
- Specify the impact of the alert.
- Specify the urgency/priority of the alert.
- Specify the group name of the alert.
- Specify the summary of the alert.
- Specify the environment where the alert is generated.
- Specify the location where the alert is generated.
- Specify the IP address of the server where the alert is generated.
- To go back to the previous screen, click Back.
Percentage
Percentage option is used to match when the percentage of document in the match bucket within a calculation window is higher or lower than a threshold. By default, the calculation window is buffer_time. On clicking Percentage, the following input fields are added in the dialog box.
- Specify the query for which the percentage is to be monitored.
- Specify the time frame (in minutes) at which the status needs to be checked.
- Specify the number of events which will trigger an alert.
- Specify the time that events count must occur within.
- Specify the type of document to search for.
- Specify the match terms that defines a filter for the match bucket, which should match a subset of the documents returned by the main query filter.
- Specify the minimum and maximum percentage. An alert is triggered when the percentage of matching documents is less than the minimum percentage or greater than the maximum percentage.
- Specify the type of environments that must be included.
- Specify the impact of the alert.
- Specify the urgency/priority of the alert.
- Specify the group name of the alert.
- Specify the summary of the alert.
- Specify the environment where the alert is generated.
- Specify the location where the alert is generated.
- Specify the IP address of the server where the alert is generated.
- To go back to the previous screen, click Back.
Whitelist
Whitelist option is used to compare a certain field to a whitelist, and match if the list does not contain the term. On clicking Whitelist, the following input fields are added in the dialog box.
- Specify the query for which the whitelist is to be monitored.
- Specify the time frame (in minutes) at which the status needs to be checked.
- Specify the number of events which will trigger an alert.
- Specify the time that events count must occur within.
- Specify the whitelist compare key which is the name of the field to use to compare to the whitelist.
- Specify if the Ignore Null field is true or false. If it is true, the events without a compare_key field will not match.
- Specify the list of whitelisted values and/or a list of paths to flat files which contain the whitelisted values using ‘!file /path/to/file’.
- Specify the type of environments that must be included.
- Specify the impact of the alert.
- Specify the urgency/priority of the alert.
- Specify the group name of the alert.
- Specify the summary of the alert.
- Specify the environment where the alert is generated.
- Specify the location where the alert is generated.
- Specify the IP address of the server where the alert is generated.
- To go back to the previous screen, click Back.
Blacklist
Blacklist option is used to compare a certain field to a blacklist, and match if the list contains the term. On clicking Blacklist, the following input fields are added in the dialog box.
- Specify the query for which the blacklist is to be monitored.
- Specify the time frame (in minutes) at which the status needs to be checked.
- Specify the number of events which will trigger an alert.
- Specify the time that events count must occur within.
- Specify the blacklist compare key which is the name of the field to use to compare to the blacklist. If the field is left blank, those events will be ignored.
- Specify the list of blacklisted values and/or a list of paths to flat files which contain the blacklisted values using ‘!file /path/to/file’.
- Specify the type of environments that must be included.
- Specify the impact of the alert.
- Specify the urgency/priority of the alert.
- Specify the group name of the alert.
- Specify the summary of the alert.
- Specify the environment where the alert is generated.
- Specify the location where the alert is generated.
- Specify the IP address of the server where the alert is generated.
- To go back to the previous screen, click Back.
Change
Change option is used to monitor a certain field and match if that field changes. On clicking Change, the following input fields are added in the dialog box.
- Specify the query for which the change is to be monitored.
- Specify the time frame (in minutes) at which the status needs to be checked.
- Specify the number of events which will trigger an alert.
- Specify the time that events count must occur within.
- Specify the change field. This monitors a certain field and matches if that field changes. The field must change with respect to the last event with the same query_key.
- Specify the compare key which is the name of the field to monitor for changes. Since this is a list of strings, the user can specify multiple keys. An alert is triggered if any of the fields change.
- Specify if the Ignore Null field is true or false. If it is true, the events without a compare_key field will not be counted as changed. All the fields in compare_key are checked for this.
- Specify the query key. This rule is applied on a per-query_key basis, and this field must be present in all of the events that are checked.
- Specify the type of environments that must be included.
- Specify the impact of the alert.
- Specify the urgency/priority of the alert.
- Specify the group name of the alert.
- Specify the summary of the alert.
- Specify the environment where the alert is generated.
- Specify the location where the alert is generated.
- Specify the IP address of the server where the alert is generated.
- To go back to the previous screen, click Back.
Flatline
Flatline option is used when the total number of events is under a given threshold for a time period. On clicking Flatline, the following input fields are added in the dialog box.
- Specify the query for which the flatline is to be monitored.
- Specify the time frame (in minutes) at which the status needs to be checked.
- Specify the number of events which will trigger an alert.
- Specify the time that events count must occur within.
- Specify the threshold, which is the minimum number of events for an alert not to be triggered.
- Specify the time frame that contains less than the threshold events.
- Specify the type of environments that must be included.
- Specify the impact of the alert.
- Specify the urgency/priority of the alert.
- Specify the group name of the alert.
- Specify the summary of the alert.
- Specify the environment where the alert is generated.
- Specify the location where the alert is generated.
- Specify the IP address of the server where the alert is generated.
- To go back to the previous screen, click Back.
All
All option is used when any rule matches everything. On clicking All, the following input fields are added in the dialog box.
- Specify the query for which the all rule is to be monitored.
- Specify the any field. Here, any rule matches everything. Every hit that the query returns generates an alert.
- Specify the type of environments that must be included.
- Specify the impact of the alert.
- Specify the urgency/priority of the alert.
- Specify the group name of the alert.
- Specify the summary of the alert.
- Specify the environment where the alert is generated.
- Specify the location where the alert is generated.
- Specify the IP address of the server where the alert is generated.
- To go back to the previous screen, click Back.
Term
Term option is used when a new value appears in a field that has never appeared earlier. On clicking Term, the following input fields are added in the dialog box.
- Specify the query for which the term is to be monitored.
- Specify the time frame (in minutes) at which the status needs to be checked.
- Specify the number of events which will trigger an alert.
- Specify the time that events count must occur within.
- Specify the list of fields to be monitored for new terms. If the user leaves it blank, query_key is used. Each entry in the list of fields can be a list in itself.
- Specify the type of environments that must be included.
- Specify the impact of the alert.
- Specify the urgency/priority of the alert.
- Specify the group name of the alert.
- Specify the summary of the alert.
- Specify the environment where the alert is generated.
- Specify the location where the alert is generated.
- Specify the IP address of the server where the alert is generated.
- To go back to the previous screen, click Back.
Cardinality
Cardinality option is used when the total number of unique values for a certain field within a time frame is higher or lower than a threshold. On clicking Cardinality, the following input fields are added in the dialog box.
- Specify the query for which the cardinality is to be monitored.
- Specify the time frame (in minutes) at which the status needs to be checked.
- Specify the number of events which will trigger an alert.
- Specify the time that events count must occur within.
- Specify the time period in which the number of unique values is counted.
- Specify the field for which the cardinality is counted.
- Specify the maximum cardinality number of data. If the cardinality of the data is greater than this number, an alert is triggered. Each new event that raises the cardinality triggers an alert.
- Specify the minimum cardinality number of data. If the cardinality of the data is less than this number, an alert is triggered.
- Specify the type of environments that must be included.
- Specify the impact of the alert.
- Specify the urgency/priority of the alert.
- Specify the group name of the alert.
- Specify the summary of the alert.
- Specify the environment where the alert is generated.
- Specify the location where the alert is generated.
- Specify the IP address of the server where the alert is generated.
- To go back to the previous screen, click Back.
Aggregation
Aggregation option is used when the value of a metric within the calculation window is higher or lower than a threshold. By default, this is buffer_time. On clicking Aggregation, the following input fields are added in the dialog box.
- Specify the query for which the cardinality is to be monitored.
- Specify the time frame (in minutes) at which the status needs to be checked.
- Specify the number of events which will trigger an alert.
- Specify the time that events count must occur within.
- Specify the metric aggregation key, which is the name of the field over which the metric value is calculated.
- Specify the type of metric aggregation to perform on the metric_agg_key field.
- Specify the type of document to search for.
- Specify the maximum threshold. If the calculated metric value is greater than this number, an alert is triggered.
- Specify the minimum threshold. If the calculated metric value is less than this number, an alert is triggered.
- Specify the type of environments that must be included.
- Specify the impact of the alert.
- Specify the urgency/priority of the alert.
- Specify the group name of the alert.
- Specify the summary of the alert.
- Specify the environment where the alert is generated.
- Specify the location where the alert is generated.
- Specify the IP address of the server where the alert is generated.
- To go back to the previous screen, click Back.
Next Steps
- Select the Repeating Alert check box to get alerts at a fixed time interval.
- Select the Email Information check box and mention the recipient’s email address and subject of the email.
- After providing all the required details, click Apply. The rule gets applied in the system and displayed in the alert rule list
Editing an Alert Rule
User can edit an alert rule using the Edit button on the Alert Rule dialog box. To do this, user first needs to select an alert rule from the list and then click the Edit button. The Alert Rule is displayed in the edit mode.
Specify the changes and click the click the Apply button to implement the changes.
Deleting an Alert Rule
User can delete an alert rule using the Delete button. To do this, select an alert rule to be deleted and click the Delete button. The alert rule gets deleted.