This page covers the options for using notifications and downtime in GroundWork Monitor.
There are two methods available for setting up alert notifications NoMa and Nagios™. The NoMa method is quite simple, largely automatic, and is what is recommended. The Nagios method is used to send notifications for Nagios events only, and you will probably only use this if you have already invested in configuring Nagios for notifications, or if you need some specific escalation or notification feature for Nagios monitored resources.
Overview of notifications and downtime
Notifications and escalations are alerts that are sent when GroundWork Monitor detects a failure or over threshold limit condition. These notifications are sent to configured contacts using specified methods.
If a threshold value is breached for a monitored resource, an event is generated and sent to the Foundation layer of GroundWork monitor. Additional threshold overrides (if they are set) are applied, and the state of the resource is updated accordingly. Then an alert is sent to NoMa. You can create notification definitions to send these alerts as notifications to your contacts. A notification definition defines which contacts will receive notifications for which resources (ie. hostgroups, hosts, and services).
Regarding notification methods, you can be notified of problems and recoveries pretty much any way you want, including by phone, SMS, email, or even an audio alert. The "method" definition can be used to open tickets in a ticketing system, too. Pretty much whatever you want to run as program can be used.
Scheduling of downtime can be very useful during system maintenance as it suppresses notifications to those entities in downtime. The Configuration → Downtime menu is used to manage the scheduling of downtime for all monitored entities including hosts, services, host groups, and service groups. You can set regular (one-time) and recurring (e.g., daily, weekly, monthly, yearly) downtime.
During the specified downtime, alert notifications will not be sent out about the monitored entities, regardless of whether you use NoMa or Nagios notification methods. This is useful in the event you need to take a server down for an upgrade or maintenance, for example. Scheduling downtime also avoids alarm fatigue, provides more accurate data for SLA reporting, and reinforces change control discipline. When the scheduled downtime expires, notifications for the hosts and services will resume as normal.
Alert notifications using NoMa
GroundWork has integrated NoMa, a free standing notification and escalation subsystem which no longer requires the use of Nagios notifications. The NoMa subsystem also permits changes to notification and escalation schedules to be made by users with roles having lower privileges than the system administration role, which makes the system more flexible to operate.
Notifications and escalations using Nagios remain available for configuration and maintenance as shown by the direct line from Nagios to Notifications in the image below. This permits customers that have made investments in time or scripting for Nagios alerts to continue to use these methods.
There is also a way to send Nagios notifications to NoMa, as shown by the dotted line below. See Using Nagios to send notifications to NoMa below.
As shown, the alert flow coming from monitoring sources such as Nagios and Cloud Hub (and any other source) is sent to the RESTful API for the Foundation database. In turn, NoMa subscribes to alert messages via the REST API and sends any alerts it finds there that pass its filters. This figure shows the relationship between Nagios, NoMa, and GroundWork Foundation.
By using the NoMa front-end, hosts and services can be assigned to notification definitions. When NoMa receives notifications, it searches for matching host and service definitions in its database. If a matching definition is found, contacts and methods are determined and notifications will be sent. The following diagram shows how the product is employed within GroundWork Monitor.
Using Nagios to send notifications to NoMa
In rare cases, you may need to consider using Nagios notifications to place alerts directly in the NoMa input queue. This is useful if you have specific requirements for escalating alerts from Nagios using contacts and methods already defined in NoMa, or if you want to continuously notify a contact or a group of contacts until an issue is acknowledged or resolved.
Generally, using the continuous notification features of Nagios is not recommended, since it contributes to alarm fatigue. A single notification on state change should be sufficient. However, many organizations have become accustomed to this mode of operation, so GroundWork provides a Nagios notification script and command definition called alert_via_noma for this purpose.
If you find you need this functionality and want to implement, see How to use Nagios to send notifications to NoMa.
Alert notifications using Nagios
GroundWork Monitor retains the ability to use Nagios for notification directly, bypassing NoMa if desired. The following is a brief discussion on setting up notifications using Nagios in GroundWork Monitor. For more details, see How to configure notifications using Nagios.
Contact notifications are communications to contacts or contact groups about the status of a monitored element. Notifications can be configured for circumstances including any hard state change, if a host or service remains in a non-OK state, and for acknowledgments.
When do notifications occur?
- When a hard state change occurs and all filters are passed.
- When a host or service remains in a hard non-OK state and the time specified by the <notification_interval> option in the host or service definition has passed since the last notification was sent out (for that specified host or service).
Who gets notified?
- Host: Each host may belong to one or more host groups. Each host group has a contact groups option that specifies what contact groups receive notifications for hosts in that particular host group.
- Service: Each service definition has a contact groups option that specifies what contact groups receive notifications for that particular service.
What filters must be passed in order for notification to be sent out?
- Program Wide Filters
- Host and Service Filters
- Contact Filters
Notifications in GroundWork Monitor are communications made to contacts about the status of a monitored element. Notifications can be configured for circumstances including any hard state change, if a host or service remains in a non-OK state, and for acknowledgments. Contact groups are associated with escalation trees which are then used for host and service notifications. The table below describes the various notification objects.
|Contact Templates, Contacts, Contact Groups||Contact Templates typically store generalized contact information which is consistent across multiple contacts definitions such as time periods, specific host and service states for which notifications can be sent out, and commands used to notify of a host or service problem or recovery. Specific contact information such as e-mail addresses and phone numbers would be contained in the contact definition. Contact definitions inherit generalized information from contact templates.|
Contacts contain individual settings defining who should get notified in the event of a problem on your network. Contact definitions also indicate which notification options will be used for the contact based on the selected contact template.
Contact Groups are definitions of one or more contacts for the purpose of sending out alert/recovery notifications to one or more contacts. Contact definitions can be grouped into contact groups typically by area of expertise or geographic location. For example you might have one contact group called network-administrators and perhaps another contact group called sanfrancisco-support. Then, when a host or service has a problem or recovers, Nagios will find the appropriate contact groups to send notifications to and notify all contacts in those contact groups. See How to configure Nagios contacts.
|Escalations and Escalation Trees||Service escalations are used to escalate notifications for a particular service. Host escalations are used to escalate notifications for a particular host. An escalation tree is a grouping of multiple escalations which can be applied to a host, host profile, host group, or a service.|
Escalation trees are optional and are used in the GroundWork Monitor Nagios engine to alert users when monitoring services and hosts change between states. Escalation trees combine specified contact groups that are to be notified when a notification is escalated.
There are two methods for assigning contact groups for notifications; the first being a direct contact group assignment through a host or service template to an object (e.g. host, service) or directly in a host group or service group definition; and the second an escalation tree assignment to an object (host, host group, or service).
GroundWork Monitor does not assume that just because a notification has been sent that the underlying problem is under control. It requires the recipient of the page to log into the system and acknowledge having received a notification. If that acknowledgment does not occur within a period of time identified by the notification interval, subsequent notifications will be sent out. The monitoring System Administrator can configure how many notifications get sent at each escalation level before escalating the problem to a higher level of support.
Notifications are escalated if one or more escalation definitions matches the current notification that is being sent out. If a host or service notification does not have any valid escalation definitions that applies to it, the contact group(s) specified in either the host group or service definition will be used for the notification. See How to configure Nagios escalations.