Release Notes for GDMA 2.7.0

CONTENTS

Related RESOURCES

Overview

GDMA 2.7.0 provides the following new features and updates:

  • HTTP(S) can be used to send check results to the GroundWork server, as an alternative to the NSCA protocol (GroundWork issue GDMA-425). This requires new server-side support; see below.
  • Local logging is now enabled by default as GDMA clients are installed. This is possible because automatic logfile rotation was implemented as of GDMA 2.6.1 (GroundWork issue GDMA-256). Having it enabled by default should help with quickly diagnosing problems that might appear on some random client.
  • GDMA client logging output has been improved in a number of small ways in both presentation and content, to make the logfiles a bit easier to decipher. Among those changes, the poller and spooler now print the GDMA version number into their respective logfiles when the daemons are started up and when the logfiles are automatically rotated, so that information is visibly exposed. Also, the program that directly runs checks now logs not just the results of failed checks, but also a notice of checks skipped in a given cycle and the results of successful checks. All of this should help with forensic investigations.
  • Server liveness checking has been radically reduced (GroundWork issue GDMA-407). This suppresses a lot of network noise traffic that might have interfered somewhat with acceptance of check results when the server was generally under heavy load.
  • A bug in the scheduling of checks has been fixed (GroundWork issue GDMA-452). The bug changed when particular checks were run and sometimes whether they were run at all, when a service is configured with both multiple instances and a _Check_Interval value larger than 1.
  • As many third-party components as we could possibly touch have been upgraded to current external releases.
    • OpenSSL 1.1.1c
    • Nagios Plugins 2.2.1
    • NRPE 3.2.1
    • Perl 5.28.2
    • many of the extra add-on Perl packages
    • OpenLDAP 2.4.47
    • curl 7.65.0
  • The check_system_uptime.pl plugin has been replaced with a version that will operate correctly on Solaris and AIX as well as Linux.

    As of this writing, that upgrade is included in the Linux GDMA builds, but it has not yet been verified in the current GDMA test builds for Solaris and AIX.

  • The AIX copy of the check_disk plugin has been fixed to correctly report results for very large disk partitions (GroundWork issue GWMON-12081).

    As of this writing, that upgrade is not yet included in current AIX GDMA test builds.

  • Support for Jolokia (https://jolokia.org/) is included in the GDMA client.
  • The default value for the ‑‑gdma_protocol <gdma_protocol> installer parameter has been changed from "http" to "https". Sites which use unattended-mode installs and have have previously allowed this option to default to http will need to add the option either on the command line or in the options file mentioned on the command line.
  • A new ‑‑gdma_auto_configuration <gdma_auto_configuration> option is now supported, for selection of which mechanism (if any) will be used to automatically establish the initial and/or ongoing configuration of the GDMA client. Its default value will be autoregistration, essentially mirroring the previous behavior. That standard default will be modified to none if username/password credentials are not supplied. Alternate values are autosetup and none.
  • A new ‑‑gdma_spooler_transport <gdma_spooler_transport> option is supported, for selection of how check results will be sent to the server. This option defaults to "nsca", specifying the legacy transport. An alternative value of "http" is also supported, to send such traffic via either HTTP or HTTPS to each server as specified in the Target_Server and Target_Server_Secondary options.
  • Auto_Register_Attempts will now be set to "never" at install time under conditions where Auto-Registration is not to be invoked. That prevents its accidental activation later on when other parts of the configuration are changed. If you want to enable Auto-Registration later on, you will need to override that setting, by setting the value typically to "fibonacci" and ensuring that the rest of the Auto_Register_* parameters are set appropriately.
  • A new ‑‑gdma_download_certs <gdma_download_certs> option is supported, to allow TLS certificates to be downloaded automatically from the GroundWork server to the GDMA client. If you have selected or defaulted HTTPS as the protocol for connections to the server (see the ‑‑gdma_protocol above), this option defaults to "yes". That setting specifies that such downloading should be enabled when a certificate verify problem arises. An alternative value of "no" is also supported, in which case you will need to install certs manually on the GDMA client if HTTPS is to be used, as in previous GDMA releases.

    This option is provided as a convenience measure for easier GDMA client deployment. But note that in the GDMA 2.7.0 release, no certificate-revocation validation is performed on downloaded certificates. Until that changes, disabling this option and installing certs manually is recommended for full proper security.

  • A new ‑‑gdma_download_self_signed_certs <gdma_download_self_signed_certs> option is supported, to allow an automatically-downloaded self-signed TLS certificate to be treated as valid. This option defaults to "no", which is the proper setting for a secure system. An alternative value of "yes" is also supported; it should only be used if you are completely confident that your network security cannot be broken, or if you cannot obtain a non-self-signed cert for installation on the GroundWork server and you are willing to take the consequent security risks.

    This option is defaulted as enabled, for convenience in quick testing of GDMA deployments where you have not yet obtained a cert for your server from some other source. For production use, be sure to pay attention to this option and set it as desired.

    If you need to use a self-signed cert, the better alternative is to install it manually on your GDMA clients, and to disable automatic downloading of any certs, self-signed or not (see the previous option).

  • A new ‑‑gdma_download_bad_server_name_certs <gdma_download_bad_server_name_certs> option is supported, to allow use of an automatically-downloaded TLS certificate whose embedded server name does not match the name of the server from which the certificate was retrieved. This option defaults to "no", which is the proper setting for a secure system. An alternative value of "yes" is also supported; it should only be used if you are completely confident that your network security cannot be broken, or in early-test situations where you do not yet have proper certificates set up on the GroundWork server and you are willing to take the consequent security risks.

    This option is defaulted as enabled, for convenience in quick testing of GDMA deployments. This is definitely not the setting you should have in place for production use. Be sure to pay attention to this option and set it as desired.

    This option is provided only to make it easier to deploy GDMA clients for test purposes; it should never be used in production. Use of this option would make a Man-In-The-Middle attack easier.

Sending check-result data via HTTP/S

Traditionally, GDMA clients have used the NSCA protocol to send check-result data to a GroundWork server, using a dedicated port on the server which accepts such data directly into the Nagios process. Some customers have requested that such traffic be transported via an HTTP/S channel instead, either because they wish not to have unusual special ports open on the GroundWork server, or because they have HTTP/S-traffic observability tools in place that they would like to apply to this traffic as well. GDMA 2.7.0 now provides such a capability.

To support this change, modifications are made on both the client and server. The client requires at least the GDMA 2.7.0 release to be installed. On a GroundWork 7.2.1 server, some new web-access setup is needed within Apache, to accept and route this traffic. Also on a GroundWork 7.2.1 server, a new version of the libbronx.so library is provided to accept data on this channel, and an extended copy of the bronx.cfg config file supports new options to control the new facility. See below for details.

Port usage and security considerations

Historically, the Bronx event broker has used port 5667 for acceptance of check results using the NSCA protocol. To support the new facility of accepting check-result traffic using the HTTP/S protocol as seen on the GDMA client, Bronx can now accept HTTP connections on port 5657. This port supports only HTTP connections, not HTTPS connections, and it is intended only for internal use on the GroundWork server. It should not be exposed for connections outside of the GroundWork server itself, and in the GroundWork 7 context, Bronx should listen on this port only on the IPv4 localhost network interface. Supporting HTTPS will be the provenence of the front-end Apache processing, which will handle decryption and forward incoming data-submission requests to the Bronx port.

This new interface is not a full REST API, nor is it intended to be, ever. Except for checking certain primitive server metadata such as the server time or whether Bronx is up, Bronx does not support any data exfiltration on this port. The new interface is strictly for submission of check results, and the only data returned to a caller is a boolean indication of success or failure in accepting a set of submitted check results.

To protect access credentials, it is strongly recommended that you run your GroundWork server in HTTPS-only mode, and set the Target_Server option (and Target_Server_Secondary option, if that is in use) for all GDMA clients to use "https://" as the protocol for connecting to the GroundWork server.

Even with HTTPS in play for the GDMA client connection, it is still possible for an attacker to attempt a brute-force attack to guess the username and password. The present implementation does not introduce any artificial delays at the Bronx level to slow down such attacks. Use of rather long and random passwords is therefore recommended, to make such an attack impractical.

Client/Server time mismatch

If HTTP/S is in use for transport of check results, a validation of the client clock is run at GDMA startup. If the time differs by more than 10 seconds in either direction from the server clock, both of the nagios.log and event_broker.log logfiles on the server will contain a message such as this:

[1559272754] [BRONX] {Bronx_AccessHandlerCallback} warning: client '172.28.113.159'
    clock is running roughly 686 seconds fast relative to server clock

That should help to identify clients where the time mismatch is getting so great that Bronx or Nagios may intentionally drop incoming data on the floor because it is too far out of bounds. Such dropped data will also be logged in the same logfiles, thusly:

[1559273176] [BRONX] {process_request_payload} Received 9 check results, submitted 0, dropped 9,
    from client '172.28.113.159' via peer 127.0.0.1

If that happens, then with

logging=warning

or higher levels of logging set in bronx.cfg, you can also see the reasons for such discarding of data:

[1559273379] [BRONX] {process_request_payload} Dropping check result from '172.23.123.101' via peer 127.0.0.1
    for host 'somehost' service 'linux_disk_root' with stale timestamp (check result was 1202 seconds old).

[1559273439] [BRONX] {process_request_payload} Dropping check result from '172.23.117.259' via peer 127.0.0.1
    for host 'otherhost' service 'gdma_poller' with timestamp too far in the future.


GroundWork 7.2.1 server setup

To enable check-result traffic to be received on the server using HTTP/S, the following changes must be made:

  • The revised copy of /usr/local/groundwork/common/lib/libbronx.so must be installed.
  • /usr/local/groundwork/config/bronx.cfg must be configured to enable this facility. For compatibility with existing installs, it is disabled by default as shipped.
  • The Apache configuration must be extended to forward check-result traffic to Bronx. This is done by installing a new /usr/local/groundwork/apache2/conf/groundwork/bronx_httpd.conf file.

The libbronx.so and bronx_httpd.conf files are provided in and automatically installed by the current rollup patch for GroundWork Monitor 7.2.1. (see GWME-7.2.1-00 Rollup Patch Installer and GWME-7.2.1-00 Rollup Patch Details). For the bronx.cfg file, see the next section.

New bronx.cfg configuration settings

The copy of bronx.cfg provided on the Downloads page is for installation on GroundWork Monitor 7.x.x. To use it, you must have already installed a current rollup patch (see above).

To allow check-result data to be sent via HTTP/S, the updated bronx.cfg file must be put into play on the server, while preserving any local settings you may have had previously in that file.

To install the updated bronx.cfg file, take the following steps:

  • Download the file from the Downloads page.
  • Merge any previous local changes to option settings in your existing /usr/local/groundwork/config/bronx.cfg file into the new copy of the file. Also pay attention to the new options listed just below, and modify those according to your local requirements. Of the new options, you will likely only need to adjust the values of http_listener and perhaps http_listener_allowed_clients.
  • Back up the file that you already have:

    cd /usr/local/groundwork/config
    cp -p bronx.cfg bronx.cfg.orig
  • Install the modified new copy. Assuming your modified new copy is currently parked in the /tmp/ directory, you would use these commands:

    cd /usr/local/groundwork/config
    cp /tmp/bronx.cfg .
    chown nagios:nagios bronx.cfg
    chmod 600 bronx.cfg
  • Restart Nagios so Bronx is restarted and picks up the changes:

    service groundwork restart nagios

The following new settings are available in the bronx.cfg file. Default settings for GroundWork Monitor 7.x.x are shown here. Commentary on the meaning of these options and appropriate values for them are provided in the config file. To enable accepting data via HTTP/S, the http_listener option must be turned "on".

http_listener=off
http_connection_limit=32
http_idle_connection_timeout=20
http_thread_pool_size=4
http_listener_port=5657
http_listener_ipv4_hostname=localhost
http_listener_allowed_peers=localhost
http_listener_allowed_clients=10.0.0.0/8,172.16.0.0/12,192.168.0.0/16

In GroundWork Monitor 8.0.0, three of those parameters have different standard values, which will be set as such out-of-the-box in that context:

http_listener=on
http_listener_ipv4_hostname=nagios
http_listener_allowed_peers=0.0.0.0/0

If you enable this channel, the http_listener_allowed_clients option must be set to reflect the set of hosts or subnets you expect to receive data from via this transport. (Information about the calling client must be passed along to Bronx by the fronting web server [i.e., Apache HTTP Server in GroundWork 7, NGINX in GroundWork 8]. So the fact that Bronx is listening only on a localhost network interface [as specified by http_listener_ipv4_hostname] and only accepting direct socket connections from the web server acting as a reverse proxy [as specified by http_listener_allowed_peers] does not preclude it from accepting data from your GDMA clients. It just means that there must be some intermediary process on the GroundWork server, namely the web server, that passes along each request.) If you change the port number in the http_listener_port option in GroundWork 7, you must also change it in the ProxyPass directive in the /usr/local/groundwork/apache2/conf/groundwork/bronx_httpd.conf file.

The http_thread_pool_size is set to a fairly low number, but this should be adequate even for large numbers of GDMA clients reporting in. Each thread can manage many concurrent incoming requests. Bumping up this number arbitrarily is discouraged, partly because there is currently a fairly heavy per-thread memory overhead in the allocation of thread stacks, at least in terms of virtual memory allocation.

For simplicity in initial setup, the http_listener_allowed_clients value is pre-set to allow client data submissions from all the standard private IPv4 networks. You may tune this as desired. IPv6 addresses and CIDR blocks are also supported.

New Apache setup

The new file noted in this section is provided and automatically installed by the rollup patch for GroundWork Monitor 7.2.1.

For GroundWork 7 installs, a new file is provided:

/usr/local/groundwork/apache2/conf/groundwork/bronx_httpd.conf

You simply need to put this file in place, owned by nagios:nagios and with 644 permissions, and then bounce Apache to pick up the revised configuration:

service groundwork restart apache


GDMA client configuration settings

To have a GDMA 2.7.0 or later client send data via HTTP/S, the GDMA client's Spooler_Transport option must be set to "HTTP" instead of "NSCA". A GUI-mode or text-mode installation will prompt for this setting. An unattended-mode installation can use the ‑‑gdma_spooler_transport command-line option, which can be set to either http or nsca.

Sending in check results via HTTP/S uses the Auto_Register_User and Auto_Register_Pass credentials, so those must be set as well. For unattended installs, see the ‑‑gdma_autoregistration_username and ‑‑gdma_autoregistration_password options.

The ‑‑gdma_auto_configuration <gdma_auto_configuration> installer command-line option, or its equivalent setting in an option file or in responses to text-mode or UI-mode installation prompts, ends up adjusting the Auto_Register_Attempts and Enable_Auto_Setup options in the installed gdma/config/gdma_auto.conf file.

Auto-Registration or Auto-Setup is only possible if the username/password credentials are available. So the unattended-mode option will be ignored, and the text-mode or UI-mode question will be skipped, if those credentials are not supplied during the installation. In that case, the installation will default to an Auto_Register_Attempts setting of never and an Enable_Auto_Setup setting of off.

The ‑‑gdma_spooler_transport <gdma_spooler_transport> installer command-line option, or its equivalent setting in an option file or in a text-mode or UI-mode install, ends up setting the value of the new Spooler_Transport option in the gdma/config/gdma_auto.conf file.

# Default value shown; may be set to HTTP instead.
Spooler_Transport = "NSCA"

Use of HTTP/S for transporting check results is only possible if the username/password credentials are available. So the unattended-mode option will be ignored, and the text-mode or UI-mode question will be skipped, if those credentials are not supplied during the installation. In that case, the installation will default to a Spooler_Transport setting of NSCA.

Automatic client management of TLS certificates

When configuring the GDMA client to use HTTPS for communication with the GroundWork server, it has historically been necessary to manually install the server's SSL/TLS certificate on the GDMA client. The GDMA 2.7.0 release supports new options to allow this action to be carried out automatically, on an ongoing basis. The installer command-line forms are:

  • ‑‑gdma_download_certs <gdma_download_certs>
  • ‑‑gdma_download_self_signed_certs <gdma_download_self_signed_certs>
  • ‑‑gdma_download_bad_server_name_certs <gdma_download_bad_server_name_certs>

Details are available via the installer's ‑‑help option. There are equivalent options available in an option file or via responses to text-mode or UI-mode installation prompts. These settings end up manipulating the Download_Certs_AutomaticallyAllow_Downloaded_Self_Signed_Certs, and Allow_Downloaded_Bad_Server_Name_Certs options in the installed gdma/config/gdma_auto.conf file. For convenience at many sites, automatic cert downloading is defaulted as enabled. For proper security, acceptance of downloaded self-signed certs and downloaded certs with mismatched server names are defaulted as disabled. To avoid casual or accidental enablement of the latter two options, the installer does not even place them in the installed config file if they are not enabled.

If cert downloading is disabled, the server SSL/TLS certificate will need to be managed by the customer in the legacy manner, by a mechanism such as:

  • separate placement of the cert file on the client at the time of GDMA installation on the client
  • via some sort of customer-managed configuration tool such as Ansible, Puppet, or Chef
  • via embedding in a VM instance image along with GDMA itself, that is used as the basis for the GDMA client machine

If cert downloading is enabled and a valid cert is not found on the GDMA client, the client will reach out to the server, download and verify the cert (subject to other options described below), and install it for use in subsequent communication. This action will be limited to at most one attempt per polling cycle, for each target server.

In the GDMA 2.7.0 release, downloaded certs are not subjected to any sort of certificate-has-not-been-revoked validation.

Self-signed certs may still be used in such a setup, but the GDMA client provides a separate option to control their acceptance in the case of automatic cert downloading. You should only enable acceptance of downloaded self-signed certs if you are completely sure that your infrastructure is secure, because use of such a form of certs would make it much easier for a Man-In-The-Middle attack to occur. If you need to use self-signed certs in production, you should not use the new related option. Instead, you should install those certs manually on the GDMA clients, and disable automatic cert downloading entirely.

The option to allow use of downloaded certs wherein the server name embedded in the cert does not match the server name used to retrieve the cert is provided only to simplify early testing of GDMA clients. This is not something you want to have in place in production, as it breaks a primary cert-validation test and would make it easier to mount an attack on the connection.

Performance considerations

Availability of the HTTP/S channel is not an automatic recommendation for its use. By adding extra steps in the data-handling chain, namely at the fronting web server, it does impose extra load on the GroundWork server. It is possible that such additional load might help drive the total server load past the capacity of a customer's existing hardware. Partly because of that, acceptance of check-result data via NSCA and HTTP/S channels may coexist on the same GroundWork server. That capability also assists in conversion of a fleet of GDMA clients to use the alternate transport; they need not all switch over at exactly the same moment in time. A site may choose to convert some subset of GDMA clients which live in sensitive less-private locations, while leaving other GDMA clients which live fully within a protected context still running the NSCA protocol.

Local logging on GDMA clients

Starting with this release of GDMA, local logging is now enabled by default on the client. This is made possible by built-in logfile rotation, which limits both the size of each logfile and the number of rolled-out previous copies. The reason to enable logging by default is to capture any faults that might occur on a random basis throughout the infrastructure, as experienced on the GDMA client. It makes sense to preserve any evidence of such failures, to assist in diagnosis. Without logging enabled, if a fault occurs on some arbitrary machine, it can be quite difficult to identify and fix problems quickly.

The Enable_Local_Logging directive is still available, and it can still be used to control whether logging is enabled. We are simply changing the initial installed value in the gdma_auto.conf and multihost_gdma_auto.conf files on the GDMA client.

That said, as part of making the GDMA client logfiles useful in cases where arbitrary clients run into trouble, it is now recommended that you no longer override the client configuration for the Enable_Local_Logging option. This comes into play mostly in the gdma-windows host external, which should be modified to comment out this line:

Enable_Local_Logging = "off"

so it looks like:

# Enable_Local_Logging = "off"

The other standard host externals (gdma-linuxgdma-solaris, and gdma-aix) have long shipped with this option already commented out. But you may want to check your own settings for these and your own host externals.

That advice applies if in the past, with earlier releases of GDMA, you have not enabled logging directly in the client's gdma_auto.conf file. (Past releases have generally disabled logging in out-of-the-box installs, but you may have made your own adjustments past that point.) If you still have such older releases of GDMA deployed and logging is enabled locally there, then enabling it on the server in this way will risk infinite logfile growth on those older GDMA releases. Take that into account when choosing how and when to make configuration changes.

Even if general logging is disabled, daemon startup is logged anyway, to provide the most basic record of when the GDMA client was in operation.

Known issues

  • In general, OpenSSL has been upgraded to release 1.1.1c across all of the GDMA platforms. However, on Windows GDMA, the Nagios check_ldapcheck_ldapscheck_mysqlcheck_mysql_query, and check_pgsql plugins are still linked to the OpenSSL 1.0.2h release library. (The rest of the Nagios plugins that link to OpenSSL do so using the upgraded 1.1.1c release.)
  • Microsoft's extended support for Windows Server 2003 ended on July 14, 2015. GroundWork intends to make Windows GDMA 2.7.0 the last release that will support Windows 2003 and similar Windows versions. Unless we hear from customers who still need such support, future releases of Windows GDMA will only be supported on Windows Vista, Windows Server 2008, and later versions of Windows.