Overview

GDMA 2.7.0 provides the following new features and updates:

  • HTTP(S) can be used to send check results to the GroundWork server, as an alternative to the NSCA protocol (GroundWork issue GDMA-425). This requires new server-side support, which is included with GroundWork 8; see below.
  • Local logging is now enabled by default as GDMA clients are installed. This is possible because automatic logfile rotation was implemented as of GDMA 2.6.1 (GroundWork issue GDMA-256). Having it enabled by default should help with quickly diagnosing problems that might appear on some random client.
  • GDMA client logging output has been improved in a number of small ways in both presentation and content, to make the logfiles a bit easier to decipher. Among those changes, the poller and spooler now print the GDMA version number into their respective logfiles when the daemons are started up and when the logfiles are automatically rotated, so that information is visibly exposed. Also, the program that directly runs checks now logs not just the results of failed checks, but also a notice of checks skipped in a given cycle and the results of successful checks. All of this should help with forensic investigations.
  • Server liveness checking has been radically reduced (GroundWork issue GDMA-407). This suppresses a lot of network noise traffic that might have interfered somewhat with acceptance of check results when the server was generally under heavy load.
  • A bug in the scheduling of checks has been fixed (GroundWork issue GDMA-452). The bug changed when particular checks were run and sometimes whether they were run at all, when a service is configured with both multiple instances and a _Check_Interval value larger than 1.
  • As many third-party components as we could possibly touch have been upgraded to current external releases.
    • OpenSSL 1.1.1c
    • Nagios Plugins 2.2.1
    • NRPE 3.2.1
    • Perl 5.28.2
    • many of the extra add-on Perl packages
    • OpenLDAP 2.4.47
    • curl 7.65.0
  • The check_system_uptime.pl plugin has been replaced with a version that will operate correctly on Solaris and AIX as well as Linux.

    As of this writing, that upgrade is included in the Linux GDMA builds, but it has not yet been verified in the current GDMA test builds for Solaris and AIX.

  • The AIX copy of the check_disk plugin has been fixed to correctly report results for very large disk partitions (GroundWork issue GWMON-12081).

    As of this writing, that upgrade is not yet included in current AIX GDMA test builds.

  • Support for Jolokia (https://jolokia.org/) is included in the GDMA client.
  • The default value for the ‑‑gdma_protocol <gdma_protocol> installer parameter has been changed from "HTTP" to "HTTPS". Sites which use unattended-mode installs and have have previously allowed this option to default to http will need to add the option either on the command line or in the options file mentioned on the command line.
  • A new ‑‑gdma_auto_configuration <gdma_auto_configuration> option is now supported, for selection of which mechanism (if any) will be used to automatically establish the initial and/or ongoing configuration of the GDMA client. Its default value will be autoregistration, essentially mirroring the previous behavior. That standard default will be modified to none if username/password credentials are not supplied. Alternate values are autosetup and none.
  • A new ‑‑gdma_spooler_transport <gdma_spooler_transport> option is supported, for selection of how check results will be sent to the server. This option defaults to "NSCA", specifying the legacy transport. An alternative value of "HTTP" is also supported, to send such traffic via either HTTP or HTTPS to each server as specified in the Target_Server and Target_Server_Secondary options.
  • Auto_Register_Attempts will now be set to "never" at install time under conditions where Auto-Registration is not to be invoked. That prevents its accidental activation later on when other parts of the configuration are changed. If you want to enable Auto-Registration later on, you will need to override that setting, by setting the value typically to "fibonacci" and ensuring that the rest of the Auto_Register_* parameters are set appropriately.
  • A new ‑‑gdma_download_certs <gdma_download_certs> option is supported, to allow TLS certificates to be downloaded automatically from the GroundWork server to the GDMA client. If you have selected or defaulted HTTPS as the protocol for connections to the server (see the ‑‑gdma_protocol option above), this option defaults to "yes". That setting specifies that such downloading should be enabled when a certificate verify problem arises. An alternative value of "no" is also supported, in which case you will need to install certs manually on the GDMA client if HTTPS is to be used, as in previous GDMA releases.

    Automatic cert downloading works only in Linux GDMA 2.7.0.  The code for it is present in Windows GDMA 2.7.0 but it does not work on that platform  We expect to address this in a future release on the Windows platform.  For the time being, you should forcibly disable this feature when installing Windows GDMA, by using the "--gdma_download_certs no" option when installing from the command line, or by selecting Install certs yourself instead of Download certs automatically when installing using the GUI-mode installer.

    This option is provided as a convenience measure for easier GDMA client deployment. However, in the GDMA 2.7.0 release, no certificate-revocation validation is performed on downloaded certificates. Until that changes, disabling this option and installing certs manually is recommended for full proper security.

  • A new ‑‑gdma_download_self_signed_certs <gdma_download_self_signed_certs> option is supported, to allow an automatically-downloaded self-signed TLS certificate to be treated as valid. This option defaults to "yes", which should only be used if you are completely confident that your network security cannot be broken, or if you cannot obtain a non-self-signed cert for installation on the GroundWork server and you are willing to take the consequent security risks.  An alternative value of "no" is also supported, which is the proper setting for a secure system.

    Recognition of self-signed automatically-downloaded certs works only in Linux GDMA 2.7.0.  The code for it is present in Windows GDMA 2.7.0 but it does not work on that platform  We expect to address this in a future release on the Windows platform.  For the time being, you should forcibly disable the --gdma_download_self_signed_certs option when installing Windows GDMA.

    This option is defaulted as enabled, for convenience in quick testing of GDMA deployments where you have not yet obtained a cert for your server from some other source. For production use, be sure to pay attention to this option and set it as desired.

    If you need to use a self-signed cert, the better alternative is to install it manually on your GDMA clients, and to disable automatic downloading of any certs, self-signed or not (see the previous option).

  • A new ‑‑gdma_download_bad_server_name_certs <gdma_download_bad_server_name_certs> option is supported, to allow use of an automatically-downloaded TLS certificate whose embedded server name does not match the name of the server from which the certificate was retrieved. This option defaults to "yes", which should only be used if you are completely confident that your network security cannot be broken, or in early-test situations where you do not yet have proper certificates set up on the GroundWork server and you are willing to take the consequent security risks.  An alternative value of "no" is also supported, which is the proper setting for a secure system.

    Handling of automatically-downloaded certs works only in Linux GDMA 2.7.0.  The code for it is present in Windows GDMA 2.7.0 but it does not work on that platform  We expect to address this in a future release on the Windows platform.  For the time being, you should forcibly disable the --gdma_download_bad_server_name_certs option when installing Windows GDMA.

    This option is defaulted as enabled, and it is only provided in the first place, for convenience in quick testing of GDMA deployments. This is definitely not the setting you should have in place for production use. Having this option enabled makes a Man-In-The-Middle attack easier.  Be sure to pay attention to this option and set it as desired.

Sending check-result data via HTTP/S

Traditionally, GDMA clients have used the NSCA protocol to send check-result data to a GroundWork server, using a dedicated port on the server which accepts such data directly into the Nagios process. Some customers have requested that such traffic be transported via an HTTP/S channel instead, either because they wish not to have unusual special ports open on the GroundWork server, or because they have HTTP/S-traffic observability tools in place that they would like to apply to this traffic as well. GDMA 2.7.0 now provides such a capability.

To support this change, modifications are made on both the client and server. The client requires at least the GDMA 2.7.0 release to be installed. In GroundWork 8, the following setup is in play on the server:

  • In the reverse-proxy container, some web-access setup is provided to accept and route this traffic.
  • A new version of the libbronx.so library, which serves as a Nagios Event Broker, is provided to accept data on this channel.
  • The bronx.cfg config file supports new options to control the new facility. See below for details.

Port usage and security considerations

Historically, the Bronx event broker has used port 5667 for acceptance of check results using the NSCA protocol. That is still available if you need it.

To support the new facility of accepting check-result traffic using the HTTP/S protocol as seen on the GDMA client, Bronx can now accept HTTP connections on port 5657. This port supports only HTTP connections, not HTTPS connections, and it is intended only for internal use on the GroundWork server (in GroundWork 8, only for internal use between the containers, on their own private network). It should not be exposed for connections outside of the GroundWork server itself. In the GroundWork 8 context, Bronx should listen on this port only on the IPv4 network interface for the nagios  container, which is where Bronx runs, and that is the default setup. That allows Bronx to receive incoming data from GDMA clients, which route their requests to the reverse proxy which handles all outside access to the GroundWork 8 server. Supporting HTTPS is the provenance of the reverse proxy, which will handle decryption and forward incoming data-submission requests to the Bronx port.

This new interface is not a full REST API, nor is it intended to be, ever. Except for checking certain primitive server metadata such as the server time or whether Bronx is up, Bronx does not support any data exfiltration on this port. The new interface is strictly for submission of check results, and the only data returned to a caller when submitting a set of check results is a boolean indication of success or failure in accepting that set of data.

To protect access credentials, it is strongly recommended that you run your GroundWork server in HTTPS-only mode, and set the Target_Server option (and Target_Server_Secondary option, if that is in use) for all GDMA clients to use "https://" as the protocol for connecting to the GroundWork server.

Even with HTTPS in play for the GDMA client connection, it is still possible for an attacker to attempt a brute-force attack to guess the username and password. The present implementation does not introduce any artificial delays at the Bronx level to slow down such attacks. Use of rather long and random passwords is therefore recommended, to make such an attack impractical.

Client/server time mismatch

If HTTP/S is in use for transport of check results, a validation of the client clock is run at GDMA startup. If the time differs by more than 10 seconds in either direction from the server clock, both of the nagios.log and event_broker.log logfiles on the server will contain a message such as this:

[1559272754] [BRONX] {Bronx_AccessHandlerCallback} warning: client '172.28.113.159'
    clock is running roughly 686 seconds fast relative to server clock

That should help to identify clients where the time mismatch is getting so great that Bronx or Nagios may intentionally drop incoming data on the floor because it is too far out of bounds. Such dropped data will also be logged in the same logfiles, thusly:

[1559273176] [BRONX] {process_request_payload} Received 9 check results, submitted 0, dropped 9,
    from client '172.28.113.159' via peer 127.0.0.1

If that happens, then with

logging=warning

or higher levels of logging set in bronx.cfg, you can also see the reasons for such discarding of data:

[1559273379] [BRONX] {process_request_payload} Dropping check result from '172.23.123.101' via peer 127.0.0.1
    for host 'somehost' service 'linux_disk_root' with stale timestamp (check result was 1202 seconds old).

[1559273439] [BRONX] {process_request_payload} Dropping check result from '172.23.117.259' via peer 127.0.0.1
    for host 'otherhost' service 'gdma_poller' with timestamp too far in the future.

In the GroundWork 8 context, the content of those logfiles is captured as part of the nagios container output.

GroundWork 8.x.x server setup

In GroundWork 8, server-side setup to enable check-result traffic to be received on the server using HTTP/S is largely present by default.  The two parts an administrator is most likely concerned with are these:

  • /usr/local/groundwork/config/bronx.cfg is configured by default as shipped to enable this facility, via the http_listener=on  option.
  • GDMA traffic from the the three standard private IPv4 address blocks (https://en.wikipedia.org/wiki/Private_network) is preconfigured to be allowed in, via the default setting of the http_listener_allowed_clients  option.  This should satisfy the needs of customers who monitor only infrastructure internal to their organization.  Other customers may need to modify the value of that option to allow access from GDMA clients positioned elsewhere in the network.

For details of those options, see the next section.

New bronx.cfg configuration settings

The following new settings are available in the bronx.cfg file. Default settings for GroundWork Monitor 8 are shown here. Commentary on the meaning of these options and appropriate values for them are provided in the config file. To enable accepting data via HTTP/S, the http_listener option must be "on".

http_listener=on
http_connection_limit=32
http_idle_connection_timeout=20
http_thread_pool_size=4
http_listener_port=5657
http_listener_ipv4_hostname=nagios
http_listener_allowed_peers=0.0.0.0/0
http_listener_allowed_clients=10.0.0.0/8,172.16.0.0/12,192.168.0.0/16

If you want to use this channel, the http_listener_allowed_clients option must be set to reflect the set of hosts or subnets you expect to receive data from via this transport. (Information about the calling client must be passed along to Bronx by the fronting web server [NGINX in the GroundWork 8 reverse proxy container], and that is a standard part of the GroundWork 8 reverse-proxy setup. So the fact that Bronx is listening only on an intra-container private network interface [as specified by http_listener_ipv4_hostname] and accepting direct socket connections from the web server acting as a reverse proxy [as allowed by http_listener_allowed_peers] does not preclude it from accepting data from your GDMA clients. It just means that there must be some intermediary process on the GroundWork server, namely the reverse proxy web server, that passes along each request.)

For simplicity in initial setup, the http_listener_allowed_clients value is pre-set to allow client data submissions from all the standard private IPv4 networks. You may tune this as desired. IPv6 addresses and CIDR blocks are also supported.

If you change the port number in the http_listener_port option in GroundWork 8, you would also need to change it in the reverse-proxy setup.  There's no point to doing so, so don't do that.

The http_thread_pool_size is set to a fairly low number, but this should be adequate even for large numbers of GDMA clients reporting in. Each thread can manage many concurrent incoming requests. Bumping up this number arbitrarily is discouraged, partly because there is currently a fairly heavy per-thread memory overhead in the allocation of thread stacks, at least in terms of virtual memory allocation.

GDMA client configuration settings

To have a GDMA 2.7.0 or later client send data via HTTP/S, the GDMA client's Spooler_Transport option must be set to "HTTP" instead of "NSCA". A GUI-mode or text-mode installation will prompt for this setting. An unattended-mode installation can use the ‑‑gdma_spooler_transport command-line option, which can be set to either http or nsca.

Sending in check results via HTTP/S uses the Auto_Register_User and Auto_Register_Pass credentials, so those must be set as well. For unattended installs, see the ‑‑gdma_autoregistration_username and ‑‑gdma_autoregistration_password options.

The ‑‑gdma_auto_configuration <gdma_auto_configuration> installer command-line option, or its equivalent setting in an option file or in responses to text-mode or GUI-mode installation prompts, ends up adjusting the Auto_Register_Attempts and Enable_Auto_Setup options in the installed gdma/config/gdma_auto.conf file.

Auto-Registration or Auto-Setup is only possible if the username/password credentials are available. So if those credentials are not supplied during the installation, the unattended-mode option will be ignored, and the text-mode or UI-mode question will be skipped. In that case, the installation will default to an Auto_Register_Attempts setting of never and an Enable_Auto_Setup setting of off.

The ‑‑gdma_spooler_transport <gdma_spooler_transport> installer command-line option, or its equivalent setting in an option file or in a text-mode or GUI-mode install, ends up setting the value of the new Spooler_Transport option in the gdma/config/gdma_auto.conf file.

# Default value shown; may be set to HTTP instead.
Spooler_Transport = "NSCA"

Use of HTTP/S for transporting check results is only possible if the username/password credentials are available. So if those credentials are not supplied during the installation, the unattended-mode option will be ignored, and the text-mode or GUI-mode question will be skipped. In that case, the installation will default to a Spooler_Transport setting of NSCA.

Automatic client management of TLS certificates

This feature currently only works in Linux GDMA (2.7.0 or later), not in Windows GDMA.

See the warnings about all of these options in the Overview section of this page.

When configuring the GDMA client to use HTTPS for communication with the GroundWork server, it has historically been necessary to manually install the server's SSL/TLS certificate on the GDMA client. The GDMA 2.7.0 release supports new options to allow this action to be carried out automatically, on an ongoing basis. The installer command-line forms are:

  • ‑‑gdma_download_certs <gdma_download_certs>
  • ‑‑gdma_download_self_signed_certs <gdma_download_self_signed_certs>
  • ‑‑gdma_download_bad_server_name_certs <gdma_download_bad_server_name_certs>

Details are available via the installer's ‑‑help option. There are equivalent options available in an option file or via responses to text-mode or GUI-mode installation prompts. These settings end up manipulating the Download_Certs_AutomaticallyAllow_Downloaded_Self_Signed_Certs, and Allow_Downloaded_Bad_Server_Name_Certs options in the installed gdma/config/gdma_auto.conf file. For convenience at many sites, automatic cert downloading is defaulted as enabled, and for convenience in initial testing, both acceptance of self-signed certs and acceptance of certs with mismatched server names are also defaulted as enabled. However, for proper security, acceptance of downloaded self-signed certs and downloaded certs with mismatched server names should be disabled. To avoid casual or accidental enablement of the latter two options, the installer does not even place the corresponding directives in the installed config file if they are not enabled at install time.

If cert downloading is disabled, the server SSL/TLS certificate will need to be managed by the customer in the legacy manner, by a mechanism such as:

  • separate placement of the cert file on the client at the time of GDMA installation on the client
  • via some sort of customer-managed configuration tool such as Ansible, Puppet, or Chef
  • via embedding in a VM instance image along with GDMA itself, that is used as the basis for the GDMA client machine

If cert downloading is enabled and a valid cert is not found on the GDMA client, the client will reach out to the server, download and verify the cert (subject to other options described below), and install it for use in subsequent communication. This action will be limited to at most one attempt per polling cycle, for each target server.

In the GDMA 2.7.0 release, downloaded certs are not subjected to any sort of certificate-has-not-been-revoked validation.

Self-signed certs may still be used in such a setup, but the GDMA client provides a separate option to control their acceptance in the case of automatic cert downloading. You should only enable acceptance of downloaded self-signed certs if you are completely sure that your infrastructure is secure, because use of such a form of certs would make it much easier for a Man-In-The-Middle attack to occur. If you need to use self-signed certs in production, you should not use the new related option. Instead, you should install those certs manually on the GDMA clients, and disable automatic cert downloading entirely.

The option to allow use of downloaded certs wherein the server name embedded in the cert does not match the server name used to retrieve the cert is provided only to simplify early testing of GDMA clients. This is not something you want to have in place in production, as it breaks a primary cert-validation test and would make it easier to mount an attack on the connection.

Performance considerations

Availability of the HTTP/S channel is not an automatic recommendation for its use. By adding extra steps in the data-handling chain, namely at the fronting web server, it does impose extra load on the GroundWork server. It is possible that such additional load might help drive the total server load past the capacity of a customer's existing hardware. Partly because of that, acceptance of check-result data via NSCA and HTTP/S channels may coexist on the same GroundWork server. That capability also assists in conversion of a fleet of GDMA clients to use the alternate transport; they need not all switch over at exactly the same moment in time. A site may choose to convert some subset of GDMA clients which live in sensitive less-private locations, while leaving other GDMA clients which live fully within a protected context still running the NSCA protocol.

Conversely, the implementation of the HTTP/S support within Bronx may well provide greater concurrency in accepting check results from many GDMA clients, when compared to the current implementation of the NSCA support within Bronx.  That could potentially relieve some traffic jams experienced in large installations.  So that could be a strong reason to use the HTTP/S transport, even if you are not concerned about port usage, firewall rules, or traffic observability.

Local logging on GDMA clients

Starting with this release of GDMA, local logging is now enabled by default on the client. This is made possible by built-in logfile rotation, which limits both the size of each logfile and the number of rolled-out previous copies. The reason to enable logging by default is to capture any faults that might occur on a random basis throughout the infrastructure, as experienced on the GDMA client. It makes sense to preserve any evidence of such failures, to assist in diagnosis. Without logging enabled, if a fault occurs on some arbitrary machine, it can be quite difficult to identify and fix problems quickly.

The Enable_Local_Logging directive is still available, and it can still be used to control whether logging is enabled. We are simply changing the initial installed value in the gdma_auto.conf and multihost_gdma_auto.conf files on the GDMA client.

That said, as part of making the GDMA client logfiles useful in cases where arbitrary clients run into trouble, it is now recommended that you no longer override the client configuration for the Enable_Local_Logging option. This comes into play mostly in the gdma-windows host external, which should be modified to comment out this line:

Enable_Local_Logging = "off"

so it looks like:

# Enable_Local_Logging = "off"

The other standard host externals (gdma-linuxgdma-solaris, and gdma-aix) have long shipped with this option already commented out. But you may want to check your own settings for these and your own host externals.

That advice applies if in the past, with earlier releases of GDMA, you have not enabled logging directly in the client's gdma_auto.conf file. (Past releases have generally disabled logging in out-of-the-box installs, but you may have made your own adjustments past that point.) If you still have such older releases of GDMA deployed and logging is enabled locally there, then enabling it on the server in this way will risk infinite logfile growth on those older GDMA releases. Take that into account when choosing how and when to make configuration changes.

Even if general logging is disabled, daemon startup is logged anyway, to provide the most basic record of when the GDMA client was in operation.

Known Issues

  • In general, OpenSSL has been upgraded to release 1.1.1c across all of the GDMA platforms. However, on Windows GDMA, the Nagios check_ldapcheck_ldapscheck_mysqlcheck_mysql_query, and check_pgsql plugins are still linked to the OpenSSL 1.0.2h release library. (The rest of the Nagios plugins that link to OpenSSL do so using the upgraded 1.1.1c release.)
  • Microsoft's extended support for Windows Server 2003 ended on July 14, 2015. GroundWork intends to make Windows GDMA 2.7.0 the last release that will support Windows 2003 and similar Windows versions. Unless we hear from customers who still need such support, future releases of Windows GDMA will only be supported on Windows Vista, Windows Server 2008, and later versions of Windows.

Related Resources