About Transit Connection Generator (TCG)

The connection between GroundWork servers and monitoring data sources such as Nagios is facilitated by a new component called TCG, or the Transit Connection Generator (see https://github.com/gwos/tcg). Based on the NATS project, TCG allows the creation of resilient connections that automatically re-establish when broken, and buffers data to ensure a lossless stream of metrics. TCG is easy to configure, and flexible.

The GroundWork TCG is open source and available on GitHub.

TCG does not support looped connections (sending data to itself), in case you were wondering.

Types of TCG Connections

The connection types supported in TCG as of GroundWork Monitor 8.2.1:

APM

The Application Performance Monitoring connector with Jaeger also uses TCG, along with a Prometheus client to forward custom metrics from your applications to GroundWork for monitoring. For details see Application Performance Monitoring.

Elastic

This is similar to a Cloud Hub connector, in that it connects to a source of monitoring data, Elastic Stack in this case. It uses TCG to send its state and metric data to GroundWork. For details see Elasticsearch Monitoring.

Kubernetes

The Kubernetes connector is designed to run in Kubernetes, and access the metrics API. Once connected to GroundWork, it gives you a clean set of services detailing metrics on your Nodes, Pods and Names Spaces in Kubernetes. You don't need to do more than install the Metrics API, set it up and authenticate to start monitoring your Kubernetes workloads in a clear, useful format alongside your monitoring of hosts and network resources. For details see Kubernetes Monitoring.

Nagios

The typical default Standalone GroundWork server (8.1.0>) uses TCG to forward data from Nagios to GroundWork Foundation, our normalization and aggregation layer. You will see a connection called “Local Nagios” under the Connectors menu, which is the local internal connection that does this. Additionally, Child servers use TCG to forward Nagios state data and metrics to Parent servers

This is a connector to a local Nagios with the Data Geyser enhancement running. 
nagios connection type to local
You can also connect this to a Parent server to create an independently managed GroundWork Child.
nagios connection type to parent

The Nagios Parent Managed Child is a special connection type for sending data from a Child to a Parent. It is special because it is provisioned and managed on the Parent, not the Child. 

Office 365

The office connector can give you usage metrics for your Microsoft™ subscription services, whether you use Microsoft Azure™ or Microsoft 365™ to access them. Get notified when the services your business runs on are down or degraded, or the license or resources you use fall below acceptable levels all by just using this connector to authenticate to the Microsoft API. For details see Microsoft 365 Monitoring.

SNMP

This connector looks at NeDi and pulls metric independently from the database for the systems under monitoring. For details see SNMP Monitoring.

Configuration Details

TCG uses NATS for guaranteed delivery of monitoring messages. The NATS component uses on-disk queues to save messages that can't be immediately delivered. These queues can be limited in scope by size and by age, separately. Here's how these settings can be adjusted for your TCG implementations. 

Defaults

The maximum age a message can be in TCG is 10 days (240 hours), and the maximum size a queue can be is 50 GB. This means that by default you need to have 50 GB of disk set aside per connector. In a standalone GroundWork installation, this 50 GB requirement is covered easily by the 200GB minimum needed for running GroundWork 8.x. However, when one or more of the above optional configurations is used, more space has to be provisioned. 

Changing Maximum Age and Queue Size for Nagios Connections

The Nagios instances on GroundWork Standalone or Child servers will queue and retransmit monitoring results using TCG. If you have Child servers, you may want to adjust the settings for how much data to queue and for how long. 

To adjust the Nagios connector size on a Standalone or Child GroundWork server: 

  1. Access the command line of your GroundWork server and change to the gw8 directory:

    cd gw8
    CODE
  2. Edit the datageyser_tcg_config.yaml file in the nagios container: 

    docker-compose exec nagios vi /usr/local/groundwork/config/datageyser_tcg_config.yaml
    CODE
  3. Adjust the following lines as needed: 

        natsStoreMaxAge: 240h0m0s
        natsStoreMaxBytes: 53687091200

    For example: To make the timeout 3 days and the size 1GB: 

        natsStoreMaxAge: 72h0m0s
        natsStoreMaxBytes: 1073741824

    and save the file.

  4. Restart GroundWork to make the changes take effect: 

    $ docker-compose down
    $ docker-compose up -d

Changing Maximum Age and Queue Size for Containerized Connections

If you run connectors on your GroundWork server as connectors, these connections will queue and retransmit monitoring results using TCG. You may want to adjust the settings for how much data to queue and for how long, since the local connectors will potentially consume a lot of disk space if they are unable to communicate with the GroundWork system for a while. 

These settings are saved in a tcg_config.yaml (or datageyser_ tcg_config.yaml for the Nagios connectors). This file is typically located in the container running the connector, or in the directory where the connector runs when running as a Linux service. 

For example, to adjust the Elasticsearch connector maximum age and queue size on a GroundWork server: 

  1. Access the command line of your GroundWork server and change to the gw8 directory:

    cd gw8
  2. Edit the tcg_config.yaml file in the connector container, for example if you are running the Elastic connector as a container, the container will be called tcg-elastic, and the file to edit will be called /tcg/elastic-connector/tcg_config.yaml

    docker-compose exec tcg-elastic vi /tcg/elastic-connector/tcg_config.yaml
  3. Adjust the following lines as needed: 

        natsStoreMaxAge: 240h0m0s
        natsStoreMaxBytes: 53687091200

    For example: To make the timeout 3 days and the size 1GB: 

        natsStoreMaxAge: 72h0m0s
        natsStoreMaxBytes: 1073741824

    and save the file.

  4. Restart GroundWork to make the changes take effect: 

    docker-compose down
    docker-compose up -d

    Changes to values in this file will be preserved across restarts of GroundWork, but other changes such as comments will not. 

Optimizing Inventory  and Metrics  Transmission

Data Batching

High-load performance can potentially be adjusted by sending metrics and inventory in batches. This is enabled in specific contexts by default (8.2.1>), but can be changed through the use of environment variables set in docker-compose.override.yml. Non-Nagios TCG connectors do not have batching enabled by default, and can be configured using time-limited and size-limited values. Nagios, however, is pre-configured with batching enabled (8.2.1>). The pre-configured values for events, metrics, and maximum bytes of a Nagios batch are 5 seconds, 10 seconds, and 100 kilobytes, respectively. The starting point – if you wish to override these values – should therefore be the following docker.compose.override.yml settings for any TCG connector container, or in the following example, Nagios:

The default docker-compose.override.yml that exists after installation already has a nagios section that should be used or removed.

services:
  nagios:
    environment:
      - TCG_CONNECTOR_BATCHEVENTS=5s
      - TCG_CONNECTOR_BATCHMETRICS=10s
      - TCG_CONNECTOR_BATCHMAXBYTES=102400
CODE

Gzip Compression

As of GroundWork Monitor 8.2.2, you can compress inventory and metrics over the connection between TCG and foundation. This is especially useful in the context of architectures where network latency exists between the physical devices where metrics are being collected, and the central database, such as a Nagios Parent Managed Child setup. Network gzip compression can reduce network bottlenecks in high-load situations where metrics can be sent in batches at a minimal cost in processing power.

An entry in docker-compose.override.yml can be used to override default behavior. Possible values of this environment value are force, child, or off. If this environment variable is omitted from your configuration, TCG and Nagios compression will default to child in all contexts. The value of child will enable compression only on the connection from Nagios Child to Parent foundation connections. The following configuration will force Nagios to use compression in all monitoring architectures, not only Nagios Child to Parent:

services:
  nagios:
    environment:
      - TCG_CONNECTOR_GWENCODE=force
CODE

Appendices

Appendix A: Installing Containerized TCG Post GroundWork 8 Install

During an initial GroundWork Monitor 8 installation, you will be prompted for optional services to install, and any of these services can be installed at a later time. Doing so is easiest when your GroundWork server has the ability to pull the required containers from the Docker Hub repository where they are published (free to download) by GroundWork, but some customers might be deploying their GroundWork Monitor system in offline environments where this isn't the case. This section steps through how to install the TCG connectors without accessing the Internet at all, as long as you have the installer file present. This pertains to:

  • TCG APM: Install TCG APM Connector to use along with a Prometheus client to forward custom metrics from your applications to GroundWork for monitoring
  • TCG ELASTIC: Install TCG Elastic Connector if you are using Elastic
  • TCG SNMP: Install TCG SNMP Connector if you are using NeDi for Network Monitoring

To install these optional containers,  you will need to rerun the installer with --noexec which unpacks the container images, then load them as follows:

  1. Place the installer file in the directory immediately above the gw8 directory, and run it:

    ./gw8setup-8.2.2-GA.run --noexec
    CODE
  2. Then, change to the gw8 directory, where you will see several .bz2 archives:

    cd gw8
    CODE
  3. Enter the following to load the tcg image:

    docker load -i gw8images-tcg.tar.bz2
    CODE
  4. Next, edit the following .yml file:

    docker-compose.override.yml
    CODE
  5. Include the required text to start the container and deploy the volumes, depending on which one you need. Here's what you need for each:

    # APM Connector
      tcg-apm:
         image: groundworkdevelopment/tcg:${TAG}
         entrypoint: ["/app/docker_cmd.sh", "apm-connector"]
         volumes:
            - tcg-var:/tcg
    # Elastic Connector
      tcg-elastic:
         image: groundworkdevelopment/tcg:${TAG}
         entrypoint: ["/app/docker_cmd.sh", "elastic-connector"]
         volumes:
           - tcg-var:/tcg
    # SNMP Connector
      tcg-snmp:
         image: groundworkdevelopment/tcg:${TAG}
         entrypoint: ["/app/docker_cmd.sh", "snmp-connector"]
         environment:
           - NEDI_CONF_PATH=/usr/local/groundwork/config/nedi
         volumes:
           - tcg-var:/tcg
           - ulg:/usr/local/groundwork/config/
     # Kubernetes Connector
      tcg-kubernetes:
        image: groundworkdevelopment/tcg:${TAG}
        entrypoint: ["/app/docker_cmd.sh", "kubernetes-connector"]
        volumes:
          - tcg-var:/tcg
    
      # Office 365 Connector
      tcg-office:     
        image: groundworkdevelopment/tcg:${TAG}     
        entrypoint: ["/app/docker_cmd.sh", "office-connector"]
        volumes:
          - tcg-var:/tcg
    CODE
  6. Also, include (if it isn't already) the volumes section with the tcg-var volume. This is typically near the end of the file: 

    volumes:
       tcg-var:
    CODE
  7. Restart GroundWork to make the changes take effect: 

    docker-compose down
    docker-compose up -d
    CODE

Related Resources