This article reviews how to add and configure a Cloudera connection using GroundWork Cloud Hub. The connection requires a unique set of parameters (e.g., credentials). If you are connecting to a remote GroundWork server to send results, you will need your remote GroundWork server RESTACCESSAPI token.
Adding a new connection
To access Cloud Hub configuration, log in to GroundWork Monitor as a member of the Admin role (e.g., user admin), and select Configuration > Cloud Hub. To add a new connection click the +Add button next to the Cloudera connector icon. You will need to create a new connection in this way for each Cloudera project to be monitored.
The data the GroundWork server receives comes from the remote virtualization server. The information is pulled from the API on a periodic basis based on the check interval that is set. In the configuration page you will need to enter both the GroundWork Server and Cloudera Connector parameters, and select the Service Views (resources to be monitored).
- The GroundWork Server can simply be the same as the one you are running the Cloud Hub connector on, or it can be a remote server.
- If it's the same as the one you are running on, leave the directive Use Local Connection checked.
- Otherwise, uncheck this box and fill in the hostname of the remote GroundWork server in the Hostname field, leave RESTAPIACCESS in the Username field, and paste in the the encrypted Token. The token can be obtained on the remote GroundWork server, for users within the Admin role, by going to Administration > Security under Webservices API Account: RESTAPIACCESS, Encrypted Token. Just copy the key from the remote server into the Token field on the Cloud Hub server.
Once you have the GroundWork Server side of the form filled out, click Test. If you have the credentials correct and you have access to the API, you will see a Success message. Otherwise an error will give you a hint as to what is wrong and let you try again.
Using a remote server will populate the remote server with the Cloudera monitoring data, and this will not show in the local GroundWork Server.
Table: GroundWork server values
Version Indicates the minimum GroundWork Monitor version needed. In other words, a version below the indicated value is incompatible. Hostname The host name or IP address where a GroundWork server is running. A port number should not be entered here. If GroundWork is running on the same server, you can enter localhost. Username The provisioned Username granted API access on the GroundWork server. Token The corresponding API Token for the given Username on the GroundWork server, see Administration > Security under Webservices API Account: RESTAPI Encryted Token. SSL Check this box if the GroundWork server is provisioned with a secure HTTPS transport. Merge Hosts If checked, this option combines all metrics of same named hosts under one host. For example, if there is a Nagios configured host named demo1 and a Cloud Hub discovered host named demo1, the services for both configured and discovered hosts will be combined under the hostname demo1 (case-sensitive). Monitor If checked, enables connection to be monitored. Use Local Connection This directive refers to where the Cloud Hub results are sent.
If this field is checked, results will be posted to the same server as where Cloud Hub is running.
Or, with this field unchecked, you can forward results to any accessible GroundWork server you define with the name and API key.
Ownership Ownership is the owner of a connectors hosts and the ownership can be switched.
When a Cloud Hub connector is instantiated the following options are available for ownership:
Always take ownership: The connector will assume ownership of all hosts it instantiates, even merged hosts. This will remain true even if another app merges the host.
Leave ownership if already owned: The connector host will remain with the existing owner until or unless the owner deletes the host.
Always defer ownership (default): This option leaves ownership unchanged on merged hosts, and allows other apps to take ownership.
Note that multiple apps can report on a single service, but only one can own the host.
Connection Status Click Test to verify a connection using the GroundWork server entries.
Next you will need to fill in the Cloudera Connector parameters and test the connection.
- Enter a Display Name.
- Enter the Cloudera Server.
- Enter a Username.
- Enter a Password.
- Optionally check the Prefix Service Name with Cluster, refer to the table below for a description of this directive.
- Optionally set the Interval, Timeout, and Retry directives, and verify the Port entry.
- Validate the connection by clicking Test. A dialog will be displayed with either a Success message or, if the project cannot be contacted, an error message will be displayed with a hint as to why the connection failed. When a successful connection is made, the Connection Status buttons will change to green.
- Click Save in the upper right corner to save your correct connection parameters.
- Next, you need to discover and select which resources to monitor. On the right side of the screen are Service Views which are optional features of Cloudera. These features are the core components, or services that are managed by Cloudera. Services include Cluster, Host, HBase, HDFS, Hive, Hue, Impala, KS_Indexer, Oozie, Solr, Spark on Yarn, Zookeeper, Yarn, and Kafka. Each of these services has their own rich set of metrics. By default, all services are selected. If checked, service will be monitored. Cloudera also provides Cluster and Host metrics which can be optionally collected. If there are one or more clusters or hosts in the system, they will be automatically detected and collected. If you were collecting metrics for a service, and then unchecked that Cloudera service, the existing hosts and metrics stored in the GroundWork server will be deleted.
- De-select any resource you do not want to monitor.
- Click Save when finished.
After the credentials have been validated and the resources indicated, select the Metrics link (top navigation) to start customizing metrics for the connection. Please refer to the article How to determine Cloud Hub metrics to be monitored.
Table: Cloudera server values
Display Name This is the configuration’s name displayed in the list of Cloud Hub connectors on the Cloud Hub home page. Cloudera Server The host name or IP address where a Cloudera server is running. A port number should not be entered here. Username The provisioned Username granted API access on the Cloudera server. Password The corresponding Password for the given Username on the Cloudera server. Prefix Service Names with Cluster? A Cluster is a logical entity that contains a set of hosts and the service instances running on the hosts. If this directive is checked, the name of the service will be prefixed with the name of the cluster in the various GroundWork monitoring visualizations.
This option is useful when are running two or more Cloudera clusters. Without this option checked, Cloudera services are stored as hosts with a host name directly corresponding to the Cloudera service name. For example, the HDFS service is stored in GroundWork as a host named hdfs. If this box is checked, the hostname will be prefixed by the name of the cluster it is running under. Given a cluster named cluster1, the hostname for HDFS will be stored as cluster1-hdfs. Similarly, the SOLR service will be stored as a hostname cluster1-solr. The default setting is set as disable. If you find Cloudera services are not being mapped to unique GroundWork hostnames, you can use this feature even with a single-cluster Cloudera deployment.
Interval (min) This is the metric gathering interval for collecting monitoring data from Cloudera and sending it to the GroundWork server. The value is in minutes. Timeout (ms) The connection timeout in milliseconds. Normally the default value 5000 is sufficient. When you have a slow network connection, you may want to increase the default value. Infinite Retries Check this box if you want Cloud Hub to infinitely retry connection to Cloudera when the connection fails. When this box is checked, the Retry Limit field is disabled. When this box is unchecked, the Retry Limit field is enabled. Retry Limit This entry is the number of retries for the connection and sets a limit on how many attempts are made after a failure. The number set indicates how many connections are attempted before the connection is left in an inactive state. At this point, the connection is suspended and you will need to manually restart it. When a retry limit is exhausted, all hosts managed by this connection are set to the monitor status Unreachable and all services for the matched hosts are set to the status of Unknown. Port Number The optional port number for the Cloudera server API. Default is 7180. Connection Status Click Test to verify a connection using the Cloudera connector entries.
Cloud Hub (Documentation)
How to determine Cloud Hub metrics to be monitored (Knowledge Base)
Cloud Hub troubleshooting (Knowledge Base)
How to configure Cloud Hub connectors (Knowledge Base)
Ownership options (Documentation)
Transit Connection Generator (TCG) (Documentation)