Managing and Monitoring a Cluster
Also available as:
PDF
loading table of contents...

SmartSense alerts

Descriptions, potential causes and possible rememdies for alerts triggered by SmartSense.

Table 1. SmartSense Alerts
Alert Description Potential Causes Possible Remedies
SmartSense Server Process This alert is triggered if the HST server process cannot be confirmed to be up and listening on the network for the configured critical threshold, given in seconds. HST server is not running. Start HST server process. If startup fails, check the hst-server.log.
SmartSense Bundle Capture Failure This alert is triggered if the last triggered SmartSense bundle is failed or timed out. Some nodes are timed out during capture or fail during data capture. It could also be because upload to Hortonworks fails. From the Bundles page check the status of bundle. Next, check which agents have failed or timed out, and review their logs.

You can also initiate a new capture.

SmartSense Long Running Bundle This alert is triggered if the SmartSense in-progress bundle has possibility of not completing successfully on time. Service components that are getting collected may not be running. Or some agents may be timing out during data collection/upload. Restart the services that are not running. Force-complete the bundle and start a new capture.
SmartSense Gateway Status This alert is triggered if the SmartSense Gateway server process is enabled but is unable to reach. SmartSense Gateway is not running. Start the gateway. If gateway start fails, review hst-gateway.log