Monitor cluster health with ksck
The kudu CLI includes a tool called
      ksck that can be used for gathering
    information about the state of a Kudu cluster, including checking its health. ksck will identify issues such as under-replicated
    tablets, unreachable tablet servers, or tablets without a leader. 
      ksck should be run from the command line as the Kudu admin user, and requires
      the full list of master addresses to be specified: 
$ sudo -u kudu kudu cluster ksck master-01.example.com,master-02.example.com,master-03.example.com
     To see a full list of the options available with ksck, use the
        --help flag. If the cluster is healthy, ksck will print
      information about the cluster, a success message, and return a zero (success) exit status. 
Master Summary
               UUID               |       Address         | Status
----------------------------------+-----------------------+---------
 a811c07b99394df799e6650e7310f282 | master-01.example.com | HEALTHY
 b579355eeeea446e998606bcb7e87844 | master-02.example.com | HEALTHY
 cfdcc8592711485fad32ec4eea4fbfcd | master-02.example.com | HEALTHY
Tablet Server Summary
               UUID               |        Address         | Status
----------------------------------+------------------------+---------
 a598f75345834133a39c6e51163245db | tserver-01.example.com | HEALTHY
 e05ca6b6573b4e1f9a518157c0c0c637 | tserver-02.example.com | HEALTHY
 e7e53a91fe704296b3a59ad304e7444a | tserver-03.example.com | HEALTHY
Version Summary
 Version |      Servers
---------+-------------------------
  1.7.1  | all 6 server(s) checked
Summary by table
   Name   | RF | Status  | Total Tablets | Healthy | Recovering | Under-replicated | Unavailable
----------+----+---------+---------------+---------+------------+------------------+-------------
 my_table | 3  | HEALTHY | 8             | 8       | 0          | 0                | 0
                | Total Count
----------------+-------------
 Masters        | 3
 Tablet Servers | 3
 Tables         | 1
 Tablets        | 8
 Replicas       | 24
OK
     If the cluster is unhealthy, for instance if a tablet server process has stopped,
        ksck will report the issue(s) and return a non-zero exit status, as shown
      in the abbreviated snippet of ksck output below: 
Tablet Server Summary
               UUID               |        Address         |   Status
----------------------------------+------------------------+-------------
 a598f75345834133a39c6e51163245db | tserver-01.example.com | HEALTHY
 e05ca6b6573b4e1f9a518157c0c0c637 | tserver-02.example.com | HEALTHY
 e7e53a91fe704296b3a59ad304e7444a | tserver-03.example.com | UNAVAILABLE
Error from 127.0.0.1:7150: Network error: could not get status from server: Client connection negotiation failed: client connection to 127.0.0.1:7150: connect: Connection refused (error 61) (UNAVAILABLE)
... (full output elided)
------------------
Errors:
------------------
Network error: error fetching info from tablet servers: failed to gather info for all tablet servers: 1 of 3 had errors
Corruption: table consistency check error: 1 out of 1 table(s) are not healthy
FAILED
Runtime error: ksck discovered errors
     To verify data integrity, the optional --checksum_scan flag can be set,
      which will ensure the cluster has consistent data by scanning each tablet replica and
      comparing results. The --tables or --tablets flags can be
      used to limit the scope of the checksum scan to specific tables or tablets, respectively. 
 For example, checking data integrity on the my_table table can be done with
      the following command: 
$ sudo -u kudu kudu cluster ksck --checksum_scan --tables my_table master-01.example.com,master-02.example.com,master-03.example.com
     By default, ksck will attempt to use a snapshot scan of the table, so the
      checksum scan can be done while writes continue. 
 Finally, ksck also supports output in JSON format using the
        --ksck_format flag. JSON output contains the same information as the plain
      text output, but in a format that can be used by other tools. See kudu cluster ksck
        --help for more information. 
