Downloading Hdfsfindtool from the CDH archives

If you have access to CDH credentials and want to use HDFSFindtool with your CDP deployment, then you can download the tool from the archives.

You must have the required paywall credentials to access the CDH archives.
You can download the CDH5 jar to run HDFSFindtool commands from the CDH archives.
  1. Log into the CDH archives with the help of your paywall credentials and download the search tarball.
    $ wget --user=<your_paywall_username> --ask-password https://archive.cloudera.com/p/cdh5/cdh/5/search-1.0.0-cdh5.16.2.tar.gz

    Here, <your_paywall_username> indicates your Paywall username. In addition, enter your password when prompted.

  2. Extract the downloaded tarball file.
    $ tar -xvf search-1.0.0-cdh5.16.2.tar.gz
  3. Run HDFSFindtool.
    $ hadoop jar cloudera-search-1.0.0-cdh5.16.2/dist/search-mr-1.0.0-cdh5.16.2.jar org.apache.solr.hadoop.HdfsFindTool -find /tmp/ -type f
The following example shows how you can use HDFSFindtool:
$ hadoop jar cloudera-search-1.0.0-cdh5.16.2/dist/search-mr-1.0.0-cdh5.16.2.jar org.apache.solr.hadoop.HdfsFindTool -find /tmp/ -type f
WARNING: Use "yarn jar" to launch YARN applications.
hdfs://ns1/tmp/logs/systest/logs/application_1575879043308_0001/file1
hdfs://ns1/tmp/my_table.avsc