If you have access to CDH credentials and want to use HDFSFindtool with your CDP
deployment, then you can download the tool from the archives.
You can download the CDH5 jar to run HDFSFindtool commands
from the CDH archives.
-
Log into the CDH archives with the help of your paywall credentials and
download the search tarball.
$ wget --user=<your_paywall_username> --ask-password https://archive.cloudera.com/p/cdh5/cdh/5/search-1.0.0-cdh5.16.2.tar.gz
Here, <your_paywall_username> indicates your Paywall
username. In addition, enter your password when prompted.
-
Extract the downloaded tarball file.
$ tar -xvf search-1.0.0-cdh5.16.2.tar.gz
-
Run HDFSFindtool.
$ hadoop jar cloudera-search-1.0.0-cdh5.16.2/dist/search-mr-1.0.0-cdh5.16.2.jar org.apache.solr.hadoop.HdfsFindTool -find /tmp/ -type f
The following example shows how you can use HDFSFindtool:
$ hadoop jar cloudera-search-1.0.0-cdh5.16.2/dist/search-mr-1.0.0-cdh5.16.2.jar org.apache.solr.hadoop.HdfsFindTool -find /tmp/ -type f
WARNING: Use "yarn jar" to launch YARN applications.
hdfs://ns1/tmp/logs/systest/logs/application_1575879043308_0001/file1
hdfs://ns1/tmp/my_table.avsc