Copy Sample Tweets to HDFS
- Copy the provided sample tweets to HDFS. These tweets will be used
to demonstrate the batch indexing capabilities of Cloudera Search:
- Security Enabled:
kinit hdfs@EXAMPLE.COM
hdfs dfs -mkdir -p /user/jdoe
hdfs dfs -chown jdoe:jdoe /user/jdoe
kinit jdoe@EXAMPLE.COM
hdfs dfs -mkdir -p /user/jdoe/indir
hdfs dfs -put /opt/cloudera/parcels/CDH/share/doc/search*/examples/test-documents/sample-statuses-*.avro /user/jdoe/indir/
hdfs dfs -ls /user/jdoe/indir
- Security Disabled:
sudo -u hdfs hdfs dfs -mkdir -p /user/jdoe
sudo -u hdfs hdfs dfs -chown jdoe:jdoe /user/jdoe
hdfs dfs -mkdir -p /user/jdoe/indir
hdfs dfs -put /opt/cloudera/parcels/CDH/share/doc/search*/examples/test-documents/sample-statuses-*.avro /user/jdoe/indir/
hdfs dfs -ls /user/jdoe/indir
- Security Enabled:
- Ensure that
outdir
is empty and exists in HDFS:hdfs dfs -rm -r -skipTrash /user/jdoe/outdir
hdfs dfs -mkdir /user/jdoe/outdir
hdfs dfs -ls /user/jdoe/outdir
The sample tweets are now in HDFS and ready to be indexed. Continue to the next section to index the sample tweets.