Copy sample tweets to HDFS
Copy the provided sample tweets to HDFS. These tweets are used to demonstrate the batch indexing capabilities of Cloudera Search.
-
Copy the provided sample tweets to HDFS:
Security Enabled:
-
kinit [***hdfs@EXAMPLE.COM***]
-
hdfs dfs -mkdir -p /user/[***USER***]
-
hdfs dfs -chown [***USER***]:[***GROUP***] /user/[***USER***]
-
kinit [***USER@EXAMPLE.COM***]
-
hdfs dfs -mkdir -p /user/[***USER***]/indir
-
hdfs dfs -put /opt/cloudera/parcels/CDH/share/doc/search*/examples/test-documents/sample-statuses-*.avro /user/[***USER***]/indir/
-
hdfs dfs -ls /user/[***USER***]/indir
Security Disabled: Run the following commands as [***USER***]:sudo -u hdfs hdfs dfs -mkdir -p /user/[***USER***]
sudo -u hdfs hdfs dfs -chown [***USER***]:[***GROUP***] /user/[***USER***]
hdfs dfs -mkdir -p /user/[***USER***]/indir
hdfs dfs -put /opt/cloudera/parcels/CDH/share/doc/search*/examples/test-documents/sample-statuses-*.avro /user/[***USER***]/indir/
hdfs dfs -ls /user/[***USER***]/indir
-
- Ensure that
outdir
is empty and exists in HDFS:hdfs dfs -rm -r -skipTrash /user/[***USER***]/outdir
hdfs dfs -mkdir /user/[***USER***]/outdir
hdfs dfs -ls /user/[***USER***]/outdir
The sample tweets are now in HDFS and ready to be indexed. Continue to the next section to index the sample tweets.