Configuring Hive access for S3A

You must configure specific properties for client applications such as Hive to access the Ozone data store using S3A.

  • You must import the CA certificate to run Ozone S3 Gateway from the S3A filesystem.
  • You must configure the following Hive properties using the Cluster-wide Advanced Configuration Snippet (Safety Valve) for core-site.xml:
    fs.s3a.bucket.<<bucketname>>.access.key = <accesskey>
    fs.s3a.bucket.<<bucketname>>.secret.key = <secret>
    fs.s3a.endpoint = <Ozone S3 endpoint url>
    fs.s3a.bucket.probe = 0
    fs.s3a.change.detection.version.required = false = true
    fs.s3a.change.detection.mode = none
  • You must provide the required permissions in Ranger to the user running the queries. Consider the following example of providing a user with all permissions. You can change the permissions based on your requirements.
    • Assign the user with all permissions to the Database, table/udf, and URL resources in a HadoopSQL resource-based policy.
    • Assign the user with S3_VOLUME_POLICY in an Ozone policy.
The following procedure explains how you can log on to the Hive shell, create a Hive table using S3A, add data to the table, and view the added data. You can perform the same procedure by logging on to Hue using the Hive or Beeline shell.
  1. Create an Ozone bucket.
    The following example shows how you can create a bucket named s3hive:
    ozone sh bucket create /s3v/s3hive
  2. Log on to the Hive shell and perform the specified steps.
    1. Create a table on Ozone using S3A.
      jdbc:hive2://> create external table mytable1(key string, value int) location 's3a://s3hive/mytable1';
    2. Add data to the table.
      jdbc:hive2://> insert into mytable1 values("cldr",1);
      jdbc:hive2://> insert into mytable1 values("cldr-cdp",1);
    3. View the data added to the table.
      jdbc:hive2://> select * from mytable1;