Creating tables by importing CSV files from ABFS
You can create tables in Hue by importing CSV files stored in ABFS. Hue automatically detects the schema and the column types, thus helping you to create tables without using the CREATE TABLE syntax.
Only Hue Superusers can access ABFS and import files to create tables.
The maximum file size supported is three gigabytes.
- Create a storage account from the Microsoft Azure portal. On the Data Lake Storage Gen2 so that the objects and files within your account can be organized into a hierarchy of directories and nested subdirectories in the same way that the file system on your computer is organized. page of the Azure portal, enable
- While registering an Azure environment in CDP Management Console, set the
Storage Location Base in the Data
Access section as follows:
abfs://storage-fs@[***AZURE-STORAGE-ACCOUNT-NAME***].dfs.core.windows.net
This location is used to read and store data.
- Enable the ABFS file browser in Hue.
- Sign in to Cloudera Data Warehouse.
- Go to the Virtual Warehouse from which you want to access the ABFS containers and click its edit icon.
- On the Virtual Warehouses detail page, go to the Hue tab and select hue-safety-valve from the drop-down menu.
- Add the following configuration for Hive or Impala Virtual Warehouse
in the space provided:For Hive Virtual Warehouse:
[desktop] # Remove the file browser from the blacklisted apps. # Tweak the app_blacklist property to suit your app configuration. app_blacklist=oozie,search,hbase,security,pig,sqoop,spark,impala [azure] [[azure_accounts]] [[[default]]] client_id=[***AZURE-ACCOUNT-CLIENT-ID***] client_secret=[***AZURE-ACCOUNT-CLIENT-SECRET***] tenant_id=[***AZURE-ACCOUNT-TENANT-ID***] [[abfs_clusters]] [[[default]]] fs_defaultfs=abfs://[***CONTAINER-NAME***]@[***AZURE-STORAGE-ACCOUNT-NAME***]>.dfs.core.windows.net webhdfs_url=https://[***AZURE-STORAGE-ACCOUNT-NAME***].dfs.core.windows.net/
For Impala Virtual Warehouse:[desktop] # Remove the file browser from the blacklisted apps. # Tweak the app_blacklist property to suit your app configuration. app_blacklist=spark,zookeeper,hive,hbase,search,oozie,jobsub,pig,sqoop,security [azure] [[azure_accounts]] [[[default]]] client_id=<client_id> client_secret=<client_secret_id> tenant_id=<tenant_id> [[abfs_clusters]] [[[default]]] fs_defaultfs=abfs://[***CONTAINER-NAME***]@[***AZURE-STORAGE-ACCOUNT-NAME***]>.dfs.core.windows.net webhdfs_url=https://[***AZURE-STORAGE-ACCOUNT-NAME***].dfs.core.windows.net/
Make sure that the container name and the Azure storage account name that you specify under the
abfs_clusters
section is same as what you specified under while activating the Azure environment, so that Hive or Impala has permission to access the uploaded files. - Click Apply in the upper right corner of the
page.
The ABFS File Browser icon appears on the left Assist panel on the Hue web UI after the Virtual Warehouse restarts.