Data Access
Also available as:
PDF
loading table of contents...

Contents

1. What's New in Data Access for HDP 2.6
What's New in Apache Hive
What's New in Apache Tez
What's New in Apache HBase
What's New in Apache Phoenix
Druid
2. Data Warehousing with Apache Hive
Content Roadmap
Features Overview
Temporary Tables
Optimized Row Columnar (ORC) Format
SQL Optimization
SQL Compliance and ACID-Based Transactions
Streaming Data Ingestion
Query Vectorization
Beeline versus Hive CLI
Hive JDBC and ODBC Drivers
Moving Data into Apache Hive
Using an External Table
Using Sqoop
Incrementally Updating a Table
Queries on Data Stored in Remote Clusters
Query Capability on Remote Clusters
Configuring HiveServer2
Configuring HiveServer2 for Transactions (ACID Support)
Configuring HiveServer2 for LDAP and for LDAP over SSL
Securing Apache Hive
Authorization Using Apache Ranger Policies
SQL Standard-Based Authorization
Required Privileges for Hive Operations
Storage-Based Authorization
Configuring Storage-Based Authorization
Permissions for Apache Hive Operations
Row-Level Filtering and Column Masking
Troubleshooting
JIRAs
3. Enabling Efficient Execution with Apache Pig and Apache Tez
4. Managing Metadata Services with Apache HCatalog
HCatalog Community Information
WebHCat Community Information
Security for WebHCat
5. Persistent Read/Write Data Access with Apache HBase
Content Roadmap
Deploying Apache HBase
Installation and Setup
Cluster Capacity and Region Sizing
Enabling Multitenancy with Namepaces
Security Features Available in Technical Preview
Managing Apache HBase Clusters
Monitoring Apache HBase Clusters
Optimizing Apache HBase I/O
Importing Data into HBase with Bulk Load
Using Snapshots
Backing up and Restoring Apache HBase Datasets
Planning a Backup-and-Restore Strategy for Your Environment
Best Practices for Backup-and-Restore
Running the Backup-and-Restore Utility
Medium Object (MOB) Storage Support in Apache HBase
Enabling MOB Storage Support
Testing the MOB Storage Support Configuration
Tuning MOB Storage Cache Properties
HBase Quota Management
Setting Up Quotas
Throttle Quotas
Space Quotas
Quota Enforcement
Quota Violation Policies
Impact of Quota Violation Policy
Number-of-Tables Quotas
Number-of-Regions Quotas
HBase Best Practices
6. Orchestrating SQL and APIs with Apache Phoenix
Enabling Phoenix and Interdependent Components
Thin Client Connectivity with Phoenix Query Server
Securing Authentication on the Phoenix Query Server
Selecting and Obtaining a Client Driver
Creating and Using User-Defined Functions (UDFs) in Phoenix
Mapping Phoenix Schemas to HBase Namespaces
Enabling Namespace Mapping
Creating New Schemas and Tables with Namespace Mapping
Associating Tables of a Schema to a Namespace
Phoenix Repair Tool
Running the Phoenix Repair Tool
7. Real-Time Data Analytics with Druid
Content Roadmap
Architecture
Installing and Configuring Druid
Interdependencies for the Ambari-Assisted Druid Installation
Assigning Slave and Client Components
Configuring the Druid Installation
Security and Druid
Securing Druid Web UIs and Accessing Endpoints
High Availability in Druid Clusters
Configuring Druid Clusters for High Availability
Leveraging Druid to Accelerate Hive SQL Queries
How Druid Indexes Hive-Sourced Data
Transforming Hive Data to Druid Datasources
Performance-Related druid.* Properties