Hive Properties in CDH 4.6.0
gatewaydefaultgroup
Advanced
Display Name | Description | Related Name | Default Value | API Name | Required |
---|---|---|---|---|---|
Deploy Directory | The directory where the client configs will be deployed | /etc/hive | client_config_root_dir | true | |
Hive Client Advanced Configuration Snippet (Safety Valve) for hive-site.xml | For advanced use only, a string to be inserted into the client configuration for hive-site.xml. | hive_client_config_safety_valve | false | ||
Gateway Client Environment Advanced Configuration Snippet (Safety Valve) for hive-env.sh | For advanced use only, key-value pairs (one on each line) to be inserted into the client configuration for hive-env.sh | hive_client_env_safety_valve | false | ||
Client Java Configuration Options | These are Java command line arguments. Commonly, garbage collection flags or extra debugging flags would be passed here. | -Djava.net.preferIPv4Stack=true | hive_client_java_opts | false | |
Hive Metastore Connection Timeout | Timeout for requests to the Hive Metastore Server. Consider increasing this if you have tables with a lot of metadata and see timeout errors. Used by most Hive Metastore clients such as Hive CLI and HiveServer2, but not by Impala. Impala has a separately configured timeout. | hive.metastore.client.socket.timeout | 5 minute(s) | hive_metastore_timeout | false |
Gateway Logging Advanced Configuration Snippet (Safety Valve) | For advanced use only, a string to be inserted into log4j.properties for this role only. | log4j_safety_valve | false |
Logs
Display Name | Description | Related Name | Default Value | API Name | Required |
---|---|---|---|---|---|
Gateway Logging Threshold | The minimum log level for Gateway logs | INFO | log_threshold | false |
Monitoring
Display Name | Description | Related Name | Default Value | API Name | Required |
---|---|---|---|---|---|
Enable Configuration Change Alerts | When set, Cloudera Manager will send alerts when this entity's configuration changes. | false | enable_config_alerts | false |
Other
Display Name | Description | Related Name | Default Value | API Name | Required |
---|---|---|---|---|---|
Alternatives Priority | The priority level that the client configuration will have in the Alternatives system on the hosts. Higher priority levels will cause Alternatives to prefer this configuration over any others. | 90 | client_config_priority | true |
hivemetastoreserverdefaultgroup
Advanced
Display Name | Description | Related Name | Default Value | API Name | Required |
---|---|---|---|---|---|
Hive Metastore Server Advanced Configuration Snippet (Safety Valve) for hive-site.xml | For advanced use only, a string to be inserted into hive-site.xml for this role only. | hive_metastore_config_safety_valve | false | ||
Hive Metastore Server Environment Advanced Configuration Snippet (Safety Valve) | For advanced use only, key-value pairs (one on each line) to be inserted into a role's environment. Applies to configurations of this role except client configuration. | hive_metastore_env_safety_valve | false | ||
Java Configuration Options for Hive Metastore Server | These arguments will be passed as part of the Java command line. Commonly, garbage collection flags or extra debugging flags would be passed here. | -XX:+UseParNewGC -XX:+UseConcMarkSweepGC -XX:-CMSConcurrentMTEnabled -XX:CMSInitiatingOccupancyFraction=70 -XX:+CMSParallelRemarkEnabled | hive_metastore_java_opts | false | |
Max Hive Metastore Server Threads | Maximum number of worker threads in the Hive Metastore Server's thread pool | hive.metastore.server.max.threads | 100000 | hive_metastore_max_threads | true |
Min Hive Metastore Server Threads | Minimum number of worker threads in the Hive Metastore Server's thread pool | hive.metastore.server.min.threads | 200 | hive_metastore_min_threads | true |
Hive Metastore Server Logging Advanced Configuration Snippet (Safety Valve) | For advanced use only, a string to be inserted into log4j.properties for this role only. | log4j_safety_valve | false | ||
Heap Dump Directory | Path to directory where heap dumps are generated when java.lang.OutOfMemoryError error is thrown. This directory is automatically created if it does not exist. If this directory already exists, role user must have write access to this directory. If this directory is shared among multiple roles, it should have 1777 permissions. The heap dump files are created with 600 permissions and are owned by the role user. The amount of free space in this directory should be greater than the maximum Java Process heap size configured for this role. | oom_heap_dump_dir | /tmp | oom_heap_dump_dir | false |
Dump Heap When Out of Memory | When set, generates heap dump file when java.lang.OutOfMemoryError is thrown. | false | oom_heap_dump_enabled | true | |
Kill When Out of Memory | When set, a SIGKILL signal is sent to the role process when java.lang.OutOfMemoryError is thrown. | true | oom_sigkill_enabled | true | |
Automatically Restart Process | When set, this role's process is automatically (and transparently) restarted in the event of an unexpected failure. | false | process_auto_restart | true |
Logs
Display Name | Description | Related Name | Default Value | API Name | Required |
---|---|---|---|---|---|
Hive Metastore Server Log Directory | Directory where Hive Metastore Server will place its log files. | /var/log/hive | hive_log_dir | false | |
Hive Metastore Server Logging Threshold | The minimum log level for Hive Metastore Server logs | INFO | log_threshold | false | |
Hive Metastore Server Maximum Log File Backups | The maximum number of rolled log files to keep for Hive Metastore Server logs. Typically used by log4j or logback. | 10 | max_log_backup_index | false | |
Hive Metastore Server Max Log Size | The maximum size, in megabytes, per log file for Hive Metastore Server logs. Typically used by log4j or logback. | 200 MiB | max_log_size | false |
Monitoring
Display Name | Description | Related Name | Default Value | API Name | Required |
---|---|---|---|---|---|
Enable Health Alerts for this Role | When set, Cloudera Manager will send alerts when the health of this role reaches the threshold specified by the EventServer setting eventserver_health_events_alert_threshold | true | enable_alerts | false | |
Enable Configuration Change Alerts | When set, Cloudera Manager will send alerts when this entity's configuration changes. | false | enable_config_alerts | false | |
Heap Dump Directory Free Space Monitoring Absolute Thresholds | The health test thresholds for monitoring of free space on the filesystem that contains this role's heap dump directory. | Warning: 10 GiB, Critical: 5 GiB | heap_dump_directory_free_space_absolute_thresholds | false | |
Heap Dump Directory Free Space Monitoring Percentage Thresholds | The health test thresholds for monitoring of free space on the filesystem that contains this role's heap dump directory. Specified as a percentage of the capacity on that filesystem. This setting is not used if a Heap Dump Directory Free Space Monitoring Absolute Thresholds setting is configured. | Warning: Never, Critical: Never | heap_dump_directory_free_space_percentage_thresholds | false | |
File Descriptor Monitoring Thresholds | The health test thresholds of the number of file descriptors used. Specified as a percentage of file descriptor limit. | Warning: 50.0 %, Critical: 70.0 % | hivemetastore_fd_thresholds | false | |
Hive Metastore Server Host Health Test | When computing the overall Hive Metastore Server health, consider the host's health. | true | hivemetastore_host_health_enabled | false | |
Hive Metastore Server Process Health Test | Enables the health test that the Hive Metastore Server's process state is consistent with the role configuration | true | hivemetastore_scm_health_enabled | false | |
Log Directory Free Space Monitoring Absolute Thresholds | The health test thresholds for monitoring of free space on the filesystem that contains this role's log directory. | Warning: 10 GiB, Critical: 5 GiB | log_directory_free_space_absolute_thresholds | false | |
Log Directory Free Space Monitoring Percentage Thresholds | The health test thresholds for monitoring of free space on the filesystem that contains this role's log directory. Specified as a percentage of the capacity on that filesystem. This setting is not used if a Log Directory Free Space Monitoring Absolute Thresholds setting is configured. | Warning: Never, Critical: Never | log_directory_free_space_percentage_thresholds | false | |
Hive Metastore Canary Health Test | Enables the health test that checks that basic Hive Metastore operations succeed | true | metastore_canary_health_enabled | false | |
Process Swap Memory Thresholds | The health test thresholds on the swap memory usage of the process. | Warning: Any, Critical: Never | process_swap_memory_thresholds | false | |
Role Triggers | The configured triggers for this role. This is a JSON formatted list of triggers. These triggers are evaluated as part as the health
system. Every trigger expression is parsed, and if the trigger condition is met, the list of actions provided in the trigger expression is executed. Each trigger has all of the following fields:
|
[] | role_triggers | true | |
Unexpected Exits Thresholds | The health test thresholds for unexpected exits encountered within a recent period specified by the unexpected_exits_window configuration for the role. | Warning: Never, Critical: Any | unexpected_exits_thresholds | false | |
Unexpected Exits Monitoring Period | The period to review when computing unexpected exits. | 5 minute(s) | unexpected_exits_window | false |
Performance
Display Name | Description | Related Name | Default Value | API Name | Required |
---|---|---|---|---|---|
Maximum Process File Descriptors | If configured, overrides the process soft and hard rlimits (also called ulimits) for file descriptors to the configured value. | rlimit_fds | false |
Ports and Addresses
Display Name | Description | Related Name | Default Value | API Name | Required |
---|---|---|---|---|---|
Hive Metastore Server Port | Port on which Hive Metastore Server will listen for connections. | hive.metastore.port | 9083 | hive_metastore_port | false |
Resource Management
Display Name | Description | Related Name | Default Value | API Name | Required |
---|---|---|---|---|---|
Java Heap Size of Hive Metastore Server in Bytes | Maximum size in bytes for the Java Process heap memory. Passed to Java -Xmx. | 256 MiB | hive_metastore_java_heapsize | false | |
Cgroup CPU Shares | Number of CPU shares to assign to this role. The greater the number of shares, the larger the share of the host's CPUs that will be given to this role when the host experiences CPU contention. Must be between 2 and 262144. Defaults to 1024 for processes not managed by Cloudera Manager. | cpu.shares | 1024 | rm_cpu_shares | true |
Cgroup I/O Weight | Weight for the read I/O requests issued by this role. The greater the weight, the higher the priority of the requests when the host experiences I/O contention. Must be between 100 and 1000. Defaults to 1000 for processes not managed by Cloudera Manager. | blkio.weight | 500 | rm_io_weight | true |
Cgroup Memory Hard Limit | Hard memory limit to assign to this role, enforced by the Linux kernel. When the limit is reached, the kernel will reclaim pages charged to the process. If reclaiming fails, the kernel may kill the process. Both anonymous as well as page cache pages contribute to the limit. Use a value of -1 B to specify no limit. By default processes not managed by Cloudera Manager will have no limit. | memory.limit_in_bytes | -1 MiB | rm_memory_hard_limit | true |
Cgroup Memory Soft Limit | Soft memory limit to assign to this role, enforced by the Linux kernel. When the limit is reached, the kernel will reclaim pages charged to the process if and only if the host is facing memory pressure. If reclaiming fails, the kernel may kill the process. Both anonymous as well as page cache pages contribute to the limit. Use a value of -1 B to specify no limit. By default processes not managed by Cloudera Manager will have no limit. | memory.soft_limit_in_bytes | -1 MiB | rm_memory_soft_limit | true |
Stacks Collection
Display Name | Description | Related Name | Default Value | API Name | Required |
---|---|---|---|---|---|
Stacks Collection Data Retention | The amount of stacks data that is retained. After the retention limit is reached, the oldest data is deleted. | stacks_collection_data_retention | 100 MiB | stacks_collection_data_retention | false |
Stacks Collection Directory | The directory in which stacks logs are placed. If not set, stacks are logged into a stacks subdirectory of the role's log directory. | stacks_collection_directory | stacks_collection_directory | false | |
Stacks Collection Enabled | Whether or not periodic stacks collection is enabled. | stacks_collection_enabled | false | stacks_collection_enabled | true |
Stacks Collection Frequency | The frequency with which stacks are collected. | stacks_collection_frequency | 5.0 second(s) | stacks_collection_frequency | false |
Stacks Collection Method | The method used to collect stacks. The jstack option involves periodically running the jstack command against the role's daemon process. The servlet method is available for those roles that have an HTTP server endpoint exposing the current stacks traces of all threads. When the servlet method is selected, that HTTP endpoint is periodically scraped. | stacks_collection_method | jstack | stacks_collection_method | false |
hiveserver2defaultgroup
Advanced
Display Name | Description | Related Name | Default Value | API Name | Required |
---|---|---|---|---|---|
HiveServer2 Advanced Configuration Snippet (Safety Valve) for hive-site.xml | For advanced use only, a string to be inserted into hive-site.xml for this role only. | hive_hs2_config_safety_valve | false | ||
HiveServer2 Environment Advanced Configuration Snippet (Safety Valve) | For advanced use only, key-value pairs (one on each line) to be inserted into a role's environment. Applies to configurations of this role except client configuration. | hive_hs2_env_safety_valve | false | ||
Hive Downloaded Resources Directory | Local directory where Hive stores jars downloaded for remote file systems (HDFS). If not specified, Hive uses a default location. | hive.downloaded.resources.dir | hiveserver2_downloaded_resources_dir | false | |
Enable Explain Logging | When enabled, HiveServer2 logs EXPLAIN EXTENDED output for every query at INFO log4j level. | hive.log.explain.output | true | hiveserver2_enable_explain_output | false |
Hive Local Scratch Directory | Local Directory where Hive stores jars and data when performing a MapJoin optimization. If not specified, Hive uses a default location. | hive.exec.local.scratchdir | hiveserver2_exec_local_scratchdir | false | |
Hive HDFS Scratch Directory | Directory in HDFS where Hive writes intermediate data between MapReduce jobs. If not specified, Hive uses a default location. | hive.exec.scratchdir | hiveserver2_exec_scratchdir | false | |
Java Configuration Options for HiveServer2 | These arguments will be passed as part of the Java command line. Commonly, garbage collection flags or extra debugging flags would be passed here. | -XX:+UseParNewGC -XX:+UseConcMarkSweepGC -XX:-CMSConcurrentMTEnabled -XX:CMSInitiatingOccupancyFraction=70 -XX:+CMSParallelRemarkEnabled | hiveserver2_java_opts | false | |
Max HiveServer2 Threads | Maximum number of worker threads in HiveServer2's thread pool | hive.server2.thrift.max.worker.threads | 100 | hiveserver2_max_threads | true |
Min HiveServer2 Threads | Minimum number of worker threads in HiveServer2's thread pool | hive.server2.thrift.min.worker.threads | 5 | hiveserver2_min_threads | true |
HiveServer2 Logging Advanced Configuration Snippet (Safety Valve) | For advanced use only, a string to be inserted into log4j.properties for this role only. | log4j_safety_valve | false | ||
Heap Dump Directory | Path to directory where heap dumps are generated when java.lang.OutOfMemoryError error is thrown. This directory is automatically created if it does not exist. If this directory already exists, role user must have write access to this directory. If this directory is shared among multiple roles, it should have 1777 permissions. The heap dump files are created with 600 permissions and are owned by the role user. The amount of free space in this directory should be greater than the maximum Java Process heap size configured for this role. | oom_heap_dump_dir | /tmp | oom_heap_dump_dir | false |
Dump Heap When Out of Memory | When set, generates heap dump file when java.lang.OutOfMemoryError is thrown. | false | oom_heap_dump_enabled | true | |
Kill When Out of Memory | When set, a SIGKILL signal is sent to the role process when java.lang.OutOfMemoryError is thrown. | true | oom_sigkill_enabled | true | |
Automatically Restart Process | When set, this role's process is automatically (and transparently) restarted in the event of an unexpected failure. | false | process_auto_restart | true |
Logs
Display Name | Description | Related Name | Default Value | API Name | Required |
---|---|---|---|---|---|
HiveServer2 Log Directory | Directory where HiveServer2 will place its log files. | /var/log/hive | hive_log_dir | false | |
HiveServer2 Logging Threshold | The minimum log level for HiveServer2 logs | INFO | log_threshold | false | |
HiveServer2 Maximum Log File Backups | The maximum number of rolled log files to keep for HiveServer2 logs. Typically used by log4j or logback. | 10 | max_log_backup_index | false | |
HiveServer2 Max Log Size | The maximum size, in megabytes, per log file for HiveServer2 logs. Typically used by log4j or logback. | 200 MiB | max_log_size | false |
Monitoring
Display Name | Description | Related Name | Default Value | API Name | Required |
---|---|---|---|---|---|
Enable Health Alerts for this Role | When set, Cloudera Manager will send alerts when the health of this role reaches the threshold specified by the EventServer setting eventserver_health_events_alert_threshold | true | enable_alerts | false | |
Enable Configuration Change Alerts | When set, Cloudera Manager will send alerts when this entity's configuration changes. | false | enable_config_alerts | false | |
Heap Dump Directory Free Space Monitoring Absolute Thresholds | The health test thresholds for monitoring of free space on the filesystem that contains this role's heap dump directory. | Warning: 10 GiB, Critical: 5 GiB | heap_dump_directory_free_space_absolute_thresholds | false | |
Heap Dump Directory Free Space Monitoring Percentage Thresholds | The health test thresholds for monitoring of free space on the filesystem that contains this role's heap dump directory. Specified as a percentage of the capacity on that filesystem. This setting is not used if a Heap Dump Directory Free Space Monitoring Absolute Thresholds setting is configured. | Warning: Never, Critical: Never | heap_dump_directory_free_space_percentage_thresholds | false | |
Hive Local Scratch Directory Free Space Monitoring Absolute Thresholds | The health test thresholds for monitoring of free space on the filesystem that contains this role's Local Directory where Hive stores jars and data when performing a MapJoin optimization. If not specified, Hive uses a default location.. | Warning: 10 GiB, Critical: 5 GiB | hiveserver2_exec_local_scratch_directory_free_space_absolute_thresholds | false | |
Hive Local Scratch Directory Free Space Monitoring Percentage Thresholds | The health test thresholds for monitoring of free space on the filesystem that contains this role's Local Directory where Hive stores jars and data when performing a MapJoin optimization. If not specified, Hive uses a default location.. Specified as a percentage of the capacity on that filesystem. This setting is not used if a Local Directory where Hive stores jars and data when performing a MapJoin optimization. If not specified, Hive uses a default location. Free Space Monitoring Absolute Thresholds setting is configured. | Warning: Never, Critical: Never | hiveserver2_exec_local_scratch_directory_free_space_percentage_thresholds | false | |
File Descriptor Monitoring Thresholds | The health test thresholds of the number of file descriptors used. Specified as a percentage of file descriptor limit. | Warning: 50.0 %, Critical: 70.0 % | hiveserver2_fd_thresholds | false | |
HiveServer2 Host Health Test | When computing the overall HiveServer2 health, consider the host's health. | true | hiveserver2_host_health_enabled | false | |
HiveServer2 Process Health Test | Enables the health test that the HiveServer2's process state is consistent with the role configuration | true | hiveserver2_scm_health_enabled | false | |
Log Directory Free Space Monitoring Absolute Thresholds | The health test thresholds for monitoring of free space on the filesystem that contains this role's log directory. | Warning: 10 GiB, Critical: 5 GiB | log_directory_free_space_absolute_thresholds | false | |
Log Directory Free Space Monitoring Percentage Thresholds | The health test thresholds for monitoring of free space on the filesystem that contains this role's log directory. Specified as a percentage of the capacity on that filesystem. This setting is not used if a Log Directory Free Space Monitoring Absolute Thresholds setting is configured. | Warning: Never, Critical: Never | log_directory_free_space_percentage_thresholds | false | |
Process Swap Memory Thresholds | The health test thresholds on the swap memory usage of the process. | Warning: Any, Critical: Never | process_swap_memory_thresholds | false | |
Role Triggers | The configured triggers for this role. This is a JSON formatted list of triggers. These triggers are evaluated as part as the health
system. Every trigger expression is parsed, and if the trigger condition is met, the list of actions provided in the trigger expression is executed. Each trigger has all of the following fields:
|
[] | role_triggers | true | |
Unexpected Exits Thresholds | The health test thresholds for unexpected exits encountered within a recent period specified by the unexpected_exits_window configuration for the role. | Warning: Never, Critical: Any | unexpected_exits_thresholds | false | |
Unexpected Exits Monitoring Period | The period to review when computing unexpected exits. | 5 minute(s) | unexpected_exits_window | false |
Other
Display Name | Description | Related Name | Default Value | API Name | Required |
---|---|---|---|---|---|
HiveServer2 Load Balancer | Address of the load balancer used for HiveServer2 roles, specified in host:port format. If port is not specified, the port used by HiveServer2 is used. Note: Changing this property regenerates Kerberos keytabs for all HiveServer2 roles. | hiverserver2_load_balancer | false | ||
HiveServer2 Enable Impersonation | HiveServer2 will impersonate the beeline client user when talking to other services such as Mapreduce and Hdfs. | hive.server2.enable.impersonation | true | hiveserver2_enable_impersonation | false |
Performance
Display Name | Description | Related Name | Default Value | API Name | Required |
---|---|---|---|---|---|
Hive Auto Convert Join Noconditional Size | If Hive auto convert join is on, and the sum of the size for n-1 of the tables/partitions for a n-way join is smaller than the specified size, the join is directly converted to a MapJoin (there is no conditional task). | hive.auto.convert.join.noconditionaltask.size | 20 MiB | hiveserver2_auto_convert_join_noconditionaltask_size | false |
Enable MapJoin Optimization | Enable optimization that converts common join into MapJoin based on input file size. | hive.auto.convert.join | true | hiveserver2_enable_mapjoin | false |
Hive Optimize Sorted Merge Bucket Join | Whether to try sorted merge bucket (SMB) join. | hive.optimize.bucketmapjoin.sortedmerge | false | hiveserver2_optimize_bucketmapjoin_sortedmerge | false |
Hive SMB Join Cache Rows | The number of rows with the same key value to be cached in memory per SMB-joined table. | hive.smbjoin.cache.rows | 10000 | hiveserver2_smbjoin_cache_rows | false |
Maximum Process File Descriptors | If configured, overrides the process soft and hard rlimits (also called ulimits) for file descriptors to the configured value. | rlimit_fds | false |
Ports and Addresses
Display Name | Description | Related Name | Default Value | API Name | Required |
---|---|---|---|---|---|
HiveServer2 Port | Port on which HiveServer2 will listen for connections. | hive.server2.thrift.port | 10000 | hs2_thrift_address_port | false |
Resource Management
Display Name | Description | Related Name | Default Value | API Name | Required |
---|---|---|---|---|---|
Java Heap Size of HiveServer2 in Bytes | Maximum size in bytes for the Java Process heap memory. Passed to Java -Xmx. | 256 MiB | hiveserver2_java_heapsize | false | |
Cgroup CPU Shares | Number of CPU shares to assign to this role. The greater the number of shares, the larger the share of the host's CPUs that will be given to this role when the host experiences CPU contention. Must be between 2 and 262144. Defaults to 1024 for processes not managed by Cloudera Manager. | cpu.shares | 1024 | rm_cpu_shares | true |
Cgroup I/O Weight | Weight for the read I/O requests issued by this role. The greater the weight, the higher the priority of the requests when the host experiences I/O contention. Must be between 100 and 1000. Defaults to 1000 for processes not managed by Cloudera Manager. | blkio.weight | 500 | rm_io_weight | true |
Cgroup Memory Hard Limit | Hard memory limit to assign to this role, enforced by the Linux kernel. When the limit is reached, the kernel will reclaim pages charged to the process. If reclaiming fails, the kernel may kill the process. Both anonymous as well as page cache pages contribute to the limit. Use a value of -1 B to specify no limit. By default processes not managed by Cloudera Manager will have no limit. | memory.limit_in_bytes | -1 MiB | rm_memory_hard_limit | true |
Cgroup Memory Soft Limit | Soft memory limit to assign to this role, enforced by the Linux kernel. When the limit is reached, the kernel will reclaim pages charged to the process if and only if the host is facing memory pressure. If reclaiming fails, the kernel may kill the process. Both anonymous as well as page cache pages contribute to the limit. Use a value of -1 B to specify no limit. By default processes not managed by Cloudera Manager will have no limit. | memory.soft_limit_in_bytes | -1 MiB | rm_memory_soft_limit | true |
Stacks Collection
Display Name | Description | Related Name | Default Value | API Name | Required |
---|---|---|---|---|---|
Stacks Collection Data Retention | The amount of stacks data that is retained. After the retention limit is reached, the oldest data is deleted. | stacks_collection_data_retention | 100 MiB | stacks_collection_data_retention | false |
Stacks Collection Directory | The directory in which stacks logs are placed. If not set, stacks are logged into a stacks subdirectory of the role's log directory. | stacks_collection_directory | stacks_collection_directory | false | |
Stacks Collection Enabled | Whether or not periodic stacks collection is enabled. | stacks_collection_enabled | false | stacks_collection_enabled | true |
Stacks Collection Frequency | The frequency with which stacks are collected. | stacks_collection_frequency | 5.0 second(s) | stacks_collection_frequency | false |
Stacks Collection Method | The method used to collect stacks. The jstack option involves periodically running the jstack command against the role's daemon process. The servlet method is available for those roles that have an HTTP server endpoint exposing the current stacks traces of all threads. When the servlet method is selected, that HTTP endpoint is periodically scraped. | stacks_collection_method | jstack | stacks_collection_method | false |
service_wide
Advanced
Display Name | Description | Related Name | Default Value | API Name | Required |
---|---|---|---|---|---|
Hive Auxiliary JARs Directory | Directory containing auxiliary JARs used by Hive. This should be a directory location and not a classpath containing one or more JARs. This directory must be created and managed manually on Hive CLI or HiveServer2 host. The directory location is set in the environment as HIVE_AUX_JARS_PATH and will generally override hive.aux.jars.path property set in XML files, even if hive.aux.jars.path is set in an advanced configuration snippet. | hive_aux_jars_path_dir | false | ||
Bypass Hive Metastore Server | Instead of talking to Hive Metastore Server for Metastore information, Hive clients will talk directly to the Metastore database. | false | hive_bypass_metastore_server | false | |
Hive Service Advanced Configuration Snippet (Safety Valve) for core-site.xml | For advanced use only, a string to be inserted into core-site.xml. Applies to configurations of all roles in this service except client configuration. | hive_core_site_safety_valve | false | ||
Server Name for Sentry Authorization | The server name used when defining privilege rules in Sentry authorization. Sentry uses this name as an alias for the Hive service. It does not correspond to any physical server name. | hive.sentry.server | server1 | hive_sentry_server | false |
Hive Service Advanced Configuration Snippet (Safety Valve) for sentry-site.xml | For advanced use only, a string to be inserted into sentry-site.xml. Applies to configurations of all roles in this service except client configuration. | hive_server2_sentry_safety_valve | false | ||
Hive Service Advanced Configuration Snippet (Safety Valve) for hive-site.xml | For advanced use only, a string to be inserted into hive-site.xml. Applies to configurations of all roles in this service except client configuration. | hive_service_config_safety_valve | false | ||
Hive Service Environment Advanced Configuration Snippet (Safety Valve) | For advanced use only, key-value pairs (one on each line) to be inserted into a role's environment. Applies to configurations of all roles in this service except client configuration. | hive_service_env_safety_valve | false | ||
Hive Client Advanced Configuration Snippet (Safety Valve) for navigator.client.properties | For advanced use only, a string to be inserted into the client configuration for navigator.client.properties. | navigator_client_config_safety_valve | false | ||
System Group | The group that this service's processes should run as. | hive | process_groupname | true | |
System User | The user that this service's processes should run as. | hive | process_username | true |
Cloudera Navigator
Display Name | Description | Related Name | Default Value | API Name | Required |
---|---|---|---|---|---|
Enable Audit Collection | Enable collection of audit events from the service's roles. | true | navigator_audit_enabled | false | |
Audit Event Filter | Event filters are defined in a JSON object like the following: { "defaultAction" : ("accept", "discard"),
"rules" : [ { "action" : ("accept", "discard"), "fields" : [ { "name" : "fieldName", "match" : "regex" } ] } ] } A filter has a default action and a list of rules, in order of precedence. Each
rule defines an action, and a list of fields to match against the audit event. A rule is "accepted" if all the listed field entries match the audit event. At that point, the action declared by the
rule is taken. If no rules match the event, the default action is taken. Actions default to "accept" if not defined in the JSON object. The following is the list of fields that can be filtered for
Hive events:
|
navigator.event.filter | comment : [ The default Hive audit event filter discards HDFS directory events , generated by Hive jobs that reference the /tmp directory. ], defaultAction : accept, rules : [ action : discard, fields : [ name : operation, match : QUERY , name : objectType, match : DFS_DIR, name : resourcePath, match : /tmp/hive-(?:.+)?/hive_(?:.+)?/-mr-.* ] ] | navigator_audit_event_filter | false |
Audit Queue Policy | Action to take when the audit event queue is full. Drop the event or shutdown the affected process. | navigator.batch.queue_policy | DROP | navigator_audit_queue_policy | false |
Audit Event Tracker | Configures the rules for event tracking and coalescing. This feature is used to define equivalency between different audit events.
When events match, according to a set of configurable parameters, only one entry in the audit list is generated for all the matching events. Tracking works by keeping a reference to events when they
first appear, and comparing other incoming events against the "tracked" events according to the rules defined here. Event trackers are defined in a JSON object like the following: { "timeToLive" : [integer], "fields" : [ { "type" : [string], "name" : [string] } ] } Where:
|
navigator_event_tracker | navigator_event_tracker | false |
Hive Metastore Database
Display Name | Description | Related Name | Default Value | API Name | Required |
---|---|---|---|---|---|
Auto Create and Upgrade Hive Metastore Database Schema | Automatically create or upgrade tables in the Hive Metastore database when needed. Consider setting this to false and managing the schema manually. | datanucleus.autoCreateSchema | false | hive_metastore_database_auto_create_schema | false |
Hive Metastore Database DataNucleus Metadata Validation | Perform DataNucleus validation of metadata during startup. Note: when enabled, Hive will log DataNucleus warnings even though Hive will function normally. | datanucleus.metadata.validate | false | hive_metastore_database_datanucleus_metadata_validation | false |
Fixed Datastore | Disallow any implicit schema changes in the Hive Metastore database via DataNucleus. | datanucleus.fixedDatastore | true | hive_metastore_database_fixed_datastore | false |
Hive Metastore Database Host | Host name of Hive Metastore database | localhost | hive_metastore_database_host | false | |
Hive Metastore Database Name | Name of Hive Metastore database | metastore | hive_metastore_database_name | false | |
Hive Metastore Database Password | Password for Hive Metastore database | javax.jdo.option.ConnectionPassword | hive_metastore_database_password | false | |
Hive Metastore Database Port | Port number of Hive Metastore database | 3306 | hive_metastore_database_port | false | |
Hive Metastore Database Type | Type of Hive Metastore database. Note that Derby is not recommended and Cloudera Impala does not support Derby. | mysql | hive_metastore_database_type | false | |
Hive Metastore Database User | User for Hive Metastore database | javax.jdo.option.ConnectionUserName | hive | hive_metastore_database_user | false |
Hive Metastore Derby Path | Directory name where Hive Metastore's database is stored (only for Derby) | /var/lib/hive/cloudera_manager/derby/metastore_db | hive_metastore_derby_path | false |
Logs
Display Name | Description | Related Name | Default Value | API Name | Required |
---|---|---|---|---|---|
Audit Log Directory | Path to the directory where audit logs will be written. The directory will be created if it doesn't exist. | audit_event_log_dir | /var/log/hive/audit | audit_event_log_dir | false |
Maximum Audit Log File Size | Maximum size of audit log file in MB before it is rolled over. | navigator.audit_log_max_file_size | 100 MiB | navigator_audit_log_max_file_size | false |
Number of Audit Logs to Retain | Maximum number of rolled over audit logs to retain. The logs will not be deleted if they contain audit events that have not yet been propagated to Audit Server. | navigator.client.max_num_audit_log | 10 | navigator_client_max_num_audit_log | false |
Monitoring
Display Name | Description | Related Name | Default Value | API Name | Required |
---|---|---|---|---|---|
Enable Service Level Health Alerts | When set, Cloudera Manager will send alerts when the health of this service reaches the threshold specified by the EventServer setting eventserver_health_events_alert_threshold | true | enable_alerts | false | |
Enable Configuration Change Alerts | When set, Cloudera Manager will send alerts when this entity's configuration changes. | false | enable_config_alerts | false | |
Healthy Hive Metastore Server Monitoring Thresholds | The health test thresholds of the overall Hive Metastore Server health. The check returns "Concerning" health if the percentage of "Healthy" Hive Metastore Servers falls below the warning threshold. The check is unhealthy if the total percentage of "Healthy" and "Concerning" Hive Metastore Servers falls below the critical threshold. | Warning: 99.0 %, Critical: 51.0 % | hive_hivemetastores_healthy_thresholds | false | |
Healthy HiveServer2 Monitoring Thresholds | The health test thresholds of the overall HiveServer2 health. The check returns "Concerning" health if the percentage of "Healthy" HiveServer2s falls below the warning threshold. The check is unhealthy if the total percentage of "Healthy" and "Concerning" HiveServer2s falls below the critical threshold. | Warning: 99.0 %, Critical: 51.0 % | hive_hiveserver2s_healthy_thresholds | false | |
Healthy WebHCat Server Monitoring Thresholds | The health test thresholds of the overall WebHCat Server health. The check returns "Concerning" health if the percentage of "Healthy" WebHCat Servers falls below the warning threshold. The check is unhealthy if the total percentage of "Healthy" and "Concerning" WebHCat Servers falls below the critical threshold. | Warning: 99.0 %, Critical: 51.0 % | hive_webhcats_healthy_thresholds | false | |
Service Triggers | The configured triggers for this service. This is a JSON formatted list of triggers. These triggers are evaluated as part as the
health system. Every trigger expression is parsed, and if the trigger condition is met, the list of actions provided in the trigger expression is executed. Each trigger has all of the following
fields:
|
[] | service_triggers | true | |
Service Monitor Client Config Overrides | For advanced use only, a list of configuration properties that will be used by the Service Monitor instead of the current client configuration for the service. | <property><name>hive.metastore.client.socket.timeout</name><value>60</value></property> | smon_client_config_overrides | false | |
Service Monitor Derived Configs Advanced Configuration Snippet (Safety Valve) | For advanced use only, a list of derived configuration properties that will be used by the Service Monitor instead of the default ones. | smon_derived_configs_safety_valve | false |
Other
Display Name | Description | Related Name | Default Value | API Name | Required |
---|---|---|---|---|---|
Hive Bytes Per Reducer | Size per reducer. If the input size is 10GiB and this is set to 1GiB, Hive will use 10 reducers. | hive.exec.reducers.bytes.per.reducer | 64 MiB | hive_bytes_per_reducer | false |
Hive Max Reducers | Max number of reducers to use. If the configuration parameter Hive Reduce Tasks is negative, Hive will limit the number of reducers to the value of this parameter. | hive.exec.reducers.max | 1099 | hive_max_reducers | false |
Hive Reduce Tasks | Default number of reduce tasks per job. Usually set to a prime number close to the number of available hosts. Ignored when mapred.job.tracker is "local". Hadoop sets this to 1 by default, while Hive uses -1 as the default. When set to -1, Hive will automatically determine an appropriate number of reducers for each job. | mapred.reduce.tasks | -1 | hive_reduce_tasks | false |
Set User and Group Information | In unsecure mode, setting this property to true will cause the Metastore Server to execute DFS operations using the client's reported user and group permissions. Cloudera Manager will set this for all clients and servers. | hive.metastore.execute.setugi | true | hive_set_ugi | true |
Hive Warehouse Directory | Hive warehouse directory is the location in HDFS where Hive's tables are stored. Note that Hive's default value for its warehouse directory is '/user/hive/warehouse'. | hive.metastore.warehouse.dir | /user/hive/warehouse | hive_warehouse_directory | false |
Hive Warehouse Subdirectories Inherit Permissions | Let the table directories inherit the permission of the Warehouse or Database directory instead of being created with the permissions derived from dfs umask. This allows Impala to insert into tables created via Hive. | hive.warehouse.subdir.inherit.perms | true | hive_warehouse_subdir_inherit_perms | true |
MapReduce Service | MapReduce jobs are run against this service. | mapreduce_yarn_service | true | ||
ZooKeeper Service | Name of the ZooKeeper service that this Hive service instance depends on. | zookeeper_service | false |
Policy File Based Sentry
Display Name | Description | Related Name | Default Value | API Name | Required |
---|---|---|---|---|---|
Sentry User to Group Mapping Class | The class to use in Sentry authorization for user to group mapping. Sentry authorization may be configured to use either Hadoop user to group mapping or local groups defined in the policy file. Hadoop user to group mapping may be configured in the Cloudera Manager HDFS service configuration page under the Security section. | hive.sentry.provider | org.apache.sentry.provider.file.HadoopGroupResourceAuthorizationProvider | hive_sentry_provider | false |
Sentry Global Policy File | HDFS path to the global policy file for Sentry authorization. This should be a relative path (and not a full HDFS URL). The global policy file must be in Sentry policy file format. | hive.sentry.provider.resource | /user/hive/sentry/sentry-provider.ini | hive_sentry_provider_resource | false |
Allow URIs in Database Policy File | Allows URIs when defining privileges in per-database policy files. Warning: Typically, this configuration should be disabled. Enabling it would allow database policy file owner (which is generally not Hive admin user) to grant load privileges to any directory with read access to Hive admin user, including databases controlled by other database policy files. | sentry.allow.uri.db.policyfile | false | sentry_allow_uri_db_policyfile | false |
Enable Sentry Authorization using Policy Files | Use Sentry to enable role-based, fine-grained authorization. This configuration enables Sentry using policy files. To enable Sentry using the Sentry service instead, add the Sentry service as a dependency to the Hive service. The Sentry service provides concurrent and secure access to authorization policy metadata and is the recommended option for enabling Sentry. Sentry is supported only on CDH 4.4 or later deployments. Before enabling Sentry, read the requirements and configuration steps in Setting Up Hive Authorization with Sentry . | hive.sentry.enabled | false | sentry_enabled | false |
Proxy
Display Name | Description | Related Name | Default Value | API Name | Required |
---|---|---|---|---|---|
Hive Metastore Access Control and Proxy User Groups Override | This configuration overrides the value set for Hive Proxy User Groups configuration in HDFS service for use by Hive Metastore Server. Specify a comma-delimited list of groups that you want to allow access to Hive Metastore metadata and allow the Hive user to impersonate. A value of '*' allows all groups. The default value of empty inherits the value set for Hive Proxy User Groups configuration in the HDFS service. | hadoop.proxyuser.hive.groups | hive_proxy_user_groups_list | false |
Security
Display Name | Description | Related Name | Default Value | API Name | Required |
---|---|---|---|---|---|
Enable TLS/SSL for HiveServer2 | Encrypt communication between clients and HiveServer2 using Transport Layer Security (TLS) (formerly known as Secure Socket Layer (SSL)). | hive.server2.enable.SSL | false | hiveserver2_enable_ssl | false |
HiveServer2 TLS/SSL Server JKS Keystore File Password | The password for the HiveServer2 JKS keystore file. | hive.server2.keystore.password | hiveserver2_keystore_password | false | |
HiveServer2 TLS/SSL Server JKS Keystore File Location | The path to the TLS/SSL keystore file containing the server certificate and private key used for TLS/SSL. Used when HiveServer2 is acting as a TLS/SSL server. The keystore must be in JKS format. | hive.server2.keystore.path | hiveserver2_keystore_path | false | |
Kerberos Principal | Kerberos principal short name used by all roles of this service. | hive | kerberos_princ_name | true |
webhcatserverdefaultgroup
Advanced
Display Name | Description | Related Name | Default Value | API Name | Required |
---|---|---|---|---|---|
WebHCat Server Advanced Configuration Snippet (Safety Valve) for webhcat-site.xml | For advanced use only, a string to be inserted into webhcat-site.xml for this role only. | hive_webhcat_config_safety_valve | false | ||
WebHCat Server Environment Advanced Configuration Snippet (Safety Valve) | For advanced use only, key-value pairs (one on each line) to be inserted into a role's environment. Applies to configurations of this role except client configuration. | hive_webhcat_env_safety_valve | false | ||
WebHCat Server Advanced Configuration Snippet (Safety Valve) for hive-site.xml | For advanced use only, a string to be inserted into hive-site.xml for this role only. | hive_webhcat_hive_config_safety_valve | false | ||
Java Configuration Options for WebHCat Server | These arguments will be passed as part of the Java command line. Commonly, garbage collection flags or extra debugging flags would be passed here. | -XX:+UseParNewGC -XX:+UseConcMarkSweepGC -XX:-CMSConcurrentMTEnabled -XX:CMSInitiatingOccupancyFraction=70 -XX:+CMSParallelRemarkEnabled | hive_webhcat_java_opts | false | |
WebHCat Server Logging Advanced Configuration Snippet (Safety Valve) | For advanced use only, a string to be inserted into log4j.properties for this role only. | log4j_safety_valve | false | ||
Heap Dump Directory | Path to directory where heap dumps are generated when java.lang.OutOfMemoryError error is thrown. This directory is automatically created if it does not exist. If this directory already exists, role user must have write access to this directory. If this directory is shared among multiple roles, it should have 1777 permissions. The heap dump files are created with 600 permissions and are owned by the role user. The amount of free space in this directory should be greater than the maximum Java Process heap size configured for this role. | oom_heap_dump_dir | /tmp | oom_heap_dump_dir | false |
Dump Heap When Out of Memory | When set, generates heap dump file when java.lang.OutOfMemoryError is thrown. | false | oom_heap_dump_enabled | true | |
Kill When Out of Memory | When set, a SIGKILL signal is sent to the role process when java.lang.OutOfMemoryError is thrown. | true | oom_sigkill_enabled | true | |
Automatically Restart Process | When set, this role's process is automatically (and transparently) restarted in the event of an unexpected failure. | false | process_auto_restart | true |
Logs
Display Name | Description | Related Name | Default Value | API Name | Required |
---|---|---|---|---|---|
WebHCat Server Log Directory | Directory where WebHCat Server will place its log files. | /var/log/hcatalog | hcatalog_log_dir | false | |
WebHCat Server Logging Threshold | The minimum log level for WebHCat Server logs | INFO | log_threshold | false | |
WebHCat Server Maximum Log File Backups | The maximum number of rolled log files to keep for WebHCat Server logs. Typically used by log4j or logback. | 10 | max_log_backup_index | false | |
WebHCat Server Max Log Size | The maximum size, in megabytes, per log file for WebHCat Server logs. Typically used by log4j or logback. | 200 MiB | max_log_size | false |
Monitoring
Display Name | Description | Related Name | Default Value | API Name | Required |
---|---|---|---|---|---|
Enable Health Alerts for this Role | When set, Cloudera Manager will send alerts when the health of this role reaches the threshold specified by the EventServer setting eventserver_health_events_alert_threshold | true | enable_alerts | false | |
Enable Configuration Change Alerts | When set, Cloudera Manager will send alerts when this entity's configuration changes. | false | enable_config_alerts | false | |
Heap Dump Directory Free Space Monitoring Absolute Thresholds | The health test thresholds for monitoring of free space on the filesystem that contains this role's heap dump directory. | Warning: 10 GiB, Critical: 5 GiB | heap_dump_directory_free_space_absolute_thresholds | false | |
Heap Dump Directory Free Space Monitoring Percentage Thresholds | The health test thresholds for monitoring of free space on the filesystem that contains this role's heap dump directory. Specified as a percentage of the capacity on that filesystem. This setting is not used if a Heap Dump Directory Free Space Monitoring Absolute Thresholds setting is configured. | Warning: Never, Critical: Never | heap_dump_directory_free_space_percentage_thresholds | false | |
Log Directory Free Space Monitoring Absolute Thresholds | The health test thresholds for monitoring of free space on the filesystem that contains this role's log directory. | Warning: 10 GiB, Critical: 5 GiB | log_directory_free_space_absolute_thresholds | false | |
Log Directory Free Space Monitoring Percentage Thresholds | The health test thresholds for monitoring of free space on the filesystem that contains this role's log directory. Specified as a percentage of the capacity on that filesystem. This setting is not used if a Log Directory Free Space Monitoring Absolute Thresholds setting is configured. | Warning: Never, Critical: Never | log_directory_free_space_percentage_thresholds | false | |
Process Swap Memory Thresholds | The health test thresholds on the swap memory usage of the process. | Warning: Any, Critical: Never | process_swap_memory_thresholds | false | |
Role Triggers | The configured triggers for this role. This is a JSON formatted list of triggers. These triggers are evaluated as part as the health
system. Every trigger expression is parsed, and if the trigger condition is met, the list of actions provided in the trigger expression is executed. Each trigger has all of the following fields:
|
[] | role_triggers | true | |
Unexpected Exits Thresholds | The health test thresholds for unexpected exits encountered within a recent period specified by the unexpected_exits_window configuration for the role. | Warning: Never, Critical: Any | unexpected_exits_thresholds | false | |
Unexpected Exits Monitoring Period | The period to review when computing unexpected exits. | 5 minute(s) | unexpected_exits_window | false | |
File Descriptor Monitoring Thresholds | The health test thresholds of the number of file descriptors used. Specified as a percentage of file descriptor limit. | Warning: 50.0 %, Critical: 70.0 % | webhcat_fd_thresholds | false | |
WebHCat Server Host Health Test | When computing the overall WebHCat Server health, consider the host's health. | true | webhcat_host_health_enabled | false | |
WebHCat Server Process Health Test | Enables the health test that the WebHCat Server's process state is consistent with the role configuration | true | webhcat_scm_health_enabled | false |
Performance
Display Name | Description | Related Name | Default Value | API Name | Required |
---|---|---|---|---|---|
Maximum Process File Descriptors | If configured, overrides the process soft and hard rlimits (also called ulimits) for file descriptors to the configured value. | rlimit_fds | false |
Ports and Addresses
Display Name | Description | Related Name | Default Value | API Name | Required |
---|---|---|---|---|---|
WebHCat Server Port | Port on which WebHCat Server will listen for connections. | templeton.port | 50111 | hive_webhcat_address_port | false |
Resource Management
Display Name | Description | Related Name | Default Value | API Name | Required |
---|---|---|---|---|---|
Java Heap Size of WebHCat Server in Bytes | Maximum size in bytes for the Java Process heap memory. Passed to Java -Xmx. | 256 MiB | hive_webhcat_java_heapsize | false | |
Cgroup CPU Shares | Number of CPU shares to assign to this role. The greater the number of shares, the larger the share of the host's CPUs that will be given to this role when the host experiences CPU contention. Must be between 2 and 262144. Defaults to 1024 for processes not managed by Cloudera Manager. | cpu.shares | 1024 | rm_cpu_shares | true |
Cgroup I/O Weight | Weight for the read I/O requests issued by this role. The greater the weight, the higher the priority of the requests when the host experiences I/O contention. Must be between 100 and 1000. Defaults to 1000 for processes not managed by Cloudera Manager. | blkio.weight | 500 | rm_io_weight | true |
Cgroup Memory Hard Limit | Hard memory limit to assign to this role, enforced by the Linux kernel. When the limit is reached, the kernel will reclaim pages charged to the process. If reclaiming fails, the kernel may kill the process. Both anonymous as well as page cache pages contribute to the limit. Use a value of -1 B to specify no limit. By default processes not managed by Cloudera Manager will have no limit. | memory.limit_in_bytes | -1 MiB | rm_memory_hard_limit | true |
Cgroup Memory Soft Limit | Soft memory limit to assign to this role, enforced by the Linux kernel. When the limit is reached, the kernel will reclaim pages charged to the process if and only if the host is facing memory pressure. If reclaiming fails, the kernel may kill the process. Both anonymous as well as page cache pages contribute to the limit. Use a value of -1 B to specify no limit. By default processes not managed by Cloudera Manager will have no limit. | memory.soft_limit_in_bytes | -1 MiB | rm_memory_soft_limit | true |
Stacks Collection
Display Name | Description | Related Name | Default Value | API Name | Required |
---|---|---|---|---|---|
Stacks Collection Data Retention | The amount of stacks data that is retained. After the retention limit is reached, the oldest data is deleted. | stacks_collection_data_retention | 100 MiB | stacks_collection_data_retention | false |
Stacks Collection Directory | The directory in which stacks logs are placed. If not set, stacks are logged into a stacks subdirectory of the role's log directory. | stacks_collection_directory | stacks_collection_directory | false | |
Stacks Collection Enabled | Whether or not periodic stacks collection is enabled. | stacks_collection_enabled | false | stacks_collection_enabled | true |
Stacks Collection Frequency | The frequency with which stacks are collected. | stacks_collection_frequency | 5.0 second(s) | stacks_collection_frequency | false |
Stacks Collection Method | The method used to collect stacks. The jstack option involves periodically running the jstack command against the role's daemon process. The servlet method is available for those roles that have an HTTP server endpoint exposing the current stacks traces of all threads. When the servlet method is selected, that HTTP endpoint is periodically scraped. | stacks_collection_method | jstack | stacks_collection_method | false |