Hive Properties in CDH 5.4.0

Role groups:

gatewaydefaultgroup
hivemetastoreserverdefaultgroup
hiveserver2defaultgroup
service_wide
webhcatserverdefaultgroup

gatewaydefaultgroup

Advanced

Display Name	Description	Related Name	Default Value	API Name	Required
Deploy Directory	The directory where the client configs will be deployed		/etc/hive	`client_config_root_dir`	true
Hive Client Advanced Configuration Snippet (Safety Valve) for hive-site.xml	For advanced use only, a string to be inserted into the client configuration for hive-site.xml.			`hive_client_config_safety_valve`	false
Gateway Client Environment Advanced Configuration Snippet (Safety Valve) for hive-env.sh	For advanced use only, key-value pairs (one on each line) to be inserted into the client configuration for hive-env.sh			`hive_client_env_safety_valve`	false
Client Java Configuration Options	These are Java command line arguments. Commonly, garbage collection flags or extra debugging flags would be passed here.		-Djava.net.preferIPv4Stack=true	`hive_client_java_opts`	false
Hive Metastore Connection Timeout	Timeout for requests to the Hive Metastore Server. Consider increasing this if you have tables with a lot of metadata and see timeout errors. Used by most Hive Metastore clients such as Hive CLI and HiveServer2, but not by Impala. Impala has a separately configured timeout.	`hive.metastore.client.socket.timeout`	5 minute(s)	`hive_metastore_timeout`	false
Gateway Logging Advanced Configuration Snippet (Safety Valve)	For advanced use only, a string to be inserted into log4j.properties for this role only.			`log4j_safety_valve`	false

Logs

Display Name	Description	Related Name	Default Value	API Name	Required
Gateway Logging Threshold	The minimum log level for Gateway logs		INFO	`log_threshold`	false

Monitoring

Display Name	Description	Related Name	Default Value	API Name	Required
Enable Configuration Change Alerts	When set, Cloudera Manager will send alerts when this entity's configuration changes.		false	`enable_config_alerts`	false

Other

Display Name	Description	Related Name	Default Value	API Name	Required
Alternatives Priority	The priority level that the client configuration will have in the Alternatives system on the hosts. Higher priority levels will cause Alternatives to prefer this configuration over any others.		90	`client_config_priority`	true

Resource Management

Display Name	Description	Related Name	Default Value	API Name	Required
Client Java Heap Size in Bytes	Maximum size in bytes for the Java process heap memory. Passed to Java -Xmx.		1 GiB	`hive_client_java_heapsize`	false

hivemetastoreserverdefaultgroup

Advanced

Display Name	Description	Related Name	Default Value	API Name	Required
Hive Metastore Server Advanced Configuration Snippet (Safety Valve) for hive-site.xml	For advanced use only, a string to be inserted into hive-site.xml for this role only.			`hive_metastore_config_safety_valve`	false
Hive Metastore Delegation Token Store	The delegation token store implementation class. Use DBTokenStore for Highly Available Metastore Configuration.	`hive.cluster.delegation.token.store.class`	org.apache.hadoop.hive.thrift.MemoryTokenStore	`hive_metastore_delegation_token_store`	false
Hive Metastore Server Environment Advanced Configuration Snippet (Safety Valve)	For advanced use only, key-value pairs (one on each line) to be inserted into a role's environment. Applies to configurations of this role except client configuration.			`hive_metastore_env_safety_valve`	false
Java Configuration Options for Hive Metastore Server	These arguments will be passed as part of the Java command line. Commonly, garbage collection flags or extra debugging flags would be passed here.		-XX:+UseParNewGC -XX:+UseConcMarkSweepGC -XX:-CMSConcurrentMTEnabled -XX:CMSInitiatingOccupancyFraction=70 -XX:+CMSParallelRemarkEnabled	`hive_metastore_java_opts`	false
Max Hive Metastore Server Threads	Maximum number of worker threads in the Hive Metastore Server's thread pool	`hive.metastore.server.max.threads`	100000	`hive_metastore_max_threads`	true
Min Hive Metastore Server Threads	Minimum number of worker threads in the Hive Metastore Server's thread pool	`hive.metastore.server.min.threads`	200	`hive_metastore_min_threads`	true
Hive Metastore Server Logging Advanced Configuration Snippet (Safety Valve)	For advanced use only, a string to be inserted into log4j.properties for this role only.			`log4j_safety_valve`	false
Heap Dump Directory	Path to directory where heap dumps are generated when java.lang.OutOfMemoryError error is thrown. This directory is automatically created if it does not exist. If this directory already exists, role user must have write access to this directory. If this directory is shared among multiple roles, it should have 1777 permissions. The heap dump files are created with 600 permissions and are owned by the role user. The amount of free space in this directory should be greater than the maximum Java Process heap size configured for this role.	`oom_heap_dump_dir`	/tmp	`oom_heap_dump_dir`	false
Dump Heap When Out of Memory	When set, generates heap dump file when java.lang.OutOfMemoryError is thrown.		false	`oom_heap_dump_enabled`	true
Kill When Out of Memory	When set, a SIGKILL signal is sent to the role process when java.lang.OutOfMemoryError is thrown.		true	`oom_sigkill_enabled`	true
Automatically Restart Process	When set, this role's process is automatically (and transparently) restarted in the event of an unexpected failure.		false	`process_auto_restart`	true

Logs

Display Name	Description	Default Value	API Name	Required
Hive Metastore Server Log Directory	Directory where Hive Metastore Server will place its log files.	/var/log/hive	`hive_log_dir`	false
Hive Metastore Server Logging Threshold	The minimum log level for Hive Metastore Server logs	INFO	`log_threshold`	false
Hive Metastore Server Maximum Log File Backups	The maximum number of rolled log files to keep for Hive Metastore Server logs. Typically used by log4j or logback.	10	`max_log_backup_index`	false
Hive Metastore Server Max Log Size	The maximum size, in megabytes, per log file for Hive Metastore Server logs. Typically used by log4j or logback.	200 MiB	`max_log_size`	false

Monitoring

Display Name	Description	Default Value	API Name	Required
Enable Health Alerts for this Role	When set, Cloudera Manager will send alerts when the health of this role reaches the threshold specified by the EventServer setting eventserver_health_events_alert_threshold	true	`enable_alerts`	false
Enable Configuration Change Alerts	When set, Cloudera Manager will send alerts when this entity's configuration changes.	false	`enable_config_alerts`	false
Heap Dump Directory Free Space Monitoring Absolute Thresholds	The health test thresholds for monitoring of free space on the filesystem that contains this role's heap dump directory.	Warning: 10 GiB, Critical: 5 GiB	`heap_dump_directory_free_space_absolute_thresholds`	false
Heap Dump Directory Free Space Monitoring Percentage Thresholds	The health test thresholds for monitoring of free space on the filesystem that contains this role's heap dump directory. Specified as a percentage of the capacity on that filesystem. This setting is not used if a Heap Dump Directory Free Space Monitoring Absolute Thresholds setting is configured.	Warning: Never, Critical: Never	`heap_dump_directory_free_space_percentage_thresholds`	false
File Descriptor Monitoring Thresholds	The health test thresholds of the number of file descriptors used. Specified as a percentage of file descriptor limit.	Warning: 50.0 %, Critical: 70.0 %	`hivemetastore_fd_thresholds`	false
Hive Metastore Server Host Health Test	When computing the overall Hive Metastore Server health, consider the host's health.	true	`hivemetastore_host_health_enabled`	false
Hive Metastore Server Process Health Test	Enables the health test that the Hive Metastore Server's process state is consistent with the role configuration	true	`hivemetastore_scm_health_enabled`	false
Log Directory Free Space Monitoring Absolute Thresholds	The health test thresholds for monitoring of free space on the filesystem that contains this role's log directory.	Warning: 10 GiB, Critical: 5 GiB	`log_directory_free_space_absolute_thresholds`	false
Log Directory Free Space Monitoring Percentage Thresholds	The health test thresholds for monitoring of free space on the filesystem that contains this role's log directory. Specified as a percentage of the capacity on that filesystem. This setting is not used if a Log Directory Free Space Monitoring Absolute Thresholds setting is configured.	Warning: Never, Critical: Never	`log_directory_free_space_percentage_thresholds`	false
Hive Metastore Canary Health Test	Enables the health test that checks that basic Hive Metastore operations succeed	true	`metastore_canary_health_enabled`	false
Process Swap Memory Thresholds	The health test thresholds on the swap memory usage of the process.	Warning: Any, Critical: Never	`process_swap_memory_thresholds`	false
Role Triggers	The configured triggers for this role. This is a JSON formatted list of triggers. These triggers are evaluated as part as the health system. Every trigger expression is parsed, and if the trigger condition is met, the list of actions provided in the trigger expression is executed. Each trigger has all of the following fields: `triggerName` (mandatory) - The name of the trigger. This value must be unique for the specific role. `triggerExpression` (mandatory) - A tsquery expression representing the trigger. `streamThreshold` (optional) - The maximum number of streams that can satisfy a condition of a trigger before the condition fires. By default set to 0, and any stream returned causes the condition to fire. `enabled` (optional) - By default set to 'true'. If set to 'false', the trigger will not be evaluated. `expressionEditorConfig` (optional) - Metadata for the trigger editor. If present, the trigger should only be edited from the Edit Trigger page; editing the trigger here may lead to inconsistencies. For example, the following JSON formatted trigger configured for a DataNode fires if the DataNode has more than 1500 file-descriptors opened:`[{"triggerName": "sample-trigger", "triggerExpression": "IF (SELECT fd_open WHERE roleName=$ROLENAME and last(fd_open) > 1500) DO health:bad", "streamThreshold": 0, "enabled": "true"}]`See the trigger rules documentation for more details on how to write triggers using tsquery.The JSON format is evolving and may change in the future and, as a result, backward compatibility is not guaranteed between releases at this time.	[]	`role_triggers`	true
Unexpected Exits Thresholds	The health test thresholds for unexpected exits encountered within a recent period specified by the unexpected_exits_window configuration for the role.	Warning: Never, Critical: Any	`unexpected_exits_thresholds`	false
Unexpected Exits Monitoring Period	The period to review when computing unexpected exits.	5 minute(s)	`unexpected_exits_window`	false

Performance

Display Name	Description	Related Name	Default Value	API Name	Required
Maximum Process File Descriptors	If configured, overrides the process soft and hard rlimits (also called ulimits) for file descriptors to the configured value.			`rlimit_fds`	false

Ports and Addresses

Display Name	Description	Related Name	Default Value	API Name	Required
Hive Metastore Server Port	Port on which Hive Metastore Server will listen for connections.	`hive.metastore.port`	9083	`hive_metastore_port`	false

Resource Management

Display Name	Description	Related Name	Default Value	API Name	Required
Java Heap Size of Hive Metastore Server in Bytes	Maximum size in bytes for the Java Process heap memory. Passed to Java -Xmx.		1 GiB	`hive_metastore_java_heapsize`	false
Cgroup CPU Shares	Number of CPU shares to assign to this role. The greater the number of shares, the larger the share of the host's CPUs that will be given to this role when the host experiences CPU contention. Must be between 2 and 262144. Defaults to 1024 for processes not managed by Cloudera Manager.	`cpu.shares`	1024	`rm_cpu_shares`	true
Cgroup I/O Weight	Weight for the read I/O requests issued by this role. The greater the weight, the higher the priority of the requests when the host experiences I/O contention. Must be between 100 and 1000. Defaults to 1000 for processes not managed by Cloudera Manager.	`blkio.weight`	500	`rm_io_weight`	true
Cgroup Memory Hard Limit	Hard memory limit to assign to this role, enforced by the Linux kernel. When the limit is reached, the kernel will reclaim pages charged to the process. If reclaiming fails, the kernel may kill the process. Both anonymous as well as page cache pages contribute to the limit. Use a value of -1 B to specify no limit. By default processes not managed by Cloudera Manager will have no limit.	`memory.limit_in_bytes`	-1 MiB	`rm_memory_hard_limit`	true
Cgroup Memory Soft Limit	Soft memory limit to assign to this role, enforced by the Linux kernel. When the limit is reached, the kernel will reclaim pages charged to the process if and only if the host is facing memory pressure. If reclaiming fails, the kernel may kill the process. Both anonymous as well as page cache pages contribute to the limit. Use a value of -1 B to specify no limit. By default processes not managed by Cloudera Manager will have no limit.	`memory.soft_limit_in_bytes`	-1 MiB	`rm_memory_soft_limit`	true

Stacks Collection

Display Name	Description	Related Name	Default Value	API Name	Required
Stacks Collection Data Retention	The amount of stacks data that is retained. After the retention limit is reached, the oldest data is deleted.	`stacks_collection_data_retention`	100 MiB	`stacks_collection_data_retention`	false
Stacks Collection Directory	The directory in which stacks logs are placed. If not set, stacks are logged into a `stacks` subdirectory of the role's log directory.	`stacks_collection_directory`		`stacks_collection_directory`	false
Stacks Collection Enabled	Whether or not periodic stacks collection is enabled.	`stacks_collection_enabled`	false	`stacks_collection_enabled`	true
Stacks Collection Frequency	The frequency with which stacks are collected.	`stacks_collection_frequency`	5.0 second(s)	`stacks_collection_frequency`	false
Stacks Collection Method	The method used to collect stacks. The jstack option involves periodically running the jstack command against the role's daemon process. The servlet method is available for those roles that have an HTTP server endpoint exposing the current stacks traces of all threads. When the servlet method is selected, that HTTP endpoint is periodically scraped.	`stacks_collection_method`	jstack	`stacks_collection_method`	false

hiveserver2defaultgroup

Advanced

Display Name	Description	Related Name	Default Value	API Name	Required
HiveServer2 Advanced Configuration Snippet (Safety Valve) for hive-site.xml	For advanced use only, a string to be inserted into hive-site.xml for this role only.			`hive_hs2_config_safety_valve`	false
HiveServer2 Environment Advanced Configuration Snippet (Safety Valve)	For advanced use only, key-value pairs (one on each line) to be inserted into a role's environment. Applies to configurations of this role except client configuration.			`hive_hs2_env_safety_valve`	false
Hive Downloaded Resources Directory	Local directory where Hive stores jars downloaded for remote file systems (HDFS). If not specified, Hive uses a default location.	`hive.downloaded.resources.dir`		`hiveserver2_downloaded_resources_dir`	false
Enable Explain Logging	When enabled, HiveServer2 logs EXPLAIN EXTENDED output for every query at INFO log4j level.	`hive.log.explain.output`	true	`hiveserver2_enable_explain_output`	false
Hive Local Scratch Directory	Local Directory where Hive stores jars and data when performing a MapJoin optimization. If not specified, Hive uses a default location.	`hive.exec.local.scratchdir`		`hiveserver2_exec_local_scratchdir`	false
Hive HDFS Scratch Directory	Directory in HDFS where Hive writes intermediate data between MapReduce jobs. If not specified, Hive uses a default location.	`hive.exec.scratchdir`		`hiveserver2_exec_scratchdir`	false
Idle Operation Timeout	Operation will be closed when not accessed for this duration of time, in milliseconds; disable by setting to zero. For a positive value, checked for operations in terminal state only (FINISHED, CANCELED, CLOSED, ERROR). For a negative value, checked for all of the operations regardless of state.	`hive.server2.idle.operation.timeout`	0 second(s)	`hiveserver2_idle_operation_timeout`	false
Idle Session Timeout	Session will be closed when not accessed for this duration of time, in milliseconds; disable by setting to zero or a negative value.	`hive.server2.idle.session.timeout`	0 second(s)	`hiveserver2_idle_session_timeout`	false
Java Configuration Options for HiveServer2	These arguments will be passed as part of the Java command line. Commonly, garbage collection flags or extra debugging flags would be passed here.		-XX:+UseParNewGC -XX:+UseConcMarkSweepGC -XX:-CMSConcurrentMTEnabled -XX:CMSInitiatingOccupancyFraction=70 -XX:+CMSParallelRemarkEnabled	`hiveserver2_java_opts`	false
Max HiveServer2 Threads	Maximum number of worker threads in HiveServer2's thread pool	`hive.server2.thrift.max.worker.threads`	100	`hiveserver2_max_threads`	true
Min HiveServer2 Threads	Minimum number of worker threads in HiveServer2's thread pool	`hive.server2.thrift.min.worker.threads`	5	`hiveserver2_min_threads`	true
Session Check Interval	The check interval for session/operation timeout, in milliseconds, which can be disabled by setting to zero or a negative value.	`hive.server2.session.check.interval`	0 second(s)	`hiveserver2_session_check_interval`	false
HiveServer2 Logging Advanced Configuration Snippet (Safety Valve)	For advanced use only, a string to be inserted into log4j.properties for this role only.			`log4j_safety_valve`	false
Heap Dump Directory	Path to directory where heap dumps are generated when java.lang.OutOfMemoryError error is thrown. This directory is automatically created if it does not exist. If this directory already exists, role user must have write access to this directory. If this directory is shared among multiple roles, it should have 1777 permissions. The heap dump files are created with 600 permissions and are owned by the role user. The amount of free space in this directory should be greater than the maximum Java Process heap size configured for this role.	`oom_heap_dump_dir`	/tmp	`oom_heap_dump_dir`	false
Dump Heap When Out of Memory	When set, generates heap dump file when java.lang.OutOfMemoryError is thrown.		false	`oom_heap_dump_enabled`	true
Kill When Out of Memory	When set, a SIGKILL signal is sent to the role process when java.lang.OutOfMemoryError is thrown.		true	`oom_sigkill_enabled`	true
Automatically Restart Process	When set, this role's process is automatically (and transparently) restarted in the event of an unexpected failure.		false	`process_auto_restart`	true

Logs

Display Name	Description	Default Value	API Name	Required
HiveServer2 Log Directory	Directory where HiveServer2 will place its log files.	/var/log/hive	`hive_log_dir`	false
HiveServer2 Logging Threshold	The minimum log level for HiveServer2 logs	INFO	`log_threshold`	false
HiveServer2 Maximum Log File Backups	The maximum number of rolled log files to keep for HiveServer2 logs. Typically used by log4j or logback.	10	`max_log_backup_index`	false
HiveServer2 Max Log Size	The maximum size, in megabytes, per log file for HiveServer2 logs. Typically used by log4j or logback.	200 MiB	`max_log_size`	false

Monitoring

Display Name	Description	Default Value	API Name	Required
Enable Health Alerts for this Role	When set, Cloudera Manager will send alerts when the health of this role reaches the threshold specified by the EventServer setting eventserver_health_events_alert_threshold	true	`enable_alerts`	false
Enable Configuration Change Alerts	When set, Cloudera Manager will send alerts when this entity's configuration changes.	false	`enable_config_alerts`	false
Heap Dump Directory Free Space Monitoring Absolute Thresholds	The health test thresholds for monitoring of free space on the filesystem that contains this role's heap dump directory.	Warning: 10 GiB, Critical: 5 GiB	`heap_dump_directory_free_space_absolute_thresholds`	false
Heap Dump Directory Free Space Monitoring Percentage Thresholds	The health test thresholds for monitoring of free space on the filesystem that contains this role's heap dump directory. Specified as a percentage of the capacity on that filesystem. This setting is not used if a Heap Dump Directory Free Space Monitoring Absolute Thresholds setting is configured.	Warning: Never, Critical: Never	`heap_dump_directory_free_space_percentage_thresholds`	false
Hive Local Scratch Directory Free Space Monitoring Absolute Thresholds	The health test thresholds for monitoring of free space on the filesystem that contains this role's Local Directory where Hive stores jars and data when performing a MapJoin optimization. If not specified, Hive uses a default location..	Warning: 10 GiB, Critical: 5 GiB	`hiveserver2_exec_local_scratch_directory_free_space_absolute_thresholds`	false
Hive Local Scratch Directory Free Space Monitoring Percentage Thresholds	The health test thresholds for monitoring of free space on the filesystem that contains this role's Local Directory where Hive stores jars and data when performing a MapJoin optimization. If not specified, Hive uses a default location.. Specified as a percentage of the capacity on that filesystem. This setting is not used if a Local Directory where Hive stores jars and data when performing a MapJoin optimization. If not specified, Hive uses a default location. Free Space Monitoring Absolute Thresholds setting is configured.	Warning: Never, Critical: Never	`hiveserver2_exec_local_scratch_directory_free_space_percentage_thresholds`	false
File Descriptor Monitoring Thresholds	The health test thresholds of the number of file descriptors used. Specified as a percentage of file descriptor limit.	Warning: 50.0 %, Critical: 70.0 %	`hiveserver2_fd_thresholds`	false
HiveServer2 Host Health Test	When computing the overall HiveServer2 health, consider the host's health.	true	`hiveserver2_host_health_enabled`	false
HiveServer2 Process Health Test	Enables the health test that the HiveServer2's process state is consistent with the role configuration	true	`hiveserver2_scm_health_enabled`	false
Log Directory Free Space Monitoring Absolute Thresholds	The health test thresholds for monitoring of free space on the filesystem that contains this role's log directory.	Warning: 10 GiB, Critical: 5 GiB	`log_directory_free_space_absolute_thresholds`	false
Log Directory Free Space Monitoring Percentage Thresholds	The health test thresholds for monitoring of free space on the filesystem that contains this role's log directory. Specified as a percentage of the capacity on that filesystem. This setting is not used if a Log Directory Free Space Monitoring Absolute Thresholds setting is configured.	Warning: Never, Critical: Never	`log_directory_free_space_percentage_thresholds`	false
Process Swap Memory Thresholds	The health test thresholds on the swap memory usage of the process.	Warning: Any, Critical: Never	`process_swap_memory_thresholds`	false
Role Triggers	The configured triggers for this role. This is a JSON formatted list of triggers. These triggers are evaluated as part as the health system. Every trigger expression is parsed, and if the trigger condition is met, the list of actions provided in the trigger expression is executed. Each trigger has all of the following fields: `triggerName` (mandatory) - The name of the trigger. This value must be unique for the specific role. `triggerExpression` (mandatory) - A tsquery expression representing the trigger. `streamThreshold` (optional) - The maximum number of streams that can satisfy a condition of a trigger before the condition fires. By default set to 0, and any stream returned causes the condition to fire. `enabled` (optional) - By default set to 'true'. If set to 'false', the trigger will not be evaluated. `expressionEditorConfig` (optional) - Metadata for the trigger editor. If present, the trigger should only be edited from the Edit Trigger page; editing the trigger here may lead to inconsistencies. For example, the following JSON formatted trigger configured for a DataNode fires if the DataNode has more than 1500 file-descriptors opened:`[{"triggerName": "sample-trigger", "triggerExpression": "IF (SELECT fd_open WHERE roleName=$ROLENAME and last(fd_open) > 1500) DO health:bad", "streamThreshold": 0, "enabled": "true"}]`See the trigger rules documentation for more details on how to write triggers using tsquery.The JSON format is evolving and may change in the future and, as a result, backward compatibility is not guaranteed between releases at this time.	[]	`role_triggers`	true
Unexpected Exits Thresholds	The health test thresholds for unexpected exits encountered within a recent period specified by the unexpected_exits_window configuration for the role.	Warning: Never, Critical: Any	`unexpected_exits_thresholds`	false
Unexpected Exits Monitoring Period	The period to review when computing unexpected exits.	5 minute(s)	`unexpected_exits_window`	false

Other

Display Name	Description	Related Name	Default Value	API Name	Required
HiveServer2 Load Balancer	Address of the load balancer used for HiveServer2 roles, specified in host:port format. If port is not specified, the port used by HiveServer2 is used. Note: Changing this property regenerates Kerberos keytabs for all HiveServer2 roles.			`hiverserver2_load_balancer`	false
HiveServer2 Enable Impersonation	HiveServer2 will impersonate the beeline client user when talking to other services such as Mapreduce and Hdfs.	`hive.server2.enable.doAs`	true	`hiveserver2_enable_impersonation`	false

Performance

Display Name	Description	Related Name	Default Value	API Name	Required
Hive Auto Convert Join Noconditional Size	If Hive auto convert join is on, and the sum of the size for n-1 of the tables/partitions for a n-way join is smaller than the specified size, the join is directly converted to a MapJoin (there is no conditional task).	`hive.auto.convert.join.noconditionaltask.size`	20 MiB	`hiveserver2_auto_convert_join_noconditionaltask_size`	false
Enable Stats Optimization	Enable optimization that checks if a query can be answered using statistics. If so, answers the query using only statistics stored in metastore.	`hive.compute.query.using.stats`	false	`hiveserver2_compute_query_using_stats`	false
Enable Cost Based Optimizer for Hive	Enabled the Calcite based Cost Based Optimizer for HiveServer2.	`hive.cbo.enable`	false	`hiveserver2_enable_cbo`	false
Enable MapJoin Optimization	Enable optimization that converts common join into MapJoin based on input file size.	`hive.auto.convert.join`	true	`hiveserver2_enable_mapjoin`	false
Fetch Task Query Conversion	Some select queries can be converted to a single FETCH task instead of MapReduce task, minimizing latency. A value of none disables all conversion, minimal converts simple queries such as SELECT * and filter on partition columns, and more will convert SELECT queries including FILTERS.	`hive.fetch.task.conversion`	minimal	`hiveserver2_fetch_task_conversion`	false
Fetch Task Query Conversion Threshold	Above this size, queries will not be converted to fetch tasks.	`hive.fetch.task.conversion.threshold`	256 MiB	`hiveserver2_fetch_task_conversion_threshold`	false
Maximum ReduceSink Top-K Memory Usage	The max percentage of heap to be used for hash in ReduceSink operator for Top-K selection. A 0 means the optimization is disabled. Values accepted are between 0 and 1.	`hive.limit.pushdown.memory.usage`	0.1	`hiveserver2_limit_pushdown_memory_usage`	false
Enable Map-Side Aggregation	Enable map-side partial aggregation, which cause the mapper to generate fewer rows. This reduces the data to be sorted and distributed to reducers.	`hive.map.aggr`	true	`hiveserver2_map_aggr`	false
Ratio of Memory Usage for Map-Side Aggregation	Portion of total memory used in map-side partial aggregation. When exceeded, the partially aggregated results will be flushed from the map task to the reducers.	`hive.map.aggr.hash.percentmemory`	0.5	`hiveserver2_map_aggr_hash_memory_ratio`	false
Enable Merging Small Files - Map-Only Job	Merge small files at the end of a map-only job. When enabled, a map-only job is created to merge the files in the destination table/partitions.	`hive.merge.mapfiles`	true	`hiveserver2_merge_mapfiles`	false
Enable Merging Small Files - Map-Reduce Job	Merge small files at the end of a map-reduce job. When enabled, a map-only job is created to merge the files in the destination table/partitions.	`hive.merge.mapredfiles`	false	`hiveserver2_merge_mapredfiles`	false
Desired File Size After Merging	The desired file size after merging. This should be larger than hive.merge.smallfiles.avgsize.	`hive.merge.size.per.task`	256 MiB	`hiveserver2_merge_size_per_task`	false
Small File Average Size Merge Threshold	When the average output file size of a job is less than the value of this property, Hive will start an additional map-only job to merge the output files into bigger files. This is only done for map-only jobs if hive.merge.mapfiles is true, for map-reduce jobs if hive.merge.mapredfiles is true, and for Spark jobs if hive.merge.sparkfiles is true.	`hive.merge.smallfiles.avgsize`	16 MiB	`hiveserver2_merge_smallfiles_avgsize`	false
Enable Merging Small Files - Spark Job	Merge small files at the end of a Spark job. When enabled, a map-only job is created to merge the files in the destination table/partitions.	`hive.merge.sparkfiles`	true	`hiveserver2_merge_sparkfiles`	false
Hive Optimize Sorted Merge Bucket Join	Whether to try sorted merge bucket (SMB) join.	`hive.optimize.bucketmapjoin.sortedmerge`	false	`hiveserver2_optimize_bucketmapjoin_sortedmerge`	false
Enable Automatic Use of Indexes	Whether to use the indexing optimization for all queries.	`hive.optimize.index.filter`	true	`hiveserver2_optimize_index_filter`	false
Enable ReduceDeDuplication Optimization	Remove extra map-reduce jobs if the data is already clustered by the same key, eliminating the need to repartition the dataset again.	`hive.optimize.reducededuplication`	true	`hiveserver2_optimize_reducededuplication`	false
Mininum Reducers for ReduceDeDuplication Optimization	When the number of ReduceSink operators after merging is less than this number, the ReduceDeDuplication optimization will be disabled.	`hive.optimize.reducededuplication.min.reducer`	4	`hiveserver2_optimize_reducededuplication_min_reducer`	false
Enable Sorted Dynamic Partition Optimizer	When dynamic partition is enabled, reducers keep only one record writer at all times, which lowers the memory pressure on reducers.	`hive.optimize.sort.dynamic.partition`	false	`hiveserver2_optimize_sort_dynamic_partition`	false
Hive SMB Join Cache Rows	The number of rows with the same key value to be cached in memory per SMB-joined table.	`hive.smbjoin.cache.rows`	10000	`hiveserver2_smbjoin_cache_rows`	false
Spark Driver Maximum Java Heap Size	Maximum size of each Spark driver's Java heap memory when Hive is running on Spark.	`spark.driver.memory`	256 MiB	`hiveserver2_spark_driver_memory`	false
Enable Dynamic Executor Allocation	When enabled, Spark will add and remove executors dynamically to Hive jobs. This is done based on the workload.	`spark.dynamicAllocation.enabled`	true	`hiveserver2_spark_dynamic_allocation_enabled`	true
Initial Number of Executors	Initial number of executors used by the application at any given time. This is required if the dynamic executor allocation feature is enabled.	`spark.dynamicAllocation.initialExecutors`	1	`hiveserver2_spark_dynamic_allocation_initial_executors`	true
Lower Bound on Number of Executors	Lower bound on the number of executors used by the application at any given time. This is used by dynamic executor allocation	`spark.dynamicAllocation.minExecutors`	1	`hiveserver2_spark_dynamic_allocation_min_executors`	true
Spark Executor Cores	Number of cores per Spark executor.	`spark.executor.cores`	1	`hiveserver2_spark_executor_cores`	true
Spark Executors Per Application	Number of Spark executors assigned to each application. This should not be set when Dynamic Executor Allocation is enabled.	`spark.executor.instances`		`hiveserver2_spark_executor_instances`	false
Spark Executor Maximum Java Heap Size	Maximum size of each Spark executor's Java heap memory when Hive is running on Spark.	`spark.executor.memory`	256 MiB	`hiveserver2_spark_executor_memory`	true
Spark Driver Memory Overhead	This is the amount of extra off-heap memory that can be requested from YARN, per driver. This, together with spark.driver.memory, is the total memory that YARN can use to create JVM for a driver process.	`spark.yarn.driver.memoryOverhead`	26 MiB	`hiveserver2_spark_yarn_driver_memory_overhead`	false
Spark Executor Memory Overhead	This is the amount of extra off-heap memory that can be requested from YARN, per executor process. This, together with spark.executor.memory, is the total memory that YARN can use to create JVM for an executor process.	`spark.yarn.executor.memoryOverhead`	26 MiB	`hiveserver2_spark_yarn_executor_memory_overhead`	true
Load Column Statistics	Whether column stats for a table are fetched during explain.	`hive.stats.fetch.column.stats`	true	`hiveserver2_stats_fetch_column_stats`	false
Enable Vectorization Optimization	Enable optimization that vectorizes query execution by streamlining operations by processing a block of 1024 rows at a time.	`hive.vectorized.execution.enabled`	true	`hiveserver2_vectorized_enabled`	false
Vectorized GroupBy Check Interval	In vectorized group-by, the number of row entries added to the hash table before re-checking average variable size for memory usage estimation.	`hive.vectorized.groupby.checkinterval`	4096	`hiveserver2_vectorized_groupby_checkinterval`	false
Vectorized GroupBy Flush Ratio	Ratio between 0.0 and 1.0 of entries in the vectorized group-by aggregation hash that is flushed when the memory threshold is exceeded.	`hive.vectorized.groupby.flush.percent`	0.1	`hiveserver2_vectorized_groupby_flush_ratio`	false
Enable Reduce-Side Vectorization	Whether to vectorize the reduce side of query execution.	`hive.vectorized.execution.reduce.enabled`	false	`hiveserver2_vectorized_reduce_enabled`	false
Maximum Process File Descriptors	If configured, overrides the process soft and hard rlimits (also called ulimits) for file descriptors to the configured value.			`rlimit_fds`	false

Ports and Addresses

Display Name	Description	Related Name	Default Value	API Name	Required
HiveServer2 Port	Port on which HiveServer2 will listen for connections.	`hive.server2.thrift.port`	10000	`hs2_thrift_address_port`	false

Resource Management

Display Name	Description	Related Name	Default Value	API Name	Required
Java Heap Size of HiveServer2 in Bytes	Maximum size in bytes for the Java Process heap memory. Passed to Java -Xmx.		256 MiB	`hiveserver2_java_heapsize`	false
Cgroup CPU Shares	Number of CPU shares to assign to this role. The greater the number of shares, the larger the share of the host's CPUs that will be given to this role when the host experiences CPU contention. Must be between 2 and 262144. Defaults to 1024 for processes not managed by Cloudera Manager.	`cpu.shares`	1024	`rm_cpu_shares`	true
Cgroup I/O Weight	Weight for the read I/O requests issued by this role. The greater the weight, the higher the priority of the requests when the host experiences I/O contention. Must be between 100 and 1000. Defaults to 1000 for processes not managed by Cloudera Manager.	`blkio.weight`	500	`rm_io_weight`	true
Cgroup Memory Hard Limit	Hard memory limit to assign to this role, enforced by the Linux kernel. When the limit is reached, the kernel will reclaim pages charged to the process. If reclaiming fails, the kernel may kill the process. Both anonymous as well as page cache pages contribute to the limit. Use a value of -1 B to specify no limit. By default processes not managed by Cloudera Manager will have no limit.	`memory.limit_in_bytes`	-1 MiB	`rm_memory_hard_limit`	true
Cgroup Memory Soft Limit	Soft memory limit to assign to this role, enforced by the Linux kernel. When the limit is reached, the kernel will reclaim pages charged to the process if and only if the host is facing memory pressure. If reclaiming fails, the kernel may kill the process. Both anonymous as well as page cache pages contribute to the limit. Use a value of -1 B to specify no limit. By default processes not managed by Cloudera Manager will have no limit.	`memory.soft_limit_in_bytes`	-1 MiB	`rm_memory_soft_limit`	true

Stacks Collection

Display Name	Description	Related Name	Default Value	API Name	Required
Stacks Collection Data Retention	The amount of stacks data that is retained. After the retention limit is reached, the oldest data is deleted.	`stacks_collection_data_retention`	100 MiB	`stacks_collection_data_retention`	false
Stacks Collection Directory	The directory in which stacks logs are placed. If not set, stacks are logged into a `stacks` subdirectory of the role's log directory.	`stacks_collection_directory`		`stacks_collection_directory`	false
Stacks Collection Enabled	Whether or not periodic stacks collection is enabled.	`stacks_collection_enabled`	false	`stacks_collection_enabled`	true
Stacks Collection Frequency	The frequency with which stacks are collected.	`stacks_collection_frequency`	5.0 second(s)	`stacks_collection_frequency`	false
Stacks Collection Method	The method used to collect stacks. The jstack option involves periodically running the jstack command against the role's daemon process. The servlet method is available for those roles that have an HTTP server endpoint exposing the current stacks traces of all threads. When the servlet method is selected, that HTTP endpoint is periodically scraped.	`stacks_collection_method`	jstack	`stacks_collection_method`	false

service_wide

Advanced

Display Name	Description	Related Name	Default Value	API Name	Required
Hive Auxiliary JARs Directory	Directory containing auxiliary JARs used by Hive. This should be a directory location and not a classpath containing one or more JARs. This directory must be created and managed manually on Hive CLI or HiveServer2 host. The directory location is set in the environment as HIVE_AUX_JARS_PATH and will generally override hive.aux.jars.path property set in XML files, even if hive.aux.jars.path is set in an advanced configuration snippet.			`hive_aux_jars_path_dir`	false
Bypass Hive Metastore Server	Instead of talking to Hive Metastore Server for Metastore information, Hive clients will talk directly to the Metastore database.		false	`hive_bypass_metastore_server`	false
Hive Service Advanced Configuration Snippet (Safety Valve) for core-site.xml	For advanced use only, a string to be inserted into core-site.xml. Applies to configurations of all roles in this service except client configuration.			`hive_core_site_safety_valve`	false
Hive Copy Large File Size	Smaller than this size, Hive uses a single-threaded copy; larger than this size, Hive uses DistCp.	`hive.exec.copyfile.maxsize`	32 MiB	`hive_exec_copyfile_maxsize`	false
Server Name for Sentry Authorization	The server name used when defining privilege rules in Sentry authorization. Sentry uses this name as an alias for the Hive service. It does not correspond to any physical server name.	`hive.sentry.server`	server1	`hive_sentry_server`	false
Hive Service Advanced Configuration Snippet (Safety Valve) for sentry-site.xml	For advanced use only, a string to be inserted into sentry-site.xml. Applies to configurations of all roles in this service except client configuration.			`hive_server2_sentry_safety_valve`	false
Hive Service Advanced Configuration Snippet (Safety Valve) for hive-site.xml	For advanced use only, a string to be inserted into hive-site.xml. Applies to configurations of all roles in this service except client configuration.			`hive_service_config_safety_valve`	false
Hive Service Environment Advanced Configuration Snippet (Safety Valve)	For advanced use only, key-value pairs (one on each line) to be inserted into a role's environment. Applies to configurations of all roles in this service except client configuration.			`hive_service_env_safety_valve`	false
Hive Client Advanced Configuration Snippet (Safety Valve) for navigator.client.properties	For advanced use only, a string to be inserted into the client configuration for navigator.client.properties.			`navigator_client_config_safety_valve`	false
System Group	The group that this service's processes should run as.		hive	`process_groupname`	true
System User	The user that this service's processes should run as.		hive	`process_username`	true

Cloudera Navigator

Display Name	Description	Related Name	Default Value	API Name	Required
Enable Audit Collection	Enable collection of audit events from the service's roles.		true	`navigator_audit_enabled`	false
Audit Event Filter	Event filters are defined in a JSON object like the following: `{ "defaultAction" : ("accept", "discard"), "rules" : [ { "action" : ("accept", "discard"), "fields" : [ { "name" : "fieldName", "match" : "regex" } ] } ] }` A filter has a default action and a list of rules, in order of precedence. Each rule defines an action, and a list of fields to match against the audit event. A rule is "accepted" if all the listed field entries match the audit event. At that point, the action declared by the rule is taken. If no rules match the event, the default action is taken. Actions default to "accept" if not defined in the JSON object. The following is the list of fields that can be filtered for Hive events: userName: the user performing the action. ipAddress: the IP from where the request originated. operation: the Hive operation being performed. databaseName: the databaseName for the operation. tableName: the tableName for the operation. The default Hive audit event filter discards HDFS directory events generated by Hive jobs that reference the /tmp directory.	`navigator.event.filter`	comment : [ The default Hive audit event filter discards HDFS directory events , generated by Hive jobs that reference the /tmp directory. ], defaultAction : accept, rules : [ action : discard, fields : [ name : operation, match : QUERY , name : objectType, match : DFS_DIR, name : resourcePath, match : /tmp/hive-(?:.+)?/hive_(?:.+)?/-mr-.* ] ]	`navigator_audit_event_filter`	false
Audit Queue Policy	Action to take when the audit event queue is full. Drop the event or shutdown the affected process.	`navigator.batch.queue_policy`	DROP	`navigator_audit_queue_policy`	false
Audit Event Tracker	Configures the rules for event tracking and coalescing. This feature is used to define equivalency between different audit events. When events match, according to a set of configurable parameters, only one entry in the audit list is generated for all the matching events. Tracking works by keeping a reference to events when they first appear, and comparing other incoming events against the "tracked" events according to the rules defined here. Event trackers are defined in a JSON object like the following: `{ "timeToLive" : [integer], "fields" : [ { "type" : [string], "name" : [string] } ] }` Where: timeToLive: maximum amount of time an event will be tracked, in milliseconds. Must be provided. This defines how long, since it's first seen, an event will be tracked. A value of 0 disables tracking. fields: list of fields to compare when matching events against tracked events. Each field has an evaluator type associated with it. The evaluator defines how the field data is to be compared. The following evaluators are available: value: uses the field value for comparison. userName: treats the field value as a userNname, and ignores any host-specific data. This is useful for environment using Kerberos, so that only the principal name and realm are compared. The following is the list of fields that can be used to compare Hive events: operation: the Hive operation being performed. username: the user performing the action. ipAddress: the IP from where the request originated. allowed: whether the operation was allowed or denied. databaseName: the database affected by the operation. tableName: the table or view affected by the operation. objectType: the type of object affected by the operation. resourcePath: the path of the resource affected by the operation.	`navigator_event_tracker`		`navigator_event_tracker`	false

Hive Metastore Database

Display Name	Description	Related Name	Default Value	API Name	Required
Auto Create and Upgrade Hive Metastore Database Schema	Automatically create or upgrade tables in the Hive Metastore database when needed. Consider setting this to false and managing the schema manually.	`datanucleus.autoCreateSchema`	false	`hive_metastore_database_auto_create_schema`	false
Hive Metastore Database DataNucleus Metadata Validation	Perform DataNucleus validation of metadata during startup. Note: when enabled, Hive will log DataNucleus warnings even though Hive will function normally.	`datanucleus.metadata.validate`	false	`hive_metastore_database_datanucleus_metadata_validation`	false
Hive Metastore Database Host	Host name of Hive Metastore database		localhost	`hive_metastore_database_host`	false
Hive Metastore Database Name	Name of Hive Metastore database		metastore	`hive_metastore_database_name`	false
Hive Metastore Database Password	Password for Hive Metastore database	`javax.jdo.option.ConnectionPassword`		`hive_metastore_database_password`	false
Hive Metastore Database Port	Port number of Hive Metastore database		3306	`hive_metastore_database_port`	false
Hive Metastore Database Type	Type of Hive Metastore database. Note that Derby is not recommended and Cloudera Impala does not support Derby.		mysql	`hive_metastore_database_type`	false
Hive Metastore Database User	User for Hive Metastore database	`javax.jdo.option.ConnectionUserName`	hive	`hive_metastore_database_user`	false
Hive Metastore Derby Path	Directory name where Hive Metastore's database is stored (only for Derby)		/var/lib/hive/cloudera_manager/derby/metastore_db	`hive_metastore_derby_path`	false
Strict Hive Metastore Schema Validation	Prevent Metastore operations in the event of schema version incompatibility. Consider setting this to true to reduce probability of schema corruption during Metastore operations. Note that setting this property to true will also set datanucleus.autoCreateSchema property to false and datanucleus.fixedDatastore property to true. Any values set in Cloudera Manager for these properties will be overridden.	`hive.metastore.schema.verification`	true	`hive_metastore_schema_verification`	false

Logs

Display Name	Description	Related Name	Default Value	API Name	Required
Audit Log Directory	Path to the directory where audit logs will be written. The directory will be created if it doesn't exist.	`audit_event_log_dir`	/var/log/hive/audit	`audit_event_log_dir`	false
Maximum Audit Log File Size	Maximum size of audit log file in MB before it is rolled over.	`navigator.audit_log_max_file_size`	100 MiB	`navigator_audit_log_max_file_size`	false
Number of Audit Logs to Retain	Maximum number of rolled over audit logs to retain. The logs will not be deleted if they contain audit events that have not yet been propagated to Audit Server.	`navigator.client.max_num_audit_log`	10	`navigator_client_max_num_audit_log`	false

Monitoring

Display Name	Description	Default Value	API Name	Required
Enable Service Level Health Alerts	When set, Cloudera Manager will send alerts when the health of this service reaches the threshold specified by the EventServer setting eventserver_health_events_alert_threshold	true	`enable_alerts`	false
Enable Configuration Change Alerts	When set, Cloudera Manager will send alerts when this entity's configuration changes.	false	`enable_config_alerts`	false
Healthy Hive Metastore Server Monitoring Thresholds	The health test thresholds of the overall Hive Metastore Server health. The check returns "Concerning" health if the percentage of "Healthy" Hive Metastore Servers falls below the warning threshold. The check is unhealthy if the total percentage of "Healthy" and "Concerning" Hive Metastore Servers falls below the critical threshold.	Warning: 99.0 %, Critical: 51.0 %	`hive_hivemetastores_healthy_thresholds`	false
Healthy HiveServer2 Monitoring Thresholds	The health test thresholds of the overall HiveServer2 health. The check returns "Concerning" health if the percentage of "Healthy" HiveServer2s falls below the warning threshold. The check is unhealthy if the total percentage of "Healthy" and "Concerning" HiveServer2s falls below the critical threshold.	Warning: 99.0 %, Critical: 51.0 %	`hive_hiveserver2s_healthy_thresholds`	false
Healthy WebHCat Server Monitoring Thresholds	The health test thresholds of the overall WebHCat Server health. The check returns "Concerning" health if the percentage of "Healthy" WebHCat Servers falls below the warning threshold. The check is unhealthy if the total percentage of "Healthy" and "Concerning" WebHCat Servers falls below the critical threshold.	Warning: 99.0 %, Critical: 51.0 %	`hive_webhcats_healthy_thresholds`	false
Service Triggers	The configured triggers for this service. This is a JSON formatted list of triggers. These triggers are evaluated as part as the health system. Every trigger expression is parsed, and if the trigger condition is met, the list of actions provided in the trigger expression is executed. Each trigger has all of the following fields: `triggerName` (mandatory) - The name of the trigger. This value must be unique for the specific service. `triggerExpression` (mandatory) - A tsquery expression representing the trigger. `streamThreshold` (optional) - The maximum number of streams that can satisfy a condition of a trigger before the condition fires. By default set to 0, and any stream returned causes the condition to fire. `enabled` (optional) - By default set to 'true'. If set to 'false', the trigger will not be evaluated. `expressionEditorConfig` (optional) - Metadata for the trigger editor. If present, the trigger should only be edited from the Edit Trigger page; editing the trigger here may lead to inconsistencies. For example, the followig JSON formatted trigger fires if there are more than 10 DataNodes with more than 500 file-descriptors opened:`[{"triggerName": "sample-trigger", "triggerExpression": "IF (SELECT fd_open WHERE roleType = DataNode and last(fd_open) > 500) DO health:bad", "streamThreshold": 10, "enabled": "true"}]`See the trigger rules documentation for more details on how to write triggers using tsquery.The JSON format is evolving and may change in the future and, as a result, backward compatibility is not guaranteed between releases at this time.	[]	`service_triggers`	true
Service Monitor Client Config Overrides	For advanced use only, a list of configuration properties that will be used by the Service Monitor instead of the current client configuration for the service.	<property><name>hive.metastore.client.socket.timeout</name><value>60</value></property>	`smon_client_config_overrides`	false
Service Monitor Derived Configs Advanced Configuration Snippet (Safety Valve)	For advanced use only, a list of derived configuration properties that will be used by the Service Monitor instead of the default ones.		`smon_derived_configs_safety_valve`	false

Other

Display Name	Description	Related Name	Default Value	API Name	Required
Enable Hive on Spark (Unsupported)	Cloudera does not support Hive on Spark in CDH 5.4. Enable Hive to use Spark for execution even though it is not supported. For evaluation purposes only. This configuration only takes effect when Hive is configured with a Spark On YARN Service. See Configuring Hive on Spark for more information about using Hive on Spark.	`hive.enable.spark.execution.engine`	false	`enable_hive_on_spark`	true
Hive Bytes Per Reducer	Size per reducer. If the input size is 10GiB and this is set to 1GiB, Hive will use 10 reducers.	`hive.exec.reducers.bytes.per.reducer`	64 MiB	`hive_bytes_per_reducer`	false
Hive Max Reducers	Max number of reducers to use. If the configuration parameter Hive Reduce Tasks is negative, Hive will limit the number of reducers to the value of this parameter.	`hive.exec.reducers.max`	1099	`hive_max_reducers`	false
Hive Reduce Tasks	Default number of reduce tasks per job. Usually set to a prime number close to the number of available hosts. Ignored when mapred.job.tracker is "local". Hadoop sets this to 1 by default, while Hive uses -1 as the default. When set to -1, Hive will automatically determine an appropriate number of reducers for each job.	`mapred.reduce.tasks`	-1	`hive_reduce_tasks`	false
Set User and Group Information	In unsecure mode, setting this property to true will cause the Metastore Server to execute DFS operations using the client's reported user and group permissions. Cloudera Manager will set this for all clients and servers.	`hive.metastore.execute.setugi`	true	`hive_set_ugi`	true
Hive Warehouse Directory	Hive warehouse directory is the location in HDFS where Hive's tables are stored. Note that Hive's default value for its warehouse directory is '/user/hive/warehouse'.	`hive.metastore.warehouse.dir`	/user/hive/warehouse	`hive_warehouse_directory`	false
Hive Warehouse Subdirectories Inherit Permissions	Let the table directories inherit the permission of the Warehouse or Database directory instead of being created with the permissions derived from dfs umask. This allows Impala to insert into tables created via Hive.	`hive.warehouse.subdir.inherit.perms`	true	`hive_warehouse_subdir_inherit_perms`	true
MapReduce Service	MapReduce jobs are run against this service.			`mapreduce_yarn_service`	true
Sentry Service	Name of the Sentry service that this Hive service instance depends on. If selected, Hive uses this Sentry service to look up authorization privileges. Before enabling Sentry, read the requirements and configuration steps in Setting Up The Sentry Service .			`sentry_service`	false
Spark On YARN Service	Name of the Spark on YARN service that this Hive service instance depends on. If selected and Enable Hive on Spark is set to true, Hive jobs can use the Spark execution engine instead of MapReduce2. Requires that Hive depends on YARN. See Configuring Hive on Spark for more information about Hive on Spark.			`spark_on_yarn_service`	false
ZooKeeper Service	Name of the ZooKeeper service that this Hive service instance depends on.			`zookeeper_service`	false

Policy File Based Sentry

Display Name	Description	Related Name	Default Value	API Name	Required
Sentry User to Group Mapping Class	The class to use in Sentry authorization for user to group mapping. Sentry authorization may be configured to use either Hadoop user to group mapping or local groups defined in the policy file. Hadoop user to group mapping may be configured in the Cloudera Manager HDFS service configuration page under the Security section.	`hive.sentry.provider`	org.apache.sentry.provider.file.HadoopGroupResourceAuthorizationProvider	`hive_sentry_provider`	false
Sentry Global Policy File	HDFS path to the global policy file for Sentry authorization. This should be a relative path (and not a full HDFS URL). The global policy file must be in Sentry policy file format.	`hive.sentry.provider.resource`	/user/hive/sentry/sentry-provider.ini	`hive_sentry_provider_resource`	false
Allow URIs in Database Policy File	Allows URIs when defining privileges in per-database policy files. Warning: Typically, this configuration should be disabled. Enabling it would allow database policy file owner (which is generally not Hive admin user) to grant load privileges to any directory with read access to Hive admin user, including databases controlled by other database policy files.	`sentry.allow.uri.db.policyfile`	false	`sentry_allow_uri_db_policyfile`	false
Enable Sentry Authorization using Policy Files	Use Sentry to enable role-based, fine-grained authorization. This configuration enables Sentry using policy files. To enable Sentry using the Sentry service instead, add the Sentry service as a dependency to the Hive service. The Sentry service provides concurrent and secure access to authorization policy metadata and is the recommended option for enabling Sentry. Sentry is supported only on CDH 4.4 or later deployments. Before enabling Sentry, read the requirements and configuration steps in Setting Up Hive Authorization with Sentry .	`hive.sentry.enabled`	false	`sentry_enabled`	false

Proxy

Display Name	Description	Related Name	Default Value	API Name	Required
Hive Metastore Access Control and Proxy User Groups Override	This configuration overrides the value set for Hive Proxy User Groups configuration in HDFS service for use by Hive Metastore Server. Specify a comma-delimited list of groups that you want to allow access to Hive Metastore metadata and allow the Hive user to impersonate. A value of '*' allows all groups. The default value of empty inherits the value set for Hive Proxy User Groups configuration in the HDFS service.	`hadoop.proxyuser.hive.groups`		`hive_proxy_user_groups_list`	false

Security

Display Name	Description	Related Name	Default Value	API Name	Required
Enable LDAP Authentication	When checked, LDAP-based authentication for users is enabled. This option is incompatible with Kerberos authentication for Hive Server 2 at this time.		false	`hiveserver2_enable_ldap_auth`	false
Enable TLS/SSL for HiveServer2	Encrypt communication between clients and HiveServer2 using Transport Layer Security (TLS) (formerly known as Secure Socket Layer (SSL)).	`hive.server2.use.SSL`	false	`hiveserver2_enable_ssl`	false
HiveServer2 TLS/SSL Server JKS Keystore File Password	The password for the HiveServer2 JKS keystore file.	`hive.server2.keystore.password`		`hiveserver2_keystore_password`	false
HiveServer2 TLS/SSL Server JKS Keystore File Location	The path to the TLS/SSL keystore file containing the server certificate and private key used for TLS/SSL. Used when HiveServer2 is acting as a TLS/SSL server. The keystore must be in JKS format.	`hive.server2.keystore.path`		`hiveserver2_keystore_path`	false
LDAP BaseDN	When set, this parameter is used to convert the username into the LDAP Distinguished Name (DN), so that the resulting DN looks like uid=username,X. For example, if this parameter is set to "ou=People,dc=cloudera,dc=com", and the username passed in is "mike", the resulting authentication passed to the LDAP server will look like "uid=mike,ou=People,dc=cloudera,dc=com". This parameter is generally most useful when authenticating against an OpenLDAP server. This parameter is mutually exclusive with LDAP Domain.	`hive.server2.authentication.ldap.baseDN`		`hiveserver2_ldap_basedn`	false
LDAP Domain	When set, this value will be appended to all usernames before authenticating with the LDAP server. For example, if this parameter is set to "my.domain.com" and the user authenticating is "mike", then "mike@my.domain.com" will be passed to the LDAP server. If this field is unset, the username will remain unaltered before being passed to the LDAP server. LDAP Domain is most useful when authenticating against an Active Directory server. This parameter is mutually exclusive with LDAP BaseDN.	`hive.server2.authentication.ldap.Domain`		`hiveserver2_ldap_domain`	false
LDAP URI	The URI of the LDAP server to use if LDAP authentication is enabled. The URI must be prefixed with ldap:// or ldaps://. Usernames and passwords will go over the wire in the clear unless an "ldaps://" URI is specified. The URI can optionally specify the port, for example: ldaps://ldap_server.example.com:636.	`hive.server2.authentication.ldap.url`		`hiveserver2_ldap_uri`	false
HiveServer2 TLS/SSL Certificate Trust Store File	The location on disk of the trust store, in .jks format, used to confirm the authenticity of TLS/SSL servers that HiveServer2 might connect to. This is used when HiveServer2 is the client in a TLS/SSL connection. This trust store must contain the certificate(s) used to sign the service(s) being connected to. If this parameter is not provided, the default list of well-known certificate authorities is used instead.			`hiveserver2_truststore_file`	false
HiveServer2 TLS/SSL Certificate Trust Store Password	The password for the HiveServer2 TLS/SSL Certificate Trust Store File. Note that this password is not required to access the trust store: this field can be left blank. This password provides optional integrity checking of the file. The contents of trust stores are certificates, and certificates are public information.			`hiveserver2_truststore_password`	false
Kerberos Principal	Kerberos principal short name used by all roles of this service.		hive	`kerberos_princ_name`	true
Bypass Sentry Authorization Users	List of users that are allowed to bypass Sentry Authorization in the Hive metastore. These are usually service users that already ensure that all activity has been authorized, such as hive and impala. Only applies when Hive is using Sentry Service.	`sentry.metastore.service.users`	hive, impala, hue, hdfs	`sentry_metastore_service_users`	false

webhcatserverdefaultgroup

Advanced

Display Name	Description	Related Name	Default Value	API Name	Required
WebHCat Server Advanced Configuration Snippet (Safety Valve) for webhcat-site.xml	For advanced use only, a string to be inserted into webhcat-site.xml for this role only.			`hive_webhcat_config_safety_valve`	false
WebHCat Server Environment Advanced Configuration Snippet (Safety Valve)	For advanced use only, key-value pairs (one on each line) to be inserted into a role's environment. Applies to configurations of this role except client configuration.			`hive_webhcat_env_safety_valve`	false
WebHCat Server Advanced Configuration Snippet (Safety Valve) for hive-site.xml	For advanced use only, a string to be inserted into hive-site.xml for this role only.			`hive_webhcat_hive_config_safety_valve`	false
Java Configuration Options for WebHCat Server	These arguments will be passed as part of the Java command line. Commonly, garbage collection flags or extra debugging flags would be passed here.		-XX:+UseParNewGC -XX:+UseConcMarkSweepGC -XX:-CMSConcurrentMTEnabled -XX:CMSInitiatingOccupancyFraction=70 -XX:+CMSParallelRemarkEnabled	`hive_webhcat_java_opts`	false
WebHCat Server Logging Advanced Configuration Snippet (Safety Valve)	For advanced use only, a string to be inserted into log4j.properties for this role only.			`log4j_safety_valve`	false
Heap Dump Directory	Path to directory where heap dumps are generated when java.lang.OutOfMemoryError error is thrown. This directory is automatically created if it does not exist. If this directory already exists, role user must have write access to this directory. If this directory is shared among multiple roles, it should have 1777 permissions. The heap dump files are created with 600 permissions and are owned by the role user. The amount of free space in this directory should be greater than the maximum Java Process heap size configured for this role.	`oom_heap_dump_dir`	/tmp	`oom_heap_dump_dir`	false
Dump Heap When Out of Memory	When set, generates heap dump file when java.lang.OutOfMemoryError is thrown.		false	`oom_heap_dump_enabled`	true
Kill When Out of Memory	When set, a SIGKILL signal is sent to the role process when java.lang.OutOfMemoryError is thrown.		true	`oom_sigkill_enabled`	true
Automatically Restart Process	When set, this role's process is automatically (and transparently) restarted in the event of an unexpected failure.		false	`process_auto_restart`	true

Logs

Display Name	Description	Default Value	API Name	Required
WebHCat Server Log Directory	Directory where WebHCat Server will place its log files.	/var/log/hcatalog	`hcatalog_log_dir`	false
WebHCat Server Logging Threshold	The minimum log level for WebHCat Server logs	INFO	`log_threshold`	false
WebHCat Server Maximum Log File Backups	The maximum number of rolled log files to keep for WebHCat Server logs. Typically used by log4j or logback.	10	`max_log_backup_index`	false
WebHCat Server Max Log Size	The maximum size, in megabytes, per log file for WebHCat Server logs. Typically used by log4j or logback.	200 MiB	`max_log_size`	false

Monitoring

Display Name	Description	Default Value	API Name	Required
Enable Health Alerts for this Role	When set, Cloudera Manager will send alerts when the health of this role reaches the threshold specified by the EventServer setting eventserver_health_events_alert_threshold	true	`enable_alerts`	false
Enable Configuration Change Alerts	When set, Cloudera Manager will send alerts when this entity's configuration changes.	false	`enable_config_alerts`	false
Heap Dump Directory Free Space Monitoring Absolute Thresholds	The health test thresholds for monitoring of free space on the filesystem that contains this role's heap dump directory.	Warning: 10 GiB, Critical: 5 GiB	`heap_dump_directory_free_space_absolute_thresholds`	false
Heap Dump Directory Free Space Monitoring Percentage Thresholds	The health test thresholds for monitoring of free space on the filesystem that contains this role's heap dump directory. Specified as a percentage of the capacity on that filesystem. This setting is not used if a Heap Dump Directory Free Space Monitoring Absolute Thresholds setting is configured.	Warning: Never, Critical: Never	`heap_dump_directory_free_space_percentage_thresholds`	false
Log Directory Free Space Monitoring Absolute Thresholds	The health test thresholds for monitoring of free space on the filesystem that contains this role's log directory.	Warning: 10 GiB, Critical: 5 GiB	`log_directory_free_space_absolute_thresholds`	false
Log Directory Free Space Monitoring Percentage Thresholds	The health test thresholds for monitoring of free space on the filesystem that contains this role's log directory. Specified as a percentage of the capacity on that filesystem. This setting is not used if a Log Directory Free Space Monitoring Absolute Thresholds setting is configured.	Warning: Never, Critical: Never	`log_directory_free_space_percentage_thresholds`	false
Process Swap Memory Thresholds	The health test thresholds on the swap memory usage of the process.	Warning: Any, Critical: Never	`process_swap_memory_thresholds`	false
Role Triggers	The configured triggers for this role. This is a JSON formatted list of triggers. These triggers are evaluated as part as the health system. Every trigger expression is parsed, and if the trigger condition is met, the list of actions provided in the trigger expression is executed. Each trigger has all of the following fields: `triggerName` (mandatory) - The name of the trigger. This value must be unique for the specific role. `triggerExpression` (mandatory) - A tsquery expression representing the trigger. `streamThreshold` (optional) - The maximum number of streams that can satisfy a condition of a trigger before the condition fires. By default set to 0, and any stream returned causes the condition to fire. `enabled` (optional) - By default set to 'true'. If set to 'false', the trigger will not be evaluated. `expressionEditorConfig` (optional) - Metadata for the trigger editor. If present, the trigger should only be edited from the Edit Trigger page; editing the trigger here may lead to inconsistencies. For example, the following JSON formatted trigger configured for a DataNode fires if the DataNode has more than 1500 file-descriptors opened:`[{"triggerName": "sample-trigger", "triggerExpression": "IF (SELECT fd_open WHERE roleName=$ROLENAME and last(fd_open) > 1500) DO health:bad", "streamThreshold": 0, "enabled": "true"}]`See the trigger rules documentation for more details on how to write triggers using tsquery.The JSON format is evolving and may change in the future and, as a result, backward compatibility is not guaranteed between releases at this time.	[]	`role_triggers`	true
Unexpected Exits Thresholds	The health test thresholds for unexpected exits encountered within a recent period specified by the unexpected_exits_window configuration for the role.	Warning: Never, Critical: Any	`unexpected_exits_thresholds`	false
Unexpected Exits Monitoring Period	The period to review when computing unexpected exits.	5 minute(s)	`unexpected_exits_window`	false
File Descriptor Monitoring Thresholds	The health test thresholds of the number of file descriptors used. Specified as a percentage of file descriptor limit.	Warning: 50.0 %, Critical: 70.0 %	`webhcat_fd_thresholds`	false
WebHCat Server Host Health Test	When computing the overall WebHCat Server health, consider the host's health.	true	`webhcat_host_health_enabled`	false
WebHCat Server Process Health Test	Enables the health test that the WebHCat Server's process state is consistent with the role configuration	true	`webhcat_scm_health_enabled`	false

Performance

Display Name	Description	Related Name	Default Value	API Name	Required
Maximum Process File Descriptors	If configured, overrides the process soft and hard rlimits (also called ulimits) for file descriptors to the configured value.			`rlimit_fds`	false

Ports and Addresses

Display Name	Description	Related Name	Default Value	API Name	Required
WebHCat Server Port	Port on which WebHCat Server will listen for connections.	`templeton.port`	50111	`hive_webhcat_address_port`	false

Resource Management

Display Name	Description	Related Name	Default Value	API Name	Required
Java Heap Size of WebHCat Server in Bytes	Maximum size in bytes for the Java Process heap memory. Passed to Java -Xmx.		256 MiB	`hive_webhcat_java_heapsize`	false
Cgroup CPU Shares	Number of CPU shares to assign to this role. The greater the number of shares, the larger the share of the host's CPUs that will be given to this role when the host experiences CPU contention. Must be between 2 and 262144. Defaults to 1024 for processes not managed by Cloudera Manager.	`cpu.shares`	1024	`rm_cpu_shares`	true
Cgroup I/O Weight	Weight for the read I/O requests issued by this role. The greater the weight, the higher the priority of the requests when the host experiences I/O contention. Must be between 100 and 1000. Defaults to 1000 for processes not managed by Cloudera Manager.	`blkio.weight`	500	`rm_io_weight`	true
Cgroup Memory Hard Limit	Hard memory limit to assign to this role, enforced by the Linux kernel. When the limit is reached, the kernel will reclaim pages charged to the process. If reclaiming fails, the kernel may kill the process. Both anonymous as well as page cache pages contribute to the limit. Use a value of -1 B to specify no limit. By default processes not managed by Cloudera Manager will have no limit.	`memory.limit_in_bytes`	-1 MiB	`rm_memory_hard_limit`	true
Cgroup Memory Soft Limit	Soft memory limit to assign to this role, enforced by the Linux kernel. When the limit is reached, the kernel will reclaim pages charged to the process if and only if the host is facing memory pressure. If reclaiming fails, the kernel may kill the process. Both anonymous as well as page cache pages contribute to the limit. Use a value of -1 B to specify no limit. By default processes not managed by Cloudera Manager will have no limit.	`memory.soft_limit_in_bytes`	-1 MiB	`rm_memory_soft_limit`	true

Stacks Collection

Display Name	Description	Related Name	Default Value	API Name	Required
Stacks Collection Data Retention	The amount of stacks data that is retained. After the retention limit is reached, the oldest data is deleted.	`stacks_collection_data_retention`	100 MiB	`stacks_collection_data_retention`	false
Stacks Collection Directory	The directory in which stacks logs are placed. If not set, stacks are logged into a `stacks` subdirectory of the role's log directory.	`stacks_collection_directory`		`stacks_collection_directory`	false
Stacks Collection Enabled	Whether or not periodic stacks collection is enabled.	`stacks_collection_enabled`	false	`stacks_collection_enabled`	true
Stacks Collection Frequency	The frequency with which stacks are collected.	`stacks_collection_frequency`	5.0 second(s)	`stacks_collection_frequency`	false
Stacks Collection Method	The method used to collect stacks. The jstack option involves periodically running the jstack command against the role's daemon process. The servlet method is available for those roles that have an HTTP server endpoint exposing the current stacks traces of all threads. When the servlet method is selected, that HTTP endpoint is periodically scraped.	`stacks_collection_method`	jstack	`stacks_collection_method`	false

HDFS Properties in CDH 5.4.0

Hue Properties in CDH 5.4.0