PutClouderaHiveQL

Description:

Executes a HiveQL DDL/DML command (UPDATE, INSERT, e.g.). The content of an incoming FlowFile is expected to be the HiveQL command to execute. The HiveQL command may use the ? to escape parameters. In this case, the parameters to use must exist as FlowFile attributes with the naming convention hiveql.args.N.type and hiveql.args.N.value, where N is a positive integer. The hiveql.args.N.type is expected to be a number indicating the JDBC Type. The content of the FlowFile is expected to be in UTF-8 format.

Tags:

sql, hive, put, database, update, insert

Properties:

In the list below, the names of required properties appear in bold. Any other properties (not in bold) are considered optional. The table also indicates any default values, and whether a property supports the NiFi Expression Language.

Display Name	API Name	Default Value	Allowable Values	Description
Hive Database Connection Pooling Service	hive3-dbcp-service		Controller Service API: ClouderaHiveDBCPService Implementation: ClouderaHiveConnectionPool	The Hive Controller Service that is used to obtain connection(s) to the Hive database
Batch Size	hive-batch-size	100		The preferred number of FlowFiles to put to the database in a single transaction
Query timeout	hive3-query-timeout	0		Sets the number of seconds the driver will wait for a query to execute. A value of 0 means no timeout. NOTE: Non-zero values may not be supported by the driver. Supports Expression Language: true (will be evaluated using flow file attributes and variable registry)
Character Set	hive3-charset	UTF-8		Specifies the character set of the record data.
Statement Delimiter	statement-delimiter	;		Statement Delimiter used to separate SQL statements in a multiple statement script
Rollback On Failure	rollback-on-failure	false	true false	Specify how to handle error. By default (false), if an error occurs while processing a FlowFile, the FlowFile will be routed to 'failure' or 'retry' relationship based on error type, and processor can continue with next FlowFile. Instead, you may want to rollback currently processed FlowFiles and stop further processing immediately. In that case, you can do so by enabling this 'Rollback On Failure' property. If enabled, failed FlowFiles will stay in the input relationship without penalizing it and being processed repeatedly until it gets processed successfully or removed by other means. It is important to set adequate 'Yield Duration' to avoid retrying too frequently.

Relationships:

Name	Description
retry	A FlowFile is routed to this relationship if the database cannot be updated but attempting the operation again may succeed
success	A FlowFile is routed to this relationship after the database is successfully updated
failure	A FlowFile is routed to this relationship if the database cannot be updated and retrying the operation will also fail, such as an invalid query or an integrity constraint violation

Reads Attributes:

Name	Description
hiveql.args.N.type	Incoming FlowFiles are expected to be parametrized HiveQL statements. The type of each Parameter is specified as an integer that represents the JDBC Type of the parameter.
hiveql.args.N.value	Incoming FlowFiles are expected to be parametrized HiveQL statements. The value of the Parameters are specified as hiveql.args.1.value, hiveql.args.2.value, hiveql.args.3.value, and so on. The type of the hiveql.args.1.value Parameter is specified by the hiveql.args.1.type attribute.

Writes Attributes:

Name	Description
query.input.tables	This attribute is written on the flow files routed to the 'success' relationships, and contains input table names (if any) in comma delimited 'databaseName.tableName' format.
query.output.tables	This attribute is written on the flow files routed to the 'success' relationships, and contains the target table names in 'databaseName.tableName' format.

State management:

This component does not store state.

Restricted:

This component is not restricted.

Input requirement:

This component requires an incoming relationship.

System Resource Considerations:

None specified.