IcebergCatalogService
Implementations: HadoopCatalogService
HiveCatalogService
Specifies the Controller Service to use for handling references to table’s metadata files. | Catalog Namespace | catalog-namespace | | | The namespace of the catalog. Supports Expression Language: true (will be evaluated using flow file attributes and variable registry) |
Table Name | table-name | | | The name of the Iceberg table to write to. Supports Expression Language: true (will be evaluated using flow file attributes and variable registry) |
Equality Delete Field Strategy | Equality Delete Field Strategy | All Fields | - Primary and Partition Keys
- All Fields
| Columns for equality delete files will be selected based on the strategy. Note: double and float type of columns can not be used in equality delete files. |
Unmatched Column Behavior | unmatched-column-behavior | Fail on Unmatched Columns | - Ignore Unmatched Columns
- Warn on Unmatched Columns
- Fail on Unmatched Columns
| If an incoming record does not have a field mapping for all of the database table's columns, this property specifies how to handle the situation. |
File Format | file-format | | | File format to use when writing Iceberg data files. If not set, then the 'write.format.default' table property will be used, default value is parquet. Supports Expression Language: true (will be evaluated using flow file attributes and variable registry) |
Record Type | Record Type | Debezium | - Debezium
- GoldenGate
- Custom
| Specifies the type of the incoming CDC record. In case Custom record type is chosen, the used operation type values need to be specified for insert, delete and update. |
Record Reader | record-reader | | Controller Service API: RecordReaderFactory Implementations: EBCDICRecordReader JsonTreeReader GrokReader ReaderLookup IPFIXReader WindowsEventLogReader ParquetReader CSVReader Syslog5424Reader JASN1Reader ExcelReader CiscoEmblemSyslogMessageReader ScriptedReader ProtobufReader JsonPathReader XMLReader CEFReader SyslogReader AvroReader YamlTreeReader | Specifies the Controller Service to use for parsing incoming data and determining the data's schema. |
Insert Operation Type | Insert Operation Type | | | Specifies the operation type values used which denote an insert operation in the incoming record.
This Property is only considered if the [Record Type] Property has a value of "Custom". |
Delete Operation Type | Delete Operation Type | | | Specifies the operation type values used which denote a delete operation in the incoming record.
This Property is only considered if the [Record Type] Property has a value of "Custom". |
Update Operation Type | Update Operation Type | | | Specifies the operation type values used which denote an update operation in the incoming record.
This Property is only considered if the [Record Type] Property has a value of "Custom". |
Operation RecordPath | Operation RecordPath | | | This property denotes a RecordPath that will be evaluated against each incoming Record in order to determine the operation type. The RecordPath must evaluate to one of the valid Iceberg Operation Types, or the incoming FlowFile will be routed to failure.
This Property is only considered if the [Record Type] Property has a value of "Custom". |
Before Data RecordPath | Before Data RecordPath | | | This property denotes a RecordPath that will be evaluated against each incoming Record and marks the record state before the operation.
This Property is only considered if the [Record Type] Property has a value of "Custom". |
After Data RecordPath | After Data RecordPath | | | This property denotes a RecordPath that will be evaluated against each incoming Record and marks the record state after the operation.
This Property is only considered if the [Record Type] Property has a value of "Custom". |
Failure Strategy | Failure Strategy | Route to Failure | - Route to Failure
- Rollback Session
| If one or more Records cannot be processed or the operation can not be applied to Iceberg table, specifies how to handle the failure. |
Maximum File Size | maximum-file-size | | | The maximum size that a file can be, if the file size is exceeded a new file will be generated with the remaining data. If not set, then the 'write.target-file-size-bytes' table property will be used, default value is 512 MB. Supports Expression Language: true (will be evaluated using flow file attributes and variable registry) |
Kerberos User Service | kerberos-user-service | | Controller Service API: KerberosUserService Implementations: KerberosTicketCacheUserService KerberosKeytabUserService KerberosPasswordUserService | Specifies the Kerberos User Controller Service that should be used for authenticating with Kerberos. |
Number of Commit Retries | number-of-commit-retries | 10 | | Number of times to retry a commit before failing. Supports Expression Language: true (will be evaluated using flow file attributes and variable registry) |
Minimum Commit Wait Time | minimum-commit-wait-time | 100 ms | | Minimum time to wait before retrying a commit. Supports Expression Language: true (will be evaluated using flow file attributes and variable registry) |
Maximum Commit Wait Time | maximum-commit-wait-time | 2 sec | | Maximum time to wait before retrying a commit. Supports Expression Language: true (will be evaluated using flow file attributes and variable registry) |
Maximum Commit Duration | maximum-commit-duration | 30 sec | | Total retry timeout period for a commit. Supports Expression Language: true (will be evaluated using flow file attributes and variable registry) |
Relationships:
Name | Description |
---|
success | A FlowFile is routed to this relationship after all data operations were successful. |
failure | A FlowFile is routed to this relationship when a data operation is failed. |
Reads Attributes:
None specified.Writes Attributes:
Name | Description |
---|
iceberg.cdc.record.count | The number of CDC records in the FlowFile. |
State management:
This component does not store state.Restricted:
This component is not restricted.System Resource Considerations:
None specified.