Querying live data from Kafka
You can get useful information, including Kafka record metadata from a table of Kafka data by using typical Hive queries.
Each Kafka record consists of a user payload key (byte []) and value (byte[]) plus the
following metadata fields:
- Partition int32
- Offset int64
- Timestamp int64
The Hive row represents the dual composition of Kafka data:
- The user payload serialized in the value byte array
- The metadata: key byte array, partition, offset, and timestamp fields
In the Hive representation of the Kafka record, the key byte array is called __key and is of type binary. You can cast __key at query time. Hive appends __key to the last column derived from value byte array, and appends the partition, offset, and timestamp to __key columns that are named accordingly.