Ingesting Data into Apache Hive in CDP Public Cloud Understand the use caseYou can use Apache NiFi to move data from a range of locations into a Data Engineering cluster running Apache Hive in CDP Public Cloud. Meet the prerequisitesUse this checklist to make sure that you meet all the requirements before you start building your data flow.Configure the service accountConfigure the Service Account you will use to ingest data into Hive.Create the IDBroker mappingTo enable your CDP user to utilize the central authentication features CDP provides and to exchange credentials for AWS or Azure access tokens, you have to map this CDP user to the correct IAM role or Azure Managed Service Identity (MSI). The option to add/modify these mappings is available from the Management Console in your CDP environment. Create the Hive target tableBefore you can ingest data into Apache Hive in CDP Public Cloud, ensure that you have a Hive target table. These steps walk you through creating a simple table. Modify these instructions based on your data ingest target table needs.Add Ranger policiesAdd Ranger policies to ensure that you have write access to your Hive tables.Obtain the Hive connection detailsTo enable an Apache NiFi data flow to communicate with Hive, you must obtain the Hive connection details by downloading several client configuration files. The NiFi processors require these files to parse the configuration values and use those values to communicate with Hive.Build the data flowFrom the Apache NiFi canvas, set up the elements of your data flow. This involves opening NiFi in CDP Public Cloud, adding processors to your NiFi canvas, and connecting the processors.Configure the controller servicesYou can add Controller Services to provide shared services to be used by the processors in your data flow. You will use these Controller Services later when you configure your processors. Configure the processor for your data sourceYou can set up a data flow to move data from many locations into Apache Hive. This example assumes that you are configuring ConsumeKafkaRecord_2_0. If you are moving data from a location other than Kafka, review Getting Started with Apache NiFi for information about how to build a data flow, and about other data consumption processor options. Configure the processor for your data targetYou can set up a data flow to move data into many locations. This example assumes that you are moving data into Apache Hive using PutHive3Streaming. If you are moving data into another location, review Getting Started with Apache NiFi for information about how to build a data flow, and about other data ingest processor options.Start your data flowStart your data flow to verify that you have created a working dataflow and to begin your data ingest process. Verify your data flowLearn how you can verify the operation of your Hive ingest data flow.Next stepsProvides information on what to do once you have moved data into Hive in CDP Public Cloud.