Analyzing your data with HBase

Describes how to create and use a Flink streaming application with HBase sink in CDP Public Cloud.

  • You have a CDP Public Cloud environment.
  • You have a CDP username (it can be your own CDP user or a CDP machine user) and a password set to access Data Hub clusters.

    The predefined resource role of this user is at least EnvironmentUser. This resource role provides the ability to view Data Hub clusters and set the FreeIPA password for the environment.

  • Your user is synchronized to the CDP Public Cloud environment.
  • You have a Streaming Analytics cluster.
  • You have an Operational Database with SQL cluster in the same Data Hub environment as the Streaming Analytics cluster.
  • Your CDP user has the correct permissions set up in Ranger allowing access to HBase.
  1. Choose a source for your Flink application and add the connector to your application.
  2. Add HBase as sink to your Flink application.
    The following code example shows how to build your application logic with an HBase sink:
    
    HBaseSinkFunction<QueryResult> hbaseSink = new HBaseSinkFunction<QueryResult>("ITEM_QUERIES") {
    @Override
    public void executeMutations(QueryResult qresult, Context context, BufferedMutator mutator) throws Exception {
      Put put = new Put(Bytes.toBytes(qresult.queryId));
      put.addColumn(Bytes.toBytes("itemId"), Bytes.toBytes("str"),   Bytes.toBytes(qresult.itemInfo.itemId));
      put.addColumn(Bytes.toBytes("quantity"), Bytes.toBytes("int"), Bytes.toBytes(qresult.itemInfo.quantity));
      mutator.mutate(put);
      }
    }; 
    hbaseSink.setWriteOptions(HBaseWriteOptions.builder()
      .setBufferFlushIntervalMillis(1000)
      .build()
    );
    streamqueryResultStream.addSink(hbaseSink);
  3. Start generating data to your source connector.
  4. Deploy your Flink streaming application.
You have the following options to monitor and manage your Flink applications: