Analyzing your data with HBase

After preparing your environment, you need to choose a source to which you connect Flink in Data Hub. After generating data to your source, Flink applies the computations you have added in your application design. The results are redirected to your HBase sink.

  • You have a CDP Public Cloud environment.
  • You have a CDP username (it can be your own CDP user or a CDP machine user) and a password set to access Data Hub clusters.

    The predefined resource role of this user is at least EnvironmentUser. This resource role provides the ability to view Data Hub clusters and set the FreeIPA password for the environment.

  • Your user is synchronized to the CDP Public Cloud environment.
  • You have a Streaming Analytics cluster.
  • You have an Operational Database with SQL cluster in the same Data Hub environment as the Streaming Analytics cluster.
  • Your CDP user has the correct permissions set up in Ranger allowing access to HBase.
  1. Choose a source for your Flink application and add the connector to your application.
  2. Add HBase as sink to your Flink application.
    The following code example shows how to build your application logic with an HBase sink:
    HBaseSinkFunction<QueryResult> hbaseSink = new HBaseSinkFunction<QueryResult>("ITEM_QUERIES") {
    public void executeMutations(QueryResult qresult, Context context, BufferedMutator mutator) throws Exception {
      Put put = new Put(Bytes.toBytes(qresult.queryId));
      put.addColumn(Bytes.toBytes("itemId"), Bytes.toBytes("str"),   Bytes.toBytes(qresult.itemInfo.itemId));
      put.addColumn(Bytes.toBytes("quantity"), Bytes.toBytes("int"), Bytes.toBytes(qresult.itemInfo.quantity));
  3. Start generating data to your source connector.
  4. Deploy your Flink streaming application.
You have the following options to monitor and manage your Flink applications: