Setting up your CDP pattern infrastructure
Before starting, ensure that your central or departmental IT has onboarded to CDP Public Cloud and registered an environment with Cloudera. Business Intelligence at Scale pattern leverages Streams Messaging Manager (SMM), Cloudera DataFlow (CDF), Cloudera Data Engineering (CDE), and Cloudera Data Warehouse (CDW) services. You must set up these services as part of your infrastructure.
Most steps required to set up your CDP pattern infrastructure should be performed by a
CDP administrator and require the
EnvironmentAdmin CDP role. Since the
goal of this pattern is to enable self-service DataFlow development, Data Engineering,
and Data Warehousing, make sure that the IT team has created the necessary users and
groups, and have granted the required permissions to these users and groups.
When your IT team creates and registers an environment with CDP, it is assumed that they create a medium-duty Data Lake to import and manage streaming data in the Streams Messaging Data Hub cluster, as well as when running other CDP services.
- Ensure that you have an available CDP environment
- Ensure that you have CDP login credentials
- Ensure that you have a running Data Lake
- Ensure that your CDP user is synchronized to the CDP Public Cloud environment
- Review the AWS environments requirements checklist
For test and development environments, you can set up small to medium sized clusters, and scale up for production use.
After you register your environment with CDP, you must also activate entitlements for using Unified Analytics and Data Visualization in CDW by contacting your Cloudera Account Representative.
The following diagram shows all the Data Services used to implement the Business Intelligence at Scale pattern: