Flow Management cluster layout

The Data Hub service provides two default Flow Management cluster definitions: Flow Management: Light Duty and Flow Management: Heavy Duty. Understanding the layout, capacity, and components of these definitions is essential for effective deployment.

Flow Management: Light Duty cluster layout

The Flow Management: Light Duty cluster definition is suitable for development, testing, or proof of concept scenarios.

Each cluster node comprises the following components:

  • NiFi and ZooKeeper are co-located on all instances.

  • Specifications for nodes hosting NiFi and ZooKeeper:
    • AWS: m5.2xlarge
    • Azure: D8_v3
    • GCE: e2-standard-8
  • Storage requirements per NiFi node:
    • AWS: 4 x 500GB EBS ST1
    • Azure 4 x 500GB Standard SSD
    • GCE: 4 x 500GB PD-Standard
  • Each NiFi node hosts:
    • FlowFile repository
    • Content repository
    • Provenance repository
    • Log and Database repository

For more information, see the Instance types and Storage information specific to your cloud provider.

Flow Management: Heavy Duty cluster layout

The Flow Management: Heavy Duty cluster definition is intended for production scenarios.

Each cluster node comprises the following components:

  • NiFi and ZooKeeper run on separate nodes.

  • NiFi nodes scale independently of ZooKeeper.

  • Specifications for each ZooKeeper node:
    • AWS: m5.2xlarge
    • Azure: D8_v3
    • GCE: e2-standard-8
  • Specifications for each NiFi node:
    • AWS: m5.2xlarge
    • Azure: F16sv2
    • GCE: e2-standard-8
  • Storage requirements per NiFi node:
    • AWS: 4x 1TB EBS GP2
    • Azure: 4x 1TB Premium SSD
    • GCE: 4x 1TB PD-SSD
  • Each NiFi node hosts:
    • FlowFile repository
    • Content repository
    • Provenance repository
    • Log and Database repository

For more information, see the Instance types and Storage information specific to your cloud provider.