Recommendations for bulk operations in EFM

Learn about the guidances that Cloudera provides to set up the properties in Edge Flow Manager (EFM) for operations.

Cloudera recommends the following while setting up EFM properties:
  • efm.operation.monitoring.rollingBatchOperationsSize: Set to 10-20% of the total number of agents; but it should not exceed 1000.
  • efm.operation.monitoring.rollingBatchOperationsFrequency: Based on former execution times, find the frequency where at most 25% of the rolling batch size frees up in a single iteration.
  • efm.monitor.maxHeartbeatInterval in combination with efm.operation.monitoring.inQueuedStateTimeoutHeartbeatRate: Maxheartbeatrate should be close to 75 percentile so you can keep inQueuedStateTimeoutHeartbeatRate to a value which should not be more than 3. If a higher number is needed for the rate, you should investigate those agents why those agents do not match with criterias.
  • efm.operation.monitoring.inDeployedStateTimeout in combination with efm.operation.monitoring.inDeployedStateCheckFrequency: Deployed state timeout should be 120% of the longest expected operation execution time. State check frequency should be set to such a value where EFM checks state at most 4-10 times during the operation execution.
  • If you put together all the configurations explained in the scenarios, you can get an expected execution time formula: