Recommendations for bulk operations in EFM
Learn about the guidances that Cloudera provides to set up the properties in Edge Flow Manager (EFM) for operations.
Cloudera recommends the following while setting up EFM properties:
efm.operation.monitoring.rollingBatchOperationsSize
: Set to 10-20% of the total number of agents; but it should not exceed 1000.efm.operation.monitoring.rollingBatchOperationsFrequency
: Based on former execution times, find the frequency where at most 25% of the rolling batch size frees up in a single iteration.efm.monitor.maxHeartbeatInterval
in combination withefm.operation.monitoring.inQueuedStateTimeoutHeartbeatRate
: Maxheartbeatrate should be close to 75 percentile so you can keepinQueuedStateTimeoutHeartbeatRate
to a value which should not be more than 3. If a higher number is needed for the rate, you should investigate those agents why those agents do not match with criterias.efm.operation.monitoring.inDeployedStateTimeout
in combination withefm.operation.monitoring.inDeployedStateCheckFrequency
: Deployed state timeout should be 120% of the longest expected operation execution time. State check frequency should be set to such a value where EFM checks state at most 4-10 times during the operation execution.- If you put together all the configurations explained in the scenarios, you can get an expected execution time formula: