Configurations of the setup

Learn about the configurations of the Edge Flow Manager (EFM) setup used for this guide.

EFM

EFM is configured to run in a clustered mode. There are 3 instances each running in Docker in a Kubernetes POD with 6 cores and 12 GB RAM allocated.

Note that PODs run exclusively on Kubernetes nodes to minimize the noisy neighbor issue.

Java version used in this setup is Adoptium OpenJDK (Temurin) 11.0.15.
Properties Description
EFM properties

management.metrics.efm.enabled=true

management.metrics.export.prometheus.enabled=true

Metrics collection and metrics export to Prometheus are enabled.
logging.level.com.cloudera.cem.efm=ERROR Log level is set to ERROR. This is recommended to be set in high-volume production environments as well, as logging has an impact on performance and also logs can fill up the disks easily.
efm.db.maxConnections=150 Database connections' maximum value is increased from 50 to 150 to be able to serve the increased number of database connection requests in some circumstances. Note that this is a per instance property which means that the EFM cluster will have 450 connections in total against the single database.
efm.server.jetty.threads.max=600 The maximum value of Jetty thread number is increased from 200 (default) to 600. This gives EFM some buffer to have enough threads for new connections if there are some straggler requests.
efm.event.maxAgeToKeep.debug=0 Retention period for debug level events is set to 0. In high volume deployments, there will be thousands of events generated in every minute, which will cause the in-memory event store to fill up.
JVM properties

-XX:+UseG1GC

-XX:+UseStringDeduplication

-XX:+ParallelRefProcEnabled

The recommended garbage collection is G1 on Java11. On Java8, CMS garbage collector is recommended. Ensure that you always use the latest Java Update to have all the fixes and backports.
-Xms8g

-Xmx8g

Heap memory is configured to 8 GB. It is recommended to set both the initial and the maximum size to the same value.

Note that only 8 GB is allocated for the heap while the POD’s available memory is 12 GB. The reason behind this is besides the heap, there are other memory areas and entities which consume space like the metaspace, network buffers, threads, and so on.

Database

EFM requires a relational database for storing persistent data. Although it can be run with in-memory H2 database, for clustered setups and in-production environments an external database instance is necessary.

MySQL is a popular relational database and supported by EFM. In this setup, MySQL 8.0.28 docker image is used, which is publicly accessible on Docker Hub.

There is 1 instance running in a Kubernetes POD with 4 cores and 8 GB RAM allocated.
Configurations Description
--max_connections=451 Maximum number of connections is increased to 451 to be in sync with EFM configuration. 150 connections are configured per instance for the 3 instance cluster.
--innodb-dedicated-server=ON For production deployments, this option needs to be set for enabling MySQL to use all the available resources on the hosting instance.