Using MiNiFi as a log collector pod in Kubernetes
Learn how to use MiNiFi as a log collector pod in Kubernetes.
If you have a Kubernetes deployment, which contains some pods, you can see the logs from these pods in a centralized location by using MiNiFi. To do so, you need to set up a log collector pod (in a daemon set) which runs MiNiFi. MiNiFi collects the logs from the other pods, and pushes those logs to the central location you want (for example, Kafka). After the logs are in the central location, they can be searched, archived and so on.
- A KubernetesControllerService controller service
You can configure which pods to collect logs from by setting the
Pod Name Filter, and
Container Name Filterattributes on the KubernetesControllerService. If none of these are set, the default is to collect logs from all pods in the
- A TailFile processor with the following properties set:
Attribute Provider Serviceproperty set to the name of the KubernetesControllerService
tail-modeproperty set to
File to Tailproperty set to
tail-base-directoryproperty set to
- Some other processor which uploads the log lines output by the TailFile processor somewhere (for example, PublishKafka)
You can find a sample
config.yml file, which contains all these settings, at
The namespace of the pod.
The name of the pod.
The unique ID of the pod.
The name of the container inside the pod.
The location of the log file on the node; usually something like:
The RouteOnAttribute processor to separate the flow files by any of the attributes above.
The DefragmentText processor to merge multi-line log messages into a single flow file.
The MergeContent processor to batch multiple log lines into a single flow file.
The UpdateAttribute processor to create further attributes based on the existing ones.
As you probably want the log collector pod to run on all nodes in your cluster, Cloudera recommends to run it as a Daemon Set. For more information, see https://kubernetes.io/docs/concepts/workloads/controllers/daemonset/.