Application Logging Made Simple with Kubernetes, Elasticsearch, Fluent Bit and Kibana
Posted on April 5, 2019
Today, we are going to talk about the EFK stack: Elasticsearch, Fluent, and Kibana. You will learn about the stack and how to configure it to centralize logging for applications deployed on Kubernetes.
Application Logging Process Overview
The 3 components of the EFK stack are as follows:
- Elasticsearch
- Fluentbit/Fluentd
- Kibana

We will focus on Fluentbit, as it is the lightweight version of Fluentd and more suitable for Kubernetes. Additionally, we will talk about how we reached the final solution and the hurdles we had to overcome. Last but not least, we’ll show you how we handled application logs without actually installing 3rd party clients like https://github.com/fluent/fluent-logger-java.
So, first off, Fluent Bit is the agent which tails your logs, parses them, and sends them to Elasticsearch. Elasticsearch stores them in different (or the same) indexes, and provides an easy and powerful way to query your data. We use Kibana— an UI over Elasticsearch to help you query your data either by using the Lucene query syntax, or by clicking on certain values for certain labels. It’s useful if you need all the logs that have the log_level set to ERROR, for example. Generally speaking, Kibana makes vizualization easier.
Getting Started
The simplest way to start off with creating this stack, is to deploy all components using Helm charts. If you do not know what Helm is or Helm charts are, we will try to briefly explain the two. Helm is a package manager for Kubernetes, sort of how APT is a package manager for Debian. Here, Helm charts are packages which describe different software components (databases, caches, etc.). A Helm chart is composed of two main parts:
- Templates
- Values
By using this we get rid of duplicate code, and store only what changes from one environment to the other in the values file. When you download a chart, you only need to change the values file to suit your needs. It’s useful if you want to change the size of the PVC, or you don’t want a PVC, or to change the resource limits on your containers. The sky is the limit.
For the deployment of Elasticsearch and Kibana, we are not going to use Helm. We want to deploy a single-node Elasticsearch, and our Kubernetes cluster will be the one provided by Docker for Desktop. Fluent Bit will be installed by using its Helm chart.
As you can see, we have deployed our components:

And to generate some dummy logs, we have used the following command:
kubectl run --image=cloudhero/fakelogs fakelogs
The above command runs a pod from the cloudhero/fakelogs
image that just outputs the same Java log every 5 seconds, to simulate multi-line logs.
Method 1: Deploy Fluent Bit and send all the logs to the same index.
The first thing which everybody does: deploy the Fluent Bit daemonset and send all the logs to the same index. The results are shown below:


As you can see, our application log went in the same index with all other logs and parsed with the default Docker parser. This presents itself with the following problems:
- All logs go into the same index. This makes our search more complicated and slower, as Elasticsearch has to search through all the logs. You also do not have control over the log cleanup. Maybe you need 30-days application logs, but all other logs to be cleaned after 7 days.
- All logs are parsed with the same parser. As we all know, not all applications have the same log format. So you will end up with some logs being parsed correctly, or not at all.
- You could specify multiple INPUT and OUTPUT plugins in your Fluentbit configuration file, but that would lead to duplicate logs. This could end up being very costly in terms of disk space.
Method 2: Fluent Bit and Running Logrotate as a Sidecar for Application Logging
The next thing we can do, is deploy our applications with Fluent Bit and logrotate sidecars, and direct the stdout of your application to a shared emptyDir
volume. Below is an example of how you can do this with the cloudhero/fakelogs
image:
apiVersion: apps/v1
kind: Deployment
metadata:
name: fakelogs
spec:
replicas: 1
selector:
matchLabels:
app: fakelogs
template:
metadata:
labels:
app: fakelogs
#I have chosen user 1000 on every container because logrotate will not run as root.
spec:
containers:
- name: fakelogs
image: cloudhero/fakelogs
command: ["/bin/sh","-c"]
args: ["./main > /app/log/app.log"]
volumeMounts:
- name: log-volume
mountPath: /app/log
securityContext:
allowPrivilegeEscalation: false
runAsUser: 1000
- name: fluentbit
image: fluent/fluent-bit:1.0.6
volumeMounts:
- name: log-volume
mountPath: /app/log
- name: config
mountPath: /fluent-bit/etc/fluent-bit.conf
subPath: fluent-bit.conf
- name: config
mountPath: /fluent-bit/etc/parsers_springboot.conf
subPath: parsers_springboot.conf
securityContext:
allowPrivilegeEscalation: false
runAsUser: 1000
- name: logrotate
image: cloudhero/logrotate
volumeMounts:
- name: log-volume
mountPath: /app/log
- name: config
mountPath: /etc/logrotate.conf
subPath: logrotate.conf
- name: config
mountPath: /etc/logrotate.d/logrotate-java.conf
subPath: logrotate-java.conf
securityContext:
allowPrivilegeEscalation: false
runAsUser: 1000
volumes:
- name: log-volume
emptyDir: {}
- name: config
configMap:
name: fakelogs-configmap
apiVersion: v1
kind: ConfigMap
metadata:
name: fakelogs-configmap
labels:
component: fakelogs-configmap
data:
logrotate.conf: |-
weekly
rotate 4
create
tabooext + .apk-new
compress
include /etc/logrotate.d
logrotate-java.conf: |-
/app/log/app.log {
hourly
size 100M
missingok
rotate 1
compress
notifempty
create 0640 root root
}
fluent-bit.conf: |-
[SERVICE]
Flush 1
Daemon Off
Log_Level info
Parsers_File parsers_springboot.conf
[INPUT]
Name tail
Path /app/log/app.log
Multiline on
Parser_Firstline springboot
[OUTPUT]
Name es
Match *
Host elasticsearch.efk.svc.cluster.local
Port 9200
Logstash_Format On
Retry_Limit False
Type flb_type
Time_Key @timestamp
Logstash_Prefix fakelogs
parsers_springboot.conf: |-
[PARSER]
Name springboot
Format regex
Regex /^(?<date>[0-9]+-[0-9]+-[0-9]+\s+[0-9]+:[0-9]+:[0-9]+.[0-9]+)\s+(?<log_level>[Aa]lert|ALERT|[Tt]race|TRACE|[Dd]ebug|DEBUG|[Nn]otice|NOTICE|[Ii]nfo|INFO|[Ww]arn?(?:ing)?|WARN?(?:ING)?|[Ee]rr?(?:or)?|ERR?(?:OR)?|[Cc]rit?(?:ical)?|CRIT?(?:ICAL)?|[Ff]atal|FATAL|[Ss]evere|SEVERE|EMERG(?:ENCY)?|[Ee]merg(?:ency)?)\s+(?<pid>[0-9]+)\s+---\s+\[(?<thread>.*)\]\s+(?<class_name>.*)\s+:\s+(?<message>.*)$/
Time_Key time
Time_Format %Y-%m-%
With this setup, we have fixed the problems with the prior setup. Results are shown below:


As you can see, our logs are parsed and have their own index, so we don’t interfere with the kubernetes_cluster-* index, where the rest of the logs are located.
As good as this method may seem to be, we are still facing some problems:
- Kubectl logs will not output anything anymore.
- You need a logrotate sidecar to take care of your log file, if we are redirecting the output from stdout. There are also some cases where logrotate will not even work, take for example Nginx. When rotating the logfile, you need to send a signal to the Nginx PID. This is not possible when running logrotate in another container.
- You use up more resources by having a minimum of 3 containers per pod.
How to Solve Your Problems
For the first problem, we came up with the solution of an Nginx sidecar which outputs the application logs to a webpage. But this adds another container to our pod, taking the number of containers in a pod to 4.
For the second problem you could create an image which runs both Nginx and logrotate. Yet, it will force you to use more things in your container, like SupervisorD to handle process failures. We have also tried to write to a named pipe instead of a file, but the Fluentbit tail plugin does not work on pipes (the head plugin seems to work, but we have concluded that it is not very reliable). Moreover, we tried piping the logs to netcat and send them over the network. Here, the forward plugin does not work this way, and the TCP plugin expects JSON output, and does not support parsing.
The third problem is solvable only if you create an image with all 3 processes, which is not advisable.
Method 3: Using Fluent Bit and Kubernetes Annotations for Application Logging
Recently, Fluentbit has added support for Kubernetes annotations. Currently, there are two annotations https://docs.fluentbit.io/manual/filter/kubernetes#kubernetes-annotations:
- fluentbit.io/parser[_stream][-container]
- fluentbit.io/exclude
The first one is cool, but it still does not enable you to send logs to different indexes in Elasticsearch. The second one is the interesting one. This lets you exclude your application logs from the main tailing process (which tails /var/log/containers/*
), and then create separate INPUT and OUTPUT stages in your Fluentbit configuration file, for each application.
First off, you can create a new filter which takes care of your application logs:
[FILTER]
Name kubernetes
Match kube.*
Kube_URL https://kubernetes.default.svc:443
Kube_CA_File /var/run/secrets/kubernetes.io/serviceaccount/ca.crt
Kube_Token_File /var/run/secrets/kubernetes.io/serviceaccount/token
Merge_Log On
K8S-Logging.Exclude On
[FILTER]
Name kubernetes
Match app-.*
Kube_URL https://kubernetes.default.svc:443
Kube_CA_File /var/run/secrets/kubernetes.io/serviceaccount/ca.crt
Kube_Token_File /var/run/secrets/kubernetes.io/serviceaccount/token
Merge_Log On
K8S-Logging.Exclude Off
You then write another INPUT/OUTPUT pair in the config:
[INPUT]
Name tail
Path /var/log/containers/fakelogs*_default_fakelogs-*.log
Tag app-fakelogs.*
Refresh_Interval 5
Mem_Buf_Limit 5MB
Skip_Long_Lines On
Multiline on
Parser_Firstline springboot
[OUTPUT]
Name es
Match app-fakelogs.*
Host elasticsearch.efk.svc.cluster.local
Port 9200
Logstash_Format On
Retry_Limit False
Type flb_type
Time_Key @timestamp
Logstash_Prefix fakelogs
This works because the log names in the /var/log/containers
folder are in the <deployment_name>*_<namespace>_<container>-*.log
format.
This method gives you a centralized configuration file for all your applications, removes the need for additional Fluentbit processes, gives you maximum flexibility for your parsers and Elasticsearch indexes, and still lets you have your logs when running the kubectl logs command!
I hope you found this blog post useful and will help you debug your applications faster! Here are five tips and tricks that will help manage your Elasticsearch clusters