Simplified guide to logging Docker to Elasticsearch in 2019 (With syslog-ng)

(Last Updated On: 02/19/2019)

This simplified guide to logging Docker to Elasticsearch shows you how to send logs of containers into Elastic. Although there are many tutorials on to logging Docker to Elasticsearch, this one is different from all as it uses syslog-ng. Visualize them on a nice dashboard in Kibana. And you can download it all at the end of the post!

Update: I moved the chapters about parsing and visualizing NGINX / Apache access logs in Kibana into a dedicated post. I hope it will improve readability of both subjects.

Docker logs in Kibana Dasboard

Why to store logs of containers?

I have a Docker environment on my Linux home server. It provides different services I use and sandboxes to try out new things. I already have a central log server to store logs from different sources. Logs of Docker containers are yet to be added.
Although Docker acts like a machine in the machine, it is still possible to collect and store the logs of the containers in the same central log server. They will be handy both for troubleshooting and analysis.

Choosing the right Docker log driver

The formats of the logs could be different container by container. Fortunately Docker daemon provides multiple drivers to collect and forward the logs to other systems regardless of the format of the logs.

Depending on what log driver we use – as Docker supports many log drivers – it could even provide metadata to the logs too.

The most trivial driver is ‘syslog’ but the documentation mentions an important disadvantage.

“Needs to be set up as highly available (HA) or else there can be issues on container start if it’s not available”

Although this is the only driver which supports TLS transport, it is quite overkill to set up an HA syslog infrastructure solely to collect container’s logs. I think journald is a better alternative to syslog.

I chose the journald driver over syslog because:

  1. All mainstream Linux operating systems ships journald which works out of the box.
  2. The service runs on the host and collects system logs even when no container is running.
  3. It is capable to collect Docker daemon logs as well, not just the logs of containers.
  4. The docker logs feature can be used parallel with journald.
  5. Regardless what the document mentions, it also supports logging tags.
  6. The only downside, that it is a binary format and processing it takes extra effort. Lucky for us syslog-ng can decode them. Syslog-ng can also relay the log messages via TLS if you still need that feature.

Orchestrating containers by docker-compose

I use docker-compose for container orchestration because it is much simpler to use in smaller server environments than using Kubernetes. Plus you can use Docker Swarm if you like.

The following chart gives you an overview about a single host system serving Nextcloud sites. From our perspective it is just PHP services and HTTP servers which are reachable from the Internet through a reverse proxy.

Docker to Elastic logging architecture with syslog-ngEvery Docker container sends their logs messages to journald.
Syslog-ng reads the journals and sends the processed messages back into Elasticsearch, which runs in the same Docker environment.

The logging daemon stores the logs on the local filesystem as well. Therefore in case Elastic goes down, no logs will be lost.
The logs on the host are in the same format as they are in Elastic, as a result you can reuse them anytime.

JournalD by default does not persist journals across system reboots or only keeps them for a limited time.

Elastic works for me as platform to visualize and analyze logs. I do not mean it as a long term storage, although it can work like that. If I need more server resources I simply delete Elastic and bootstrap it with the local logs whenever I need it again.

Configuring Docker daemon to store logs of containers in journald

Please note, I only use Nextcloud as an example. The sole purpose of this docker-compose YAML file is to show an example. How an similar system should be configured. You can swap Nextcloud to any service you may already have in your containers.

version: '3'
services:

    nginx_proxy:
        image: nginx:1.14
        ports:
            - "80:80"
            - "443:443"
        volumes:
            - ./reverse_proxy.conf:/etc/nginx/conf.d/reverse_proxy.conf:ro
            - ./ssl:/etc/nginx/ssl:ro
        depends_on:
            - nginx_nextcloud_1
            - nginx_nextcloud_2
            - elasticsearch
        networks:
            - example-net
        logging:
            driver: journald
            options:
                labels: "application"
                tag: "nginx"
        labels:
            application: "reverse_proxy"

    nginx_nextcloud_1:
        image: nginx:1.14
        volumes:
            - ./backend_nextcloud/:/etc/nginx:ro
            - /srv/www/nextcloud_1:/var/www
        depends_on:
            - php-fpm-nextcloud_1
        networks:
            - example-net
        logging:
            driver: journald
            options:
                labels: "application"
                tag: "nginx"
        labels:
            application: "nextcloud_1"

    php-fpm-nextcloud_1:
        image: php:7.2-fpm
        volumes:
            - /srv/www/nextcloud_1:/var/www
        networks:
            - example-net
        extra_hosts:
            - "mariadb:172.18.0.100"
        logging:
            driver: journald
            options:
                labels: "application"
                tag: "php"
        labels:
            application: "nextcloud_1"

    elasticsearch:
        image: "docker.elastic.co/elasticsearch/elasticsearch:6.5.0"
        hostname: elasticsearch
        ports:
            - "9200:9200"
        environment:
            - bootstrap.memory_lock=true
            - "ES_JAVA_OPTS=-Xms512m -Xmx512m"
        volumes:
            - elasticdata:/usr/share/elasticsearch/data
            - "./elasticsearch.yml:/usr/share/elasticsearch/config/elasticsearch.yml"
        ulimits:
            memlock:
              soft: -1
              hard: -1
        logging:
            driver: journald
            options:
                labels: "application"
                tag: "elasticsearch"
        labels:
            application: "elastic"
        networks:
            example-net:
                ipv4_address: 172.20.0.40

    kibana:
        image: "docker.elastic.co/kibana/kibana:6.5.0"
        hostname: kibana
        ports:
            - "5601:5601"
        volumes:
            - kibanadata:/usr/share/kibana/data
            - "./kibana.yml:/usr/share/kibana/config/kibana.yml"
        logging:
            driver: journald
            options:
                labels: "application"
                tag: "kibana"
        labels:
            application: "kibana"
        networks:
            - example-net

volumes:
    elasticdata:
        driver: local
    kibanadata:
        driver: local

networks:
    example-net:
        driver: bridge
        driver_opts:
            com.docker.network.bridge.name: br-docker-0
        ipam:
            config:
                - subnet: 172.20.0.0/16

Explaining logging related options

To logging Docker to Elasticsearch first we need to use journald driver to collect the logs of the containers. The following Compose config exactly does that.

services:
    nginx_proxy:
        logging:
            driver: journald
            options:
                labels: "application"
                tag: "nginx"
        labels:
            application: "reverse_proxy" 

The logging→options→labels:application key value pair maps to labels→application. Effectively it adds the application:reverse_proxy label to the container called nginx_proxy. A handy metadata we can use in Elastic indexes later.

Defining the tag logging→options→tag:nginx is very important. It overwrites the PROGRAM field in syslog messages. In this case to nginx. Without it, all logs from containers would be originated from program name dockerd.

See the effect without “tag” being specified:

Nov 02 11:39:29 microchuck dockerd[1982]: 1.2.3.4 - - [02/Nov/2018:11:39:29 +0100] "GET / HTTP/1.1" 200 612 "-" "Mozilla/5.0 (Windows NT 6.1; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/51.0.2704.103 Safari/537.36" "-"
Nov 02 11:52:23 microchuck dockerd[1982]: 1.2.3.4 - - [02/Nov/2018:11:52:23 +0100] "GET / HTTP/1.1" 200 612 "-" "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_11_6) AppleWebKit/601.7.7 (KHTML, like Gecko) Version/9.1.2 Safari/601.7.7" "-" 

With “tag” specified.

Feb 04 00:20:39 microchuck nginx[1980]: 1.2.3.4 - - [04/Feb/2019:00:20:39 +0100] "GET /.well-known/security.txt HTTP/1.1" 404 169 "-" "-" "-"
Feb 04 00:20:52 microchuck nginx[1980]: 1.2.3.4 - - [04/Feb/2019:00:20:52 +0100] "GET /favicon.ico HTTP/1.1" 404 169 "-" "python-requests/2.10.0" "-" 

Explaining network related options

I need a custom network setup to set fixed IP address to Elastic on a bridged interface. Syslog-ng needs this to address it from outside of the Docker containers. Although DNS resolution works inside of that Docker Network, but does not work outside of it.

services: 
    elasticsearch:
        ports: - "9200:9200"
        networks:
            example-net:
                ipv4_address: 172.20.0.40

networks:
    example-net:
        driver: bridge
        driver_opts:
            com.docker.network.bridge.name: br-docker-0
        ipam:
            config:
                - subnet: 172.20.0.0/16

Explaining database related options

You may noticed in the architecture diagram that I did not place the database servers inside a container. I consider it a bad practice to use any database server in containers. It could work for development purposes. Although it is technically feasible, I have had bad experiences with it (data loss or corruption, slow startup).

An SQL server is installed on the HOST. The PHP containers can access it on a specified host name with the help of extra_hosts setting of docker-compose.

 services:
    php-fpm-nextcloud_1:
        networks:
            - example-net
        extra_hosts:
            - "mariadb:172.18.0.20"

With the help of the examples you should be able to make Docker daemon to log into journald.

Managing container log messages by syslog-ng

Some metadata values were already set like label and tag by Docker Compose. The next step is to make syslog-ng to read container’s logs from journald.

Collect and store container logs locally

I provide a syslog-ng configuration file. Similar to those provided in setting up a network source to collect OpenWRT logs or getting GeoIP metadata sent into Elasticsearch.

The basic configuration only stores container logs locally. It will be will be incrementally extended in the next chapters to reach its full potential.

# log messages are actually created by Docker Daemon
filter f_dockerd {"${.journald._COMM}" eq "dockerd"};

# put the files locally, either for long term or just for troubleshooting
destination d_docker_file {
    file(
        "/var/log/docker/$S_YEAR.$S_MONTH.$S_DAY/${.journald.CONTAINER_NAME}.json"
        template("$(format_json --rekey .journald.* --shift-levels 2 --scope rfc5424 --key HOST --key ISODATE --key .journald.APPLICATION --key .journald.CONTAINER_*)\n")
        create-dirs(yes)
    );
};

log {
    source(src);
    filter(f_dockerd);
    destination(d_docker_file);
};

The main building blocks are the same. Source drivers, filters, destinations drivers and log paths which wires them together. You can notice the source driver is actually not defined here but only referenced as source(src) from the main syslog-ng.conf file.

Notice: On Debian based systems the default source is usually called “s_src”. On openSUSE it is simply “src”, that is what I use here.

A filter called f_dockerd will select the logs created by the Docker Daemon command. Syslog-ng can use journald’s COMM variable to use for filtering.

The file destination driver uses a template for file names and directories. It is a nice feature which helps you to structure logs by sorting logs by date and name of the container.

Save the config as “/etc/syslog-ng/conf.d/docker-journal-elastic.conf” then reload syslog-ng service.

[email protected]:~> sudo systemctl reload syslog-ng

Check the results either by looking into the journals.

[email protected]:~> sudo journalctl -u docker --since="1 min ago"
-- Logs begin at Sun 2018-07-01 14:51:52 CEST, end at Mon 2019-02-04 13:25:12 CET. --
Feb 04 13:24:42 microchuck php[1980]: 172.20.0.10 -  04/Feb/2019:13:24:42 +0100 "GET /apps/notifications/api/v2/notifications" 200
Feb 04 13:24:42 microchuck nginx[1980]: 172.20.0.30 forwarded for 1.2.3.4 - - [04/Feb/2019:13:24:42 +0100]  "GET /oc/ocs/v2.php/apps/notifications/api/v2/notifications HTTP/1.0" 200 74 "-" "Mozilla/5.0 (Ma>
Feb 04 13:24:42 microchuck nginx[1980]: 1.2.3.4 - - [04/Feb/2019:13:24:42 +0100] "GET /oc/ocs/v2.php/apps/notifications/api/v2/notifications HTTP/1.1" 200 74 "-" "Mozilla/5.0 (Macintosh; Intel Mac OS X 10.>
Feb 04 13:25:12 microchuck php[1980]: 172.20.0.10 -  04/Feb/2019:13:25:12 +0100 "GET /apps/notifications/api/v2/notifications" 200
Feb 04 13:25:12 microchuck nginx[1980]: 172.20.0.30 forwarded for 1.2.3.4 - - [04/Feb/2019:13:25:12 +0100]  "GET /oc/ocs/v2.php/apps/notifications/api/v2/notifications HTTP/1.0" 200 74 "-" "Mozilla/5.0 (Ma>
Feb 04 13:25:12 microchuck nginx[1980]: 1.2.3.4 - - [04/Feb/2019:13:25:12 +0100] "GET /oc/ocs/v2.php/apps/notifications/api/v2/notifications HTTP/1.1" 200 74 "-" "Mozilla/5.0 (Macintosh; Intel Mac OS X 10.

Or checking the directory /var/log/docker/.

[email protected]:~> sudo tree -L 2 /var/log/docker
/var/log/docker
├── 2019.02.03
│   ├── balagedocker_nginx_proxy_1.json
│   ├── balagedocker_php-fpm-nextcloud_1_1.json
│   └── balagedocker_php-fpm-nextcloud_2_1.json
└── 2019.02.04
    ├── balagedocker_nginx_proxy_1.json
    ├── balagedocker_php-fpm-nextcloud_1_1.json
    └── balagedocker_php-fpm-nextcloud_2_1.json

Transfer log messages into Elasticsearch

Syslog-ng already collectes and stores messages of containers on local filesystem. Both in volatile journals and persistent text files.
Let’s extend the previous config file by adding a new destination driver for Elastic. Also connect it into the existing log path. It also sends the same logs to Elastic.

Check the differences against the previous config to see what have changed.
If you have any problems by using this output then don’t worry. I will post a full downloadable example at the end of this post.

[email protected]:~> diff -u docker-journal.conf docker-journal-elastic.conf
--- docker-journal.conf 2019-02-04 13:30:44.926795685 +0100
+++ docker-journal-elastic.conf 2019-02-04 13:30:32.358695533 +0100
@@ -8,8 +8,20 @@
     );
 };
 
+destination d_elastic_docker {
+    http(url("http://172.20.0.40:9200/docker-containers/test/_bulk")
+        method("POST")
+        flush-lines(3)
+        workers(4)
+        headers("Content-Type: application/x-ndjson")
+        body-suffix("\n")
+        body("{ \"index\":{} }\n$(format-json --rekey .journald.* --shift-levels 2 --scope rfc5424 --key HOST --key ISODATE --key .journald.APPLICATION --key .journald.CONTAINER_*)\n")
+    );
+};
+
 log {
     source(src);
     filter(f_dockerd);
     destination(d_docker_file);
+    destination(d_elastic_docker);
 };

I have hard coded values like IP, Port, Index and type of Elastic in the config. Do not forget to adjust the values to suit to your needs.

One could also use reusable config blocks with variables in syslog-ng to reduce the boilerplate. This time I try to flatten the configuration for the sake of simplicity and readability.

You are ready to reload syslog-ng for the changes to take effect. After a couple of logs reached Elastic, make sure you create index patterns for the documents. It is mandatory to be able to discover them.

Discovering container logs in Kibana

If you succeeded to follow the steps, you will have an index pattern called “docker-containers*”. Try to browse the log messages in Kibana→Discover menu.

The following screenshot shows an example from my system. The most interesting metadata fields are visible on the left.
There are APPLICATION, CONTAINER_NAME, CONTAINER TAG. Of course the standard fields from RFC5424 are also available.

Docker logs discovery in KibanaThis is already useful but visualizations will add more to this.

Creating Docker visualizations in Kibana

There are many visualizations like Data Table, Vertical Bar, Pie Chart and Line. I created short videos about how you can make use of them.

Creating a stacked Vertical Bar visualization

This chart is useful to show how the amount of logs per app changes in time. You can use this type of visualization for other attributes too.

Creating a Line visualization for getting log trends

I use this type of visualization for seeing the trends how the number of logs changes in time. You can put Data Table next to it in a Dashboard to show more insights.

Creating a Data Table visualization to the amount of logs per container

The Data Table visualization usually more readable than Vertical Bars. For instance when texts on X-Axis are too long I prefer the Data Table.
The video below was made for HTTP Agents. Just replace the Field of Aggregation to “CONTAINER_NAME”.

Creating a Pie Chart visualization to show the amount of logs per app

Pie Charts are pretty good for counting the amount of data when your data set has limited variability. For example there are only a dozen of app names and not hundreds.
The video below was made for HTTP Response codes. You should be able to adjust the Field of Aggregation to “APPLICATION”.

Creating dashboard from visualizations in Kibana

You should add the recently created visualizations to a Dashboard.
Check the video below to see how you can do that.
Although the video presents NGINX logs, the method is the same for Docker logs.

Final words

I really liked this project. I could learn many new things and the results are simply amazing. My plan is to use this system and improve it on the go.

Configurations for syslog-ng and docker-compose. The visualizations and dashboard for Kibana. As I promised you can download them all from this GitHub repository. Feel free to use them.

Update: I moved the chapters about parsing and visualizing NGINX / Apache access logs in Kibana into a dedicated post. Please make sure you check it out.

If this post was helpful for you, then please share it. Most importantly, should you have anything to add then please make any comments below. I will appreciate it.

Thank you.

Leave a Reply

Your email address will not be published. Required fields are marked *

60 − 59 =