angle-uparrow-clockwisearrow-counterclockwisearrow-down-uparrow-leftatcalendarcard-listchatcheckenvelopefolderhouseinfo-circlepencilpeoplepersonperson-fillperson-plusphoneplusquestion-circlesearchtagtrashx

Python application logging with Docker

Docker best practices suggest logging to stdout but there are some gotchas.

5 December 2022 Updated 5 December 2022
post main image
https://unsplash.com/@agk42

When you develop a software application, probably the first thing you will set up is logging. First only to the console, but soon you will add log files.

Some years ago I started using Docker for development and production. I did not make much changes when moving to Docker. The application log files are still in a log directory on a Docker Volume. This means that the log files are part of the application framework.

Best practices on the internet suggest to use Docker logging. This means our Docker application must print log records to stdout (and/or sterr), instead of writing them to our application log files. When we send our log records to stdout in Docker, we can specify a logging driver to export them to an external tool such as Syslog, using Docker supplied log drivers or third party log drivers. Fair enough, I understand that this can have advantages, or even is essential.

Again I do not want to change too much and stick to the default logging driver in Docker: json-file. Our log records will present in the files:

/var/lib/docker/containers/<container id>/<container id>-json.log

One of my current Python projects is an application consisting of a lot of (micro) services. A service is represented by a container with log files still in the context / framework of the service. In this post I convert my existing application logging method to Docker logging.

Changing the logging module

For my applications I developed a custom logging module, inside this module there are the standard Python logging methods, meaning my application code has lines like:

logger.debug(...)
logger.info(...)
logger.error(...)

I am using this logging module everywhere. This means I only needed to add a new logging mode, using a variable, use_docker_logging=True, that indicates to write log data to stdout instead of a file.

We can do this very easy using logging.StreamHandler(stdout). When using the Python logging module, the data is flushed after every record. This means there is no need to start Python as ‘python -u’ or use the environment variable PYTHONUNBUFFERED.

Docker Exec, problems, where are my log records?

Unfortunately, we are not done, there is a problem. Do not know about you, but I do all my development using Docker containers today. In many cases I have the command in the docker-compose.yml file like:

command: tail -f /dev/null

This means, start the container and keep it alive. Then I 'Docker Exec' into the container (shell), and run the Python scripts. When I did this, the log records from the 'Docker Exec' session did not appear in the Docker logs.

After searching the internet I found this (Docker issues) page 'Proposal: additional logging options for docker exec', see links below.

It appears that when you 'Docker Exec' into a container, the stdout of this session is not the stdout of the initial session. There will be reasons for this, we cannot change it, and we have to deal with this.

The solution is to redirect log records to:

/proc/1/fd/1

Then I found another (Docker issues) page on the internet 'Echoing to /dev/stdout does not appear in docker logs', see links below. One of the suggestions is to log to a sym-linked file. We create the symlink in the Dockerfile:

RUN ln -sf /proc/1/fd/1 /var/log/test.log

Back to the solution we had so far, we replace the logging.StreamHandler by the logging.FileHandler:

logging.FileHandler('/var/log/test.log')

Now the log records from Python scripts running in the 'Docker Exec' session, appear in the Docker logs.

Docker Exec session, no console logging

Unfortunately, we are not done, there is another problem. Because we used the logging.FileHandler, we only log to a file, stdout, being the Docker logs. To see the log records on the screen in the 'Docker Exec' session, we must add the logging.StreamHandler again.

But wait, we only must do this for the 'Docker Exec' session, otherwise we will see duplicated log records in the Docker logs.

I solved this in a way that is a bit hackish, by finding the top parent process name. If this process name is 'sh' or 'bash' then I assume we used Docker Exec to enter the container.

import psutil

    ...
    # get 'top' parent process of this docker exec session
    parent_process_pid = os.getpid()
    parent_process_name = None
    while True:
        #print('parent_process_pid = {}'.format(parent_process_pid))
        parent_process = psutil.Process(parent_process_pid)
        pid = parent_process.ppid()
        parent_process_name = parent_process.name()
        if pid == 0:
            break
        parent_process_pid = pid

    if parent_name in ['sh', 'bash']:
        # add console logging
        ...
    else:
        # no console logging
        ...
    ...

Log lines with extra fields

All log records of a container are now in a single log file. If you have multiple services (processes) running in a container, you probably want to add extra fields to the log lines that identify the service (process). This in addition to the logging.DEBUG, logging.ERROR etc. field that we insert in the log lines.

In Python logging we can add an extra field to a log line with the logging.setLogRecordFactory() method. An example is given on the page 'Using LogRecordFactory in python to add custom fields for logging', see links below.

Other changes

Note that with the default Docker logging driver we can only write a string to stdout, writing a JSON-object (dictionary) is not possible. A severe limitation, but we have to live with it.

If we want to add tags to all (!) log records, we can use labels in Docker-Compose:

Example:

  some-service:
    image: ...
    ports:
      - "8082:8000"
    labels:
      log_for_application: "myapp"

When logging to Docker, the timestamp is added automatically by Docker. This means we must remove it from our Python log line. My final logger formatter string looks like:

'%(proc_id)-15.15s %(levelname)-8.8s [%(filename)-30s%(funcName)20s():%(lineno)03s] %(message)s'

Viewing the Docker logs of all containers with Dozzle

There are many solutions to view the Docker logs of all containers. I tried a few, and most easy to set up and use was Dozzle. Dozzle is a real-time log viewer for docker containers, see links below. To run it, just pull the container, and you are ready to go. Dozzle looks like Logs Explorer, an extension for Docker Desktop, see links below.

Be careful, by default Dozzle connects you to Google Analytics, make sure to start Dozzle with the '--no-analytics' flag:

docker run --name dozzle -d --volume=/var/run/docker.sock:/var/run/docker.sock -p 8888:8080 amir20/dozzle:latest --no-analytics

Once started, point your browser to:

http://127.0.0.1:8888

With Dozzle it is very easy to view your Docker logs.

Search uses regex, meaning that if you for example want to filter on two terms, 'Term1' and 'Term2',
you can type in the Search box:

Term1.*?Term2

Another nice thing with Dozzle is that we can start it also for a single container, or a few containers. What really is missing, is a way to store and reselect your searches. But I am not complaining. Great tool!

Summary

This took far more time than I expected. It was much more than just printing to stdout. Is it worth the trouble? For the current application, that consists of many services represented by containers, I believe it is.

Now that logging is done via Docker, I can look at other ways to view the logs, generate alerts, etc.

Links / credits

Docker - Configure logging drivers
https://docs.docker.com/config/containers/logging/configure

Dozzle, a real-time log viewer for docker containers
https://dozzle.dev

Echoing to /dev/stdout does not appear in 'docker logs' #19616
https://github.com/moby/moby/issues/19616

Proposal: additional logging options for docker exec #8662
https://github.com/moby/moby/issues/8662

Using LogRecordFactory in python to add custom fields for logging
https://stackoverflow.com/questions/59585861/using-logrecordfactory-in-python-to-add-custom-fields-for-logging

Leave a comment

Comment anonymously or log in to comment.

Comments

Leave a reply

Reply anonymously or log in to reply.