Aggregate and tail Docker container logs using Docker SDK for Python

Use grep, sed, threading and Docker SDK for Python to log errors in a single file.

11 December 2022 Updated 11 December 2022

The problem: You finally have a Docker application consisting of many (micro) services (containers) and want to monitor all these containers for errors.

The services are mostly Python scripts that use the standard Python logging module and print messages to stdout (and stderr):

... DEBUG ... 
... INFO ... 
... ERROR ...

Using the default Docker json-file logging driver, these messages end up in the Docker logs. We can view the logs for a container using the command:

docker logs <container-id>

docker logs <container-name>

In this post I present a way to view the errors of all containers in a single file. We are not using 'container_name: <container-name>' in Docker-compose. To select and access the Docker containers, we use the Docker SDK for Python.

Steps:

Filtering log lines with grep and sed
Docker and container naming
Selecting and accessing containers with the Docker SDK for Python
Starting multiple Docker log processes
The code
Testing

Filtering log lines with grep and sed

To tail the Docker log we use the following command that also shows the messages sent to stderr:

docker logs --timestamps --follow <container-id> 2>&1

In the commands below we add a 'do not buffer' argument to make sure we do not pass parts of messages.

We do not want to see log lines with 'DEBUG', and show only lines with 'ERROR' and 'WARNING'. Because we also have other services, we search for case-insensitive as well:

grep --line-buffered -v DEBUG | grep --line-buffered -i -e "ERROR" -i -e "WARNING"

Because we have multiple container logs, we insert the container name at the start of a log line:

sed -u "s/^/<container-name> /"

Finally, we append these lines to a single log file:

>> my_app.errors.log

Here is the full line:

docker logs --timestamps --follow <container-id> 2>&1 | grep --line-buffered -v DEBUG | grep --line-buffered -i -e "ERROR" -i -e "WARNING" | sed -u "s/^/<container-name> /" >> my_app.errors.log

Then, in another console window we tail this file:

tail - f my_app.errors.log

Docker and container naming

In Docker, it is possible assign a name to a container. This works fine if there is only one such container. But if you want to run multiple instances (replicas) of the container, you cannot do this and must let Docker generate the container names.

A Docker generated container name looks like:

<project-name>-<service-name>_1

Multiple replicas will be numbered '_2', '_3', etc. The first part of the container name is called the project name, in this case the name of the directory where the docker-compose.yml file is.

You can also specify the project name using environment variable COMPOSE_PROJECT_NAME. For example, with docker-compose.yml and .env files:

# file: docker-compose.yml
version: "3"

services:
  web:
    image: "nginx:latest"
    env_file:
      - ./.env
    deploy:
      replicas: 3
    ports:
      - "1000-1002:80"

# file: .env
COMPOSE_PROJECT_NAME=MY_PROJECT

Then, bringing up the project using:

docker-compose up -d

And the result is:

Creating my_project_web_1 ... done
Creating my_project_web_2 ... done
Creating my_project_web_3 ... done

Selecting and accessing containers with the Docker SDK for Python

We specify the containers we want to show the logs for using the project names and service names. If we omit the service names, then all services of a project are selected. If we want to select all containers from the project 'my_project':

projects_services = {
    'my_project': []
}

If we want to select only the service 'web' from the project 'my_project':

projects_services = {
    'my_project': ['web']
}

To select the containers in Python, we use the Docker SDK for Python. We use the project names to search the containers. By default we select all services of a project.

In your virtual environment, install the Docker SDK for Python:

pip install docker

To get all containers:

    client = docker.from_env()
    containers = client.containers.list()

For each container, the Docker SDK exposes a labels dictionary with keys including:

com.docker.compose.container-number
com.docker.compose.project
com.docker.compose.service

With these, we can select the containers.

Starting multiple Docker log processes

There are several ways to start multiple Docker log processes. Because we are using Python we do this with threads and subprocess. Once we have the command, see above, we start the process as follows:

    cmd = ...
    p = subprocess.Popen(cmd, shell=True)
    p.wait()

Docker containers can be present, not present, up, down, meaning that we must continuously (every some seconds) check if our Docker logs command is still running, and if not, start it again.

The code

In case you want to try this, here is the code:

# file: dockertail.py
import docker
import logging
import subprocess
import sys
import threading
import time

def get_logger():
    logger_format = '[%(asctime)s] [%(levelname)s] %(message)s'
    logger = logging.getLogger(__name__)
    logger.setLevel(logging.DEBUG)
    # console
    console_handler = logging.StreamHandler(sys.stdout)                             
    console_handler.setLevel(logging.DEBUG)
    console_handler.setFormatter(logging.Formatter(logger_format))
    logger.addHandler(console_handler)                                            
    return logger

logger = get_logger()

# empty list means all services
projects_services = {
    'my_project': [],
}

class DockerTail:
    def __init__(self):
        self.client = docker.from_env()

    def get_docker_containers(self):
        return self.client.containers.list()

    def discover_docker_containers(self):
        logger.debug('containers:')
        for container in self.get_docker_containers():
            labels = container.labels
            logger.debug('- project  = {}'.format(labels['com.docker.compose.project']))
            logger.debug('- service  = {}'.format(labels['com.docker.compose.service']))
            logger.debug('- container.name = {}'.format(container.name))
            logger.debug('- short_id = {}'.format(container.short_id))
            logger.debug('- number   = {}'.format(labels['com.docker.compose.container-number']))

    def get_projects_containers_by_projects_services(self, projects_services):
        # group by project
        projects_containers = {}
        for container in self.get_docker_containers():
            labels = container.labels
            project = labels['com.docker.compose.project']
            service = labels['com.docker.compose.service']
            if project in projects_services:
                if service in projects_services[project] or len(projects_services[project]) == 0:
                    if project not in projects_containers:
                        projects_containers[project] = []
                    projects_containers[project].append(container)
        # sort projects by name
        projects_containers = dict(sorted(projects_containers.items()))            
        for project, containers in projects_containers.items():
            # sort project containers by service name
            containers.sort(key=lambda x: x.labels['com.docker.compose.service'])
        # show what we got
        logger.debug('projects:')
        for project, containers in projects_containers.items():
            logger.debug('- {}'.format(project))
            logger.debug('  services:')
            for container in containers:
                labels = container.labels
                logger.debug('  - {}'.format(labels['com.docker.compose.service']))
                logger.debug('      name:      {}'.format(container.name))
                logger.debug('      number:    {}'.format(labels['com.docker.compose.container-number']))
                logger.debug('      status:    {}'.format(container.status))
                logger.debug('      short_id:  {}'.format(container.short_id))
        return projects_containers

def docker_logs_follow(project, container):
    cmd = 'docker logs --timestamps --follow {} 2>&1'.format(container.short_id)
    cmd += ' | grep --line-buffered -v "DEBUG"'
    cmd += ' | grep --line-buffered -i -e "ERROR" -i -e "WARNING" '
    cmd += ' | sed -u "s/^/{} /"'.format(container.name)
    cmd += ' >> my_app.errors.log'
    logger.debug('cmd = {}'.format(cmd))
    p = subprocess.Popen(cmd, shell=True)
    p.wait()

def main():
    dt = DockerTail()
    dt.discover_docker_containers()

    container_name_threads = {}
    # start 'docker logs <container-id> ...' 
    while True:
        projects_containers = dt.get_projects_containers_by_projects_services(projects_services)
        for project, containers in projects_containers.items():
            for container in containers:
                t = container_name_threads.get(container.name)
                if t is None or not t.is_alive():
                    logger.debug('starting docker logs follow for {}:{}:{}'.format(project, container.labels['com.docker.compose.service'], container.name))
                    t = threading.Thread(target=docker_logs_follow, args=(project, container))
                    t.start()
                    container_name_threads[container.name] = t
        time.sleep(2)

if __name__ == '__main__':
    main()

Testing

Bring up the Docker-compose example with the three replicas as shown before:

docker-compose up -d

In another console window start dockertail.py:

python dockertail.py

And in another console window tail our application error log file, the aggregation of error messages of all application containers:

tail -f my_app.errors.log

Now in the first window, let's access a container without generating an error:

wget 127.0.0.1:1001

Repeat this with a different URL to generate an error:

wget 127.0.0.1:1001/nothing

This will print the following error message in my_app.errors.log tail:

my_project_web_3 2022-12-11T10:44:45.239669141Z 2022/12/11 10:44:45 [error] 36#36: *7 open() "/usr/share/nginx/html/nothing" failed (2: No such file or directory), client: 172.17.82.1, server: localhost, request: "GET /nothing HTTP/1.1", host: "127.0.0.1:1001"

Nice.

Single line exceptions

Almost perfect. In the above, when an exception occurs in a Python script we only see the first line. The traceback is not visible. If we do not want to touch our dockertail.py script, there is only one way to show the traceback in our log and that is to have the exception printed as a single line. You can find examples on the internet on how to do this.

Another way would be to start every traceback line with a special prefix and then modify our grep filter above to always pass these lines.

Summary

I was looking for a very basic way to monitor the logs of many Docker containers for errors. In many Docker based applications, the containers are named by Docker. Docker API for Python is a nice tool to discover and select Docker containers using the project name and service names. To filter the Docker container logs for errors, we use the Linux grep command and use the sed command to add the container name to the log message.

Once started, all we have to do is look at the output of the aggregate log file using:

tail -f my_app.errors.log

Automate or terminate!

Links / credits

Docker SDK for Python - Containers
https://docker-py.readthedocs.io/en/stable/containers.html

How can a container identify which container it is in a set of a scaled docker-compose service?
https://stackoverflow.com/questions/39770712/how-can-a-container-identify-which-container-it-is-in-a-set-of-a-scaled-docker-c/39895650#39895650

How do I can format exception stacktraces in Python logging?
https://stackoverflow.com/questions/28180159/how-do-i-can-format-exception-stacktraces-in-python-logging

How to reach additional containers by the hostname after docker-compose scale?
https://stackoverflow.com/questions/36031176/how-to-reach-additional-containers-by-the-hostname-after-docker-compose-scale

Aggregate and tail Docker container logs using Docker SDK for Python

Filtering log lines with grep and sed

Docker and container naming

Selecting and accessing containers with the Docker SDK for Python

Starting multiple Docker log processes

The code

Testing

Single line exceptions

Summary

Links / credits

Read more

Docker Log file Logging

Leave a comment

Comments

Leave a reply

Recent

Most viewed

Tags