Flask, Celery, Redis and Docker
Docker-compose makes it very easy to use the same Docker image for your Flask application and the Celery worker(s).
This is a post about how I use Docker and Docker-composer to develop and run my Flask website with Celery and Redis. There are many articles on the internet about this and if you are searching for them do not forget to search on Github.com. I just grabbed the bits and pieces and created my own setup. Before going into this I want to mention two other things that came up when adding Celery to Flask.
The Flask application pattern
Again I must refer to Miguel Grinberg's nice post about this. The problem is that we must create and initialize the Celery object at the time we call create_app() that sets up our Flask app. There are several solutions to this and I used one that was described in the article 'Flask + Celery = how to', see links below. The trick is using __init__.py for another purpose. In most cases we are using __init__.py as the file with the create_app() function. What we do is move all this code to a new file factory.py and use __init__.py to instantiate the Celery object:
# __init__.py
from celery import Celery
def make_celery(app_name=__name__):
celery = Celery(__name__)
return celery
celery = make_celery()
What happens now is that we instantiate the Celery object at app creation time and later initialize it in create_app() with the parameters we specified in app.config. To run our app we change the run.py file. We now pass the Celery object to the create_app() function, something like:
# run.py
import app
from app import factory
my_app = factory.create_app(celery=app.celery, ...)
For details, refer to the article mentioned above.
Celery and time zones
I thought I mention this as well because I struggled with it. I tried setting the timezone for both Flask and Celery to something other than UTC. For timezone 'Europe/Amsterdam' I kept getting the message (from Flower):
Substantial drift from celery@75895a6a62ab may mean clocks are out of sync. Current drift is 7200 seconds.
I did not solve this but fortunately this is not really a problem. It is good practice to run a web application with timezone UTC and only convert to a local timezone when requested, for example when showing a date and time to a visitor. To avoid problems, use UTC everywhere!
Using Docker
I am using Docker for development, testing, staging and production. Because my production server is running ISPConfig with a MariaDB database and I also have a MariaDB database installed on my development system, I did not add a database to my Docker configuration but instead connect to the MariaDB database using a unix socket.
I have a shared docker-compose file, docker-compose_shared.yml, and docker-compose files for deployment options, docker-compose_development.yml, docker-compose_production.yml, ... And every deployment option has its own environment file.
To start development I run:
docker-compose -f docker-compose_shared.yml -f docker-compose_development.yml up
And to start production I run:
docker-compose -f docker-compose_shared.yml -f docker-compose_production.yml up -d
Using the same Docker image for the web-service and the celery-worker
Before adding Celery I had only one service: web. With Celery I have at least three more:
- Redis
- One or more workers
- Flower
Redis and Flower are trivial to add but how do we add a worker? After reading on the internet I decided that the worker should have the same image as the web-image. Of course there is a lot of overhead (dead code) here but we also make our life more easy as we can use working code we wrote and tested before.
When building an image with docker-compose, the image for web-service is build. When running with docker-compose, both the web-service and the celery-worker-service must use this image.
Below is the docker-compose_shared.yml file. I am only showing the important lines. Some variables in the .env file for production:
PROJECT_NAME=peterspython
PROJECT_CONFIG=production
DOCKER_IMAGE_VERSION=1.456
# docker-compose_shared.yml
version: "3.7"
services:
redis:
image: "redis:5.0.9-alpine"
...
web:
image: ${PROJECT_NAME}_${PROJECT_CONFIG}_web_image:${DOCKER_IMAGE_VERSION}
container_name: ${PROJECT_NAME}_${PROJECT_CONFIG}_web_container
env_file:
- ./.env
build:
context: ./project
dockerfile: Dockerfile
args:
...
ports:
- "${SERVER_PORT_HOST}:${SERVER_PORT_CONTAINER}"
environment:
...
volumes:
# connect to mysql via unix socket
- /var/run/mysqld:/var/run/mysqld
# files outside the container:
...
depends_on:
- redis
celery_worker1:
image: ${PROJECT_NAME}_${PROJECT_CONFIG}_web_image:${DOCKER_IMAGE_VERSION}
env_file:
- ./.env
restart: always
environment:
...
volumes:
# connect to mysql via unix socket
- /var/run/mysqld:/var/run/mysqld
# files outside the container:
...
# for development we start the celery worker by hand after entering the container, this means we can stop and start after editing files
# see docker-compose_development.yml
command: celery -A celery_worker.celery worker -Q ${CELERY_WORKER1_QUEUE} -n ${CELERY_WORKER1_NAME} ${CELERY_WORKER1_OPTIONS} --logfile=...
depends_on:
- web
- redis
flower:
image: "mher/flower:0.9.5"
...
The docker-compose_development.yml file:
# docker-compose_development.yml
version: "3.7"
services:
web:
ports:
- "${SERVER_PORT_HOST}:${SERVER_PORT_CONTAINER}"
volumes:
# development: use files outside the container
- ./project:/home/flask/project/
command: python3 run_all.py
celery_worker1:
restart: "no"
volumes:
# development: use files outside the container
- ./project:/home/flask/project/
command: echo "do not run"
And the docker-compose_production.yml file:
# docker-compose_production.yml
version: "3.7"
services:
web:
ports:
- "${SERVER_PORT_HOST}:${SERVER_PORT_CONTAINER}"
volumes:
- /var/www/clients/${GROUP}/${OWNER}/web/static:/home/flask/project/sites/peterspython/static
command: /usr/local/bin/gunicorn ${GUNICORN_PARAMETERS} -b :${SERVER_PORT_CONTAINER} wsgi_all:application
Manually starting and stopping the worker during development
When developing a Flask application we use the DEBUG option. Then the application automatically restarts when we makes changes like saving a file.
The worker is a separate program, when it is running it does not know that you changed the code. This means we must stop and start the worker after making changes to a task. In the docker-compose_shared.yml file there is a command starting the worker. But in the development file I overrule this command using:
command: echo "do not run"
During development I open a terminal window and start everything using 'docker-compose up'. Now I can see all (debug) messages and problems in the application.
The worker did not start. In another terminal window I enter the worker Docker container using 'docker-compose run sh'. Note that I also could use 'docker compose exec sh' here.
I already created a shell script start_workers.sh in my project directory:
# start_workers.sh
celery -A celery_worker.celery worker -Q celery_worker1_queue -n celery_worker1 --loglevel=DEBUG --logfile=...
Now all I have to do to start the worker is type in the container shell:
./start_workers.sh
and Celery starts.
To stop Celery I just hit CTRL-C. Response:
worker: Hitting Ctrl+C again will terminate all running tasks!
worker: Warm shutdown (MainProcess)
Here we have the option to terminate any running tasks.
It is also possible to do a graceful shutdown of the worker. Go to the terminal window of the worker and type CTRL-Z to stop the worker. Response:
[1]+ Stopped ./start_workers.sh
Then type:
kill %1
Now Celery responds with (followed by the message from the terminating process):
worker: Warm shutdown (MainProcess)
[1]+ Terminated ./start_workers.sh
Celery worker memory usage
The presented solution is not the most memory friendly solution but it is not that bad either. To get an indication we use the Docker command:
docker stats
Response:
CONTAINER ID NAME CPU % MEM USAGE / LIMIT MEM % NET I/O BLOCK I/O PIDS
d531bfb686d0 peterspython_production_flower_1 0.17% 32.03MiB / 7.791GiB 0.40% 6.18MB / 2.26MB 0B / 0B 6
a63af28ce411 peterspython_production_celery_worker1_1 0.30% 121.6MiB / 7.791GiB 1.52% 9.95MB / 10.4MB 0B / 0B 3
b8b9f080dc26 peterspython_production_web_container 0.02% 467.3MiB / 7.791GiB 5.86% 1.35MB / 55.7MB 0B / 0B 6
de4fb0ef253a peterspython_production_redis_1 0.16% 9.059MiB / 7.791GiB 0.11% 12.6MB / 16.1MB 0B / 0B 4
I started the web-service with 5 Gunicorn workers. There is one Celery worker with two threads (--concurrency=2). The Celery worker here takes some 120 MB. Of course the memory usage will go up in some cases but as long as most tasks are not very memory intensive I do not think it is worth the trouble to strip code from the Celery worker.
Summary
Using Docker with Docker-compose is not only great because we have an (almost) identical system for development and production. It is also very easy to add services like Redis and Flower. And using the same Docker image for our application and the Celery worker is also very easy with Docker-compose. As always there is downside: It takes time to set this up. But the result makes up for a lot.
Links / credits
Celery and the Flask Application Factory Pattern
https://blog.miguelgrinberg.com/post/celery-and-the-flask-application-factory-pattern/page/0
Celery in a Flask Application Factory
https://github.com/zenyui/celery-flask-factory
Dockerize a Flask, Celery, and Redis Application with Docker Compose
https://nickjanetakis.com/blog/dockerize-a-flask-celery-and-redis-application-with-docker-compose
Flask + Celery = how to.
https://medium.com/@frassetto.stefano/flask-celery-howto-d106958a15fe
start-celery-for-dev.py
https://gist.github.com/chenjianjx/53d8c2317f6023dc2fa0
Read more
Celery Docker Docker-compose Flask Redis
Leave a comment
Comment anonymously or log in to comment.
Comments (1)
Leave a reply
Reply anonymously or log in to reply.
Hi there thanks!
Is it possible for you to share/post a minimal github project with all these files?
Recent
- Hiding database UUID primary keys of your web application
- Don't Repeat Yourself (DRY) with Jinja2
- SQLAlchemy, PostgreSQL, maximum number of rows per user
- Show the values in SQLAlchemy dynamic filters
- Secure data transfer with Public Key encryption and pyNaCl
- rqlite: a high-availability and distributed SQLite alternative
Most viewed
- Using Python's pyOpenSSL to verify SSL certificates downloaded from a host
- Using UUIDs instead of Integer Autoincrement Primary Keys with SQLAlchemy and MariaDb
- Connect to a service on a Docker host from a Docker container
- Using PyInstaller and Cython to create a Python executable
- SQLAlchemy: Using Cascade Deletes to delete related objects
- Flask RESTful API request parameter validation with Marshmallow schemas