Reducing page response times of a Flask SQLAlchemy website

Reduce the carbon dioxide (CO2) emissions of your web app using query result caching and template caching.

29 August 2019 Updated 30 August 2019
post main image

Objects, it is nice to build an application with them but it has one very big disadvantage: it can be very slow because of the extra CPU-cycles and all the extra memory used. The slowdown of course is very much caused by the extra mapper layers and extra data. 

Should you care about performance? Yes! The admin part of an application does not have to be very fast but the frontend pages, the pages visitors can access must load as fast as possible. I was shocked by the enormous difference between an ORM query and a raw SQLAlchemy query. All these extra CPU-cycles, all this extra memory used. And not very environmentally friendly also. I do not want my website to be responsible for the climate change ... ;-)

Then there is also Jinja templating. I see times of 30 milliseconds building the blog posts for the home page. Why? How is this possible? I did some debugging but still did not find the cause yet, but on staging and production, both using Gunicorn, it seems the Jinja delays are much smaller. 

What we can do to reduce page generation time

We have two options:

1. Leave the object / ORM mode

- Transform some ORM queries to raw SQL queries
- Build some Jinja template parts in Python

- Very fast, also on first hit

- Very much work

2. Add application level caching

- Cache the result of some ORM queries
- Cache the result of some template parts

- Easy to add, not much code changes

- First hit remains slow if cache times expired

What is the use of an application level cache?

Databases also do caching so why cache anyway? Using application level query result caching we avoid access to the database within a specified cache time. Many pages on a website are static in a sense that they only change once in a while. If there is just a part of a page static then we can cache only that part.

Caching is not not for the the first visitor, it serves its purpose when many visitors access the website. So the first first visitor gets a page response time time of 200 milliseconds, but the visitors following will get a response time of 30 milliseconds, if they visit the pages within the cache time.

There is a problem here and that is the time recorded by search engines. Google may rate your website lower because of first time (no cached) visits. There is not really much we can about this. An option would be to prefetch certain items but this an order more complex.

Implementing a cache

Caching is not really difficult to code ourselves. We could also use Flask-Caching here. I like the Caching Jinja2 Snippets feature:

{% cache [timeout [,[key1, [key2, ...]]]] %}
{% endcache %}

But as I said before, I want to do as much as possible myself, that is how I learn Python, Flask. In the past I implemented caching for a PHP website using data from a remote system which proved very effective, so now let's do this in Python.

The caching object

As this website app is running on a Linux system I believe there is no need for Redis or Memcached, we can use the filesystem for caching. The Linux filesystem is fast enough. I cache only some query results and Jinja template parts. How do you know what you should cache? This is simple, measure, measure, measure, see also the post about Profiling. I added timers to the home page with blog posts, to the blog post page. Then I selected the queries and template parts that would save the most time.

The implementation of the caching object looks very much like the Settings implementation, see a previous post. Nice thing of Flask is that we can create an object during app creating time, and use it during app run time.

Below is the code of the caching class. It has two main functions: dump, to store data in the cache, and load, to get data from the cache. Both use a cache_item_id that identifies the cached item. Pickl is used write objects to a file and read objects from a file. At dump time we can also pass other parameters like non-default cache time and a cache_item_type. This type is not necessary, I use it to select a different subdirectory so all query results are in directory A and all render_template results are in directory B. Easy for debugging.

class AppCache:

    def __init__(self, app=None): = app
        if app is not None:

    def init_app(self, app):
        self.cache_item_id2cache_details = {}

    def get_subdir_and_seconds(self, cache_item_type, cache_item_seconds):
        return cache_subdir, cache_seconds

    def dump(self, cache_item_id, cache_item_data, cache_item_type=None , cache_item_seconds=None):

        cache_subdir, cache_seconds = self.get_subdir_and_seconds(cache_item_type, cache_item_seconds)

        cache_file = os.path.join('cache', cache_subdir, cache_item_id + '.pickle')

        temp_name = next(tempfile._get_candidate_names())
        cache_file_tmp = os.path.join('cache', cache_subdir, cache_item_id + '_' + temp_name + '.pickle')

            fh = open(cache_file_tmp, 'wb')
            pickle.dump(cache_item_data, fh)
            # rename is atomic
            os.rename(cache_file_tmp, cache_file)
            return False

        # no errors so store
        if cache_item_id not in self.cache_item_id2cache_details:
            self.cache_item_id2cache_details[cache_item_id] = { 'cache_file': cache_file, 'cache_seconds': cache_seconds } 
            self.cache_item_id2cache_details[cache_item_id]['cache_file'] = cache_file
            self.cache_item_id2cache_details[cache_item_id]['cache_seconds'] = cache_seconds
        return True

    def load(self, cache_item_id):
        if cache_item_id in self.cache_item_id2cache_details:
            cache_file = self.cache_item_id2cache_details[cache_item_id]['cache_file']
            cache_seconds = self.cache_item_id2cache_details[cache_item_id]['cache_seconds']

                if os.path.isfile(cache_file):
                    mtime = os.path.getmtime(cache_file)
                    if (mtime + cache_seconds) > time.time():
                        fh = open(cache_file, 'rb')
                        return True, pickle.load(fh) 

        return False, None

As with the Settings class before, I instantiate the object in the create_app function:

    app.app_cache = AppCache(app)

Now we can use it in other views by calling current_app.app_cache.dump() and current_app.app_cache.load():

    cache_item_id = 'some_cache_item_id'
    hit, data = current_app.app_cache.load(cache_item_id)
    if not hit:
        # get the data somewhere
        # cache it
        current_app.app_cache.dump(cache_item_id, data)
    # use data

Caching ORM query results

We should limit ourselves to queries that really slow down the system. A good candidate is the query that gets the blog posts for the home page. The home page can be displayed in different languages and the blog posts are partitioned into pages having a maximum of 10 blog posts on a page. This means the cache_item_id must reflect both the language and the page number:

    cache_item_id = 'home_page_blog_posts_' + lang_code + '_p' + str(page_number)
    hit, home_page_blog_posts = current_app.app_cache.load(cache_item_id)
    if not hit:
        home_page_blog_posts = get_home_page_blog_posts(lang_code, page_number)
        # cache it
        current_app.app_cache.dump(cache_item_id, home_page_blog_posts)

Caching Jinja render_template results

In Flask we use render_template to generate the HTML for a page. For the home page I have an index.html
that includes a for-loop to print the blog posts. To improve performance we can cache exactly this part. To do this we add another render_template function. First we render the blog_posts, the result is called rendered_blog_posts. In the second render_template we use the rendered blog posts.

    cache_item_id = 'homepage_blog_posts_' + lang_code + '_p' + str(page_number)
    hit, rendered_blog_posts = current_app.app_cache.load(cache_item_id)
    if not hit:
        rendered_blog_posts = render_template(

        # cache it
        current_app.app_cache.dump(cache_item_id, rendered_blog_posts)

    return render_template(

Of course we must split the template file index.html into two files, I called them index_blog_posts.html and
index_using_rendered_blog_posts.html. First we copy index.html to these two files, then we edit the files. As you can imagine the index_blog_posts.html looks something like this:

    {% if blog_posts %}
        {% for blog_post in blog_posts %}
            <div class="row">
                <div class="col-12 p-0">

                    <h1><a href="{{ url_for('pages.blog_post_view', slug=blog_post.slug) }}">{{ blog_post.title }}</a></h1>

        {% endfor %}

        {{ list_page_pagination(pagination) if pagination }}

    {% endif %}

and the index_using_rendered_blog_posts.html looks like this:

{% extends "content_right_column.html" %}

{% block content %}

    {% if rendered_blog_posts %}
        {{ rendered_blog_posts|safe }}
    {% endif %}
{% endblock %}


The above changes reduced the - second hit - home page generation time from 150 milliseconds to 30 milliseconds on my local PC. Caching is a very effective way to reduce response times. In fact I believe most Flask SQLAlchemy websites cannot perform without it. The code changes are minimal and we only have to add caching where it really matters. If we really need first hit performance we can always convert our 'slow' ORM queries to raw SQL. Until that time we focus on functionality, not on time-consuming optimizations.

Links / credits

Flask Extension Development


pickle — Python object serialization

Leave a comment

Comment anonymously or log in to comment.


Leave a reply

Reply anonymously or log in to reply.