angle-uparrow-clockwisearrow-counterclockwisearrow-down-uparrow-leftatcalendarcard-listchatcheckenvelopefolderhouseinfo-circlepencilpeoplepersonperson-fillperson-plusphoneplusquestion-circlesearchtagtrashx

Hiding database UUID primary keys of your web application

Do not make it easy for others to play with the data your web application exposes.

29 March 2024 Updated 30 March 2024
post main image
https://unsplash.com/@davidkristianto

When you create a web application, you must be very careful not to expose too much information. If you use (auto-increment) Integer IDs in your database, then you are probably exposing too much already. Some examples. An Integer user_id makes it easy to guess how many new registrations you receive every day. An Integer order_id makes it easy to guess how many orders you receive every day.

In addition, visitors can try to decrement or increment these values in URLs or forms. If you do not have the proper logic present, then they may be able to see previously registered users, previous orders.

Many of these problems go away when using UUID4s as primary keys. This makes it impossible to guess previous values. But you still need to guard against brute force guessing.

But even UUID4s we sometimes do not want to expose. There are many methods to hide your IDs, here I present another one. The assumption is that you already use UUID4s as primary keys. Note that there are not really easy solutions here.

As always I do this on Ubuntu 22.04.

How it works

The assumption is that you already use UUID4s for primary keys (and foreign keys).

Encoding

Before sending the data from the server to the client, we encode the IDs in the data:

  • We replace all primary key UUID4s value, the 'from_ids', by new UUID4s values, the 'to_ids'.
  • This original and new UUIDs are stored in a newly created 'from_to_record'.
  • This 'from_to_record' is written to the database.
from_to_record = {
    'id': <uuid4>,
    'user_account_id': <user_account_id>,
    'created_on': datetime.datetime.utcnow(),
    'to_id_from_ids': {
        <to_id1>: <from_id1>,
        <to_id2>: <from_id2>,
        ...
    },
}

Note(s):

  • On every encoding, a new 'from_to_record' is created.
  • 'from_to_records' never change, they are created once and remain valid until expired.
  • We do not store 'from_id_to_ids' but 'to_id_from_ids'. The reason is that we use 'from_to_records' only to look up a 'from_id' from a 'to_id'.
  • Here we show a single 'from_to_record' with a JSONB column type to store multiple pairs. You can of course also have multiple records instead.
  • A 'user_account_id' connects a record to a specific user.

Decoding

When we receive data from the client:

  • First we get non-expired 'from_to_records' from the database using the 'user_account_id' and 'created_on' fields.
  • Then, using these records, we replace the 'to_ids' in the data from the client by the 'from_ids'.

Pros and Cons

As already mentioned, whatever method you want to use, you must do extra work. Here are some pros and cons of this method:

Pros:

  • Straightforward replacement.
  • Almost no changes at the client.
  • Generation of new UUID4s is an optimized function.
  • We do not touch original records.
  • Easy time restrictions when using a creation date/timeout.

Cons:

  • Requires a database.
  • Not very fast.

Example: Flask application with a form

The, very limited, example below demonstrates how this is working. In the example, we can list and edit members without exposing the actual primary keys.

We use Flask-Caching (FileSystemCache) as a database for the members and for the 'from_to_records'. Typically you would use a real database system for the members and something like Redis for the 'from_to_records'.

Create a virtual environment and then:

pip install flask
pip install Flask-Caching

There are three classes:

  • 'IdFromTo'
    Used to retrieve and save 'from_to_records' and to translate the 'from_ids' to 'to_ids', and vice-versa.
  • 'Db'
    Database queries
  • 'DbWrapper'
    Our new methods handling the 'from_ids' and 'to_ids' in requests.

The project tree:

.
├── project
│   ├── app
│   │   └── factory.py
│   └── run.py

Here are the two files:

# run.py
from app.factory import create_app

host = '127.0.0.1'
port = 5050

app = create_app()
app.config['SERVER_NAME'] = host + ':' + str(port)

if __name__ == '__main__':
    app.run(
        host=host,
        port=port,
        use_debugger=True,
        use_reloader=True,
    )
# factory.py
import datetime
import logging
import uuid
import os
import sys
from flask import current_app, Flask, redirect, render_template, request, url_for
from flask_caching import Cache

cache = Cache()

logging.basicConfig(
    format='%(asctime)s %(levelname)8s [%(filename)-15s%(funcName)15s():%(lineno)03s] %(message)s',
    level=logging.DEBUG,
)
logger = logging.getLogger()

# use string uuid
def get_uuid4():
    return str(uuid.uuid4())

class IdFromTo:
    def __init__(self, user_account_id):
        self.user_account_id = user_account_id
        self.expire_seconds = 30
        self.expired_on = datetime.datetime.utcnow() - datetime.timedelta(seconds=self.expire_seconds)
        self.from_to_records_loaded = False
        self.from_to_records = []
        self.from_ids_to_ids = {}

    def load_from_to_records(self):
        # filter here, we do not have a real database
        for from_to_record in cache.get('from_to_records') or []:
            if from_to_record['created_on'] < self.expired_on:
                logger.debug(f'expired, skipping from_to_record = {from_to_record} ...')
                continue
            if from_to_record['user_account_id'] != self.user_account_id:
                logger.debug(f'not a dataset of me, skipping from_to_record = {from_to_record} ...')
                continue
            self.from_to_records.append(from_to_record)
        from_to_records_len = len(self.from_to_records)
        logger.debug(f'from_to_records_len = {from_to_records_len}')

    # get from_id: match with previously saved from_to_records
    def get_from_id(self, to_id):
        if not self.from_to_records_loaded:
            self.load_from_to_records()
            self.from_to_records_loaded = True
        for from_to_record in self.from_to_records:
            to_id_from_ids = from_to_record['to_id_from_ids']
            if to_id in to_id_from_ids:
                return to_id_from_ids[to_id]
        logger.debug(f'not found in to_id_from_ids, to_id = {to_id}')
        return None

    # get to_id: create new/append to from_ids_to_ids
    def get_to_id(self, from_id):
        from_id = str(from_id)
        if from_id in self.from_ids_to_ids:
            # already created 
            logger.debug(f'use already created to_id_for from_id = {from_id}')
            return self.from_ids_to_ids[from_id]
        logger.debug(f'create new to_id_for from_id = {from_id}')
        to_id = get_uuid4()
        self.from_ids_to_ids[from_id] = to_id
        return to_id

    def save(self):
        # load, append, overwrite
        from_to_records = cache.get('from_to_records') or []
        # swap
        from_ids_to_ids_len = len(self.from_ids_to_ids)
        if from_ids_to_ids_len == 0:
            return
        to_id_from_ids = {}
        for from_id, to_id in self.from_ids_to_ids.items():
            to_id_from_ids[to_id] = from_id
        from_to_record = {
            'id': get_uuid4(),
            'user_account_id': self.user_account_id,
            'created_on': datetime.datetime.utcnow(),
            'to_id_from_ids': to_id_from_ids,
        }
        from_to_records.append(from_to_record)
        cache.set('from_to_records', from_to_records)

class Db:
    def __init__(self):
        # initial members
        self.members = [{
            'id': 'b5ff1840-38a8-44cc-8f54-730dcf0b1358',
            'name': 'John',
        },{
            'id': '27e14ff0-7620-4e17-9fa7-f491bab22c8a',
            'name': 'Jane',
        }]

    def get_members(self):
        members = cache.get('db_members')
        if members is None:
            # first time only
            cache.set('db_members', self.members)
        return cache.get('db_members')

    def get_member(self, member_id):
        for member in cache.get('db_members') or []:
            if member['id'] == member_id:
                return member
        return None

    def update_member(self, member_id, name):
        members = cache.get('db_members')
        for member in members:
            if member['id'] == member_id:
                member['name'] = name
                cache.set('db_members', members)
                return member
        return None

class DbWrapper:
    def __init__(self, user_account_id):
        self.user_account_id = user_account_id
        self.db = Db()

    def get_members(self):
        id_from_to = IdFromTo(self.user_account_id)
        members = []
        for member in self.db.get_members() or []:
            to_id = id_from_to.get_to_id(member['id'])
            members.append({
                'id': to_id,
                'name': member['name']
            })
        id_from_to.save()
        return members

    def get_member(self, to_id):
        id_from_to = IdFromTo(self.user_account_id)
        from_id = id_from_to.get_from_id(to_id)
        member = self.db.get_member(from_id)
        if member is None:
            return None
        return {
            'id': to_id,
            'name': member['name'],
        }

    def update_member(self, to_id, name):
        # do not swap change to_id
        id_from_to = IdFromTo(self.user_account_id)
        from_id = id_from_to.get_from_id(to_id)
        member = self.db.update_member(from_id, name)
        return {
            'id': to_id,
            'name': member['name'],
        }
    
def create_app():
    app = Flask(__name__, instance_relative_config=True)

    app.config.update({
        'CACHE_TYPE': 'FileSystemCache',
        'CACHE_DEFAULT_TIMEOUT': 3600,
        'CACHE_DIR': '.'
    })
    cache.init_app(app)

    user_account_id = '47742ae4-67bd-4164-b044-e893344c861c'

    db = DbWrapper(user_account_id)

    @app.route('/')
    def home():
        return redirect(url_for('members'))

    @app.route('/members', methods=['GET', 'POST'])
    def members():
        members = db.get_members()
        page_data_lines = ['Members:']
        for member in members:
            member_id = member['id']
            member_name = member['name']
            member_edit_url = url_for('member_edit', member_id=member_id)
            page_data_lines.append(f'<a href="{member_edit_url}">{member_name}</a> ({member_id})')
        return f"""{'<br>'.join(page_data_lines)}"""

    @app.route('/member/edit/<member_id>', methods=['GET', 'POST'])
    def member_edit(member_id):
        member = db.get_member(member_id)
        members_url = url_for('members')
        if member is None:
            return f"""Error: Expired?
                <a href="{members_url}">Start again</a>"""
        name = member['name']
        logger.debug(f'member = {member}')
        error = ''
        if request.method == 'POST':
            name = request.form.get('name').strip()
            if len(name) > 0:
                member = db.update_member(member_id, name)
                return redirect(url_for('members'))
            error = 'Error: enter a name'
        members_url = url_for('members')
        return f"""
            <a href="{members_url}">Members</a><br><br>
            Edit member:<br>
            <form method="post">
            <input type="text" name="name" value="{name}"><br>
            <input type="submit" name="button" value="Update">
            </form>
            {error}
            """

    return app

To run, go to the project directory and type:

python run.py

Then point your browser to:

127.0.0.1:5050

Summary

There are many ways to hide your database IDs.  The solution presented is for a database that uses UUID4 primary keys.

Before sending the data to the client, primary UUID4s values are replaced with new UUID4s, and the original and replacement values are stored. When data is received from the client, we load the original and replacement values and replace. If we wait too long, the values expire and we must start again.

Only minimal changes are required in the client.

Links / credits

Hiding, obfuscating or encrypting database IDs
https://bhch.github.io/posts/2021/07/hiding-obfuscating-or-encrypting-database-ids

Leave a comment

Comment anonymously or log in to comment.

Comments

Leave a reply

Reply anonymously or log in to reply.