angle-uparrow-clockwisearrow-counterclockwisearrow-down-uparrow-leftatcalendarcard-listchatcheckenvelopefolderhouseinfo-circlepencilpeoplepersonperson-fillperson-plusphoneplusquestion-circlesearchtagtrashx

SQLAlchemy: Using Cascade Deletes to delete related objects

Cascade Deletes is much more than just adding 'all, delete' to the relationship.

16 July 2022 Updated 16 July 2022
post main image

Although the documentation of SQLAlchemy ORM is very good it sometimes is confusing and lacks clear examples, but maybe this is just me. About Cascade Deletes, I find the following text from the SQLAlchemy documentation mind-blowing:

The default value of the relationship 'cascade'-option is 'save-update, merge'. The typical alternative setting for this parameter is either all or more commonly 'all, delete-orphan'. The 'all' symbol is a synonym for 'save-update, merge', 'refresh-expire', 'expunge', 'delete', and using it in conjunction with 'delete-orphan' indicates that the child object should follow along with its parent in all cases, and be deleted once it is no longer associated with that parent.

Huh? What? And then after some time: WTF! Help! But I want only recipes!

But there are no recipes. The only way to make sure that we are doing the right thing is to read the documentation, read the documentation again, and create some examples that confirm that we understood.

SQLAlchemy Cascade Deletes

Cascade Deletes in SQLAlchemy can be specified in a relationship using the 'cascade'-option. The 'cascade'-option has a default value. This implies that if you were not using the 'cascade'-option and want to add to a 'cascade'-option to a relationship, you are changing (!) the default behavior. To keep default behavior, you must add 'save-update' and 'merge' to the 'cascade'-option.

The SQLAlchemy documentation then gives an example of Cascade Delete: When the parent is deleted, its children also must be deleted. In the example the Cascade Delete line is:

cascade='all, delete'

This is very confusing because 'delete' is already part of 'all'. This means we can also write:

cascade='all'

But wait, 'all' includes also 'refresh-expire' and 'expunge'. There is a warning in the documentation that 'refresh-expire' changes the default behavior and also 'expunge' changes the default behavior. This means we do not want 'all' but probably:

cascade='save-update, merge, delete'

Example: Customer, Order (One-To-Many)

When we delete a customer, we also want to delete all orders of this customer.

We have two models, Customer and Order. Customer-Order is a One-To-Many relationship. A customer can have zero or more orders and an order is part of only one customer. We create a bidirectional relationship. A customer can refer to orders using Customer.orders and an order can refer to a customer using Order.customer.

Note that in the ForeignKey Order.customer_id I added the attribute 'nullable=False'. The reason is that we do not want orders without a customer floating around.

We make the Cascade Delete work by adding the following line to the relationship in the Customer:

cascade='save-update, merge, delete'

Our model classes:

class Customer(Base):
    __tablename__ = 'customer'

    id = sa.Column(sa.Integer, primary_key=True)
    name = sa.Column(sa.String(100), nullable=False)

    # relationship: orders
    orders = relationship(
        'Order',
        back_populates='customer',
        # delete customer orders when customer deleted
        cascade='save-update, merge, delete'
    )

class Order(Base):
    __tablename__ = 'order'

    id = sa.Column(sa.Integer, primary_key=True)
    name = sa.Column(sa.String(100), nullable=False)

    # relationship: customer
    customer_id = sa.Column(sa.Integer, sa.ForeignKey('customer.id'), nullable=False, index=True)
    customer = relationship(
        'Customer',
        back_populates='orders',
    )

Assuming a customer with two orders, the following SQL-statements are generated by SQLAlchemy:

DELETE FROM "order" WHERE "order".id = ?
((1,), (2,))
DELETE FROM customer WHERE customer.id = ?
(2,)

You would expect three SQL-statements, the first one selecting all orders of the customer. In my test script I created the customers and orders before the DELETE-statement meaning that the order objects are in the session. SQLAlchemy knows this, and decided there is no need to get them from the database.

Now let's change the 'cascade'-option: remove 'delete' and add 'delete-orphan', that also deletes any children that were removed from the parent, even if the parent is not deleted. Can probably not be used without 'delete'. Let's see what happens.

cascade='save-update, merge, delete-orphan',

The SQL-statements and result generated by SQLAlchemy:

UPDATE "order" SET customer_id=? WHERE "order".id = ?
((None, 1), (None, 2))
exception = IntegrityError, e.args = ('(sqlite3.IntegrityError) NOT NULL constraint failed: order.customer_id',)

SQLAlchemy makes an attempt to remove the orders from the customer by updating the ForeignKeys to NULL, but that is not allowed because the ForeignKey includes 'nullable=False'.

Finally, let's create the 'cascade'-option with both 'delete' and 'delete-orphan':

cascade='save-update, merge, delete, delete-orphan',

The SQL-statements generated by SQLAlchemy:

DELETE FROM "order" WHERE "order".id = ?
((1,), (2,))
DELETE FROM customer WHERE customer.id = ?
(2,)

This is exactly the same as the 'delete'-only case.

Database object deletes using ForeignKey ON DELETE CASCADE

Many database engines, but not all, support the 'ON DELETE CASCADE' action: Each row in the child table associated with the deleted parent row, is also deleted.
SQLAlchemy has support for this. In addition to the 'cascade'-option described above we must add:

  • passive_deletes=True, to the relationship in the parent
  • ondelete='CASCADE', to the ForeignKey in the child

Customer and Order model classes after making these changes:

class Customer(Base):
    __tablename__ = 'customer'

    id = sa.Column(sa.Integer, primary_key=True)
    name = sa.Column(sa.String(100), nullable=False)

    # relationship: orders
    orders = relationship(
        'Order',
        back_populates='customer',
        # delete customer orders when customer deleted
        cascade='save-update, merge, delete',
        passive_deletes=True,
    )

class Order(Base):
    __tablename__ = 'order'

    id = sa.Column(sa.Integer, primary_key=True)
    name = sa.Column(sa.String(100), nullable=False)

    # relationship: customer
    customer_id = sa.Column(sa.Integer, sa.ForeignKey('customer.id', ondelete='CASCADE'), nullable=False, index=True)
    customer = relationship(
        'Customer',
        back_populates='orders',
    )

If you try this you may find that the generated SQL is exactly the same as above. Meaning two SQL-statements, one to delete the orders and one to delete the customer (and optionally a third SQL-statement if the orders are not in the session). The reason is that the objects to be deleted are already in the current session. From the documentation:

SQLAlchemy only emits DELETE for those rows that are already locally present in the Session; for any collections that are unloaded, it leaves them to the database to handle, rather than emitting a SELECT for them.

We can also set passive_deletes='all', From the SQLAlchemy documentation:

This has the effect of entirely disabling SQLAlchemy’s behavior of setting the foreign key column to NULL, and a DELETE will be emitted for the parent row without any affect on the child row, even if the child row is present in memory.

If I try this I get the following error message, do not understand why:

can't set passive_deletes='all' in conjunction with 'delete' or 'delete-orphan' cascade

Anyway, to check and see what is happening we can either expunge the objects first, or, start a new session, select a customer and delete this customer. I used the latter. Now only a single SQL-statement is generated:

DELETE FROM customer WHERE customer.id = ?
(2,)

Of course this did not work immediately. It appears you must enable Foreign Keys when using SQLite. When you forget this, the orders of the customer are NOT deleted. It fails silently ... no errors or messages ... sigh.

SQLAlchemy Cascade Deletes or database Cascade Deletes?

From the SQLAlchemy documentation:

Database level ON DELETE CASCADE is generally much more efficient than relying upon the “cascade” delete feature of SQLAlchemy.

Fine. I like efficient. I am using PostgreSQL and am going to use database Cascade Deletes where possible. But in the end it all depends on your application. A web application typically has only a minimal number of deletes, so why bother?

The code

In case you want to try yourself, here is the code I used:

# customer-order ondelete
import sys

import sqlalchemy as sa
from sqlalchemy.orm import declarative_base, sessionmaker, relationship

from sqlalchemy.engine import Engine
from sqlalchemy import event
from sqlite3 import Connection as SQLite3Connection

# enable foreign keys. if you do not do this, ON DELETE CASCADE fails silently!
@event.listens_for(Engine, "connect")
def _set_sqlite_pragma(dbapi_connection, connection_record):
    if isinstance(dbapi_connection, SQLite3Connection):
        print('turning on foreign keys ...')
        cursor = dbapi_connection.cursor()
        cursor.execute('PRAGMA foreign_keys=ON;')
        cursor.close()

Base = declarative_base()

class Customer(Base):
    __tablename__ = 'customer'

    id = sa.Column(sa.Integer, primary_key=True)
    name = sa.Column(sa.String(100), nullable=False)

    # relationship: orders
    orders = relationship(
        'Order',
        back_populates='customer',
        cascade='save-update, merge, delete',
        passive_deletes=True,
    )

class Order(Base):
    __tablename__ = 'order'

    id = sa.Column(sa.Integer, primary_key=True)
    name = sa.Column(sa.String(100), nullable=False)

    # relationship: customer
    customer_id = sa.Column(sa.Integer, sa.ForeignKey('customer.id', ondelete='CASCADE'), nullable=False, index=True)
    #customer_id = sa.Column(sa.Integer, sa.ForeignKey('customer.id'), nullable=False, index=True)
    customer = relationship(
        'Customer',
        back_populates='orders',
    )

# get engine and create all
engine_echo = True
engine = sa.create_engine('sqlite:///:memory:', echo=engine_echo)
Base.metadata.drop_all(engine, checkfirst=True)
Base.metadata.create_all(engine)

# class to dump objects
class DumpItem:
    def __init__(self, model=None, many_attr_name=None):
        self.model = model
        self.many_attr_name = many_attr_name

# function to dump objects
def dump_all(title, do_not_print=False):
    lines = []
    sep_line = '-' * 60
    dump_items = [
        DumpItem(model=Customer, many_attr_name='orders'),
        DumpItem(model=Order),
    ]
    lines.append('\n{}\n{}'.format(sep_line, title))
    for dump_item in dump_items:
        stmt = sa.select(dump_item.model).order_by(dump_item.model.name)
        objs = session.execute(stmt).scalars().all()
        lines.append('{}\n{}, total {}\n{}'.format(sep_line, dump_item.model.__name__, len(objs), sep_line))
        for i, obj in enumerate(objs):
            lines.append('[{:d}] {}'.format(i, obj.name))
            try:
                many_attr = getattr(obj, dump_item.many_attr_name, None)
            except:
                continue
            for j, many_obj in enumerate(many_attr):
                lines.append('    [{}] {}'.format(j, many_obj.name))
    dump = '{}\n'.format('\n'.join(lines))
    if not do_not_print:
        print(dump)
    return dump

print('create session')
Session = sessionmaker(bind=engine)
session = Session()

print('create customers ...')
customers = [
    Customer(name='customer0'), Customer(name='customer1'),
    Customer(name='customer2'), Customer(name='customer3'),
]
session.add_all(customers)

print('create customer orders ...')
customers[1].orders = [Order(name='order0'), Order(name='order1')]
customers[2].orders = [Order(name='order2'), Order(name='order3')]

session.commit()
session.flush()

objs_before = dump_all('before delete', do_not_print=True)

# select delete with current session or delete with new session
delete_with_current_session = False

if delete_with_current_session:
    print('before delete customer1, current session ...')
    
    try:
        session.delete(customers[1])
        '''
        emitted sql:

        DELETE FROM "order" WHERE "order".id = ?
        ((1,), (2,))
        DELETE FROM customer WHERE customer.id = ?
        (2,)
        '''
        session.commit()
    except Exception as e:
        print('exception = {}, e.args = {}'.format(type(e).__name__, e.args))
        session.rollback()
        sys.exit()
    print('after delete customer1, current session ...')

else:
    print('before delete customer1, new session ...')

    print('create a new session')
    Session2 = sessionmaker(bind=engine)
    session2 = Session2()

    print('get customer1')
    stmt = sa.select(Customer).where(Customer.name == 'customer1')
    customer1 = session2.execute(stmt).scalars().first()

    try:
        session2.delete(customer1)
        '''
        emitted sql:

        DELETE FROM customer WHERE customer.id = ?
        (2,)
        '''
        session2.commit()
    except Exception as e:
        print('exception = {}, e.args = {}'.format(type(e).__name__, e.args))
        session2.rollback()
        sys.exit()
    print('after delete customer1, current session ...')

objs_after = dump_all('after delete', do_not_print=True)
print(objs_before)
print(objs_after)

And the result:

------------------------------------------------------------
before delete
------------------------------------------------------------
Customer, total 4
------------------------------------------------------------
[0] customer0
[1] customer1
    [0] order0
    [1] order1
[2] customer2
    [0] order2
    [1] order3
[3] customer3
------------------------------------------------------------
Order, total 4
------------------------------------------------------------
[0] order0
[1] order1
[2] order2
[3] order3

------------------------------------------------------------
after delete
------------------------------------------------------------
Customer, total 3
------------------------------------------------------------
[0] customer0
[1] customer2
    [0] order2
    [1] order3
[2] customer3
------------------------------------------------------------
Order, total 2
------------------------------------------------------------
[0] order2
[1] order3

Summary

As often with SQLAlchemy: You needed it, you did not get into all details, it worked, everybody happy. But when you look into it again, you start reading more, and make some cases to double check that it is working as expected.

Cascade Deletes is a difficult subject. You must make the correct decisions for your data model and the only way to do this is to fully understand what is happening.

This was the first time I fully read the 'SQLAlchemy - Cascades' page, see links below. I assumed that SQLAlchemy would make the best choice for me (and it worked, nothing wrong with that), but I was naive and did not fully understand the problem.

From the SQLAlchemy documentation:

If cascades are confusing, we’ll refer to their conclusion, stating “The sections we have just covered can be a bit confusing. However, in practice, it all works out nicely.”

Yeah sure, it all worked out nicely but please replace 'in practice' by 'after many many hours'.

Links / credits

Cascade all vs all, delete #6403
https://github.com/sqlalchemy/sqlalchemy/issues/6403

SQLAlchemy - Basic Relationship Patterns
https://docs.sqlalchemy.org/en/14/orm/basic_relationships.html

SQLAlchemy - Cascades
https://docs.sqlalchemy.org/en/14/orm/cascades.html

SQLAlchemy Cascade Delete: Clarity through Examples
https://esmithy.net/2020/06/20/sqlalchemy-cascade-delete

Read more

SQLAlchemy

Leave a comment

Comment anonymously or log in to comment.

Comments

Leave a reply

Reply anonymously or log in to reply.