SQLAlchemy: Using Cascade Deletes to delete related objects
Cascade Deletes is much more than just adding 'all, delete' to the relationship.
Although the documentation of SQLAlchemy ORM is very good it sometimes is confusing and lacks clear examples, but maybe this is just me. About Cascade Deletes, I find the following text from the SQLAlchemy documentation mind-blowing:
Huh? What? And then after some time: WTF! Help! But I want only recipes!
But there are no recipes. The only way to make sure that we are doing the right thing is to read the documentation, read the documentation again, and create some examples that confirm that we understood.
SQLAlchemy Cascade Deletes
Cascade Deletes in SQLAlchemy can be specified in a relationship using the 'cascade'-option. The 'cascade'-option has a default value. This implies that if you were not using the 'cascade'-option and want to add to a 'cascade'-option to a relationship, you are changing (!) the default behavior. To keep default behavior, you must add 'save-update' and 'merge' to the 'cascade'-option.
The SQLAlchemy documentation then gives an example of Cascade Delete: When the parent is deleted, its children also must be deleted. In the example the Cascade Delete line is:
cascade='all, delete'
This is very confusing because 'delete' is already part of 'all'. This means we can also write:
cascade='all'
But wait, 'all' includes also 'refresh-expire' and 'expunge'. There is a warning in the documentation that 'refresh-expire' changes the default behavior and also 'expunge' changes the default behavior. This means we do not want 'all' but probably:
cascade='save-update, merge, delete'
Example: Customer, Order (One-To-Many)
When we delete a customer, we also want to delete all orders of this customer.
We have two models, Customer and Order. Customer-Order is a One-To-Many relationship. A customer can have zero or more orders and an order is part of only one customer. We create a bidirectional relationship. A customer can refer to orders using Customer.orders and an order can refer to a customer using Order.customer.
Note that in the ForeignKey Order.customer_id I added the attribute 'nullable=False'. The reason is that we do not want orders without a customer floating around.
We make the Cascade Delete work by adding the following line to the relationship in the Customer:
cascade='save-update, merge, delete'
Our model classes:
class Customer(Base):
__tablename__ = 'customer'
id = sa.Column(sa.Integer, primary_key=True)
name = sa.Column(sa.String(100), nullable=False)
# relationship: orders
orders = relationship(
'Order',
back_populates='customer',
# delete customer orders when customer deleted
cascade='save-update, merge, delete'
)
class Order(Base):
__tablename__ = 'order'
id = sa.Column(sa.Integer, primary_key=True)
name = sa.Column(sa.String(100), nullable=False)
# relationship: customer
customer_id = sa.Column(sa.Integer, sa.ForeignKey('customer.id'), nullable=False, index=True)
customer = relationship(
'Customer',
back_populates='orders',
)
Assuming a customer with two orders, the following SQL-statements are generated by SQLAlchemy:
DELETE FROM "order" WHERE "order".id = ?
((1,), (2,))
DELETE FROM customer WHERE customer.id = ?
(2,)
You would expect three SQL-statements, the first one selecting all orders of the customer. In my test script I created the customers and orders before the DELETE-statement meaning that the order objects are in the session. SQLAlchemy knows this, and decided there is no need to get them from the database.
Now let's change the 'cascade'-option: remove 'delete' and add 'delete-orphan', that also deletes any children that were removed from the parent, even if the parent is not deleted. Can probably not be used without 'delete'. Let's see what happens.
cascade='save-update, merge, delete-orphan',
The SQL-statements and result generated by SQLAlchemy:
UPDATE "order" SET customer_id=? WHERE "order".id = ?
((None, 1), (None, 2))
exception = IntegrityError, e.args = ('(sqlite3.IntegrityError) NOT NULL constraint failed: order.customer_id',)
SQLAlchemy makes an attempt to remove the orders from the customer by updating the ForeignKeys to NULL, but that is not allowed because the ForeignKey includes 'nullable=False'.
Finally, let's create the 'cascade'-option with both 'delete' and 'delete-orphan':
cascade='save-update, merge, delete, delete-orphan',
The SQL-statements generated by SQLAlchemy:
DELETE FROM "order" WHERE "order".id = ?
((1,), (2,))
DELETE FROM customer WHERE customer.id = ?
(2,)
This is exactly the same as the 'delete'-only case.
Database object deletes using ForeignKey ON DELETE CASCADE
Many database engines, but not all, support the 'ON DELETE CASCADE' action: Each row in the child table associated with the deleted parent row, is also deleted.
SQLAlchemy has support for this. In addition to the 'cascade'-option described above we must add:
- passive_deletes=True, to the relationship in the parent
- ondelete='CASCADE', to the ForeignKey in the child
Customer and Order model classes after making these changes:
class Customer(Base):
__tablename__ = 'customer'
id = sa.Column(sa.Integer, primary_key=True)
name = sa.Column(sa.String(100), nullable=False)
# relationship: orders
orders = relationship(
'Order',
back_populates='customer',
# delete customer orders when customer deleted
cascade='save-update, merge, delete',
passive_deletes=True,
)
class Order(Base):
__tablename__ = 'order'
id = sa.Column(sa.Integer, primary_key=True)
name = sa.Column(sa.String(100), nullable=False)
# relationship: customer
customer_id = sa.Column(sa.Integer, sa.ForeignKey('customer.id', ondelete='CASCADE'), nullable=False, index=True)
customer = relationship(
'Customer',
back_populates='orders',
)
If you try this you may find that the generated SQL is exactly the same as above. Meaning two SQL-statements, one to delete the orders and one to delete the customer (and optionally a third SQL-statement if the orders are not in the session). The reason is that the objects to be deleted are already in the current session. From the documentation:
We can also set passive_deletes='all', From the SQLAlchemy documentation:
If I try this I get the following error message, do not understand why:
can't set passive_deletes='all' in conjunction with 'delete' or 'delete-orphan' cascade
Anyway, to check and see what is happening we can either expunge the objects first, or, start a new session, select a customer and delete this customer. I used the latter. Now only a single SQL-statement is generated:
DELETE FROM customer WHERE customer.id = ?
(2,)
Of course this did not work immediately. It appears you must enable Foreign Keys when using SQLite. When you forget this, the orders of the customer are NOT deleted. It fails silently ... no errors or messages ... sigh.
SQLAlchemy Cascade Deletes or database Cascade Deletes?
From the SQLAlchemy documentation:
Fine. I like efficient. I am using PostgreSQL and am going to use database Cascade Deletes where possible. But in the end it all depends on your application. A web application typically has only a minimal number of deletes, so why bother?
The code
In case you want to try yourself, here is the code I used:
# customer-order ondelete
import sys
import sqlalchemy as sa
from sqlalchemy.orm import declarative_base, sessionmaker, relationship
from sqlalchemy.engine import Engine
from sqlalchemy import event
from sqlite3 import Connection as SQLite3Connection
# enable foreign keys. if you do not do this, ON DELETE CASCADE fails silently!
@event.listens_for(Engine, "connect")
def _set_sqlite_pragma(dbapi_connection, connection_record):
if isinstance(dbapi_connection, SQLite3Connection):
print('turning on foreign keys ...')
cursor = dbapi_connection.cursor()
cursor.execute('PRAGMA foreign_keys=ON;')
cursor.close()
Base = declarative_base()
class Customer(Base):
__tablename__ = 'customer'
id = sa.Column(sa.Integer, primary_key=True)
name = sa.Column(sa.String(100), nullable=False)
# relationship: orders
orders = relationship(
'Order',
back_populates='customer',
cascade='save-update, merge, delete',
passive_deletes=True,
)
class Order(Base):
__tablename__ = 'order'
id = sa.Column(sa.Integer, primary_key=True)
name = sa.Column(sa.String(100), nullable=False)
# relationship: customer
customer_id = sa.Column(sa.Integer, sa.ForeignKey('customer.id', ondelete='CASCADE'), nullable=False, index=True)
#customer_id = sa.Column(sa.Integer, sa.ForeignKey('customer.id'), nullable=False, index=True)
customer = relationship(
'Customer',
back_populates='orders',
)
# get engine and create all
engine_echo = True
engine = sa.create_engine('sqlite:///:memory:', echo=engine_echo)
Base.metadata.drop_all(engine, checkfirst=True)
Base.metadata.create_all(engine)
# class to dump objects
class DumpItem:
def __init__(self, model=None, many_attr_name=None):
self.model = model
self.many_attr_name = many_attr_name
# function to dump objects
def dump_all(title, do_not_print=False):
lines = []
sep_line = '-' * 60
dump_items = [
DumpItem(model=Customer, many_attr_name='orders'),
DumpItem(model=Order),
]
lines.append('\n{}\n{}'.format(sep_line, title))
for dump_item in dump_items:
stmt = sa.select(dump_item.model).order_by(dump_item.model.name)
objs = session.execute(stmt).scalars().all()
lines.append('{}\n{}, total {}\n{}'.format(sep_line, dump_item.model.__name__, len(objs), sep_line))
for i, obj in enumerate(objs):
lines.append('[{:d}] {}'.format(i, obj.name))
try:
many_attr = getattr(obj, dump_item.many_attr_name, None)
except:
continue
for j, many_obj in enumerate(many_attr):
lines.append(' [{}] {}'.format(j, many_obj.name))
dump = '{}\n'.format('\n'.join(lines))
if not do_not_print:
print(dump)
return dump
print('create session')
Session = sessionmaker(bind=engine)
session = Session()
print('create customers ...')
customers = [
Customer(name='customer0'), Customer(name='customer1'),
Customer(name='customer2'), Customer(name='customer3'),
]
session.add_all(customers)
print('create customer orders ...')
customers[1].orders = [Order(name='order0'), Order(name='order1')]
customers[2].orders = [Order(name='order2'), Order(name='order3')]
session.commit()
session.flush()
objs_before = dump_all('before delete', do_not_print=True)
# select delete with current session or delete with new session
delete_with_current_session = False
if delete_with_current_session:
print('before delete customer1, current session ...')
try:
session.delete(customers[1])
'''
emitted sql:
DELETE FROM "order" WHERE "order".id = ?
((1,), (2,))
DELETE FROM customer WHERE customer.id = ?
(2,)
'''
session.commit()
except Exception as e:
print('exception = {}, e.args = {}'.format(type(e).__name__, e.args))
session.rollback()
sys.exit()
print('after delete customer1, current session ...')
else:
print('before delete customer1, new session ...')
print('create a new session')
Session2 = sessionmaker(bind=engine)
session2 = Session2()
print('get customer1')
stmt = sa.select(Customer).where(Customer.name == 'customer1')
customer1 = session2.execute(stmt).scalars().first()
try:
session2.delete(customer1)
'''
emitted sql:
DELETE FROM customer WHERE customer.id = ?
(2,)
'''
session2.commit()
except Exception as e:
print('exception = {}, e.args = {}'.format(type(e).__name__, e.args))
session2.rollback()
sys.exit()
print('after delete customer1, current session ...')
objs_after = dump_all('after delete', do_not_print=True)
print(objs_before)
print(objs_after)
And the result:
------------------------------------------------------------
before delete
------------------------------------------------------------
Customer, total 4
------------------------------------------------------------
[0] customer0
[1] customer1
[0] order0
[1] order1
[2] customer2
[0] order2
[1] order3
[3] customer3
------------------------------------------------------------
Order, total 4
------------------------------------------------------------
[0] order0
[1] order1
[2] order2
[3] order3
------------------------------------------------------------
after delete
------------------------------------------------------------
Customer, total 3
------------------------------------------------------------
[0] customer0
[1] customer2
[0] order2
[1] order3
[2] customer3
------------------------------------------------------------
Order, total 2
------------------------------------------------------------
[0] order2
[1] order3
Summary
As often with SQLAlchemy: You needed it, you did not get into all details, it worked, everybody happy. But when you look into it again, you start reading more, and make some cases to double check that it is working as expected.
Cascade Deletes is a difficult subject. You must make the correct decisions for your data model and the only way to do this is to fully understand what is happening.
This was the first time I fully read the 'SQLAlchemy - Cascades' page, see links below. I assumed that SQLAlchemy would make the best choice for me (and it worked, nothing wrong with that), but I was naive and did not fully understand the problem.
From the SQLAlchemy documentation:
Yeah sure, it all worked out nicely but please replace 'in practice' by 'after many many hours'.
Links / credits
Cascade all vs all, delete #6403
https://github.com/sqlalchemy/sqlalchemy/issues/6403
SQLAlchemy - Basic Relationship Patterns
https://docs.sqlalchemy.org/en/14/orm/basic_relationships.html
SQLAlchemy - Cascades
https://docs.sqlalchemy.org/en/14/orm/cascades.html
SQLAlchemy Cascade Delete: Clarity through Examples
https://esmithy.net/2020/06/20/sqlalchemy-cascade-delete
Read more
SQLAlchemy
Leave a comment
Comment anonymously or log in to comment.
Comments (3)
Leave a reply
Reply anonymously or log in to reply.
good one,might be better if you had used declarative mapping
Confusion - I have no better words for this. As a database/SQLAlchemy newbie I am struggling to understand what is going on so "trial and error/not what I want" method is my best friend:)
Very nice article, thanks for this!
Recent
Most viewed
- Using PyInstaller and Cython to create a Python executable
- Reducing page response times of a Flask SQLAlchemy website
- Using Python's pyOpenSSL to verify SSL certificates downloaded from a host
- Connect to a service on a Docker host from a Docker container
- Using UUIDs instead of Integer Autoincrement Primary Keys with SQLAlchemy and MariaDb
- SQLAlchemy: Using Cascade Deletes to delete related objects