A Few Django ORM Mistakes
See if you can figure out what's wrong with the code snippets below! Ask yourself what the problem is, what effect will it have, and how can you fix it?
These examples are for Django, but probably apply to many other ORMs.
Bug 1
def create():
with transaction.atomic():
thing = Thing.objects.create(foo=1, bar=1)
set_foo(thing.id)
thing.bar = 2
thing.save()
def set_foo(id):
thing = Thing.objects.get(id=id)
thing.foo = 2
thing.save()
Hint
The save
method saves all attributes.
Solution
The problem with this code is that two Python instances of the same database row exist. Here's the annotated source:
def create():
with transaction.atomic():
# Create database row foo=1 bar=1
thing = Thing.objects.create(foo=1, bar=1)
set_foo(thing.id)
# The database row has been updated with foo=2 bar=1, but this
# instance still has foo=1 bar=1 as it hasn't been reloaded
thing.bar = 2
thing.save() # Writes foo=1 bar=2
# The foo=2 write has been lost
def set_foo(id):
# Look up the same Thing, but create a new instance
thing = Thing.objects.get(id=id)
thing.foo = 2
thing.save() # Writes foo=2, bar=1
The result is a single Thing
with a foo
of 1
and a bar
of 2
. A write has been
lost!
Here's one possible fix:
def create():
with transaction.atomic():
thing = Thing.objects.create(foo=1, bar=2)
set_foo(thing)
thing.foo = 3
thing.save()
def set_foo(thing):
thing.bar = 4
thing.save()
Bug 2
class Thing(Model):
foo = ...
bar = ...
def thing_set_foo(id, value):
thing = Thing.objects.get(id=id)
thing.foo = value
thing.save()
def thing_set_bar(id, value):
thing = Thing.objects.get(id=id)
thing.bar = value
thing.save()
Hint
Assume thing_set_foo
and thing_set_bar
can happen simultaneously.
Solution
It's possible for a thread to read from the database just before a write happens in another thread, resulting in the following situation:
Here's one possible solution:
def thing_set_foo(id, value):
thing = Thing.objects.get(id=id)
thing.foo = value
thing.save(update_fields=["foo"])
def thing_set_bar(id, value):
thing = Thing.objects.get(id=id)
thing.bar = value
thing.save(update_fields=["bar"])
Bug 3
def increment(id)
counter = Counter.objects.get(id=id)
counter.count = counter.count + 1
counter.save()
Solution
This is very much like bug 2, but the twist is that the increment
function can conflict with itself. If called in two different threads,
even though increment
is called twice the total may still only be 1.
One way to fix this is to make the increment operation atomic.
The way to do this in the Django ORM is to use F
objects:
def increment(id)
counter = Counter.objects.get(id=id)
counter.count = F('count') + 1
counter.save()
Isolation Levels
READ COMMITTED Isolation Level
This is the default for PostgreSQL. Transactions can read updates from other transactions after they have been committed.
REPEATABLE READ Isolation Level
This is the default for MySQL. A snapshot is established on the first read in the transaction, and all subsequent reads are from the snapshot.
Going forward, assume we are using MySQL in its default configuration.
Bug 4
def state_transition(id):
with transaction.atomic():
stateful = Stateful.objects.get(id=id)
if stateful.state == DONE:
raise AlreadyDone
do_state_transition()
stateful.state = DONE
stateful.save()
Solution
It is possible for do_state_transition
to be executed twice if
the state transition is executed concurrently. This could be a problem if your
state transition includes side effects!
One simple solution to this problem is to lock the object:
def state_transition(id):
with transaction.atomic():
stateful = Stateful.objects.select_for_update().get(id=id)
if stateful.state == DONE:
raise AlreadyDone
do_state_transition()
stateful.state = DONE
stateful.save()
But, generally, you should try to avoid doing side effects in transactions!
Optimistic Locking
Another way of handling this is to use optimistic locking. Instead of
naively saving every value, optimistic locking tries to update with
a WHERE state ≠ DONE
clause. This will first lock the
database row and read in the latest state (updates do not use the snapshot)
before updating. If no rows are successfully updated (i.e. the row has already
transitioned) then the transaction is rolled back.
There are two caveats to optimistic locking. The first is that if your transition has side effects it won't help you, and the second is that if you take other locks in the transaction you must avoid deadlocks.
This is the approach taken by the django-fsm library.
Bug 5
def create_payment(collection_id, amount):
with transaction.atomic():
payment_collection = PaymentCollection.objects.get(id=collection_id)
Payment.objects.create(
amount=amount, payment_collection=payment_collection)
payment_collection.total = (
payment_collection.payment_set.all()
.aggregate(total=Sum('amount'))['total'])
payment_collection.save()
Solution
If executed concurrently, one transaction will not see the newly created Payment
in the other transaction. The last write will win and the total will be inconsistent.
In addition to that, in MySQL this can potentially deadlock causing
your transaction to roll back entirely! Creating the Payment
causes a lock that blocks the aggregation read.
Both issues can be fixed by locking the model being updated (not the payment!) at the start of the transaction:
def create_payment(collection_id, amount):
with transaction.atomic():
payment_collection = (PaymentCollection.objects
.select_for_update().get(id=collection_id))
Payment.objects.create(
amount=amount, payment_collection=payment_collection)
payment_collection.total = (
payment_collection.payment_set.all()
.aggregate(total=Sum('amount'))['total'])
payment_collection.save()
Or, alternatively, making the update atomic:
def create_payment(collection_id, amount):
payment_collection = PaymentCollection.objects.get(id=collection_id)
Payment.objects.create(amount=amount, payment_collection=payment_collection)
with connection.cursor() as cursor:
cursor.execute("""
UPDATE payment_collection,
(SELECT payment_collection_id, sum(amount) AS total
FROM payment
GROUP BY payment_collection__id) totals
SET payment_collection.total = totals.total
WHERE totals.pc_id = pc.id
AND pc.id = %s
""", [payment_collection.id])
Note that this cannot be in a transaction, or the deadlock issues will remain!
In my opinion, the safest way to do this is by using a SQL view instead of storing the total. Views can be awkward to use with Django unfortunately.
CREATE VIEW payment_collection_totals
SELECT payment_collection_id, SUM(amount) AS total
FROM payment
GROUP BY payment_collection_id
CREATE VIEW payment_collection_with_total
SELECT payment_collection.*, COALESCE(totals.total, 0) AS total
FROM payment_collection
LEFT JOIN totals ON (totals.payment_collection_id = payment_collection.id)
Bug 6
def foo(id):
with transaction.atomic():
foo = Foo.objects.get(id=id)
bar = Bar.objects.create(...)
with lock():
foo.refresh_from_db()
# If foo.bar has already been set in another thread,
# raise an exception and rollback the transaction
if foo.bar:
raise Exception
foo.bar = bar
foo.save()
Hint
This is a bug in MySQL, but not PostgreSQL.
Solution
This bug is a result of the REPEATABLE READ isolation level. The read after
the transaction starts establishes a snapshot, so
when refresh_from_db
is performed after waiting for a lock, the
snapshot is read, not the most recent value.
This means when the foo.bar
check is performed, we are checking
potentially stale data. This can cause multiple Bar
to be
created, but only one of them linked to the correct Foo
.
Confusingly, replacing with lock()
with
a select_for_update()
will work for MySQL, because MySQL has a
weird quirk where locked reads do not read from the snapshot. When
using REPEATABLE READ with PostgreSQL, this will throw an error instead.
The preferred way is to either move the lock to the top, outside of the
transaction, or using select_for_update()
as follows:
def foo(id):
with transaction.atomic():
foo = Foo.objects.select_for_update().get(id=id)
bar = Bar.objects.create(...)
# If foo.bar has already been set in another thread,
# raise an exception and rollback the transaction
if foo.bar:
raise Exception
foo.bar = bar
foo.save()
How Common Are These Bugs?
These bugs are simplified versions of bugs found in production code, and written by experienced Django developers. Remember, these bugs won't be as easy to spot as they are here. They can be buried deep in between hundreds of lines of code.
After spending a very short time looking at a few open source Django projects, I can tell that these bugs are common. In fact, at the time of writing the Django example app contains two of them! There are probably a few reasons for this:
- Thinking about concurrency is just hard.
- Developers are not taking enough time to understand the underlying database technology.
- In most cases the lost writes will happen very rarely.
- When they do happen they are often relatively harmless.
- It's a silent failure. How would you tell if your app was losing an occasional write?
So it's your call if you want to ignore these problems. But if you decide you don't care, please don't write financial or medical software. ;)
Tips
- Remember the ORM is an in-memory cache.
- ORMs can obscure bugs. Look at the SQL!
- Avoid read-modify-write
- If you don't you'll probably need a lock.
- If it's not locked, it's not up to date.
- Lock before reads to avoid weird MySQL behaviour.
- Locking reads don't use the snapshot in MySQL.
- Prefer immutable database design if practical.
- See: Immutable Data.
- Consider using a serializable isolation level.
- You don't have to worry about locking if you use serializable transactions.
- Has other drawbacks.
- PostgreSQL implementation is nicer than the MySQL one IMO.