Database Transactions

Published on June 25, 2025 · 14 min read

A database transaction is a logical unit of work that consists of one or more database operations (such as reads, writes, updates, or deletes) that must be executed as an indivisible unit.

Transactions transform a database from one consistent state to another, ensuring that either all operations within the transaction succeed or none of them take effect.

Logical Unit: Groups related operations that represent a complete business action
State Transformation: Moves the database from one valid state to another
All-or-Nothing Execution: Either completes entirely or has no effect at all

sql

BEGIN TRANSACTION;
    UPDATE accounts SET balance = balance - 100 WHERE account_id = 'A123';
    UPDATE accounts SET balance = balance + 100 WHERE account_id = 'B456';
    INSERT INTO transaction_log (from_account, to_account, amount, timestamp)
        VALUES ('A123', 'B456', 100, NOW());
COMMIT;

ACID properties

ACID properties define the fundamental guarantees that database transactions must provide to ensure reliability and consistency:

Property	Definition	Implementation	Example
Atomicity	All operations within a transaction must complete successfully, or the entire transaction is rolled back.	Achieved through write-ahead logging (WAL) and undo/redo mechanisms. The database maintains a transaction log that records all changes before they’re applied to the actual data pages.	In a bank transfer, if debiting the source account succeeds but crediting the destination fails, atomicity ensures the debit is reversed, preventing money from disappearing.
Consistency	A transaction must bring the database from one valid state to another, maintaining all defined rules, constraints, and triggers	Types of Constraints: - Domain constraints: Data type restrictions - Key constraints: Primary and foreign key relationships - Check constraints: Custom validation rules - Triggers: Procedural consistency rules	A constraint ensuring account balances never go negative will cause a transaction to fail if it would violate this rule.
Isolation	Concurrent transactions must not interfere with each other; each transaction should execute as if it were the only transaction running on the system. Complete isolation severely impacts performance, leading to various isolation levels that balance consistency with concurrency.	Implementation Approaches: - Pessimistic: Lock-based protocols - Optimistic: Multi-Version Concurrency Control (MVCC)	In a shared booking system: – Transaction 1 (User A): SELECT availableseats FROM flights WHERE flightid = ‘XYZ123’; – Returns 1 – Transaction 2 (User B): SELECT availableseats FROM flights WHERE flightid = ‘XYZ123’; – Returns 1 – Without Isolation, both might try to book the last seat, leading to overbooking. – With Isolation (e.g., using locks on `XYZ123` during update): – Only one transaction will successfully decrement `available_seats` to 0; the other will be blocked or fail, preventing double-booking.
Durability	Once a transaction commits, its changes must persist even in the event of system failures.	Through write-ahead logging, where changes are first written to a durable transaction log before being applied to data pages. Regular checkpoints ensure efficient recovery.	After a successful online purchase: – Transaction (Purchase): INSERT INTO orders (…) VALUES (…); UPDATE productinventory SET quantity = quantity - 1 WHERE productid = ‘XYZ’; COMMIT; – If the system crashes immediately after COMMIT, upon restart, the database recovery process uses the transaction log to ensure the order record and reduced inventory are still present. The purchase is guaranteed to be permanent.

Isolation

Isolation Anomalies

Dirty Read: Reading uncommitted changes that might be rolled back

sql

-- Transaction 1
UPDATE products SET price = 50 WHERE id = 1;
-- (not yet committed)

-- Transaction 2 (Read Uncommitted)
SELECT price FROM products WHERE id = 1; -- Sees 50
-- Transaction 1 rolls back

Non-Repeatable Read: Getting different values when reading the same row twice

sql

-- Transaction 1
SELECT price FROM products WHERE id = 1; -- Returns 100

-- Transaction 2
UPDATE products SET price = 150 WHERE id = 1;
COMMIT;

-- Transaction 1
SELECT price FROM products WHERE id = 1; -- Returns 150

Phantom Read: New rows appearing that match a previous query condition

sql

-- Transaction 1
SELECT COUNT(*) FROM orders WHERE status = 'pending'; -- Returns 5

-- Transaction 2
INSERT INTO orders (status) VALUES ('pending');
COMMIT;

-- Transaction 1
SELECT COUNT(*) FROM orders WHERE status = 'pending'; -- Returns 6

Serialization anomaly(Write Skew): successfully committing a group of transactions is inconsistent with all possible orderings of running those transactions one at a time.

sql

-- Transaction 1 - Dr. Alice wants to go off-call, Read current on-call doctors, Alice checks if there are enough other doctors on call before she goes off.
SELECT COUNT(*) FROM Doctors WHERE status = 'On-Call'; -- Returns 2

-- Transaction 1: Logic Check (based on read count), If count > 1, Alice can go off-call. (2 > 1, so TRUE), at this point, imagine a delay or complex operation within Transaction A, before it attempts to update.

-- Transaction 2 - Dr. Bob also wants to go off-call, Read current on-call doctors, Bob checks if there are enough other doctors on call before he goes off.
SELECT COUNT(*) FROM Doctors WHERE status = 'On-Call'; -- Returns 2 (Still sees 2, as T1 hasn't committed)

-- Transaction 2: Logic Check (based on read count), if count > 1, Bob can go off-call. (2 > 1, so TRUE), Update Dr. Bob's status, commits
UPDATE Doctors SET status = 'Off-Call' WHERE doctor_id = 2;
COMMIT;

-- Back to Transaction 1: Update Dr. Alice's status, commits
UPDATE Doctors SET status = 'Off-Call' WHERE doctor_id = 1;
COMMIT;

-- Check consistency rule:
SELECT COUNT(*) FROM Doctors WHERE status = 'On-Call'; -- Returns 0 (Violation: 0 < 1)

Isolation Levels

Isolation levels represent different trade-offs between data consistency and system performance. The SQL standard defines four isolation levels:

Level	Behavior	Anomalies Prevented	Anomalies allowed	Use Case	Performance
Read Uncommitted	Transactions can read uncommitted changes from other transactions	-	Dirty reads, non-repeatable reads, phantom reads	Rarely used except for approximate queries where accuracy isn’t critical	Highest concurrency, lowest overhead
Read Committed	Transactions only see committed changes from other transactions	Dirty reads	Non-repeatable reads, phantom reads	Default level in many databases (PostgreSQL, Oracle)	Short-duration read locks or MVCC snapshots
Repeatable Read	Guarantees that repeated reads of the same data return the same values	Dirty reads, non-repeatable reads	Phantom reads (in some cases)	When transactions need stable views of data they’ve already read	Longer-duration read locks or consistent MVCC snapshots
Serializable	Transactions execute as if they ran sequentially	All anomalies	-	Critical operations requiring absolute consistency	Predicate locking, serializable snapshot isolation (SSI)

Locking Mechanisms

Two-Phase Locking (2PL):

Growing Phase: Acquire locks as needed
Shrinking Phase: Release all locks at transaction end
Variants: Strict 2PL (holds all locks until commit)

Lock Types:

Shared (S) Locks: Allow concurrent reads
Exclusive (X) Locks: Prevent all concurrent access

diagram

Other types of locks

Intention Locks: which do not directly lock a node, but instead denote the existence, or intent to add, a lock of the specified type lower in the node hierarchy.
Predicate Locks: Instead of locking records, lock predicates

Multi-Version Concurrency Control (MVCC)

MVCC maintains multiple versions of data to allow concurrent access without locking:

Key Concepts:

Each transaction sees a consistent snapshot of the database
Writers don’t block readers, readers don’t block writers
Old versions are garbage collected when no longer needed

Implementation Details:

code

Row Version Structure:
[Row Data | Transaction ID | Pointer to Previous Version]

Write-Ahead Logging (WAL)

WAL ensures durability and enables recovery:

Log Before Write: Changes are first written to the log
Force Log at Commit: Log records must be on disk before commit completes
Background Flushing: Data pages written to disk asynchronously

Log Record Types:

REDO: Information to replay changes
UNDO: Information to reverse changes
COMMIT/ABORT: Transaction outcome markers

diagram

Commit Protocols

Local Transactions:

Write all changes to WAL
Mark transaction as committed in log
Return success to client
Asynchronously apply changes to data pages

Distributed Transactions

Two-Phase Commit:

Phases:

Prepare Phase: All participants vote to commit or abort
Commit Phase: Coordinator instructs final outcome based on votes

diagram

Cons:

Single point of failure (Coordinator)

Three-Phase Commit:

Improve fault tolerance compared to 2PC by reducing the risk of indefinite blocking.

Phase 1 (Prepare): Same as 2PC.
Phase 2 (Pre-Commit): The coordinator, upon receiving all “yes” votes, sends a “pre-commit” message. Participants acknowledge the message. This phase ensures participants are ready to commit.
Phase 3 (Commit/Abort): If all participants acknowledge the “pre-commit” message, the coordinator sends the final “commit” or “abort” decision.

diagram

The pre-commit phase allows participants to prepare for commit, and if the coordinator fails after this phase, participants can still decide whether to commit or abort based on their pre-commit state, preventing indefinite blocking.

Database-Specific Implementations

PostgreSQL

MVCC Implementation: Stores multiple row versions inline with visibility information

Isolation: Every transaction in Postgres gets a transaction ID called XID. This includes single one statement transactions such as an insert, update, or delete, as well as explicitly wrapping a group of statements together via BEGIN - COMMIT. When a transaction starts, Postgres increments an XID and assigns it to the current transaction. Postgres also stores transaction information on every row in the system, which is used to determine whether a row is visible to the transaction or not.

WAL: Sophisticated WAL with full-page writes for torn-page protection

Unique Features:

Rich set of isolation levels including true serializable
Advanced SSI implementation detecting write skew anomalies
Powerful LISTEN/NOTIFY for transaction coordination

MySQL/`InnoDB`

MVCC Implementation: Maintains undo logs in separate table space. InnoDB uses the information in the rollback segment to perform the undo operations needed in a transaction rollback. It also uses the information to build earlier versions of a row for a consistent read. Isolation: Next-key locking to prevent phantoms

WAL: Repeatable Read with some non-standard behaviors

Features:

Gap locking for phantom prevention
Adaptive hash indexes for frequent queries
Change buffering for secondary index updates

Anti-Patterns

Long-Running Transaction Anti-Pattern

Keeping transactions open for extended periods while performing non-database operations.

sql

-- BAD: Long-running transaction
BEGIN TRANSACTION;
SELECT * FROM orders WHERE status = 'pending';
-- ... application processes data for 30 seconds ...
-- ... makes external API calls ...
-- ... sends emails ...
UPDATE orders SET status = 'processed' WHERE id = 123;
COMMIT;

Minimize transaction scope and use optimistic locking.

sql

-- GOOD: Short transaction with optimistic locking
-- 1. Read data outside transaction
SELECT id, status, version FROM orders WHERE status = 'pending';

-- 2. Process data in application

-- 3. Short transaction for update only
BEGIN TRANSACTION;
UPDATE orders
SET status = 'processed', version = version + 1
WHERE id = 123 AND version = @original_version;

IF @@ROWCOUNT = 0
    ROLLBACK;
    -- Handle concurrent modification
ELSE
    COMMIT;

Nested Transaction Trap

Attempting to use nested transactions incorrectly.

sql

-- BAD: Assuming nested transaction support
BEGIN TRANSACTION; -- Outer transaction
    INSERT INTO orders (...) VALUES (...);

    BEGIN TRANSACTION; -- This doesn't create a new transaction!
        INSERT INTO order_items (...) VALUES (...);
        -- Error occurs here
    ROLLBACK; -- This rolls back EVERYTHING, not just inner "transaction"

COMMIT; -- This will fail because transaction was already rolled back

Use savepoints for partial rollbacks.

sql

-- GOOD: Using savepoints
BEGIN TRANSACTION;
    INSERT INTO orders (...) VALUES (...);

    SAVEPOINT order_items_start;
    BEGIN TRY
        INSERT INTO order_items (...) VALUES (...);
    END TRY
    BEGIN CATCH
        ROLLBACK TO SAVEPOINT order_items_start;
        -- Handle error, transaction continues
    END CATCH

COMMIT;

Silent Autocommit Trap

Unaware of autocommit behavior leading to partial updates.

python

### BAD: Autocommit enabled, no explicit transaction
def transfer_money(from_account, to_account, amount):
    cursor.execute("UPDATE accounts SET balance = balance - %s WHERE id = %s",
                   (amount, from_account))

    # If this fails, the debit above is already committed!
    cursor.execute("UPDATE accounts SET balance = balance + %s WHERE id = %s",
                   (amount, to_account))

Always use explicit transactions for multi-statement operations.

python

### GOOD: Explicit transaction control
def transfer_money(from_account, to_account, amount):
    try:
        cursor.execute("BEGIN")
        cursor.execute("UPDATE accounts SET balance = balance - %s WHERE id = %s",
                       (amount, from_account))
        cursor.execute("UPDATE accounts SET balance = balance + %s WHERE id = %s",
                       (amount, to_account))
        cursor.execute("COMMIT")
    except Exception as e:
        cursor.execute("ROLLBACK")
        raise

Read-Modify-Write Race Condition

Non-atomic read-modify-write operations.

sql

-- BAD: Race condition between read and write
SELECT quantity FROM inventory WHERE product_id = 123; -- Returns 10
-- Another transaction might modify quantity here
UPDATE inventory SET quantity = 9 WHERE product_id = 123; -- Assumes quantity is still 10

Use atomic operations or proper locking.

sql

-- GOOD: Atomic update
UPDATE inventory
SET quantity = quantity - 1
WHERE product_id = 123 AND quantity >= 1;

-- Check if update succeeded
IF @@ROWCOUNT = 0
    RAISERROR('Insufficient inventory', 16, 1);

Deadlock-Prone Resource Ordering

Inconsistent resource acquisition order causing deadlocks.

sql

-- Transaction A
BEGIN TRANSACTION;
UPDATE accounts SET balance = balance - 100 WHERE id = 1; -- Lock account 1
UPDATE accounts SET balance = balance + 100 WHERE id = 2; -- Lock account 2
COMMIT;

-- Transaction B (running concurrently)
BEGIN TRANSACTION;
UPDATE accounts SET balance = balance - 50 WHERE id = 2;  -- Lock account 2
UPDATE accounts SET balance = balance + 50 WHERE id = 1;  -- Wait for account 1 (DEADLOCK!)
COMMIT;

Always acquire resources in consistent order.

sql

-- GOOD: Consistent ordering (always update lower ID first)
CREATE PROCEDURE transfer_money(@from_id INT, @to_id INT, @amount DECIMAL)
AS
BEGIN
    DECLARE @first_id INT, @second_id INT;
    DECLARE @first_amount DECIMAL, @second_amount DECIMAL;

    IF @from_id < @to_id
    BEGIN
        SET @first_id = @from_id; SET @first_amount = -@amount;
        SET @second_id = @to_id; SET @second_amount = @amount;
    END
    ELSE
    BEGIN
        SET @first_id = @to_id; SET @first_amount = @amount;
        SET @second_id = @from_id; SET @second_amount = -@amount;
    END

    BEGIN TRANSACTION;
    UPDATE accounts SET balance = balance + @first_amount WHERE id = @first_id;
    UPDATE accounts SET balance = balance + @second_amount WHERE id = @second_id;
    COMMIT;
END

Distributed Database Transactions

Eventual Consistency Pattern

Accept temporary inconsistency in favor of availability and partition tolerance.

diagram

Trade-offs:

✅ High availability and performance
✅ Scales horizontally
❌ Temporary inconsistency
❌ Complex conflict resolution

Saga Pattern

Break distributed transactions into a sequence of local transactions, each with compensating actions.

Orchestration-Based Saga

diagram

The Orchestrator directs the workflow (A → B → C).
If all succeed, the saga completes.
If any service (e.g., C) fails, the orchestrator triggers compensating actions in reverse order (B → A).
This ensures data consistency without distributed transactions.

Choreography-Based Saga

diagram

No orchestrator: Services listen to events and act accordingly.
Each service is autonomous: It knows how to continue the saga or handle failures.
Event-driven: The saga progresses through a chain of events.

Sources

https://www.cs.cmu.edu/~natassa/courses/15-721/papers/p97-astrahan.pdf
https://jimgray.azurewebsites.net/papers/thetransactionconcept.pdf
https://cs-people.bu.edu/mathan/reading-groups/papers-classics/recovery.pdf
https://learning.oreilly.com/library/view/designing-data-intensive-applications/9781491903063/
https://arxiv.org/pdf/1208.4179
https://www.postgresql.org/docs/17/wal-intro.html
https://dev.mysql.com/doc/refman/8.4/en/innodb-multi-versioning.html
https://dev.mysql.com/doc/refman/8.4/en/innodb-locking.html
https://arxiv.org/abs/2310.04601

database transactions acid

$Database Transactions

Related Notes

Database Transactions