Database Transactions Deep Dive

Published on June 21, 2025

A database transaction is a logical unit of work that consists of one or more database operations (such as reads, writes, updates, or deletes) that must be executed as an indivisible unit.

Transactions transform a database from one consistent state to another, ensuring that either all operations within the transaction succeed or none of them take effect.

  • Logical Unit: Groups related operations that represent a complete business action
  • State Transformation: Moves the database from one valid state to another
  • All-or-Nothing Execution: Either completes entirely or has no effect at all
BEGIN TRANSACTION;
    UPDATE accounts SET balance = balance - 100 WHERE account_id = 'A123';
    UPDATE accounts SET balance = balance + 100 WHERE account_id = 'B456';
    INSERT INTO transaction_log (from_account, to_account, amount, timestamp)
        VALUES ('A123', 'B456', 100, NOW());
COMMIT;

ACID properties

ACID properties define the fundamental guarantees that database transactions must provide to ensure reliability and consistency:

PropertyDefinitionImplementationExample
AtomicityAll operations within a transaction must complete successfully, or the entire transaction is rolled back.Achieved through write-ahead logging (WAL) and undo/redo mechanisms. The database maintains a transaction log that records all changes before they’re applied to the actual data pages.In a bank transfer, if debiting the source account succeeds but crediting the destination fails, atomicity ensures the debit is reversed, preventing money from disappearing.
ConsistencyA transaction must bring the database from one valid state to another, maintaining all defined rules, constraints, and triggersTypes of Constraints:
- Domain constraints: Data type restrictions
- Key constraints: Primary and foreign key relationships
- Check constraints: Custom validation rules
- Triggers: Procedural consistency rules
A constraint ensuring account balances never go negative will cause a transaction to fail if it would violate this rule.
IsolationConcurrent transactions must not interfere with each other; each transaction should execute as if it were the only transaction running on the system.

Complete isolation severely impacts performance, leading to various isolation levels that balance consistency with concurrency.
Implementation Approaches:

- Pessimistic: Lock-based protocols
- Optimistic: Multi-Version Concurrency Control (MVCC)
In a shared booking system:

Transaction 1 (User A):
SELECT available_seats FROM flights WHERE flight_id = ‘XYZ123’; — Returns 1

Transaction 2 (User B):
SELECT available_seats FROM flights WHERE flight_id = ‘XYZ123’; — Returns 1

— Without Isolation, both might try to book the last seat, leading to overbooking.

With Isolation (e.g., using locks on XYZ123 during update):
— Only one transaction will successfully decrement available_seats to 0; the other will be blocked or fail, preventing double-booking.
DurabilityOnce a transaction commits, its changes must persist even in the event of system failures.Through write-ahead logging, where changes are first written to a durable transaction log before being applied to data pages. Regular checkpoints ensure efficient recovery.After a successful online purchase:

Transaction (Purchase):
INSERT INTO orders (…) VALUES (…);
UPDATE product_inventory SET quantity = quantity - 1 WHERE product_id = ‘XYZ’;
COMMIT;

— If the system crashes immediately after COMMIT, upon restart, the database recovery process uses the transaction log to ensure the order record and reduced inventory are still present. The purchase is guaranteed to be permanent.

Isolation

Isolation Anomalies

Dirty Read: Reading uncommitted changes that might be rolled back

-- Transaction 1
UPDATE products SET price = 50 WHERE id = 1;
-- (not yet committed)

-- Transaction 2 (Read Uncommitted)
SELECT price FROM products WHERE id = 1; -- Sees 50
-- Transaction 1 rolls back

Non-Repeatable Read: Getting different values when reading the same row twice

-- Transaction 1
SELECT price FROM products WHERE id = 1; -- Returns 100

-- Transaction 2
UPDATE products SET price = 150 WHERE id = 1;
COMMIT;

-- Transaction 1
SELECT price FROM products WHERE id = 1; -- Returns 150

Phantom Read: New rows appearing that match a previous query condition

-- Transaction 1
SELECT COUNT(*) FROM orders WHERE status = 'pending'; -- Returns 5

-- Transaction 2
INSERT INTO orders (status) VALUES ('pending');
COMMIT;

-- Transaction 1
SELECT COUNT(*) FROM orders WHERE status = 'pending'; -- Returns 6

Serialization anomaly(Write Skew): successfully committing a group of transactions is inconsistent with all possible orderings of running those transactions one at a time.

-- Transaction 1 - Dr. Alice wants to go off-call, Read current on-call doctors, Alice checks if there are enough other doctors on call before she goes off.
SELECT COUNT(*) FROM Doctors WHERE status = 'On-Call'; -- Returns 2

-- Transaction 1: Logic Check (based on read count), If count > 1, Alice can go off-call. (2 > 1, so TRUE), at this point, imagine a delay or complex operation within Transaction A, before it attempts to update.

-- Transaction 2 - Dr. Bob also wants to go off-call, Read current on-call doctors, Bob checks if there are enough other doctors on call before he goes off.
SELECT COUNT(*) FROM Doctors WHERE status = 'On-Call'; -- Returns 2 (Still sees 2, as T1 hasn't committed)

-- Transaction 2: Logic Check (based on read count), if count > 1, Bob can go off-call. (2 > 1, so TRUE), Update Dr. Bob's status, commits
UPDATE Doctors SET status = 'Off-Call' WHERE doctor_id = 2;
COMMIT;

-- Back to Transaction 1: Update Dr. Alice's status, commits
UPDATE Doctors SET status = 'Off-Call' WHERE doctor_id = 1;
COMMIT;

-- Check consistency rule:
SELECT COUNT(*) FROM Doctors WHERE status = 'On-Call'; -- Returns 0 (Violation: 0 < 1)

Isolation Levels

Isolation levels represent different trade-offs between data consistency and system performance. The SQL standard defines four isolation levels:

LevelBehaviorAnomalies PreventedAnomalies allowedUse CasePerformance
Read UncommittedTransactions can read uncommitted changes from other transactions-Dirty reads, non-repeatable reads, phantom readsRarely used except for approximate queries where accuracy isn’t criticalHighest concurrency, lowest overhead
Read CommittedTransactions only see committed changes from other transactionsDirty readsNon-repeatable reads, phantom readsDefault level in many databases (PostgreSQL, Oracle)Short-duration read locks or MVCC snapshots
Repeatable ReadGuarantees that repeated reads of the same data return the same valuesDirty reads, non-repeatable readsPhantom reads (in some cases)When transactions need stable views of data they’ve already readLonger-duration read locks or consistent MVCC snapshots
SerializableTransactions execute as if they ran sequentiallyAll anomalies-Critical operations requiring absolute consistencyPredicate locking, serializable snapshot isolation (SSI)

Locking Mechanisms

Two-Phase Locking (2PL):

  • Growing Phase: Acquire locks as needed
  • Shrinking Phase: Release all locks at transaction end
  • Variants: Strict 2PL (holds all locks until commit)

Lock Types:

  • Shared (S) Locks: Allow concurrent reads
  • Exclusive (X) Locks: Prevent all concurrent access
Two-phase locking diagram showing growing and shrinking phases with shared and exclusive locks

Other types of locks

  • Intention Locks: which do not directly lock a node, but instead denote the existence, or intent to add, a lock of the specified type lower in the node hierarchy.
  • Predicate Locks: Instead of locking records, lock predicates

Multi-Version Concurrency Control (MVCC)

MVCC maintains multiple versions of data to allow concurrent access without locking:

Key Concepts:

  • Each transaction sees a consistent snapshot of the database
  • Writers don’t block readers, readers don’t block writers
  • Old versions are garbage collected when no longer needed

Implementation Details:

Row Version Structure:
[Row Data | Transaction ID | Pointer to Previous Version]

Write-Ahead Logging (WAL)

WAL ensures durability and enables recovery:

  1. Log Before Write: Changes are first written to the log
  2. Force Log at Commit: Log records must be on disk before commit completes
  3. Background Flushing: Data pages written to disk asynchronously

Log Record Types:

  • REDO: Information to replay changes
  • UNDO: Information to reverse changes
  • COMMIT/ABORT: Transaction outcome markers
Write Ahead log
Write Ahead log during restart

Commit Protocols

Local Transactions:

  1. Write all changes to WAL
  2. Mark transaction as committed in log
  3. Return success to client
  4. Asynchronously apply changes to data pages

Distributed Transactions

Two-Phase Commit:

Phases:

  1. Prepare Phase: All participants vote to commit or abort
  2. Commit Phase: Coordinator instructs final outcome based on votes
Two-phase commit protocol diagram showing coordinator and participant interactions during prepare and commit phases

Cons:

  • Single point of failure (Coordinator)
Three-Phase Commit:

Improve fault tolerance compared to 2PC by reducing the risk of indefinite blocking.

  • Phase 1 (Prepare): Same as 2PC.
  • Phase 2 (Pre-Commit): The coordinator, upon receiving all “yes” votes, sends a “pre-commit” message. Participants acknowledge the message. This phase ensures participants are ready to commit.
  • Phase 3 (Commit/Abort): If all participants acknowledge the “pre-commit” message, the coordinator sends the final “commit” or “abort” decision.
Three-phase commit protocol diagram showing prepare, pre-commit, and commit phases for improved fault tolerance

The pre-commit phase allows participants to prepare for commit, and if the coordinator fails after this phase, participants can still decide whether to commit or abort based on their pre-commit state, preventing indefinite blocking.

Database-Specific Implementations

PostgreSQL

MVCC Implementation: Stores multiple row versions inline with visibility information

Isolation: Every transaction in Postgres gets a transaction ID called XID. This includes single one statement transactions such as an insert, update, or delete, as well as explicitly wrapping a group of statements together via BEGIN - COMMIT. When a transaction starts, Postgres increments an XID and assigns it to the current transaction. Postgres also stores transaction information on every row in the system, which is used to determine whether a row is visible to the transaction or not.

WAL: Sophisticated WAL with full-page writes for torn-page protection

Unique Features:

  • Rich set of isolation levels including true serializable
  • Advanced SSI implementation detecting write skew anomalies
  • Powerful LISTEN/NOTIFY for transaction coordination

More reading:

  1. https://arxiv.org/pdf/1208.4179
  2. https://www.postgresql.org/docs/17/wal-intro.html

MySQL/InnoDB

MVCC Implementation: Maintains undo logs in separate table space. InnoDB uses the information in the rollback segment to perform the undo operations needed in a transaction rollback. It also uses the information to build earlier versions of a row for a consistent read.
Isolation: Next-key locking to prevent phantoms

WAL: Repeatable Read with some non-standard behaviors

Features:

  • Gap locking for phantom prevention
  • Adaptive hash indexes for frequent queries
  • Change buffering for secondary index updates

More reading:

  1. https://dev.mysql.com/doc/refman/8.4/en/innodb-multi-versioning.html
  2. https://dev.mysql.com/doc/refman/8.4/en/innodb-locking.html

Common Transaction Anti-Patterns and How to Avoid Them

Long-Running Transaction Anti-Pattern

Keeping transactions open for extended periods while performing non-database operations.

-- BAD: Long-running transaction
BEGIN TRANSACTION;
SELECT * FROM orders WHERE status = 'pending';
-- ... application processes data for 30 seconds ...
-- ... makes external API calls ...
-- ... sends emails ...
UPDATE orders SET status = 'processed' WHERE id = 123;
COMMIT;

Minimize transaction scope and use optimistic locking.

-- GOOD: Short transaction with optimistic locking
-- 1. Read data outside transaction
SELECT id, status, version FROM orders WHERE status = 'pending';

-- 2. Process data in application

-- 3. Short transaction for update only
BEGIN TRANSACTION;
UPDATE orders
SET status = 'processed', version = version + 1
WHERE id = 123 AND version = @original_version;

IF @@ROWCOUNT = 0
    ROLLBACK;
    -- Handle concurrent modification
ELSE
    COMMIT;

Nested Transaction Trap

Attempting to use nested transactions incorrectly.

-- BAD: Assuming nested transaction support
BEGIN TRANSACTION; -- Outer transaction
    INSERT INTO orders (...) VALUES (...);

    BEGIN TRANSACTION; -- This doesn't create a new transaction!
        INSERT INTO order_items (...) VALUES (...);
        -- Error occurs here
    ROLLBACK; -- This rolls back EVERYTHING, not just inner "transaction"

COMMIT; -- This will fail because transaction was already rolled back

Use savepoints for partial rollbacks.

-- GOOD: Using savepoints
BEGIN TRANSACTION;
    INSERT INTO orders (...) VALUES (...);

    SAVEPOINT order_items_start;
    BEGIN TRY
        INSERT INTO order_items (...) VALUES (...);
    END TRY
    BEGIN CATCH
        ROLLBACK TO SAVEPOINT order_items_start;
        -- Handle error, transaction continues
    END CATCH

COMMIT;

Silent Autocommit Trap

Unaware of autocommit behavior leading to partial updates.

### BAD: Autocommit enabled, no explicit transaction
def transfer_money(from_account, to_account, amount):
    cursor.execute("UPDATE accounts SET balance = balance - %s WHERE id = %s",
                   (amount, from_account))

    # If this fails, the debit above is already committed!
    cursor.execute("UPDATE accounts SET balance = balance + %s WHERE id = %s",
                   (amount, to_account))

Always use explicit transactions for multi-statement operations.

### GOOD: Explicit transaction control
def transfer_money(from_account, to_account, amount):
    try:
        cursor.execute("BEGIN")
        cursor.execute("UPDATE accounts SET balance = balance - %s WHERE id = %s",
                       (amount, from_account))
        cursor.execute("UPDATE accounts SET balance = balance + %s WHERE id = %s",
                       (amount, to_account))
        cursor.execute("COMMIT")
    except Exception as e:
        cursor.execute("ROLLBACK")
        raise

Read-Modify-Write Race Condition

Non-atomic read-modify-write operations.

-- BAD: Race condition between read and write
SELECT quantity FROM inventory WHERE product_id = 123; -- Returns 10
-- Another transaction might modify quantity here
UPDATE inventory SET quantity = 9 WHERE product_id = 123; -- Assumes quantity is still 10

Use atomic operations or proper locking.

-- GOOD: Atomic update
UPDATE inventory
SET quantity = quantity - 1
WHERE product_id = 123 AND quantity >= 1;

-- Check if update succeeded
IF @@ROWCOUNT = 0
    RAISERROR('Insufficient inventory', 16, 1);

Deadlock-Prone Resource Ordering

Inconsistent resource acquisition order causing deadlocks.

-- Transaction A
BEGIN TRANSACTION;
UPDATE accounts SET balance = balance - 100 WHERE id = 1; -- Lock account 1
UPDATE accounts SET balance = balance + 100 WHERE id = 2; -- Lock account 2
COMMIT;

-- Transaction B (running concurrently)
BEGIN TRANSACTION;
UPDATE accounts SET balance = balance - 50 WHERE id = 2;  -- Lock account 2
UPDATE accounts SET balance = balance + 50 WHERE id = 1;  -- Wait for account 1 (DEADLOCK!)
COMMIT;

Always acquire resources in consistent order.

-- GOOD: Consistent ordering (always update lower ID first)
CREATE PROCEDURE transfer_money(@from_id INT, @to_id INT, @amount DECIMAL)
AS
BEGIN
    DECLARE @first_id INT, @second_id INT;
    DECLARE @first_amount DECIMAL, @second_amount DECIMAL;

    IF @from_id < @to_id
    BEGIN
        SET @first_id = @from_id; SET @first_amount = -@amount;
        SET @second_id = @to_id; SET @second_amount = @amount;
    END
    ELSE
    BEGIN
        SET @first_id = @to_id; SET @first_amount = @amount;
        SET @second_id = @from_id; SET @second_amount = -@amount;
    END

    BEGIN TRANSACTION;
    UPDATE accounts SET balance = balance + @first_amount WHERE id = @first_id;
    UPDATE accounts SET balance = balance + @second_amount WHERE id = @second_id;
    COMMIT;
END

8. Modern Distributed Database Transaction Approaches

8.1. Eventual Consistency Pattern

Accept temporary inconsistency in favor of availability and partition tolerance.

Eventual consistency pattern diagram showing how data synchronizes across distributed systems over time

Trade-offs:

  • ✅ High availability and performance
  • ✅ Scales horizontally
  • ❌ Temporary inconsistency
  • ❌ Complex conflict resolution

8.2. Saga Pattern

Break distributed transactions into a sequence of local transactions, each with compensating actions.

8.2.1. Orchestration-Based Saga

Saga orchestration pattern diagram showing centralized coordinator managing distributed transaction workflow
  • The Orchestrator directs the workflow (A → B → C).
  • If all succeed, the saga completes.
  • If any service (e.g., C) fails, the orchestrator triggers compensating actions in reverse order (B → A).
  • This ensures data consistency without distributed transactions.

8.2.2. Choreography-Based Saga

Saga choreography pattern diagram showing event-driven distributed transaction coordination without central orchestrator
  • No orchestrator: Services listen to events and act accordingly.
  • Each service is autonomous: It knows how to continue the saga or handle failures.
  • Event-driven: The saga progresses through a chain of events.

Sources

  1. https://www.cs.cmu.edu/~natassa/courses/15-721/papers/p97-astrahan.pdf
  2. https://jimgray.azurewebsites.net/papers/thetransactionconcept.pdf
  3. https://cs-people.bu.edu/mathan/reading-groups/papers-classics/recovery.pdf
  4. https://learning.oreilly.com/library/view/designing-data-intensive-applications/9781491903063/
  5. https://arxiv.org/pdf/1208.4179
  6. https://www.postgresql.org/docs/17/wal-intro.html
  7. https://dev.mysql.com/doc/refman/8.4/en/innodb-multi-versioning.html
  8. https://dev.mysql.com/doc/refman/8.4/en/innodb-locking.html
  9. https://arxiv.org/abs/2310.04601