Redo vs. Undo
Let's first go over the rules for ARIES recovery protocol. What's the key rule? WAL. What does it determine? When the data must be flushed to disk in order to guarantee two properies: Isolation and Durability.
The rule to guarantee Isolation: before writing dirty pages to disk, write the corresponding log records. Why?
The rule to guarantee Durability: flush to disk before reporting the transaction as committed.
Questions to work on in groups:
1. Do you need to flush the abort records? CLR records?
2. Suppose you could only record UNDO records (what the data looked like before it was modified), or only REDO records (what the data looks like after modifications). What modifications to WAL would you do? What modifications to recovery? Or is this impossible?
What's the goal of having the log? What are the key things that we need to do on recovery?
- Remove the data from uncommitted transactions
- Persist the data from committed transactions
UNDO ONLY:
- Reconstruct the active transaction table
- For every transaction that was active, undo the effects, starting from the latest undo record.
What about committed transactions?
- You don't have enough information to redo them.
- So your WAL modifies as follows: for every TXN, flush all DB changes to disk before you commit.
- Do you still have to flush log tail before the corresponding dirty pages? YES!
REDO ONLY:
- Redo everything like ARIES? No, can't do that. Some transactions would not have committed.
- Analysis phase: earliest LSN whose changes may not have made it to disk.
- Find all transactions with commit record after that recLSN.
- Scan forward, redo the changes for those transactions.
What about uncommitted transactions? You can't undo! So modify WAL as follows:
- Never flush to disk the data from uncommitted transaction.
- On commit, flush the log tail.
--------------
What are the pros and cons of REDO and UNDO in terms of performance, complexity?
UNDO:
- Flush DB changes on commit
random writes. - On recovery, don't redo, only undo
REDO:
- Commits are fast: only flush the log
- Aborts are fast: discard the log.
- On recovery, only redo.
- Disadvantage: can't flush a page until it contains only committed data. Buffer pool can overflow, checkpoints are difficult, log may become long.