Lectures - 11 Transactions Logging
Lectures - 11 Transactions Logging
Lectures - 11 Transactions Logging
Exam
Grades
Feedback Themes
- TAs were ultra-responsive during the “24 hours”
- Test was too long. We’ll recalibrate for Finals
- Most liked 24 hour flex time
UPDATE Product
SET Price = Price – 1.99
WHERE pname = ‘Gizmo’
DELETE Product
WHERE price <=0.99
UPDATE Product
SET Price = Price – 1.99
WHERE pname = ‘Gizmo’
DELETE Product
WHERE price <=0.99
How?
Mobile Game
Report & Share
Example Real-Time
DBMS
Business/Product
Analysis
DB v0
Q1: 1000 users/sec? Q7: How to model/evolve game data? Q4: Which user cohorts?
Q2: Offline? Q8: How to scale to millions of users? Q5: Next features to build?
(Recap lectures) Q3: Support v1, v1’ versions? Q9: When machines die, restore game Experiments to run?
state gracefully? Q6: Predict ads demand?
Example Real-Time
DBMS
Business/Product
Analysis
DB v0
Q1: 1000 users/sec? Q7: How to model/evolve game data? Q4: Which user cohorts?
Q2: Offline? Q8: How to scale to millions of users? Q5: Next features to build?
(Recap lectures) Q3: Support v1, v1’ versions? Q9: When machines crash, restore Experiments to run?
game state gracefully? Q6: Predict ads demand?
Today’s
Lecture 2. Transactions
4. Logging
Example
Unpack
ATM DB:
Transaction
Read Balance Read Balance
Give money vs Update Balance
Update Balance Give money
Visa does > 60,000 TXNs/sec with users & merchants
Want your 4$ Starbucks transaction to wait for a stranger’s 10k$ bet in Las Vegas ?
⇒ Transactions can (1) be quick or take a long time, (2) unrelated to you
Transactions are at the core of
-- payment, stock market, banks, ticketing
-- Gmail, Google Docs (e.g., multiple people editing)
Money Money (@4:29 am day+1)
Example
Monthly
bank
interest
transaction
‘T-Monthly-423’
Monthly Interest 10%
4:28 am Starts run on 100M bank accounts
Takes 24 hours to run
UPDATE Money
SET Balance = Balance * 1.1
Money Money (@4:29 am day+1)
Example
Monthly
bank
interest
transaction
Cost to update all data
100M bank accounts → 100M seeks? (worst case)
Problem1: SLOW :(
Money Money (@10:45 am)
Example
??
transaction
Case1: T-Monthly-423 crashed
‘T-Monthly-423’ Case2: T-Monthly-423 completed
With crash Monthly Interest 10% 4002 deposited 20$ at 10:45 am
4:28 am Starts run on 100M bank accounts
Takes 24 hours to run
Network outage at 10:29 am,
System access at 10:45 am
Problem 2: Wrong :(
15
Primary data structures/algorithms
LOGS LOCKS
Big Scale
Roadmap
?????
1. Why Transactions?
Today’s
Lecture 2. Properties of Transactions: ACID
3. Logging
Transactions: Basic Definition
START TRANSACTION
UPDATE Product
SET Price = Price – 1.99
WHERE pname = ‘Gizmo’
COMMIT
Transactions in SQL
START TRANSACTION
UPDATE Bank SET amount = amount – 100
WHERE name = ‘Bob’
UPDATE Bank SET amount = amount + 100
WHERE name = ‘Joe’
COMMIT
Motivation for Transactions
Group user actions (reads & writes) into Transactions helps with two goals:
Next lecture
Client 1:
INSERT INTO CheapProduct(name, price)
SELECT pname, price
FROM Product Crash / abort!
WHERE price <= 0.99
DELETE Product
WHERE price <=0.99
DELETE Product
WHERE price <=0.99
COMMIT
Now works like a charm- we’ll see how / why next lecture…
3. Properties of Transactions
1. Atomicity
2. Consistency
What you will
learn about in 3. Isolation
this section
4. Durability
ACID: Atomicity
Conceptually,
• similar to OS “sandboxes”
• E.g. TXNs can’t observe each other’s “partial updates”
ACID: Durability
• Atomic
• State shows either all the effects of TXN, or none of them
• Consistent
• TXN moves from a state where integrity holds, to another where integrity
holds
• Isolated
• Effect of TXNs is the same as TXNs running one after another
• Durable
• Once a TXN has committed, its effects remain in the database
A Note: ACID is one popular option!
Idea:
• Log consists of an ordered list of Update Records
• Log record contains UNDO information for every update!
<TransactionID, &reference, old value, new value>
(e.g., key)
What DB does?
• Owns the log “service” for all applications/transactions.
• Appends to log. Flush when necessary — force writes to disk
Example
Update
Records
Monthly
bank Commit
interest Record
transaction
‘T-Monthly-423’
Full run
Monthly Interest 10%
4:28 am Starts run on 100M bank accounts
Takes 24 hours to run
START TRANSACTION
UPDATE Money
SET Amt = Amt * 1.10
COMMIT
Money Money (@10:45 am) WA Log (@10:29 am)
Example
??
Monthly ??
??
bank
interest ??
transaction
TXN ‘T-Monthly-423’
Did T-Monthly-423 complete?
With crash Monthly Interest 10%
4:28 am Starts run on 100M bank accounts
Which tuples are bad?
Example
Monthly
bank
interest
transaction
System recovery (after 10:45 am)
B=5
Main Memory RAM
2. Flushed as DB
blocks on disk
(sequential I/O)
B=5
Main Memory memory!
A=7
Data on Disk Log on Disk
Why do we need logging for
atomicity?
• Could we just write TXN updates to disk only once whole TXN
complete?
• Then, if abort / crash and TXN not complete, it has no effect- atomicity!
• With unlimited memory and time, this could work…
• We’ll see why it works by looking at other protocols which are incorrect!
B=5
Main Memory
OK, Commit!
A=7
Data on Disk Log-Disk
Write-ahead Logging (WAL)
Commit Protocol
Commit after we’ve written
T: R(A), W(A) log to disk but before
we’ve written data to
disk… this is WAL!
T A=13
Main Memory
OK, Commit!
If we crash now, is T
<Tid, &A, 7,13> durable?
Yes
A=7
A=13
USE THE LOG!
Data on Disk Log-Disk
Write-Ahead Logging (WAL)
Algorithm: WAL
T A=13 Log-RAM
OK, Commit!
B=5
Main Memory
If we crash now, is T
durable?
No
Lost T’s update!
A=7
Data on Disk Log-Disk
Incorrect Commit Protocol #2
Let’s try committing after
we’ve written data but
before we’ve written LOG
T: R(A), W(A) A: 7→13 to disk…
T A=13 Log-RAM
OK, Commit!
B=5
Main Memory
If we crash now, is T
durable? Yes! Except…
loseinformation onleg
Money Money (@4:29 am day+1) WAL (@4:29 am day+1)
Example
Monthly
bank
interest
transaction
Cost to update all data Cost to Append to log
100M bank accounts → 100M seeks? (worst + 1 seek to get ‘end of log’
case) + write 100M log entries sequentially