CAS CS 460/660 Introduction To Database Systems Query Evaluation I
CAS CS 460/660 Introduction To Database Systems Query Evaluation I
CAS CS 460/660 Introduction To Database Systems Query Evaluation I
Introduction to Database
Systems
Query Evaluation I
1.1
Introduction
Query Optimizer
Schema Statistics
HeapScan
1.4
Query Optimization
Distinct
A deep subject, focuses on multi-table queries
We will only need a cookbook version for now. Sort
Build the dataflow bottom up:
Choose an Access Method (HeapScan or IndexScan)
Non-trivial, we’ll learn about this later! Filter
Next apply any WHERE clause filters
Next apply GROUP BY and aggregation HashAgg
Can choose between sorting and hashing!
Next apply any HAVING clause filters
Next Sort to help with ORDER BY and DISTINCT Filter
In absence of ORDER BY, can do DISTINCT via hashing!
HeapScan
1.5
Iterators
class iterator {
void init(); iterator
tuple next();
void close();
iterator inputs[];
// additional state goes here
}
Note:
Edges in the graph are specified by inputs (max 2, usually 1)
Encapsulation: any iterator can be input to any other!
When subclassing, different iterators will keep different kinds of state
information
1.6
Example: Scan class Scan extends iterator {
void init();
tuple next();
void close();
iterator inputs[1];
bool_expr filter_expr;
init(): proj_attr_list proj_list;
Set up internal state }
call init() on child – often a file open
next():
call next() on child until qualifying tuple found or EOF
keep only those fields in “proj_list”
return tuple (or EOF -- “End of File” -- if no tuples remain)
close():
call close() on child
clean up internal state
1.7
class Sort extends iterator {
Example: Sort void init();
tuple next();
void close();
iterator inputs[1];
int numberOfRuns;
DiskBlock runs[];
init():
RID nextRID[];
}
generate the sorted runs on disk
Allocate runs[] array and fill in with disk pointers.
Initialize numberOfRuns
Allocate nextRID array and initialize to NULLs
next():
nextRID array tells us where we’re “up to” in each run
find the next tuple to return based on nextRID array
advance the corresponding nextRID entry
return tuple (or EOF -- “End of File” -- if no tuples remain)
close():
deallocate the runs and nextRID arrays
1.8
Streaming through RAM
Simple case: “Map”. (assume many records per disk page)
Goal: Compute f(x) for each record, write out the result
Challenge: minimize RAM, call read/write rarely
Approach
Read a chunk from INPUT to an Input Buffer
Write f(x) for each item to an Output Buffer
When Input Buffer is consumed, read another chunk
When Output Buffer fills, write it to OUTPUT
Reads and Writes are not coordinated (i.e., not in lockstep)
E.g., if f() is Compress(), you read many chunks per write.
E.g., if f() is DeCompress(), you write many chunks per read.
Input Output
f(x)
Buffer Buffer
Time-space Rendezvous
in the same place (RAM) at the same time
There may be many combos of such
items
1.10
Divide and Conquer
B-2
INPUT OUTPUT
IN OUT
1.11
Divide and Conquer
Phase 1
“streamwise” divide into N/(B-2)
megachunks
output (write) to disk one megachunk at
a time
B-2
INPUT OUTPUT
IN OUT
1.12
Divide and Conquer
Phase 2
Now megachunks will be the input
process each megachunk individually.
B-2
INPUT OUTPUT
IN OUT
1.13
Sorting: 2-Way
• Pass 0:
– read a page, sort it, write it.
– only one buffer page is used
– a repeated “ batch job”
I/O
INPUT Buffer
OUTPUT
sort
RAM
1.14
Sorting: 2-Way (cont.)
INPUT 1
Merge OUTPUT
INPUT 2
RAM
1.15
Two-Way External Merge Sort
Sort subfiles and Merge 3,4 6,2 9,4 8,7 5,6 3,1 2 Input file
PASS 0
How many passes? 3,4 2,6 4,9 7,8 5,6 1,3 2 1-page runs
N pages in the file PASS 1
2,3 4,7 1,3
=> the number of 2-page runs
4,6 8,9 5,6 2
passes = PASS 2
2,3
4,4 1,2 4-page runs
6,7 3,5
8,9 6
PASS 3
Total I/O cost? (reads +
1,2
writes)
2,3
Each pass we read + 3,4 8-page runs
write 4,5
6,6
each page in file. So
7,8
total cost is: 9
1.16
General External Merge Sort
INPUT 2
... sort
INPUT B
RAM Disk
INPUT 1
INPUT 2
Merge OUTPUT
...
INPUT B-1
RAM Disk
Merging Runs
1.18
Cost of External Merge Sort
Number of passes:
Cost = 2N * (# of passes)
E.g., with 5 buffer pages, to sort 108
page file:
Pass 0: = 22 sorted runs of 5
pages each (last run is only 3 pages)
Pass 1: = 6 sorted runs of 20
pages each (last run is only 8 pages)
Pass 2: 2 sorted runs, 80 pages and 28
pages
Pass 3: Sorted file of 108 pages
1.20
Memory Requirement for External
Sorting
1.21
Alternative: Hashing
Idea:
Many times we don’t require order
E.g.: removing duplicates
E.g.: forming groups
Often just need to rendezvous
matches
Hashing does this
And may be cheaper than sorting!
(Hmmm…!)
But how to do it out-of-core??
1.22
Divide
1.23
Divide & Conquer
Streaming Partition (divide):
Use a hash function hp to stream records to
disk-based partitions
All matches rendezvous in the same partition.
Streaming alg to create partitions on disk:
“Spill” partitions to disk via output buffers
ReHash (conquer):
Read partitions into RAM-based hash table one
at a time, using hash function hr
Then go through each bucket of this hash table to
achieve rendezvous in RAM
Note: Two different hash functions
hp is coarser-grained than hr
1.24
Two Phases
Original
Relation OUTPUT Partitions
Partition: 1
1
INPUT 2
hash 2
...
function
hp B-1
B-1
1.25
Two Phases Original
Relation OUTPUT Partitions
1
1
INPUT 2
hash 2
Partition:
...
function
hp B-1
B-1
Partitions Result
Hash table for partition
Rehash: hash Ri (k <= B pages)
fn
hr
1.27
Memory Requirement
1.28
How does this compare with
external sorting?
1.29
So which is better ??
Simplest analysis:
Same memory requirement for 2 passes
Same I/O cost
But we can dig a bit deeper…
Sorting pros:
Great if input already sorted (or almost sorted)
w/heapsort
Great if need output to be sorted anyway
Not sensitive to “data skew” or “bad” hash functions
Hashing pros:
For duplicate elimination, scales with # of values
Not # of items! We’ll see this again.
Can exploit extra memory to reduce # IOs (stay tuned…)
1.30
Summing Up 1
1.31
Summary Part 2
Sort/Hash Duality
Sorting is Conquer & Merge
Hashing is Divide & Conquer
Sorting is overkill for rendezvous
But sometimes a win anyhow
Sorting sensitive to internal sort alg
Quicksort vs. HeapSort
In practice, QuickSort tends to be used
Don’t forget double buffering (with threads)
1.32