1. CSE232A: Database System
Principles
Notes 02: Hardware
1
Database System Architecture
Query Processing Transaction Management
SQL query Calls from Transactions (read,write)
Parser Transaction
relational algebra Manager
Query View
Hardware
Rewriter definitions aspects of
Concurrency Lock
Controller Table
and
Optimizer
Statistics &
Catalogs &
storing and
query execution
System Data retrieving data
Recovery
plan Manager
Execution Buffer
Engine Manager
Data + Indexes Log
Memory Hierarchy
• Cache memory
– On-chip and L2
– Caching outside control of DB
Cost per byte
system
• RAM
Capacity
Access Speed
– Addressable space includes virtual
memory but DB systems avoid it
– Main memory DBs rely more on
OS
• Disk
– Access speed & Transfer rate
– Winchester, arrays,…
• Tertiary storage
– Tapes, jukeboxes, DVDs
3
1
2. Storage Cost
nearline offline
1015 tape & tape
optical
typical capacity (bytes)
1013 magnetic disks
optical
1011 electronic disks
secondary online
109 electronic tape
main
107
from Gray & Reuter
105 cache
103
10-9 10-6 10-3 10-0 103
access time (sec)
4
Storage Cost from Gray & Reuter
104 cache
electronic
102 online
main
tape
dollars/MB
electronic
secondary magnetic
optical nearline
100 tape &
disks
optical
disks
10-2 offline
tape
10-4
10-9 10-6 10-3 10-0 103
access time (sec)
5
Volatile Vs Non-Volatile
Storage
• Persistence important for transaction
atomicity and durability
• Even if database fits in main memory
changes have to be written in non-
volatile storage
• Hard disk
• RAM disks w/ battery
• Flash memory
6
2
3. Cost of Disk Access
• How many blocks were accessed ?
• Clustered/consecutive ?
7
Moore’s Law: Different Rates
of Improvement
• Processor speed Clustered/sequential
access-based algorithms
Disk Transfer Rate
• Main memory bit/$
become relatively
• Disk bit/$ better
• RAM access speed
• Disk access speed
• Disk transfer rate
Access
Time
Disk
8
Moore’s Law: Different Rates
of Improvement
Cost of “miss”
increases
Cache Capacity
RAM Capacity
Access
Time
Disk
9
3
4. Focus on: “Typical Disk”
Controller
BUS
Disk
…
Terms: Platter, Head, Actuator
Cylinder, Track
Sector (physical),
Block (logical), Gap
10
Often different numbers
Top View of sectors per track
Sector
Track
Block
(typically Gap
multiple
sectors) 11
“Typical” Numbers
Diameter: 1 inch ® 15 inches
Cylinders: 100 ® 20000
Surfaces: 1 (CDs) ®
(Tracks/cyl) 2 (floppies) ®
® 5 (typical hd)
® 30
Sector Size: 512B ® 50K
Capacity: 360 KB (old floppy)
® 30 GB
12
4
5. Key performance metric: Time
to fetch block
I want block x
block X in memory
?
Time = Seek Time (locate track) +
Rotational Delay (locate sector)+
Transfer Time (fetch block) +
Other (disk controller, …)
13
Track
Seek Delay Where
Head
must go
Track
Where
Head is
14
Rotational Delay
Head Here
Block I Want
15
5
6. Seek Time
3 or 5x
Time
x
Few ms
1 N
Cylinders Traveled
16
Average Random Seek Time
N N
å å SEEKTIME (i ® j)
i=1 j=1
S= j¹i
N(N-1)
“Typical” S: 10 ms ® 40 ms
17
Average Rotational Delay
R = 1/2 revolution
“typical” R = 8.33 ms (7200 RPM)
Assume we have to start reading
from start of first sector
18
6
7. Transfer Rate: t
• “typical” t: 1 ® 3 MB/second
• transfer time: block size
t
19
Other Delays
• CPU time to issue I/O
• Contention for controller
• Contention for bus, memory
“Typical” Value: 0
20
Practice Problem
• Single surface
• Rotation speed 7200rpm
• 16,384 tracks
• 128 sectors/track
• 4096 bytes/sector
• 4 sectors/block (16,384 bytes/block)
• SEEKTIME (i ® j) = [1000 + (j-i)] µs
• Neglect gaps
• Calculate minimum, maximum, average time
to fetch one block
21
7
8. Practice Problem: Minimum Time
• Head is at the start of the first sector of
the block
• Just compute transfer time
• 4 sectors cover 4/128 of a track
• 1 full rotation takes 60/7200=8.33ms
• Transfer time is 8.33 * 4 /128 = 0.26ms
22
Practice Problem: Maximum Time
• Assume read must start at the first
sector
• Head is at innermost, required track is
the outermost
• Seek time = …
• Head just missed the beginning
• Rotational delay = …
• Transfer time = …
23
Practice problem: Average
time
• Solve…
24
8
9. • So far: Random Block Access
• What about: Reading “Next” block?
Time to get = Block Size + Negligible
block t
- skip gap
- switch track
- once in a while,
next cylinder
25
Rule of Random I/O: Expensive
Thumb Sequential I/O: Much less
• Ex: 1 KB Block
» Random I/O: ~ 20 ms.
» Sequential I/O: ~ 1 ms.
26
Practice Problem cont’d:
Sustained Bandwidth over Track
• Assume required blocks are consecutive
on single track
• What is the sustained bandwidth of
fetching consecutive blocks?
• 128 sectors/track * 4KB/sector in
8.33ms/track full rotation =
512KB/8.33ms = 61.46KB/ms
27
9
10. Suggested optimization
• Cluster data in consecutive blocks
• Give an extra point to algorithms that
– exploit data clustering by avoiding
“random” accesses
– Read/write consecutive blocks
28
Example: 2-Phase Merge Sort
P K AD L E ZW J C RH Y F X I
Main Memory: 4 blocks
READ
P K A D L E ZW
SORT
A D K P L D WP
E K P K Z WRITE A D K P L DWP
E K P K Z
…
MERGE
AD DP
C K F
WRITE
C F H I J R X Y
Improve by bringing max number of 29
blocks in memory
Cost for Writing similar to Reading
…. unless we want to verify!
need to add (full) rotation + Block size
t
30
10
11. • To Modify a Block?
To Modify Block:
(a) Read Block
(b) Modify in Memory
(c) Write Block
[(d) Verify?]
31
Block Address:
• Physical Device
• Cylinder #
• Surface #
• Sector
Once upon a time DBs
had access to such – now
it is the OS’s domain
32
Optimizations (in controller or O.S.)
• Disk Scheduling Algorithms
– e.g., elevator algorithm
• Track (or larger) Buffer
• Pre-fetch
• Arrays
• Mirrored Disks
33
11
12. Double Buffering
Problem: Have a File
» Sequence of Blocks B1, B2
Have a Program
» Process B1
» Process B2
» Process B3
...
34
Single Buffer Solution
(1) Read B1 ® Buffer
(2) Process Data in Buffer
(3) Read B2 ® Buffer
(4) Process Data in Buffer ...
35
Say P = time to process/block
R = time to read in 1 block
n = # blocks
Single buffer time = n(P+R)
36
12
13. Double Buffering
process process
Memory:
C
A B
Disk:
A B C D E F G
done
done
37
Say P ³ R
P = Processing time/block
R = IO time/block
n = # blocks
What is processing time?
• Double buffering time = R + nP
• Single buffering time = n(R+P)
Improvement much more dramatic if
consequtive blocks: … 38
Block Size Selection?
• Big Block ® Amortize I/O Cost
Unfortunately...
• Big Block Þ Read in more useless stuff!
and takes longer to read
39
13
14. Trend
• memory prices drop and memory capacities
increase,
• transfer rates increase
• Disk access times do not increase that much
Þ blocks get bigger ...
40
Disk Failures (Sec 2.5)
• Partial ® Total
• Intermittent ® Permanent
41
Coping with Disk Failures
• Detection
– e.g. Checksum
• Correction
Þ Redundancy
42
14
15. At what level do we cope?
• Single Disk
– e.g., Error Correcting Codes
• Disk Array
Logical Physical
43
Operating System
e.g., Stable Storage
Logical Block Copy A Copy B
44
Database System
• e.g.,
Log
Current DB Last week’s DB
45
15