2. Goals
Make Storage and Buffering Managers
multithreading-safe
Improve performance by allowing simultaneous
operations
Provide flexible compile-time options to select
optimal behavior
3. Problem overview
Multiple
Buffer and Storage are
Access manager etc
threads
global shared objects
All above can run in
Buffer Manager multiple parallel threads
Needs thread control!
Storage Manager
?
File
4. Fine-grained control
Access manager etc From bottom to top:
Physical storage is
already controlled by OS
Buffer Manager
Storage manager can use
???
simple SWMR due
SWMR immutability on read
Storage Manager concurency
Buffer manager needs
something else…
OS-based
File
5. Fine-grained control
Multiple
Global lock provides only
Access manager etc
threads
safety
No performance benefit!
Buffer Manager
Storage Manager
Locked!
File
6. Deep down in the Buffer Manager
Underneath BM consists
of:
External interface
Access manager etc
Buffer Manager
Buffer Buffer Control Block
list
Buffer Pool object
Pool
List of BufferControl
Blocks
Pages
Storage Manager
Chunk of memory for
cached Pages
File
7. Buffer Manager API
class KAIROSMOBILE_API CBufferManager
{ We can identify 3 groups:
// ACTUAL API
BM_RESULT readPage(TBSID tbsID, PAGEID pageNum, PAGEP&
retPage);
read, write and page
BM_RESULT writePage(TBSID tbsID, PAGEID pageNum);
BM_RESULT fixPage(TBSID tbsID, PAGEID pageNum, PAGEP& retPage);
BM_RESULT unFixPage(TBSID tbsID, PAGEID pageNum);
pinning API
// ALLOCATION API OF Storage Manager
BM_RESULT allocPage( TBSID tbsID, PAGEID& pageID );
Allocation API from
BM_RESULT freePage( TBSID tbsID, PAGEID pageID );
Storage manager
// Unused API
BM_RESULT checkPage(TBSID tbsID, PAGEID pageNum);
BM_RESULT touchPage(TBSID tbsID, PAGEID pageNum); Unused part of API
BM_RESULT invalidatePage(TBSID tbsID, PAGEID pageNum);
}
// …
Our goal is first group
More on this later…
8. Existing algorithm
fixPage fixPage( pageId ):
// lookup page in BP
pageBcb = bp.lookup( pageId )
lookup( page ) if pageBcb found:
// in buffer
return pageBcb.buf
found?
// need to read
pageBcb = bp.getLRU()
getLRU page if pageBcb.isDirty: sm.flushPage( … )
sm.readPage( pageBcb.buf )
return pageBcb.buf
dirty? YES flushPage
not found
YES
readPage
9. Race condition
T1: Read N T2: Read N
Not found,
read
Not found,
Reading...
read
Reading...
Read
complete
Read
complete
Simple lock can’t solve race conditions in multiple requests, need another
solution.
10. New approach
Provides per-request lock granularity
Solves inconsistency
But requires few more changes…
11. Now
Existing BufferControl
block structure
Buffer Pool maintains set
Buffer Pool of the BCB in linear and
page*
prev*
LRU lists
BCB BCB BCB BCB next*
lruprev* BCB contains
lrunext*
flags
management
Page Page Page
information for Page
buffers
12. Changes
Bcb flags have two more
states: reading, waiting
Bcb extended with the
Buffer Pool
page*
prev*
new field to refer a wait-
BCB BCB BCB BCB
next* object*
lruprev*
lrunext* wait-object is sort of
flags
Page Page Page wob conditional variable
WOB
13. New algorithm
Starts with the local lock
fixPage
LOCK
Looksup page in BP…
lookup( page )
YES found? NO getLRU page
If page found, check read
reading?
dirty? flag in BCB
wob
assigned?
NO assign wob
writePage
NO
If read we are the first who
yields to wait, setup and
assign Wait-object
mark (waiting)
reset flags,
found
NO set reading
Wait for read to complete
not
wait on (wob) UNLOCK
found
free wob
readPage
Free Wait-object
UNLOCK
LOCK
Return page
signal (wob) YES waiting?
reset flags,
set reading
UNLOCK
return
14. New algorithm 2
Starts with the local lock
fixPage
LOCK
lookup( page )
Looksup page in BP …
YES found? NO getLRU page If page not found, check
reading? dirty flag in BCB
dirty?
wob
Flush page
NO assign wob
Withdraw lock
assigned?
writePage
NO
found
NO
mark (waiting)
reset flags,
set reading
Request read
wait on (wob) UNLOCK
not
found Hold lock
free wob
readPage
If waiting flag is set:
UNLOCK
LOCK
Signal ‘wait complete’
signal (wob) YES waiting?
Reset flags
reset flags,
set reading
Return page
UNLOCK
return
15. New algorithm 3
Note, only requests to
fixPage
LOCK
lookup( page )
the page which is already
YES found? NO getLRU page reading are put on hold
reading?
dirty? Also, withholding lock
wob
assigned?
NO assign wob
writePage
NO around ‘read’ call allows
NO
mark (waiting)
reset flags,
another threads to
proceed non-blocking
found set reading
not
wait on (wob) UNLOCK
found
readPage
free wob
LOCK
UNLOCK
signal (wob) YES waiting?
reset flags,
set reading
UNLOCK
return
16. Time diagram
T1: Read N T2: Read N T3: Read M
Not found,
read
Found,
Reading...
"reading"
Reading...
Read Read
Waiting on Wob
complete complete
Signal completion Acquired
* Readers can proceed if they request different page and wait if page is
loading.
17. Summary of changes
Extending Buffer Control Block with flags and WOB
index field
Waiting Object (sort of counted conditional variable)
and fast-mutex cross-platform primitives
Upgrade Buffer Manager algorithm
Possible Buffer Manager API consolidation
18. API Consolidation
Consistent BM API: Currently:
fixPage( id, flags ) readPage is same as:
unfixPage( id ) fixPage + copy + unfix
flushPage( id ) writePage is actually
flush
+ unused API
19. Alternatives
Access manager etc Per thread buffer
manager does not
require anything of the
Buffer
Manager
Buffer
Manager
Per thread above
Also can positively affect
SWMR
thread cache fighting
Storage Manager concurency (thrashing)
Only requires SWMR
lock on Storage Manager
OS-based
File