SlideShare uma empresa Scribd logo
1 de 16
Baixar para ler offline
CSE232A: Database System
                Principles

                  Notes 02: Hardware




                                                                                     1




     Database System Architecture
    Query Processing                      Transaction Management
         SQL query                         Calls from Transactions (read,write)

          Parser                                           Transaction
              relational algebra                            Manager
          Query            View
                                             Hardware
         Rewriter        definitions         aspects of
                                                Concurrency                              Lock
                                                 Controller                              Table
           and
         Optimizer
                         Statistics &
                         Catalogs &
                                            storing and
query execution
                         System Data      retrieving data
                                                 Recovery
plan                                                        Manager
         Execution       Buffer
          Engine        Manager

           Data + Indexes                                        Log




                     Memory Hierarchy
• Cache memory
   – On-chip and L2
   – Caching outside control of DB
                                           Cost per byte




     system
• RAM
                                                           Capacity

                                                                      Access Speed




   – Addressable space includes virtual
     memory but DB systems avoid it
   – Main memory DBs rely more on
     OS
• Disk
   – Access speed & Transfer rate
   – Winchester, arrays,…
• Tertiary storage
   – Tapes, jukeboxes, DVDs

                                                                                     3




                                                                                                 1
Storage Cost
                                                                              nearline offline
                           1015                                                tape &   tape
                                                                               optical
typical capacity (bytes)




                           1013                                 magnetic        disks
                                                                 optical
                           1011                 electronic        disks
                                                secondary            online
                           109     electronic                         tape
                                      main
                           107
                                                                                   from Gray & Reuter
                           105    cache

                           103
                              10-9          10-6             10-3          10-0           103
                                                                                   access time (sec)
                                                                                                       4




Storage Cost                                                                       from Gray & Reuter

                           104     cache

                                      electronic
                           102                                            online
                                         main
                                                                           tape
dollars/MB




                                             electronic
                                             secondary         magnetic
                                                                optical              nearline
                           100                                                        tape &
                                                                 disks
                                                                                      optical
                                                                                       disks
                           10-2                                                           offline
                                                                                           tape
                           10-4
                              10-9          10-6             10-3          10-0           103
                                                                                   access time (sec)
                                                                                                       5




                                  Volatile Vs Non-Volatile
                                           Storage

• Persistence important for transaction
  atomicity and durability
• Even if database fits in main memory
  changes have to be written in non-
  volatile storage
• Hard disk
• RAM disks w/ battery
• Flash memory
                                                                                                       6




                                                                                                           2
Cost of Disk Access

    • How many blocks were accessed ?
    • Clustered/consecutive ?




                                                                            7




     Moore’s Law: Different Rates
          of Improvement

•   Processor speed                                          Clustered/sequential
                                                            access-based algorithms
                                       Disk Transfer Rate




•   Main memory bit/$
                                                               become relatively
•   Disk bit/$                                                       better
•   RAM access speed
•   Disk access speed
•   Disk transfer rate
                                                                 Access
                                                                  Time
                                                                  Disk




                                                                            8




     Moore’s Law: Different Rates
          of Improvement

                                               Cost of “miss”
                                                 increases
     Cache Capacity


                        RAM Capacity




                                       Access
                                        Time
                                        Disk




                                                                            9




                                                                                      3
Focus on: “Typical Disk”
         Controller
BUS


           Disk




                                    …
  Terms: Platter, Head, Actuator
         Cylinder, Track
         Sector (physical),
         Block (logical), Gap
                                           10




                      Often different numbers
  Top View              of sectors per track


                                  Sector

                                        Track

  Block
(typically                              Gap
 multiple
 sectors)                                  11




 “Typical” Numbers
     Diameter:    1 inch ® 15 inches
     Cylinders:   100 ® 20000
     Surfaces:    1 (CDs) ®
     (Tracks/cyl) 2 (floppies) ®
                   ® 5 (typical hd)
                        ® 30
     Sector Size: 512B ® 50K
     Capacity:    360 KB (old floppy)
                  ® 30 GB
                                           12




                                                4
Key performance metric: Time
       to fetch block
I want                          block x
block X                         in memory
                   ?
Time = Seek Time (locate track) +
       Rotational Delay (locate sector)+
       Transfer Time (fetch block) +
       Other (disk controller, …)
                                         13




                                       Track
Seek Delay                            Where
                                       Head
                                      must go



                                       Track
                                      Where
                                      Head is


                                         14




Rotational Delay




   Head Here


                       Block I Want


                                         15




                                                5
Seek Time



     3 or 5x


  Time


          x
Few ms
               1                           N

                                Cylinders Traveled


                                                     16




 Average Random Seek Time


                    N     N
                   å å           SEEKTIME (i ® j)
                   i=1   j=1
         S=               j¹i


                              N(N-1)

“Typical” S: 10 ms ® 40 ms
                                                     17




 Average Rotational Delay


   R = 1/2 revolution

   “typical” R = 8.33 ms (7200 RPM)

  Assume we have to start reading
  from start of first sector

                                                     18




                                                          6
Transfer Rate: t


• “typical” t: 1 ® 3 MB/second
• transfer time: block size
                    t




                                               19




Other Delays


• CPU time to issue I/O
• Contention for controller
• Contention for bus, memory


“Typical” Value: 0

                                               20




           Practice Problem
•   Single surface
•   Rotation speed 7200rpm
•   16,384 tracks
•   128 sectors/track
•   4096 bytes/sector
•   4 sectors/block (16,384 bytes/block)
•   SEEKTIME (i ® j) = [1000 + (j-i)] µs
•   Neglect gaps
•   Calculate minimum, maximum, average time
    to fetch one block
                                               21




                                                    7
Practice Problem: Minimum Time

 • Head is at the start of the first sector of
   the block
 • Just compute transfer time
 • 4 sectors cover 4/128 of a track
 • 1 full rotation takes 60/7200=8.33ms
 • Transfer time is 8.33 * 4 /128 = 0.26ms


                                             22




Practice Problem: Maximum Time

 • Assume read must start at the first
   sector
 • Head is at innermost, required track is
   the outermost
 • Seek time = …
 • Head just missed the beginning
 • Rotational delay = …
 • Transfer time = …
                                             23




    Practice problem: Average
               time

 • Solve…




                                             24




                                                  8
• So far: Random Block Access
• What about: Reading “Next” block?

Time to get = Block Size + Negligible
  block           t

                             - skip gap
                             - switch track
                             - once in a while,
                                next cylinder
                                                  25




Rule of         Random I/O: Expensive
Thumb           Sequential I/O: Much less

• Ex:     1 KB Block
          » Random I/O: ~ 20 ms.
          » Sequential I/O: ~ 1 ms.




                                                  26




    Practice Problem cont’d:
Sustained Bandwidth over Track

• Assume required blocks are consecutive
  on single track
• What is the sustained bandwidth of
  fetching consecutive blocks?
• 128 sectors/track * 4KB/sector in
  8.33ms/track full rotation =
  512KB/8.33ms = 61.46KB/ms

                                                  27




                                                       9
Suggested optimization

  • Cluster data in consecutive blocks
  • Give an extra point to algorithms that
    – exploit data clustering by avoiding
      “random” accesses
    – Read/write consecutive blocks




                                                        28




  Example: 2-Phase Merge Sort
P K AD L E ZW J C RH Y F X I
                                         Main Memory: 4 blocks
                                                READ
                                            P K A D L E ZW
                                                 SORT
                     A D K P L D WP
                         E K   P K Z     WRITE   A D K P L DWP
                                                     E K P K Z
         …
             MERGE




AD DP
 C K F
                                         WRITE
                     C F H I   J R X Y




 Improve by bringing max number of                      29

          blocks in memory




  Cost for Writing similar to Reading


…. unless we want to verify!
  need to add (full) rotation + Block size
                                      t




                                                        30




                                                                 10
• To Modify a Block?

To Modify Block:
    (a) Read Block
    (b) Modify in Memory
    (c) Write Block
    [(d) Verify?]



                                                31




Block Address:


•   Physical Device
•   Cylinder #
•   Surface #
•   Sector

Once upon a time DBs
had access to such – now
it is the OS’s domain
                                                32




Optimizations         (in controller or O.S.)



• Disk Scheduling Algorithms
    – e.g., elevator algorithm
•   Track (or larger) Buffer
•   Pre-fetch
•   Arrays
•   Mirrored Disks


                                                33




                                                     11
Double Buffering
Problem: Have a File
            » Sequence of Blocks B1, B2

           Have a Program
           » Process B1
           » Process B2
           » Process B3
                ...




                                          34




Single Buffer Solution


(1)   Read B1 ® Buffer
(2)   Process Data in Buffer
(3)   Read B2 ® Buffer
(4)   Process Data in Buffer ...




                                          35




Say P = time to process/block
    R = time to read in 1 block
    n = # blocks

Single buffer time = n(P+R)




                                          36




                                               12
Double Buffering

                process      process
 Memory:
                  C
                  A            B



 Disk:
            A B C D E F G
              done
           done


                                                 37




 Say P ³ R
                     P = Processing time/block
                     R = IO time/block
                     n = # blocks

 What is processing time?

 • Double buffering time = R + nP

 • Single buffering time        = n(R+P)
Improvement much more dramatic if
consequtive blocks: …                            38




Block Size Selection?
 • Big Block ® Amortize I/O Cost


  Unfortunately...


 • Big Block Þ Read in more useless stuff!
               and takes longer to read


                                                 39




                                                      13
Trend

• memory prices drop and memory capacities
  increase,
• transfer rates increase
• Disk access times do not increase that much

Þ     blocks get bigger ...




                                                40




Disk Failures           (Sec 2.5)


• Partial ® Total
• Intermittent ® Permanent




                                                41




Coping with Disk Failures


• Detection
    – e.g. Checksum


• Correction
      Þ Redundancy




                                                42




                                                     14
At what level do we cope?


• Single Disk
  – e.g., Error Correcting Codes
• Disk Array



    Logical                 Physical

                                                43




    Operating System
                e.g., Stable Storage




Logical Block            Copy A        Copy B




                                                44




     Database System

• e.g.,




                   Log
Current DB                   Last week’s DB

                                                45




                                                     15
Summary
Summary
• Secondary storage, mainly disks
• I/O times
• I/Os should be avoided,
          especially random ones…..




                                      46




                                           16

Mais conteúdo relacionado

Mais procurados

In the brain of Tom Wilkie
In the brain of Tom WilkieIn the brain of Tom Wilkie
In the brain of Tom WilkieAcunu
 
Review of the firebird development in 2011 2012
Review of the firebird development in 2011 2012Review of the firebird development in 2011 2012
Review of the firebird development in 2011 2012Mind The Firebird
 
Analyzing Chips in a System Context
Analyzing Chips in a System ContextAnalyzing Chips in a System Context
Analyzing Chips in a System Contextchiportal
 
Approaches to Designing a High-Performance Switch Router
Approaches to Designing a High-Performance Switch RouterApproaches to Designing a High-Performance Switch Router
Approaches to Designing a High-Performance Switch RouterVishal Sharma, Ph.D.
 
TRANSFER VIDEO TAPE (VHS) TO VIDEO TAPE. Any system to any system ...
TRANSFER VIDEO TAPE (VHS) TO VIDEO TAPE. Any system to any system ...TRANSFER VIDEO TAPE (VHS) TO VIDEO TAPE. Any system to any system ...
TRANSFER VIDEO TAPE (VHS) TO VIDEO TAPE. Any system to any system ...crysatal16
 
Emulex OneConnect Universal CNA (Short Overview)
Emulex OneConnect Universal CNA (Short Overview)Emulex OneConnect Universal CNA (Short Overview)
Emulex OneConnect Universal CNA (Short Overview)Emulex Corporation
 
COTS aplicaciones y monitorización de la producción en los pozos
COTS aplicaciones y monitorización de la producción en los pozosCOTS aplicaciones y monitorización de la producción en los pozos
COTS aplicaciones y monitorización de la producción en los pozosMarketing Donalba
 
On site services flyer
On site services flyerOn site services flyer
On site services flyerkadarnell77
 

Mais procurados (8)

In the brain of Tom Wilkie
In the brain of Tom WilkieIn the brain of Tom Wilkie
In the brain of Tom Wilkie
 
Review of the firebird development in 2011 2012
Review of the firebird development in 2011 2012Review of the firebird development in 2011 2012
Review of the firebird development in 2011 2012
 
Analyzing Chips in a System Context
Analyzing Chips in a System ContextAnalyzing Chips in a System Context
Analyzing Chips in a System Context
 
Approaches to Designing a High-Performance Switch Router
Approaches to Designing a High-Performance Switch RouterApproaches to Designing a High-Performance Switch Router
Approaches to Designing a High-Performance Switch Router
 
TRANSFER VIDEO TAPE (VHS) TO VIDEO TAPE. Any system to any system ...
TRANSFER VIDEO TAPE (VHS) TO VIDEO TAPE. Any system to any system ...TRANSFER VIDEO TAPE (VHS) TO VIDEO TAPE. Any system to any system ...
TRANSFER VIDEO TAPE (VHS) TO VIDEO TAPE. Any system to any system ...
 
Emulex OneConnect Universal CNA (Short Overview)
Emulex OneConnect Universal CNA (Short Overview)Emulex OneConnect Universal CNA (Short Overview)
Emulex OneConnect Universal CNA (Short Overview)
 
COTS aplicaciones y monitorización de la producción en los pozos
COTS aplicaciones y monitorización de la producción en los pozosCOTS aplicaciones y monitorización de la producción en los pozos
COTS aplicaciones y monitorización de la producción en los pozos
 
On site services flyer
On site services flyerOn site services flyer
On site services flyer
 

Destaque

Destaque (7)

Graphing in SAS
Graphing in SASGraphing in SAS
Graphing in SAS
 
Proyecto inversiones digisat, c.a
Proyecto inversiones digisat, c.a Proyecto inversiones digisat, c.a
Proyecto inversiones digisat, c.a
 
Environmental Health Perspectives
Environmental Health PerspectivesEnvironmental Health Perspectives
Environmental Health Perspectives
 
Selected Research Topics
Selected Research TopicsSelected Research Topics
Selected Research Topics
 
Terug Op De Troon
Terug Op De TroonTerug Op De Troon
Terug Op De Troon
 
Cartel de seleccion1
Cartel de seleccion1Cartel de seleccion1
Cartel de seleccion1
 
Beijing 360° - panoramic photo album
Beijing 360° - panoramic photo albumBeijing 360° - panoramic photo album
Beijing 360° - panoramic photo album
 

Semelhante a Hardware

How to Modernize Your Database Platform to Realize Consolidation Savings
How to Modernize Your Database Platform to Realize Consolidation SavingsHow to Modernize Your Database Platform to Realize Consolidation Savings
How to Modernize Your Database Platform to Realize Consolidation SavingsIsaac Christoffersen
 
Webinar: Untethering Compute from Storage
Webinar: Untethering Compute from StorageWebinar: Untethering Compute from Storage
Webinar: Untethering Compute from StorageAvere Systems
 
Securing Your Endpoints Using Novell ZENworks Endpoint Security Management
Securing Your Endpoints Using Novell ZENworks Endpoint Security ManagementSecuring Your Endpoints Using Novell ZENworks Endpoint Security Management
Securing Your Endpoints Using Novell ZENworks Endpoint Security ManagementNovell
 
DTX CableAnalyzer
DTX CableAnalyzerDTX CableAnalyzer
DTX CableAnalyzerlenlax
 
Deep dive storage networking the path to performance
Deep dive storage networking the path to performanceDeep dive storage networking the path to performance
Deep dive storage networking the path to performanceInterop
 
Lug best practice_hpc_workflow
Lug best practice_hpc_workflowLug best practice_hpc_workflow
Lug best practice_hpc_workflowrjmurphyslideshare
 
Analysis Software Benchmark
Analysis Software BenchmarkAnalysis Software Benchmark
Analysis Software BenchmarkAkira Shibata
 
DaStor/Cassandra report for CDR solution
DaStor/Cassandra report for CDR solutionDaStor/Cassandra report for CDR solution
DaStor/Cassandra report for CDR solutionSchubert Zhang
 
Windows Server 2012 Active Directory Domain and Trust (Forest Trust)
Windows Server 2012 Active Directory Domain and Trust (Forest Trust)Windows Server 2012 Active Directory Domain and Trust (Forest Trust)
Windows Server 2012 Active Directory Domain and Trust (Forest Trust)Serhad MAKBULOĞLU, MBA
 
20130325 openstack-meetup
20130325 openstack-meetup20130325 openstack-meetup
20130325 openstack-meetupsteve ulrich
 
Ngn2004 Moving Up And To The Edges110204
Ngn2004 Moving Up And To The Edges110204Ngn2004 Moving Up And To The Edges110204
Ngn2004 Moving Up And To The Edges110204guestf6c708
 
DFX Architecture for High-performance Multi-core Microprocessors
DFX Architecture for High-performance Multi-core MicroprocessorsDFX Architecture for High-performance Multi-core Microprocessors
DFX Architecture for High-performance Multi-core MicroprocessorsIshwar Parulkar
 
VDI storage and storage virtualization
VDI storage and storage virtualizationVDI storage and storage virtualization
VDI storage and storage virtualizationSisimon Soman
 
Ahorro energético archivado de backups
Ahorro energético archivado de backupsAhorro energético archivado de backups
Ahorro energético archivado de backupsOmega Peripherals
 
Ahorro energético archivado de backups
Ahorro energético archivado de backupsAhorro energético archivado de backups
Ahorro energético archivado de backupsOmega Peripherals
 
Sürat Teknoloji EMC Forum Isilon Sunumu
Sürat Teknoloji EMC Forum Isilon SunumuSürat Teknoloji EMC Forum Isilon Sunumu
Sürat Teknoloji EMC Forum Isilon SunumuSürat Teknoloji
 
20121205 open stack_accelerating_science_v3
20121205 open stack_accelerating_science_v320121205 open stack_accelerating_science_v3
20121205 open stack_accelerating_science_v3Tim Bell
 
Accelerating Science with OpenStack.pptx
Accelerating Science with OpenStack.pptxAccelerating Science with OpenStack.pptx
Accelerating Science with OpenStack.pptxOpenStack Foundation
 
20121017 OpenStack Accelerating Science
20121017 OpenStack Accelerating Science20121017 OpenStack Accelerating Science
20121017 OpenStack Accelerating ScienceTim Bell
 
20121017 OpenStack CERN Accelerating Science
20121017 OpenStack CERN Accelerating Science20121017 OpenStack CERN Accelerating Science
20121017 OpenStack CERN Accelerating ScienceTim Bell
 

Semelhante a Hardware (20)

How to Modernize Your Database Platform to Realize Consolidation Savings
How to Modernize Your Database Platform to Realize Consolidation SavingsHow to Modernize Your Database Platform to Realize Consolidation Savings
How to Modernize Your Database Platform to Realize Consolidation Savings
 
Webinar: Untethering Compute from Storage
Webinar: Untethering Compute from StorageWebinar: Untethering Compute from Storage
Webinar: Untethering Compute from Storage
 
Securing Your Endpoints Using Novell ZENworks Endpoint Security Management
Securing Your Endpoints Using Novell ZENworks Endpoint Security ManagementSecuring Your Endpoints Using Novell ZENworks Endpoint Security Management
Securing Your Endpoints Using Novell ZENworks Endpoint Security Management
 
DTX CableAnalyzer
DTX CableAnalyzerDTX CableAnalyzer
DTX CableAnalyzer
 
Deep dive storage networking the path to performance
Deep dive storage networking the path to performanceDeep dive storage networking the path to performance
Deep dive storage networking the path to performance
 
Lug best practice_hpc_workflow
Lug best practice_hpc_workflowLug best practice_hpc_workflow
Lug best practice_hpc_workflow
 
Analysis Software Benchmark
Analysis Software BenchmarkAnalysis Software Benchmark
Analysis Software Benchmark
 
DaStor/Cassandra report for CDR solution
DaStor/Cassandra report for CDR solutionDaStor/Cassandra report for CDR solution
DaStor/Cassandra report for CDR solution
 
Windows Server 2012 Active Directory Domain and Trust (Forest Trust)
Windows Server 2012 Active Directory Domain and Trust (Forest Trust)Windows Server 2012 Active Directory Domain and Trust (Forest Trust)
Windows Server 2012 Active Directory Domain and Trust (Forest Trust)
 
20130325 openstack-meetup
20130325 openstack-meetup20130325 openstack-meetup
20130325 openstack-meetup
 
Ngn2004 Moving Up And To The Edges110204
Ngn2004 Moving Up And To The Edges110204Ngn2004 Moving Up And To The Edges110204
Ngn2004 Moving Up And To The Edges110204
 
DFX Architecture for High-performance Multi-core Microprocessors
DFX Architecture for High-performance Multi-core MicroprocessorsDFX Architecture for High-performance Multi-core Microprocessors
DFX Architecture for High-performance Multi-core Microprocessors
 
VDI storage and storage virtualization
VDI storage and storage virtualizationVDI storage and storage virtualization
VDI storage and storage virtualization
 
Ahorro energético archivado de backups
Ahorro energético archivado de backupsAhorro energético archivado de backups
Ahorro energético archivado de backups
 
Ahorro energético archivado de backups
Ahorro energético archivado de backupsAhorro energético archivado de backups
Ahorro energético archivado de backups
 
Sürat Teknoloji EMC Forum Isilon Sunumu
Sürat Teknoloji EMC Forum Isilon SunumuSürat Teknoloji EMC Forum Isilon Sunumu
Sürat Teknoloji EMC Forum Isilon Sunumu
 
20121205 open stack_accelerating_science_v3
20121205 open stack_accelerating_science_v320121205 open stack_accelerating_science_v3
20121205 open stack_accelerating_science_v3
 
Accelerating Science with OpenStack.pptx
Accelerating Science with OpenStack.pptxAccelerating Science with OpenStack.pptx
Accelerating Science with OpenStack.pptx
 
20121017 OpenStack Accelerating Science
20121017 OpenStack Accelerating Science20121017 OpenStack Accelerating Science
20121017 OpenStack Accelerating Science
 
20121017 OpenStack CERN Accelerating Science
20121017 OpenStack CERN Accelerating Science20121017 OpenStack CERN Accelerating Science
20121017 OpenStack CERN Accelerating Science
 

Último

IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsEnterprise Knowledge
 
Artificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsArtificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsJoaquim Jorge
 
Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)wesley chun
 
Slack Application Development 101 Slides
Slack Application Development 101 SlidesSlack Application Development 101 Slides
Slack Application Development 101 Slidespraypatel2
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024Rafal Los
 
Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slidevu2urc
 
Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Scriptwesley chun
 
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024The Digital Insurer
 
How to convert PDF to text with Nanonets
How to convert PDF to text with NanonetsHow to convert PDF to text with Nanonets
How to convert PDF to text with Nanonetsnaman860154
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...apidays
 
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...Neo4j
 
The Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxThe Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxMalak Abu Hammad
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityPrincipled Technologies
 
CNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of ServiceCNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of Servicegiselly40
 
Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreterPresentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreternaman860154
 
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfThe Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfEnterprise Knowledge
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024The Digital Insurer
 
Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsMaria Levchenko
 
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc
 
Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024The Digital Insurer
 

Último (20)

IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI Solutions
 
Artificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsArtificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and Myths
 
Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)
 
Slack Application Development 101 Slides
Slack Application Development 101 SlidesSlack Application Development 101 Slides
Slack Application Development 101 Slides
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024
 
Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slide
 
Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Script
 
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
 
How to convert PDF to text with Nanonets
How to convert PDF to text with NanonetsHow to convert PDF to text with Nanonets
How to convert PDF to text with Nanonets
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
 
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
 
The Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxThe Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptx
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivity
 
CNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of ServiceCNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of Service
 
Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreterPresentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreter
 
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfThe Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024
 
Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed texts
 
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
 
Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024
 

Hardware

  • 1. CSE232A: Database System Principles Notes 02: Hardware 1 Database System Architecture Query Processing Transaction Management SQL query Calls from Transactions (read,write) Parser Transaction relational algebra Manager Query View Hardware Rewriter definitions aspects of Concurrency Lock Controller Table and Optimizer Statistics & Catalogs & storing and query execution System Data retrieving data Recovery plan Manager Execution Buffer Engine Manager Data + Indexes Log Memory Hierarchy • Cache memory – On-chip and L2 – Caching outside control of DB Cost per byte system • RAM Capacity Access Speed – Addressable space includes virtual memory but DB systems avoid it – Main memory DBs rely more on OS • Disk – Access speed & Transfer rate – Winchester, arrays,… • Tertiary storage – Tapes, jukeboxes, DVDs 3 1
  • 2. Storage Cost nearline offline 1015 tape & tape optical typical capacity (bytes) 1013 magnetic disks optical 1011 electronic disks secondary online 109 electronic tape main 107 from Gray & Reuter 105 cache 103 10-9 10-6 10-3 10-0 103 access time (sec) 4 Storage Cost from Gray & Reuter 104 cache electronic 102 online main tape dollars/MB electronic secondary magnetic optical nearline 100 tape & disks optical disks 10-2 offline tape 10-4 10-9 10-6 10-3 10-0 103 access time (sec) 5 Volatile Vs Non-Volatile Storage • Persistence important for transaction atomicity and durability • Even if database fits in main memory changes have to be written in non- volatile storage • Hard disk • RAM disks w/ battery • Flash memory 6 2
  • 3. Cost of Disk Access • How many blocks were accessed ? • Clustered/consecutive ? 7 Moore’s Law: Different Rates of Improvement • Processor speed Clustered/sequential access-based algorithms Disk Transfer Rate • Main memory bit/$ become relatively • Disk bit/$ better • RAM access speed • Disk access speed • Disk transfer rate Access Time Disk 8 Moore’s Law: Different Rates of Improvement Cost of “miss” increases Cache Capacity RAM Capacity Access Time Disk 9 3
  • 4. Focus on: “Typical Disk” Controller BUS Disk … Terms: Platter, Head, Actuator Cylinder, Track Sector (physical), Block (logical), Gap 10 Often different numbers Top View of sectors per track Sector Track Block (typically Gap multiple sectors) 11 “Typical” Numbers Diameter: 1 inch ® 15 inches Cylinders: 100 ® 20000 Surfaces: 1 (CDs) ® (Tracks/cyl) 2 (floppies) ® ® 5 (typical hd) ® 30 Sector Size: 512B ® 50K Capacity: 360 KB (old floppy) ® 30 GB 12 4
  • 5. Key performance metric: Time to fetch block I want block x block X in memory ? Time = Seek Time (locate track) + Rotational Delay (locate sector)+ Transfer Time (fetch block) + Other (disk controller, …) 13 Track Seek Delay Where Head must go Track Where Head is 14 Rotational Delay Head Here Block I Want 15 5
  • 6. Seek Time 3 or 5x Time x Few ms 1 N Cylinders Traveled 16 Average Random Seek Time N N å å SEEKTIME (i ® j) i=1 j=1 S= j¹i N(N-1) “Typical” S: 10 ms ® 40 ms 17 Average Rotational Delay R = 1/2 revolution “typical” R = 8.33 ms (7200 RPM) Assume we have to start reading from start of first sector 18 6
  • 7. Transfer Rate: t • “typical” t: 1 ® 3 MB/second • transfer time: block size t 19 Other Delays • CPU time to issue I/O • Contention for controller • Contention for bus, memory “Typical” Value: 0 20 Practice Problem • Single surface • Rotation speed 7200rpm • 16,384 tracks • 128 sectors/track • 4096 bytes/sector • 4 sectors/block (16,384 bytes/block) • SEEKTIME (i ® j) = [1000 + (j-i)] µs • Neglect gaps • Calculate minimum, maximum, average time to fetch one block 21 7
  • 8. Practice Problem: Minimum Time • Head is at the start of the first sector of the block • Just compute transfer time • 4 sectors cover 4/128 of a track • 1 full rotation takes 60/7200=8.33ms • Transfer time is 8.33 * 4 /128 = 0.26ms 22 Practice Problem: Maximum Time • Assume read must start at the first sector • Head is at innermost, required track is the outermost • Seek time = … • Head just missed the beginning • Rotational delay = … • Transfer time = … 23 Practice problem: Average time • Solve… 24 8
  • 9. • So far: Random Block Access • What about: Reading “Next” block? Time to get = Block Size + Negligible block t - skip gap - switch track - once in a while, next cylinder 25 Rule of Random I/O: Expensive Thumb Sequential I/O: Much less • Ex: 1 KB Block » Random I/O: ~ 20 ms. » Sequential I/O: ~ 1 ms. 26 Practice Problem cont’d: Sustained Bandwidth over Track • Assume required blocks are consecutive on single track • What is the sustained bandwidth of fetching consecutive blocks? • 128 sectors/track * 4KB/sector in 8.33ms/track full rotation = 512KB/8.33ms = 61.46KB/ms 27 9
  • 10. Suggested optimization • Cluster data in consecutive blocks • Give an extra point to algorithms that – exploit data clustering by avoiding “random” accesses – Read/write consecutive blocks 28 Example: 2-Phase Merge Sort P K AD L E ZW J C RH Y F X I Main Memory: 4 blocks READ P K A D L E ZW SORT A D K P L D WP E K P K Z WRITE A D K P L DWP E K P K Z … MERGE AD DP C K F WRITE C F H I J R X Y Improve by bringing max number of 29 blocks in memory Cost for Writing similar to Reading …. unless we want to verify! need to add (full) rotation + Block size t 30 10
  • 11. • To Modify a Block? To Modify Block: (a) Read Block (b) Modify in Memory (c) Write Block [(d) Verify?] 31 Block Address: • Physical Device • Cylinder # • Surface # • Sector Once upon a time DBs had access to such – now it is the OS’s domain 32 Optimizations (in controller or O.S.) • Disk Scheduling Algorithms – e.g., elevator algorithm • Track (or larger) Buffer • Pre-fetch • Arrays • Mirrored Disks 33 11
  • 12. Double Buffering Problem: Have a File » Sequence of Blocks B1, B2 Have a Program » Process B1 » Process B2 » Process B3 ... 34 Single Buffer Solution (1) Read B1 ® Buffer (2) Process Data in Buffer (3) Read B2 ® Buffer (4) Process Data in Buffer ... 35 Say P = time to process/block R = time to read in 1 block n = # blocks Single buffer time = n(P+R) 36 12
  • 13. Double Buffering process process Memory: C A B Disk: A B C D E F G done done 37 Say P ³ R P = Processing time/block R = IO time/block n = # blocks What is processing time? • Double buffering time = R + nP • Single buffering time = n(R+P) Improvement much more dramatic if consequtive blocks: … 38 Block Size Selection? • Big Block ® Amortize I/O Cost Unfortunately... • Big Block Þ Read in more useless stuff! and takes longer to read 39 13
  • 14. Trend • memory prices drop and memory capacities increase, • transfer rates increase • Disk access times do not increase that much Þ blocks get bigger ... 40 Disk Failures (Sec 2.5) • Partial ® Total • Intermittent ® Permanent 41 Coping with Disk Failures • Detection – e.g. Checksum • Correction Þ Redundancy 42 14
  • 15. At what level do we cope? • Single Disk – e.g., Error Correcting Codes • Disk Array Logical Physical 43 Operating System e.g., Stable Storage Logical Block Copy A Copy B 44 Database System • e.g., Log Current DB Last week’s DB 45 15
  • 16. Summary Summary • Secondary storage, mainly disks • I/O times • I/Os should be avoided, especially random ones….. 46 16