SlideShare uma empresa Scribd logo
1 de 30
Baixar para ler offline
11.4.14   - mycassandra -   1
NoSQL, Key-Value Store (KVS), Document-oriented DB, GraphDB
     : memcached, Google Bigtable, Amazon Dynamo, Amazon SimpleDB, Apache Cassandra, Voldemort,
     Ringo, Vpork, MongoDB, CouchDB, Tokyo Cabinet/Tokyo Tyrant, Flare, ROMA, kumofs, Kai, Redis,
     Hadoop Hbase, Hypertable, Yahoo! PNUTS, Scalaris, Dynomite, ThruDB, Neo4j, IBM ObjectGrid, Oracle
     Coherence, Velocity, …

                                                             :“        ↔                             ”
• 
•                                                       (join, transaction)
•                             /




                                           - mycassandra -    
                                                                                                         2
• 
               •  key/value vs. multi-dimensional map vs. document vs. graph
          • 
               •              vs.
          •            vs.
          • 
               •  strong vs. weak
          • 
               •        vs.
          • 
               •  row vs. column
          • 
               •  master/slave vs. decentralized

11.4.14                                     - mycassandra -                        3
• 
               •  key/value vs. multi-dimensional map vs. document vs. graph
          • 
               •              vs.
          •            vs.
          • 
               •  strong vs. weak
          • 
               •        vs.
          • 
               •  row vs. column
          • 
               •  master/slave vs. decentralized

11.4.14                                     - mycassandra -                        4
vs.                                    



              write/read                                          
                           Bigtable, Cassandra,          MySQL, Sherpa
                           HBase
                
          Log-Structured                B+-Tree [R.Bayer ‘72]
                           Merge Tree [P. O’Neil ‘96]
          
                                                                       
                           Bigtable                      MySQL


                                                                              
11.4.14                            - mycassandra -                                    5
Write-Heavy                        
                     Read-Heavy                         


                                                                         write-optimized
                                             Better
                                                  
                                             Better


              read-optimized


                                 write-optimized
                                                                                   read-optimized



                                Yahoo! Cloud Serving Benchmark, SOCC ’10

11.4.14                                          - mycassandra -
                                                                                                          6
/

                1. 
                2. 

                      1.MyCassandra
                                  
                                   2.MyCassandra Cluster
                                                                                          


          read-optimized


                                                                      read/write-optimized
                      
      write-optimized




11.4.14                                         - mycassandra -                               7
Apache Cassandra


          • 
          • 
          •                                                                              

                    N = 3
                     ID         
       Consistent Hashing(               )


                    A
           F
               Z
                             secondary 1

                              Q
             V
                             N
                                                                    •  request proxy
          primary
                      secondary 2
                •          primary node
                                                                    •             secondary node
                                hash(key) = Q
                              key
   values
11.4.14                                         - mycassandra -                                         8
Google Bigtable
                         -                                :        -
          •  Bigtable:                                  sequential write
                                       I/O
          •  always writable
                                             write-lock


                                        <k1, cf1+cf2>
                  Cassandra    
                                 map: <key,ColumnFamily>
                                                       
                                                                     async
                                        Memtable
                                               
                  Memory
                  Disk
                     <k1, cf1>
                     <k1, cf2>
     write
         
                             Commit Log
                                                
                                                                           SSTable
                                                                                 
11.4.14                                       - mycassandra -                            9
Google Bigtable
                            -                         :    -
          key
                •  Memtable   value
                •  SSTable            value
                                                     I/O
                                           
                                       Map                       Cassandra          
                               <key,ColumnFamily>
                                                
     read
        
                             Memtable
                                             
                   Memory
                                 <k1, CF4>

                   Disk

                                                                                 <key, CF1>
                                   Commit Log
                                      I/O 
                                      <key, CF2>
                                  
                                    SSTable
                                                                             
   <key, CF3>
                                                                                          

11.4.14                                  - mycassandra -                                      10
1.                                                   
                                MyCassandra
                                          


               read-optimized




                                            write-optimized
                            


11.4.14                           - mycassandra -                  11
Cassandra
                   •  Cassandra                          /
                   • 



                                                                              Consistent Hashing
          InnoDB
               
    MyISAM
                         
   Memory …
                                  
                                                                              Gossip Protocol

                                             
                                    


                                                                     




                                  Bigtable
                                         
       MySQL
                                                     
       Redis
                                                                 
       …
11.4.14      MyCassandra:                                                                      12
MyCassandra
                           :                      Cassandra
                       :                      . JDBC API / stored procedure
                   :                     key-value store


                               MyCassandra node × 6




11.4.14                                                                       13
2.
                                 
                 MyCassandra Cluster
                                   




                 read/write-optimized




11.4.14        - mycassandra -           14
• 
    • 
                                                           sync
           async
          =>

    • 
          Quorum Protocol:   (           )+        (       )>   (      )


          =>


                                                                    mem
11.4.14                          - mycassandra -                                15
•  W:
                                                                               •  R:
                                                                    
          •  RW:


                                             MyCassandra
          •                     (W) /                           (R) /                       (RW)


          •                             gossip protocol


          • 
           1.                           (key                             )
           2.                                                                      × N-1
                            N=3                                 
            Consistent Hashing    ID   
                                        R             RW
                 
        RW
                                   W                        W
                                                                    R
                                            gossip
                     R
                                                      RW


                            W
          RW
             R
                                                        
           W

11.4.14                                                                                                 16
host
                                                                                     
                                                                                 node
(1) 1             /1       →
          ☓                                                                      storage
          ☓
(2) 1             /k                                 →
          ID                    [Amazon Dynamo, SOSP ’07]
          ☓
(3) 1                                                  →
                                                                           FT
           space

Fault
Torelance (FT)
        space
           FT
          space
                                                                                  (3)
1storage / 1node / 1 host
                                              (2)
               (1)
                                                      virtual node 

                                                      1 node / host        k nodes / host
11.4.14                                                                                           17
                                                      k storages / node
   1 storage / node
•  :
                                                                                  •  R:
                                                                          
       •  RW:
     =3, =2
    W:RW:R = 1:1:1         
                                        Client
               1) 
                               Proxy

                                                              2)  W, RW

                     ACK
                                                                                   ACK
                                                              3a)


          W
          
                                                   3b)             R
                 RW
             R
                                 
                                                                                  ACK      
                                 

                                : max (W, RW)



11.4.14                                     - mycassandra -                                    18
•  :
                                                                      •  R:

     =3, =2
                                                              
       •  RW:

    W:RW:R = 1:1:1   
         Client
                     Proxy
                  1) 

                                             2)  R, RW

                                             3a)
                         
                   3b)                    or                  W

          W
   RW
   R
                     
                                             4) 
                         
                                                      Proxy

                                                     (Cassandra   read repair     )
                         : max (R, RW)



11.4.14                            - mycassandra -                                      19
/
           
               •  MyCassandra Cluster: 6×3 = 18                /6       (W:R:RW = 6 : 6 : 6)
               •  Cassandra: 6      /6
           
               •                 : = 3,                                 :   =   =2
                                         : Bigtable (W), MySQL / InnoDB (R), Redis (RW)

                      : YCSB (Yahoo! Cloud Serving Benchmark) [SOCC ’10]
           
               1.    MyCassandra/Cassandra×6     YCSB Client×1
               2.    1KB values(100[Bytes]×10[columns])+key                        1,000
               3. 
               4.    YCSB
               5.    YCSB Stat

11.4.14                                               - mycassandra -                          20
YCSB                                                       

          •     4
               Workload
      Application                   Operation Ratio        Record
                              Example
                                             Selection
                              Log
                          Read: 0%               Zipfian( )
    Write
               Write-Only
                                  Write: 100%
    Heavy
                                                  Read: 50%
               Write-Heavy
 Session Store
                  Write: 50%
                                                            Read: 95%
    Read       Read-Heavy
 Photo tagging
                   Write: 5%
    Heavy
                                                            Read: 100%
               Read-Only
 Cache
                            Write: 0%

                ( ) Zipfian   :                        ,
                                                        /                  
11.4.14                              - mycassandra -                                            21
/                                                       
           1       11.5~23.5%       
       avg. write-latency                            Cassandra
          0.8
                                                                                          MyCassandra
          0.6                                                                             Cluster
          0.4                                                                      MySQL + Redis
Better
                                                                                                
          0.2
                    write:100%
              write:50%
               write:5%
         write:0%
           0
           (ms)
                                                          88.5%            
          10
                                            avg. read-latency
           8

Better
 6
                                
                                          85.2% 
             88.5% 
           4                                49.7%    

           2
                     read:0%
                read:50%
                read:95%
        read:100%
           0
           (ms)
                     Write-Only              Write-Heavy              Read-Heavy        Read-Only
11.4.14                                             - mycassandra -                                      22
30000       0.99     
                                                 Cassandra
                                   max. qps for 40 clients
                                                                                 MyCassandra
          25000                                                                  Cluster
          20000
                                                                                   6.53   
          15000
Better
          10000                        0.62       
           1.49    

           5000

               0
                     [100:0]
         [50:50]
                 [5:95]
            [0:100]
    [write:read]
      (query/sec)
   Write-Only      Write-Heavy             Read-Heavy          Read-Only

                             Write Heavy
                                       
                             Read Heavy
                                                                              
                        •                                             6.53
                        •                                                    
11.4.14                                    - mycassandra -                                              23
(1)
                                                  : HDD vs. SSD
          30000
                            Cassandra     HDD
                                                    30000
                                                                                  MyCassandra SSD              HDD
          25000                           SSD
                                                    25000
          20000                                     20000
                                                                                    Cluster
          15000                                     15000
                            (3)
        ( )
                                       ( )
     10000
Better
                                             10000
           5000                                       5000                    (3)
              0                                                0
            (qps)
                                       (qps)




  (1)                              HDD/SSD                         IOZone            HDD: Western digital
 SSD: Crucial
  (2)                                                              benchmark
                                                               sequential write      86,277 qps
           96,401 qps
  (3)                              
                           sequential read       108,914 qps
          216,099 qps
                                                               random write          2,485 qps
            29,045 qps
11.4.14                                      - mycassandra -   random read           926 qps
              21,751 qps
                                                                                                                     24
 Read-Heavy
               •                       88.5%
               •         6.53
                    =>
                                                           /

                        Write-Heavy
               •         Cassandra


11.4.14                                  - mycassandra -           25
(1/2)
       Write-Heavy
        •  MySQL

          •         :

          •             :
               • 
               • 
                    )                                                write-optimized
                                                             write-heavy
                                                                       
                            4                                                15000                  
                                Cassandra   MyCassandra
                                            cluster
                            3
                        
                                                    10000
                            2
                            1                                                 5000

                            0                                                     0
11.4.14                                                                                             26
                                   write latency      read latency                     throughput
(2/2)
           Amazon EC2
               •  1           /N

                         /
               •      /
               • 

               • 




11.4.14                            - mycassandra -            27
  FD-Tree: Tree Indexing on Flash Disks, VLDB ’10
               • 
               •  B+tree                 + LSM-tree
               •         SSD
           
               •    MySQL: RDBMS
               •    Anvil, SOSP ’09: 1
               •    Cloudy, VLDB ’10:
               •    Dynamo, SOSP ‘07:           vs.
               •    MyCassandra (        ):                     vs.




11.4.14                                       - mycassandra -         28
: MyCassandra/MyCassandra Cluster
                            Cassandra
 1. MyCassandra
       2. MyCassandra
                                                             Cluster
          data model
       multi-dimensional map (Column Family)
          throughput
       write
          write or read
     write and read
          latency
          low
            lower in case
     lower
          persistence
      yes
            yes or no (memory)
 yes
          consistency
      weak (eventual, quorum)
          replication
      sync / async
          data partition
   row
          node              decentralized
          organization
                                      throughput, latency              
11.4.14                                      - mycassandra -                     29
:
           1)

           2) MySQL + memcached
                    : MyCassandra Cluster
           -
           -
                                                     Table
          movie-id
     name
    thumb-name
         tag
                   count
          704122313
    movieA
 EY37lHk5bgU
         sport, succer, FIFA, …
 169,374
          704122314
    movieB
 Zk3BSYMWjzQ
 music, jazz, …
                472,803
11.4.14                      Read-Heavy - mycassandra -
                                      
                     Write-Heavy
                                                                      
                 30

Mais conteúdo relacionado

Mais procurados

Hadoop World 2011: Advanced HBase Schema Design
Hadoop World 2011: Advanced HBase Schema DesignHadoop World 2011: Advanced HBase Schema Design
Hadoop World 2011: Advanced HBase Schema DesignCloudera, Inc.
 
Hadoop World 2011: Advanced HBase Schema Design - Lars George, Cloudera
Hadoop World 2011: Advanced HBase Schema Design - Lars George, ClouderaHadoop World 2011: Advanced HBase Schema Design - Lars George, Cloudera
Hadoop World 2011: Advanced HBase Schema Design - Lars George, ClouderaCloudera, Inc.
 
Intro to HBase - Lars George
Intro to HBase - Lars GeorgeIntro to HBase - Lars George
Intro to HBase - Lars GeorgeJAX London
 
Data Storage and Management project Report
Data Storage and Management project ReportData Storage and Management project Report
Data Storage and Management project ReportTushar Dalvi
 

Mais procurados (7)

Hadoop at Rakuten, 2011/07/06
Hadoop at Rakuten, 2011/07/06Hadoop at Rakuten, 2011/07/06
Hadoop at Rakuten, 2011/07/06
 
Hbase: an introduction
Hbase: an introductionHbase: an introduction
Hbase: an introduction
 
NoSQL: Cassadra vs. HBase
NoSQL: Cassadra vs. HBaseNoSQL: Cassadra vs. HBase
NoSQL: Cassadra vs. HBase
 
Hadoop World 2011: Advanced HBase Schema Design
Hadoop World 2011: Advanced HBase Schema DesignHadoop World 2011: Advanced HBase Schema Design
Hadoop World 2011: Advanced HBase Schema Design
 
Hadoop World 2011: Advanced HBase Schema Design - Lars George, Cloudera
Hadoop World 2011: Advanced HBase Schema Design - Lars George, ClouderaHadoop World 2011: Advanced HBase Schema Design - Lars George, Cloudera
Hadoop World 2011: Advanced HBase Schema Design - Lars George, Cloudera
 
Intro to HBase - Lars George
Intro to HBase - Lars GeorgeIntro to HBase - Lars George
Intro to HBase - Lars George
 
Data Storage and Management project Report
Data Storage and Management project ReportData Storage and Management project Report
Data Storage and Management project Report
 

Semelhante a 読み出し性能と書き込み性能を両立させるクラウドストレージ (OS-117-24)

読み出し性能と書き込み性能を両立させるクラウドストレージ (SACSIS2011-A6-1)
読み出し性能と書き込み性能を両立させるクラウドストレージ (SACSIS2011-A6-1)読み出し性能と書き込み性能を両立させるクラウドストレージ (SACSIS2011-A6-1)
読み出し性能と書き込み性能を両立させるクラウドストレージ (SACSIS2011-A6-1)Shun Nakamura
 
MyCassandra (Full English Version)
MyCassandra (Full English Version)MyCassandra (Full English Version)
MyCassandra (Full English Version)Shun Nakamura
 
Developers summit cassandraで見るNoSQL
Developers summit cassandraで見るNoSQLDevelopers summit cassandraで見るNoSQL
Developers summit cassandraで見るNoSQLRyu Kobayashi
 
MyCassandra: A Cloud Storage Supporting both Read Heavy and Write Heavy Workl...
MyCassandra: A Cloud Storage Supporting both Read Heavy and Write Heavy Workl...MyCassandra: A Cloud Storage Supporting both Read Heavy and Write Heavy Workl...
MyCassandra: A Cloud Storage Supporting both Read Heavy and Write Heavy Workl...Shun Nakamura
 
Spring one2gx2010 spring-nonrelational_data
Spring one2gx2010 spring-nonrelational_dataSpring one2gx2010 spring-nonrelational_data
Spring one2gx2010 spring-nonrelational_dataRoger Xia
 
Intro to cassandra
Intro to cassandraIntro to cassandra
Intro to cassandraAaron Ploetz
 

Semelhante a 読み出し性能と書き込み性能を両立させるクラウドストレージ (OS-117-24) (7)

読み出し性能と書き込み性能を両立させるクラウドストレージ (SACSIS2011-A6-1)
読み出し性能と書き込み性能を両立させるクラウドストレージ (SACSIS2011-A6-1)読み出し性能と書き込み性能を両立させるクラウドストレージ (SACSIS2011-A6-1)
読み出し性能と書き込み性能を両立させるクラウドストレージ (SACSIS2011-A6-1)
 
MyCassandra (Full English Version)
MyCassandra (Full English Version)MyCassandra (Full English Version)
MyCassandra (Full English Version)
 
Developers summit cassandraで見るNoSQL
Developers summit cassandraで見るNoSQLDevelopers summit cassandraで見るNoSQL
Developers summit cassandraで見るNoSQL
 
Cassandra
CassandraCassandra
Cassandra
 
MyCassandra: A Cloud Storage Supporting both Read Heavy and Write Heavy Workl...
MyCassandra: A Cloud Storage Supporting both Read Heavy and Write Heavy Workl...MyCassandra: A Cloud Storage Supporting both Read Heavy and Write Heavy Workl...
MyCassandra: A Cloud Storage Supporting both Read Heavy and Write Heavy Workl...
 
Spring one2gx2010 spring-nonrelational_data
Spring one2gx2010 spring-nonrelational_dataSpring one2gx2010 spring-nonrelational_data
Spring one2gx2010 spring-nonrelational_data
 
Intro to cassandra
Intro to cassandraIntro to cassandra
Intro to cassandra
 

Mais de Shun Nakamura

シリコンバレーに行ってきた!
シリコンバレーに行ってきた!シリコンバレーに行ってきた!
シリコンバレーに行ってきた!Shun Nakamura
 
読み出し性能と書き込み性能を選択可能なクラウドストレージ (DEIM2011-C3-3)
読み出し性能と書き込み性能を選択可能なクラウドストレージ (DEIM2011-C3-3)読み出し性能と書き込み性能を選択可能なクラウドストレージ (DEIM2011-C3-3)
読み出し性能と書き込み性能を選択可能なクラウドストレージ (DEIM2011-C3-3)Shun Nakamura
 

Mais de Shun Nakamura (6)

HBase at LINE
HBase at LINEHBase at LINE
HBase at LINE
 
シリコンバレーに行ってきた!
シリコンバレーに行ってきた!シリコンバレーに行ってきた!
シリコンバレーに行ってきた!
 
MyCassandra
MyCassandraMyCassandra
MyCassandra
 
読み出し性能と書き込み性能を選択可能なクラウドストレージ (DEIM2011-C3-3)
読み出し性能と書き込み性能を選択可能なクラウドストレージ (DEIM2011-C3-3)読み出し性能と書き込み性能を選択可能なクラウドストレージ (DEIM2011-C3-3)
読み出し性能と書き込み性能を選択可能なクラウドストレージ (DEIM2011-C3-3)
 
Cassandra勉強会
Cassandra勉強会Cassandra勉強会
Cassandra勉強会
 
ComSys WIP
ComSys WIPComSys WIP
ComSys WIP
 

Último

How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerThousandEyes
 
CNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of ServiceCNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of Servicegiselly40
 
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...HostedbyConfluent
 
Salesforce Community Group Quito, Salesforce 101
Salesforce Community Group Quito, Salesforce 101Salesforce Community Group Quito, Salesforce 101
Salesforce Community Group Quito, Salesforce 101Paola De la Torre
 
Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsMaria Levchenko
 
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking MenDelhi Call girls
 
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...gurkirankumar98700
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationMichael W. Hawkins
 
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfThe Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfEnterprise Knowledge
 
🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘RTylerCroy
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationRadu Cotescu
 
Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)Allon Mureinik
 
Google AI Hackathon: LLM based Evaluator for RAG
Google AI Hackathon: LLM based Evaluator for RAGGoogle AI Hackathon: LLM based Evaluator for RAG
Google AI Hackathon: LLM based Evaluator for RAGSujit Pal
 
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Igalia
 
Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreterPresentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreternaman860154
 
08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking MenDelhi Call girls
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityPrincipled Technologies
 
Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024The Digital Insurer
 
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024BookNet Canada
 
Slack Application Development 101 Slides
Slack Application Development 101 SlidesSlack Application Development 101 Slides
Slack Application Development 101 Slidespraypatel2
 

Último (20)

How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
CNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of ServiceCNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of Service
 
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
 
Salesforce Community Group Quito, Salesforce 101
Salesforce Community Group Quito, Salesforce 101Salesforce Community Group Quito, Salesforce 101
Salesforce Community Group Quito, Salesforce 101
 
Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed texts
 
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
 
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day Presentation
 
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfThe Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
 
🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organization
 
Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)
 
Google AI Hackathon: LLM based Evaluator for RAG
Google AI Hackathon: LLM based Evaluator for RAGGoogle AI Hackathon: LLM based Evaluator for RAG
Google AI Hackathon: LLM based Evaluator for RAG
 
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
 
Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreterPresentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreter
 
08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivity
 
Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024
 
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
 
Slack Application Development 101 Slides
Slack Application Development 101 SlidesSlack Application Development 101 Slides
Slack Application Development 101 Slides
 

読み出し性能と書き込み性能を両立させるクラウドストレージ (OS-117-24)

  • 1. 11.4.14 - mycassandra - 1
  • 2. NoSQL, Key-Value Store (KVS), Document-oriented DB, GraphDB : memcached, Google Bigtable, Amazon Dynamo, Amazon SimpleDB, Apache Cassandra, Voldemort, Ringo, Vpork, MongoDB, CouchDB, Tokyo Cabinet/Tokyo Tyrant, Flare, ROMA, kumofs, Kai, Redis, Hadoop Hbase, Hypertable, Yahoo! PNUTS, Scalaris, Dynomite, ThruDB, Neo4j, IBM ObjectGrid, Oracle Coherence, Velocity, … :“ ↔ ” •  •  (join, transaction) •  / - mycassandra - 2
  • 3. •  •  key/value vs. multi-dimensional map vs. document vs. graph •  •  vs. •  vs. •  •  strong vs. weak •  •  vs. •  •  row vs. column •  •  master/slave vs. decentralized 11.4.14 - mycassandra - 3
  • 4. •  •  key/value vs. multi-dimensional map vs. document vs. graph •  •  vs. •  vs. •  •  strong vs. weak •  •  vs. •  •  row vs. column •  •  master/slave vs. decentralized 11.4.14 - mycassandra - 4
  • 5. vs. write/read Bigtable, Cassandra, MySQL, Sherpa HBase Log-Structured B+-Tree [R.Bayer ‘72] Merge Tree [P. O’Neil ‘96] Bigtable MySQL 11.4.14 - mycassandra - 5
  • 6. Write-Heavy Read-Heavy write-optimized Better Better read-optimized write-optimized read-optimized Yahoo! Cloud Serving Benchmark, SOCC ’10 11.4.14 - mycassandra - 6
  • 7. / 1.  2.  1.MyCassandra 2.MyCassandra Cluster read-optimized read/write-optimized write-optimized 11.4.14 - mycassandra - 7
  • 8. Apache Cassandra •  •  •  N = 3 ID Consistent Hashing( ) A F Z secondary 1 Q V N •  request proxy primary secondary 2 •  primary node •  secondary node hash(key) = Q key values 11.4.14 - mycassandra - 8
  • 9. Google Bigtable - : - •  Bigtable: sequential write I/O •  always writable write-lock <k1, cf1+cf2> Cassandra map: <key,ColumnFamily> async Memtable Memory Disk <k1, cf1> <k1, cf2> write Commit Log SSTable 11.4.14 - mycassandra - 9
  • 10. Google Bigtable - : - key •  Memtable value •  SSTable value I/O Map Cassandra <key,ColumnFamily> read Memtable Memory <k1, CF4> Disk <key, CF1> Commit Log I/O <key, CF2> SSTable <key, CF3> 11.4.14 - mycassandra - 10
  • 11. 1. MyCassandra read-optimized write-optimized 11.4.14 - mycassandra - 11
  • 12. Cassandra •  Cassandra / •  Consistent Hashing InnoDB MyISAM Memory … Gossip Protocol Bigtable MySQL Redis … 11.4.14 MyCassandra: 12
  • 13. MyCassandra : Cassandra : . JDBC API / stored procedure : key-value store MyCassandra node × 6 11.4.14 13
  • 14. 2. MyCassandra Cluster read/write-optimized 11.4.14 - mycassandra - 14
  • 15. •  •  sync async => •  Quorum Protocol: ( )+ ( )> ( ) => mem 11.4.14 - mycassandra - 15
  • 16. •  W: •  R: •  RW: MyCassandra •  (W) / (R) / (RW) •  gossip protocol •  1.  (key ) 2.  × N-1 N=3 Consistent Hashing ID R RW RW W W R gossip R RW W RW R W 11.4.14 16
  • 17. host node (1) 1 /1 → ☓ storage ☓ (2) 1 /k → ID [Amazon Dynamo, SOSP ’07] ☓ (3) 1 → FT space Fault Torelance (FT) space FT space (3) 1storage / 1node / 1 host (2) (1) virtual node 1 node / host k nodes / host 11.4.14 17 k storages / node 1 storage / node
  • 18. •  : •  R: •  RW: =3, =2 W:RW:R = 1:1:1 Client 1)  Proxy 2)  W, RW ACK ACK 3a) W 3b) R RW R ACK : max (W, RW) 11.4.14 - mycassandra - 18
  • 19. •  : •  R: =3, =2 •  RW: W:RW:R = 1:1:1 Client Proxy 1)  2)  R, RW 3a) 3b) or W W RW R 4)  Proxy (Cassandra read repair ) : max (R, RW) 11.4.14 - mycassandra - 19
  • 20. /   •  MyCassandra Cluster: 6×3 = 18 /6 (W:R:RW = 6 : 6 : 6) •  Cassandra: 6 /6   •  : = 3, : = =2   : Bigtable (W), MySQL / InnoDB (R), Redis (RW) : YCSB (Yahoo! Cloud Serving Benchmark) [SOCC ’10]   1.  MyCassandra/Cassandra×6 YCSB Client×1 2.  1KB values(100[Bytes]×10[columns])+key 1,000 3.  4.  YCSB 5.  YCSB Stat 11.4.14 - mycassandra - 20
  • 21. YCSB •  4 Workload Application Operation Ratio Record Example Selection Log Read: 0% Zipfian( ) Write Write-Only Write: 100% Heavy Read: 50% Write-Heavy Session Store Write: 50% Read: 95% Read Read-Heavy Photo tagging Write: 5% Heavy Read: 100% Read-Only Cache Write: 0% ( ) Zipfian : , / 11.4.14 - mycassandra - 21
  • 22. / 1 11.5~23.5% avg. write-latency Cassandra 0.8 MyCassandra 0.6 Cluster 0.4 MySQL + Redis Better 0.2 write:100% write:50% write:5% write:0% 0 (ms) 88.5% 10 avg. read-latency 8 Better 6 85.2% 88.5% 4 49.7% 2 read:0% read:50% read:95% read:100% 0 (ms) Write-Only Write-Heavy Read-Heavy Read-Only 11.4.14 - mycassandra - 22
  • 23. 30000 0.99 Cassandra max. qps for 40 clients MyCassandra 25000 Cluster 20000 6.53 15000 Better 10000 0.62 1.49 5000 0 [100:0] [50:50] [5:95] [0:100] [write:read] (query/sec) Write-Only Write-Heavy Read-Heavy Read-Only Write Heavy Read Heavy •  6.53 •  11.4.14 - mycassandra - 23
  • 24. (1) : HDD vs. SSD 30000 Cassandra HDD 30000 MyCassandra SSD HDD 25000 SSD 25000 20000 20000 Cluster 15000 15000 (3) ( ) ( ) 10000 Better 10000 5000 5000 (3) 0 0 (qps) (qps) (1) HDD/SSD IOZone HDD: Western digital SSD: Crucial (2) benchmark sequential write 86,277 qps 96,401 qps (3) sequential read 108,914 qps 216,099 qps random write 2,485 qps 29,045 qps 11.4.14 - mycassandra - random read 926 qps 21,751 qps 24
  • 25.  Read-Heavy •  88.5% •  6.53 => /   Write-Heavy •  Cassandra 11.4.14 - mycassandra - 25
  • 26. (1/2)  Write-Heavy •  MySQL •  : •  : •  •  ) write-optimized write-heavy 4 15000 Cassandra MyCassandra cluster 3 10000 2 1 5000 0 0 11.4.14 26 write latency read latency throughput
  • 27. (2/2)  Amazon EC2 •  1 /N   / •  / •  •  11.4.14 - mycassandra - 27
  • 28.   FD-Tree: Tree Indexing on Flash Disks, VLDB ’10 •  •  B+tree + LSM-tree •  SSD   •  MySQL: RDBMS •  Anvil, SOSP ’09: 1 •  Cloudy, VLDB ’10: •  Dynamo, SOSP ‘07: vs. •  MyCassandra ( ): vs. 11.4.14 - mycassandra - 28
  • 29. : MyCassandra/MyCassandra Cluster Cassandra 1. MyCassandra 2. MyCassandra Cluster data model multi-dimensional map (Column Family) throughput write write or read write and read latency low lower in case lower persistence yes yes or no (memory) yes consistency weak (eventual, quorum) replication sync / async data partition row node decentralized organization throughput, latency 11.4.14 - mycassandra - 29
  • 30. : 1) 2) MySQL + memcached : MyCassandra Cluster - - Table movie-id name thumb-name tag count 704122313 movieA EY37lHk5bgU sport, succer, FIFA, … 169,374 704122314 movieB Zk3BSYMWjzQ music, jazz, … 472,803 11.4.14 Read-Heavy - mycassandra - Write-Heavy 30