SlideShare uma empresa Scribd logo
1 de 27
IN-MEMORY COMPUTING
ROBERT FRIBERG, DEVREX LABS
HTTP://DEVREXLABS.COM/
@ROBERTFRIBERG
About me
◦ Independent Developer and Trainer
◦ Sql Server DBA since 6.5
◦ C#, javascript, perl, java, +++
◦ Machine learning, AI
◦ Squash fanatic
Agenda
◦ Revisiting Traditional RDBMS
◦ Defining IMDB
◦ A look at a few in-memory products
◦ OrigoDB in depth
◦ Goals
◦ Learn technical stuff
◦ Thinking different
What is a database?
◦ An organized collection of information
◦ Allows reading and writing
◦ Provides authorization and authentication
◦ Provides some level of data safety
Demand drives change
◦ Performance
◦ Data volume
◦ Scalability
◦ Availability
◦ Modeling
• NoSQL
• Big data
• Graph
• Real time analytics
• In-memory computing
• Column stores
One size no longer fits all
B-TREE data structure
B-trees and Transactions
LOG
DATA 64KB blocks w 8x8KB pages
Logical BTREE of 8kb data pages
In the buffer pool (cache)
Buffer
Manager
Transactions append inserted, deleted, original and modified pages to the LOG
• Fill factor
• Page splits
• Clustered index
• Checkpoint
Transactions
A Atomic
C Consistent
I Isolated
D Durable s0 s1 s2t1 t2
What is s?
Isolation Levels
◦ SERIALIZABLE
◦ REPEATABLE_READ
◦ READ_COMMITTED_SNAPSHOT (MVCC) (Row versioning)
◦ READ_COMMITTED
◦ READ_UNCOMMITTED – dont worry, be happy
consistency
performance
“the B-tree is optimized for
systems that read and
write large blocks of data”
- Wikipedia
The Traditional RDBMS Architecture
”.. is obsolete”
-Michael
Stonebraker
Reference: OLTP through the looking glass, Stonebraker et al
OLTP vs. OLAP mismatch
Read load
OLAP Read intensive, touches a lot of
data, benefits from indexes
Write
load
OLTP
Write intensive
- Small writes
- small reads
- hot spots
Indexes hurt
write
performance
In-memory
Disk-based
What is an in-memory database?
◦ PRIMARY representation is in-memory
◦ Memory optimized data structures
◦ ALL the data in memory (possibly distributed)
(in-memory is not necessarily in-process)
Transaction logging
◦ Write Ahead Logging – write to disk before commit
◦ Effect logging – persist the effected datapages
◦ Command logging – persist the cause
IMDB Applications
◦ Real time applications with no durability requirements
◦ Embedded, router, online gaming
◦ Real time applications with durability requirements, low latency, high throughput
◦ Traditional applications during test and development (and production)
◦ Whenever data fits in RAM or can be distributed
◦ General OLTP replacement when DB < 2TB
Some In-memory Products
memcached
In-memory computing
SQL Server Hekaton
◦ Memory optimized tree structure
◦ Almost Lock-free Mvcc concurrency control
◦ Command logging
◦ Seamlessly Integrated in the traditional model
◦ Indexing
◦ Joins
◦ Querying
redis
◦ Redis is an open source, BSD licensed, advanced key-value store. It is often referred to
as a data structure server since keys can contain strings, hashes, lists, sets and sorted
sets.
◦ Extremely popular and widespread
(twitter, flicker, github, digg, disqus, Instagram, stackoverflow)
◦ Written in C, great performance
Comparison Matrix
Product License Datamodel Interface ACID Distributed Concurrency
Control
VoltDB OSS Relational Java/sql yes Yes (2PC) Serialized
memsql $$ Relational SQL Almost Yes Mvcc
aerospike $$ Key/value many yes Yes(2PC) CAS
SQL Server $$ Relational + T-SQL Yes (no) No Locking,
mvcc
NuoDB $$ Relational SQL Yes
Hazelcast OSS Key/value+ java Almost Yes (2PC)
Gridgain OSS Key/value Java,sql Yes Yes (2PC) mvcc
Origodb OSS + User defined NET/REST Yes No
Master/slave
Serialized +
Redis OSS Key/value + Many/LUA Yes No
Master/Slave
Serialized
OrigoDB
◦ Is it a database? (first name was Livedomain)
◦ Database Toolkit - Define your own datamodel
◦ Write ahead command logging + snapshots
◦ Single writer + multiple reader concurrency (serialized)
◦ Open source embedded engine
◦ 100% ACID
◦ Commercial server with master/slave replication
Design goals
◦Simplicity and correctness before performance
◦Flexibility
◦Rapid development
Modular Kernels
◦ Optimistic Kernel – write ahead logging assumes command will succeed
◦ Royal Food Taster – 2 identical in-memory models
◦ Immutability Kernel – Lock free, writers don’t block readers
◦ Requires immutable model
Evolution of OrigoDB
◦ File formats for document-oriented Desktop applications (java 1996)
◦ Cache invalidation experiments
◦ In-memory search indexes (offline built snapshots -> live updates)
◦ Inception: lambda expressions (NET 2008)
Cousins of OrigoDB
◦ Prevayler (java)
◦ Bamboo (net)
◦ Madeleine (ruby)
◦ Perlvayler
◦ Twisted-python
◦ Lmax
◦ Java-chronicle
Bring your own data model
◦ Generic models = Extra schema + mapping is complex so why?
◦ Relational
◦ Key/Value (value is a blob)
◦ Document (document is structured and queryable)
◦ Graph, nodes and edges
◦ Domain specific models
◦ OO Domain model (DDD) (typed graph)
◦ Javascript V8 environment (persisted node.js)
◦ Machine learning models (Accord.NET)
◦ Lucene.NET indexes
Demo time!
◦ TODO example – Anemic model, transaction script pattern (fat commands)
◦ Twitter clone – rich model with proxy, no commands
◦ Geekstream http://geekstream.devrexlabs.com/
◦ OrigoDB Server http://origodb.com/
Last words
◦ Times are changing! Embrace!
◦ One size does not fit all – go polyglot persistence!
◦ Choose the most appropriate data model
◦ If data fits in RAM go in-memory!
Thank you!
robert@devrexlabs.com
@robertfriberg

Mais conteúdo relacionado

Mais procurados

Data Lakehouse Symposium | Day 1 | Part 2
Data Lakehouse Symposium | Day 1 | Part 2Data Lakehouse Symposium | Day 1 | Part 2
Data Lakehouse Symposium | Day 1 | Part 2Databricks
 
Improve PostgreSQL replication with Oracle GoldenGate
Improve PostgreSQL replication with Oracle GoldenGateImprove PostgreSQL replication with Oracle GoldenGate
Improve PostgreSQL replication with Oracle GoldenGateBobby Curtis
 
Columnar Databases (1).pptx
Columnar Databases (1).pptxColumnar Databases (1).pptx
Columnar Databases (1).pptxssuser55cbdb
 
Real-Time Streaming: Intro to Amazon Kinesis
Real-Time Streaming: Intro to Amazon KinesisReal-Time Streaming: Intro to Amazon Kinesis
Real-Time Streaming: Intro to Amazon KinesisAmazon Web Services
 
A Reference Architecture for ETL 2.0
A Reference Architecture for ETL 2.0 A Reference Architecture for ETL 2.0
A Reference Architecture for ETL 2.0 DataWorks Summit
 
Data Lake Overview
Data Lake OverviewData Lake Overview
Data Lake OverviewJames Serra
 
Relational databases vs Non-relational databases
Relational databases vs Non-relational databasesRelational databases vs Non-relational databases
Relational databases vs Non-relational databasesJames Serra
 
SeaweedFS introduction
SeaweedFS introductionSeaweedFS introduction
SeaweedFS introductionchrislusf
 
Reduce Amazon RDS Costs up to 50% with Proxies
Reduce Amazon RDS Costs up to 50% with ProxiesReduce Amazon RDS Costs up to 50% with Proxies
Reduce Amazon RDS Costs up to 50% with ProxiesJohn Varghese
 
Building Your Data Warehouse with Amazon Redshift
Building Your Data Warehouse with Amazon RedshiftBuilding Your Data Warehouse with Amazon Redshift
Building Your Data Warehouse with Amazon RedshiftAmazon Web Services
 
Delta Lake Cheat Sheet.pdf
Delta Lake Cheat Sheet.pdfDelta Lake Cheat Sheet.pdf
Delta Lake Cheat Sheet.pdfkaransharma62792
 
Processing Large Data with Apache Spark -- HasGeek
Processing Large Data with Apache Spark -- HasGeekProcessing Large Data with Apache Spark -- HasGeek
Processing Large Data with Apache Spark -- HasGeekVenkata Naga Ravi
 
Best Practices for Data Warehousing with Amazon Redshift | AWS Public Sector ...
Best Practices for Data Warehousing with Amazon Redshift | AWS Public Sector ...Best Practices for Data Warehousing with Amazon Redshift | AWS Public Sector ...
Best Practices for Data Warehousing with Amazon Redshift | AWS Public Sector ...Amazon Web Services
 
Predicting Flights with Azure Databricks
Predicting Flights with Azure DatabricksPredicting Flights with Azure Databricks
Predicting Flights with Azure DatabricksSarah Dutkiewicz
 
7. Key-Value Databases: In Depth
7. Key-Value Databases: In Depth7. Key-Value Databases: In Depth
7. Key-Value Databases: In DepthFabio Fumarola
 
Building modern data lakes
Building modern data lakes Building modern data lakes
Building modern data lakes Minio
 
Memory management in oracle
Memory management in oracleMemory management in oracle
Memory management in oracleDavin Abraham
 

Mais procurados (20)

Data Lakehouse Symposium | Day 1 | Part 2
Data Lakehouse Symposium | Day 1 | Part 2Data Lakehouse Symposium | Day 1 | Part 2
Data Lakehouse Symposium | Day 1 | Part 2
 
Improve PostgreSQL replication with Oracle GoldenGate
Improve PostgreSQL replication with Oracle GoldenGateImprove PostgreSQL replication with Oracle GoldenGate
Improve PostgreSQL replication with Oracle GoldenGate
 
Columnar Databases (1).pptx
Columnar Databases (1).pptxColumnar Databases (1).pptx
Columnar Databases (1).pptx
 
Real-Time Streaming: Intro to Amazon Kinesis
Real-Time Streaming: Intro to Amazon KinesisReal-Time Streaming: Intro to Amazon Kinesis
Real-Time Streaming: Intro to Amazon Kinesis
 
A Reference Architecture for ETL 2.0
A Reference Architecture for ETL 2.0 A Reference Architecture for ETL 2.0
A Reference Architecture for ETL 2.0
 
Data Lake Overview
Data Lake OverviewData Lake Overview
Data Lake Overview
 
Relational databases vs Non-relational databases
Relational databases vs Non-relational databasesRelational databases vs Non-relational databases
Relational databases vs Non-relational databases
 
SeaweedFS introduction
SeaweedFS introductionSeaweedFS introduction
SeaweedFS introduction
 
Reduce Amazon RDS Costs up to 50% with Proxies
Reduce Amazon RDS Costs up to 50% with ProxiesReduce Amazon RDS Costs up to 50% with Proxies
Reduce Amazon RDS Costs up to 50% with Proxies
 
Building Your Data Warehouse with Amazon Redshift
Building Your Data Warehouse with Amazon RedshiftBuilding Your Data Warehouse with Amazon Redshift
Building Your Data Warehouse with Amazon Redshift
 
Introduction to Data Engineering
Introduction to Data EngineeringIntroduction to Data Engineering
Introduction to Data Engineering
 
Delta Lake Cheat Sheet.pdf
Delta Lake Cheat Sheet.pdfDelta Lake Cheat Sheet.pdf
Delta Lake Cheat Sheet.pdf
 
Processing Large Data with Apache Spark -- HasGeek
Processing Large Data with Apache Spark -- HasGeekProcessing Large Data with Apache Spark -- HasGeek
Processing Large Data with Apache Spark -- HasGeek
 
Best Practices for Data Warehousing with Amazon Redshift | AWS Public Sector ...
Best Practices for Data Warehousing with Amazon Redshift | AWS Public Sector ...Best Practices for Data Warehousing with Amazon Redshift | AWS Public Sector ...
Best Practices for Data Warehousing with Amazon Redshift | AWS Public Sector ...
 
Predicting Flights with Azure Databricks
Predicting Flights with Azure DatabricksPredicting Flights with Azure Databricks
Predicting Flights with Azure Databricks
 
7. Key-Value Databases: In Depth
7. Key-Value Databases: In Depth7. Key-Value Databases: In Depth
7. Key-Value Databases: In Depth
 
Denormalization
DenormalizationDenormalization
Denormalization
 
Building modern data lakes
Building modern data lakes Building modern data lakes
Building modern data lakes
 
Big data architecture
Big data architectureBig data architecture
Big data architecture
 
Memory management in oracle
Memory management in oracleMemory management in oracle
Memory management in oracle
 

Semelhante a In-memory Databases

VMworld 2013: Virtualizing Databases: Doing IT Right
VMworld 2013: Virtualizing Databases: Doing IT Right VMworld 2013: Virtualizing Databases: Doing IT Right
VMworld 2013: Virtualizing Databases: Doing IT Right VMworld
 
OrigoDB - take the red pill
OrigoDB - take the red pillOrigoDB - take the red pill
OrigoDB - take the red pillRobert Friberg
 
High-Performance Storage Services with HailDB and Java
High-Performance Storage Services with HailDB and JavaHigh-Performance Storage Services with HailDB and Java
High-Performance Storage Services with HailDB and Javasunnygleason
 
Key-value databases in practice Redis @ DotNetToscana
Key-value databases in practice Redis @ DotNetToscanaKey-value databases in practice Redis @ DotNetToscana
Key-value databases in practice Redis @ DotNetToscanaMatteo Baglini
 
Big Data Day LA 2016/ Big Data Track - How To Use Impala and Kudu To Optimize...
Big Data Day LA 2016/ Big Data Track - How To Use Impala and Kudu To Optimize...Big Data Day LA 2016/ Big Data Track - How To Use Impala and Kudu To Optimize...
Big Data Day LA 2016/ Big Data Track - How To Use Impala and Kudu To Optimize...Data Con LA
 
"Clouds on the Horizon Get Ready for Drizzle" by David Axmark @ eLiberatica 2009
"Clouds on the Horizon Get Ready for Drizzle" by David Axmark @ eLiberatica 2009"Clouds on the Horizon Get Ready for Drizzle" by David Axmark @ eLiberatica 2009
"Clouds on the Horizon Get Ready for Drizzle" by David Axmark @ eLiberatica 2009eLiberatica
 
J1 T1 3 - Azure Data Lake store & analytics 101 - Kenneth M. Nielsen
J1 T1 3 - Azure Data Lake store & analytics 101 - Kenneth M. NielsenJ1 T1 3 - Azure Data Lake store & analytics 101 - Kenneth M. Nielsen
J1 T1 3 - Azure Data Lake store & analytics 101 - Kenneth M. NielsenMS Cloud Summit
 
001 hbase introduction
001 hbase introduction001 hbase introduction
001 hbase introductionScott Miao
 
Introducing Apache Kudu (Incubating) - Montreal HUG May 2016
Introducing Apache Kudu (Incubating) - Montreal HUG May 2016Introducing Apache Kudu (Incubating) - Montreal HUG May 2016
Introducing Apache Kudu (Incubating) - Montreal HUG May 2016Mladen Kovacevic
 
Citrix Synergy 2014: Going the CloudPlatform Way
Citrix Synergy 2014: Going the CloudPlatform WayCitrix Synergy 2014: Going the CloudPlatform Way
Citrix Synergy 2014: Going the CloudPlatform WayIliyas Shirol
 
Citrix Synergy 2014 - Syn233 Building and operating a Dev Ops cloud: best pra...
Citrix Synergy 2014 - Syn233 Building and operating a Dev Ops cloud: best pra...Citrix Synergy 2014 - Syn233 Building and operating a Dev Ops cloud: best pra...
Citrix Synergy 2014 - Syn233 Building and operating a Dev Ops cloud: best pra...Citrix
 
NoSQL for great good [hanoi.rb talk]
NoSQL for great good [hanoi.rb talk]NoSQL for great good [hanoi.rb talk]
NoSQL for great good [hanoi.rb talk]Huy Do
 
Summer 2017 undergraduate research powerpoint
Summer 2017 undergraduate research powerpointSummer 2017 undergraduate research powerpoint
Summer 2017 undergraduate research powerpointChristopher Dubois
 
dba_lounge_Iasi: Everybody likes redis
dba_lounge_Iasi: Everybody likes redisdba_lounge_Iasi: Everybody likes redis
dba_lounge_Iasi: Everybody likes redisLiviu Costea
 
Exploiting NoSQL Like Never Before
Exploiting NoSQL Like Never BeforeExploiting NoSQL Like Never Before
Exploiting NoSQL Like Never BeforeFrancis Alexander
 
Cassandra Summit 2014: Deploying Cassandra for Call of Duty
Cassandra Summit 2014: Deploying Cassandra for Call of DutyCassandra Summit 2014: Deploying Cassandra for Call of Duty
Cassandra Summit 2014: Deploying Cassandra for Call of DutyDataStax Academy
 
Why we love ArangoDB. The hunt for the right NosQL Database
Why we love ArangoDB. The hunt for the right NosQL DatabaseWhy we love ArangoDB. The hunt for the right NosQL Database
Why we love ArangoDB. The hunt for the right NosQL DatabaseAndreas Jung
 

Semelhante a In-memory Databases (20)

VMworld 2013: Virtualizing Databases: Doing IT Right
VMworld 2013: Virtualizing Databases: Doing IT Right VMworld 2013: Virtualizing Databases: Doing IT Right
VMworld 2013: Virtualizing Databases: Doing IT Right
 
OrigoDB - take the red pill
OrigoDB - take the red pillOrigoDB - take the red pill
OrigoDB - take the red pill
 
High-Performance Storage Services with HailDB and Java
High-Performance Storage Services with HailDB and JavaHigh-Performance Storage Services with HailDB and Java
High-Performance Storage Services with HailDB and Java
 
Key-value databases in practice Redis @ DotNetToscana
Key-value databases in practice Redis @ DotNetToscanaKey-value databases in practice Redis @ DotNetToscana
Key-value databases in practice Redis @ DotNetToscana
 
Big Data Day LA 2016/ Big Data Track - How To Use Impala and Kudu To Optimize...
Big Data Day LA 2016/ Big Data Track - How To Use Impala and Kudu To Optimize...Big Data Day LA 2016/ Big Data Track - How To Use Impala and Kudu To Optimize...
Big Data Day LA 2016/ Big Data Track - How To Use Impala and Kudu To Optimize...
 
"Clouds on the Horizon Get Ready for Drizzle" by David Axmark @ eLiberatica 2009
"Clouds on the Horizon Get Ready for Drizzle" by David Axmark @ eLiberatica 2009"Clouds on the Horizon Get Ready for Drizzle" by David Axmark @ eLiberatica 2009
"Clouds on the Horizon Get Ready for Drizzle" by David Axmark @ eLiberatica 2009
 
J1 T1 3 - Azure Data Lake store & analytics 101 - Kenneth M. Nielsen
J1 T1 3 - Azure Data Lake store & analytics 101 - Kenneth M. NielsenJ1 T1 3 - Azure Data Lake store & analytics 101 - Kenneth M. Nielsen
J1 T1 3 - Azure Data Lake store & analytics 101 - Kenneth M. Nielsen
 
Drupal performance
Drupal performanceDrupal performance
Drupal performance
 
001 hbase introduction
001 hbase introduction001 hbase introduction
001 hbase introduction
 
Introducing Apache Kudu (Incubating) - Montreal HUG May 2016
Introducing Apache Kudu (Incubating) - Montreal HUG May 2016Introducing Apache Kudu (Incubating) - Montreal HUG May 2016
Introducing Apache Kudu (Incubating) - Montreal HUG May 2016
 
Citrix Synergy 2014: Going the CloudPlatform Way
Citrix Synergy 2014: Going the CloudPlatform WayCitrix Synergy 2014: Going the CloudPlatform Way
Citrix Synergy 2014: Going the CloudPlatform Way
 
Citrix Synergy 2014 - Syn233 Building and operating a Dev Ops cloud: best pra...
Citrix Synergy 2014 - Syn233 Building and operating a Dev Ops cloud: best pra...Citrix Synergy 2014 - Syn233 Building and operating a Dev Ops cloud: best pra...
Citrix Synergy 2014 - Syn233 Building and operating a Dev Ops cloud: best pra...
 
NoSQL for great good [hanoi.rb talk]
NoSQL for great good [hanoi.rb talk]NoSQL for great good [hanoi.rb talk]
NoSQL for great good [hanoi.rb talk]
 
Summer 2017 undergraduate research powerpoint
Summer 2017 undergraduate research powerpointSummer 2017 undergraduate research powerpoint
Summer 2017 undergraduate research powerpoint
 
dba_lounge_Iasi: Everybody likes redis
dba_lounge_Iasi: Everybody likes redisdba_lounge_Iasi: Everybody likes redis
dba_lounge_Iasi: Everybody likes redis
 
What's New in Apache Hive
What's New in Apache HiveWhat's New in Apache Hive
What's New in Apache Hive
 
Cosmos db
Cosmos dbCosmos db
Cosmos db
 
Exploiting NoSQL Like Never Before
Exploiting NoSQL Like Never BeforeExploiting NoSQL Like Never Before
Exploiting NoSQL Like Never Before
 
Cassandra Summit 2014: Deploying Cassandra for Call of Duty
Cassandra Summit 2014: Deploying Cassandra for Call of DutyCassandra Summit 2014: Deploying Cassandra for Call of Duty
Cassandra Summit 2014: Deploying Cassandra for Call of Duty
 
Why we love ArangoDB. The hunt for the right NosQL Database
Why we love ArangoDB. The hunt for the right NosQL DatabaseWhy we love ArangoDB. The hunt for the right NosQL Database
Why we love ArangoDB. The hunt for the right NosQL Database
 

Último

Hyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdf
Hyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdfHyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdf
Hyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdfPrecisely
 
Developer Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQLDeveloper Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQLScyllaDB
 
How to write a Business Continuity Plan
How to write a Business Continuity PlanHow to write a Business Continuity Plan
How to write a Business Continuity PlanDatabarracks
 
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptxThe Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptxLoriGlavin3
 
Unleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding ClubUnleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding ClubKalema Edgar
 
Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Mattias Andersson
 
Connect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck PresentationConnect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck PresentationSlibray Presentation
 
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024BookNet Canada
 
Gen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfGen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfAddepto
 
Scanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsScanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsRizwan Syed
 
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek SchlawackFwdays
 
WordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your BrandWordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your Brandgvaughan
 
How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.Curtis Poe
 
What is DBT - The Ultimate Data Build Tool.pdf
What is DBT - The Ultimate Data Build Tool.pdfWhat is DBT - The Ultimate Data Build Tool.pdf
What is DBT - The Ultimate Data Build Tool.pdfMounikaPolabathina
 
Advanced Computer Architecture – An Introduction
Advanced Computer Architecture – An IntroductionAdvanced Computer Architecture – An Introduction
Advanced Computer Architecture – An IntroductionDilum Bandara
 
SAP Build Work Zone - Overview L2-L3.pptx
SAP Build Work Zone - Overview L2-L3.pptxSAP Build Work Zone - Overview L2-L3.pptx
SAP Build Work Zone - Overview L2-L3.pptxNavinnSomaal
 
unit 4 immunoblotting technique complete.pptx
unit 4 immunoblotting technique complete.pptxunit 4 immunoblotting technique complete.pptx
unit 4 immunoblotting technique complete.pptxBkGupta21
 
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptxUse of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptxLoriGlavin3
 
DevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache MavenDevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache MavenHervé Boutemy
 
DSPy a system for AI to Write Prompts and Do Fine Tuning
DSPy a system for AI to Write Prompts and Do Fine TuningDSPy a system for AI to Write Prompts and Do Fine Tuning
DSPy a system for AI to Write Prompts and Do Fine TuningLars Bell
 

Último (20)

Hyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdf
Hyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdfHyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdf
Hyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdf
 
Developer Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQLDeveloper Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQL
 
How to write a Business Continuity Plan
How to write a Business Continuity PlanHow to write a Business Continuity Plan
How to write a Business Continuity Plan
 
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptxThe Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
 
Unleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding ClubUnleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding Club
 
Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?
 
Connect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck PresentationConnect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck Presentation
 
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
 
Gen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfGen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdf
 
Scanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsScanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL Certs
 
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
 
WordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your BrandWordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your Brand
 
How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.
 
What is DBT - The Ultimate Data Build Tool.pdf
What is DBT - The Ultimate Data Build Tool.pdfWhat is DBT - The Ultimate Data Build Tool.pdf
What is DBT - The Ultimate Data Build Tool.pdf
 
Advanced Computer Architecture – An Introduction
Advanced Computer Architecture – An IntroductionAdvanced Computer Architecture – An Introduction
Advanced Computer Architecture – An Introduction
 
SAP Build Work Zone - Overview L2-L3.pptx
SAP Build Work Zone - Overview L2-L3.pptxSAP Build Work Zone - Overview L2-L3.pptx
SAP Build Work Zone - Overview L2-L3.pptx
 
unit 4 immunoblotting technique complete.pptx
unit 4 immunoblotting technique complete.pptxunit 4 immunoblotting technique complete.pptx
unit 4 immunoblotting technique complete.pptx
 
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptxUse of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
 
DevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache MavenDevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache Maven
 
DSPy a system for AI to Write Prompts and Do Fine Tuning
DSPy a system for AI to Write Prompts and Do Fine TuningDSPy a system for AI to Write Prompts and Do Fine Tuning
DSPy a system for AI to Write Prompts and Do Fine Tuning
 

In-memory Databases

  • 1. IN-MEMORY COMPUTING ROBERT FRIBERG, DEVREX LABS HTTP://DEVREXLABS.COM/ @ROBERTFRIBERG
  • 2. About me ◦ Independent Developer and Trainer ◦ Sql Server DBA since 6.5 ◦ C#, javascript, perl, java, +++ ◦ Machine learning, AI ◦ Squash fanatic
  • 3. Agenda ◦ Revisiting Traditional RDBMS ◦ Defining IMDB ◦ A look at a few in-memory products ◦ OrigoDB in depth ◦ Goals ◦ Learn technical stuff ◦ Thinking different
  • 4. What is a database? ◦ An organized collection of information ◦ Allows reading and writing ◦ Provides authorization and authentication ◦ Provides some level of data safety
  • 5. Demand drives change ◦ Performance ◦ Data volume ◦ Scalability ◦ Availability ◦ Modeling • NoSQL • Big data • Graph • Real time analytics • In-memory computing • Column stores One size no longer fits all
  • 7. B-trees and Transactions LOG DATA 64KB blocks w 8x8KB pages Logical BTREE of 8kb data pages In the buffer pool (cache) Buffer Manager Transactions append inserted, deleted, original and modified pages to the LOG • Fill factor • Page splits • Clustered index • Checkpoint
  • 8. Transactions A Atomic C Consistent I Isolated D Durable s0 s1 s2t1 t2 What is s?
  • 9. Isolation Levels ◦ SERIALIZABLE ◦ REPEATABLE_READ ◦ READ_COMMITTED_SNAPSHOT (MVCC) (Row versioning) ◦ READ_COMMITTED ◦ READ_UNCOMMITTED – dont worry, be happy consistency performance
  • 10. “the B-tree is optimized for systems that read and write large blocks of data” - Wikipedia
  • 11. The Traditional RDBMS Architecture ”.. is obsolete” -Michael Stonebraker Reference: OLTP through the looking glass, Stonebraker et al
  • 12. OLTP vs. OLAP mismatch Read load OLAP Read intensive, touches a lot of data, benefits from indexes Write load OLTP Write intensive - Small writes - small reads - hot spots Indexes hurt write performance In-memory Disk-based
  • 13. What is an in-memory database? ◦ PRIMARY representation is in-memory ◦ Memory optimized data structures ◦ ALL the data in memory (possibly distributed) (in-memory is not necessarily in-process)
  • 14. Transaction logging ◦ Write Ahead Logging – write to disk before commit ◦ Effect logging – persist the effected datapages ◦ Command logging – persist the cause
  • 15. IMDB Applications ◦ Real time applications with no durability requirements ◦ Embedded, router, online gaming ◦ Real time applications with durability requirements, low latency, high throughput ◦ Traditional applications during test and development (and production) ◦ Whenever data fits in RAM or can be distributed ◦ General OLTP replacement when DB < 2TB
  • 17. SQL Server Hekaton ◦ Memory optimized tree structure ◦ Almost Lock-free Mvcc concurrency control ◦ Command logging ◦ Seamlessly Integrated in the traditional model ◦ Indexing ◦ Joins ◦ Querying
  • 18. redis ◦ Redis is an open source, BSD licensed, advanced key-value store. It is often referred to as a data structure server since keys can contain strings, hashes, lists, sets and sorted sets. ◦ Extremely popular and widespread (twitter, flicker, github, digg, disqus, Instagram, stackoverflow) ◦ Written in C, great performance
  • 19. Comparison Matrix Product License Datamodel Interface ACID Distributed Concurrency Control VoltDB OSS Relational Java/sql yes Yes (2PC) Serialized memsql $$ Relational SQL Almost Yes Mvcc aerospike $$ Key/value many yes Yes(2PC) CAS SQL Server $$ Relational + T-SQL Yes (no) No Locking, mvcc NuoDB $$ Relational SQL Yes Hazelcast OSS Key/value+ java Almost Yes (2PC) Gridgain OSS Key/value Java,sql Yes Yes (2PC) mvcc Origodb OSS + User defined NET/REST Yes No Master/slave Serialized + Redis OSS Key/value + Many/LUA Yes No Master/Slave Serialized
  • 20. OrigoDB ◦ Is it a database? (first name was Livedomain) ◦ Database Toolkit - Define your own datamodel ◦ Write ahead command logging + snapshots ◦ Single writer + multiple reader concurrency (serialized) ◦ Open source embedded engine ◦ 100% ACID ◦ Commercial server with master/slave replication
  • 21. Design goals ◦Simplicity and correctness before performance ◦Flexibility ◦Rapid development
  • 22. Modular Kernels ◦ Optimistic Kernel – write ahead logging assumes command will succeed ◦ Royal Food Taster – 2 identical in-memory models ◦ Immutability Kernel – Lock free, writers don’t block readers ◦ Requires immutable model
  • 23. Evolution of OrigoDB ◦ File formats for document-oriented Desktop applications (java 1996) ◦ Cache invalidation experiments ◦ In-memory search indexes (offline built snapshots -> live updates) ◦ Inception: lambda expressions (NET 2008)
  • 24. Cousins of OrigoDB ◦ Prevayler (java) ◦ Bamboo (net) ◦ Madeleine (ruby) ◦ Perlvayler ◦ Twisted-python ◦ Lmax ◦ Java-chronicle
  • 25. Bring your own data model ◦ Generic models = Extra schema + mapping is complex so why? ◦ Relational ◦ Key/Value (value is a blob) ◦ Document (document is structured and queryable) ◦ Graph, nodes and edges ◦ Domain specific models ◦ OO Domain model (DDD) (typed graph) ◦ Javascript V8 environment (persisted node.js) ◦ Machine learning models (Accord.NET) ◦ Lucene.NET indexes
  • 26. Demo time! ◦ TODO example – Anemic model, transaction script pattern (fat commands) ◦ Twitter clone – rich model with proxy, no commands ◦ Geekstream http://geekstream.devrexlabs.com/ ◦ OrigoDB Server http://origodb.com/
  • 27. Last words ◦ Times are changing! Embrace! ◦ One size does not fit all – go polyglot persistence! ◦ Choose the most appropriate data model ◦ If data fits in RAM go in-memory! Thank you! robert@devrexlabs.com @robertfriberg

Notas do Editor

  1. Explain the basic operations like insert, seek and scan
  2. Explain the basics quickly.Talk about the boundaries of s.Ask: Is an RDBMS ACID? Answer on next slide.
  3. Consistency and isolation are not binary.
  4. Reporting.In-memory pushes the boundaries
  5. Explain each of the bullets relating to previous topics.Recall slide ”What is a database”?
  6. Great performance comes for free but could be optimized.
  7. Some other frameworks based on or supporting write-ahead command logging and snapshots with a user defined in-memory model.
  8. Defining a custom data model is what makes OrigoDB unique.