SlideShare uma empresa Scribd logo
1 de 15
Bitsy Graph Database

Sridhar Ramachandran
Founder, LambdaZen LLC
What is Bitsy?
● A small, fast, embeddable,
durable, in-memory graph
database.
● Maintains an on-disk copy of
the graph database.
● Designed for multi-threaded
OLTP applications.
● Provides ACID guarantees
and optimistic concurrency
control for transactions.
● Compatible with
Tinkerpop/Blueprints -- the
graph database standard.

Tinkerpop software stack
From https://github.com/tinkerpop/blueprints/wiki
In-memory and durable?
● Bitsy maintains a copy of the entire graph in memory
data-structures.
● Bitsy saves all changes made to the database, to the
disk, during a commit operation.
● Commits from different threads are forced to the disk at
once, thereby improving the write performance in a
multithreaded OLTP environment.
● The database is loaded from files during startup.
● All database files are append-only text files with JSONencoded vertices and edges.
● The database files are periodically compacted by a
background thread.
Design Principle #1: No Seek
● Bitsy appends all changes to an
unordered transaction log, unlike
most databases which persist data in
B-Trees and other ordered
structures.
● Ordered data structures perform
multiple seeks per updated element.
● Seek operations on the hard-disk are
expensive (5-15 ms).
● Bitsy avoids seeks per element, and
addresses rotational latency by
combining commits from concurrent
transactions.

Hard disk head: Seek
operations require a
mechanical movement of
the hard disk head which
takes 5-15ms.
Rotational latency is the
time taken for the
requested sector in the
rotating platter to reach
the head. Takes 2-4ms.
Design Principle #2: No Socket
● Typical databases run in a separate
process exposing a socket-based
protocol to applications.
● The cost of serializing and
deserializing the requests and
responses, and calling OS-level
functions, reduces the overall
throughput of the database.
● By avoiding a socket-based protocol
between the application and the
database, Bitsy can achieve submicrosecond query latencies.

The OSI model requires
serialization and
deserialization as the
packet crosses from one
layer to another
Design Principle #3: No SQL
● Tuning a SQL database is
a non-trivial task.
● The biggest factor in a
SQL query's efficiency is
its execution plan.
● By avoiding SQL and the
execution plans that come
with it, Bitsy ensures that
all queries and updates
are efficient*.

An example execution plan from Oracle's
documentation

* The "allow full-graph scan" option must be disabled to guarantee quick responses.
Concurrency Model
● Bitsy is designed to work in multi-threaded OLTP
environments.
● It implements optimistic concurrency control where
edges and vertices are tied to version numbers that are
incremented on updates.
● A BitsyRetryException is raised during a transaction
commit, if an updated vertex/edge has a different
version at the time of commit, than at the time of query.
● The application should retry the entire transaction in
case of conflict.
Write Algorithms
●
●

●
●

●

●

The write algorithms operate on
three levels of "double buffers".
The transaction buffers capture
transactions to be committed
simultaneously.
The commit waits for the buffer to
flush to a transaction file (A/B).
Transaction files are moved to
vertex and edge files on exceeding
a threshold size (default is 4MB).
Vertex and edge files are
reorganized after a period of growth
(default is +1x initial size).
Online backups trigger a
transaction flush, and then copy the
backup the vertex and edge files
representing the DB snapshot.
Write throughput in an OLTP setting
●
●
●

The plot below shows the throughput of a test application* that repeatedly
commits a small transaction (1 vertex + 1 edge) from multiple threads.
The throughput exceeds 50K ops/second at 750 concurrent threads.
The comparison with Neo4J 1.9.2 illustrates the benefit of "No Seek".

* Tests performed on a $600 HP p7-1287c desktop PC with a single 7200 rpm hard disk.
Read throughput in an OLTP setting
●
●

The plot below shows the read throughput of threads, repeatedly traversing
separate portions of the graph in a desktop PC*.
Bitsy implements mostly lock-free read algorithms that can perform close
to 20M ops/second at 1000 threads -- on par with Neo4J’s warm caches.

* Tests performed on a $600 HP p7-1287c desktop PC with 4 cores
Monitoring and Management
● Offline backup and
restore operations are
simple file copy
operations on the
database directory.
● Bitsy exposes a JMX
interface to make online
backups, and adjust
database parameters.
● Bitsy logs messages
using the SLF4J API with
logger names starting
with "com.lambdazen".

Online backup through jconsole
Dependencies
●
●
●
●

Blueprints Core
Jackson JSON Processor
SLF4J API
Ness Computing Core Component: For fast UUID
serialization/deserialization
License
● Bitsy is a dual-licensed product.
● The AGPL v3 license can be used for open-source
●

projects and internally-used closed-source projects.
The commercial license is an extremely liberal license
that provides rights to modify and use Bitsy in an
unlimited number of instances, products* and services.
Pricing details with a 15% promotional discount (till Feb 2014)
Startups and small
businesses
(1-10 employees)

Medium-sized enterprises
(10-500 employees)

Large-sized
enterprises
(500+ employees)

$425 annual
$1699 perpetual

$849 annual
$3399 perpetual

$1275 annual
$5099 perpetual

* The products must not encourage the direct use of Bitsy APIs.
Wrap-up
● Bitsy is a small, fast, embeddable, durable, in-memory
graph database, with the following features:
○ ACID guarantees and clean recovery from crashes
○ Query latency in sub-microseconds
○ High transaction throughput in an OLTP setting with multiple
clients/threads accessing the database

●

○ Well-defined optimistic concurrency model
○ Support for online backups
○ Human-readable database files
○ Small code footprint (~1.5MB with dependencies)
Bitsy is dual-licensed under AGPL and a liberal
commercial license for unlimited enterprise-wide use.
Questions and Feedback
● The project is hosted at https://bitbucket.
org/lambdazen/bitsy with publicly accessible
○ Documentation and install instructions (in Wiki)
○ Links to downloads
○ Issue management

● Please email your questions and feedback to
bisty@lambdazen.com

Mais conteúdo relacionado

Mais procurados

Designing Apache Hudi for Incremental Processing With Vinoth Chandar and Etha...
Designing Apache Hudi for Incremental Processing With Vinoth Chandar and Etha...Designing Apache Hudi for Incremental Processing With Vinoth Chandar and Etha...
Designing Apache Hudi for Incremental Processing With Vinoth Chandar and Etha...
HostedbyConfluent
 
Performance Optimizations in Apache Impala
Performance Optimizations in Apache ImpalaPerformance Optimizations in Apache Impala
Performance Optimizations in Apache Impala
Cloudera, Inc.
 

Mais procurados (20)

Apache airflow
Apache airflowApache airflow
Apache airflow
 
Hadoop
HadoopHadoop
Hadoop
 
Giraph at Hadoop Summit 2014
Giraph at Hadoop Summit 2014Giraph at Hadoop Summit 2014
Giraph at Hadoop Summit 2014
 
OSA Con 2022 - Arrow in Flight_ New Developments in Data Connectivity - David...
OSA Con 2022 - Arrow in Flight_ New Developments in Data Connectivity - David...OSA Con 2022 - Arrow in Flight_ New Developments in Data Connectivity - David...
OSA Con 2022 - Arrow in Flight_ New Developments in Data Connectivity - David...
 
Introduction to spark
Introduction to sparkIntroduction to spark
Introduction to spark
 
Apache Airflow
Apache AirflowApache Airflow
Apache Airflow
 
Designing Apache Hudi for Incremental Processing With Vinoth Chandar and Etha...
Designing Apache Hudi for Incremental Processing With Vinoth Chandar and Etha...Designing Apache Hudi for Incremental Processing With Vinoth Chandar and Etha...
Designing Apache Hudi for Incremental Processing With Vinoth Chandar and Etha...
 
How To be a Backend developer
How To be a Backend developer    How To be a Backend developer
How To be a Backend developer
 
Cqrs api v2
Cqrs api v2Cqrs api v2
Cqrs api v2
 
Hexagonal Architecture
Hexagonal ArchitectureHexagonal Architecture
Hexagonal Architecture
 
Iceberg: a fast table format for S3
Iceberg: a fast table format for S3Iceberg: a fast table format for S3
Iceberg: a fast table format for S3
 
Dev Ops Training
Dev Ops TrainingDev Ops Training
Dev Ops Training
 
Hadoop - primeiros passos
Hadoop - primeiros passosHadoop - primeiros passos
Hadoop - primeiros passos
 
Presto Summit 2018 - 09 - Netflix Iceberg
Presto Summit 2018  - 09 - Netflix IcebergPresto Summit 2018  - 09 - Netflix Iceberg
Presto Summit 2018 - 09 - Netflix Iceberg
 
Unlocking the Power of Lakehouse Architectures with Apache Pulsar and Apache ...
Unlocking the Power of Lakehouse Architectures with Apache Pulsar and Apache ...Unlocking the Power of Lakehouse Architectures with Apache Pulsar and Apache ...
Unlocking the Power of Lakehouse Architectures with Apache Pulsar and Apache ...
 
GPORCA: Query Optimization as a Service
GPORCA: Query Optimization as a ServiceGPORCA: Query Optimization as a Service
GPORCA: Query Optimization as a Service
 
Performance Optimizations in Apache Impala
Performance Optimizations in Apache ImpalaPerformance Optimizations in Apache Impala
Performance Optimizations in Apache Impala
 
Designing a complete ci cd pipeline using argo events, workflow and cd products
Designing a complete ci cd pipeline using argo events, workflow and cd productsDesigning a complete ci cd pipeline using argo events, workflow and cd products
Designing a complete ci cd pipeline using argo events, workflow and cd products
 
Continuous Application with FAIR Scheduler with Robert Xue
Continuous Application with FAIR Scheduler with Robert XueContinuous Application with FAIR Scheduler with Robert Xue
Continuous Application with FAIR Scheduler with Robert Xue
 
Airflow at lyft
Airflow at lyftAirflow at lyft
Airflow at lyft
 

Destaque (6)

Improvements in Bitsy 1.5
Improvements in Bitsy 1.5Improvements in Bitsy 1.5
Improvements in Bitsy 1.5
 
HyperGraphDb
HyperGraphDbHyperGraphDb
HyperGraphDb
 
HypergraphDB
HypergraphDBHypergraphDB
HypergraphDB
 
OrientDB distributed architecture 1.1
OrientDB distributed architecture 1.1OrientDB distributed architecture 1.1
OrientDB distributed architecture 1.1
 
Pinot: Realtime Distributed OLAP datastore
Pinot: Realtime Distributed OLAP datastorePinot: Realtime Distributed OLAP datastore
Pinot: Realtime Distributed OLAP datastore
 
fluent-plugin-norikra #fluentdcasual
fluent-plugin-norikra #fluentdcasualfluent-plugin-norikra #fluentdcasual
fluent-plugin-norikra #fluentdcasual
 

Semelhante a Bitsy graph database

in-memory database system and low latency
in-memory database system and low latencyin-memory database system and low latency
in-memory database system and low latency
hyeongchae lee
 
MySQL 5.6 - Operations and Diagnostics Improvements
MySQL 5.6 - Operations and Diagnostics ImprovementsMySQL 5.6 - Operations and Diagnostics Improvements
MySQL 5.6 - Operations and Diagnostics Improvements
Morgan Tocker
 
PCIX Gigabit Ethernet Card Design
PCIX Gigabit Ethernet Card DesignPCIX Gigabit Ethernet Card Design
PCIX Gigabit Ethernet Card Design
Mohamad Tisani
 

Semelhante a Bitsy graph database (20)

Webinar: High Performance MongoDB Applications with IBM POWER8
Webinar: High Performance MongoDB Applications with IBM POWER8Webinar: High Performance MongoDB Applications with IBM POWER8
Webinar: High Performance MongoDB Applications with IBM POWER8
 
PostgreSQL 10: What to Look For
PostgreSQL 10: What to Look ForPostgreSQL 10: What to Look For
PostgreSQL 10: What to Look For
 
Ashnik EnterpriseDB PostgreSQL - A real alternative to Oracle
Ashnik EnterpriseDB PostgreSQL - A real alternative to Oracle Ashnik EnterpriseDB PostgreSQL - A real alternative to Oracle
Ashnik EnterpriseDB PostgreSQL - A real alternative to Oracle
 
GlusterFS Presentation FOSSCOMM2013 HUA, Athens, GR
GlusterFS Presentation FOSSCOMM2013 HUA, Athens, GRGlusterFS Presentation FOSSCOMM2013 HUA, Athens, GR
GlusterFS Presentation FOSSCOMM2013 HUA, Athens, GR
 
WiredTiger & What's New in 3.0
WiredTiger & What's New in 3.0WiredTiger & What's New in 3.0
WiredTiger & What's New in 3.0
 
Running Production CDC Ingestion Pipelines With Balaji Varadarajan and Pritam...
Running Production CDC Ingestion Pipelines With Balaji Varadarajan and Pritam...Running Production CDC Ingestion Pipelines With Balaji Varadarajan and Pritam...
Running Production CDC Ingestion Pipelines With Balaji Varadarajan and Pritam...
 
in-memory database system and low latency
in-memory database system and low latencyin-memory database system and low latency
in-memory database system and low latency
 
Webinar: OpenEBS - Still Free and now FASTEST Kubernetes storage
Webinar: OpenEBS - Still Free and now FASTEST Kubernetes storageWebinar: OpenEBS - Still Free and now FASTEST Kubernetes storage
Webinar: OpenEBS - Still Free and now FASTEST Kubernetes storage
 
IMCSummit 2015 - Day 1 Developer Track - Evolution of non-volatile memory exp...
IMCSummit 2015 - Day 1 Developer Track - Evolution of non-volatile memory exp...IMCSummit 2015 - Day 1 Developer Track - Evolution of non-volatile memory exp...
IMCSummit 2015 - Day 1 Developer Track - Evolution of non-volatile memory exp...
 
Db2 analytics accelerator on ibm integrated analytics system technical over...
Db2 analytics accelerator on ibm integrated analytics system   technical over...Db2 analytics accelerator on ibm integrated analytics system   technical over...
Db2 analytics accelerator on ibm integrated analytics system technical over...
 
EQUNIX - PPT 11DB-Postgres™.pdf
EQUNIX - PPT 11DB-Postgres™.pdfEQUNIX - PPT 11DB-Postgres™.pdf
EQUNIX - PPT 11DB-Postgres™.pdf
 
VM-aware Adaptive Storage Cache Prefetching
VM-aware Adaptive Storage Cache PrefetchingVM-aware Adaptive Storage Cache Prefetching
VM-aware Adaptive Storage Cache Prefetching
 
MySQL 5.6 - Operations and Diagnostics Improvements
MySQL 5.6 - Operations and Diagnostics ImprovementsMySQL 5.6 - Operations and Diagnostics Improvements
MySQL 5.6 - Operations and Diagnostics Improvements
 
PCIX Gigabit Ethernet Card Design
PCIX Gigabit Ethernet Card DesignPCIX Gigabit Ethernet Card Design
PCIX Gigabit Ethernet Card Design
 
M|18 Intel and MariaDB: Strategic Collaboration to Enhance MariaDB Functional...
M|18 Intel and MariaDB: Strategic Collaboration to Enhance MariaDB Functional...M|18 Intel and MariaDB: Strategic Collaboration to Enhance MariaDB Functional...
M|18 Intel and MariaDB: Strategic Collaboration to Enhance MariaDB Functional...
 
Apache ignite v1.3
Apache ignite v1.3Apache ignite v1.3
Apache ignite v1.3
 
Sparc t4 systems customer presentation
Sparc t4 systems customer presentationSparc t4 systems customer presentation
Sparc t4 systems customer presentation
 
Cómo se diseña una base de datos que pueda ingerir más de cuatro millones de ...
Cómo se diseña una base de datos que pueda ingerir más de cuatro millones de ...Cómo se diseña una base de datos que pueda ingerir más de cuatro millones de ...
Cómo se diseña una base de datos que pueda ingerir más de cuatro millones de ...
 
https://docs.google.com/presentation/d/1DcL4zK6i3HZRDD4xTGX1VpSOwyu2xBeWLT6a_...
https://docs.google.com/presentation/d/1DcL4zK6i3HZRDD4xTGX1VpSOwyu2xBeWLT6a_...https://docs.google.com/presentation/d/1DcL4zK6i3HZRDD4xTGX1VpSOwyu2xBeWLT6a_...
https://docs.google.com/presentation/d/1DcL4zK6i3HZRDD4xTGX1VpSOwyu2xBeWLT6a_...
 
9.6_Course Material-Postgresql_002.pdf
9.6_Course Material-Postgresql_002.pdf9.6_Course Material-Postgresql_002.pdf
9.6_Course Material-Postgresql_002.pdf
 

Último

IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI Solutions
Enterprise Knowledge
 
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
Earley Information Science
 
CNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of ServiceCNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of Service
giselly40
 

Último (20)

Factors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptxFactors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptx
 
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
 
Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)
 
IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI Solutions
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...
 
Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreterPresentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreter
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
 
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
 
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
 
Real Time Object Detection Using Open CV
Real Time Object Detection Using Open CVReal Time Object Detection Using Open CV
Real Time Object Detection Using Open CV
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivity
 
Advantages of Hiring UIUX Design Service Providers for Your Business
Advantages of Hiring UIUX Design Service Providers for Your BusinessAdvantages of Hiring UIUX Design Service Providers for Your Business
Advantages of Hiring UIUX Design Service Providers for Your Business
 
Slack Application Development 101 Slides
Slack Application Development 101 SlidesSlack Application Development 101 Slides
Slack Application Development 101 Slides
 
Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed texts
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
 
CNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of ServiceCNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of Service
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processors
 

Bitsy graph database

  • 1. Bitsy Graph Database Sridhar Ramachandran Founder, LambdaZen LLC
  • 2. What is Bitsy? ● A small, fast, embeddable, durable, in-memory graph database. ● Maintains an on-disk copy of the graph database. ● Designed for multi-threaded OLTP applications. ● Provides ACID guarantees and optimistic concurrency control for transactions. ● Compatible with Tinkerpop/Blueprints -- the graph database standard. Tinkerpop software stack From https://github.com/tinkerpop/blueprints/wiki
  • 3. In-memory and durable? ● Bitsy maintains a copy of the entire graph in memory data-structures. ● Bitsy saves all changes made to the database, to the disk, during a commit operation. ● Commits from different threads are forced to the disk at once, thereby improving the write performance in a multithreaded OLTP environment. ● The database is loaded from files during startup. ● All database files are append-only text files with JSONencoded vertices and edges. ● The database files are periodically compacted by a background thread.
  • 4. Design Principle #1: No Seek ● Bitsy appends all changes to an unordered transaction log, unlike most databases which persist data in B-Trees and other ordered structures. ● Ordered data structures perform multiple seeks per updated element. ● Seek operations on the hard-disk are expensive (5-15 ms). ● Bitsy avoids seeks per element, and addresses rotational latency by combining commits from concurrent transactions. Hard disk head: Seek operations require a mechanical movement of the hard disk head which takes 5-15ms. Rotational latency is the time taken for the requested sector in the rotating platter to reach the head. Takes 2-4ms.
  • 5. Design Principle #2: No Socket ● Typical databases run in a separate process exposing a socket-based protocol to applications. ● The cost of serializing and deserializing the requests and responses, and calling OS-level functions, reduces the overall throughput of the database. ● By avoiding a socket-based protocol between the application and the database, Bitsy can achieve submicrosecond query latencies. The OSI model requires serialization and deserialization as the packet crosses from one layer to another
  • 6. Design Principle #3: No SQL ● Tuning a SQL database is a non-trivial task. ● The biggest factor in a SQL query's efficiency is its execution plan. ● By avoiding SQL and the execution plans that come with it, Bitsy ensures that all queries and updates are efficient*. An example execution plan from Oracle's documentation * The "allow full-graph scan" option must be disabled to guarantee quick responses.
  • 7. Concurrency Model ● Bitsy is designed to work in multi-threaded OLTP environments. ● It implements optimistic concurrency control where edges and vertices are tied to version numbers that are incremented on updates. ● A BitsyRetryException is raised during a transaction commit, if an updated vertex/edge has a different version at the time of commit, than at the time of query. ● The application should retry the entire transaction in case of conflict.
  • 8. Write Algorithms ● ● ● ● ● ● The write algorithms operate on three levels of "double buffers". The transaction buffers capture transactions to be committed simultaneously. The commit waits for the buffer to flush to a transaction file (A/B). Transaction files are moved to vertex and edge files on exceeding a threshold size (default is 4MB). Vertex and edge files are reorganized after a period of growth (default is +1x initial size). Online backups trigger a transaction flush, and then copy the backup the vertex and edge files representing the DB snapshot.
  • 9. Write throughput in an OLTP setting ● ● ● The plot below shows the throughput of a test application* that repeatedly commits a small transaction (1 vertex + 1 edge) from multiple threads. The throughput exceeds 50K ops/second at 750 concurrent threads. The comparison with Neo4J 1.9.2 illustrates the benefit of "No Seek". * Tests performed on a $600 HP p7-1287c desktop PC with a single 7200 rpm hard disk.
  • 10. Read throughput in an OLTP setting ● ● The plot below shows the read throughput of threads, repeatedly traversing separate portions of the graph in a desktop PC*. Bitsy implements mostly lock-free read algorithms that can perform close to 20M ops/second at 1000 threads -- on par with Neo4J’s warm caches. * Tests performed on a $600 HP p7-1287c desktop PC with 4 cores
  • 11. Monitoring and Management ● Offline backup and restore operations are simple file copy operations on the database directory. ● Bitsy exposes a JMX interface to make online backups, and adjust database parameters. ● Bitsy logs messages using the SLF4J API with logger names starting with "com.lambdazen". Online backup through jconsole
  • 12. Dependencies ● ● ● ● Blueprints Core Jackson JSON Processor SLF4J API Ness Computing Core Component: For fast UUID serialization/deserialization
  • 13. License ● Bitsy is a dual-licensed product. ● The AGPL v3 license can be used for open-source ● projects and internally-used closed-source projects. The commercial license is an extremely liberal license that provides rights to modify and use Bitsy in an unlimited number of instances, products* and services. Pricing details with a 15% promotional discount (till Feb 2014) Startups and small businesses (1-10 employees) Medium-sized enterprises (10-500 employees) Large-sized enterprises (500+ employees) $425 annual $1699 perpetual $849 annual $3399 perpetual $1275 annual $5099 perpetual * The products must not encourage the direct use of Bitsy APIs.
  • 14. Wrap-up ● Bitsy is a small, fast, embeddable, durable, in-memory graph database, with the following features: ○ ACID guarantees and clean recovery from crashes ○ Query latency in sub-microseconds ○ High transaction throughput in an OLTP setting with multiple clients/threads accessing the database ● ○ Well-defined optimistic concurrency model ○ Support for online backups ○ Human-readable database files ○ Small code footprint (~1.5MB with dependencies) Bitsy is dual-licensed under AGPL and a liberal commercial license for unlimited enterprise-wide use.
  • 15. Questions and Feedback ● The project is hosted at https://bitbucket. org/lambdazen/bitsy with publicly accessible ○ Documentation and install instructions (in Wiki) ○ Links to downloads ○ Issue management ● Please email your questions and feedback to bisty@lambdazen.com