SQLite is a widely used embedded database engine, known for its simplicity and lightweight design. However, the original SQLite project does not accept contributions from third parties and does not use third-party code, which can limit its potential for innovation. This talk is an overview of SQLite architecture and an introduction to libSQL: Chiselstrike's fork of SQLite.
Piotr Sarna will show how this fork can be used in distributed settings, with automatic backups and the ability to replicate data across multiple nodes. Chiselstrike's modifications also include integration with WebAssembly, which allows users to define custom functions and procedures using Wasm, a compact and portable binary format.
You'll learn the reasons behind this fork of SQLite, and the challenges and trade-offs involved in extending the database with these new features. Piotr also presents Chiselstrike's plans for future work. This talk will be relevant to database researchers and practitioners interested in leveraging SQLite for applications that require custom functions and/or distributed support.
2. Piotr Sarna
■ Graduated from University of Warsaw
with MSc in Computer Science
■ Used to develop a distributed file system
■ Wrote a few patches for the Linux kernel
■ Ex-maintainer of ScyllaDB
and the ScyllaDB Rust Driver project
■ Maintainer of libSQL
Your photo
goes here,
smile :)
3. libSQL
A fork of SQLite, focused on:
1. open contribution
2. modernizing SQLite for the purpose of edge computing and
distributed systems
3. Rust language preferred for new features
https://libsql.org
https://github.com/libsql/libsql
7. B-Tree
Quoting the docs: “Knuth, The Art Of Computer Programming,
Volume 3 "Sorting and Searching", pages 471-479”
Two implementations internally:
● Table B-Trees, indexed by i64 and storing arbitrary data in
leaves only
● Index B-Trees, indexed by arbitrary data stored in all nodes
8. Pager
Pager is an interface for accessing pages of data.
Pages can be 512-65536 bytes long.
B-Tree is implemented entirely on top of the pager.
9. VFS
Virtual File System
Serves similar purpose to VFS in the Linux kernel - represents a common
interface for the underlying implementations.
Initially created to support multiple operating systems.
It turned out to be more generic, and allows various implementations:
● encrypted I/O
● compressed I/O
● wrappers producing extra logs and metrics
● etc.
10. VDBE
Virtual machine responsible for running SQL
commands.
Each statement gets prepared, i.e. compiled to VM
instructions.
The statement can then be executed with
sqlite3_step and other functions.
12. Transactions
Atomic execution of statements.
Flavors of transactions in SQLite:
● read/write
● DEFERRED/IMMEDIATE/EXCLUSIVE
Implicitly or explicitly, each statement is executed
within a transaction.
14. Journaling Modes
Rollback journal
● Pages are copied to rollback journal
before writing
● On rollback, pages are copied from
the rollback journal and applied
● Reads are performed from the main
database file
● Works on shared, distributed file
systems
Write-ahead log (WAL)
● Pages are appended to WAL on writes
● On rollback, pages are discarded from
WAL
● A “checkpoint” operation moves pages
back to the original database file
● Reads are performed both from the main
database file and WAL
● A WAL index is needed for efficient
access to newest pages
● Works within a single machine, because
WAL index lives in shared memory
15. Journaling Modes
Rollback journal
● Rollback journal files can be
deleted, truncated or left intact
once not needed
● 1 writer or N readers can
access
the database concurrently
Write-ahead log (WAL)
● WAL file can grow indefinitely, unless a
“checkpoint” operation moves its pages
back to the database file
● “autocheckpoint” option usually takes
care of automatic cleanup
● WAL2 format uses 2 files in order
to prevent unbounded growth of WAL
files*
● 1* writer and N readers can access
the database
*WAL2 mode is not merged upstream yet
*N writers are supported as well, conditions apply, not generally available
16. Concurrency
Rollback journal concurrency:
● either 1 writer or N readers
WAL concurrency:
● 1 writer and N readers
BEGIN CONCURRENT:
● N writers and N readers
as long as writers do not overlap,
available only on a separate code branch
17. SQLITE_BUSY
When a transaction cannot continue due to another fibre holding a lock,
SQLITE_BUSY error code is returned, and the user is supposed to poll or give up.
A global busy handler can also be registered.
18. Distributed SQLite?
● rqlite
○ https://github.com/rqlite/rqlite
○ SQLite distributed with Raft, on a statement string level
● dqlite
○ https://github.com/canonical/dqlite
○ SQLite distributed with Raft, on WAL level
● Litestream
○ https://github.com/benbjohnson/litestream
○ SQLite with WAL files replicated to S3
● Verneuil
○ https://github.com/backtrace-labs/verneuil/
○ VFS for SQLite, using S3-compatible storage as backend
● mvSQLite
○ https://github.com/losfair/mvsqlite
○ SQLite on top of FoundationDB with multi-version concurrency control, implementation based on VFS
● sqld
○ https://github.com/libsql/sqld
○ Server-side SQLite with multiple pluggable backends
20. Virtual WAL
Write-ahead log journaling brings superior write concurrency, but
contrary to VFS,
it does not have a virtual interface in SQLite.
In libSQL, it does. That allows:
● committing database pages to a remote distributed system
● reading pages from a distributed systems without keeping
them locally
● implementing lightweight read replicas
● implementing efficient disaster recovery mechanisms
21. Write Path
Pager WAL
B-Tree
write pages n,m,o,p
begin write transaction
yessir
write pages n,m,o,p
yessir
end write transaction
yessir
k
22. Read Path: Page Exists in WAL
Pager WAL
B-Tree
read page n
begin read transaction
yessir
is page n in WAL?
here’s newest committed frame: f
read frame f
here it is
here’s page n
end read transaction
yessir
23. Read Path: Page not in WAL
Pager WAL
B-Tree
read page n
begin read transaction
yessir
is page n in WAL?
no sir
*reads from the main
database file*
end read transaction
yessir
here’s page n
24. Virtual WAL: Example Use Case
A virtual WAL implementation which continuously backs up
the main database file and its WAL to remote storage:
https://github.com/libsql/bottomless
https://github.com/libsql/sqld/