New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
An introduction to Pincaster
1. This is just a kiwi, what were you thinking about?
2. What is Pincaster ?
• Not Only SQL (well, not at all)
• Not only a database for geolocalized apps
• Not only a key/value store
• Not only a persistent database
• Not only a non-relational database
• Not only a database.
3. Speaks HTTP / JSON
• Adds a slight overhead over a binary protocol, but makes it dead
simple to use in almost any language and environment.
• Pincaster is fast and lightweight no matter what. And there's
plenty of room for optimization. HTTP keep alive is supported.
• Written in C. Runs on OSX / Linux / *BSD and doesn't require
any external dependency. Valgrind doesn't cry. BSD license.
• Event-driven model with asynchronous workers. Powered by the
awesome libevent2 library.
• And shamelessly reuses (in a different implementation) some of
the nice concepts from Redis. Why not?
4. Threads
HTTP server OpReply queue Worker
Zero copy
Worker
Domain handler Op queue
Worker
Journal rewriter
fork()ed
5. Layers
• A layer is like a database, identified by a unique
name.
• A layer contains a set of records.
• Layers are independent and can have different
settings.
• Layers can be created / deleted online.
• Layers can mix different types of records.
6. Void records
• Just unique keys.
• Fast and memory efficient.
• Serialized data can be embedded in keys.
• Useful as flags and in range queries.
7. Hashes
Property Binary-safe value
Key Property Binary-safe value
Property Binary-safe value
8. Atomic operations
No transaction, but multiple changes can be combined as a
single atomic operation:
• Add new properties
• Update properties
• Delete properties
• Change special properties
• Increment/decrement counters which are automatically
created if needed.
9. Points
Latitude, longitude
Key or
x,y
Location is set through the special _loc property.
Indexed with quad-trees.
Space efficient, points are grouped into buckets.
Designed for dynamic data like geolocalized applications.
10. Layers types
• Flat: rectangular area (x0, y0) - (x1, y1)
• Flatwrap: rectangular area with wrapping.
• Spherical / geoidal - WGS84 - GPS and map
services friendly. Handle corner cases.
• Pick your function for non-euclidian distance
computation: rhomboid, fast, great circle or
haversine.
11. Simple spatial queries
• Find points within a radius (euclidian distance for
flat/flatwrap or meters for spherical/geoidal layers).
• Find points within a rectangle. Wraparound is
properly handled. You can directly query according
to the Google Map viewing area.
• Overflow is reported.
• Clustering is on the way.
12. Points+hashes
Location
Key Property Binary-safe value
Property Binary-safe value
Spatial queries can optionally return properties.
13. Expirable records
• Records can automatically expire by setting an
_expires_at property.
• Expiration dates can be changed, removed and
readded at will.
• Can act like a memcache speaking HTTP/
JSON with a set of properties per record. Also
useful for ephemeral geo data (e.g. when
storing location of online users).
14. Range queries
• Keys are lexically sorted (red-black tree).
• Hence, range queries are cheap.
• Pincaster currently offers prefix-matching.
• Results of range queries can include keys,
keys + overview or keys + overview +
properties.
15. Linked records
• Symbolic links to other records with special properties starting
with $link: .
• Implements N:1 relations but 1:N and N:N can be represented
by multiple links. Records can have any number of links.
• No referential integrity.
• Useful for easy retrieval of related records. But nowhere an
alternative to a graph database.
• Just add link=1 to any query in order to traverse links and
retrieve related records (duplicate records and loops are
tracked).
16. Name
Donald Duck
Donald Location
$link:favorite restaurant Rest1234
Rest1234 Description Mac Donald's
Location
17. Public HTTP service
• Public data can be embedded in records, through
$content and $content_type properties.
• Especially useful to store JSON data and HTML
partials that can be directly served to browsers.
You can also think about it like memcache with
an embedded web server.
• Should be used behind a proxy or a filtering load
balancer, though.
18. Expiration
Location
Donald Name Donald Duck
$content <html>Quack quack!</html>
$content_type text/html
http://host/api/1.0/records/users/Donald.json
Public:
http://host/public/users/Donald.html
Quack quack!
19. Durability
• Similar to Redis AOF, but with a non-binary (human readable
and tweakable) journal.
• Data set and indices are kept in memory.
• But an append-only journal can log every query that is going
to change a database. Timestamps allow point-in-time
rollback.
• Configurable fsync() policy: after every commit, never or after
x seconds.
• Combines efficiency of an in-memory database with durability.
20. Journal rewrite
• Reduces startup time.
• Constructs a new journal with only needed operations to
reconstruct the data set.
• Background operation happening in a child process with a low
priority.
• Takes advantage of copy-on-write through fork() and appends
missing records afterwards.
• The new journal atomically replaces the previous one after
successful completion.
21. Coming soon
• Replication
• Clustering of spatial results (partly implemented).
• Optimization (distance computation, finer grained rwlocks, use slabs in
more areas).
• Spidermonkey integration.
• Automatically move expired records to another layer.
• Observers (push a notification over a HTTP channel when there's an
update in a geographic zone).
• Client libraries and possibly a decent web site (help needed!)