2. Robert Friberg
• Design, build and maintain systems for clients at Devrex
• Trainer – Microsoft NET, SQL Server, java, perl, python, linux
• Machine learning, AI
• Squash fanatic
• @robertfriberg, robert@devrexlabs.com
3. Why?
Service
Layer
Domain
Layer
Data Access
Layer
Relational
Model
Views/SP’s
Build faster systems faster
4. What is OrigoDB?
• In-memory database toolkit
• Code and data in same process
• Write-ahead command logging and snapshots
• Open Source single DLL for NET/Mono
• Commercial server with mirror replication
5. How does it work?
.NET Process Storage
Handles queries and
commands, guards
model
Engine
1. AppendToLog
2. Execute
command
PlaceOrderCommand
Snapshot
Snapshot
Client code
passes commands
and queries
PlaceOrderCommand
NewCustomerCommand
IncreaseInventoryLevelCommand
PlaceOrderCommand
PlaceOrderCommand
time
In-memory
Model
6. Demand drives change
• Performance
• Data volume
• Scalability
• Availability
• Modeling
• NoSQL
• Big data
• Graph
• Real time analytics
• In-memory computing
• Column stores
One size (RDBMS) no longer fits all
Polyglot Persistence
7. B-trees and Transactions
LOG
• Fill factor
• Page splits
• Clustered index
• Checkpoint
DATA 64KB blocks w 8x8KB pages
Logical BTREE of 8kb data pages
In the buffer pool (cache)
Buffer
Manager
Transactions append inserted, deleted, original and modified pages to the LOG
8. When?
• Whenever data fits in RAM
• Alternative to general RDBMS OLAP/OLTP
• Complex models and transactions
• In-memory analytics
• Traceability requirements (complete history of events)
17. Silly Relational
[Serializable]
public class RelationalModel : Model
{
private DataSet _dataset;
public RelationalModel()
{
_dataset = new DataSet();
}
//... ExecuteQuery and ExecuteCommand omitted
}
19. Domain specific relational
[Serializable]
public class Model : Model
{
SortedDictionary<int,Customer> _customers;
SortedDictionary<int,Order> _orders;
SortedDictionary<int,Product> _products;
SortedDictionary<string,Customer> _customersByName;
}
[Serializable]
public class Order {
public int CustomerId; //foreign key vs. reference
}
20. JS interpreter hosting (V8)
[Serializable]
public class JurassicModel : Model
{
private ScriptEngine _scriptEngine;
public JurassicModel()
{
_scriptEngine = new ScriptEngine();
_scriptEngine.Execute("var model = {}");
}
//... ExecuteQuery and ExecuteCommand omitted
}
21. Commands
• Serial execution
• Exclusive access to the model
• Transition the model from one valid state to the next
s0 t s1 s2 1 t2
22. Command guidelines
• No side effects or external actions
• No external dependencies
• Unhandled exceptions trigger rollback (full restore)
• Call Command.Abort() to signal exception
23. The model is an object graph
TaskList
Task
Task
Task
TaskList
Task
Task
Task
Task
Category
Category
Category
Category
TaskModel
28. Automated Test alternatives
• Transparent testing of domain behavior: entities, model
• Test commands, queries, entities on model without engine
• Test with in-memory storage
• Full stack testing – slow, requires cleanup
29. In-memory storage
• Non persistent command journal and snapshots
• Mimics FileStore using MemoryStreams
• Tests serialization/identity issues
var config = EngineConfiguration.Create().ForIsolatedTest();
OR:
config.SetCommandStoreFactory(cfg => new InMemoryCommandStore(cfg));
config.SetSnapshotStoreFactory(cfg => new InMemorySnapshotStore(cfg));
33. Engine.For<M>()
• Returns IEngine<M>() or derivative
• Reuse based on EngineConfiguration.Location property
• Remote or in-process
• ILocalEngineClient<M>
• IRemoteEngineClient<M>
• Running engines are tracked by Config.Engines
35. x64 vs. x32
• Core Library compiled with AnyCPU
• x32 = 32-bit pointers, max 4GB
• x64 = 64-bit pointers
• Server ships with x64 and x32 binaries
36. IIS Hosting
• Disable application pool recycling
• Ensure single process, no farming or LB
• Litter controllers with Db.For<M>() / Engine.For<M>()
• Or put a static ref somewhere, eg Global.asax
37. Proxy
• Proxy has same interface as the Model
• Method calls are intercepted
• void methods interpreted as commands (and logged)
• other methods interpreted as queries
• Can be overriden with attributes
• Local or remote, cluster
38. Demo: Let’s build a key/value store
• New project
• Reference origodb.core
• Define KeyValueStoreModel
• void Put(key, value)
• object Get(key,value)
• bool Exists(key, value)
• bool Remove(key,value)
• Add some unit tests
41. EngineConfiguration.Location property
• File location for FileStorage
• Connection string when SqlStorage
• Defaults (when null)
• Will look in app.config for connection string
• Type name of model
• Current working directory
• App_Data in web context
• Magic
• mode=remote;host=10.0.0.20;port=3001
42. Persistence modes
• Journaling (default)
• SnapshotPerTransaction
• ManualSnapshots
var config = EngineConfiguration.Create();
config.PersistenceMode = PersistenceMode.SnapshotPerTransaction;
var db = Engine.For<MyModel>(config);
43. Kernels
• OptimisticKernel (default)
• RoyalFoodTaster
var config = EngineConfiguration.Create();
config.Kernel = Kernels.RoyalFoodTaster;
var db = Db.For<MyModel>(config);
46. ProtobufFormatter
• Protocol Buffers by Google
• IFormatter wrapper around protobuf-net by @marcgravell
• Contract based as opposed to embedded metadata
• Compact, fast, loose coupling, cross platform
• Configure with attributes or code
• Use it!
47. Protobuf: Attribute based configuration
[ProtoContract]
public class Company
{
[ProtoMember(1)]
public string Name { get; set; }
[ProtoMember(2)]
public List<Employee> Employees { get; set; }
}
58. OrigoDB Server
• Console Application or Windows Service
• Process hosting single Engine / Model
• Ad-hoc Linq / Razor queries
• Javascript API
• Primitive web based UI
• Commercial License
• Multiserver replication
60. Lab M6
• Follow the OrigoDB Server online tutorial
http://origodb.com/
61. Thank you for listening!
• http://origodb.com
• http://dev.origodb.com
• http://github.com/devrexlabs
• http://geekstream.devrexlabs.com
• @robertfriberg, @devrexlabs
Editor's Notes
What is it? How does it work? Who built it and why? When does it shine? When does it suck?
What if we just keep all the data in RAM?
Moving back and forth and mapping is silly.
Code and data in same process.
Productivity
Simplicity
Consistency
Testability
Strongly typed, compile time checked
Operations
Capturing the essence..
In-memory
In-memory object graph, user defined. Probably collections, entities and references.
Your choice.
Is it a database? Is it an object database? Linq queries.
Toolkit
Flexible, configurable, kernels, storage, data model, persistence modes, formatting
Bring your own model. – this is key.
Usually a product based on a specific data model. VoltDB, Raven
Naming. LiveDomain -> LiveDB -> OrigoDB
Code and data in same process
Don’t do CRUD. It’s silly. ORMS are based on crud.
One of the first thing you learn is don’t do SELECT *. EF
Command logging
The in-memory data is a projection of the commands,
compare ES with a single aggregate. Same benefits as ES.
Requires NET 4.0
What is OrigoDB?
OrigoDB is an in-memory database toolkit. The core component is the Engine. The engine is 100% ACID, runs in-process and hosts a user defined data model. The data model can be domain specific or generic and is defined using plain old NET types. Persistence is based on snapshots and write-ahead command logging to the underlying storage.
The Model
is an instance of the user defined data model
lives in RAM only
is the data
is a projection of the entire sequence of commands applied to the initial model, usually empty.
can only be accessed through the engine
The Client
has no direct reference to the model
interacts directly with the Engine either in-process or remote
or indirectly via a proxy with the same interface as the model
passes query and command objects to the engine
The Engine
The Engine encapsulates an instance of the model and is responsible for atomicity, consistency, isolation and durability. It performs the following tasks:
writes commands to the journal
executes commands and queries
reads and writes snapshots
restores the model on startup
We call it a toolkit because you have a lot of options
Modelling - define your own model or use an existing one. Generic or domain specific. It’s up to you.
Storage - Default is FileStore. SqlStore or write your own module.
Data format - Choose wire and storage format by plugging in different IFormatter implementations. Binary, JSON, ProtoBuf, etc
Read more in the docs on Extensibility
Design goals
Our initial design goals were focused on rapid development, testability, simplicity, correctness, modularity, flexibility and extensibility. Performance was never a goal but running in-memory with memory optimized data structures outperforms any disk oriented system. But of course a lot of optimization is possible.
Performance –> in-memory, specialization
Data volume -> sharding, partitioning
Availability -> redundancy
Mångfald, det händer saker.
Fi
B-TREE, 8kb block, buffer pool, animering av en transaktion. Column stores
Varje tabell är en B-TREE (om den inte är en HEAP), varje index är en b-tree
Effect logging – log the effect of the transaction = modified pages, new pages
Support rollback by including deleted pages and original version of modified page.
Simplicity and all that comes with it. 40% less code. Reason enough.
Low latency queries, heavy load querying
With event sourcing, polyglot persistence. Build a read model from an event stream
Complex domain models: Because a trivial model is trivial to ORM, RDBMS. Difficult to model relationally
An in-memory instance of the Model is the data.
Commands are like Stored Procedures
Queries are like Views
Ad-hoc LINQ
A quick example
Show the web site, search, show about, mention github, show some code
approx 500’ articles
Not necessarily data
Sole purpose is to update the state of the model, from one state to another.
Avoid embedding entities and graphs, regard commands as simple messages.
Lambdas won’t serialize – local process only
Results get cloned unless otherwise specified.
Show the model, the entities, the commands. Queries/lambdas in controllers. Show hosting in global.asax
Describe the In-memory store and config for isolated test
Depends on where the logic is, commands or model or both
Test with storage exercises serialization, use to ensure serialization and isolation
Full stack as smoke tests and to verify configuration or framework related behavior
Considerations when hosting the in-process engine and model.
Lifecycle
Snapshots, direct interaction, Dispose, plenty of overloads taking config, config-string, initial model
Abstraction for convenience
Thread safe execute overloads
Or – demonstrate geeksream
Considerations when hosting the in-process engine and model.
Lifecycle
Explain each briefly
Pretty messy, leaky abstractions, needs redesign. Any takers?
Authorizer uses IPrincical and Roles of current thread identity.
Synchronizer unimportant, ReaderWriterLockSlim default
SqlStorage, EventStoreStore, Azure TableStorage for snapshots
BinaryFormatter, JsonFormatter, ProtobufFormatter
Compare with BinaryFormatter, Reflection
Use it: In production, let the design stabilize first
Considerations when hosting the in-process engine and model.
Lifecycle
Write transactions are serialized without blocking readers. Reader gets the most recent state.
This is an animated slide
_tasks.ToArray() creates a new array, thus no way to modify the _tasks array.
AddTask() returns a new TodoModel instance, thus it is immutable
Strings are immutable. The strings are shared among consecutive instances of TodoModel
Could have had an IsCompleted() method and assert !IsCompleted() in the Complete method. But the point is that instances of Task are immutable.
Notice base class and out parameter on execute
Considerations when hosting the in-process engine and model.
Lifecycle