SlideShare uma empresa Scribd logo
1 de 58
The Return of Big Iron?
Ben Stopford
Distinguished Engineer
RBS Markets
Much diversity
What does this mean?
• A change in what customers (we) value
• The mainstream is not serving customers
(us) sufficiently
The Database field has problems
We Lose: Joe Hellerstein (Berkeley) 2001
“Databases are commoditised and cornered
to slow-moving, evolving, structure
intensive, applications that require schema
evolution.“ …
“The internet companies are lost and we will
remain in the doldrums of the enterprise
space.” …
“As databases are black boxes which
require a lot of coaxing to get maximum
performance”
His question was how to win
them back?
These new technologies also
caused frustration
Backlash (2009)
Not novel (dates back to the 80’s)
Physical level not the logical level (messy?)
Incompatible with tooling
Lack of integrity (referential) & ACID
MR is brute force ignoring indexing, scew
All points are reasonable
And they proved it too!
“A comparison of Approaches to Large Scale
Data Analysis” – Sigmod 2009
• Vertica vs. DBMSX vs.
Hadoop
• Vertica up to 7 x faster than
Hadoop over benchmarks
Databases faster
than Hadoop
But possibly missed the point?
Databases were traditionally
designed to keep data safe
NoSQL grew from a need to scale
It’s more than just scale, they
facilitate different practices
A Better Fit
They better match the way software is
engineered today.
– Iterative development
– Fast feedback
– Frequent releases
Is NoSQL a Disruptive Technology?
Christensen’s observation:
Market leaders are displaced when markets
shift in ways that the incumbent leaders are
not prepared for.
Aside: MongoDB
• Impressive trajectory
• Slightly crappy product (from a traditional
database standpoint)
• Most closely related to relational DB (of
the NoSQLs)
• Plays to the agile mindset
Yet the NoSQL market is relatively
small
• Currently around $600 but projected to
grow strongly
• Database and systems management
market is worth around $34billion
Key Point

There is more to NoSQL than just
scale, it sits better with the way we
build software today
We have new building blocks to
play with!
My Problem
• Sprawling application space, built over
many years, grouped into both vertical and
horizontal silos
• Duplication of effort
• Data corruption & preventative measures
• Consolidation is costly, time consuming
and technically challenging.
Traditional solutions
(in chronological order)
– Messaging
– SOA
– Enterprise Data Warehouse
– Data virtualisation
Bringing data, applications, people
together is hard
A popular choice is an EDW
EDW pattern is workable, but tough
– As soon as you take a ‘view’ on what the
shape of the data is, it becomes harder to
change.
• Leave ‘taking a view” to the last responsible
moment

– Multifaceted: Shape, diversity of source,
diversity of population, temporal change
Harder to do iteratively
Is this the only way?
The Google Approach
MapReduce
Google Filesystem
BigTable
Tenzing
Megastore
F1
Dremel
Spanner
And just one code base!
So no enterprise schema secret
society!
The Ebay Approach
The Partial-Schematic
Approach
Often termed Clobs & Cracking
Problems with solidifying a
schematic representation
• Risk of throwing information away, keeping
only what you think you need.
– OK if you create data
– Bad if you got data from elsewhere

• Data tends to be poly-structured in
programs and on the wire
• Early-binding slows down development
But schemas are good
• They guarantee a contract
• That contract spans the whole dataset
– Similar to static typing in programming
languages.
Compromise positions
• Query schema can be a subset of data
schema.
• Use schemaless databases to capture
diversity early and evolve it as you build.
Common solutions today use
multiple technologies
M Re u
ap d ce

D a
at
W ho se
are u

?
Ke Vl u
y ae
St o
re

In- M mry/
eo
O
LTP D ba
ata se
We use an late-bound schema,
sitting over a schemaless store
S
tructured
S
tandardisation
Layer
Raw Data

Late Bound
Schema
Evolutionary Approach
• Late-binding makes consolidation
incremental
– Schematic representation delivered at the ‘last
responsible moment’ (schema on demand)
– A trade in this model has 4 mandatory nodes. A
fully modeled trade has around 800.

• The system of record is raw data, not our
‘view’ of it
• No schema migration! But this comes at a
price.
Scaling
Key based access always scales
Client
But queries (without the sharding key)
always broadcast
Client
As query complexity increases so does
the overhead
Client
Course grained shards
Client
Data Replicas provide hardware isolation
Client
Scaling
• Key based sharding is only sufficient very
simple workloads
• Course grained shards help (but suffer
from skew)
• Replication provides useful, if expensive,
hardware isolation
• Workload management is less useful in
my experience
Weak consistency forces the
problem onto the developer
Particularly bad for banks!
Scaling two phase commit is hard to
do efficiently
• Requires distributed lock/clock/counter
• Requires synchronisation of all readers &
writers
Alternatives to traditional 2PC
• MVCC over explicit locking
• Timestamp based strong consistency
– E.g. Granola

• Optimistic concurrency control
– Leverage short running transactions (avoid
cross-network transactions)
– Tolerate different temporal viewpoints to
reduce synchronization costs.
Immutable Data
•
•
•
•
•

Safety
‘As was’ view
Sits well with MVCC
Efficiency problems
Gaining popularity (e.g. Datomic)
Use joins to avoid ‘over aggregating’

Joins are ok, so long as they are
– Local
– via a unique key

Trade
r

Party
Trade
Memory/Disk Tradeoff
• Memory only (possibly overplayed)
• Pinned indexes (generally good idea if you
can afford the RAM)
• Disk resident (best general purpose
solution and for very large datasets)
Balance flexibility and complexity
Operational
(real time / MR)

Object/S
QL
S
tandardisation

Raw Data

Relational
Analytics
Supple at the front, more rigid at the back

Raw Access

Operational Access

Analytic Access

D

Looser

Tighter

L
M

Untyped

Object/S
QL

Reporting

Broad Data Coverage

Narrow Data Coverage

Narrow Query

Comprehensive Quer y
Principals
•
•
•
•

Record everything
Grow a schema, don’t do it upfront
Avoid using a ‘view’ as your system of record.
Differentiate between sourced data (out of
your control) and generated data (in your
control).
• Use automated replication (for isolation) as
well as sharding (for scale)
• Leverage asynchronicity to reduce
transaction overheads
Consolidation
means more trust,
less impedance
mismatches and
managing tighter
couplings
Target architectures are starting to
look more like large applications
of cloud enabled services than
heterogeneous application
conglomerates
Are we going back to the mainframe?
Thanks

http://www.benstopford.com

Mais conteúdo relacionado

Mais procurados

Lessons from lhc
Lessons from lhcLessons from lhc
Lessons from lhcdrsm79
 
SQL or NoSQL, that is the question!
SQL or NoSQL, that is the question!SQL or NoSQL, that is the question!
SQL or NoSQL, that is the question!Andraz Tori
 
Performance Considerations in Logical Data Warehouse
Performance Considerations in Logical Data WarehousePerformance Considerations in Logical Data Warehouse
Performance Considerations in Logical Data WarehouseDenodo
 
NoSQL Architecture Overview
NoSQL Architecture OverviewNoSQL Architecture Overview
NoSQL Architecture OverviewChristopher Foot
 
How Consistent Data Services Deliver Simplicity, Compatibility, And Lower Cost
How Consistent Data Services Deliver Simplicity, Compatibility, And Lower CostHow Consistent Data Services Deliver Simplicity, Compatibility, And Lower Cost
How Consistent Data Services Deliver Simplicity, Compatibility, And Lower CostDana Gardner
 
Data Warehouse in Cloud
Data Warehouse in CloudData Warehouse in Cloud
Data Warehouse in CloudPawan Bhargava
 
Distributed RDBMS: Challenges, Solutions & Trade-offs
Distributed RDBMS: Challenges, Solutions & Trade-offsDistributed RDBMS: Challenges, Solutions & Trade-offs
Distributed RDBMS: Challenges, Solutions & Trade-offsAhmed Magdy Ezzeldin, MSc.
 
Building a Digital Bank
Building a Digital BankBuilding a Digital Bank
Building a Digital BankDataStax
 
How to select a modern data warehouse and get the most out of it?
How to select a modern data warehouse and get the most out of it?How to select a modern data warehouse and get the most out of it?
How to select a modern data warehouse and get the most out of it?Slim Baltagi
 
O'Reilly ebook: Operationalizing the Data Lake
O'Reilly ebook: Operationalizing the Data LakeO'Reilly ebook: Operationalizing the Data Lake
O'Reilly ebook: Operationalizing the Data LakeVasu S
 
Where Does Big Data Meet Big Database - QCon 2012
Where Does Big Data Meet Big Database - QCon 2012Where Does Big Data Meet Big Database - QCon 2012
Where Does Big Data Meet Big Database - QCon 2012Ben Stopford
 
Using Graph Databases in Real-Time to Solve Resource Authorization at Telenor...
Using Graph Databases in Real-Time to Solve Resource Authorization at Telenor...Using Graph Databases in Real-Time to Solve Resource Authorization at Telenor...
Using Graph Databases in Real-Time to Solve Resource Authorization at Telenor...Sebastian Verheughe
 
So You Want to Build a Data Lake?
So You Want to Build a Data Lake?So You Want to Build a Data Lake?
So You Want to Build a Data Lake?David P. Moore
 
How much money do you lose every time your ecommerce site goes down?
How much money do you lose every time your ecommerce site goes down?How much money do you lose every time your ecommerce site goes down?
How much money do you lose every time your ecommerce site goes down?DataStax
 
Big Challenges in Data Modeling: NoSQL and Data Modeling
Big Challenges in Data Modeling: NoSQL and Data ModelingBig Challenges in Data Modeling: NoSQL and Data Modeling
Big Challenges in Data Modeling: NoSQL and Data ModelingDATAVERSITY
 

Mais procurados (20)

Lessons from lhc
Lessons from lhcLessons from lhc
Lessons from lhc
 
SQL or NoSQL, that is the question!
SQL or NoSQL, that is the question!SQL or NoSQL, that is the question!
SQL or NoSQL, that is the question!
 
Performance Considerations in Logical Data Warehouse
Performance Considerations in Logical Data WarehousePerformance Considerations in Logical Data Warehouse
Performance Considerations in Logical Data Warehouse
 
NoSQL Architecture Overview
NoSQL Architecture OverviewNoSQL Architecture Overview
NoSQL Architecture Overview
 
How Consistent Data Services Deliver Simplicity, Compatibility, And Lower Cost
How Consistent Data Services Deliver Simplicity, Compatibility, And Lower CostHow Consistent Data Services Deliver Simplicity, Compatibility, And Lower Cost
How Consistent Data Services Deliver Simplicity, Compatibility, And Lower Cost
 
Data Warehouse in Cloud
Data Warehouse in CloudData Warehouse in Cloud
Data Warehouse in Cloud
 
Distributed RDBMS: Challenges, Solutions & Trade-offs
Distributed RDBMS: Challenges, Solutions & Trade-offsDistributed RDBMS: Challenges, Solutions & Trade-offs
Distributed RDBMS: Challenges, Solutions & Trade-offs
 
Sql vs nosql
Sql vs nosqlSql vs nosql
Sql vs nosql
 
Disaster Recovery Site Implementation with MySQL
Disaster Recovery Site Implementation with MySQLDisaster Recovery Site Implementation with MySQL
Disaster Recovery Site Implementation with MySQL
 
SQL Server Disaster Recovery Implementation
SQL Server Disaster Recovery ImplementationSQL Server Disaster Recovery Implementation
SQL Server Disaster Recovery Implementation
 
Building a Digital Bank
Building a Digital BankBuilding a Digital Bank
Building a Digital Bank
 
How to select a modern data warehouse and get the most out of it?
How to select a modern data warehouse and get the most out of it?How to select a modern data warehouse and get the most out of it?
How to select a modern data warehouse and get the most out of it?
 
O'Reilly ebook: Operationalizing the Data Lake
O'Reilly ebook: Operationalizing the Data LakeO'Reilly ebook: Operationalizing the Data Lake
O'Reilly ebook: Operationalizing the Data Lake
 
Where Does Big Data Meet Big Database - QCon 2012
Where Does Big Data Meet Big Database - QCon 2012Where Does Big Data Meet Big Database - QCon 2012
Where Does Big Data Meet Big Database - QCon 2012
 
Using Graph Databases in Real-Time to Solve Resource Authorization at Telenor...
Using Graph Databases in Real-Time to Solve Resource Authorization at Telenor...Using Graph Databases in Real-Time to Solve Resource Authorization at Telenor...
Using Graph Databases in Real-Time to Solve Resource Authorization at Telenor...
 
So You Want to Build a Data Lake?
So You Want to Build a Data Lake?So You Want to Build a Data Lake?
So You Want to Build a Data Lake?
 
Rdbms vs. no sql
Rdbms vs. no sqlRdbms vs. no sql
Rdbms vs. no sql
 
Data vault what's Next: Part 2
Data vault what's Next: Part 2Data vault what's Next: Part 2
Data vault what's Next: Part 2
 
How much money do you lose every time your ecommerce site goes down?
How much money do you lose every time your ecommerce site goes down?How much money do you lose every time your ecommerce site goes down?
How much money do you lose every time your ecommerce site goes down?
 
Big Challenges in Data Modeling: NoSQL and Data Modeling
Big Challenges in Data Modeling: NoSQL and Data ModelingBig Challenges in Data Modeling: NoSQL and Data Modeling
Big Challenges in Data Modeling: NoSQL and Data Modeling
 

Destaque

Streaming, Database & Distributed Systems Bridging the Divide
Streaming, Database & Distributed Systems Bridging the DivideStreaming, Database & Distributed Systems Bridging the Divide
Streaming, Database & Distributed Systems Bridging the DivideBen Stopford
 
Microservices for a Streaming World
Microservices for a Streaming WorldMicroservices for a Streaming World
Microservices for a Streaming WorldBen Stopford
 
Data Pipelines with Apache Kafka
Data Pipelines with Apache KafkaData Pipelines with Apache Kafka
Data Pipelines with Apache KafkaBen Stopford
 
A little bit of clojure
A little bit of clojureA little bit of clojure
A little bit of clojureBen Stopford
 
The Power of the Log
The Power of the LogThe Power of the Log
The Power of the LogBen Stopford
 
Linux Performance Tools
Linux Performance ToolsLinux Performance Tools
Linux Performance ToolsBrendan Gregg
 
Ideas for Distributing Skills Across a Continental Divide
Ideas for Distributing Skills Across a Continental DivideIdeas for Distributing Skills Across a Continental Divide
Ideas for Distributing Skills Across a Continental DivideBen Stopford
 
Test-Oriented Languages: Is it time for a new era?
Test-Oriented Languages: Is it time for a new era?Test-Oriented Languages: Is it time for a new era?
Test-Oriented Languages: Is it time for a new era?Ben Stopford
 
Refactoring tested code - has mocking gone wrong?
Refactoring tested code - has mocking gone wrong?Refactoring tested code - has mocking gone wrong?
Refactoring tested code - has mocking gone wrong?Ben Stopford
 
Beyond The Data Grid: Coherence, Normalisation, Joins and Linear Scalability
Beyond The Data Grid: Coherence, Normalisation, Joins and Linear ScalabilityBeyond The Data Grid: Coherence, Normalisation, Joins and Linear Scalability
Beyond The Data Grid: Coherence, Normalisation, Joins and Linear ScalabilityBen Stopford
 
Coherence Implementation Patterns - Sig Nov 2011
Coherence Implementation Patterns - Sig Nov 2011Coherence Implementation Patterns - Sig Nov 2011
Coherence Implementation Patterns - Sig Nov 2011Ben Stopford
 
Top Mistakes When Writing Reactive Applications - Scala by the Bay 2016
Top Mistakes When Writing Reactive Applications - Scala by the Bay 2016Top Mistakes When Writing Reactive Applications - Scala by the Bay 2016
Top Mistakes When Writing Reactive Applications - Scala by the Bay 2016Petr Zapletal
 
Time Manager Workshop at #QGIS2015 Conference in Nodebo
Time Manager Workshop at #QGIS2015 Conference in NodeboTime Manager Workshop at #QGIS2015 Conference in Nodebo
Time Manager Workshop at #QGIS2015 Conference in NodeboAnita Graser
 
User Focused Security at Netflix: Stethoscope
User Focused Security at Netflix: StethoscopeUser Focused Security at Netflix: Stethoscope
User Focused Security at Netflix: StethoscopeJesse Kriss
 
forward and backward chaining
forward and backward chainingforward and backward chaining
forward and backward chainingRado Sianipar
 
Taking the friction out of microservice frameworks with Lagom
Taking the friction out of microservice frameworks with LagomTaking the friction out of microservice frameworks with Lagom
Taking the friction out of microservice frameworks with LagomMarkus Eisele
 
Stay productive while slicing up the monolith
Stay productive while slicing up the monolith Stay productive while slicing up the monolith
Stay productive while slicing up the monolith Markus Eisele
 
Modernizing Applications with Microservices
Modernizing Applications with MicroservicesModernizing Applications with Microservices
Modernizing Applications with MicroservicesMarkus Eisele
 

Destaque (20)

JAX London Slides
JAX London SlidesJAX London Slides
JAX London Slides
 
Streaming, Database & Distributed Systems Bridging the Divide
Streaming, Database & Distributed Systems Bridging the DivideStreaming, Database & Distributed Systems Bridging the Divide
Streaming, Database & Distributed Systems Bridging the Divide
 
Microservices for a Streaming World
Microservices for a Streaming WorldMicroservices for a Streaming World
Microservices for a Streaming World
 
Data Pipelines with Apache Kafka
Data Pipelines with Apache KafkaData Pipelines with Apache Kafka
Data Pipelines with Apache Kafka
 
A little bit of clojure
A little bit of clojureA little bit of clojure
A little bit of clojure
 
The Power of the Log
The Power of the LogThe Power of the Log
The Power of the Log
 
Linux Performance Tools
Linux Performance ToolsLinux Performance Tools
Linux Performance Tools
 
Ideas for Distributing Skills Across a Continental Divide
Ideas for Distributing Skills Across a Continental DivideIdeas for Distributing Skills Across a Continental Divide
Ideas for Distributing Skills Across a Continental Divide
 
Test-Oriented Languages: Is it time for a new era?
Test-Oriented Languages: Is it time for a new era?Test-Oriented Languages: Is it time for a new era?
Test-Oriented Languages: Is it time for a new era?
 
Refactoring tested code - has mocking gone wrong?
Refactoring tested code - has mocking gone wrong?Refactoring tested code - has mocking gone wrong?
Refactoring tested code - has mocking gone wrong?
 
Beyond The Data Grid: Coherence, Normalisation, Joins and Linear Scalability
Beyond The Data Grid: Coherence, Normalisation, Joins and Linear ScalabilityBeyond The Data Grid: Coherence, Normalisation, Joins and Linear Scalability
Beyond The Data Grid: Coherence, Normalisation, Joins and Linear Scalability
 
Coherence Implementation Patterns - Sig Nov 2011
Coherence Implementation Patterns - Sig Nov 2011Coherence Implementation Patterns - Sig Nov 2011
Coherence Implementation Patterns - Sig Nov 2011
 
Top Mistakes When Writing Reactive Applications - Scala by the Bay 2016
Top Mistakes When Writing Reactive Applications - Scala by the Bay 2016Top Mistakes When Writing Reactive Applications - Scala by the Bay 2016
Top Mistakes When Writing Reactive Applications - Scala by the Bay 2016
 
MessageBus vs MessageBus
MessageBus vs MessageBusMessageBus vs MessageBus
MessageBus vs MessageBus
 
Time Manager Workshop at #QGIS2015 Conference in Nodebo
Time Manager Workshop at #QGIS2015 Conference in NodeboTime Manager Workshop at #QGIS2015 Conference in Nodebo
Time Manager Workshop at #QGIS2015 Conference in Nodebo
 
User Focused Security at Netflix: Stethoscope
User Focused Security at Netflix: StethoscopeUser Focused Security at Netflix: Stethoscope
User Focused Security at Netflix: Stethoscope
 
forward and backward chaining
forward and backward chainingforward and backward chaining
forward and backward chaining
 
Taking the friction out of microservice frameworks with Lagom
Taking the friction out of microservice frameworks with LagomTaking the friction out of microservice frameworks with Lagom
Taking the friction out of microservice frameworks with Lagom
 
Stay productive while slicing up the monolith
Stay productive while slicing up the monolith Stay productive while slicing up the monolith
Stay productive while slicing up the monolith
 
Modernizing Applications with Microservices
Modernizing Applications with MicroservicesModernizing Applications with Microservices
Modernizing Applications with Microservices
 

Semelhante a Big iron 2 (published)

SQL, NoSQL, BigData in Data Architecture
SQL, NoSQL, BigData in Data ArchitectureSQL, NoSQL, BigData in Data Architecture
SQL, NoSQL, BigData in Data ArchitectureVenu Anuganti
 
What ya gonna do?
What ya gonna do?What ya gonna do?
What ya gonna do?CQD
 
Big Data Platforms: An Overview
Big Data Platforms: An OverviewBig Data Platforms: An Overview
Big Data Platforms: An OverviewC. Scyphers
 
Relational databases vs Non-relational databases
Relational databases vs Non-relational databasesRelational databases vs Non-relational databases
Relational databases vs Non-relational databasesJames Serra
 
Jay Kreps on Project Voldemort Scaling Simple Storage At LinkedIn
Jay Kreps on Project Voldemort Scaling Simple Storage At LinkedInJay Kreps on Project Voldemort Scaling Simple Storage At LinkedIn
Jay Kreps on Project Voldemort Scaling Simple Storage At LinkedInLinkedIn
 
One Size Doesn't Fit All: The New Database Revolution
One Size Doesn't Fit All: The New Database RevolutionOne Size Doesn't Fit All: The New Database Revolution
One Size Doesn't Fit All: The New Database Revolutionmark madsen
 
NoSQLDatabases
NoSQLDatabasesNoSQLDatabases
NoSQLDatabasesAdi Challa
 
NO SQL: What, Why, How
NO SQL: What, Why, HowNO SQL: What, Why, How
NO SQL: What, Why, HowIgor Moochnick
 
MongoDB: What, why, when
MongoDB: What, why, whenMongoDB: What, why, when
MongoDB: What, why, whenEugenio Minardi
 
Building FoundationDB
Building FoundationDBBuilding FoundationDB
Building FoundationDBFoundationDB
 
Everything We Learned About In-Memory Data Layout While Building VoltDB
Everything We Learned About In-Memory Data Layout While Building VoltDBEverything We Learned About In-Memory Data Layout While Building VoltDB
Everything We Learned About In-Memory Data Layout While Building VoltDBjhugg
 
Database revolution opening webcast 01 18-12
Database revolution opening webcast 01 18-12Database revolution opening webcast 01 18-12
Database revolution opening webcast 01 18-12mark madsen
 
Database Revolution - Exploratory Webcast
Database Revolution - Exploratory WebcastDatabase Revolution - Exploratory Webcast
Database Revolution - Exploratory WebcastInside Analysis
 
JasperWorld 2012: Reinventing Data Management by Max Schireson
JasperWorld 2012: Reinventing Data Management by Max SchiresonJasperWorld 2012: Reinventing Data Management by Max Schireson
JasperWorld 2012: Reinventing Data Management by Max SchiresonMongoDB
 
When to Use MongoDB...and When You Should Not...
When to Use MongoDB...and When You Should Not...When to Use MongoDB...and When You Should Not...
When to Use MongoDB...and When You Should Not...MongoDB
 
SpringPeople - Introduction to Cloud Computing
SpringPeople - Introduction to Cloud ComputingSpringPeople - Introduction to Cloud Computing
SpringPeople - Introduction to Cloud ComputingSpringPeople
 
Navigating NoSQL in cloudy skies
Navigating NoSQL in cloudy skiesNavigating NoSQL in cloudy skies
Navigating NoSQL in cloudy skiesshnkr_rmchndrn
 
Big Data Cloud Meetup - Jan 29 2013 - Mike Stonebraker & Scott Jarr of VoltDB
Big Data Cloud Meetup - Jan 29 2013 - Mike Stonebraker & Scott Jarr of VoltDBBig Data Cloud Meetup - Jan 29 2013 - Mike Stonebraker & Scott Jarr of VoltDB
Big Data Cloud Meetup - Jan 29 2013 - Mike Stonebraker & Scott Jarr of VoltDBBigDataCloud
 
"Navigating the Database Universe" by Dr. Michael Stonebraker and Scott Jarr,...
"Navigating the Database Universe" by Dr. Michael Stonebraker and Scott Jarr,..."Navigating the Database Universe" by Dr. Michael Stonebraker and Scott Jarr,...
"Navigating the Database Universe" by Dr. Michael Stonebraker and Scott Jarr,...lisapaglia
 

Semelhante a Big iron 2 (published) (20)

SQL, NoSQL, BigData in Data Architecture
SQL, NoSQL, BigData in Data ArchitectureSQL, NoSQL, BigData in Data Architecture
SQL, NoSQL, BigData in Data Architecture
 
What ya gonna do?
What ya gonna do?What ya gonna do?
What ya gonna do?
 
Big Data Platforms: An Overview
Big Data Platforms: An OverviewBig Data Platforms: An Overview
Big Data Platforms: An Overview
 
Relational databases vs Non-relational databases
Relational databases vs Non-relational databasesRelational databases vs Non-relational databases
Relational databases vs Non-relational databases
 
Jay Kreps on Project Voldemort Scaling Simple Storage At LinkedIn
Jay Kreps on Project Voldemort Scaling Simple Storage At LinkedInJay Kreps on Project Voldemort Scaling Simple Storage At LinkedIn
Jay Kreps on Project Voldemort Scaling Simple Storage At LinkedIn
 
One Size Doesn't Fit All: The New Database Revolution
One Size Doesn't Fit All: The New Database RevolutionOne Size Doesn't Fit All: The New Database Revolution
One Size Doesn't Fit All: The New Database Revolution
 
NoSQLDatabases
NoSQLDatabasesNoSQLDatabases
NoSQLDatabases
 
NO SQL: What, Why, How
NO SQL: What, Why, HowNO SQL: What, Why, How
NO SQL: What, Why, How
 
MongoDB: What, why, when
MongoDB: What, why, whenMongoDB: What, why, when
MongoDB: What, why, when
 
Building FoundationDB
Building FoundationDBBuilding FoundationDB
Building FoundationDB
 
Everything We Learned About In-Memory Data Layout While Building VoltDB
Everything We Learned About In-Memory Data Layout While Building VoltDBEverything We Learned About In-Memory Data Layout While Building VoltDB
Everything We Learned About In-Memory Data Layout While Building VoltDB
 
Database revolution opening webcast 01 18-12
Database revolution opening webcast 01 18-12Database revolution opening webcast 01 18-12
Database revolution opening webcast 01 18-12
 
Database Revolution - Exploratory Webcast
Database Revolution - Exploratory WebcastDatabase Revolution - Exploratory Webcast
Database Revolution - Exploratory Webcast
 
JasperWorld 2012: Reinventing Data Management by Max Schireson
JasperWorld 2012: Reinventing Data Management by Max SchiresonJasperWorld 2012: Reinventing Data Management by Max Schireson
JasperWorld 2012: Reinventing Data Management by Max Schireson
 
When to Use MongoDB...and When You Should Not...
When to Use MongoDB...and When You Should Not...When to Use MongoDB...and When You Should Not...
When to Use MongoDB...and When You Should Not...
 
Destroying Data Silos
Destroying Data SilosDestroying Data Silos
Destroying Data Silos
 
SpringPeople - Introduction to Cloud Computing
SpringPeople - Introduction to Cloud ComputingSpringPeople - Introduction to Cloud Computing
SpringPeople - Introduction to Cloud Computing
 
Navigating NoSQL in cloudy skies
Navigating NoSQL in cloudy skiesNavigating NoSQL in cloudy skies
Navigating NoSQL in cloudy skies
 
Big Data Cloud Meetup - Jan 29 2013 - Mike Stonebraker & Scott Jarr of VoltDB
Big Data Cloud Meetup - Jan 29 2013 - Mike Stonebraker & Scott Jarr of VoltDBBig Data Cloud Meetup - Jan 29 2013 - Mike Stonebraker & Scott Jarr of VoltDB
Big Data Cloud Meetup - Jan 29 2013 - Mike Stonebraker & Scott Jarr of VoltDB
 
"Navigating the Database Universe" by Dr. Michael Stonebraker and Scott Jarr,...
"Navigating the Database Universe" by Dr. Michael Stonebraker and Scott Jarr,..."Navigating the Database Universe" by Dr. Michael Stonebraker and Scott Jarr,...
"Navigating the Database Universe" by Dr. Michael Stonebraker and Scott Jarr,...
 

Mais de Ben Stopford

10 Principals for Effective Event-Driven Microservices with Apache Kafka
10 Principals for Effective Event-Driven Microservices with Apache Kafka10 Principals for Effective Event-Driven Microservices with Apache Kafka
10 Principals for Effective Event-Driven Microservices with Apache KafkaBen Stopford
 
10 Principals for Effective Event Driven Microservices
10 Principals for Effective Event Driven Microservices10 Principals for Effective Event Driven Microservices
10 Principals for Effective Event Driven MicroservicesBen Stopford
 
The Future of Streaming: Global Apps, Event Stores and Serverless
The Future of Streaming: Global Apps, Event Stores and ServerlessThe Future of Streaming: Global Apps, Event Stores and Serverless
The Future of Streaming: Global Apps, Event Stores and ServerlessBen Stopford
 
A Global Source of Truth for the Microservices Generation
A Global Source of Truth for the Microservices GenerationA Global Source of Truth for the Microservices Generation
A Global Source of Truth for the Microservices GenerationBen Stopford
 
Building Event Driven Services with Kafka Streams
Building Event Driven Services with Kafka StreamsBuilding Event Driven Services with Kafka Streams
Building Event Driven Services with Kafka StreamsBen Stopford
 
NDC London 2017 - The Data Dichotomy- Rethinking Data and Services with Streams
NDC London 2017  - The Data Dichotomy- Rethinking Data and Services with StreamsNDC London 2017  - The Data Dichotomy- Rethinking Data and Services with Streams
NDC London 2017 - The Data Dichotomy- Rethinking Data and Services with StreamsBen Stopford
 
Building Event Driven Services with Apache Kafka and Kafka Streams - Devoxx B...
Building Event Driven Services with Apache Kafka and Kafka Streams - Devoxx B...Building Event Driven Services with Apache Kafka and Kafka Streams - Devoxx B...
Building Event Driven Services with Apache Kafka and Kafka Streams - Devoxx B...Ben Stopford
 
Building Event Driven Services with Stateful Streams
Building Event Driven Services with Stateful StreamsBuilding Event Driven Services with Stateful Streams
Building Event Driven Services with Stateful StreamsBen Stopford
 
Devoxx London 2017 - Rethinking Services With Stateful Streams
Devoxx London 2017 - Rethinking Services With Stateful StreamsDevoxx London 2017 - Rethinking Services With Stateful Streams
Devoxx London 2017 - Rethinking Services With Stateful StreamsBen Stopford
 
Event Driven Services Part 2: Building Event-Driven Services with Apache Kafka
Event Driven Services Part 2:  Building Event-Driven Services with Apache KafkaEvent Driven Services Part 2:  Building Event-Driven Services with Apache Kafka
Event Driven Services Part 2: Building Event-Driven Services with Apache KafkaBen Stopford
 
Event Driven Services Part 1: The Data Dichotomy
Event Driven Services Part 1: The Data Dichotomy Event Driven Services Part 1: The Data Dichotomy
Event Driven Services Part 1: The Data Dichotomy Ben Stopford
 
Event Driven Services Part 3: Putting the Micro into Microservices with State...
Event Driven Services Part 3: Putting the Micro into Microservices with State...Event Driven Services Part 3: Putting the Micro into Microservices with State...
Event Driven Services Part 3: Putting the Micro into Microservices with State...Ben Stopford
 
Strata Software Architecture NY: The Data Dichotomy
Strata Software Architecture NY: The Data DichotomyStrata Software Architecture NY: The Data Dichotomy
Strata Software Architecture NY: The Data DichotomyBen Stopford
 
Advanced databases ben stopford
Advanced databases   ben stopfordAdvanced databases   ben stopford
Advanced databases ben stopfordBen Stopford
 
A Paradigm Shift: The Increasing Dominance of Memory-Oriented Solutions for H...
A Paradigm Shift: The Increasing Dominance of Memory-Oriented Solutions for H...A Paradigm Shift: The Increasing Dominance of Memory-Oriented Solutions for H...
A Paradigm Shift: The Increasing Dominance of Memory-Oriented Solutions for H...Ben Stopford
 
Balancing Replication and Partitioning in a Distributed Java Database
Balancing Replication and Partitioning in a Distributed Java DatabaseBalancing Replication and Partitioning in a Distributed Java Database
Balancing Replication and Partitioning in a Distributed Java DatabaseBen Stopford
 
Data Grids with Oracle Coherence
Data Grids with Oracle CoherenceData Grids with Oracle Coherence
Data Grids with Oracle CoherenceBen Stopford
 
Architecting for Change: An Agile Approach
Architecting for Change: An Agile ApproachArchitecting for Change: An Agile Approach
Architecting for Change: An Agile ApproachBen Stopford
 

Mais de Ben Stopford (18)

10 Principals for Effective Event-Driven Microservices with Apache Kafka
10 Principals for Effective Event-Driven Microservices with Apache Kafka10 Principals for Effective Event-Driven Microservices with Apache Kafka
10 Principals for Effective Event-Driven Microservices with Apache Kafka
 
10 Principals for Effective Event Driven Microservices
10 Principals for Effective Event Driven Microservices10 Principals for Effective Event Driven Microservices
10 Principals for Effective Event Driven Microservices
 
The Future of Streaming: Global Apps, Event Stores and Serverless
The Future of Streaming: Global Apps, Event Stores and ServerlessThe Future of Streaming: Global Apps, Event Stores and Serverless
The Future of Streaming: Global Apps, Event Stores and Serverless
 
A Global Source of Truth for the Microservices Generation
A Global Source of Truth for the Microservices GenerationA Global Source of Truth for the Microservices Generation
A Global Source of Truth for the Microservices Generation
 
Building Event Driven Services with Kafka Streams
Building Event Driven Services with Kafka StreamsBuilding Event Driven Services with Kafka Streams
Building Event Driven Services with Kafka Streams
 
NDC London 2017 - The Data Dichotomy- Rethinking Data and Services with Streams
NDC London 2017  - The Data Dichotomy- Rethinking Data and Services with StreamsNDC London 2017  - The Data Dichotomy- Rethinking Data and Services with Streams
NDC London 2017 - The Data Dichotomy- Rethinking Data and Services with Streams
 
Building Event Driven Services with Apache Kafka and Kafka Streams - Devoxx B...
Building Event Driven Services with Apache Kafka and Kafka Streams - Devoxx B...Building Event Driven Services with Apache Kafka and Kafka Streams - Devoxx B...
Building Event Driven Services with Apache Kafka and Kafka Streams - Devoxx B...
 
Building Event Driven Services with Stateful Streams
Building Event Driven Services with Stateful StreamsBuilding Event Driven Services with Stateful Streams
Building Event Driven Services with Stateful Streams
 
Devoxx London 2017 - Rethinking Services With Stateful Streams
Devoxx London 2017 - Rethinking Services With Stateful StreamsDevoxx London 2017 - Rethinking Services With Stateful Streams
Devoxx London 2017 - Rethinking Services With Stateful Streams
 
Event Driven Services Part 2: Building Event-Driven Services with Apache Kafka
Event Driven Services Part 2:  Building Event-Driven Services with Apache KafkaEvent Driven Services Part 2:  Building Event-Driven Services with Apache Kafka
Event Driven Services Part 2: Building Event-Driven Services with Apache Kafka
 
Event Driven Services Part 1: The Data Dichotomy
Event Driven Services Part 1: The Data Dichotomy Event Driven Services Part 1: The Data Dichotomy
Event Driven Services Part 1: The Data Dichotomy
 
Event Driven Services Part 3: Putting the Micro into Microservices with State...
Event Driven Services Part 3: Putting the Micro into Microservices with State...Event Driven Services Part 3: Putting the Micro into Microservices with State...
Event Driven Services Part 3: Putting the Micro into Microservices with State...
 
Strata Software Architecture NY: The Data Dichotomy
Strata Software Architecture NY: The Data DichotomyStrata Software Architecture NY: The Data Dichotomy
Strata Software Architecture NY: The Data Dichotomy
 
Advanced databases ben stopford
Advanced databases   ben stopfordAdvanced databases   ben stopford
Advanced databases ben stopford
 
A Paradigm Shift: The Increasing Dominance of Memory-Oriented Solutions for H...
A Paradigm Shift: The Increasing Dominance of Memory-Oriented Solutions for H...A Paradigm Shift: The Increasing Dominance of Memory-Oriented Solutions for H...
A Paradigm Shift: The Increasing Dominance of Memory-Oriented Solutions for H...
 
Balancing Replication and Partitioning in a Distributed Java Database
Balancing Replication and Partitioning in a Distributed Java DatabaseBalancing Replication and Partitioning in a Distributed Java Database
Balancing Replication and Partitioning in a Distributed Java Database
 
Data Grids with Oracle Coherence
Data Grids with Oracle CoherenceData Grids with Oracle Coherence
Data Grids with Oracle Coherence
 
Architecting for Change: An Agile Approach
Architecting for Change: An Agile ApproachArchitecting for Change: An Agile Approach
Architecting for Change: An Agile Approach
 

Último

Understanding the Laravel MVC Architecture
Understanding the Laravel MVC ArchitectureUnderstanding the Laravel MVC Architecture
Understanding the Laravel MVC ArchitecturePixlogix Infotech
 
AI as an Interface for Commercial Buildings
AI as an Interface for Commercial BuildingsAI as an Interface for Commercial Buildings
AI as an Interface for Commercial BuildingsMemoori
 
Breaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountBreaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountPuma Security, LLC
 
Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)Allon Mureinik
 
Beyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry InnovationBeyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry InnovationSafe Software
 
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure serviceWhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure servicePooja Nehwal
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024Rafal Los
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationSafe Software
 
Benefits Of Flutter Compared To Other Frameworks
Benefits Of Flutter Compared To Other FrameworksBenefits Of Flutter Compared To Other Frameworks
Benefits Of Flutter Compared To Other FrameworksSoftradix Technologies
 
The Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxThe Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxMalak Abu Hammad
 
08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking MenDelhi Call girls
 
Slack Application Development 101 Slides
Slack Application Development 101 SlidesSlack Application Development 101 Slides
Slack Application Development 101 Slidespraypatel2
 
SIEMENS: RAPUNZEL – A Tale About Knowledge Graph
SIEMENS: RAPUNZEL – A Tale About Knowledge GraphSIEMENS: RAPUNZEL – A Tale About Knowledge Graph
SIEMENS: RAPUNZEL – A Tale About Knowledge GraphNeo4j
 
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...Patryk Bandurski
 
Swan(sea) Song – personal research during my six years at Swansea ... and bey...
Swan(sea) Song – personal research during my six years at Swansea ... and bey...Swan(sea) Song – personal research during my six years at Swansea ... and bey...
Swan(sea) Song – personal research during my six years at Swansea ... and bey...Alan Dix
 
How to Remove Document Management Hurdles with X-Docs?
How to Remove Document Management Hurdles with X-Docs?How to Remove Document Management Hurdles with X-Docs?
How to Remove Document Management Hurdles with X-Docs?XfilesPro
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationRadu Cotescu
 
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...HostedbyConfluent
 
Pigging Solutions Piggable Sweeping Elbows
Pigging Solutions Piggable Sweeping ElbowsPigging Solutions Piggable Sweeping Elbows
Pigging Solutions Piggable Sweeping ElbowsPigging Solutions
 
Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreterPresentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreternaman860154
 

Último (20)

Understanding the Laravel MVC Architecture
Understanding the Laravel MVC ArchitectureUnderstanding the Laravel MVC Architecture
Understanding the Laravel MVC Architecture
 
AI as an Interface for Commercial Buildings
AI as an Interface for Commercial BuildingsAI as an Interface for Commercial Buildings
AI as an Interface for Commercial Buildings
 
Breaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountBreaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path Mount
 
Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)
 
Beyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry InnovationBeyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
 
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure serviceWhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
 
Benefits Of Flutter Compared To Other Frameworks
Benefits Of Flutter Compared To Other FrameworksBenefits Of Flutter Compared To Other Frameworks
Benefits Of Flutter Compared To Other Frameworks
 
The Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxThe Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptx
 
08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men
 
Slack Application Development 101 Slides
Slack Application Development 101 SlidesSlack Application Development 101 Slides
Slack Application Development 101 Slides
 
SIEMENS: RAPUNZEL – A Tale About Knowledge Graph
SIEMENS: RAPUNZEL – A Tale About Knowledge GraphSIEMENS: RAPUNZEL – A Tale About Knowledge Graph
SIEMENS: RAPUNZEL – A Tale About Knowledge Graph
 
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
 
Swan(sea) Song – personal research during my six years at Swansea ... and bey...
Swan(sea) Song – personal research during my six years at Swansea ... and bey...Swan(sea) Song – personal research during my six years at Swansea ... and bey...
Swan(sea) Song – personal research during my six years at Swansea ... and bey...
 
How to Remove Document Management Hurdles with X-Docs?
How to Remove Document Management Hurdles with X-Docs?How to Remove Document Management Hurdles with X-Docs?
How to Remove Document Management Hurdles with X-Docs?
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organization
 
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
 
Pigging Solutions Piggable Sweeping Elbows
Pigging Solutions Piggable Sweeping ElbowsPigging Solutions Piggable Sweeping Elbows
Pigging Solutions Piggable Sweeping Elbows
 
Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreterPresentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreter
 

Big iron 2 (published)

  • 1. The Return of Big Iron? Ben Stopford Distinguished Engineer RBS Markets
  • 3. What does this mean? • A change in what customers (we) value • The mainstream is not serving customers (us) sufficiently
  • 4. The Database field has problems
  • 5. We Lose: Joe Hellerstein (Berkeley) 2001 “Databases are commoditised and cornered to slow-moving, evolving, structure intensive, applications that require schema evolution.“ … “The internet companies are lost and we will remain in the doldrums of the enterprise space.” … “As databases are black boxes which require a lot of coaxing to get maximum performance”
  • 6. His question was how to win them back?
  • 7. These new technologies also caused frustration
  • 8. Backlash (2009) Not novel (dates back to the 80’s) Physical level not the logical level (messy?) Incompatible with tooling Lack of integrity (referential) & ACID MR is brute force ignoring indexing, scew
  • 9. All points are reasonable
  • 10. And they proved it too! “A comparison of Approaches to Large Scale Data Analysis” – Sigmod 2009 • Vertica vs. DBMSX vs. Hadoop • Vertica up to 7 x faster than Hadoop over benchmarks Databases faster than Hadoop
  • 11. But possibly missed the point?
  • 13. NoSQL grew from a need to scale
  • 14.
  • 15. It’s more than just scale, they facilitate different practices
  • 16. A Better Fit They better match the way software is engineered today. – Iterative development – Fast feedback – Frequent releases
  • 17. Is NoSQL a Disruptive Technology? Christensen’s observation: Market leaders are displaced when markets shift in ways that the incumbent leaders are not prepared for.
  • 18. Aside: MongoDB • Impressive trajectory • Slightly crappy product (from a traditional database standpoint) • Most closely related to relational DB (of the NoSQLs) • Plays to the agile mindset
  • 19. Yet the NoSQL market is relatively small • Currently around $600 but projected to grow strongly • Database and systems management market is worth around $34billion
  • 20. Key Point There is more to NoSQL than just scale, it sits better with the way we build software today
  • 21. We have new building blocks to play with!
  • 22. My Problem • Sprawling application space, built over many years, grouped into both vertical and horizontal silos • Duplication of effort • Data corruption & preventative measures • Consolidation is costly, time consuming and technically challenging.
  • 23. Traditional solutions (in chronological order) – Messaging – SOA – Enterprise Data Warehouse – Data virtualisation
  • 24. Bringing data, applications, people together is hard
  • 25. A popular choice is an EDW
  • 26. EDW pattern is workable, but tough – As soon as you take a ‘view’ on what the shape of the data is, it becomes harder to change. • Leave ‘taking a view” to the last responsible moment – Multifaceted: Shape, diversity of source, diversity of population, temporal change
  • 27. Harder to do iteratively
  • 28. Is this the only way?
  • 29. The Google Approach MapReduce Google Filesystem BigTable Tenzing Megastore F1 Dremel Spanner
  • 30. And just one code base! So no enterprise schema secret society!
  • 33. Problems with solidifying a schematic representation • Risk of throwing information away, keeping only what you think you need. – OK if you create data – Bad if you got data from elsewhere • Data tends to be poly-structured in programs and on the wire • Early-binding slows down development
  • 34. But schemas are good • They guarantee a contract • That contract spans the whole dataset – Similar to static typing in programming languages.
  • 35. Compromise positions • Query schema can be a subset of data schema. • Use schemaless databases to capture diversity early and evolve it as you build.
  • 36. Common solutions today use multiple technologies M Re u ap d ce D a at W ho se are u ? Ke Vl u y ae St o re In- M mry/ eo O LTP D ba ata se
  • 37. We use an late-bound schema, sitting over a schemaless store S tructured S tandardisation Layer Raw Data Late Bound Schema
  • 38. Evolutionary Approach • Late-binding makes consolidation incremental – Schematic representation delivered at the ‘last responsible moment’ (schema on demand) – A trade in this model has 4 mandatory nodes. A fully modeled trade has around 800. • The system of record is raw data, not our ‘view’ of it • No schema migration! But this comes at a price.
  • 40. Key based access always scales Client
  • 41. But queries (without the sharding key) always broadcast Client
  • 42. As query complexity increases so does the overhead Client
  • 44. Data Replicas provide hardware isolation Client
  • 45. Scaling • Key based sharding is only sufficient very simple workloads • Course grained shards help (but suffer from skew) • Replication provides useful, if expensive, hardware isolation • Workload management is less useful in my experience
  • 46. Weak consistency forces the problem onto the developer Particularly bad for banks!
  • 47. Scaling two phase commit is hard to do efficiently • Requires distributed lock/clock/counter • Requires synchronisation of all readers & writers
  • 48. Alternatives to traditional 2PC • MVCC over explicit locking • Timestamp based strong consistency – E.g. Granola • Optimistic concurrency control – Leverage short running transactions (avoid cross-network transactions) – Tolerate different temporal viewpoints to reduce synchronization costs.
  • 49. Immutable Data • • • • • Safety ‘As was’ view Sits well with MVCC Efficiency problems Gaining popularity (e.g. Datomic)
  • 50. Use joins to avoid ‘over aggregating’ Joins are ok, so long as they are – Local – via a unique key Trade r Party Trade
  • 51. Memory/Disk Tradeoff • Memory only (possibly overplayed) • Pinned indexes (generally good idea if you can afford the RAM) • Disk resident (best general purpose solution and for very large datasets)
  • 52. Balance flexibility and complexity Operational (real time / MR) Object/S QL S tandardisation Raw Data Relational Analytics
  • 53. Supple at the front, more rigid at the back Raw Access Operational Access Analytic Access D Looser Tighter L M Untyped Object/S QL Reporting Broad Data Coverage Narrow Data Coverage Narrow Query Comprehensive Quer y
  • 54. Principals • • • • Record everything Grow a schema, don’t do it upfront Avoid using a ‘view’ as your system of record. Differentiate between sourced data (out of your control) and generated data (in your control). • Use automated replication (for isolation) as well as sharding (for scale) • Leverage asynchronicity to reduce transaction overheads
  • 55. Consolidation means more trust, less impedance mismatches and managing tighter couplings
  • 56. Target architectures are starting to look more like large applications of cloud enabled services than heterogeneous application conglomerates
  • 57. Are we going back to the mainframe?

Notas do Editor

  1. Think about the systems you built five or ten years ago. Who was involved in the building of a new system in the early 2000s? Who used a relational DB? Who seriously considered using anything else?
  2. Retrospective
  3. (no schema or high level languages)
  4. Companies that grew up around technology.