SlideShare uma empresa Scribd logo
1 de 25
Baixar para ler offline
GaianDB
A dynamic distributed
federated database
Dale Lane
@dalelane
A massively over-simplified view of
data-warehousing...
The “Internet of Things”
GaianDB
a
dynamic
distributed
federated
database
Federated data
Network of distributed databases
A dynamic network
A dynamic network
Biologically-Inspired Self-Organisation
Exploit natural selection in nature to
build better networks
Robust self-organizing network
architectures
Frameworks and algorithms for robust
fault-tolerant information dissemination
Robust communications with minimal
complexity or human control
Gaian database
N0
N3
N11
N4
N5
N1
N2
N6
N7
N8
N10
N9
SQL Query
N0
N3
N11
N4
N5
N1
N2
N6
N7
N8
N10
N9
SQL Query
N0
N3
N11
N4
N5
N1
N2
N6
N7
N8
N10
N9
SQL Query
N0
N3
N11
N4
N5
N1
N2
N6
N7
N8
N10
N9
SQL Queries
Queries routed to all database nodes – a
flood query, but retrieving only the data
required to satisfy a query
Exchanges query traffic in the network for
data traffic – aiming to minimize total traffic
Predicated on a concept of ‘store
data locally - read data from
anywhere’ paradigm
Architecture
GaianDB
Derby Engine: Parsing, Compilation, Execution
GaianPStmtNode VTI:
Executes queries on physical leaf nodes +
Propagates the original SQL (+ queryID & steps state info) to linked Gaian nodes
Instantiates Invokes costing
methods
Pushes columns
and ‘where’ clause
in a structure
MQ(tt)
Stream Data
Original SQL
DB2 Oracle MS
SQLServer
Sybase MySQL Flat files
In-memory
tables
Derby
GaianDB
GaianDB
GaianDB
propagate
Text Index
Derby
tables
N0
N3
N11
N4
N5
N1
N2
N6
N7
N8
N10
N9
SQL Query
N0
N3
N11
N4
N5
N1
N2
N6
N7
N8
N10
N9
SQL Query
Expanded Node
Multithreaded, breadth-first query propagation
Loop detection/handling – no duplicates
Performance – with 1,250 nodes
Query time for 1025 nodes, fetching up to 1025 rows from each
y = 4.217x + 349.251
0
1000
2000
3000
4000
5000
6000
0 200 400 600 800 1000 1200
Row s fetched per node
Time(milliseconds)
Query Execute Time
Total Query Time
Linear (Total Query Time)
Query Performance
0.0
53.9
107.8
161.7
215.6
269.5
323.4
377.3
431.2
485.1
539.0
0 200 400 600 800 1000 1200
Number of Nodes
QueryTime(milliseconds)
Average Query Time
Predicted Max (Layers)
Predicted Min (Layers)
Performance questions
The time to propagate a query to all of
the nodes in the database, as a function
of the number of database nodes (N);
The time to fetch data from across the
nodes of the database to a single node,
as a function of the volume of data;
The time to fetch data from across the
database to multiple nodes concurrently
querying, as a function of the number
of nodes concurrently querying.
Graph metrics
The eccentricity ε(νi) of a graph
vertex νi is the maximum graph
distance between νi and any other
vertex νj of G i.e. the "longest
shortest path" between any two
graph vertices (νi , νj) of the graph.
The maximum eccentricity is the
graph diameter Gd. The minimum
graph eccentricity is the graph
radius Gr. We define the size of G as
the number of vertices N and the
number of connections at each
vertex as the vertex degree δi
(1 < i ≤ N).
Biologically inspired self-organisation
0
1
2
3
4
5
6
7
8
9
10
0 200 400 600 800 1000
Number of Nodes (N)
GraphDimension(edges)
Radius
Diameter
(1+e)ln(N)
(1-e)ln(N)
Network growth by
preferential attachment
Using a fitness function at
each node
Limit maximum vertex degree =10
Gd = nint [ (1+e) * ln(N) ]
Gr = nint [ (1-e) * ln(N) ]
e = 0.24
Query propagation time
The predicted maximum (Tmax) and
minimum times (Tmin) to execute the
flood query are:
TL = link latency
Tp = processor delay
Tmax = (Gd + 1)(TL + Tp)
Tmin = (Gr + 1)(TL + Tp)
with the predicted execute query time
from any node (Tν) being:
Tν = (ε(ν) + 1)(TL + Tp)
Hence substituting for ε(ν)
Tν = nint[1 + B * ln(N) * (TL + Tp)]
Measured query propagation
IndividualQueryTimeScalability
0.0
53.9
107.8
161.7
215.6
269.5
323.4
377.3
431.2
485.1
539.0
592.9
0 200 400 600 800 1000 1200
Number of Nodes
QueryTime(ms) AverageQueryTime
PredictedMax(Diameter+1)
PredictedMin(Radius+1)
Queriednodeeccentricity+1
Individual Query Time Scalability
0
53.9
107.8
161.7
215.6
269.5
323.4
0 50 100
Number ofNodes
QueryTime(ms)
Individual Query Times
Average Query Time
Queried node eccentricity+1
Measured data fetch
Query time to fetch 1 million rows
y = 4.217x + 349.251
y = 1.7383x + 678.141
0
1000
2000
3000
4000
5000
6000
0 200000 400000 600000 800000 1000000 1200000
Total Rows fetched
Time(milliseconds)
Total Query Time 1025 nodes
Total Query Time 1 node
Total Query Time 1 node indexed
Linear (Total Query Time 1025 nodes)
Linear (Total Query Time 1 node)
Example uses
Smart Metering
centralised
write
Smart Metering
centralised
read
Smart Metering
distributed federated
write
Smart Metering
distributed federated
read
Other uses...
http://www.alphaworks.ibm.com/tech/gaiandb
Image credits
Background: YouTube video “The Internet of Things”, IBM
http://www.youtube.com/watch?v=sfEbMV295Kk
Icons: DB and envelope icons, Tim Morgan
http://flickr.com/photos/timothymorgan/sets/1615269
Microsoft Excel icon, Vincent Garnier (courtesy of IconArchive)
http://iconarchive.com/show/softdimension-icons-by-benjigarner/Excel-icon.html
Photo of car mechanics, Tomas
http://flickr.com/photos/tma/2264878
All other images original from GaianDB work

Mais conteúdo relacionado

Mais procurados

MongoDB Breakfast Milan - Mainframe Offloading Strategies
MongoDB Breakfast Milan -  Mainframe Offloading StrategiesMongoDB Breakfast Milan -  Mainframe Offloading Strategies
MongoDB Breakfast Milan - Mainframe Offloading Strategies
MongoDB
 

Mais procurados (20)

An Introduction to the Voluntary Carbon Markets
An Introduction to the Voluntary Carbon MarketsAn Introduction to the Voluntary Carbon Markets
An Introduction to the Voluntary Carbon Markets
 
Data Mess to Data Mesh | Jay Kreps, CEO, Confluent | Kafka Summit Americas 20...
Data Mess to Data Mesh | Jay Kreps, CEO, Confluent | Kafka Summit Americas 20...Data Mess to Data Mesh | Jay Kreps, CEO, Confluent | Kafka Summit Americas 20...
Data Mess to Data Mesh | Jay Kreps, CEO, Confluent | Kafka Summit Americas 20...
 
2020-2024 strategic plan: Towards Net Zero
2020-2024 strategic plan: Towards Net Zero2020-2024 strategic plan: Towards Net Zero
2020-2024 strategic plan: Towards Net Zero
 
What is the energy of the future?
What is the energy of the future?What is the energy of the future?
What is the energy of the future?
 
Linked (open) data: het met elkaar verbinden van kennis en organisaties
Linked (open) data: het met elkaar verbinden van kennis en organisatiesLinked (open) data: het met elkaar verbinden van kennis en organisaties
Linked (open) data: het met elkaar verbinden van kennis en organisaties
 
An Introduction to Carbon Offsets, Markets and Projects
An Introduction to Carbon Offsets, Markets and ProjectsAn Introduction to Carbon Offsets, Markets and Projects
An Introduction to Carbon Offsets, Markets and Projects
 
DataOps - The Foundation for Your Agile Data Architecture
DataOps - The Foundation for Your Agile Data ArchitectureDataOps - The Foundation for Your Agile Data Architecture
DataOps - The Foundation for Your Agile Data Architecture
 
CCXG Forum, September 2023, Paul Butarbutar
CCXG Forum, September 2023, Paul ButarbutarCCXG Forum, September 2023, Paul Butarbutar
CCXG Forum, September 2023, Paul Butarbutar
 
Data democratization the key to future proofing data culture
Data democratization the key to future proofing data cultureData democratization the key to future proofing data culture
Data democratization the key to future proofing data culture
 
An overview of BigQuery
An overview of BigQuery An overview of BigQuery
An overview of BigQuery
 
MongoDB Breakfast Milan - Mainframe Offloading Strategies
MongoDB Breakfast Milan -  Mainframe Offloading StrategiesMongoDB Breakfast Milan -  Mainframe Offloading Strategies
MongoDB Breakfast Milan - Mainframe Offloading Strategies
 
Carbon trading (1)
Carbon trading (1)Carbon trading (1)
Carbon trading (1)
 
Climate Finance and Forest Conservation
Climate Finance and Forest ConservationClimate Finance and Forest Conservation
Climate Finance and Forest Conservation
 
The Chief Data Officer - quotes from data & analytics thought leaders
The Chief Data Officer - quotes from data & analytics thought leadersThe Chief Data Officer - quotes from data & analytics thought leaders
The Chief Data Officer - quotes from data & analytics thought leaders
 
Why Data Virtualization? An Introduction by Denodo
Why Data Virtualization? An Introduction by DenodoWhy Data Virtualization? An Introduction by Denodo
Why Data Virtualization? An Introduction by Denodo
 
Dataiku Data Science Studio (datasheet)
Dataiku Data Science Studio (datasheet)Dataiku Data Science Studio (datasheet)
Dataiku Data Science Studio (datasheet)
 
IEA Net Zero Emissions 2050 - UNEP-FI version
IEA Net Zero Emissions 2050 - UNEP-FI versionIEA Net Zero Emissions 2050 - UNEP-FI version
IEA Net Zero Emissions 2050 - UNEP-FI version
 
Got data?… now what? An introduction to modern data platforms
Got data?… now what?  An introduction to modern data platformsGot data?… now what?  An introduction to modern data platforms
Got data?… now what? An introduction to modern data platforms
 
Microsoft Fabric Intro D Koutsanastasis
Microsoft Fabric Intro D KoutsanastasisMicrosoft Fabric Intro D Koutsanastasis
Microsoft Fabric Intro D Koutsanastasis
 
OECD CEFIM Framework presentation - Deger Saygin, OECD
OECD CEFIM Framework presentation - Deger Saygin, OECDOECD CEFIM Framework presentation - Deger Saygin, OECD
OECD CEFIM Framework presentation - Deger Saygin, OECD
 

Semelhante a GaianDB

NeuralProcessingofGeneralPurposeApproximatePrograms
NeuralProcessingofGeneralPurposeApproximateProgramsNeuralProcessingofGeneralPurposeApproximatePrograms
NeuralProcessingofGeneralPurposeApproximatePrograms
Mohid Nabil
 

Semelhante a GaianDB (20)

Seattle Scalability Meetup 6-26-13
Seattle Scalability Meetup 6-26-13Seattle Scalability Meetup 6-26-13
Seattle Scalability Meetup 6-26-13
 
Introduction to computer network 4th edition
Introduction to computer network   4th editionIntroduction to computer network   4th edition
Introduction to computer network 4th edition
 
Information Retrieval Dynamic Time Warping - Interspeech 2013 presentation
Information Retrieval Dynamic Time Warping - Interspeech 2013 presentationInformation Retrieval Dynamic Time Warping - Interspeech 2013 presentation
Information Retrieval Dynamic Time Warping - Interspeech 2013 presentation
 
Paper_KST
Paper_KSTPaper_KST
Paper_KST
 
Object multifunctional indexing with an open API
Object multifunctional indexing with an open API Object multifunctional indexing with an open API
Object multifunctional indexing with an open API
 
NBITSearch. Features.
NBITSearch. Features.NBITSearch. Features.
NBITSearch. Features.
 
Ijnsa050209
Ijnsa050209Ijnsa050209
Ijnsa050209
 
Coding the Continuum
Coding the ContinuumCoding the Continuum
Coding the Continuum
 
NeuralProcessingofGeneralPurposeApproximatePrograms
NeuralProcessingofGeneralPurposeApproximateProgramsNeuralProcessingofGeneralPurposeApproximatePrograms
NeuralProcessingofGeneralPurposeApproximatePrograms
 
Towards neuralprocessingofgeneralpurposeapproximateprograms
Towards neuralprocessingofgeneralpurposeapproximateprogramsTowards neuralprocessingofgeneralpurposeapproximateprograms
Towards neuralprocessingofgeneralpurposeapproximateprograms
 
Neurogrid : A Mixed-Analog-Digital Multichip System for Large-Scale Neural Si...
Neurogrid : A Mixed-Analog-Digital Multichip System for Large-Scale Neural Si...Neurogrid : A Mixed-Analog-Digital Multichip System for Large-Scale Neural Si...
Neurogrid : A Mixed-Analog-Digital Multichip System for Large-Scale Neural Si...
 
User biglm
User biglmUser biglm
User biglm
 
Lecture 25
Lecture 25Lecture 25
Lecture 25
 
D031202018023
D031202018023D031202018023
D031202018023
 
G04701051058
G04701051058G04701051058
G04701051058
 
Enchancing the Data Collection in Tree based Wireless Sensor Networks
Enchancing the Data Collection in Tree based Wireless Sensor NetworksEnchancing the Data Collection in Tree based Wireless Sensor Networks
Enchancing the Data Collection in Tree based Wireless Sensor Networks
 
Security-Aware Scheduling for Real-Time Parallel Applications on Clusters
Security-Aware Scheduling for Real-Time Parallel Applications on ClustersSecurity-Aware Scheduling for Real-Time Parallel Applications on Clusters
Security-Aware Scheduling for Real-Time Parallel Applications on Clusters
 
Spectral-, source-, connectivity- and network analysis of EEG and MEG data
Spectral-, source-, connectivity- and network analysis of EEG and MEG dataSpectral-, source-, connectivity- and network analysis of EEG and MEG data
Spectral-, source-, connectivity- and network analysis of EEG and MEG data
 
Implementation on Data Security Approach in Dynamic Multi Hop Communication
 Implementation on Data Security Approach in Dynamic Multi Hop Communication Implementation on Data Security Approach in Dynamic Multi Hop Communication
Implementation on Data Security Approach in Dynamic Multi Hop Communication
 
International Refereed Journal of Engineering and Science (IRJES)
International Refereed Journal of Engineering and Science (IRJES)International Refereed Journal of Engineering and Science (IRJES)
International Refereed Journal of Engineering and Science (IRJES)
 

Mais de Dale Lane

Debugging Web Apps on Real Mobile Devices
Debugging Web Apps on Real Mobile DevicesDebugging Web Apps on Real Mobile Devices
Debugging Web Apps on Real Mobile Devices
Dale Lane
 

Mais de Dale Lane (20)

Describing Kafka security in AsyncAPI
Describing Kafka security in AsyncAPIDescribing Kafka security in AsyncAPI
Describing Kafka security in AsyncAPI
 
Our NASA Space Apps Challenge 2019 entry
Our NASA Space Apps Challenge 2019 entryOur NASA Space Apps Challenge 2019 entry
Our NASA Space Apps Challenge 2019 entry
 
Useful Kafka tools
Useful Kafka toolsUseful Kafka tools
Useful Kafka tools
 
An intro to serverless and OpenWhisk for Kafka users
An intro to serverless and OpenWhisk for Kafka usersAn intro to serverless and OpenWhisk for Kafka users
An intro to serverless and OpenWhisk for Kafka users
 
How to increase the social impact you make
How to increase the social impact you makeHow to increase the social impact you make
How to increase the social impact you make
 
Introducing Machine Learning to Kids
Introducing Machine Learning to KidsIntroducing Machine Learning to Kids
Introducing Machine Learning to Kids
 
Introducing machine learning to kids
Introducing machine learning to kidsIntroducing machine learning to kids
Introducing machine learning to kids
 
Small Spaces, Big Ideas - our Space Apps Challenge
Small Spaces, Big Ideas - our Space Apps ChallengeSmall Spaces, Big Ideas - our Space Apps Challenge
Small Spaces, Big Ideas - our Space Apps Challenge
 
Owls
OwlsOwls
Owls
 
The skills implications of Cognitive Computing
The skills implications of Cognitive ComputingThe skills implications of Cognitive Computing
The skills implications of Cognitive Computing
 
Conversational Internet - Creating a natural language interface for web pages
Conversational Internet - Creating a natural language interface for web pagesConversational Internet - Creating a natural language interface for web pages
Conversational Internet - Creating a natural language interface for web pages
 
Debugging Web Apps on Real Mobile Devices
Debugging Web Apps on Real Mobile DevicesDebugging Web Apps on Real Mobile Devices
Debugging Web Apps on Real Mobile Devices
 
Pushing, pulling or leaving the door open
Pushing, pulling or leaving the door openPushing, pulling or leaving the door open
Pushing, pulling or leaving the door open
 
Push notifications
Push notificationsPush notifications
Push notifications
 
Fire Eagle Guest Pass
Fire Eagle Guest PassFire Eagle Guest Pass
Fire Eagle Guest Pass
 
Monitoring your electricity usage
Monitoring your electricity usageMonitoring your electricity usage
Monitoring your electricity usage
 
CurrentCost
CurrentCostCurrentCost
CurrentCost
 
An introduction to Windows Mobile development
An introduction to Windows Mobile developmentAn introduction to Windows Mobile development
An introduction to Windows Mobile development
 
An Introduction to Windows PowerShell
An Introduction to Windows PowerShellAn Introduction to Windows PowerShell
An Introduction to Windows PowerShell
 
Mowing the lawn
Mowing the lawnMowing the lawn
Mowing the lawn
 

Último

CNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of ServiceCNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of Service
giselly40
 
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
Earley Information Science
 
IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI Solutions
Enterprise Knowledge
 

Último (20)

From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivity
 
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
 
CNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of ServiceCNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of Service
 
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
 
Tech Trends Report 2024 Future Today Institute.pdf
Tech Trends Report 2024 Future Today Institute.pdfTech Trends Report 2024 Future Today Institute.pdf
Tech Trends Report 2024 Future Today Institute.pdf
 
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot TakeoffStrategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
How to convert PDF to text with Nanonets
How to convert PDF to text with NanonetsHow to convert PDF to text with Nanonets
How to convert PDF to text with Nanonets
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
 
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
 
presentation ICT roal in 21st century education
presentation ICT roal in 21st century educationpresentation ICT roal in 21st century education
presentation ICT roal in 21st century education
 
Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed texts
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024
 
IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI Solutions
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organization
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024
 

GaianDB

  • 1. GaianDB A dynamic distributed federated database Dale Lane @dalelane
  • 2. A massively over-simplified view of data-warehousing...
  • 3. The “Internet of Things”
  • 8. A dynamic network Biologically-Inspired Self-Organisation Exploit natural selection in nature to build better networks Robust self-organizing network architectures Frameworks and algorithms for robust fault-tolerant information dissemination Robust communications with minimal complexity or human control
  • 9. Gaian database N0 N3 N11 N4 N5 N1 N2 N6 N7 N8 N10 N9 SQL Query N0 N3 N11 N4 N5 N1 N2 N6 N7 N8 N10 N9 SQL Query N0 N3 N11 N4 N5 N1 N2 N6 N7 N8 N10 N9 SQL Query N0 N3 N11 N4 N5 N1 N2 N6 N7 N8 N10 N9 SQL Queries Queries routed to all database nodes – a flood query, but retrieving only the data required to satisfy a query Exchanges query traffic in the network for data traffic – aiming to minimize total traffic Predicated on a concept of ‘store data locally - read data from anywhere’ paradigm
  • 10. Architecture GaianDB Derby Engine: Parsing, Compilation, Execution GaianPStmtNode VTI: Executes queries on physical leaf nodes + Propagates the original SQL (+ queryID & steps state info) to linked Gaian nodes Instantiates Invokes costing methods Pushes columns and ‘where’ clause in a structure MQ(tt) Stream Data Original SQL DB2 Oracle MS SQLServer Sybase MySQL Flat files In-memory tables Derby GaianDB GaianDB GaianDB propagate Text Index Derby tables N0 N3 N11 N4 N5 N1 N2 N6 N7 N8 N10 N9 SQL Query N0 N3 N11 N4 N5 N1 N2 N6 N7 N8 N10 N9 SQL Query Expanded Node Multithreaded, breadth-first query propagation Loop detection/handling – no duplicates
  • 11. Performance – with 1,250 nodes Query time for 1025 nodes, fetching up to 1025 rows from each y = 4.217x + 349.251 0 1000 2000 3000 4000 5000 6000 0 200 400 600 800 1000 1200 Row s fetched per node Time(milliseconds) Query Execute Time Total Query Time Linear (Total Query Time) Query Performance 0.0 53.9 107.8 161.7 215.6 269.5 323.4 377.3 431.2 485.1 539.0 0 200 400 600 800 1000 1200 Number of Nodes QueryTime(milliseconds) Average Query Time Predicted Max (Layers) Predicted Min (Layers)
  • 12. Performance questions The time to propagate a query to all of the nodes in the database, as a function of the number of database nodes (N); The time to fetch data from across the nodes of the database to a single node, as a function of the volume of data; The time to fetch data from across the database to multiple nodes concurrently querying, as a function of the number of nodes concurrently querying.
  • 13. Graph metrics The eccentricity ε(νi) of a graph vertex νi is the maximum graph distance between νi and any other vertex νj of G i.e. the "longest shortest path" between any two graph vertices (νi , νj) of the graph. The maximum eccentricity is the graph diameter Gd. The minimum graph eccentricity is the graph radius Gr. We define the size of G as the number of vertices N and the number of connections at each vertex as the vertex degree δi (1 < i ≤ N).
  • 14. Biologically inspired self-organisation 0 1 2 3 4 5 6 7 8 9 10 0 200 400 600 800 1000 Number of Nodes (N) GraphDimension(edges) Radius Diameter (1+e)ln(N) (1-e)ln(N) Network growth by preferential attachment Using a fitness function at each node Limit maximum vertex degree =10 Gd = nint [ (1+e) * ln(N) ] Gr = nint [ (1-e) * ln(N) ] e = 0.24
  • 15. Query propagation time The predicted maximum (Tmax) and minimum times (Tmin) to execute the flood query are: TL = link latency Tp = processor delay Tmax = (Gd + 1)(TL + Tp) Tmin = (Gr + 1)(TL + Tp) with the predicted execute query time from any node (Tν) being: Tν = (ε(ν) + 1)(TL + Tp) Hence substituting for ε(ν) Tν = nint[1 + B * ln(N) * (TL + Tp)]
  • 16. Measured query propagation IndividualQueryTimeScalability 0.0 53.9 107.8 161.7 215.6 269.5 323.4 377.3 431.2 485.1 539.0 592.9 0 200 400 600 800 1000 1200 Number of Nodes QueryTime(ms) AverageQueryTime PredictedMax(Diameter+1) PredictedMin(Radius+1) Queriednodeeccentricity+1 Individual Query Time Scalability 0 53.9 107.8 161.7 215.6 269.5 323.4 0 50 100 Number ofNodes QueryTime(ms) Individual Query Times Average Query Time Queried node eccentricity+1
  • 17. Measured data fetch Query time to fetch 1 million rows y = 4.217x + 349.251 y = 1.7383x + 678.141 0 1000 2000 3000 4000 5000 6000 0 200000 400000 600000 800000 1000000 1200000 Total Rows fetched Time(milliseconds) Total Query Time 1025 nodes Total Query Time 1 node Total Query Time 1 node indexed Linear (Total Query Time 1025 nodes) Linear (Total Query Time 1 node)
  • 25. Image credits Background: YouTube video “The Internet of Things”, IBM http://www.youtube.com/watch?v=sfEbMV295Kk Icons: DB and envelope icons, Tim Morgan http://flickr.com/photos/timothymorgan/sets/1615269 Microsoft Excel icon, Vincent Garnier (courtesy of IconArchive) http://iconarchive.com/show/softdimension-icons-by-benjigarner/Excel-icon.html Photo of car mechanics, Tomas http://flickr.com/photos/tma/2264878 All other images original from GaianDB work