SlideShare a Scribd company logo
1 of 23
Reconceiving the Web as a
Distributed (NoSQL) Data
System
Daniel Austin
PayPal, Inc.
NoSQL Now! Conference
August 22, 2013
V1.2
The Big Idea
“The World-Wide Web is the World’s
Largest NoSQL Distributed Data
System”
The Mind Map
History
• DNS (1983)
The first large-scale
DDS, using Flat files
• WWW (1989)
“a single user-interface to many
large classes of stored information
such as reports, notes, data-
bases, computer documentation
and on-line systems help”
Berners-Lee & Cailliau,
1989
But Why NoSQL?
WWWDB: Anatomy
WWW
HTML
(Presentation)
URI
(Addressing)
HTTP
(Transport)
Typology of Hyperlink Queries
• Hypertext links come in two flavors:
transitive and intransitive
• Transitive queries are usually for inactive
content – presentation material to
supplement the user’s queried data
• Intransitive queries are user-actuated
and usually provide navigation and
business logic for the query
Data Clients Query Data
Sources
What Do HTTP URIs Identify?
• Not a single resource
• WWWDB query syntax is split between
HTTP ‘verbs’ (POST, GET, PUT,
DELETE) and their objects, addressed
by URIs
• URI encapsulates a resource as the
object identified by a query
(Note that transitive and intransitive
hyperlinks almost always go to different
locations)
CDN as a Caching Mechanism
• CDNs such as Akamai and Cloudfront
provide local caching services for
WWWDB, mostly for static, presentation-
related objects
– Frequency-based caching for transitive
hyperlinks
– Most secondary queries go to the CDN
– 95%+ of all the bytes transported over the
Web
– ~90% of all WWWDB queries (HTTP
requests/responses)
APIs as Secondary Queries
• Active Subqueries
• Usually dynamic
• URIs function as a selection mechanism
• Often User-Actuated, Intransitive Events
• Query results often modify the display
REST as a Query Syntax
Mechanism
• Common Semantics
– REST provides a
means of specifying
the proper query for
an object in a specific
state
• Demands NoSQL
due to state
constraints
• Uses query strings
for ranged searches
Image courtesy IBM
Indexing WWWDB
• Google, Bing, Yahoo! and other ‘index
searches’ on WWWDB
– Inconsistent results are accepted
• Query Cache or a Data Cache?
• Secondary Query Routing
• Alternative query indices – Wolfram
Alpha, Index Mundi, Twitter act as
‘almanacs’
Does the CAP Theorem Apply?
Yes, It Does, But Only Partially
• Partition and Availability – 404’s, DDOS
• WWWDB Relaxes the Consistency
Constraint
• We accept inconsistent queries and
broken links as a tradeoff for real-time
availability and high-velocity updates
But We Can Do Better!
Drawbacks of the CAP Model
• Caching – All data is Not cached
everywhere
– Some sites are single-location/single source
– Hard (static) assets are far more widely
cached
• What does CAP mean when data is only
partially distributed?
– Very little – consistency only applies to part
of the queries
Improving WWWDB
• Better Data Clients
– HTML5 provides new query mechanism via
Web Sockets, WebStorage, and other
means
– Still mostly presentation-level improvments
• Better Caching, Distribution & Tranport
– Work currently being done at IETF on HTTP
2.0
• Better Queries
– Very little work being done – more on this
RDF and the Semantic Web
• Changes query patterns but not storage
– Queries based on semantic ID of resource
• Requires content to be semantically
labeled
• Work on Sparql reduces query
limitations
– But may also make things slower (!)
• Cloud computing and query distribution
will prove a more powerful force for
improving WWWDB than semantic
Browsers as Data Clients
• Presentation First!
– Data is treated as secondary
• Designed for Browsing Not Querying
– Query patterns are inefficient
– Semi-stateful nature of Web sessions
• Bedeviled with Legacy Issues
Optimizing Web Queries
• REST doesn’t imply FAST
– Use a domain model to limit query
endpoints
– May require unnecessary requests
• Query-string semantics allows for joins,
arbitrary comparison
• Recognize that some queries require
state and use it
• Distribute intransitive queries more
widely
Reforming Hypertext for
Querying WWWDB
• Enlarge the number of link types
• Distinguish transitive links
• Add bidirectional linking
• Enhance the semantics of the query
string
• Make hypertext more useful for mobile
and devices
IPv6 and Query Routing for
WWWDB
• The IPv6 space is large enough to allow
for multiple query addressing schemes:
– Semantic addressing of objects by type
– Objects in the Internet of Things
– Dynamic, context driven addressing
Scaling the WWWDB
• This may require
expanding our notions
of URIs and links
(queries)
• Semantic mapping of
resources requires
additional complexity for
queries
• Explicit state
management for
efficiency
Every system has a
scaling limit
Final Thoughts
• The Web is the largest NoSQL Distributed
Data System
– URIs address the resultset of a NoSQL query
– Transitive and Intransitive hyperlinks
• We can add power and simplicity to our
queries by carefully reforming the URI
syntax and the current implementations of
hypertext
• HTTP and HTML are undergoing
significant evolution – now it’s time for
URIs!
Reconceiving the Web as a
Distributed Data System
Thank You!
Daniel Austin
PayPal, Inc.
NoSQL Now! Conference
August 22, 2013
V1.2
@daniel_b_austin

More Related Content

What's hot

No SQL- The Future Of Data Storage
No SQL- The Future Of Data StorageNo SQL- The Future Of Data Storage
No SQL- The Future Of Data StorageBethmi Gunasekara
 
Appache Cassandra
Appache Cassandra  Appache Cassandra
Appache Cassandra nehabsairam
 
NOSQL Databases types and Uses
NOSQL Databases types and UsesNOSQL Databases types and Uses
NOSQL Databases types and UsesSuvradeep Rudra
 
introduction to NOSQL Database
introduction to NOSQL Databaseintroduction to NOSQL Database
introduction to NOSQL Databasenehabsairam
 
Big Challenges in Data Modeling: NoSQL and Data Modeling
Big Challenges in Data Modeling: NoSQL and Data ModelingBig Challenges in Data Modeling: NoSQL and Data Modeling
Big Challenges in Data Modeling: NoSQL and Data ModelingDATAVERSITY
 
Strata+Hadoop World NY 2016 - Avinash Ramineni
Strata+Hadoop World NY 2016 - Avinash RamineniStrata+Hadoop World NY 2016 - Avinash Ramineni
Strata+Hadoop World NY 2016 - Avinash RamineniAvinash Ramineni
 
Non relational databases-no sql
Non relational databases-no sqlNon relational databases-no sql
Non relational databases-no sqlRam kumar
 
NoSQL: Why, When, and How
NoSQL: Why, When, and HowNoSQL: Why, When, and How
NoSQL: Why, When, and HowBigBlueHat
 
Big Data and NoSQL in Microsoft-Land
Big Data and NoSQL in Microsoft-LandBig Data and NoSQL in Microsoft-Land
Big Data and NoSQL in Microsoft-LandAndrew Brust
 
Thu 1400 cagle_kurt_color
Thu 1400 cagle_kurt_colorThu 1400 cagle_kurt_color
Thu 1400 cagle_kurt_colorDATAVERSITY
 
Introduction to nosql | NoSQL databases
Introduction to nosql | NoSQL databasesIntroduction to nosql | NoSQL databases
Introduction to nosql | NoSQL databasesShilpaKrishna6
 
Дмитрий Лавриненко "Blockchain for Identity Management, based on Fast Big Data"
Дмитрий Лавриненко "Blockchain for Identity Management, based on Fast Big Data"Дмитрий Лавриненко "Blockchain for Identity Management, based on Fast Big Data"
Дмитрий Лавриненко "Blockchain for Identity Management, based on Fast Big Data"Fwdays
 

What's hot (20)

No SQL- The Future Of Data Storage
No SQL- The Future Of Data StorageNo SQL- The Future Of Data Storage
No SQL- The Future Of Data Storage
 
Appache Cassandra
Appache Cassandra  Appache Cassandra
Appache Cassandra
 
NOSQL Databases types and Uses
NOSQL Databases types and UsesNOSQL Databases types and Uses
NOSQL Databases types and Uses
 
NoSql Brownbag
NoSql BrownbagNoSql Brownbag
NoSql Brownbag
 
Selecting best NoSQL
Selecting best NoSQL Selecting best NoSQL
Selecting best NoSQL
 
introduction to NOSQL Database
introduction to NOSQL Databaseintroduction to NOSQL Database
introduction to NOSQL Database
 
Big Challenges in Data Modeling: NoSQL and Data Modeling
Big Challenges in Data Modeling: NoSQL and Data ModelingBig Challenges in Data Modeling: NoSQL and Data Modeling
Big Challenges in Data Modeling: NoSQL and Data Modeling
 
Strata+Hadoop World NY 2016 - Avinash Ramineni
Strata+Hadoop World NY 2016 - Avinash RamineniStrata+Hadoop World NY 2016 - Avinash Ramineni
Strata+Hadoop World NY 2016 - Avinash Ramineni
 
NoSql
NoSqlNoSql
NoSql
 
Nosql data models
Nosql data modelsNosql data models
Nosql data models
 
RDBMS vs NoSQL
RDBMS vs NoSQLRDBMS vs NoSQL
RDBMS vs NoSQL
 
Non relational databases-no sql
Non relational databases-no sqlNon relational databases-no sql
Non relational databases-no sql
 
NoSQL: Why, When, and How
NoSQL: Why, When, and HowNoSQL: Why, When, and How
NoSQL: Why, When, and How
 
Big Data and NoSQL in Microsoft-Land
Big Data and NoSQL in Microsoft-LandBig Data and NoSQL in Microsoft-Land
Big Data and NoSQL in Microsoft-Land
 
NoSQL databases
NoSQL databasesNoSQL databases
NoSQL databases
 
Thu 1400 cagle_kurt_color
Thu 1400 cagle_kurt_colorThu 1400 cagle_kurt_color
Thu 1400 cagle_kurt_color
 
Introduction to nosql | NoSQL databases
Introduction to nosql | NoSQL databasesIntroduction to nosql | NoSQL databases
Introduction to nosql | NoSQL databases
 
Mongo db
Mongo dbMongo db
Mongo db
 
Дмитрий Лавриненко "Blockchain for Identity Management, based on Fast Big Data"
Дмитрий Лавриненко "Blockchain for Identity Management, based on Fast Big Data"Дмитрий Лавриненко "Blockchain for Identity Management, based on Fast Big Data"
Дмитрий Лавриненко "Blockchain for Identity Management, based on Fast Big Data"
 
Introduction to NoSQL
Introduction to NoSQLIntroduction to NoSQL
Introduction to NoSQL
 

Similar to Reconceiving the Web as a Distributed (NoSQL) Data System

UNIT I Introduction to NoSQL.pptx
UNIT I Introduction to NoSQL.pptxUNIT I Introduction to NoSQL.pptx
UNIT I Introduction to NoSQL.pptxRahul Borate
 
UNIT I Introduction to NoSQL.pptx
UNIT I Introduction to NoSQL.pptxUNIT I Introduction to NoSQL.pptx
UNIT I Introduction to NoSQL.pptxRahul Borate
 
Oracle Week 2016 - Modern Data Architecture
Oracle Week 2016 - Modern Data ArchitectureOracle Week 2016 - Modern Data Architecture
Oracle Week 2016 - Modern Data ArchitectureArthur Gimpel
 
Colorado Springs Open Source Hadoop/MySQL
Colorado Springs Open Source Hadoop/MySQL Colorado Springs Open Source Hadoop/MySQL
Colorado Springs Open Source Hadoop/MySQL David Smelker
 
Dropping ACID: Wrapping Your Mind Around NoSQL Databases
Dropping ACID: Wrapping Your Mind Around NoSQL DatabasesDropping ACID: Wrapping Your Mind Around NoSQL Databases
Dropping ACID: Wrapping Your Mind Around NoSQL DatabasesKyle Banerjee
 
NoSql - mayank singh
NoSql - mayank singhNoSql - mayank singh
NoSql - mayank singhMayank Singh
 
RecSys 2015 Tutorial - Scalable Recommender Systems: Where Machine Learning m...
RecSys 2015 Tutorial - Scalable Recommender Systems: Where Machine Learning m...RecSys 2015 Tutorial - Scalable Recommender Systems: Where Machine Learning m...
RecSys 2015 Tutorial - Scalable Recommender Systems: Where Machine Learning m...Joaquin Delgado PhD.
 
RecSys 2015 Tutorial – Scalable Recommender Systems: Where Machine Learning...
 RecSys 2015 Tutorial – Scalable Recommender Systems: Where Machine Learning... RecSys 2015 Tutorial – Scalable Recommender Systems: Where Machine Learning...
RecSys 2015 Tutorial – Scalable Recommender Systems: Where Machine Learning...S. Diana Hu
 
NoSQL in the context of Social Web
NoSQL in the context of Social WebNoSQL in the context of Social Web
NoSQL in the context of Social WebBogdan Gaza
 
How to use Big Data and Data Lake concept in business using Hadoop and Spark...
 How to use Big Data and Data Lake concept in business using Hadoop and Spark... How to use Big Data and Data Lake concept in business using Hadoop and Spark...
How to use Big Data and Data Lake concept in business using Hadoop and Spark...Institute of Contemporary Sciences
 

Similar to Reconceiving the Web as a Distributed (NoSQL) Data System (20)

NOsql Presentation.pdf
NOsql Presentation.pdfNOsql Presentation.pdf
NOsql Presentation.pdf
 
UNIT I Introduction to NoSQL.pptx
UNIT I Introduction to NoSQL.pptxUNIT I Introduction to NoSQL.pptx
UNIT I Introduction to NoSQL.pptx
 
UNIT I Introduction to NoSQL.pptx
UNIT I Introduction to NoSQL.pptxUNIT I Introduction to NoSQL.pptx
UNIT I Introduction to NoSQL.pptx
 
NoSQL
NoSQLNoSQL
NoSQL
 
NoSQL.pptx
NoSQL.pptxNoSQL.pptx
NoSQL.pptx
 
Oracle Week 2016 - Modern Data Architecture
Oracle Week 2016 - Modern Data ArchitectureOracle Week 2016 - Modern Data Architecture
Oracle Week 2016 - Modern Data Architecture
 
Web Mining
Web MiningWeb Mining
Web Mining
 
Web mining
Web miningWeb mining
Web mining
 
BigData, NoSQL & ElasticSearch
BigData, NoSQL & ElasticSearchBigData, NoSQL & ElasticSearch
BigData, NoSQL & ElasticSearch
 
Apache drill
Apache drillApache drill
Apache drill
 
Apache Drill at ApacheCon2014
Apache Drill at ApacheCon2014Apache Drill at ApacheCon2014
Apache Drill at ApacheCon2014
 
Colorado Springs Open Source Hadoop/MySQL
Colorado Springs Open Source Hadoop/MySQL Colorado Springs Open Source Hadoop/MySQL
Colorado Springs Open Source Hadoop/MySQL
 
Foundations of business intelligence databases and information management
Foundations of business intelligence databases and information managementFoundations of business intelligence databases and information management
Foundations of business intelligence databases and information management
 
Revision
RevisionRevision
Revision
 
Dropping ACID: Wrapping Your Mind Around NoSQL Databases
Dropping ACID: Wrapping Your Mind Around NoSQL DatabasesDropping ACID: Wrapping Your Mind Around NoSQL Databases
Dropping ACID: Wrapping Your Mind Around NoSQL Databases
 
NoSql - mayank singh
NoSql - mayank singhNoSql - mayank singh
NoSql - mayank singh
 
RecSys 2015 Tutorial - Scalable Recommender Systems: Where Machine Learning m...
RecSys 2015 Tutorial - Scalable Recommender Systems: Where Machine Learning m...RecSys 2015 Tutorial - Scalable Recommender Systems: Where Machine Learning m...
RecSys 2015 Tutorial - Scalable Recommender Systems: Where Machine Learning m...
 
RecSys 2015 Tutorial – Scalable Recommender Systems: Where Machine Learning...
 RecSys 2015 Tutorial – Scalable Recommender Systems: Where Machine Learning... RecSys 2015 Tutorial – Scalable Recommender Systems: Where Machine Learning...
RecSys 2015 Tutorial – Scalable Recommender Systems: Where Machine Learning...
 
NoSQL in the context of Social Web
NoSQL in the context of Social WebNoSQL in the context of Social Web
NoSQL in the context of Social Web
 
How to use Big Data and Data Lake concept in business using Hadoop and Spark...
 How to use Big Data and Data Lake concept in business using Hadoop and Spark... How to use Big Data and Data Lake concept in business using Hadoop and Spark...
How to use Big Data and Data Lake concept in business using Hadoop and Spark...
 

More from Daniel Austin

Next generation web protocols
Next generation web protocolsNext generation web protocols
Next generation web protocolsDaniel Austin
 
Always Offline: Delay-Tolerant Networking for the Internet of Things
Always Offline: Delay-Tolerant Networking for the Internet of ThingsAlways Offline: Delay-Tolerant Networking for the Internet of Things
Always Offline: Delay-Tolerant Networking for the Internet of ThingsDaniel Austin
 
Performance: How Fast is Fast Enough?
Performance: How Fast is Fast Enough?Performance: How Fast is Fast Enough?
Performance: How Fast is Fast Enough?Daniel Austin
 
Big Data and the Future of Money 2014
Big Data and the Future of Money 2014Big Data and the Future of Money 2014
Big Data and the Future of Money 2014Daniel Austin
 
Big data comes in small packages v1.2
Big data comes in small packages v1.2Big data comes in small packages v1.2
Big data comes in small packages v1.2Daniel Austin
 
Designing Delay-tolerant Data Services for the Network of Things
Designing Delay-tolerant Data Services for the Network of ThingsDesigning Delay-tolerant Data Services for the Network of Things
Designing Delay-tolerant Data Services for the Network of ThingsDaniel Austin
 
Web Performance Bootcamp 2014
Web Performance Bootcamp 2014Web Performance Bootcamp 2014
Web Performance Bootcamp 2014Daniel Austin
 
HTML5, HTTP2, and You 1.1
HTML5, HTTP2, and You 1.1HTML5, HTTP2, and You 1.1
HTML5, HTTP2, and You 1.1Daniel Austin
 
Managing Performance Globally with MySQL
Managing Performance Globally with MySQLManaging Performance Globally with MySQL
Managing Performance Globally with MySQLDaniel Austin
 
Web Performance BootCamp 2013
Web Performance BootCamp 2013Web Performance BootCamp 2013
Web Performance BootCamp 2013Daniel Austin
 
Perspectives on the Evolution of HTML
Perspectives on the Evolution of HTMLPerspectives on the Evolution of HTML
Perspectives on the Evolution of HTMLDaniel Austin
 
The Fastest Possible Search Algorithm: Grover's Search and the World of Quant...
The Fastest Possible Search Algorithm: Grover's Search and the World of Quant...The Fastest Possible Search Algorithm: Grover's Search and the World of Quant...
The Fastest Possible Search Algorithm: Grover's Search and the World of Quant...Daniel Austin
 
Quantum Computing in a Nutshell: Grover's Search and the World of Quantum Com...
Quantum Computing in a Nutshell: Grover's Search and the World of Quantum Com...Quantum Computing in a Nutshell: Grover's Search and the World of Quantum Com...
Quantum Computing in a Nutshell: Grover's Search and the World of Quantum Com...Daniel Austin
 
Big data and the Future of Money (World Big Data Congress 2013)
Big data and the Future of Money (World Big Data Congress 2013)Big data and the Future of Money (World Big Data Congress 2013)
Big data and the Future of Money (World Big Data Congress 2013)Daniel Austin
 
Big Data is a Big Scam Most of the Time! (MySQL Connect Keynote 2012)
Big Data is a Big Scam Most of the Time! (MySQL Connect Keynote 2012)Big Data is a Big Scam Most of the Time! (MySQL Connect Keynote 2012)
Big Data is a Big Scam Most of the Time! (MySQL Connect Keynote 2012)Daniel Austin
 
Performance analysisclass
Performance analysisclassPerformance analysisclass
Performance analysisclassDaniel Austin
 
Yes sql08 inmemorydb
Yes sql08 inmemorydbYes sql08 inmemorydb
Yes sql08 inmemorydbDaniel Austin
 
The Fastest Possible Search Algorithm
The Fastest Possible Search AlgorithmThe Fastest Possible Search Algorithm
The Fastest Possible Search AlgorithmDaniel Austin
 
A Global In-memory Data System for MySQL
A Global In-memory Data System for MySQLA Global In-memory Data System for MySQL
A Global In-memory Data System for MySQLDaniel Austin
 
Notes on a High-Performance JSON Protocol
Notes on a High-Performance JSON ProtocolNotes on a High-Performance JSON Protocol
Notes on a High-Performance JSON ProtocolDaniel Austin
 

More from Daniel Austin (20)

Next generation web protocols
Next generation web protocolsNext generation web protocols
Next generation web protocols
 
Always Offline: Delay-Tolerant Networking for the Internet of Things
Always Offline: Delay-Tolerant Networking for the Internet of ThingsAlways Offline: Delay-Tolerant Networking for the Internet of Things
Always Offline: Delay-Tolerant Networking for the Internet of Things
 
Performance: How Fast is Fast Enough?
Performance: How Fast is Fast Enough?Performance: How Fast is Fast Enough?
Performance: How Fast is Fast Enough?
 
Big Data and the Future of Money 2014
Big Data and the Future of Money 2014Big Data and the Future of Money 2014
Big Data and the Future of Money 2014
 
Big data comes in small packages v1.2
Big data comes in small packages v1.2Big data comes in small packages v1.2
Big data comes in small packages v1.2
 
Designing Delay-tolerant Data Services for the Network of Things
Designing Delay-tolerant Data Services for the Network of ThingsDesigning Delay-tolerant Data Services for the Network of Things
Designing Delay-tolerant Data Services for the Network of Things
 
Web Performance Bootcamp 2014
Web Performance Bootcamp 2014Web Performance Bootcamp 2014
Web Performance Bootcamp 2014
 
HTML5, HTTP2, and You 1.1
HTML5, HTTP2, and You 1.1HTML5, HTTP2, and You 1.1
HTML5, HTTP2, and You 1.1
 
Managing Performance Globally with MySQL
Managing Performance Globally with MySQLManaging Performance Globally with MySQL
Managing Performance Globally with MySQL
 
Web Performance BootCamp 2013
Web Performance BootCamp 2013Web Performance BootCamp 2013
Web Performance BootCamp 2013
 
Perspectives on the Evolution of HTML
Perspectives on the Evolution of HTMLPerspectives on the Evolution of HTML
Perspectives on the Evolution of HTML
 
The Fastest Possible Search Algorithm: Grover's Search and the World of Quant...
The Fastest Possible Search Algorithm: Grover's Search and the World of Quant...The Fastest Possible Search Algorithm: Grover's Search and the World of Quant...
The Fastest Possible Search Algorithm: Grover's Search and the World of Quant...
 
Quantum Computing in a Nutshell: Grover's Search and the World of Quantum Com...
Quantum Computing in a Nutshell: Grover's Search and the World of Quantum Com...Quantum Computing in a Nutshell: Grover's Search and the World of Quantum Com...
Quantum Computing in a Nutshell: Grover's Search and the World of Quantum Com...
 
Big data and the Future of Money (World Big Data Congress 2013)
Big data and the Future of Money (World Big Data Congress 2013)Big data and the Future of Money (World Big Data Congress 2013)
Big data and the Future of Money (World Big Data Congress 2013)
 
Big Data is a Big Scam Most of the Time! (MySQL Connect Keynote 2012)
Big Data is a Big Scam Most of the Time! (MySQL Connect Keynote 2012)Big Data is a Big Scam Most of the Time! (MySQL Connect Keynote 2012)
Big Data is a Big Scam Most of the Time! (MySQL Connect Keynote 2012)
 
Performance analysisclass
Performance analysisclassPerformance analysisclass
Performance analysisclass
 
Yes sql08 inmemorydb
Yes sql08 inmemorydbYes sql08 inmemorydb
Yes sql08 inmemorydb
 
The Fastest Possible Search Algorithm
The Fastest Possible Search AlgorithmThe Fastest Possible Search Algorithm
The Fastest Possible Search Algorithm
 
A Global In-memory Data System for MySQL
A Global In-memory Data System for MySQLA Global In-memory Data System for MySQL
A Global In-memory Data System for MySQL
 
Notes on a High-Performance JSON Protocol
Notes on a High-Performance JSON ProtocolNotes on a High-Performance JSON Protocol
Notes on a High-Performance JSON Protocol
 

Recently uploaded

CNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of ServiceCNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of Servicegiselly40
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationRadu Cotescu
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityPrincipled Technologies
 
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...gurkirankumar98700
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024Rafal Los
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerThousandEyes
 
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...HostedbyConfluent
 
Swan(sea) Song – personal research during my six years at Swansea ... and bey...
Swan(sea) Song – personal research during my six years at Swansea ... and bey...Swan(sea) Song – personal research during my six years at Swansea ... and bey...
Swan(sea) Song – personal research during my six years at Swansea ... and bey...Alan Dix
 
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024BookNet Canada
 
Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024The Digital Insurer
 
08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking MenDelhi Call girls
 
Google AI Hackathon: LLM based Evaluator for RAG
Google AI Hackathon: LLM based Evaluator for RAGGoogle AI Hackathon: LLM based Evaluator for RAG
Google AI Hackathon: LLM based Evaluator for RAGSujit Pal
 
IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsEnterprise Knowledge
 
08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking MenDelhi Call girls
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Drew Madelung
 
Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)Allon Mureinik
 
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | DelhiFULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhisoniya singh
 
Unblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen FramesUnblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen FramesSinan KOZAK
 
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking MenDelhi Call girls
 
Enhancing Worker Digital Experience: A Hands-on Workshop for Partners
Enhancing Worker Digital Experience: A Hands-on Workshop for PartnersEnhancing Worker Digital Experience: A Hands-on Workshop for Partners
Enhancing Worker Digital Experience: A Hands-on Workshop for PartnersThousandEyes
 

Recently uploaded (20)

CNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of ServiceCNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of Service
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organization
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivity
 
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
 
Swan(sea) Song – personal research during my six years at Swansea ... and bey...
Swan(sea) Song – personal research during my six years at Swansea ... and bey...Swan(sea) Song – personal research during my six years at Swansea ... and bey...
Swan(sea) Song – personal research during my six years at Swansea ... and bey...
 
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
 
Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024
 
08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men
 
Google AI Hackathon: LLM based Evaluator for RAG
Google AI Hackathon: LLM based Evaluator for RAGGoogle AI Hackathon: LLM based Evaluator for RAG
Google AI Hackathon: LLM based Evaluator for RAG
 
IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI Solutions
 
08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
 
Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)
 
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | DelhiFULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
 
Unblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen FramesUnblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen Frames
 
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
 
Enhancing Worker Digital Experience: A Hands-on Workshop for Partners
Enhancing Worker Digital Experience: A Hands-on Workshop for PartnersEnhancing Worker Digital Experience: A Hands-on Workshop for Partners
Enhancing Worker Digital Experience: A Hands-on Workshop for Partners
 

Reconceiving the Web as a Distributed (NoSQL) Data System

  • 1. Reconceiving the Web as a Distributed (NoSQL) Data System Daniel Austin PayPal, Inc. NoSQL Now! Conference August 22, 2013 V1.2
  • 2. The Big Idea “The World-Wide Web is the World’s Largest NoSQL Distributed Data System”
  • 4. History • DNS (1983) The first large-scale DDS, using Flat files • WWW (1989) “a single user-interface to many large classes of stored information such as reports, notes, data- bases, computer documentation and on-line systems help” Berners-Lee & Cailliau, 1989 But Why NoSQL?
  • 6. Typology of Hyperlink Queries • Hypertext links come in two flavors: transitive and intransitive • Transitive queries are usually for inactive content – presentation material to supplement the user’s queried data • Intransitive queries are user-actuated and usually provide navigation and business logic for the query
  • 7. Data Clients Query Data Sources
  • 8. What Do HTTP URIs Identify? • Not a single resource • WWWDB query syntax is split between HTTP ‘verbs’ (POST, GET, PUT, DELETE) and their objects, addressed by URIs • URI encapsulates a resource as the object identified by a query (Note that transitive and intransitive hyperlinks almost always go to different locations)
  • 9. CDN as a Caching Mechanism • CDNs such as Akamai and Cloudfront provide local caching services for WWWDB, mostly for static, presentation- related objects – Frequency-based caching for transitive hyperlinks – Most secondary queries go to the CDN – 95%+ of all the bytes transported over the Web – ~90% of all WWWDB queries (HTTP requests/responses)
  • 10. APIs as Secondary Queries • Active Subqueries • Usually dynamic • URIs function as a selection mechanism • Often User-Actuated, Intransitive Events • Query results often modify the display
  • 11. REST as a Query Syntax Mechanism • Common Semantics – REST provides a means of specifying the proper query for an object in a specific state • Demands NoSQL due to state constraints • Uses query strings for ranged searches Image courtesy IBM
  • 12. Indexing WWWDB • Google, Bing, Yahoo! and other ‘index searches’ on WWWDB – Inconsistent results are accepted • Query Cache or a Data Cache? • Secondary Query Routing • Alternative query indices – Wolfram Alpha, Index Mundi, Twitter act as ‘almanacs’
  • 13. Does the CAP Theorem Apply? Yes, It Does, But Only Partially • Partition and Availability – 404’s, DDOS • WWWDB Relaxes the Consistency Constraint • We accept inconsistent queries and broken links as a tradeoff for real-time availability and high-velocity updates But We Can Do Better!
  • 14. Drawbacks of the CAP Model • Caching – All data is Not cached everywhere – Some sites are single-location/single source – Hard (static) assets are far more widely cached • What does CAP mean when data is only partially distributed? – Very little – consistency only applies to part of the queries
  • 15. Improving WWWDB • Better Data Clients – HTML5 provides new query mechanism via Web Sockets, WebStorage, and other means – Still mostly presentation-level improvments • Better Caching, Distribution & Tranport – Work currently being done at IETF on HTTP 2.0 • Better Queries – Very little work being done – more on this
  • 16. RDF and the Semantic Web • Changes query patterns but not storage – Queries based on semantic ID of resource • Requires content to be semantically labeled • Work on Sparql reduces query limitations – But may also make things slower (!) • Cloud computing and query distribution will prove a more powerful force for improving WWWDB than semantic
  • 17. Browsers as Data Clients • Presentation First! – Data is treated as secondary • Designed for Browsing Not Querying – Query patterns are inefficient – Semi-stateful nature of Web sessions • Bedeviled with Legacy Issues
  • 18. Optimizing Web Queries • REST doesn’t imply FAST – Use a domain model to limit query endpoints – May require unnecessary requests • Query-string semantics allows for joins, arbitrary comparison • Recognize that some queries require state and use it • Distribute intransitive queries more widely
  • 19. Reforming Hypertext for Querying WWWDB • Enlarge the number of link types • Distinguish transitive links • Add bidirectional linking • Enhance the semantics of the query string • Make hypertext more useful for mobile and devices
  • 20. IPv6 and Query Routing for WWWDB • The IPv6 space is large enough to allow for multiple query addressing schemes: – Semantic addressing of objects by type – Objects in the Internet of Things – Dynamic, context driven addressing
  • 21. Scaling the WWWDB • This may require expanding our notions of URIs and links (queries) • Semantic mapping of resources requires additional complexity for queries • Explicit state management for efficiency Every system has a scaling limit
  • 22. Final Thoughts • The Web is the largest NoSQL Distributed Data System – URIs address the resultset of a NoSQL query – Transitive and Intransitive hyperlinks • We can add power and simplicity to our queries by carefully reforming the URI syntax and the current implementations of hypertext • HTTP and HTML are undergoing significant evolution – now it’s time for URIs!
  • 23. Reconceiving the Web as a Distributed Data System Thank You! Daniel Austin PayPal, Inc. NoSQL Now! Conference August 22, 2013 V1.2 @daniel_b_austin

Editor's Notes

  1. We’ll use this idea to suggest improvments.
  2. It has to be a ‘NoSQL’ system because it’s stateless and SQL is inherently stateful.
  3. Note on CAP not applying goes here.