SlideShare uma empresa Scribd logo
1 de 42
Baixar para ler offline
Database
Survival…
Robin Bloor, Ph D
Thanks to our Sponsors!
Database Disruption
The forces of nature
often converge to
transform the very
foundations of our
infrastructure.
In the database
landscape, recent
developments have
resulted in a massive
transformation of
the DBMS market.
Understanding your
requirements is key
success these days.
Presentation Sequence
1 What is a Database
exactly?
2 The Database
Landscape
3 The Data Lake
Phenomenon
What is a
Database?
Database Fundamentals
q Built for a collection of
resources – which could
be engineered for the
application
q Shares data among
multiple concurrent users
q Optimizes performance
q Handles resilience
q Provides ACID properties
to some degree
Multiple Database Roles
Scale is a factor!
Hardware Factors
q CPUs, GPUs & FPGAs
q Cross breeding
q 3D Xpoint and PCM (and
Memristor?)
q SSDs & parallel access
q Parallel hardware
architectures
Performance is accelerating
and costs continue to fall.
The Cloud
q A Cloud Database is no
different to an on-prem,
in theory
q Most databases now
available in the cloud
q Some databases are cloud
focused (Snowflake, Reed
Shift)
q Some are hybrid (NuoDb
is a good example)
Data Growth
Corporate
Databases
+ Unstructured Data
+ Partner & Customer Data
+ Web Data
+ Social Network Data
+ Streaming Data
+ IoT Data
+ Personal Data
+ Log File Data
Corporate
Databases
+ Unstructured Data
+ Partner & Customer Data
+ Web Data
+ Social Network Data
+ Streaming Data
+ IoT Data
+ Personal Data
+ Log File Data
Corporate
Databases
+ Unstructured Data
+ Partner & Customer Data
+ Web Data
+ Social Network Data
+ Streaming Data
+ IoT Data
+ Personal Data
+ Log File Data
Corporate
Databases
+ Unstructured Data
+ Partner & Customer Data
+ Web Data
+ Social Network Data
+ Streaming Data
+ IoT Data
+ Personal Data
+ Log File Data
Corporate
Databases
+ Unstructured Data
+ Partner & Customer Data
+ Web Data
+ Social Network Data
+ Streaming Data
+ IoT Data
+ Personal Data
+ Log File Data
Corporate
Databases
+ Unstructured Data
+ Partner & Customer Data
+ Web Data
+ Social Network Data
+ Streaming Data
+ IoT Data
+ Personal Data
+ Log File Data
Corporate
Databases
+ Unstructured Data
+ Partner & Customer Data
+ Web Data
+ Social Network Data
+ Streaming Data
+ IoT Data
+ Personal Data
+ Log File Data
Corporate
Databases
+ Unstructured Data
+ Partner & Customer Data
+ Web Data
+ Social Network Data
+ Streaming Data
+ IoT Data
+ Personal Data
+ Log File Data
Corporate
Databases
+ Unstructured Data
+ Partner & Customer Data
+ Web Data
+ Social Network Data
+ Streaming Data
+ IoT Data
+ Personal Data
+ Log File Data
Corporate
Databases
+ Unstructured Data
+ Partner & Customer Data
+ Web Data
+ Social Network Data
+ Streaming Data
+ IoT Data
+ Personal Data
+ Log File Data
Corporate
Databases
+ Unstructured Data
+ Partner & Customer Data
+ Web Data
+ Social Network Data
+ Streaming Data
+ IoT Data
+ Personal Data
+ Log File Data
Corporate
Databases
+ Unstructured Data
+ Partner & Customer Data
+ Web Data
+ Social Network Data
+ Streaming Data
+ IoT Data
+ Personal Data
+ Log File Data
Corporate
Databases
+ Unstructured Data
+ Partner & Customer Data
+ Web Data
+ Social Network Data
+ Streaming Data
+ IoT Data
+ Personal Data
+ Log File Data
Corporate
Databases
+ Unstructured Data
+ Partner & Customer Data
+ Web Data
+ Social Network Data
+ Streaming Data
+ IoT Data
+ Personal Data
+ Log File Data
Corporate
Databases
+ Unstructured Data
+ Partner & Customer Data
+ Web Data
+ Social Network Data
+ Streaming Data
+ IoT Data
+ Personal Data
+ Log File Data
Corporate
Databases
+ Unstructured Data
+ Partner & Customer Data
+ Web Data
+ Social Network Data
+ Streaming Data
+ IoT Data
+ Personal Data
+ Log File Data
Corporate
Databases
+ Unstructured Data
+ Partner & Customer Data
+ Web Data
+ Social Network Data
+ Streaming Data
+ IoT Data
+ Personal Data
+ Log File Data
Corporate
Databases
+ Unstructured Data
+ Partner & Customer Data
+ Web Data
+ Social Network Data
+ Streaming Data
+ IoT Data
+ Personal Data
+ Log File Data
Data growth is roughly 55% pa. Always has been.
The Global Map and Data Options
u Move the data to
the processing
u Move the
processing to the
data
u Move the
processing and the
data
u Shard
There will not be a single physical database (or data lake) for a
multitude of reasons.
The
Database
Landscape
Everything in flux
u Hardware (network,
storage, servers)
u Data Sources
u Data Staging
u Data Volumes
u Data Flow
u Data Governance
u Query Languages
u Data Usage
u Data Structures
u Schema definition
u Ingest speeds
u Data Workloads
u Applications
NoSQL Confusion
As the graph indicates,
there is some overlap
between SQL databases
and other databases.
What to choose is a use-
case driven decision.
There never was a
“universal database”
and probably there
never will be.
NoSQL World
q Some NDBMS do not attempt to
provide all ACID properties.
q Some NDBMS use a distributed
scale-out architecture with data
redundancy.
q XML DBMS using XQuery are
NDBMS.
q Some documents stores are
NDBMS
q Object databases are NDBMS
(Gemstone, Objectivity,
ObjectStore, etc.)
q Key value stores
q Graph DBMS are NDMBS
q Large data pools (BigTable,
Hbase, Mnesia, etc.) are NDBM
Columnar Database
SQL Merits and Demerits
q SQL: very good for set
manipulation.
q Works for OLTP and many
query environments.
q Not good for nested data
structures (documents, web
pages, etc.)
q Not good for ordered data
sets
q Not good for data graphs
(networks of values)
Not a Swiss Army Knife!
The Impedance Mismatch
q The RDBMS stores data organized
according to table structures
q The OO programmer manipulates
data organized according to
complex object structures,
which may have specific
methods associated with them.
q The data does not simply map to
the structure it has within the
database
q Consequently a mapping activity
is necessary to get and put data
q Basically: hierarchies, types,
result sets, crappy APIs,
language bindings, tools.
The SQL Barrier
q SQL has:
q DDL (for data definition)
q DML (for Select, Project and
Join)
q But it has little MML (Math)
or TML (Time)
q Usually result sets are brought to
the client for further analytical
manipulation, but this creates
problems
q Alternatively doing all analytical
manipulation in the database
creates problems
The Analytics Apps
Advanced	
Analytic	
Methods
Machine	
learning
Statistics
Numerical	
methods
Text	
mining	&	
text	
analytics
Rules	
engines	&	
constraint	
programming
Information	
theory	&	IR
Visualization
GIS
Database Mismatch
A key problem is that we talk
mostly about computation over data
when we talk about “big data” and
analytics, a potential mismatch for
both relational and NoSQL
Database Workload Parameters
q Read-intensive vs. write-
intensive
q Mutable vs. immutable data
q Immediate vs. eventual
consistency
q Short vs. long data latency
q Predictable vs.
unpredictable data access
patterns
q Simple vs. complex data
types
Horses for Courses
q Relational row store databases for
conventionally tooled low to mid-
scale OLTP
q Relational databases for ACID
requirements
q Parallel databases (row or column)
for unpredictable or variable query
workloads
q Specialized databases for complex
data query workloads
q NoSQL (KVS, DHT) for high scale
OLTP
q NoSQL (KVS, DHT) for low latency
read-mostly data access
q Parallel databases (row or column)
for analytic workloads over tabular
data
q NoSQL / Hadoop for batch analytic
workloads over large data volumes
Database Tools: A Call Out
q Have you noticed how databases
are not self-running.
q DBA’s are in short supply and the
need for them is increasing
q Database diversity doesn’t help
in this area.
q DBA Tools:
q SQL analysis
q Performance analysis
q Security management
q Capacity planning
q Database deployment
q We meet the same problem with
data lakes – except that there
are very few tools
The Impact of Parallelism
We used to see 10x performance
improvement every 6 years, now we
see 1000x (and that’s just an
approximation) regularly
The Data
Lake
Phenomenon
The Perfect Storm – The Data Lake
q The triumph of Open
Source as a business model
q The dominance of Apache
q Hadoop, the platform
for data
q Spark, for speed
q Kafka & Nifi for data
flow
q The triumph of the cloud
and its dominance
q Cost collapse
The Primary Role of the Data Lake
System of Record
Data Governance
Application Platform
The Evolved Conception
Analytics
or BI Apps
Data
Governance
Data Lake
Mgt
Static Data Sources Data Streams
To
Databases
Data Marts
Other Apps
ETL
Data
Lake
Ingest
u Static data and data
streams
u Real-time data ingest
u Data Governance
u Data Lake Mgt
u Analytics & BI
u Extracts
The data lake becomes
the system of record
Data Bus Processing
Metadata
Mgt
Data
Cleansing
Data
Transforms
Data
Aggregat'n
Data
SecurityIt will be preferred to
complete governance
processing on the bus
where feasible. Then it
will be done at memory
speeds rather than disk
speeds.
The Full Picture
Data
Cleansing
Data
Security
Ingest
Metadata
Mgt
Real-Time
Apps
Transform &
Aggregate
Search &
Query
BI, Visual'n
& Analytics
Other
Apps
Data Lake
Mgt
Data
Governance
DATA LAKE
To
Databases
Data Marts
Other Apps
Archive
Life Cycle
Mgt Extracts
Servers, Desktops, Mobile, Network Devices, Embedded
Chips, RFID, IoT, The Cloud, Oses, VMs, Log Files, Sys
Mgt Apps, ESBs, Web Services, SaaS, Business Apps,
Office Apps, BI Apps, Workflow, Data Streams, Social...
Data Governance
If data governance was important
before Big Data, (and it was) it is
far more important in the era of
Data Lakes
Data Governance
System of record
Data provenance & lineage
Data cleansing
Data security
Data compliance
Data integrity
Data audit record
Data life-cycle mgt
Data meaning
Data Governance is a perpetual
process
The Event-based World
The event-base world is real-
time. The architecture must thus
be real-time.
A TRANSACTION is a
MOLECULE of ATOMIC EVENTS
The ATOM of data has
become the EVENT
Events: Atoms and Molecules
Events
Think of events as drops of water.
They can live in streams, and they
can also live in data pools and data
lakes and databases.
Event Types
q Instantiation Event
q A State Report
q A Trigger Event
q A Correction Event
We also need to consider:
Data Refinement
Aggregations
Homogeneous Collections
Derived Data
§ The pulse and the
threshold alert
§ Some of this involves
distributed processing
§ There are known apps
and unknown apps, so
analytical exploration
needs to be enabled
§ Only aggregations will
migrate
DepotDepot
Central
Hub
Source
Proc.
Depot
Proc.
Central
Proc.
Sensors, controllers, CPUs
Data Data
Data
Event Based IoT Architecture
u Time
u Geographic location
u Virtual/logical location
u Source device & SW
u Device ID
u Derivation (if derived)
u Creator
u Owner
u Permissions
u Status (for replication)
u Metadata
u Audit Trail
u Archive flag
Self-defining data
Presentation Sequence
1 What is a Database
exactly?
2 The Database
Landscape
3 The Data Lake
Phenomenon
Database Survival Guide: Exploratory Webcast
Database Survival Guide: Exploratory Webcast

Mais conteúdo relacionado

Mais procurados

Rob peglar introduction_analytics _big data_hadoop
Rob peglar introduction_analytics _big data_hadoopRob peglar introduction_analytics _big data_hadoop
Rob peglar introduction_analytics _big data_hadoopGhassan Al-Yafie
 
Relational to Big Graph
Relational to Big GraphRelational to Big Graph
Relational to Big GraphNeo4j
 
Database revolution opening webcast 01 18-12
Database revolution opening webcast 01 18-12Database revolution opening webcast 01 18-12
Database revolution opening webcast 01 18-12mark madsen
 
Hadoop,Big Data Analytics and More
Hadoop,Big Data Analytics and MoreHadoop,Big Data Analytics and More
Hadoop,Big Data Analytics and MoreTrendwise Analytics
 
BigData & CDN - OOP2011 (Pavlo Baron)
BigData & CDN - OOP2011 (Pavlo Baron)BigData & CDN - OOP2011 (Pavlo Baron)
BigData & CDN - OOP2011 (Pavlo Baron)Pavlo Baron
 
Using Hadoop as a platform for Master Data Management
Using Hadoop as a platform for Master Data ManagementUsing Hadoop as a platform for Master Data Management
Using Hadoop as a platform for Master Data ManagementDataWorks Summit
 
Virtualizing Relational Databases as Graphs: a multi-model approach
Virtualizing Relational Databases as Graphs: a multi-model approachVirtualizing Relational Databases as Graphs: a multi-model approach
Virtualizing Relational Databases as Graphs: a multi-model approachJuan Sequeda
 
Preparing Your Data for Cloud Analytics & AI/ML
Preparing Your Data for Cloud Analytics & AI/ML Preparing Your Data for Cloud Analytics & AI/ML
Preparing Your Data for Cloud Analytics & AI/ML Amazon Web Services
 
Graph Query Languages: update from LDBC
Graph Query Languages: update from LDBCGraph Query Languages: update from LDBC
Graph Query Languages: update from LDBCJuan Sequeda
 
Integrating Semantic Web with the Real World - A Journey between Two Cities ...
Integrating Semantic Web with the Real World  - A Journey between Two Cities ...Integrating Semantic Web with the Real World  - A Journey between Two Cities ...
Integrating Semantic Web with the Real World - A Journey between Two Cities ...Juan Sequeda
 
Paytm labs soyouwanttodatascience
Paytm labs soyouwanttodatasciencePaytm labs soyouwanttodatascience
Paytm labs soyouwanttodatascienceAdam Muise
 
Integrating Semantic Web in the Real World: A Journey between Two Cities
Integrating Semantic Web in the Real World: A Journey between Two Cities Integrating Semantic Web in the Real World: A Journey between Two Cities
Integrating Semantic Web in the Real World: A Journey between Two Cities Juan Sequeda
 
Big Data Use Cases
Big Data Use CasesBig Data Use Cases
Big Data Use Casesboorad
 
Designing the Next Generation Data Lake
Designing the Next Generation Data LakeDesigning the Next Generation Data Lake
Designing the Next Generation Data LakeRobert Chong
 
Implementing and running a secure datalake from the trenches
Implementing and running a secure datalake from the trenches Implementing and running a secure datalake from the trenches
Implementing and running a secure datalake from the trenches DataWorks Summit
 
Big Data Real Time Applications
Big Data Real Time ApplicationsBig Data Real Time Applications
Big Data Real Time ApplicationsDataWorks Summit
 
Webinar: Is Spark Hadoop's Friend or Foe?
Webinar: Is Spark Hadoop's Friend or Foe? Webinar: Is Spark Hadoop's Friend or Foe?
Webinar: Is Spark Hadoop's Friend or Foe? Zaloni
 
Incorporating the Data Lake into Your Analytic Architecture
Incorporating the Data Lake into Your Analytic ArchitectureIncorporating the Data Lake into Your Analytic Architecture
Incorporating the Data Lake into Your Analytic ArchitectureCaserta
 
Webinar: Enterprise Data Management in the Era of MongoDB and Data Lakes
Webinar: Enterprise Data Management in the Era of MongoDB and Data LakesWebinar: Enterprise Data Management in the Era of MongoDB and Data Lakes
Webinar: Enterprise Data Management in the Era of MongoDB and Data LakesMongoDB
 

Mais procurados (20)

Rob peglar introduction_analytics _big data_hadoop
Rob peglar introduction_analytics _big data_hadoopRob peglar introduction_analytics _big data_hadoop
Rob peglar introduction_analytics _big data_hadoop
 
Relational to Big Graph
Relational to Big GraphRelational to Big Graph
Relational to Big Graph
 
Database revolution opening webcast 01 18-12
Database revolution opening webcast 01 18-12Database revolution opening webcast 01 18-12
Database revolution opening webcast 01 18-12
 
Hadoop,Big Data Analytics and More
Hadoop,Big Data Analytics and MoreHadoop,Big Data Analytics and More
Hadoop,Big Data Analytics and More
 
BigData & CDN - OOP2011 (Pavlo Baron)
BigData & CDN - OOP2011 (Pavlo Baron)BigData & CDN - OOP2011 (Pavlo Baron)
BigData & CDN - OOP2011 (Pavlo Baron)
 
Using Hadoop as a platform for Master Data Management
Using Hadoop as a platform for Master Data ManagementUsing Hadoop as a platform for Master Data Management
Using Hadoop as a platform for Master Data Management
 
Virtualizing Relational Databases as Graphs: a multi-model approach
Virtualizing Relational Databases as Graphs: a multi-model approachVirtualizing Relational Databases as Graphs: a multi-model approach
Virtualizing Relational Databases as Graphs: a multi-model approach
 
Preparing Your Data for Cloud Analytics & AI/ML
Preparing Your Data for Cloud Analytics & AI/ML Preparing Your Data for Cloud Analytics & AI/ML
Preparing Your Data for Cloud Analytics & AI/ML
 
Graph Query Languages: update from LDBC
Graph Query Languages: update from LDBCGraph Query Languages: update from LDBC
Graph Query Languages: update from LDBC
 
Integrating Semantic Web with the Real World - A Journey between Two Cities ...
Integrating Semantic Web with the Real World  - A Journey between Two Cities ...Integrating Semantic Web with the Real World  - A Journey between Two Cities ...
Integrating Semantic Web with the Real World - A Journey between Two Cities ...
 
Datalake Architecture
Datalake ArchitectureDatalake Architecture
Datalake Architecture
 
Paytm labs soyouwanttodatascience
Paytm labs soyouwanttodatasciencePaytm labs soyouwanttodatascience
Paytm labs soyouwanttodatascience
 
Integrating Semantic Web in the Real World: A Journey between Two Cities
Integrating Semantic Web in the Real World: A Journey between Two Cities Integrating Semantic Web in the Real World: A Journey between Two Cities
Integrating Semantic Web in the Real World: A Journey between Two Cities
 
Big Data Use Cases
Big Data Use CasesBig Data Use Cases
Big Data Use Cases
 
Designing the Next Generation Data Lake
Designing the Next Generation Data LakeDesigning the Next Generation Data Lake
Designing the Next Generation Data Lake
 
Implementing and running a secure datalake from the trenches
Implementing and running a secure datalake from the trenches Implementing and running a secure datalake from the trenches
Implementing and running a secure datalake from the trenches
 
Big Data Real Time Applications
Big Data Real Time ApplicationsBig Data Real Time Applications
Big Data Real Time Applications
 
Webinar: Is Spark Hadoop's Friend or Foe?
Webinar: Is Spark Hadoop's Friend or Foe? Webinar: Is Spark Hadoop's Friend or Foe?
Webinar: Is Spark Hadoop's Friend or Foe?
 
Incorporating the Data Lake into Your Analytic Architecture
Incorporating the Data Lake into Your Analytic ArchitectureIncorporating the Data Lake into Your Analytic Architecture
Incorporating the Data Lake into Your Analytic Architecture
 
Webinar: Enterprise Data Management in the Era of MongoDB and Data Lakes
Webinar: Enterprise Data Management in the Era of MongoDB and Data LakesWebinar: Enterprise Data Management in the Era of MongoDB and Data Lakes
Webinar: Enterprise Data Management in the Era of MongoDB and Data Lakes
 

Semelhante a Database Survival Guide: Exploratory Webcast

The Central Hub: Defining the Data Lake
The Central Hub: Defining the Data LakeThe Central Hub: Defining the Data Lake
The Central Hub: Defining the Data LakeEric Kavanagh
 
Horses for Courses: Database Roundtable
Horses for Courses: Database RoundtableHorses for Courses: Database Roundtable
Horses for Courses: Database RoundtableEric Kavanagh
 
Self Service Analytics and a Modern Data Architecture with Data Virtualizatio...
Self Service Analytics and a Modern Data Architecture with Data Virtualizatio...Self Service Analytics and a Modern Data Architecture with Data Virtualizatio...
Self Service Analytics and a Modern Data Architecture with Data Virtualizatio...Denodo
 
The Practice of Big Data - The Hadoop ecosystem explained with usage scenarios
The Practice of Big Data - The Hadoop ecosystem explained with usage scenariosThe Practice of Big Data - The Hadoop ecosystem explained with usage scenarios
The Practice of Big Data - The Hadoop ecosystem explained with usage scenarioskcmallu
 
Hadoop at the Center: The Next Generation of Hadoop
Hadoop at the Center: The Next Generation of HadoopHadoop at the Center: The Next Generation of Hadoop
Hadoop at the Center: The Next Generation of HadoopAdam Muise
 
Knowledge Graph for Machine Learning and Data Science
Knowledge Graph for Machine Learning and Data ScienceKnowledge Graph for Machine Learning and Data Science
Knowledge Graph for Machine Learning and Data ScienceCambridge Semantics
 
Cardinality-HL-Overview
Cardinality-HL-OverviewCardinality-HL-Overview
Cardinality-HL-OverviewHarry Frost
 
Transform your DBMS to drive engagement innovation with Big Data
Transform your DBMS to drive engagement innovation with Big DataTransform your DBMS to drive engagement innovation with Big Data
Transform your DBMS to drive engagement innovation with Big DataAshnikbiz
 
Big Data Session 1.pptx
Big Data Session 1.pptxBig Data Session 1.pptx
Big Data Session 1.pptxElsonPaul2
 
Building a Single Logical Data Lake: For Advanced Analytics, Data Science, an...
Building a Single Logical Data Lake: For Advanced Analytics, Data Science, an...Building a Single Logical Data Lake: For Advanced Analytics, Data Science, an...
Building a Single Logical Data Lake: For Advanced Analytics, Data Science, an...Denodo
 
INTRODUCTION TO BIG DATA AND HADOOP
INTRODUCTION TO BIG DATA AND HADOOPINTRODUCTION TO BIG DATA AND HADOOP
INTRODUCTION TO BIG DATA AND HADOOPDr Geetha Mohan
 
Track B-1 建構新世代的智慧數據平台
Track B-1 建構新世代的智慧數據平台Track B-1 建構新世代的智慧數據平台
Track B-1 建構新世代的智慧數據平台Etu Solution
 
Immersion Day - Como simplificar o acesso ao seu ambiente analítico
Immersion Day - Como simplificar o acesso ao seu ambiente analíticoImmersion Day - Como simplificar o acesso ao seu ambiente analítico
Immersion Day - Como simplificar o acesso ao seu ambiente analíticoAmazon Web Services LATAM
 
Big data data lake and beyond
Big data data lake and beyond Big data data lake and beyond
Big data data lake and beyond Rajesh Kumar
 
The Value of the Modern Data Architecture with Apache Hadoop and Teradata
The Value of the Modern Data Architecture with Apache Hadoop and Teradata The Value of the Modern Data Architecture with Apache Hadoop and Teradata
The Value of the Modern Data Architecture with Apache Hadoop and Teradata Hortonworks
 
Introduction Big Data
Introduction Big DataIntroduction Big Data
Introduction Big DataFrank Kienle
 
Information processing architectures
Information processing architecturesInformation processing architectures
Information processing architecturesRaji Gogulapati
 
Innovation Track AWS Cloud Experience Argentina - Data Lakes & Analytics en AWS
Innovation Track AWS Cloud Experience Argentina - Data Lakes & Analytics en AWS Innovation Track AWS Cloud Experience Argentina - Data Lakes & Analytics en AWS
Innovation Track AWS Cloud Experience Argentina - Data Lakes & Analytics en AWS Amazon Web Services LATAM
 
Dr. Christian Kurze from Denodo, "Data Virtualization: Fulfilling the Promise...
Dr. Christian Kurze from Denodo, "Data Virtualization: Fulfilling the Promise...Dr. Christian Kurze from Denodo, "Data Virtualization: Fulfilling the Promise...
Dr. Christian Kurze from Denodo, "Data Virtualization: Fulfilling the Promise...Dataconomy Media
 

Semelhante a Database Survival Guide: Exploratory Webcast (20)

The Central Hub: Defining the Data Lake
The Central Hub: Defining the Data LakeThe Central Hub: Defining the Data Lake
The Central Hub: Defining the Data Lake
 
Horses for Courses: Database Roundtable
Horses for Courses: Database RoundtableHorses for Courses: Database Roundtable
Horses for Courses: Database Roundtable
 
Self Service Analytics and a Modern Data Architecture with Data Virtualizatio...
Self Service Analytics and a Modern Data Architecture with Data Virtualizatio...Self Service Analytics and a Modern Data Architecture with Data Virtualizatio...
Self Service Analytics and a Modern Data Architecture with Data Virtualizatio...
 
The Practice of Big Data - The Hadoop ecosystem explained with usage scenarios
The Practice of Big Data - The Hadoop ecosystem explained with usage scenariosThe Practice of Big Data - The Hadoop ecosystem explained with usage scenarios
The Practice of Big Data - The Hadoop ecosystem explained with usage scenarios
 
Hadoop at the Center: The Next Generation of Hadoop
Hadoop at the Center: The Next Generation of HadoopHadoop at the Center: The Next Generation of Hadoop
Hadoop at the Center: The Next Generation of Hadoop
 
Knowledge Graph for Machine Learning and Data Science
Knowledge Graph for Machine Learning and Data ScienceKnowledge Graph for Machine Learning and Data Science
Knowledge Graph for Machine Learning and Data Science
 
Cardinality-HL-Overview
Cardinality-HL-OverviewCardinality-HL-Overview
Cardinality-HL-Overview
 
Transform your DBMS to drive engagement innovation with Big Data
Transform your DBMS to drive engagement innovation with Big DataTransform your DBMS to drive engagement innovation with Big Data
Transform your DBMS to drive engagement innovation with Big Data
 
Big Data Session 1.pptx
Big Data Session 1.pptxBig Data Session 1.pptx
Big Data Session 1.pptx
 
Building a Single Logical Data Lake: For Advanced Analytics, Data Science, an...
Building a Single Logical Data Lake: For Advanced Analytics, Data Science, an...Building a Single Logical Data Lake: For Advanced Analytics, Data Science, an...
Building a Single Logical Data Lake: For Advanced Analytics, Data Science, an...
 
INTRODUCTION TO BIG DATA AND HADOOP
INTRODUCTION TO BIG DATA AND HADOOPINTRODUCTION TO BIG DATA AND HADOOP
INTRODUCTION TO BIG DATA AND HADOOP
 
Track B-1 建構新世代的智慧數據平台
Track B-1 建構新世代的智慧數據平台Track B-1 建構新世代的智慧數據平台
Track B-1 建構新世代的智慧數據平台
 
Immersion Day - Como simplificar o acesso ao seu ambiente analítico
Immersion Day - Como simplificar o acesso ao seu ambiente analíticoImmersion Day - Como simplificar o acesso ao seu ambiente analítico
Immersion Day - Como simplificar o acesso ao seu ambiente analítico
 
Big data data lake and beyond
Big data data lake and beyond Big data data lake and beyond
Big data data lake and beyond
 
The Value of the Modern Data Architecture with Apache Hadoop and Teradata
The Value of the Modern Data Architecture with Apache Hadoop and Teradata The Value of the Modern Data Architecture with Apache Hadoop and Teradata
The Value of the Modern Data Architecture with Apache Hadoop and Teradata
 
Webinar Data Mesh - Part 3
Webinar Data Mesh - Part 3Webinar Data Mesh - Part 3
Webinar Data Mesh - Part 3
 
Introduction Big Data
Introduction Big DataIntroduction Big Data
Introduction Big Data
 
Information processing architectures
Information processing architecturesInformation processing architectures
Information processing architectures
 
Innovation Track AWS Cloud Experience Argentina - Data Lakes & Analytics en AWS
Innovation Track AWS Cloud Experience Argentina - Data Lakes & Analytics en AWS Innovation Track AWS Cloud Experience Argentina - Data Lakes & Analytics en AWS
Innovation Track AWS Cloud Experience Argentina - Data Lakes & Analytics en AWS
 
Dr. Christian Kurze from Denodo, "Data Virtualization: Fulfilling the Promise...
Dr. Christian Kurze from Denodo, "Data Virtualization: Fulfilling the Promise...Dr. Christian Kurze from Denodo, "Data Virtualization: Fulfilling the Promise...
Dr. Christian Kurze from Denodo, "Data Virtualization: Fulfilling the Promise...
 

Mais de Eric Kavanagh

The Future of Data Warehousing and Data Integration
The Future of Data Warehousing and Data IntegrationThe Future of Data Warehousing and Data Integration
The Future of Data Warehousing and Data IntegrationEric Kavanagh
 
Best Practices in DataOps: How to Create Agile, Automated Data Pipelines
Best Practices in DataOps: How to Create Agile, Automated Data PipelinesBest Practices in DataOps: How to Create Agile, Automated Data Pipelines
Best Practices in DataOps: How to Create Agile, Automated Data PipelinesEric Kavanagh
 
Expediting the Path to Discovery with Multi-Source Analysis
Expediting the Path to Discovery with Multi-Source AnalysisExpediting the Path to Discovery with Multi-Source Analysis
Expediting the Path to Discovery with Multi-Source AnalysisEric Kavanagh
 
Will AI Eliminate Reports and Dashboards
Will AI Eliminate Reports and DashboardsWill AI Eliminate Reports and Dashboards
Will AI Eliminate Reports and DashboardsEric Kavanagh
 
Metadata Mastery: A Big Step for BI Modernization
Metadata Mastery: A Big Step for BI ModernizationMetadata Mastery: A Big Step for BI Modernization
Metadata Mastery: A Big Step for BI ModernizationEric Kavanagh
 
Better to Ask Permission? Best Practices for Privacy and Security
Better to Ask Permission? Best Practices for Privacy and SecurityBetter to Ask Permission? Best Practices for Privacy and Security
Better to Ask Permission? Best Practices for Privacy and SecurityEric Kavanagh
 
The Model Enterprise: A Blueprint for Enterprise Data Governance
The Model Enterprise: A Blueprint for Enterprise Data GovernanceThe Model Enterprise: A Blueprint for Enterprise Data Governance
The Model Enterprise: A Blueprint for Enterprise Data GovernanceEric Kavanagh
 
Best Laid Plans: Saving Time, Money and Trouble with Optimal Forecasting
Best Laid Plans: Saving Time, Money and Trouble with Optimal ForecastingBest Laid Plans: Saving Time, Money and Trouble with Optimal Forecasting
Best Laid Plans: Saving Time, Money and Trouble with Optimal ForecastingEric Kavanagh
 
A Winning Strategy for the Digital Economy
A Winning Strategy for the Digital EconomyA Winning Strategy for the Digital Economy
A Winning Strategy for the Digital EconomyEric Kavanagh
 
Discovering Big Data in the Fog: Why Catalogs Matter
 Discovering Big Data in the Fog: Why Catalogs Matter Discovering Big Data in the Fog: Why Catalogs Matter
Discovering Big Data in the Fog: Why Catalogs MatterEric Kavanagh
 
Health Check: Maintaining Enterprise BI
Health Check: Maintaining Enterprise BIHealth Check: Maintaining Enterprise BI
Health Check: Maintaining Enterprise BIEric Kavanagh
 
Rapid Response: Debugging and Profiling to the Rescue
Rapid Response: Debugging and Profiling to the RescueRapid Response: Debugging and Profiling to the Rescue
Rapid Response: Debugging and Profiling to the RescueEric Kavanagh
 
Solving the Really Big Tech Problems with IoT
 Solving the Really Big Tech Problems with IoT Solving the Really Big Tech Problems with IoT
Solving the Really Big Tech Problems with IoTEric Kavanagh
 
Beyond the Platform: Enabling Fluid Analysis
Beyond the Platform: Enabling Fluid AnalysisBeyond the Platform: Enabling Fluid Analysis
Beyond the Platform: Enabling Fluid AnalysisEric Kavanagh
 
Protect Your Database: High Availability for High Demand Data
 Protect Your Database: High Availability for High Demand Data Protect Your Database: High Availability for High Demand Data
Protect Your Database: High Availability for High Demand DataEric Kavanagh
 
A Better Understanding: Solving Business Challenges with Data
A Better Understanding: Solving Business Challenges with DataA Better Understanding: Solving Business Challenges with Data
A Better Understanding: Solving Business Challenges with DataEric Kavanagh
 
The Key to Effective Analytics: Fast-Returning Queries
The Key to Effective Analytics: Fast-Returning QueriesThe Key to Effective Analytics: Fast-Returning Queries
The Key to Effective Analytics: Fast-Returning QueriesEric Kavanagh
 
A Tight Ship: How Containers and SDS Optimize the Enterprise
 A Tight Ship: How Containers and SDS Optimize the Enterprise A Tight Ship: How Containers and SDS Optimize the Enterprise
A Tight Ship: How Containers and SDS Optimize the EnterpriseEric Kavanagh
 
Application Acceleration: Faster Performance for End Users
Application Acceleration: Faster Performance for End Users	Application Acceleration: Faster Performance for End Users
Application Acceleration: Faster Performance for End Users Eric Kavanagh
 
Time's Up! Getting Value from Big Data Now
Time's Up! Getting Value from Big Data NowTime's Up! Getting Value from Big Data Now
Time's Up! Getting Value from Big Data NowEric Kavanagh
 

Mais de Eric Kavanagh (20)

The Future of Data Warehousing and Data Integration
The Future of Data Warehousing and Data IntegrationThe Future of Data Warehousing and Data Integration
The Future of Data Warehousing and Data Integration
 
Best Practices in DataOps: How to Create Agile, Automated Data Pipelines
Best Practices in DataOps: How to Create Agile, Automated Data PipelinesBest Practices in DataOps: How to Create Agile, Automated Data Pipelines
Best Practices in DataOps: How to Create Agile, Automated Data Pipelines
 
Expediting the Path to Discovery with Multi-Source Analysis
Expediting the Path to Discovery with Multi-Source AnalysisExpediting the Path to Discovery with Multi-Source Analysis
Expediting the Path to Discovery with Multi-Source Analysis
 
Will AI Eliminate Reports and Dashboards
Will AI Eliminate Reports and DashboardsWill AI Eliminate Reports and Dashboards
Will AI Eliminate Reports and Dashboards
 
Metadata Mastery: A Big Step for BI Modernization
Metadata Mastery: A Big Step for BI ModernizationMetadata Mastery: A Big Step for BI Modernization
Metadata Mastery: A Big Step for BI Modernization
 
Better to Ask Permission? Best Practices for Privacy and Security
Better to Ask Permission? Best Practices for Privacy and SecurityBetter to Ask Permission? Best Practices for Privacy and Security
Better to Ask Permission? Best Practices for Privacy and Security
 
The Model Enterprise: A Blueprint for Enterprise Data Governance
The Model Enterprise: A Blueprint for Enterprise Data GovernanceThe Model Enterprise: A Blueprint for Enterprise Data Governance
The Model Enterprise: A Blueprint for Enterprise Data Governance
 
Best Laid Plans: Saving Time, Money and Trouble with Optimal Forecasting
Best Laid Plans: Saving Time, Money and Trouble with Optimal ForecastingBest Laid Plans: Saving Time, Money and Trouble with Optimal Forecasting
Best Laid Plans: Saving Time, Money and Trouble with Optimal Forecasting
 
A Winning Strategy for the Digital Economy
A Winning Strategy for the Digital EconomyA Winning Strategy for the Digital Economy
A Winning Strategy for the Digital Economy
 
Discovering Big Data in the Fog: Why Catalogs Matter
 Discovering Big Data in the Fog: Why Catalogs Matter Discovering Big Data in the Fog: Why Catalogs Matter
Discovering Big Data in the Fog: Why Catalogs Matter
 
Health Check: Maintaining Enterprise BI
Health Check: Maintaining Enterprise BIHealth Check: Maintaining Enterprise BI
Health Check: Maintaining Enterprise BI
 
Rapid Response: Debugging and Profiling to the Rescue
Rapid Response: Debugging and Profiling to the RescueRapid Response: Debugging and Profiling to the Rescue
Rapid Response: Debugging and Profiling to the Rescue
 
Solving the Really Big Tech Problems with IoT
 Solving the Really Big Tech Problems with IoT Solving the Really Big Tech Problems with IoT
Solving the Really Big Tech Problems with IoT
 
Beyond the Platform: Enabling Fluid Analysis
Beyond the Platform: Enabling Fluid AnalysisBeyond the Platform: Enabling Fluid Analysis
Beyond the Platform: Enabling Fluid Analysis
 
Protect Your Database: High Availability for High Demand Data
 Protect Your Database: High Availability for High Demand Data Protect Your Database: High Availability for High Demand Data
Protect Your Database: High Availability for High Demand Data
 
A Better Understanding: Solving Business Challenges with Data
A Better Understanding: Solving Business Challenges with DataA Better Understanding: Solving Business Challenges with Data
A Better Understanding: Solving Business Challenges with Data
 
The Key to Effective Analytics: Fast-Returning Queries
The Key to Effective Analytics: Fast-Returning QueriesThe Key to Effective Analytics: Fast-Returning Queries
The Key to Effective Analytics: Fast-Returning Queries
 
A Tight Ship: How Containers and SDS Optimize the Enterprise
 A Tight Ship: How Containers and SDS Optimize the Enterprise A Tight Ship: How Containers and SDS Optimize the Enterprise
A Tight Ship: How Containers and SDS Optimize the Enterprise
 
Application Acceleration: Faster Performance for End Users
Application Acceleration: Faster Performance for End Users	Application Acceleration: Faster Performance for End Users
Application Acceleration: Faster Performance for End Users
 
Time's Up! Getting Value from Big Data Now
Time's Up! Getting Value from Big Data NowTime's Up! Getting Value from Big Data Now
Time's Up! Getting Value from Big Data Now
 

Último

Salesforce Certified Field Service Consultant
Salesforce Certified Field Service ConsultantSalesforce Certified Field Service Consultant
Salesforce Certified Field Service ConsultantAxelRicardoTrocheRiq
 
The Essentials of Digital Experience Monitoring_ A Comprehensive Guide.pdf
The Essentials of Digital Experience Monitoring_ A Comprehensive Guide.pdfThe Essentials of Digital Experience Monitoring_ A Comprehensive Guide.pdf
The Essentials of Digital Experience Monitoring_ A Comprehensive Guide.pdfkalichargn70th171
 
Active Directory Penetration Testing, cionsystems.com.pdf
Active Directory Penetration Testing, cionsystems.com.pdfActive Directory Penetration Testing, cionsystems.com.pdf
Active Directory Penetration Testing, cionsystems.com.pdfCionsystems
 
What is Binary Language? Computer Number Systems
What is Binary Language?  Computer Number SystemsWhat is Binary Language?  Computer Number Systems
What is Binary Language? Computer Number SystemsJheuzeDellosa
 
How To Troubleshoot Collaboration Apps for the Modern Connected Worker
How To Troubleshoot Collaboration Apps for the Modern Connected WorkerHow To Troubleshoot Collaboration Apps for the Modern Connected Worker
How To Troubleshoot Collaboration Apps for the Modern Connected WorkerThousandEyes
 
Test Automation Strategy for Frontend and Backend
Test Automation Strategy for Frontend and BackendTest Automation Strategy for Frontend and Backend
Test Automation Strategy for Frontend and BackendArshad QA
 
Hand gesture recognition PROJECT PPT.pptx
Hand gesture recognition PROJECT PPT.pptxHand gesture recognition PROJECT PPT.pptx
Hand gesture recognition PROJECT PPT.pptxbodapatigopi8531
 
Learn the Fundamentals of XCUITest Framework_ A Beginner's Guide.pdf
Learn the Fundamentals of XCUITest Framework_ A Beginner's Guide.pdfLearn the Fundamentals of XCUITest Framework_ A Beginner's Guide.pdf
Learn the Fundamentals of XCUITest Framework_ A Beginner's Guide.pdfkalichargn70th171
 
TECUNIQUE: Success Stories: IT Service provider
TECUNIQUE: Success Stories: IT Service providerTECUNIQUE: Success Stories: IT Service provider
TECUNIQUE: Success Stories: IT Service providermohitmore19
 
Short Story: Unveiling the Reasoning Abilities of Large Language Models by Ke...
Short Story: Unveiling the Reasoning Abilities of Large Language Models by Ke...Short Story: Unveiling the Reasoning Abilities of Large Language Models by Ke...
Short Story: Unveiling the Reasoning Abilities of Large Language Models by Ke...kellynguyen01
 
CALL ON ➥8923113531 🔝Call Girls Kakori Lucknow best sexual service Online ☂️
CALL ON ➥8923113531 🔝Call Girls Kakori Lucknow best sexual service Online  ☂️CALL ON ➥8923113531 🔝Call Girls Kakori Lucknow best sexual service Online  ☂️
CALL ON ➥8923113531 🔝Call Girls Kakori Lucknow best sexual service Online ☂️anilsa9823
 
Building Real-Time Data Pipelines: Stream & Batch Processing workshop Slide
Building Real-Time Data Pipelines: Stream & Batch Processing workshop SlideBuilding Real-Time Data Pipelines: Stream & Batch Processing workshop Slide
Building Real-Time Data Pipelines: Stream & Batch Processing workshop SlideChristina Lin
 
Steps To Getting Up And Running Quickly With MyTimeClock Employee Scheduling ...
Steps To Getting Up And Running Quickly With MyTimeClock Employee Scheduling ...Steps To Getting Up And Running Quickly With MyTimeClock Employee Scheduling ...
Steps To Getting Up And Running Quickly With MyTimeClock Employee Scheduling ...MyIntelliSource, Inc.
 
Optimizing AI for immediate response in Smart CCTV
Optimizing AI for immediate response in Smart CCTVOptimizing AI for immediate response in Smart CCTV
Optimizing AI for immediate response in Smart CCTVshikhaohhpro
 
Project Based Learning (A.I).pptx detail explanation
Project Based Learning (A.I).pptx detail explanationProject Based Learning (A.I).pptx detail explanation
Project Based Learning (A.I).pptx detail explanationkaushalgiri8080
 
Cloud Management Software Platforms: OpenStack
Cloud Management Software Platforms: OpenStackCloud Management Software Platforms: OpenStack
Cloud Management Software Platforms: OpenStackVICTOR MAESTRE RAMIREZ
 
SyndBuddy AI 2k Review 2024: Revolutionizing Content Syndication with AI
SyndBuddy AI 2k Review 2024: Revolutionizing Content Syndication with AISyndBuddy AI 2k Review 2024: Revolutionizing Content Syndication with AI
SyndBuddy AI 2k Review 2024: Revolutionizing Content Syndication with AIABDERRAOUF MEHENNI
 
A Secure and Reliable Document Management System is Essential.docx
A Secure and Reliable Document Management System is Essential.docxA Secure and Reliable Document Management System is Essential.docx
A Secure and Reliable Document Management System is Essential.docxComplianceQuest1
 
Der Spagat zwischen BIAS und FAIRNESS (2024)
Der Spagat zwischen BIAS und FAIRNESS (2024)Der Spagat zwischen BIAS und FAIRNESS (2024)
Der Spagat zwischen BIAS und FAIRNESS (2024)OPEN KNOWLEDGE GmbH
 

Último (20)

Salesforce Certified Field Service Consultant
Salesforce Certified Field Service ConsultantSalesforce Certified Field Service Consultant
Salesforce Certified Field Service Consultant
 
The Essentials of Digital Experience Monitoring_ A Comprehensive Guide.pdf
The Essentials of Digital Experience Monitoring_ A Comprehensive Guide.pdfThe Essentials of Digital Experience Monitoring_ A Comprehensive Guide.pdf
The Essentials of Digital Experience Monitoring_ A Comprehensive Guide.pdf
 
Active Directory Penetration Testing, cionsystems.com.pdf
Active Directory Penetration Testing, cionsystems.com.pdfActive Directory Penetration Testing, cionsystems.com.pdf
Active Directory Penetration Testing, cionsystems.com.pdf
 
What is Binary Language? Computer Number Systems
What is Binary Language?  Computer Number SystemsWhat is Binary Language?  Computer Number Systems
What is Binary Language? Computer Number Systems
 
How To Troubleshoot Collaboration Apps for the Modern Connected Worker
How To Troubleshoot Collaboration Apps for the Modern Connected WorkerHow To Troubleshoot Collaboration Apps for the Modern Connected Worker
How To Troubleshoot Collaboration Apps for the Modern Connected Worker
 
Test Automation Strategy for Frontend and Backend
Test Automation Strategy for Frontend and BackendTest Automation Strategy for Frontend and Backend
Test Automation Strategy for Frontend and Backend
 
Hand gesture recognition PROJECT PPT.pptx
Hand gesture recognition PROJECT PPT.pptxHand gesture recognition PROJECT PPT.pptx
Hand gesture recognition PROJECT PPT.pptx
 
Learn the Fundamentals of XCUITest Framework_ A Beginner's Guide.pdf
Learn the Fundamentals of XCUITest Framework_ A Beginner's Guide.pdfLearn the Fundamentals of XCUITest Framework_ A Beginner's Guide.pdf
Learn the Fundamentals of XCUITest Framework_ A Beginner's Guide.pdf
 
TECUNIQUE: Success Stories: IT Service provider
TECUNIQUE: Success Stories: IT Service providerTECUNIQUE: Success Stories: IT Service provider
TECUNIQUE: Success Stories: IT Service provider
 
Short Story: Unveiling the Reasoning Abilities of Large Language Models by Ke...
Short Story: Unveiling the Reasoning Abilities of Large Language Models by Ke...Short Story: Unveiling the Reasoning Abilities of Large Language Models by Ke...
Short Story: Unveiling the Reasoning Abilities of Large Language Models by Ke...
 
CALL ON ➥8923113531 🔝Call Girls Kakori Lucknow best sexual service Online ☂️
CALL ON ➥8923113531 🔝Call Girls Kakori Lucknow best sexual service Online  ☂️CALL ON ➥8923113531 🔝Call Girls Kakori Lucknow best sexual service Online  ☂️
CALL ON ➥8923113531 🔝Call Girls Kakori Lucknow best sexual service Online ☂️
 
Building Real-Time Data Pipelines: Stream & Batch Processing workshop Slide
Building Real-Time Data Pipelines: Stream & Batch Processing workshop SlideBuilding Real-Time Data Pipelines: Stream & Batch Processing workshop Slide
Building Real-Time Data Pipelines: Stream & Batch Processing workshop Slide
 
Steps To Getting Up And Running Quickly With MyTimeClock Employee Scheduling ...
Steps To Getting Up And Running Quickly With MyTimeClock Employee Scheduling ...Steps To Getting Up And Running Quickly With MyTimeClock Employee Scheduling ...
Steps To Getting Up And Running Quickly With MyTimeClock Employee Scheduling ...
 
Optimizing AI for immediate response in Smart CCTV
Optimizing AI for immediate response in Smart CCTVOptimizing AI for immediate response in Smart CCTV
Optimizing AI for immediate response in Smart CCTV
 
Project Based Learning (A.I).pptx detail explanation
Project Based Learning (A.I).pptx detail explanationProject Based Learning (A.I).pptx detail explanation
Project Based Learning (A.I).pptx detail explanation
 
Cloud Management Software Platforms: OpenStack
Cloud Management Software Platforms: OpenStackCloud Management Software Platforms: OpenStack
Cloud Management Software Platforms: OpenStack
 
SyndBuddy AI 2k Review 2024: Revolutionizing Content Syndication with AI
SyndBuddy AI 2k Review 2024: Revolutionizing Content Syndication with AISyndBuddy AI 2k Review 2024: Revolutionizing Content Syndication with AI
SyndBuddy AI 2k Review 2024: Revolutionizing Content Syndication with AI
 
A Secure and Reliable Document Management System is Essential.docx
A Secure and Reliable Document Management System is Essential.docxA Secure and Reliable Document Management System is Essential.docx
A Secure and Reliable Document Management System is Essential.docx
 
Vip Call Girls Noida ➡️ Delhi ➡️ 9999965857 No Advance 24HRS Live
Vip Call Girls Noida ➡️ Delhi ➡️ 9999965857 No Advance 24HRS LiveVip Call Girls Noida ➡️ Delhi ➡️ 9999965857 No Advance 24HRS Live
Vip Call Girls Noida ➡️ Delhi ➡️ 9999965857 No Advance 24HRS Live
 
Der Spagat zwischen BIAS und FAIRNESS (2024)
Der Spagat zwischen BIAS und FAIRNESS (2024)Der Spagat zwischen BIAS und FAIRNESS (2024)
Der Spagat zwischen BIAS und FAIRNESS (2024)
 

Database Survival Guide: Exploratory Webcast

  • 2. Thanks to our Sponsors!
  • 3. Database Disruption The forces of nature often converge to transform the very foundations of our infrastructure. In the database landscape, recent developments have resulted in a massive transformation of the DBMS market. Understanding your requirements is key success these days.
  • 4. Presentation Sequence 1 What is a Database exactly? 2 The Database Landscape 3 The Data Lake Phenomenon
  • 6. Database Fundamentals q Built for a collection of resources – which could be engineered for the application q Shares data among multiple concurrent users q Optimizes performance q Handles resilience q Provides ACID properties to some degree
  • 8. Hardware Factors q CPUs, GPUs & FPGAs q Cross breeding q 3D Xpoint and PCM (and Memristor?) q SSDs & parallel access q Parallel hardware architectures Performance is accelerating and costs continue to fall.
  • 9. The Cloud q A Cloud Database is no different to an on-prem, in theory q Most databases now available in the cloud q Some databases are cloud focused (Snowflake, Reed Shift) q Some are hybrid (NuoDb is a good example)
  • 10. Data Growth Corporate Databases + Unstructured Data + Partner & Customer Data + Web Data + Social Network Data + Streaming Data + IoT Data + Personal Data + Log File Data Corporate Databases + Unstructured Data + Partner & Customer Data + Web Data + Social Network Data + Streaming Data + IoT Data + Personal Data + Log File Data Corporate Databases + Unstructured Data + Partner & Customer Data + Web Data + Social Network Data + Streaming Data + IoT Data + Personal Data + Log File Data Corporate Databases + Unstructured Data + Partner & Customer Data + Web Data + Social Network Data + Streaming Data + IoT Data + Personal Data + Log File Data Corporate Databases + Unstructured Data + Partner & Customer Data + Web Data + Social Network Data + Streaming Data + IoT Data + Personal Data + Log File Data Corporate Databases + Unstructured Data + Partner & Customer Data + Web Data + Social Network Data + Streaming Data + IoT Data + Personal Data + Log File Data Corporate Databases + Unstructured Data + Partner & Customer Data + Web Data + Social Network Data + Streaming Data + IoT Data + Personal Data + Log File Data Corporate Databases + Unstructured Data + Partner & Customer Data + Web Data + Social Network Data + Streaming Data + IoT Data + Personal Data + Log File Data Corporate Databases + Unstructured Data + Partner & Customer Data + Web Data + Social Network Data + Streaming Data + IoT Data + Personal Data + Log File Data Corporate Databases + Unstructured Data + Partner & Customer Data + Web Data + Social Network Data + Streaming Data + IoT Data + Personal Data + Log File Data Corporate Databases + Unstructured Data + Partner & Customer Data + Web Data + Social Network Data + Streaming Data + IoT Data + Personal Data + Log File Data Corporate Databases + Unstructured Data + Partner & Customer Data + Web Data + Social Network Data + Streaming Data + IoT Data + Personal Data + Log File Data Corporate Databases + Unstructured Data + Partner & Customer Data + Web Data + Social Network Data + Streaming Data + IoT Data + Personal Data + Log File Data Corporate Databases + Unstructured Data + Partner & Customer Data + Web Data + Social Network Data + Streaming Data + IoT Data + Personal Data + Log File Data Corporate Databases + Unstructured Data + Partner & Customer Data + Web Data + Social Network Data + Streaming Data + IoT Data + Personal Data + Log File Data Corporate Databases + Unstructured Data + Partner & Customer Data + Web Data + Social Network Data + Streaming Data + IoT Data + Personal Data + Log File Data Corporate Databases + Unstructured Data + Partner & Customer Data + Web Data + Social Network Data + Streaming Data + IoT Data + Personal Data + Log File Data Corporate Databases + Unstructured Data + Partner & Customer Data + Web Data + Social Network Data + Streaming Data + IoT Data + Personal Data + Log File Data Data growth is roughly 55% pa. Always has been.
  • 11. The Global Map and Data Options u Move the data to the processing u Move the processing to the data u Move the processing and the data u Shard There will not be a single physical database (or data lake) for a multitude of reasons.
  • 13. Everything in flux u Hardware (network, storage, servers) u Data Sources u Data Staging u Data Volumes u Data Flow u Data Governance u Query Languages u Data Usage u Data Structures u Schema definition u Ingest speeds u Data Workloads u Applications
  • 14. NoSQL Confusion As the graph indicates, there is some overlap between SQL databases and other databases. What to choose is a use- case driven decision. There never was a “universal database” and probably there never will be.
  • 15. NoSQL World q Some NDBMS do not attempt to provide all ACID properties. q Some NDBMS use a distributed scale-out architecture with data redundancy. q XML DBMS using XQuery are NDBMS. q Some documents stores are NDBMS q Object databases are NDBMS (Gemstone, Objectivity, ObjectStore, etc.) q Key value stores q Graph DBMS are NDMBS q Large data pools (BigTable, Hbase, Mnesia, etc.) are NDBM
  • 17. SQL Merits and Demerits q SQL: very good for set manipulation. q Works for OLTP and many query environments. q Not good for nested data structures (documents, web pages, etc.) q Not good for ordered data sets q Not good for data graphs (networks of values) Not a Swiss Army Knife!
  • 18. The Impedance Mismatch q The RDBMS stores data organized according to table structures q The OO programmer manipulates data organized according to complex object structures, which may have specific methods associated with them. q The data does not simply map to the structure it has within the database q Consequently a mapping activity is necessary to get and put data q Basically: hierarchies, types, result sets, crappy APIs, language bindings, tools.
  • 19. The SQL Barrier q SQL has: q DDL (for data definition) q DML (for Select, Project and Join) q But it has little MML (Math) or TML (Time) q Usually result sets are brought to the client for further analytical manipulation, but this creates problems q Alternatively doing all analytical manipulation in the database creates problems
  • 21. Database Mismatch A key problem is that we talk mostly about computation over data when we talk about “big data” and analytics, a potential mismatch for both relational and NoSQL
  • 22. Database Workload Parameters q Read-intensive vs. write- intensive q Mutable vs. immutable data q Immediate vs. eventual consistency q Short vs. long data latency q Predictable vs. unpredictable data access patterns q Simple vs. complex data types
  • 23. Horses for Courses q Relational row store databases for conventionally tooled low to mid- scale OLTP q Relational databases for ACID requirements q Parallel databases (row or column) for unpredictable or variable query workloads q Specialized databases for complex data query workloads q NoSQL (KVS, DHT) for high scale OLTP q NoSQL (KVS, DHT) for low latency read-mostly data access q Parallel databases (row or column) for analytic workloads over tabular data q NoSQL / Hadoop for batch analytic workloads over large data volumes
  • 24. Database Tools: A Call Out q Have you noticed how databases are not self-running. q DBA’s are in short supply and the need for them is increasing q Database diversity doesn’t help in this area. q DBA Tools: q SQL analysis q Performance analysis q Security management q Capacity planning q Database deployment q We meet the same problem with data lakes – except that there are very few tools
  • 25. The Impact of Parallelism We used to see 10x performance improvement every 6 years, now we see 1000x (and that’s just an approximation) regularly
  • 27. The Perfect Storm – The Data Lake q The triumph of Open Source as a business model q The dominance of Apache q Hadoop, the platform for data q Spark, for speed q Kafka & Nifi for data flow q The triumph of the cloud and its dominance q Cost collapse
  • 28. The Primary Role of the Data Lake System of Record Data Governance Application Platform
  • 29. The Evolved Conception Analytics or BI Apps Data Governance Data Lake Mgt Static Data Sources Data Streams To Databases Data Marts Other Apps ETL Data Lake Ingest u Static data and data streams u Real-time data ingest u Data Governance u Data Lake Mgt u Analytics & BI u Extracts The data lake becomes the system of record
  • 30. Data Bus Processing Metadata Mgt Data Cleansing Data Transforms Data Aggregat'n Data SecurityIt will be preferred to complete governance processing on the bus where feasible. Then it will be done at memory speeds rather than disk speeds.
  • 31. The Full Picture Data Cleansing Data Security Ingest Metadata Mgt Real-Time Apps Transform & Aggregate Search & Query BI, Visual'n & Analytics Other Apps Data Lake Mgt Data Governance DATA LAKE To Databases Data Marts Other Apps Archive Life Cycle Mgt Extracts Servers, Desktops, Mobile, Network Devices, Embedded Chips, RFID, IoT, The Cloud, Oses, VMs, Log Files, Sys Mgt Apps, ESBs, Web Services, SaaS, Business Apps, Office Apps, BI Apps, Workflow, Data Streams, Social...
  • 32. Data Governance If data governance was important before Big Data, (and it was) it is far more important in the era of Data Lakes
  • 33. Data Governance System of record Data provenance & lineage Data cleansing Data security Data compliance Data integrity Data audit record Data life-cycle mgt Data meaning Data Governance is a perpetual process
  • 34. The Event-based World The event-base world is real- time. The architecture must thus be real-time.
  • 35. A TRANSACTION is a MOLECULE of ATOMIC EVENTS The ATOM of data has become the EVENT Events: Atoms and Molecules
  • 36. Events Think of events as drops of water. They can live in streams, and they can also live in data pools and data lakes and databases.
  • 37. Event Types q Instantiation Event q A State Report q A Trigger Event q A Correction Event We also need to consider: Data Refinement Aggregations Homogeneous Collections Derived Data
  • 38. § The pulse and the threshold alert § Some of this involves distributed processing § There are known apps and unknown apps, so analytical exploration needs to be enabled § Only aggregations will migrate DepotDepot Central Hub Source Proc. Depot Proc. Central Proc. Sensors, controllers, CPUs Data Data Data Event Based IoT Architecture
  • 39. u Time u Geographic location u Virtual/logical location u Source device & SW u Device ID u Derivation (if derived) u Creator u Owner u Permissions u Status (for replication) u Metadata u Audit Trail u Archive flag Self-defining data
  • 40. Presentation Sequence 1 What is a Database exactly? 2 The Database Landscape 3 The Data Lake Phenomenon