SlideShare uma empresa Scribd logo
1 de 62
Baixar para ler offline
Determine the Right
                                 Analytic Database:
                                 A Survey of New
                                 Data Technologies

                                 O’Reilly Strata Conference
                                 February 1, 2011

                                 Mark R. Madsen
                                 http://ThirdNature.net
                                 Twitter: @markmadsen




Atomic Avenue #1 by Glen Orbik
Key Questions
 ▪ What technologies are available?
 ▪ What are they good for?
 ▪ How do you decide which to use?




But first: why are analytic databases available now?

                        Page 2
Consequences of Commoditization: Data Volume

                                 Spimes

                            Chipping

                            Sensors       Data
                                          Generated
                            GPS

                          RFID            You are
                                          here


                   Time
An Unexpected Consequence of Data Volumes




Sums, counts and sorted results only get you so far.
An Unexpected Consequence of Data Volumes




Our ability to collect data is still outpacing our ability
               to derive meaning from it.
Don’t worry about it. We’ll just buy more hardware.


      CPUs, memory and 
      storage track to very 
      similar curves
RIP Moore’s Law: it nearly ground to a halt for 
silicon integrated circuits about four years ago.
Technology Has Changed (a lot) But We Haven’t
                                    1010
                                    10 9
                                                                  10,000 X improvement
Calculations per second per $1000


                                    10 8
                                    107
                                    106
                                    105
                                    104
                                    103
                                    102
                                    101
                                    10
                                    10‐1
                                    01‐2                                             Current DW architecture
                                    10‐3
                                                                                      and methods start here
                                    10‐4
                                    10‐5
                                                                                            in the mid-1980s
                                    10‐6
                                                                                                        Data: Ray Kurzweil, 2001

                                           1900   1910    1920   1930     1940   1950    1960    1970    1980      1990       2000
                                                  Mechanical            Relay Vacuum tube Transistor    Integrated circuit
Moore’s Law via the Lens of the Industry Analyst




                                        CPU
                                        Speed




                   Time
Moore’s Law: Power Consumption




                                        Power
                                        Use




                Time             2017
Moore’s Law: Heat Generation




                                      CPU
                                      Temp




                 Time          2017
Conclusion #1: Your own nuclear reactor by 2017




                                             Power
                                             Use




                    Time              2017
Conclusion #2: You Will Need a New Desk in 2017




                                            Power
                                            Use




                   Time              2017
Problem: linear extrapolation



“If the automobile had followed                      Reality
the same development as the
computer, a Rolls-Royce would
today cost $100, get a million
miles per gallon, and explode
once a year killing everyone
inside.”                                                       Anything
                                   Robert Cringely




                                Time
Multicore performance is not a linear extrapolation.
New Technology Evolution Means New Problems
 1010
 10 9
 10 8                                   Massively 
 107
                                        parallel era
 106
 105
 104
                    Symmetric multi‐
 103
 102
                    processing era
 101
                                                Investment phase
 10
                                                Improving, perfecting, applying
 10‐1   Uniprocessor
                                              Core problems solved
 01‐2   and custom 
 10‐3
        CPU era
 10‐4                       Early engineering phase
 10‐5                       Exploring, learning, inventing
 10‐6
        1970        1980       1990           2000           2010         2020


               Technology Maturity (time + engineering effort)
What’s different?
Parallelism
We’re not getting more CPU 
power, but more CPUs.
There are too many CPUs 
relative to other resources, 
creating an imbalance in 
hardware platforms.
Most software is designed 
for a single worker, not  
high degrees of parallelism 
and won’t scale well.
Core problem: software is not designed for parallel work




Databases must be designed to permit local work with
minimal global coordination and data redistribution.
SOME TECHNOLOGY INNOVATIONS
Storage Improvements
For data workloads, disk 
throughput still key.
Improvements:
▪ Spinning disks at .05/GB
▪ Solid state disks remove 
  some latencies, read speed 
  of ~250MB/sec
▪ SSD capacity still rising
▪ Card storage (PCI), e.g. 
  FusionIO at 1.5GB/sec
▪ SSD is still costly at $2/GB 
  up to $30/GB
Compression Applied to Stored Data
10x compression means 1 disk I/O can read 
10x as much data, stretching your current 
hardware investment
But it eats CPU and
memory.
YMMV
Scale‐up vs. Scale‐out Parallelism
 Uniprocessor environments required chip upgrades.
 SMP servers can grow to a point, then it’s a forklift 
 upgrade to a bigger box.
 MPP servers grow by adding mode nodes.
Database and Hardware Deployment Models
Three levels of software‐hardware integration:
 ▪ Database appliance (specialized hardware and software)
 ▪ Preconfigured (commodity) hardware with software
 ▪ Software on generic hardware
Then there are the hardware‐database parallel models:

    Database        DB     DB               Database


       OS            OS     OS         OS      OS      OS




Shared Everything   Shared Disk        Shared Nothing
                           Page 23
In‐Memory Processing
1. Maybe not as fast you think. Depends entirely on 
   the database (e.g. VectorWise)
2. So far, applied mainly to shared‐nothing models
3. Very large memories are more applicable to 
   shared‐nothing than shared‐memory systems




     Box‐limited        Limited by node scaling
     e.g. 2 TB max      e.g. 16 nodes, 512MB per = 8TB
4. Still an expensive way to get performance
Columnar Databases
                                           In a row-store model
  ID   Name               Salary           these three rows
   1   Marge Inovera          $50,000      would be stored in
   2   Anita Bath           $120,000       sequential order as
                                           shown here, packed
   3   Nadia Geddit           $36,000
                                           into a block.


  1    Marge Inovera          $50,000      In a column store
  2    Anita Bath            $120,000      model database they
                                           would be divided by
  3    Nadia Geddit           $36,000      columns and stored
                                           in different blocks.
 Not just changing the storage layout. Also involves changes to the
 execution engine and query optimizer.
Column Stores Rule the TPC‐H Benchmark
Columnar Advantages and Disadvantages
+ Reduced I/O for queries not reading all columns
+ Better compression characteristics, meaning database 
  size < raw data size (unlike row store) and less I/O
+ Ability to operate on compressed data, improving 
  overall system performance
+ Less manual tuning
‐ Slower inserts and updates (causing ELT and trickle‐
  feed problems*)
‐ Worse for small retrievals and random I/O
‐ Uses more system memory and CPU
Explosion of Analytic Techniques

                               Machine 
                               learning
       Visualization                             Statistics
GIS



                             Advanced 
                              Analytic 
      Information            Methods                 Numerical 
      theory & IR                                    methods



                       Rules          Text mining 
                     engines &          & text 
                     constraint        analytics
                  programming
Map‐Reduce is a parallel programming framework 
 that allows one to code more easily across a 
 distributed computing environment, not a database.




                            Ok, it’s not   You write a   Did you
So how do     It’s not a    a database     distributed   just tell me
I query the   database,     How do I       mapreduce     to go to
database?     it’s a key-   query it?      function in   hell?          I believe I
              value                        erlang.                      did, Bob.
              store!
What’s Different
No database
No schema
No metadata
No query language*

Good for:
  ▪ Processing lots of complex 
    or non‐relational data
  ▪ Batch processing for very 
    large amounts of data
* Hive, Hbase, Pig, others
Using MapReduce / Hadoop
Hadoop is one implementation of MapReduce. There are 
different variations with different performance and resource 
characteristics e.g. Dryad, CGL‐MR, MPI variants
Hadoop is only part of the solution. You need more for 
enterprise deployment. Cloudera’s distribution for Hadoop
shows what a complete environment could look like. 




                                            Image: Cloudera
                              31
How Hadoop fits into a traditional BI environment

    Developers                       Analysts                  End Users

    Development                Analysis tools, BI         BI, Applications
    tools and IDEs




                                                   Data
                                                 Warehouse


                File loads                               ETL

    Databases    Documents   Flat Files   XML   Queues     ERP     Applications



                              Source Environments
NoSQL theoretically = “not only sql”, in reality…
Data stores that augment or replace relational access
and storage models with other methods.
                         Different storage models:
                         • Key‐value stores
                         • Column families
                         • Object / document stores
                         • Graphs
                         Different access models:
                         • SQL (rarely)
                         • programming API
                         • get/put
                         Reality: mostly suck for BI & analytics
Analytic DB vendors are coming from the other direction:
  • Aster Data – SQL wrapped around MR
  • EMC (Greenplum) – MR on top of the database        33
Some realities to consider
  Cheap performance?
      ▪ Do you have 20 blades 
        lying around unused?
      ▪ How much concurrency?
      ▪ How much effort to write 
        queries? Debug them?
      ▪ Performance comparisons: 
        10x slower on the same 
        hardware?
  The key is the workload type 
  and the scale of it.
Page 34
Do you really need a rack of blades for computing?
                         Graphics co‐processors have 
                         been used for certain problems 
                         for years.
                         Offer single‐system solution to 
                         offload very large compute‐
                         intensive problems.
                         Order of magnitude cost 
                         reduction, order of magnitude 
                         performance increase with 
                         current technology today (for 
                         compute‐intensive problems).
                         We’ve barely started with this.
Other Options for analytic software deployment
                      The basic models.
                      1. Separate tools and systems 
                         (MapReduce and nosql are a 
                           simple variation on this theme)
                      2. Integrated with a database
                      3. Embedded in a database

                      The primary arguments about 
                      deployment models center on 
                      whether to take data to the 
                      code or code to the data.


                      36
Leveraging the Database
Levels of database integration:
  ▪   Native DB connector
  ▪   External integration
  ▪   Internal integration
  ▪   Embedded

+ Less data movement
+ Possible dev process support
+ Hardware / environment 
  savings
+ Possible “sandboxing” support
‐ Limitations on techniques 
                             37
In‐database Execution
You can do a lot with standards‐
compliant SQL
If the database has UDFs, you 
can code too (but it’s harder)
Parallel support for UDFs varies
Some vendors build functions 
directly into the database, 
(usually scalar)
Iterative algorithms (ones that 
converge on a solution) are 
problematic, more so in MPP
  38
What are factors in the decision?
User concurrency: one job or many 
Repetition is a key element:
  ▪ Execute once and apply (build a response 
    or mortality model)
  ▪ Many executions daily (web cross‐sells)
In‐process or Batch?
  ▪ Batch and use results – segment, score
  ▪ In‐process reacts on demand – detect 
    fraud, recommend
In‐process requires thinking about how it 
integrates with the calling application. (SQL 
sometimes not your friend)   39
MATCHING THE PROBLEMS TO 
TECHNOLOGIES
The problem of size is three problems of volume.

   Computations!




                                        Number
                          Amount        of users!
                          of data!
H
Lots of H




“More” can become a qualitative rather than quantitative difference
Really lots of  H




“Databases are dead!” – famous last words
Hardware Architectures and Deployment
   Compute and data sizes are the key requirements
               PF




                                                          MR and related
Computations
               TF




                                         Shared nothing
               GF




                              Shared everything
                     PC       or shared disk
               MF




                    <10s GB    100s GB    1s TB   10s TB     100sTB        PB
                                         Data volume
                                                                           45
Hardware Architectures and Deployment
Today’s reality, and true for a while in most businesses.
               PF
Computations
               TF
               GF




                       The bulk of the
                     market resides here!
               MF




                    <10s GB   100s GB    1s TB   10s TB   100sTB   PB
                                        Data volume
                                                                   46
Hardware Architectures and Deployment
Today’s reality, and true for a while in most businesses.
               PF




                              …but analytics
Computations




                              pushes many things
               TF




                              into the MPP zone.
               GF




                       The bulk of the
                     market resides here!
               MF




                    <10s GB   100s GB    1s TB   10s TB   100sTB   PB
                                        Data volume
                                                                   47
The real question: why do you want a new platform?
 Trouble doing what you already do today
   ▪ Poor response times
   ▪ Not meeting availability deadlines
 Doing more of what you do today
   ▪ Adding users, mining more data
 Doing something new with your data
   ▪ Data mining, recommendations, embedded real‐time 
     process support


 What’s desired is possible but limited by the cost of 
 supporting or growing the existing environment.
                            Page 48
The World According to Gartner: One Magical Quadrant 
SQL Server 2008 R2 (PDW)
Official production customers?
EMC / Greenplum
SQL limitations
Memory / concurrency issues
Ingres
OLTP database
Illuminate
SQL limitations
Very limited scalability
Sun
MySQL for a DW, is this a joke?   Magic Quadrant for Data Warehouse Database Management Systems


                                  49
The assumption of the warehouse as a database is gone



 Non-traditional                  Parallel
                                                 Message
data (logs, audio,             programming
                                                 streams
  documents)                     platforms




    Traditional                                 Streaming
                                Databases
    tabular or                                 DBs/engines
 structured data



                               Data at rest    Data in motion

Copyright Third Nature, Inc.              50                    Slide 50
Data Access Differences
Basic data access styles:
▪ Standard BI and reporting
▪ Dashboards / scorecards
▪ Operational BI
▪ Ad‐hoc query and analysis
▪ Batch analytics
▪ Embedded analytics
Data loading styles:
▪ Refresh
▪ Incremental
▪ Constant
Evaluating ADB Options
Storage style:
  ▪ Files, tables, columns, cubes, KV
Storage type:
  ▪ Memory, disk, hybrid, compressed
Scaling model:
  ▪ SMP, clustered, MPP, distributed
Deployment model:
  ▪ Appliance, cloud, SaaS, on‐premise
Data access model:
  ▪ SQL, MapReduce, R, languages, etc.
License options:
  ▪ CPU, data size, subscription

                                    Page 52
What’s it going to cost? A small sample at list:
  Solution        Pricing model       Price/unit       1 TB solution            Remarks

 DatAupia        Node               $ 19,500/2TB      $ 19,500              You can’t buy a 1
                                                                            TB Satori server


  Kickfire         Data Volume      $ 50,000,-/TB     $ 50,000              Includes MySQL
  (out of             (raw)                                                 5.1 Enterprise
 business)

   Vertica         Data Volume      $ 100,000/TB      $ 200,000             Based on 5 nodes,
                      (raw)                                                 $ 20,000 each


  ParAccel         Data Volume      $ 100,000/TB      $ 200,000             Based on 5 nodes,
                      (raw)                                                 $ 20,000 each

  EXASOL           Data Volume        $ 1,350/GB      $ 350,000*            Based on 4 nodes,
                     (active)         (€1,000/GB)                           $ 20,000 each

  Teradata              Node          $ 99,000 / TB   $ 99,000**            Based on 2550
                                                                            base configuration


* 1TB raw ± 200 GB active, **realistic configuration likely 2x this price
                                              53
Factors and Tradeoffs
The core tradeoff is not always 
money for performance.

What else do you trade?
• Load time
• Trickle feeds
• New ETL tools
• New BI tools
• Operational complexity:
 • Data integration and 
   management
 • Backups
 • Hardware maintenance
                            Page 54
The Path to Performance
  1. Laborware – tuning
  2. Upgrade – try to solve the 
     problem without changing 
     out the database
  3. Extend – add an ADB or 
     Hadoop cluster to the 
     environment to offload a 
     specific workload
  4. Replace – out with the old, 
     in with the new


Page 55
One Word: PoC!
The Future
Assuming database market 
embraces MPP, you have 
compute power that exceeds 
what the DB itself needs.
Why not execute the code at 
the data?
Even without MPP, moving  
to in‐database analytic 
processing is a future 
direction and is workable for 
a large number of people.
                            57
Thank you!
Image Attributions
Thanks to the people who supplied the images used in this presentation:

Atomic Avenue #1 by Glen Orbik http://www.orbikart.com/gallery/displayimage.php?album=4&pos=5
spices.jpg ‐ http://flickr.com/photos/oberazzi/387992959/
Black hole galaxy ‐ http://www.flickr.com/photos/badastronomy/3176565627/
weaver peru.jpg ‐ http://flickr.com/photos/slack12/442373910/
rc toy truck.jpg ‐ http://flickr.com/photos/texas_hillsurfer/2683650363/
automat purple2.jpg ‐ http://flickr.com/photos/alaina/288199169/
open_air_market_bologna ‐ http://flickr.com/photos/pattchi/181259150/
bored_girl.jpg ‐ http://www.flickr.com/photos/alejandrosandoval/280691168/
path_vecchia.jpg ‐ http://www.flickr.com/photos/funadium/2320388358/
fast kids truck peru.jpg ‐ http://flickr.com/photos/zerega/1029076197/
What’s best for which types of problems?*
Shared nothing will be best for solving large data problems, regardless 
of workload or concurrency.
Column‐stores will improve query response time problems for most 
traditional query and aggregation workloads.
Row‐stores will be better for operational BI or embedded BI.
Fast storage always makes things better, but is only cost‐effective for 
medium scale or smaller data.
Compression will help everyone, but column‐stores more than row 
stores because of how the engines work.
Map‐Reduce and distributed filesystems offer advantages of a schema‐
less storage & analytic layer that can process into relational databases.
SMP and in‐memory will be better for high complexity problems under 
moderate data scale, shared‐nothing and MR for large data scale.
*The answer is always “it depends”
                               Page 60
About the Presenter
                      Mark Madsen is president of Third
                      Nature, a technology research and
                      consulting firm focused on business
                      intelligence, analytics and
                      performance management. Mark is
                      an award-winning author, architect
                      and former CTO whose work has
                      been featured in numerous industry
                      publications. During his career Mark
                      received awards from the American
                      Productivity & Quality Center, TDWI,
                      Computerworld and the Smithsonian
                      Institute. He is an international
                      speaker, contributing editor at
                      Intelligent Enterprise, and manages
                      the open source channel at the
                      Business Intelligence Network. For
                      more information or to contact Mark,
                      visit http://ThirdNature.net.
About Third Nature


Third Nature is a research and consulting firm focused on new and
emerging technology and practices in business intelligence, data
integration and information management. If your question is related to BI,
open source, web 2.0 or data integration then you‘re at the right place.
Our goal is to help companies take advantage of information-driven
management practices and applications. We offer education, consulting
and research services to support business and IT organizations as well as
technology vendors.
We fill the gap between what the industry analyst firms cover and what IT
needs. We specialize in product and technology analysis, so we look at
emerging technologies and markets, evaluating the products rather than
vendor market positions.

Mais conteúdo relacionado

Destaque

Everything has changed except us
Everything has changed except usEverything has changed except us
Everything has changed except usmark madsen
 
How to Adopt Agile at Your Organization
How to Adopt Agile at Your OrganizationHow to Adopt Agile at Your Organization
How to Adopt Agile at Your OrganizationRaimonds Simanovskis
 
The State of Open Source BI Adoption
The State of Open Source BI AdoptionThe State of Open Source BI Adoption
The State of Open Source BI Adoptionmark madsen
 
Bi isn't big data and big data isn't BI (updated)
Bi isn't big data and big data isn't BI (updated)Bi isn't big data and big data isn't BI (updated)
Bi isn't big data and big data isn't BI (updated)mark madsen
 
On the edge: analytics for the modern enterprise (analyst comments)
On the edge: analytics for the modern enterprise (analyst comments)On the edge: analytics for the modern enterprise (analyst comments)
On the edge: analytics for the modern enterprise (analyst comments)mark madsen
 
Disruptive Innovation: how do you use these theories to manage your IT?
Disruptive Innovation: how do you use these theories to manage your IT?Disruptive Innovation: how do you use these theories to manage your IT?
Disruptive Innovation: how do you use these theories to manage your IT?mark madsen
 
Building the Enterprise Data Lake: A look at architecture
Building the Enterprise Data Lake: A look at architectureBuilding the Enterprise Data Lake: A look at architecture
Building the Enterprise Data Lake: A look at architecturemark madsen
 
6 TIPS to SURVIVE the 2nd MACHINE AGE
6 TIPS to SURVIVE the 2nd MACHINE AGE6 TIPS to SURVIVE the 2nd MACHINE AGE
6 TIPS to SURVIVE the 2nd MACHINE AGEFloown
 
Third Nature - Open Source Data Warehousing
Third Nature - Open Source Data WarehousingThird Nature - Open Source Data Warehousing
Third Nature - Open Source Data Warehousingmark madsen
 

Destaque (10)

Rails on Oracle 2011
Rails on Oracle 2011Rails on Oracle 2011
Rails on Oracle 2011
 
Everything has changed except us
Everything has changed except usEverything has changed except us
Everything has changed except us
 
How to Adopt Agile at Your Organization
How to Adopt Agile at Your OrganizationHow to Adopt Agile at Your Organization
How to Adopt Agile at Your Organization
 
The State of Open Source BI Adoption
The State of Open Source BI AdoptionThe State of Open Source BI Adoption
The State of Open Source BI Adoption
 
Bi isn't big data and big data isn't BI (updated)
Bi isn't big data and big data isn't BI (updated)Bi isn't big data and big data isn't BI (updated)
Bi isn't big data and big data isn't BI (updated)
 
On the edge: analytics for the modern enterprise (analyst comments)
On the edge: analytics for the modern enterprise (analyst comments)On the edge: analytics for the modern enterprise (analyst comments)
On the edge: analytics for the modern enterprise (analyst comments)
 
Disruptive Innovation: how do you use these theories to manage your IT?
Disruptive Innovation: how do you use these theories to manage your IT?Disruptive Innovation: how do you use these theories to manage your IT?
Disruptive Innovation: how do you use these theories to manage your IT?
 
Building the Enterprise Data Lake: A look at architecture
Building the Enterprise Data Lake: A look at architectureBuilding the Enterprise Data Lake: A look at architecture
Building the Enterprise Data Lake: A look at architecture
 
6 TIPS to SURVIVE the 2nd MACHINE AGE
6 TIPS to SURVIVE the 2nd MACHINE AGE6 TIPS to SURVIVE the 2nd MACHINE AGE
6 TIPS to SURVIVE the 2nd MACHINE AGE
 
Third Nature - Open Source Data Warehousing
Third Nature - Open Source Data WarehousingThird Nature - Open Source Data Warehousing
Third Nature - Open Source Data Warehousing
 

Semelhante a Determine the Right Analytic Database: A Survey of New Data Technologies

Open Networking: Investing for Fun and Profit
Open Networking: Investing for Fun and ProfitOpen Networking: Investing for Fun and Profit
Open Networking: Investing for Fun and ProfitOpen Networking Summits
 
Tsinghua University: Two Exemplary Applications in China
Tsinghua University: Two Exemplary Applications in ChinaTsinghua University: Two Exemplary Applications in China
Tsinghua University: Two Exemplary Applications in ChinaDataStax Academy
 
Complicating Complexity: Performance in a New Machine Age
Complicating Complexity: Performance in a New Machine AgeComplicating Complexity: Performance in a New Machine Age
Complicating Complexity: Performance in a New Machine AgeMaurice Naftalin
 
invited speech at Ge2013, Udine 2013
invited speech at Ge2013, Udine 2013 invited speech at Ge2013, Udine 2013
invited speech at Ge2013, Udine 2013 Roberto Siagri
 
Everything We Learned About In-Memory Data Layout While Building VoltDB
Everything We Learned About In-Memory Data Layout While Building VoltDBEverything We Learned About In-Memory Data Layout While Building VoltDB
Everything We Learned About In-Memory Data Layout While Building VoltDBjhugg
 
Moore's Law Observations from 2009
Moore's Law Observations from 2009Moore's Law Observations from 2009
Moore's Law Observations from 2009Sameħ Galal
 
DARPA ERI Summit 2018: The End of Moore’s Law & Faster General Purpose Comput...
DARPA ERI Summit 2018: The End of Moore’s Law & Faster General Purpose Comput...DARPA ERI Summit 2018: The End of Moore’s Law & Faster General Purpose Comput...
DARPA ERI Summit 2018: The End of Moore’s Law & Faster General Purpose Comput...zionsaint
 
Big Data Everywhere Chicago: High Performance Computing - Contributions Towar...
Big Data Everywhere Chicago: High Performance Computing - Contributions Towar...Big Data Everywhere Chicago: High Performance Computing - Contributions Towar...
Big Data Everywhere Chicago: High Performance Computing - Contributions Towar...BigDataEverywhere
 
OSS Presentation Keynote by Jason Hoffman
OSS Presentation Keynote by Jason HoffmanOSS Presentation Keynote by Jason Hoffman
OSS Presentation Keynote by Jason HoffmanOpenStorageSummit
 
Nikravesh australia long_versionkeynote2012
Nikravesh australia long_versionkeynote2012Nikravesh australia long_versionkeynote2012
Nikravesh australia long_versionkeynote2012Masoud Nikravesh
 
10 Abundant-Data Computing
10 Abundant-Data Computing10 Abundant-Data Computing
10 Abundant-Data ComputingRCCSRENKEI
 
Desktop Private Cloud
Desktop Private CloudDesktop Private Cloud
Desktop Private CloudPaul Morse
 
Teleforge Client Conference, Nov 2018, Forces Driving Innovation
Teleforge Client Conference, Nov 2018, Forces Driving InnovationTeleforge Client Conference, Nov 2018, Forces Driving Innovation
Teleforge Client Conference, Nov 2018, Forces Driving InnovationFrancois Van Der Merwe
 
Very Large Scale Integrated Circuits VLSI Overview
Very Large Scale Integrated Circuits VLSI OverviewVery Large Scale Integrated Circuits VLSI Overview
Very Large Scale Integrated Circuits VLSI OverviewEngr. Bilal Sarwar
 
The Coming Age of Extreme Heterogeneity in HPC
The Coming Age of Extreme Heterogeneity in HPCThe Coming Age of Extreme Heterogeneity in HPC
The Coming Age of Extreme Heterogeneity in HPCinside-BigData.com
 
Cloud Computing y Big Data, próxima frontera de la innovación
Cloud Computing y Big Data, próxima frontera de la innovaciónCloud Computing y Big Data, próxima frontera de la innovación
Cloud Computing y Big Data, próxima frontera de la innovaciónFundación Ramón Areces
 
An Introduction to H2O4GPU
An Introduction to H2O4GPUAn Introduction to H2O4GPU
An Introduction to H2O4GPUSri Ambati
 

Semelhante a Determine the Right Analytic Database: A Survey of New Data Technologies (20)

Open Networking: Investing for Fun and Profit
Open Networking: Investing for Fun and ProfitOpen Networking: Investing for Fun and Profit
Open Networking: Investing for Fun and Profit
 
Tsinghua University: Two Exemplary Applications in China
Tsinghua University: Two Exemplary Applications in ChinaTsinghua University: Two Exemplary Applications in China
Tsinghua University: Two Exemplary Applications in China
 
Complicating Complexity: Performance in a New Machine Age
Complicating Complexity: Performance in a New Machine AgeComplicating Complexity: Performance in a New Machine Age
Complicating Complexity: Performance in a New Machine Age
 
invited speech at Ge2013, Udine 2013
invited speech at Ge2013, Udine 2013 invited speech at Ge2013, Udine 2013
invited speech at Ge2013, Udine 2013
 
Free is Changing
Free is ChangingFree is Changing
Free is Changing
 
Everything We Learned About In-Memory Data Layout While Building VoltDB
Everything We Learned About In-Memory Data Layout While Building VoltDBEverything We Learned About In-Memory Data Layout While Building VoltDB
Everything We Learned About In-Memory Data Layout While Building VoltDB
 
Moore's Law Observations from 2009
Moore's Law Observations from 2009Moore's Law Observations from 2009
Moore's Law Observations from 2009
 
DARPA ERI Summit 2018: The End of Moore’s Law & Faster General Purpose Comput...
DARPA ERI Summit 2018: The End of Moore’s Law & Faster General Purpose Comput...DARPA ERI Summit 2018: The End of Moore’s Law & Faster General Purpose Comput...
DARPA ERI Summit 2018: The End of Moore’s Law & Faster General Purpose Comput...
 
Big Data Everywhere Chicago: High Performance Computing - Contributions Towar...
Big Data Everywhere Chicago: High Performance Computing - Contributions Towar...Big Data Everywhere Chicago: High Performance Computing - Contributions Towar...
Big Data Everywhere Chicago: High Performance Computing - Contributions Towar...
 
OSS Presentation Keynote by Jason Hoffman
OSS Presentation Keynote by Jason HoffmanOSS Presentation Keynote by Jason Hoffman
OSS Presentation Keynote by Jason Hoffman
 
Nikravesh australia long_versionkeynote2012
Nikravesh australia long_versionkeynote2012Nikravesh australia long_versionkeynote2012
Nikravesh australia long_versionkeynote2012
 
10 Abundant-Data Computing
10 Abundant-Data Computing10 Abundant-Data Computing
10 Abundant-Data Computing
 
Desktop Private Cloud
Desktop Private CloudDesktop Private Cloud
Desktop Private Cloud
 
Data Intensive Engineering
Data Intensive EngineeringData Intensive Engineering
Data Intensive Engineering
 
Teleforge Client Conference, Nov 2018, Forces Driving Innovation
Teleforge Client Conference, Nov 2018, Forces Driving InnovationTeleforge Client Conference, Nov 2018, Forces Driving Innovation
Teleforge Client Conference, Nov 2018, Forces Driving Innovation
 
Final
FinalFinal
Final
 
Very Large Scale Integrated Circuits VLSI Overview
Very Large Scale Integrated Circuits VLSI OverviewVery Large Scale Integrated Circuits VLSI Overview
Very Large Scale Integrated Circuits VLSI Overview
 
The Coming Age of Extreme Heterogeneity in HPC
The Coming Age of Extreme Heterogeneity in HPCThe Coming Age of Extreme Heterogeneity in HPC
The Coming Age of Extreme Heterogeneity in HPC
 
Cloud Computing y Big Data, próxima frontera de la innovación
Cloud Computing y Big Data, próxima frontera de la innovaciónCloud Computing y Big Data, próxima frontera de la innovación
Cloud Computing y Big Data, próxima frontera de la innovación
 
An Introduction to H2O4GPU
An Introduction to H2O4GPUAn Introduction to H2O4GPU
An Introduction to H2O4GPU
 

Mais de mark madsen

Data Architecture: OMG It’s Made of People
Data Architecture: OMG It’s Made of PeopleData Architecture: OMG It’s Made of People
Data Architecture: OMG It’s Made of Peoplemark madsen
 
Solve User Problems: Data Architecture for Humans
Solve User Problems: Data Architecture for HumansSolve User Problems: Data Architecture for Humans
Solve User Problems: Data Architecture for Humansmark madsen
 
The Black Box: Interpretability, Reproducibility, and Data Management
The Black Box: Interpretability, Reproducibility, and Data ManagementThe Black Box: Interpretability, Reproducibility, and Data Management
The Black Box: Interpretability, Reproducibility, and Data Managementmark madsen
 
Operationalizing Machine Learning in the Enterprise
Operationalizing Machine Learning in the EnterpriseOperationalizing Machine Learning in the Enterprise
Operationalizing Machine Learning in the Enterprisemark madsen
 
Building a Data Platform Strata SF 2019
Building a Data Platform Strata SF 2019Building a Data Platform Strata SF 2019
Building a Data Platform Strata SF 2019mark madsen
 
Architecting a Data Platform For Enterprise Use (Strata NY 2018)
Architecting a Data Platform For Enterprise Use (Strata NY 2018)Architecting a Data Platform For Enterprise Use (Strata NY 2018)
Architecting a Data Platform For Enterprise Use (Strata NY 2018)mark madsen
 
Architecting a Platform for Enterprise Use - Strata London 2018
Architecting a Platform for Enterprise Use - Strata London 2018Architecting a Platform for Enterprise Use - Strata London 2018
Architecting a Platform for Enterprise Use - Strata London 2018mark madsen
 
A Brief Tour through the Geology & Endemic Botany of the Klamath-Siskiyou Range
A Brief Tour through the Geology & Endemic Botany of the Klamath-Siskiyou RangeA Brief Tour through the Geology & Endemic Botany of the Klamath-Siskiyou Range
A Brief Tour through the Geology & Endemic Botany of the Klamath-Siskiyou Rangemark madsen
 
How to understand trends in the data & software market
How to understand trends in the data & software marketHow to understand trends in the data & software market
How to understand trends in the data & software marketmark madsen
 
Pay no attention to the man behind the curtain - the unseen work behind data ...
Pay no attention to the man behind the curtain - the unseen work behind data ...Pay no attention to the man behind the curtain - the unseen work behind data ...
Pay no attention to the man behind the curtain - the unseen work behind data ...mark madsen
 
Assumptions about Data and Analysis: Briefing room webcast slides
Assumptions about Data and Analysis: Briefing room webcast slidesAssumptions about Data and Analysis: Briefing room webcast slides
Assumptions about Data and Analysis: Briefing room webcast slidesmark madsen
 
Everything Has Changed Except Us: Modernizing the Data Warehouse
Everything Has Changed Except Us: Modernizing the Data WarehouseEverything Has Changed Except Us: Modernizing the Data Warehouse
Everything Has Changed Except Us: Modernizing the Data Warehousemark madsen
 
Don't let data get in the way of a good story
Don't let data get in the way of a good storyDon't let data get in the way of a good story
Don't let data get in the way of a good storymark madsen
 
Big Data and Bad Analogies
Big Data and Bad AnalogiesBig Data and Bad Analogies
Big Data and Bad Analogiesmark madsen
 
Don't follow the followers
Don't follow the followersDon't follow the followers
Don't follow the followersmark madsen
 
Exploring cloud for data warehousing
Exploring cloud for data warehousingExploring cloud for data warehousing
Exploring cloud for data warehousingmark madsen
 
Open Data: Free Data Isn't the Same as Freeing Data
Open Data: Free Data Isn't the Same as Freeing DataOpen Data: Free Data Isn't the Same as Freeing Data
Open Data: Free Data Isn't the Same as Freeing Datamark madsen
 
Exploring cloud for data warehousing
Exploring cloud for data warehousingExploring cloud for data warehousing
Exploring cloud for data warehousingmark madsen
 
Wake up and smell the data
Wake up and smell the dataWake up and smell the data
Wake up and smell the datamark madsen
 
Big Data Wonderland: Two Views on the Big Data Revolution
Big Data Wonderland: Two Views on the Big Data RevolutionBig Data Wonderland: Two Views on the Big Data Revolution
Big Data Wonderland: Two Views on the Big Data Revolutionmark madsen
 

Mais de mark madsen (20)

Data Architecture: OMG It’s Made of People
Data Architecture: OMG It’s Made of PeopleData Architecture: OMG It’s Made of People
Data Architecture: OMG It’s Made of People
 
Solve User Problems: Data Architecture for Humans
Solve User Problems: Data Architecture for HumansSolve User Problems: Data Architecture for Humans
Solve User Problems: Data Architecture for Humans
 
The Black Box: Interpretability, Reproducibility, and Data Management
The Black Box: Interpretability, Reproducibility, and Data ManagementThe Black Box: Interpretability, Reproducibility, and Data Management
The Black Box: Interpretability, Reproducibility, and Data Management
 
Operationalizing Machine Learning in the Enterprise
Operationalizing Machine Learning in the EnterpriseOperationalizing Machine Learning in the Enterprise
Operationalizing Machine Learning in the Enterprise
 
Building a Data Platform Strata SF 2019
Building a Data Platform Strata SF 2019Building a Data Platform Strata SF 2019
Building a Data Platform Strata SF 2019
 
Architecting a Data Platform For Enterprise Use (Strata NY 2018)
Architecting a Data Platform For Enterprise Use (Strata NY 2018)Architecting a Data Platform For Enterprise Use (Strata NY 2018)
Architecting a Data Platform For Enterprise Use (Strata NY 2018)
 
Architecting a Platform for Enterprise Use - Strata London 2018
Architecting a Platform for Enterprise Use - Strata London 2018Architecting a Platform for Enterprise Use - Strata London 2018
Architecting a Platform for Enterprise Use - Strata London 2018
 
A Brief Tour through the Geology & Endemic Botany of the Klamath-Siskiyou Range
A Brief Tour through the Geology & Endemic Botany of the Klamath-Siskiyou RangeA Brief Tour through the Geology & Endemic Botany of the Klamath-Siskiyou Range
A Brief Tour through the Geology & Endemic Botany of the Klamath-Siskiyou Range
 
How to understand trends in the data & software market
How to understand trends in the data & software marketHow to understand trends in the data & software market
How to understand trends in the data & software market
 
Pay no attention to the man behind the curtain - the unseen work behind data ...
Pay no attention to the man behind the curtain - the unseen work behind data ...Pay no attention to the man behind the curtain - the unseen work behind data ...
Pay no attention to the man behind the curtain - the unseen work behind data ...
 
Assumptions about Data and Analysis: Briefing room webcast slides
Assumptions about Data and Analysis: Briefing room webcast slidesAssumptions about Data and Analysis: Briefing room webcast slides
Assumptions about Data and Analysis: Briefing room webcast slides
 
Everything Has Changed Except Us: Modernizing the Data Warehouse
Everything Has Changed Except Us: Modernizing the Data WarehouseEverything Has Changed Except Us: Modernizing the Data Warehouse
Everything Has Changed Except Us: Modernizing the Data Warehouse
 
Don't let data get in the way of a good story
Don't let data get in the way of a good storyDon't let data get in the way of a good story
Don't let data get in the way of a good story
 
Big Data and Bad Analogies
Big Data and Bad AnalogiesBig Data and Bad Analogies
Big Data and Bad Analogies
 
Don't follow the followers
Don't follow the followersDon't follow the followers
Don't follow the followers
 
Exploring cloud for data warehousing
Exploring cloud for data warehousingExploring cloud for data warehousing
Exploring cloud for data warehousing
 
Open Data: Free Data Isn't the Same as Freeing Data
Open Data: Free Data Isn't the Same as Freeing DataOpen Data: Free Data Isn't the Same as Freeing Data
Open Data: Free Data Isn't the Same as Freeing Data
 
Exploring cloud for data warehousing
Exploring cloud for data warehousingExploring cloud for data warehousing
Exploring cloud for data warehousing
 
Wake up and smell the data
Wake up and smell the dataWake up and smell the data
Wake up and smell the data
 
Big Data Wonderland: Two Views on the Big Data Revolution
Big Data Wonderland: Two Views on the Big Data RevolutionBig Data Wonderland: Two Views on the Big Data Revolution
Big Data Wonderland: Two Views on the Big Data Revolution
 

Último

A Deep Dive on Passkeys: FIDO Paris Seminar.pptx
A Deep Dive on Passkeys: FIDO Paris Seminar.pptxA Deep Dive on Passkeys: FIDO Paris Seminar.pptx
A Deep Dive on Passkeys: FIDO Paris Seminar.pptxLoriGlavin3
 
Generative AI for Technical Writer or Information Developers
Generative AI for Technical Writer or Information DevelopersGenerative AI for Technical Writer or Information Developers
Generative AI for Technical Writer or Information DevelopersRaghuram Pandurangan
 
Digital Identity is Under Attack: FIDO Paris Seminar.pptx
Digital Identity is Under Attack: FIDO Paris Seminar.pptxDigital Identity is Under Attack: FIDO Paris Seminar.pptx
Digital Identity is Under Attack: FIDO Paris Seminar.pptxLoriGlavin3
 
What is DBT - The Ultimate Data Build Tool.pdf
What is DBT - The Ultimate Data Build Tool.pdfWhat is DBT - The Ultimate Data Build Tool.pdf
What is DBT - The Ultimate Data Build Tool.pdfMounikaPolabathina
 
From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .Alan Dix
 
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024BookNet Canada
 
The State of Passkeys with FIDO Alliance.pptx
The State of Passkeys with FIDO Alliance.pptxThe State of Passkeys with FIDO Alliance.pptx
The State of Passkeys with FIDO Alliance.pptxLoriGlavin3
 
Unleashing Real-time Insights with ClickHouse_ Navigating the Landscape in 20...
Unleashing Real-time Insights with ClickHouse_ Navigating the Landscape in 20...Unleashing Real-time Insights with ClickHouse_ Navigating the Landscape in 20...
Unleashing Real-time Insights with ClickHouse_ Navigating the Landscape in 20...Alkin Tezuysal
 
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptx
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptxPasskey Providers and Enabling Portability: FIDO Paris Seminar.pptx
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptxLoriGlavin3
 
The Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and ConsThe Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and ConsPixlogix Infotech
 
Enhancing User Experience - Exploring the Latest Features of Tallyman Axis Lo...
Enhancing User Experience - Exploring the Latest Features of Tallyman Axis Lo...Enhancing User Experience - Exploring the Latest Features of Tallyman Axis Lo...
Enhancing User Experience - Exploring the Latest Features of Tallyman Axis Lo...Scott Andery
 
Time Series Foundation Models - current state and future directions
Time Series Foundation Models - current state and future directionsTime Series Foundation Models - current state and future directions
Time Series Foundation Models - current state and future directionsNathaniel Shimoni
 
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptxUse of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptxLoriGlavin3
 
UiPath Community: Communication Mining from Zero to Hero
UiPath Community: Communication Mining from Zero to HeroUiPath Community: Communication Mining from Zero to Hero
UiPath Community: Communication Mining from Zero to HeroUiPathCommunity
 
How to write a Business Continuity Plan
How to write a Business Continuity PlanHow to write a Business Continuity Plan
How to write a Business Continuity PlanDatabarracks
 
Generative Artificial Intelligence: How generative AI works.pdf
Generative Artificial Intelligence: How generative AI works.pdfGenerative Artificial Intelligence: How generative AI works.pdf
Generative Artificial Intelligence: How generative AI works.pdfIngrid Airi González
 
A Framework for Development in the AI Age
A Framework for Development in the AI AgeA Framework for Development in the AI Age
A Framework for Development in the AI AgeCprime
 
Connecting the Dots for Information Discovery.pdf
Connecting the Dots for Information Discovery.pdfConnecting the Dots for Information Discovery.pdf
Connecting the Dots for Information Discovery.pdfNeo4j
 
A Journey Into the Emotions of Software Developers
A Journey Into the Emotions of Software DevelopersA Journey Into the Emotions of Software Developers
A Journey Into the Emotions of Software DevelopersNicole Novielli
 
Potential of AI (Generative AI) in Business: Learnings and Insights
Potential of AI (Generative AI) in Business: Learnings and InsightsPotential of AI (Generative AI) in Business: Learnings and Insights
Potential of AI (Generative AI) in Business: Learnings and InsightsRavi Sanghani
 

Último (20)

A Deep Dive on Passkeys: FIDO Paris Seminar.pptx
A Deep Dive on Passkeys: FIDO Paris Seminar.pptxA Deep Dive on Passkeys: FIDO Paris Seminar.pptx
A Deep Dive on Passkeys: FIDO Paris Seminar.pptx
 
Generative AI for Technical Writer or Information Developers
Generative AI for Technical Writer or Information DevelopersGenerative AI for Technical Writer or Information Developers
Generative AI for Technical Writer or Information Developers
 
Digital Identity is Under Attack: FIDO Paris Seminar.pptx
Digital Identity is Under Attack: FIDO Paris Seminar.pptxDigital Identity is Under Attack: FIDO Paris Seminar.pptx
Digital Identity is Under Attack: FIDO Paris Seminar.pptx
 
What is DBT - The Ultimate Data Build Tool.pdf
What is DBT - The Ultimate Data Build Tool.pdfWhat is DBT - The Ultimate Data Build Tool.pdf
What is DBT - The Ultimate Data Build Tool.pdf
 
From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .
 
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
 
The State of Passkeys with FIDO Alliance.pptx
The State of Passkeys with FIDO Alliance.pptxThe State of Passkeys with FIDO Alliance.pptx
The State of Passkeys with FIDO Alliance.pptx
 
Unleashing Real-time Insights with ClickHouse_ Navigating the Landscape in 20...
Unleashing Real-time Insights with ClickHouse_ Navigating the Landscape in 20...Unleashing Real-time Insights with ClickHouse_ Navigating the Landscape in 20...
Unleashing Real-time Insights with ClickHouse_ Navigating the Landscape in 20...
 
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptx
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptxPasskey Providers and Enabling Portability: FIDO Paris Seminar.pptx
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptx
 
The Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and ConsThe Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and Cons
 
Enhancing User Experience - Exploring the Latest Features of Tallyman Axis Lo...
Enhancing User Experience - Exploring the Latest Features of Tallyman Axis Lo...Enhancing User Experience - Exploring the Latest Features of Tallyman Axis Lo...
Enhancing User Experience - Exploring the Latest Features of Tallyman Axis Lo...
 
Time Series Foundation Models - current state and future directions
Time Series Foundation Models - current state and future directionsTime Series Foundation Models - current state and future directions
Time Series Foundation Models - current state and future directions
 
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptxUse of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
 
UiPath Community: Communication Mining from Zero to Hero
UiPath Community: Communication Mining from Zero to HeroUiPath Community: Communication Mining from Zero to Hero
UiPath Community: Communication Mining from Zero to Hero
 
How to write a Business Continuity Plan
How to write a Business Continuity PlanHow to write a Business Continuity Plan
How to write a Business Continuity Plan
 
Generative Artificial Intelligence: How generative AI works.pdf
Generative Artificial Intelligence: How generative AI works.pdfGenerative Artificial Intelligence: How generative AI works.pdf
Generative Artificial Intelligence: How generative AI works.pdf
 
A Framework for Development in the AI Age
A Framework for Development in the AI AgeA Framework for Development in the AI Age
A Framework for Development in the AI Age
 
Connecting the Dots for Information Discovery.pdf
Connecting the Dots for Information Discovery.pdfConnecting the Dots for Information Discovery.pdf
Connecting the Dots for Information Discovery.pdf
 
A Journey Into the Emotions of Software Developers
A Journey Into the Emotions of Software DevelopersA Journey Into the Emotions of Software Developers
A Journey Into the Emotions of Software Developers
 
Potential of AI (Generative AI) in Business: Learnings and Insights
Potential of AI (Generative AI) in Business: Learnings and InsightsPotential of AI (Generative AI) in Business: Learnings and Insights
Potential of AI (Generative AI) in Business: Learnings and Insights
 

Determine the Right Analytic Database: A Survey of New Data Technologies