SlideShare a Scribd company logo
1 of 20
Download to read offline
Big “Unstructured” Data
                        A Case for Optimized Object Storage

                                  Paul Speciale




Friday, July 27, 2012
Storage facts and trends

       Recent studies estimate that data storage capacities will likely increase by over
         30X in the coming decade to over 35 Zettabytes



                        35ZB                    High-capacity drives
                                                Less Staff / TB
                                                Unstructured Data
                          Storage Consumption



                                                 30X




                                                           Time        2020


Friday, July 27, 2012
Storage facts and trends


       But…. The number of qualified people to manage this huge volume
             of data will stay flat (~1.5X)
       Administrators will be expected to manage 20X more data each

                                           Efficiency: automate & reduce overhead
                        Capcity / Budget


                                                                           ts
                                                                         en
                                                                        m
                                                                      re
                                                                 q ui
                                                               Re
                                                           e
                                                        ag
                                                      or
                                                   St
                                                                           dget
                                                                    ag e Bu
                                                                Stor




                                                                                Time

Friday, July 27, 2012
Storage facts and trends

        •   Much of that growth (80%) is driven by unstructured data
        •   Billions of large objects and files


             Media Archives       Online Images          Large Files




             Medical Images       Online Storage         Online Movies




                                                                         4


Friday, July 27, 2012
Storage facts and trends:
    Media & Entertainment Industry Example
    M&E is driving huge capacity requirements, both with file sizes and volume of files
      and storage capacities in use, driven by HD, 3D video formats:


        “Petabytes are peanuts”
        3TB per hour for 4K video




                                                                                          5


Friday, July 27, 2012
Big Data for Analytics vs.
                        Big “Unstructured” Data




                                                     6


Friday, July 27, 2012
Big Data for Analytics


                             •   In the 90’s, we experienced an explosion of
                                 data captured for analytics purposes:
                                 •   Academic Research
                                 •   Chemical R&D facilities
                                 •   Travel industry
                                 •   Geo-industry, oil & gas
                                 •   Financial / Trading
                                 •   Agriculture
                             • In the 2000’s, online applications &
                               social media triggered a flood of trend
                               data




                                                                               7


Friday, July 27, 2012
Big Data for Analytics


        • Data is captured as many small log files
          & concatenated as “Big Data”

        • Relational databases were not optimal:
             •   Too much data, too big
             •   Insufficient performance for analytics


        • This stimulated innovations:
             •   Hadoop, MapReduce, GFS
             •   XML databases


        • => This is Big Data for Analytics



                                                          8


Friday, July 27, 2012
Big Data Evolution	

    •   Today, Big Data trend refers to Big Data for
        Analytics & Big Unstructured Data:
         • Media
         • Streaming
         • Business
         • Scientific

    •   Fundamentally different data but with lots of
        similarities
          • Immense capacities
          • Number of transactions or objects

    •   Unstructured data is traditionally stored on host
        files systems but:
          • Host file systems impose fixed limits - do not
             scale up to the size we need
          • File systems do not meet performance
             requirements due to host limiting access

                                                             9


Friday, July 27, 2012
Big Unstructured Data

       •   Most unstructured data is archived, often to tape (cost),
           then difficult to access

       •   Volumes are increasing exponentially

       •   Data archives are an organization & management burden
           (Grandma’s Attic)




                                                                       10

Friday, July 27, 2012
Big Unstructured Data

    • Companies are starting to see the value of the
      data in their archives:
         •   Documents of individuals can be valuable for others
         •   Some companies have legal reasons to keep data available
         •   Unexplored analytics opportunities
         •   This data can be mined and monetized




                                                                        11

Friday, July 27, 2012
Big Unstructured Data




                            But how do store all this data in a
                             cost efficient way?

                            “Building cost-efficient Live
                              Archives”




                                                                  12

Friday, July 27, 2012
Big Unstructured Data

                        What are the requirements?

                          •   Tape is a difficult option: access          Disk Storage
                              latency is key                              (online, low-latency access)

                          •   Data has to be always available
                              online
                                                                   }   + Open application API’s
                                                                         (App & Cloud-enabled)




                                                                   }
                          •   Direct interface to the
                              applications
                                                                       + Ultra-high data durability
                                                                         (Erasure Coding)
                          •   Petabyte scalability

                          •   Extreme reliability, integrity
                                                                       = Optimized Object
                          •   Cost-efficient                              Storage

                          •   Security


                                                                                                   13

Friday, July 27, 2012
Disk vs. Tape

                        Tape has several obvious advantages over disk
                          & there will always be use cases for tape

                        But disks enable live archives with instant data
                          accessibility

                        More arguments for disk-based archives
                            •   Disks can be powered down
                            •   Tape requires replication to protect against media errors
                            •   Data integrity checking
                            •   Massive migration projects
                            •   …




                                                                                            14

Friday, July 27, 2012
Object Storage Simplifies this Problem

       •   File System organization of data
           becomes a burden
              • File systems impose limitations on
                 numbers of files & directories
              • Very time-consuming to organize
                 data
       •   Object Storage simplifies this
           problem                                     Application   Application   Application
              • Flat “Namespaces” (not file
                 systems) - without storage limits
              • Let’s the applications talk directly                 Object API
                 to the Storage
              • Use “Object” application API’s to
                 let applications directly manage
                 objects & metadata
       •   File Gateways can be used as a
           transition bridge
              • Bring legacy data and apps into
                 Object Storage

                                                                                                 15

Friday, July 27, 2012
Petabyte Scalability and Beyond

  Systems should scale BIG
     • Beyond petabytes of data – no built-in limits
     • Beyond billions of data objects

  Systems should scale uniformly
     • Add resources incrementally and grow as a Single System View
     • Manage from a “Single Pane of Glass”
     • Scale performance and capacity separately
     • Migration and seamless growth across newer generations of component
       technologies (processors, disk densities)




                                                                             16

Friday, July 27, 2012
Ultra-High Levels of Data Integrity

   • Data needs to be archived for lifetimes
      • Expect “bit perfect” integrity to store gold-copy of critical assets
      • Consolidate multiple copies of data into a single highly-durable tier
   • Ensuring the integrity of long-term unstructured data archive requires
     new data protection algorithms, to:
      • Address the increasing capacity of disk drives
      • Solve issues related to long RAID rebuild windows
   “Object storage systems based on erasure-coding can not only protect data from
     higher numbers of drive failures, but also against the failure of entire storage
     modules.”




                                                                                        17

Friday, July 27, 2012
Big Unstructured Data

                        What are the requirements?

                          •   Tape is a difficult option: access          Disk Storage
                              latency is key                              (online, low-latency access)

                          •   Data has to be always available
                              online
                                                                   }   + Open application API’s
                                                                         (App & Cloud-enabled)



                                                                   }
                          •   Direct interface to the
                              applications

                          •   Petabyte scalability                     + Ultra-high data durability
                                                                         (Erasure Coding)
                          •   Extreme reliability, integrity

                          •   Cost-efficient
                                                                       = Optimized Object
                          •   Security                                    Storage



                                                                                                   18

Friday, July 27, 2012
Thank You!


     Paul Speciale, VP Products, Amplidata Inc.	
                                                	
   www.amplidata.com



Friday, July 27, 2012
Sponsored Workshop
Friday, July 27, 2012

More Related Content

What's hot

Introduction to Harnessing Big Data
Introduction to Harnessing Big DataIntroduction to Harnessing Big Data
Introduction to Harnessing Big DataPaul Barsch
 
Linked Data Approach for Integration of Human Health & Environmental Data
Linked Data Approach for Integration of Human Health & Environmental DataLinked Data Approach for Integration of Human Health & Environmental Data
Linked Data Approach for Integration of Human Health & Environmental Data3 Round Stones
 
The Top 5 Factors to Consider When Choosing a Big Data Solution
The Top 5 Factors to Consider When Choosing a Big Data SolutionThe Top 5 Factors to Consider When Choosing a Big Data Solution
The Top 5 Factors to Consider When Choosing a Big Data SolutionDATAVERSITY
 
Managing Uncertainty to Improve Decision Making - Statistical Thinking for Qu...
Managing Uncertainty to Improve Decision Making - Statistical Thinking for Qu...Managing Uncertainty to Improve Decision Making - Statistical Thinking for Qu...
Managing Uncertainty to Improve Decision Making - Statistical Thinking for Qu...Prof. Dr. Diego Kuonen
 
Hadoop World 2011: Advancing Disney’s Data Infrastructure with Hadoop - Matt ...
Hadoop World 2011: Advancing Disney’s Data Infrastructure with Hadoop - Matt ...Hadoop World 2011: Advancing Disney’s Data Infrastructure with Hadoop - Matt ...
Hadoop World 2011: Advancing Disney’s Data Infrastructure with Hadoop - Matt ...Cloudera, Inc.
 
Data-Ed Online: Data Operations Management: Turning Your Challenges Into Success
Data-Ed Online: Data Operations Management: Turning Your Challenges Into SuccessData-Ed Online: Data Operations Management: Turning Your Challenges Into Success
Data-Ed Online: Data Operations Management: Turning Your Challenges Into SuccessData Blueprint
 
Cisco event 6 05 2014v3 wwt only
Cisco event 6 05 2014v3 wwt onlyCisco event 6 05 2014v3 wwt only
Cisco event 6 05 2014v3 wwt onlyArthur_Hansen
 
Data-Ed Online: How Safe is Your Data? Data Security
Data-Ed Online: How Safe is Your Data? Data SecurityData-Ed Online: How Safe is Your Data? Data Security
Data-Ed Online: How Safe is Your Data? Data SecurityDATAVERSITY
 
Big Data & the Cloud
Big Data & the CloudBig Data & the Cloud
Big Data & the CloudDATAVERSITY
 
DataCyte - The Future of Data Storage & Retrieval
DataCyte - The Future of Data Storage & RetrievalDataCyte - The Future of Data Storage & Retrieval
DataCyte - The Future of Data Storage & RetrievalDaniel Opland
 
RDSI Project History
RDSI Project HistoryRDSI Project History
RDSI Project HistoryRDSI
 
RDSI Project History
RDSI Project HistoryRDSI Project History
RDSI Project HistoryAsher Vennell
 
Rdsi project history slidedoc save pdf
Rdsi project history slidedoc save pdfRdsi project history slidedoc save pdf
Rdsi project history slidedoc save pdfAsher Vennell
 
How to Crunch Petabytes with Hadoop and Big Data Using InfoSphere BigInsights...
How to Crunch Petabytes with Hadoop and Big Data Using InfoSphere BigInsights...How to Crunch Petabytes with Hadoop and Big Data Using InfoSphere BigInsights...
How to Crunch Petabytes with Hadoop and Big Data Using InfoSphere BigInsights...DATAVERSITY
 
Linked Building (Energy) Data
Linked Building (Energy) DataLinked Building (Energy) Data
Linked Building (Energy) DataEdward Curry
 
Data-Ed Online: How Safe is Your Data? Data Security Webinar
Data-Ed Online: How Safe is Your Data?  Data Security WebinarData-Ed Online: How Safe is Your Data?  Data Security Webinar
Data-Ed Online: How Safe is Your Data? Data Security WebinarData Blueprint
 
EDF2013: Invited Talk Daragh O'Brien: The Story of Maturity – How data in Bus...
EDF2013: Invited Talk Daragh O'Brien: The Story of Maturity – How data in Bus...EDF2013: Invited Talk Daragh O'Brien: The Story of Maturity – How data in Bus...
EDF2013: Invited Talk Daragh O'Brien: The Story of Maturity – How data in Bus...European Data Forum
 

What's hot (19)

Introduction to Harnessing Big Data
Introduction to Harnessing Big DataIntroduction to Harnessing Big Data
Introduction to Harnessing Big Data
 
Linked Data Approach for Integration of Human Health & Environmental Data
Linked Data Approach for Integration of Human Health & Environmental DataLinked Data Approach for Integration of Human Health & Environmental Data
Linked Data Approach for Integration of Human Health & Environmental Data
 
Big data ppt
Big data pptBig data ppt
Big data ppt
 
The Top 5 Factors to Consider When Choosing a Big Data Solution
The Top 5 Factors to Consider When Choosing a Big Data SolutionThe Top 5 Factors to Consider When Choosing a Big Data Solution
The Top 5 Factors to Consider When Choosing a Big Data Solution
 
Managing Uncertainty to Improve Decision Making - Statistical Thinking for Qu...
Managing Uncertainty to Improve Decision Making - Statistical Thinking for Qu...Managing Uncertainty to Improve Decision Making - Statistical Thinking for Qu...
Managing Uncertainty to Improve Decision Making - Statistical Thinking for Qu...
 
Hadoop World 2011: Advancing Disney’s Data Infrastructure with Hadoop - Matt ...
Hadoop World 2011: Advancing Disney’s Data Infrastructure with Hadoop - Matt ...Hadoop World 2011: Advancing Disney’s Data Infrastructure with Hadoop - Matt ...
Hadoop World 2011: Advancing Disney’s Data Infrastructure with Hadoop - Matt ...
 
Data-Ed Online: Data Operations Management: Turning Your Challenges Into Success
Data-Ed Online: Data Operations Management: Turning Your Challenges Into SuccessData-Ed Online: Data Operations Management: Turning Your Challenges Into Success
Data-Ed Online: Data Operations Management: Turning Your Challenges Into Success
 
Cisco event 6 05 2014v3 wwt only
Cisco event 6 05 2014v3 wwt onlyCisco event 6 05 2014v3 wwt only
Cisco event 6 05 2014v3 wwt only
 
Why Data Vault?
Why Data Vault? Why Data Vault?
Why Data Vault?
 
Data-Ed Online: How Safe is Your Data? Data Security
Data-Ed Online: How Safe is Your Data? Data SecurityData-Ed Online: How Safe is Your Data? Data Security
Data-Ed Online: How Safe is Your Data? Data Security
 
Big Data & the Cloud
Big Data & the CloudBig Data & the Cloud
Big Data & the Cloud
 
DataCyte - The Future of Data Storage & Retrieval
DataCyte - The Future of Data Storage & RetrievalDataCyte - The Future of Data Storage & Retrieval
DataCyte - The Future of Data Storage & Retrieval
 
RDSI Project History
RDSI Project HistoryRDSI Project History
RDSI Project History
 
RDSI Project History
RDSI Project HistoryRDSI Project History
RDSI Project History
 
Rdsi project history slidedoc save pdf
Rdsi project history slidedoc save pdfRdsi project history slidedoc save pdf
Rdsi project history slidedoc save pdf
 
How to Crunch Petabytes with Hadoop and Big Data Using InfoSphere BigInsights...
How to Crunch Petabytes with Hadoop and Big Data Using InfoSphere BigInsights...How to Crunch Petabytes with Hadoop and Big Data Using InfoSphere BigInsights...
How to Crunch Petabytes with Hadoop and Big Data Using InfoSphere BigInsights...
 
Linked Building (Energy) Data
Linked Building (Energy) DataLinked Building (Energy) Data
Linked Building (Energy) Data
 
Data-Ed Online: How Safe is Your Data? Data Security Webinar
Data-Ed Online: How Safe is Your Data?  Data Security WebinarData-Ed Online: How Safe is Your Data?  Data Security Webinar
Data-Ed Online: How Safe is Your Data? Data Security Webinar
 
EDF2013: Invited Talk Daragh O'Brien: The Story of Maturity – How data in Bus...
EDF2013: Invited Talk Daragh O'Brien: The Story of Maturity – How data in Bus...EDF2013: Invited Talk Daragh O'Brien: The Story of Maturity – How data in Bus...
EDF2013: Invited Talk Daragh O'Brien: The Story of Maturity – How data in Bus...
 

Viewers also liked

Scense Express Jumpstart
Scense Express JumpstartScense Express Jumpstart
Scense Express Jumpstart247 Invest
 
PPA Conference 2013 | Mobiya Classifieds
PPA Conference 2013 | Mobiya ClassifiedsPPA Conference 2013 | Mobiya Classifieds
PPA Conference 2013 | Mobiya Classifieds247 Invest
 
Show me-the-money Intro
Show me-the-money IntroShow me-the-money Intro
Show me-the-money Intro247 Invest
 
Nagios Open Source Monitoring
Nagios Open Source MonitoringNagios Open Source Monitoring
Nagios Open Source Monitoring247 Invest
 
Zeeschildpadden Jolijn Vekeman
Zeeschildpadden Jolijn VekemanZeeschildpadden Jolijn Vekeman
Zeeschildpadden Jolijn Vekeman247 Invest
 
Keynote Sales Kickoff Interoute
Keynote Sales Kickoff InterouteKeynote Sales Kickoff Interoute
Keynote Sales Kickoff Interoute247 Invest
 
2017 Strategic Technology and Marketing Trends - Sacha Vekeman
2017 Strategic Technology and Marketing Trends - Sacha Vekeman2017 Strategic Technology and Marketing Trends - Sacha Vekeman
2017 Strategic Technology and Marketing Trends - Sacha Vekeman247 Invest
 

Viewers also liked (8)

FWBuilder
FWBuilderFWBuilder
FWBuilder
 
Scense Express Jumpstart
Scense Express JumpstartScense Express Jumpstart
Scense Express Jumpstart
 
PPA Conference 2013 | Mobiya Classifieds
PPA Conference 2013 | Mobiya ClassifiedsPPA Conference 2013 | Mobiya Classifieds
PPA Conference 2013 | Mobiya Classifieds
 
Show me-the-money Intro
Show me-the-money IntroShow me-the-money Intro
Show me-the-money Intro
 
Nagios Open Source Monitoring
Nagios Open Source MonitoringNagios Open Source Monitoring
Nagios Open Source Monitoring
 
Zeeschildpadden Jolijn Vekeman
Zeeschildpadden Jolijn VekemanZeeschildpadden Jolijn Vekeman
Zeeschildpadden Jolijn Vekeman
 
Keynote Sales Kickoff Interoute
Keynote Sales Kickoff InterouteKeynote Sales Kickoff Interoute
Keynote Sales Kickoff Interoute
 
2017 Strategic Technology and Marketing Trends - Sacha Vekeman
2017 Strategic Technology and Marketing Trends - Sacha Vekeman2017 Strategic Technology and Marketing Trends - Sacha Vekeman
2017 Strategic Technology and Marketing Trends - Sacha Vekeman
 

Similar to SPONSORED WORKSHOP by Amplidata from Structure:Data 2012:

THE 3V's OF BIG DATA: VARIETY, VELOCITY, AND VOLUME from Structure:Data 2012
THE 3V's OF BIG DATA: VARIETY, VELOCITY, AND VOLUME from Structure:Data 2012THE 3V's OF BIG DATA: VARIETY, VELOCITY, AND VOLUME from Structure:Data 2012
THE 3V's OF BIG DATA: VARIETY, VELOCITY, AND VOLUME from Structure:Data 2012Gigaom
 
Managing Data Warehouse Growth in the New Era of Big Data
Managing Data Warehouse Growth in the New Era of Big DataManaging Data Warehouse Growth in the New Era of Big Data
Managing Data Warehouse Growth in the New Era of Big DataVineet
 
Big Data Analytics: Applications and Opportunities in On-line Predictive Mode...
Big Data Analytics: Applications and Opportunities in On-line Predictive Mode...Big Data Analytics: Applications and Opportunities in On-line Predictive Mode...
Big Data Analytics: Applications and Opportunities in On-line Predictive Mode...BigMine
 
Balancing Data Governance and Innovation
Balancing Data Governance and InnovationBalancing Data Governance and Innovation
Balancing Data Governance and InnovationCaserta
 
SPONSORED WORKSHOP by Cleversafe from Structure:Data 2012
SPONSORED WORKSHOP by Cleversafe from Structure:Data 2012SPONSORED WORKSHOP by Cleversafe from Structure:Data 2012
SPONSORED WORKSHOP by Cleversafe from Structure:Data 2012Gigaom
 
Silicon valley nosql meetup april 2012
Silicon valley nosql meetup  april 2012Silicon valley nosql meetup  april 2012
Silicon valley nosql meetup april 2012InfiniteGraph
 
Big Data Analytics Materials, Chapter: 1
Big Data Analytics Materials, Chapter: 1Big Data Analytics Materials, Chapter: 1
Big Data Analytics Materials, Chapter: 1RUHULAMINHAZARIKA
 
Big Data, NoSQL, NewSQL & The Future of Data Management
Big Data, NoSQL, NewSQL & The Future of Data ManagementBig Data, NoSQL, NewSQL & The Future of Data Management
Big Data, NoSQL, NewSQL & The Future of Data ManagementTony Bain
 
Scaling big data mining infrastructure thetwitte experience - Jimmy Lin and D...
Scaling big data mining infrastructure thetwitte experience - Jimmy Lin and D...Scaling big data mining infrastructure thetwitte experience - Jimmy Lin and D...
Scaling big data mining infrastructure thetwitte experience - Jimmy Lin and D...Ohud Saud
 
Level Seven - Expedient Big Data presentation
Level Seven - Expedient Big Data presentationLevel Seven - Expedient Big Data presentation
Level Seven - Expedient Big Data presentationDoug Denton
 
Big Data and BI Best Practices
Big Data and BI Best PracticesBig Data and BI Best Practices
Big Data and BI Best PracticesYellowfin
 
Data warehouse Vs Big Data
Data warehouse Vs Big Data Data warehouse Vs Big Data
Data warehouse Vs Big Data Lisette ZOUNON
 
Hadoop and Your Data Warehouse
Hadoop and Your Data WarehouseHadoop and Your Data Warehouse
Hadoop and Your Data WarehouseCaserta
 
bigdata (1)
bigdata (1)bigdata (1)
bigdata (1)DIVYA G
 
Big data and bi best practices slidedeck
Big data and bi best practices slidedeckBig data and bi best practices slidedeck
Big data and bi best practices slidedeckActian Corporation
 
Setting Up the Data Lake
Setting Up the Data LakeSetting Up the Data Lake
Setting Up the Data LakeCaserta
 
Big Data: Setting Up the Big Data Lake
Big Data: Setting Up the Big Data LakeBig Data: Setting Up the Big Data Lake
Big Data: Setting Up the Big Data LakeCaserta
 

Similar to SPONSORED WORKSHOP by Amplidata from Structure:Data 2012: (20)

THE 3V's OF BIG DATA: VARIETY, VELOCITY, AND VOLUME from Structure:Data 2012
THE 3V's OF BIG DATA: VARIETY, VELOCITY, AND VOLUME from Structure:Data 2012THE 3V's OF BIG DATA: VARIETY, VELOCITY, AND VOLUME from Structure:Data 2012
THE 3V's OF BIG DATA: VARIETY, VELOCITY, AND VOLUME from Structure:Data 2012
 
Managing Data Warehouse Growth in the New Era of Big Data
Managing Data Warehouse Growth in the New Era of Big DataManaging Data Warehouse Growth in the New Era of Big Data
Managing Data Warehouse Growth in the New Era of Big Data
 
Big Data Analytics: Applications and Opportunities in On-line Predictive Mode...
Big Data Analytics: Applications and Opportunities in On-line Predictive Mode...Big Data Analytics: Applications and Opportunities in On-line Predictive Mode...
Big Data Analytics: Applications and Opportunities in On-line Predictive Mode...
 
Balancing Data Governance and Innovation
Balancing Data Governance and InnovationBalancing Data Governance and Innovation
Balancing Data Governance and Innovation
 
SPONSORED WORKSHOP by Cleversafe from Structure:Data 2012
SPONSORED WORKSHOP by Cleversafe from Structure:Data 2012SPONSORED WORKSHOP by Cleversafe from Structure:Data 2012
SPONSORED WORKSHOP by Cleversafe from Structure:Data 2012
 
Big Data Presentation
Big Data PresentationBig Data Presentation
Big Data Presentation
 
Big Data a big deal?
Big Data a big deal?Big Data a big deal?
Big Data a big deal?
 
Silicon valley nosql meetup april 2012
Silicon valley nosql meetup  april 2012Silicon valley nosql meetup  april 2012
Silicon valley nosql meetup april 2012
 
Big Data Analytics Materials, Chapter: 1
Big Data Analytics Materials, Chapter: 1Big Data Analytics Materials, Chapter: 1
Big Data Analytics Materials, Chapter: 1
 
Big Data, NoSQL, NewSQL & The Future of Data Management
Big Data, NoSQL, NewSQL & The Future of Data ManagementBig Data, NoSQL, NewSQL & The Future of Data Management
Big Data, NoSQL, NewSQL & The Future of Data Management
 
Scaling big data mining infrastructure thetwitte experience - Jimmy Lin and D...
Scaling big data mining infrastructure thetwitte experience - Jimmy Lin and D...Scaling big data mining infrastructure thetwitte experience - Jimmy Lin and D...
Scaling big data mining infrastructure thetwitte experience - Jimmy Lin and D...
 
Level Seven - Expedient Big Data presentation
Level Seven - Expedient Big Data presentationLevel Seven - Expedient Big Data presentation
Level Seven - Expedient Big Data presentation
 
Big Data and BI Best Practices
Big Data and BI Best PracticesBig Data and BI Best Practices
Big Data and BI Best Practices
 
Data warehouse Vs Big Data
Data warehouse Vs Big Data Data warehouse Vs Big Data
Data warehouse Vs Big Data
 
Hadoop and Your Data Warehouse
Hadoop and Your Data WarehouseHadoop and Your Data Warehouse
Hadoop and Your Data Warehouse
 
bigdata (1)
bigdata (1)bigdata (1)
bigdata (1)
 
Big data and bi best practices slidedeck
Big data and bi best practices slidedeckBig data and bi best practices slidedeck
Big data and bi best practices slidedeck
 
Setting Up the Data Lake
Setting Up the Data LakeSetting Up the Data Lake
Setting Up the Data Lake
 
Big data presentation
Big data  presentationBig data  presentation
Big data presentation
 
Big Data: Setting Up the Big Data Lake
Big Data: Setting Up the Big Data LakeBig Data: Setting Up the Big Data Lake
Big Data: Setting Up the Big Data Lake
 

More from Gigaom

Structure 2014 - The strategic value of the cloud - Joe Weinman
Structure 2014 - The strategic value of the cloud - Joe WeinmanStructure 2014 - The strategic value of the cloud - Joe Weinman
Structure 2014 - The strategic value of the cloud - Joe WeinmanGigaom
 
Structure 2014 - The right and wrong way to scale - Rackspace
Structure 2014 - The right and wrong way to scale - RackspaceStructure 2014 - The right and wrong way to scale - Rackspace
Structure 2014 - The right and wrong way to scale - RackspaceGigaom
 
Structure 2014 - The future of cloud computing survey results
Structure 2014 - The future of cloud computing survey resultsStructure 2014 - The future of cloud computing survey results
Structure 2014 - The future of cloud computing survey resultsGigaom
 
Structure 2014 - Launchpad Competition
Structure 2014 - Launchpad CompetitionStructure 2014 - Launchpad Competition
Structure 2014 - Launchpad CompetitionGigaom
 
Structure 2014 - Disrupting the data center - Intel sponsor workshop
Structure 2014 - Disrupting the data center - Intel sponsor workshopStructure 2014 - Disrupting the data center - Intel sponsor workshop
Structure 2014 - Disrupting the data center - Intel sponsor workshopGigaom
 
Structure 2014 - Cloud trends - Battery
Structure 2014 - Cloud trends - BatteryStructure 2014 - Cloud trends - Battery
Structure 2014 - Cloud trends - BatteryGigaom
 
Structure Data 2014: HOW MICRODATA CAN SAY A LOT ABOUT MACROECONOMICS, David ...
Structure Data 2014: HOW MICRODATA CAN SAY A LOT ABOUT MACROECONOMICS, David ...Structure Data 2014: HOW MICRODATA CAN SAY A LOT ABOUT MACROECONOMICS, David ...
Structure Data 2014: HOW MICRODATA CAN SAY A LOT ABOUT MACROECONOMICS, David ...Gigaom
 
Structure Data 2014: QLIK SPONSOR WORKSHOP: ANALYTICS THE WAY NATURE INTENDED...
Structure Data 2014: QLIK SPONSOR WORKSHOP: ANALYTICS THE WAY NATURE INTENDED...Structure Data 2014: QLIK SPONSOR WORKSHOP: ANALYTICS THE WAY NATURE INTENDED...
Structure Data 2014: QLIK SPONSOR WORKSHOP: ANALYTICS THE WAY NATURE INTENDED...Gigaom
 
Structure Data 2014: FIVE MYTHS ABOUT BIG DATA, Amit Bendov
Structure Data 2014: FIVE MYTHS ABOUT BIG DATA, Amit BendovStructure Data 2014: FIVE MYTHS ABOUT BIG DATA, Amit Bendov
Structure Data 2014: FIVE MYTHS ABOUT BIG DATA, Amit BendovGigaom
 
Structure Data 2014: AMID BILLIONS OF METRICS, YOUR SOFTWARE IS TRYING TO TEL...
Structure Data 2014: AMID BILLIONS OF METRICS, YOUR SOFTWARE IS TRYING TO TEL...Structure Data 2014: AMID BILLIONS OF METRICS, YOUR SOFTWARE IS TRYING TO TEL...
Structure Data 2014: AMID BILLIONS OF METRICS, YOUR SOFTWARE IS TRYING TO TEL...Gigaom
 
Structure Data 2014: SISENSE SPONSOR WORKSHOP: ON BEER, CHIPS AND DATA,
Structure Data 2014: SISENSE SPONSOR WORKSHOP: ON BEER, CHIPS AND DATA, Structure Data 2014: SISENSE SPONSOR WORKSHOP: ON BEER, CHIPS AND DATA,
Structure Data 2014: SISENSE SPONSOR WORKSHOP: ON BEER, CHIPS AND DATA, Gigaom
 
Structure Data 2014: INVERTING 80/20: BEYOND BESPOKE BIG DATA, Ari Gesher
Structure Data 2014: INVERTING 80/20: BEYOND BESPOKE BIG DATA, Ari GesherStructure Data 2014: INVERTING 80/20: BEYOND BESPOKE BIG DATA, Ari Gesher
Structure Data 2014: INVERTING 80/20: BEYOND BESPOKE BIG DATA, Ari GesherGigaom
 
Structure Data 2014: TRACKING A SOCCER GAME WITH BIG DATA, Chris Haddad
Structure Data 2014: TRACKING A SOCCER GAME WITH BIG DATA, Chris HaddadStructure Data 2014: TRACKING A SOCCER GAME WITH BIG DATA, Chris Haddad
Structure Data 2014: TRACKING A SOCCER GAME WITH BIG DATA, Chris HaddadGigaom
 
Structure Data 2014: TECH AGAINST HUMAN TRAFFICKING AND ILLICIT NETWORKS, Jus...
Structure Data 2014: TECH AGAINST HUMAN TRAFFICKING AND ILLICIT NETWORKS, Jus...Structure Data 2014: TECH AGAINST HUMAN TRAFFICKING AND ILLICIT NETWORKS, Jus...
Structure Data 2014: TECH AGAINST HUMAN TRAFFICKING AND ILLICIT NETWORKS, Jus...Gigaom
 
Structure Data 2014: DATA DRIVEN DESIGN AT FORMULA ONE SPEED, Geoff McGrath
Structure Data 2014: DATA DRIVEN DESIGN AT FORMULA ONE SPEED, Geoff McGrathStructure Data 2014: DATA DRIVEN DESIGN AT FORMULA ONE SPEED, Geoff McGrath
Structure Data 2014: DATA DRIVEN DESIGN AT FORMULA ONE SPEED, Geoff McGrathGigaom
 
Structure Data 2014: IS VIDEO BIG DATA?, Steve Russell
Structure Data 2014: IS VIDEO BIG DATA?, Steve RussellStructure Data 2014: IS VIDEO BIG DATA?, Steve Russell
Structure Data 2014: IS VIDEO BIG DATA?, Steve RussellGigaom
 
Structure Data 2014: BIG DATA ANALYTICS RE-INVENTED, Ryan Waite
Structure Data 2014: BIG DATA ANALYTICS RE-INVENTED, Ryan WaiteStructure Data 2014: BIG DATA ANALYTICS RE-INVENTED, Ryan Waite
Structure Data 2014: BIG DATA ANALYTICS RE-INVENTED, Ryan WaiteGigaom
 
How Data is Remaking E-commerce - from Roadmap 2013
How Data is Remaking E-commerce - from Roadmap 2013How Data is Remaking E-commerce - from Roadmap 2013
How Data is Remaking E-commerce - from Roadmap 2013Gigaom
 
25 Favorite Experiences in Tech - from Roadmap 2013
25 Favorite Experiences in Tech - from Roadmap 201325 Favorite Experiences in Tech - from Roadmap 2013
25 Favorite Experiences in Tech - from Roadmap 2013Gigaom
 
How Moore’s Law is Influencing Design - from Roadmap 2013
How Moore’s Law is Influencing Design - from Roadmap 2013How Moore’s Law is Influencing Design - from Roadmap 2013
How Moore’s Law is Influencing Design - from Roadmap 2013Gigaom
 

More from Gigaom (20)

Structure 2014 - The strategic value of the cloud - Joe Weinman
Structure 2014 - The strategic value of the cloud - Joe WeinmanStructure 2014 - The strategic value of the cloud - Joe Weinman
Structure 2014 - The strategic value of the cloud - Joe Weinman
 
Structure 2014 - The right and wrong way to scale - Rackspace
Structure 2014 - The right and wrong way to scale - RackspaceStructure 2014 - The right and wrong way to scale - Rackspace
Structure 2014 - The right and wrong way to scale - Rackspace
 
Structure 2014 - The future of cloud computing survey results
Structure 2014 - The future of cloud computing survey resultsStructure 2014 - The future of cloud computing survey results
Structure 2014 - The future of cloud computing survey results
 
Structure 2014 - Launchpad Competition
Structure 2014 - Launchpad CompetitionStructure 2014 - Launchpad Competition
Structure 2014 - Launchpad Competition
 
Structure 2014 - Disrupting the data center - Intel sponsor workshop
Structure 2014 - Disrupting the data center - Intel sponsor workshopStructure 2014 - Disrupting the data center - Intel sponsor workshop
Structure 2014 - Disrupting the data center - Intel sponsor workshop
 
Structure 2014 - Cloud trends - Battery
Structure 2014 - Cloud trends - BatteryStructure 2014 - Cloud trends - Battery
Structure 2014 - Cloud trends - Battery
 
Structure Data 2014: HOW MICRODATA CAN SAY A LOT ABOUT MACROECONOMICS, David ...
Structure Data 2014: HOW MICRODATA CAN SAY A LOT ABOUT MACROECONOMICS, David ...Structure Data 2014: HOW MICRODATA CAN SAY A LOT ABOUT MACROECONOMICS, David ...
Structure Data 2014: HOW MICRODATA CAN SAY A LOT ABOUT MACROECONOMICS, David ...
 
Structure Data 2014: QLIK SPONSOR WORKSHOP: ANALYTICS THE WAY NATURE INTENDED...
Structure Data 2014: QLIK SPONSOR WORKSHOP: ANALYTICS THE WAY NATURE INTENDED...Structure Data 2014: QLIK SPONSOR WORKSHOP: ANALYTICS THE WAY NATURE INTENDED...
Structure Data 2014: QLIK SPONSOR WORKSHOP: ANALYTICS THE WAY NATURE INTENDED...
 
Structure Data 2014: FIVE MYTHS ABOUT BIG DATA, Amit Bendov
Structure Data 2014: FIVE MYTHS ABOUT BIG DATA, Amit BendovStructure Data 2014: FIVE MYTHS ABOUT BIG DATA, Amit Bendov
Structure Data 2014: FIVE MYTHS ABOUT BIG DATA, Amit Bendov
 
Structure Data 2014: AMID BILLIONS OF METRICS, YOUR SOFTWARE IS TRYING TO TEL...
Structure Data 2014: AMID BILLIONS OF METRICS, YOUR SOFTWARE IS TRYING TO TEL...Structure Data 2014: AMID BILLIONS OF METRICS, YOUR SOFTWARE IS TRYING TO TEL...
Structure Data 2014: AMID BILLIONS OF METRICS, YOUR SOFTWARE IS TRYING TO TEL...
 
Structure Data 2014: SISENSE SPONSOR WORKSHOP: ON BEER, CHIPS AND DATA,
Structure Data 2014: SISENSE SPONSOR WORKSHOP: ON BEER, CHIPS AND DATA, Structure Data 2014: SISENSE SPONSOR WORKSHOP: ON BEER, CHIPS AND DATA,
Structure Data 2014: SISENSE SPONSOR WORKSHOP: ON BEER, CHIPS AND DATA,
 
Structure Data 2014: INVERTING 80/20: BEYOND BESPOKE BIG DATA, Ari Gesher
Structure Data 2014: INVERTING 80/20: BEYOND BESPOKE BIG DATA, Ari GesherStructure Data 2014: INVERTING 80/20: BEYOND BESPOKE BIG DATA, Ari Gesher
Structure Data 2014: INVERTING 80/20: BEYOND BESPOKE BIG DATA, Ari Gesher
 
Structure Data 2014: TRACKING A SOCCER GAME WITH BIG DATA, Chris Haddad
Structure Data 2014: TRACKING A SOCCER GAME WITH BIG DATA, Chris HaddadStructure Data 2014: TRACKING A SOCCER GAME WITH BIG DATA, Chris Haddad
Structure Data 2014: TRACKING A SOCCER GAME WITH BIG DATA, Chris Haddad
 
Structure Data 2014: TECH AGAINST HUMAN TRAFFICKING AND ILLICIT NETWORKS, Jus...
Structure Data 2014: TECH AGAINST HUMAN TRAFFICKING AND ILLICIT NETWORKS, Jus...Structure Data 2014: TECH AGAINST HUMAN TRAFFICKING AND ILLICIT NETWORKS, Jus...
Structure Data 2014: TECH AGAINST HUMAN TRAFFICKING AND ILLICIT NETWORKS, Jus...
 
Structure Data 2014: DATA DRIVEN DESIGN AT FORMULA ONE SPEED, Geoff McGrath
Structure Data 2014: DATA DRIVEN DESIGN AT FORMULA ONE SPEED, Geoff McGrathStructure Data 2014: DATA DRIVEN DESIGN AT FORMULA ONE SPEED, Geoff McGrath
Structure Data 2014: DATA DRIVEN DESIGN AT FORMULA ONE SPEED, Geoff McGrath
 
Structure Data 2014: IS VIDEO BIG DATA?, Steve Russell
Structure Data 2014: IS VIDEO BIG DATA?, Steve RussellStructure Data 2014: IS VIDEO BIG DATA?, Steve Russell
Structure Data 2014: IS VIDEO BIG DATA?, Steve Russell
 
Structure Data 2014: BIG DATA ANALYTICS RE-INVENTED, Ryan Waite
Structure Data 2014: BIG DATA ANALYTICS RE-INVENTED, Ryan WaiteStructure Data 2014: BIG DATA ANALYTICS RE-INVENTED, Ryan Waite
Structure Data 2014: BIG DATA ANALYTICS RE-INVENTED, Ryan Waite
 
How Data is Remaking E-commerce - from Roadmap 2013
How Data is Remaking E-commerce - from Roadmap 2013How Data is Remaking E-commerce - from Roadmap 2013
How Data is Remaking E-commerce - from Roadmap 2013
 
25 Favorite Experiences in Tech - from Roadmap 2013
25 Favorite Experiences in Tech - from Roadmap 201325 Favorite Experiences in Tech - from Roadmap 2013
25 Favorite Experiences in Tech - from Roadmap 2013
 
How Moore’s Law is Influencing Design - from Roadmap 2013
How Moore’s Law is Influencing Design - from Roadmap 2013How Moore’s Law is Influencing Design - from Roadmap 2013
How Moore’s Law is Influencing Design - from Roadmap 2013
 

Recently uploaded

New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024BookNet Canada
 
Decarbonising Buildings: Making a net-zero built environment a reality
Decarbonising Buildings: Making a net-zero built environment a realityDecarbonising Buildings: Making a net-zero built environment a reality
Decarbonising Buildings: Making a net-zero built environment a realityIES VE
 
Long journey of Ruby standard library at RubyConf AU 2024
Long journey of Ruby standard library at RubyConf AU 2024Long journey of Ruby standard library at RubyConf AU 2024
Long journey of Ruby standard library at RubyConf AU 2024Hiroshi SHIBATA
 
Varsha Sewlal- Cyber Attacks on Critical Critical Infrastructure
Varsha Sewlal- Cyber Attacks on Critical Critical InfrastructureVarsha Sewlal- Cyber Attacks on Critical Critical Infrastructure
Varsha Sewlal- Cyber Attacks on Critical Critical Infrastructureitnewsafrica
 
A Deep Dive on Passkeys: FIDO Paris Seminar.pptx
A Deep Dive on Passkeys: FIDO Paris Seminar.pptxA Deep Dive on Passkeys: FIDO Paris Seminar.pptx
A Deep Dive on Passkeys: FIDO Paris Seminar.pptxLoriGlavin3
 
Top 10 Hubspot Development Companies in 2024
Top 10 Hubspot Development Companies in 2024Top 10 Hubspot Development Companies in 2024
Top 10 Hubspot Development Companies in 2024TopCSSGallery
 
MuleSoft Online Meetup Group - B2B Crash Course: Release SparkNotes
MuleSoft Online Meetup Group - B2B Crash Course: Release SparkNotesMuleSoft Online Meetup Group - B2B Crash Course: Release SparkNotes
MuleSoft Online Meetup Group - B2B Crash Course: Release SparkNotesManik S Magar
 
How to write a Business Continuity Plan
How to write a Business Continuity PlanHow to write a Business Continuity Plan
How to write a Business Continuity PlanDatabarracks
 
Modern Roaming for Notes and Nomad – Cheaper Faster Better Stronger
Modern Roaming for Notes and Nomad – Cheaper Faster Better StrongerModern Roaming for Notes and Nomad – Cheaper Faster Better Stronger
Modern Roaming for Notes and Nomad – Cheaper Faster Better Strongerpanagenda
 
React Native vs Ionic - The Best Mobile App Framework
React Native vs Ionic - The Best Mobile App FrameworkReact Native vs Ionic - The Best Mobile App Framework
React Native vs Ionic - The Best Mobile App FrameworkPixlogix Infotech
 
2024 April Patch Tuesday
2024 April Patch Tuesday2024 April Patch Tuesday
2024 April Patch TuesdayIvanti
 
Generative AI - Gitex v1Generative AI - Gitex v1.pptx
Generative AI - Gitex v1Generative AI - Gitex v1.pptxGenerative AI - Gitex v1Generative AI - Gitex v1.pptx
Generative AI - Gitex v1Generative AI - Gitex v1.pptxfnnc6jmgwh
 
Potential of AI (Generative AI) in Business: Learnings and Insights
Potential of AI (Generative AI) in Business: Learnings and InsightsPotential of AI (Generative AI) in Business: Learnings and Insights
Potential of AI (Generative AI) in Business: Learnings and InsightsRavi Sanghani
 
Digital Identity is Under Attack: FIDO Paris Seminar.pptx
Digital Identity is Under Attack: FIDO Paris Seminar.pptxDigital Identity is Under Attack: FIDO Paris Seminar.pptx
Digital Identity is Under Attack: FIDO Paris Seminar.pptxLoriGlavin3
 
Moving Beyond Passwords: FIDO Paris Seminar.pdf
Moving Beyond Passwords: FIDO Paris Seminar.pdfMoving Beyond Passwords: FIDO Paris Seminar.pdf
Moving Beyond Passwords: FIDO Paris Seminar.pdfLoriGlavin3
 
[Webinar] SpiraTest - Setting New Standards in Quality Assurance
[Webinar] SpiraTest - Setting New Standards in Quality Assurance[Webinar] SpiraTest - Setting New Standards in Quality Assurance
[Webinar] SpiraTest - Setting New Standards in Quality AssuranceInflectra
 
Scale your database traffic with Read & Write split using MySQL Router
Scale your database traffic with Read & Write split using MySQL RouterScale your database traffic with Read & Write split using MySQL Router
Scale your database traffic with Read & Write split using MySQL RouterMydbops
 
Genislab builds better products and faster go-to-market with Lean project man...
Genislab builds better products and faster go-to-market with Lean project man...Genislab builds better products and faster go-to-market with Lean project man...
Genislab builds better products and faster go-to-market with Lean project man...Farhan Tariq
 
Time Series Foundation Models - current state and future directions
Time Series Foundation Models - current state and future directionsTime Series Foundation Models - current state and future directions
Time Series Foundation Models - current state and future directionsNathaniel Shimoni
 
Data governance with Unity Catalog Presentation
Data governance with Unity Catalog PresentationData governance with Unity Catalog Presentation
Data governance with Unity Catalog PresentationKnoldus Inc.
 

Recently uploaded (20)

New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
 
Decarbonising Buildings: Making a net-zero built environment a reality
Decarbonising Buildings: Making a net-zero built environment a realityDecarbonising Buildings: Making a net-zero built environment a reality
Decarbonising Buildings: Making a net-zero built environment a reality
 
Long journey of Ruby standard library at RubyConf AU 2024
Long journey of Ruby standard library at RubyConf AU 2024Long journey of Ruby standard library at RubyConf AU 2024
Long journey of Ruby standard library at RubyConf AU 2024
 
Varsha Sewlal- Cyber Attacks on Critical Critical Infrastructure
Varsha Sewlal- Cyber Attacks on Critical Critical InfrastructureVarsha Sewlal- Cyber Attacks on Critical Critical Infrastructure
Varsha Sewlal- Cyber Attacks on Critical Critical Infrastructure
 
A Deep Dive on Passkeys: FIDO Paris Seminar.pptx
A Deep Dive on Passkeys: FIDO Paris Seminar.pptxA Deep Dive on Passkeys: FIDO Paris Seminar.pptx
A Deep Dive on Passkeys: FIDO Paris Seminar.pptx
 
Top 10 Hubspot Development Companies in 2024
Top 10 Hubspot Development Companies in 2024Top 10 Hubspot Development Companies in 2024
Top 10 Hubspot Development Companies in 2024
 
MuleSoft Online Meetup Group - B2B Crash Course: Release SparkNotes
MuleSoft Online Meetup Group - B2B Crash Course: Release SparkNotesMuleSoft Online Meetup Group - B2B Crash Course: Release SparkNotes
MuleSoft Online Meetup Group - B2B Crash Course: Release SparkNotes
 
How to write a Business Continuity Plan
How to write a Business Continuity PlanHow to write a Business Continuity Plan
How to write a Business Continuity Plan
 
Modern Roaming for Notes and Nomad – Cheaper Faster Better Stronger
Modern Roaming for Notes and Nomad – Cheaper Faster Better StrongerModern Roaming for Notes and Nomad – Cheaper Faster Better Stronger
Modern Roaming for Notes and Nomad – Cheaper Faster Better Stronger
 
React Native vs Ionic - The Best Mobile App Framework
React Native vs Ionic - The Best Mobile App FrameworkReact Native vs Ionic - The Best Mobile App Framework
React Native vs Ionic - The Best Mobile App Framework
 
2024 April Patch Tuesday
2024 April Patch Tuesday2024 April Patch Tuesday
2024 April Patch Tuesday
 
Generative AI - Gitex v1Generative AI - Gitex v1.pptx
Generative AI - Gitex v1Generative AI - Gitex v1.pptxGenerative AI - Gitex v1Generative AI - Gitex v1.pptx
Generative AI - Gitex v1Generative AI - Gitex v1.pptx
 
Potential of AI (Generative AI) in Business: Learnings and Insights
Potential of AI (Generative AI) in Business: Learnings and InsightsPotential of AI (Generative AI) in Business: Learnings and Insights
Potential of AI (Generative AI) in Business: Learnings and Insights
 
Digital Identity is Under Attack: FIDO Paris Seminar.pptx
Digital Identity is Under Attack: FIDO Paris Seminar.pptxDigital Identity is Under Attack: FIDO Paris Seminar.pptx
Digital Identity is Under Attack: FIDO Paris Seminar.pptx
 
Moving Beyond Passwords: FIDO Paris Seminar.pdf
Moving Beyond Passwords: FIDO Paris Seminar.pdfMoving Beyond Passwords: FIDO Paris Seminar.pdf
Moving Beyond Passwords: FIDO Paris Seminar.pdf
 
[Webinar] SpiraTest - Setting New Standards in Quality Assurance
[Webinar] SpiraTest - Setting New Standards in Quality Assurance[Webinar] SpiraTest - Setting New Standards in Quality Assurance
[Webinar] SpiraTest - Setting New Standards in Quality Assurance
 
Scale your database traffic with Read & Write split using MySQL Router
Scale your database traffic with Read & Write split using MySQL RouterScale your database traffic with Read & Write split using MySQL Router
Scale your database traffic with Read & Write split using MySQL Router
 
Genislab builds better products and faster go-to-market with Lean project man...
Genislab builds better products and faster go-to-market with Lean project man...Genislab builds better products and faster go-to-market with Lean project man...
Genislab builds better products and faster go-to-market with Lean project man...
 
Time Series Foundation Models - current state and future directions
Time Series Foundation Models - current state and future directionsTime Series Foundation Models - current state and future directions
Time Series Foundation Models - current state and future directions
 
Data governance with Unity Catalog Presentation
Data governance with Unity Catalog PresentationData governance with Unity Catalog Presentation
Data governance with Unity Catalog Presentation
 

SPONSORED WORKSHOP by Amplidata from Structure:Data 2012:

  • 1. Big “Unstructured” Data A Case for Optimized Object Storage Paul Speciale Friday, July 27, 2012
  • 2. Storage facts and trends Recent studies estimate that data storage capacities will likely increase by over 30X in the coming decade to over 35 Zettabytes 35ZB High-capacity drives Less Staff / TB Unstructured Data Storage Consumption 30X Time 2020 Friday, July 27, 2012
  • 3. Storage facts and trends But…. The number of qualified people to manage this huge volume of data will stay flat (~1.5X) Administrators will be expected to manage 20X more data each Efficiency: automate & reduce overhead Capcity / Budget ts en m re q ui Re e ag or St dget ag e Bu Stor Time Friday, July 27, 2012
  • 4. Storage facts and trends • Much of that growth (80%) is driven by unstructured data • Billions of large objects and files Media Archives Online Images Large Files Medical Images Online Storage Online Movies 4 Friday, July 27, 2012
  • 5. Storage facts and trends: Media & Entertainment Industry Example M&E is driving huge capacity requirements, both with file sizes and volume of files and storage capacities in use, driven by HD, 3D video formats: “Petabytes are peanuts” 3TB per hour for 4K video 5 Friday, July 27, 2012
  • 6. Big Data for Analytics vs. Big “Unstructured” Data 6 Friday, July 27, 2012
  • 7. Big Data for Analytics • In the 90’s, we experienced an explosion of data captured for analytics purposes: • Academic Research • Chemical R&D facilities • Travel industry • Geo-industry, oil & gas • Financial / Trading • Agriculture • In the 2000’s, online applications & social media triggered a flood of trend data 7 Friday, July 27, 2012
  • 8. Big Data for Analytics • Data is captured as many small log files & concatenated as “Big Data” • Relational databases were not optimal: • Too much data, too big • Insufficient performance for analytics • This stimulated innovations: • Hadoop, MapReduce, GFS • XML databases • => This is Big Data for Analytics 8 Friday, July 27, 2012
  • 9. Big Data Evolution • Today, Big Data trend refers to Big Data for Analytics & Big Unstructured Data: • Media • Streaming • Business • Scientific • Fundamentally different data but with lots of similarities • Immense capacities • Number of transactions or objects • Unstructured data is traditionally stored on host files systems but: • Host file systems impose fixed limits - do not scale up to the size we need • File systems do not meet performance requirements due to host limiting access 9 Friday, July 27, 2012
  • 10. Big Unstructured Data • Most unstructured data is archived, often to tape (cost), then difficult to access • Volumes are increasing exponentially • Data archives are an organization & management burden (Grandma’s Attic) 10 Friday, July 27, 2012
  • 11. Big Unstructured Data • Companies are starting to see the value of the data in their archives: • Documents of individuals can be valuable for others • Some companies have legal reasons to keep data available • Unexplored analytics opportunities • This data can be mined and monetized 11 Friday, July 27, 2012
  • 12. Big Unstructured Data But how do store all this data in a cost efficient way? “Building cost-efficient Live Archives” 12 Friday, July 27, 2012
  • 13. Big Unstructured Data What are the requirements? • Tape is a difficult option: access Disk Storage latency is key (online, low-latency access) • Data has to be always available online } + Open application API’s (App & Cloud-enabled) } • Direct interface to the applications + Ultra-high data durability (Erasure Coding) • Petabyte scalability • Extreme reliability, integrity = Optimized Object • Cost-efficient Storage • Security 13 Friday, July 27, 2012
  • 14. Disk vs. Tape Tape has several obvious advantages over disk & there will always be use cases for tape But disks enable live archives with instant data accessibility More arguments for disk-based archives • Disks can be powered down • Tape requires replication to protect against media errors • Data integrity checking • Massive migration projects • … 14 Friday, July 27, 2012
  • 15. Object Storage Simplifies this Problem • File System organization of data becomes a burden • File systems impose limitations on numbers of files & directories • Very time-consuming to organize data • Object Storage simplifies this problem Application Application Application • Flat “Namespaces” (not file systems) - without storage limits • Let’s the applications talk directly Object API to the Storage • Use “Object” application API’s to let applications directly manage objects & metadata • File Gateways can be used as a transition bridge • Bring legacy data and apps into Object Storage 15 Friday, July 27, 2012
  • 16. Petabyte Scalability and Beyond Systems should scale BIG • Beyond petabytes of data – no built-in limits • Beyond billions of data objects Systems should scale uniformly • Add resources incrementally and grow as a Single System View • Manage from a “Single Pane of Glass” • Scale performance and capacity separately • Migration and seamless growth across newer generations of component technologies (processors, disk densities) 16 Friday, July 27, 2012
  • 17. Ultra-High Levels of Data Integrity • Data needs to be archived for lifetimes • Expect “bit perfect” integrity to store gold-copy of critical assets • Consolidate multiple copies of data into a single highly-durable tier • Ensuring the integrity of long-term unstructured data archive requires new data protection algorithms, to: • Address the increasing capacity of disk drives • Solve issues related to long RAID rebuild windows “Object storage systems based on erasure-coding can not only protect data from higher numbers of drive failures, but also against the failure of entire storage modules.” 17 Friday, July 27, 2012
  • 18. Big Unstructured Data What are the requirements? • Tape is a difficult option: access Disk Storage latency is key (online, low-latency access) • Data has to be always available online } + Open application API’s (App & Cloud-enabled) } • Direct interface to the applications • Petabyte scalability + Ultra-high data durability (Erasure Coding) • Extreme reliability, integrity • Cost-efficient = Optimized Object • Security Storage 18 Friday, July 27, 2012
  • 19. Thank You! Paul Speciale, VP Products, Amplidata Inc. www.amplidata.com Friday, July 27, 2012