SlideShare uma empresa Scribd logo
1 de 68
Baixar para ler offline
Big Data overview
ESCEN
Alexis Roos
Senior Sales Engineer / Architect
© Copyright 2013, Alexis Roos, alexis.roos@gmail.com
Course objectives
● Give you a map / big picture and pointers to be
able to drill down as you need
● Will cover business side but will also cover
technology as without good technical
understanding; it is not possible to grasp
business side
● Will go over landscape and possibilities and
illustrate with a good number of use cases
Proposed Agenda
● What is Big Data?
● Big Data landscape (Tech heavy)
● Business / Use cases
● Discussion
Proposed Agenda
● What is Big Data?
● Big Data landscape
● Business / Use cases
● Discussion
Big Data
Data and Big Data
● Data is the basis for Information
Economics are now allowing to store virtually
unlimited data
● "“Big data” is high -volume, -velocity and -variety
information assets that demand cost-effective,
innovative forms of information processing for
enhanced insight and decision making."
Gartner's definition.
http://www.youtube.com/watch?v=ah14LEFKe8Q
Year Cost of 1GB
1980 $3,000,000
1990 $8000
2000 $30
2010 $0.08
Data – Information processing 1/2
● Through processing data becomes
Information (knowledge) and knowledge
creates insight and insight = success.
● Transaction processing:
A sequence of information exchange and related work
that is treated as a unit for the purposes of satisfying a
request (usually human but not exclusively)
aka Online Transaction processing or OLTP
Example: you buy an item on Amazon:
. Item is placed on hold in Inventory system
. Item is placed in shopping cart
. System requests CC payment authorization for item
. If payment is approved, CC is charged, item is removed from
inventory and shipped.
-> all of the above or nothing (roll back)
Data – Information processing 2/2
● Real Time processing
Perceived as "immediate" from the originator
Ex: trading, payment, online booking, "right" ad
delivery, gaming, etc.
● Batch processing:
Delayed Execution of a series of programs ("jobs") on
a computer without manual intervention.
Ex: billing, virus scanning, web indexing, data mining,
analytics, etc.
Data – ACID Transaction
● Technical definition:
● Atomicity: each transation is all or nothing
● Consistency: transaction will stay consistent
with data rules
● Isolation: Ensures that each transaction is
kept isolated from others
● Durability: Once a transaction has been
committed, it will remain so, even in the event
of power loss, crashes, or errors
Big Data - Applications
● Find deeper insight in data:
customers, partners and business.
All Industries will be affected.
"The software is eating the world"
● Retail: buying patterns, store traffic, etc
● Logistics: track and optimize shipments, etc
● Healthcare: preventive medecine, disease
management, etc.
● Social media: optimize usage, ads, etc.
● Finance: buying patterns, portfolio optimization
http://www.youtube.com/watch?v=7D1CQ_LOizA
http://online.wsj.com/article/SB10001424053111903480904576512250915629460.html
Big Data – Three dimensions
● Volume
● Amount of data
● Velocity
● Speed at which it arrives
● Variety
● Types of data
Big Data – Volume/Size matters
Name Value Example
kilobyte (kB) 10^3 Email (7KB), Images, web pages
megabyte (MB) 10^6 Ebooks, MP3, SD video etc.
gigabyte (GB) 10^9 HD movie
terabyte (TB) 10^12 For a single journey across the Atlantic
Ocean, a four-engine jumbo jet can
create 640 terabytes of data
petabyte (PB) 10^15 FB has over 1.5 PB of stored photos
exabyte (EB) 10^18 Seagate Technology reported selling 330
exabytes worth of hard drives during the
2011 Fiscal Year
zettabyte (ZB) 10^21 WW production and consumption of data.
According to International Data
Corporation, the total amount of global
data is expected to grow to 2.7 zettabytes
during 2012
yottabyte (YB) 10^24 Not there yet ..
...
http://en.wikipedia.org/wiki/Zettabyte http://www.youtube.com/watch?v=CsVYID9rMGE
Big Data - Speed
● How fast is new data coming?
● How does this data need to be used or
correlated?
● How long is data valuable?
● How fast does data need to be
processed?
● This dimension in particular will affect the
system architecture
Big Data - Variety
● What type(s) / format(s)?
● Human or machine generated
● Text, location, document, picture, video, click streams,
log file, event, etc.
● Is it structured or unstructured?
● Static vs dynamic
● What are relationships/dependencies
within data elements?
Proposed Agenda
● What is Big Data?
● Big Data landscape
● Business / Use cases
● Discussion
Big Data landscape
Big Data applications are roughly built using
three technology layers:
● Storage
● Analytics
● Visualization
For whom?
Big Data landscape
● Storage
● Analytics
● Visualization
Big Data – Storage
● Main logical data models:
● Tabular (represented by rows and
columns) - Relational model
● Tree (a set of nodes with parent-children
relationship)
● Graph structure (a set of interconnected
nodes)
● Document (free structure /
unstructured / schema less)
Big Data – Storage
● Physical data models:
● Relational Data Base Mananement Systems (RDBMS)
support ACIDity and joins are considered relational. Use
SQL language as API.
● Key-value systems basically support get, put, and delete
operations based on a primary key.
● Column-oriented systems still use tables but have no joins
(joins must be handled within your application). Obviously,
they store data by column as opposed to traditional row-
oriented databases. This makes aggregations much easier.
● Document-oriented systems store structured "documents"
such as JSON or XML but have no joins (joins must be
handled within your application). It's very easy to map data
from object-oriented software to these systems.
http://nosql-database.org/
Big Data - Storage
● Not practical to store data on 1 system,
but distributing data creates complexity:
● Consistency: means that each client always has the
same view of the data.
● Availability: means that all clients can always read
and write.
● Partition: tolerance means that the system works
well across physical network partitions.
● If system is partitioned, it is only possible
to achieve 2 out of 3 properties (known as
CAP theorem): CA, AP or CP.
Big data - Storage
Source: http://blog.nahurst.com/visual-guide-to-nosql-systems
Big Data - Storage
ATKearny
Big Data landscape
● Storage
● Analytics
● Visualization
Big Data – Analytics
● Process of examining large amounts of
data of a variety of types to uncover
hidden patterns, unknown correlations and
other useful information resulting in
business benefits, such as more effective
marketing or increased revenue.
● Can work on all forms of data as
described before
● Can involve Transactions, Real Time
and/or Batch Oriented
Big Data – Analytics
● "Stages" of analytics:
● Business monitoring: traditional BI,
Charting, Key Performance Indicator,
etc.
● Business insights: uses statistics, data
mining, predictive analysis to generate
actionable insights: "Intelligent
dashboards". Leverages trending,
classification, optimization, simulations.
● Business transformation based on data
Big Data – Analytics
● Anticipate and predict
Big Data – Analytics
● Traditional predictive analytics and data
mining are designed for relational data or
structured data so a whole set of new
technologies have evolved for
unstructured data.
● Hadoop (batch oriented): "brute force"
● Real Time processing (new trend):
optimized for specific use cases
● Machine learning: data intensive
Big Data – Hadoop
● Designed for large scale (100's of
terabytes of data) batch oriented
information processing: archiving,
transformation, exploration, etc.
● Reliable while using commodity HW and
open source
● Main components:
● Distributed File System (HDFS)
● Map Reduce: distributed data processing
● Associated infrastructure components, query
mechanisms and machine learning
Big Data – Hadoop Example
● Derive meaning from logs:
● Who is using the web site?
IP, location, device, etc.
● What pages are they looking at?
How long, how often?
● Are they buying?
Adding products to cart?
Checking out?
● What are the trends?
Big Data – Real-Time
● Goal is to process data from highly
dynamic sources in real time
● Data is typically streaming to the
processing system and stored / processed
directly into memory
● Complex Event Processing has been
there for years but need new architecture
for Big Data scale and distributed
processing: Storm/Kafta are one of the
frameworks that could become "Hadoop"
of Real-Time
Big Data – Real-Time Example
● Derive meaning from tweets:
● How well brand is trending?
● By time, category?
● Compared to competitors
● Sentiment?
● etc
http://www.filtize.com/
Big Data – Machine learning
● What is Machine learning?
Big Data – Machine learning
● "A branch of artificial intelligence, that is
about the construction and study of
systems that can learn from data."
Supports Predictive Analytics
● Can perform tasks that are too difficult to
specify algorithmically
● Example of applications:
● Computer vision, Natural language processing,
Fraud detection, Game playing, Robot locomotion,
Sentiment analysis, Adaptive systems, scientific
applications, anomaly detection, recommendation
engine, personal assistant, etc
Big Data – Example
● Handwritten recognition
● Handcrafted rules will result in large
number of rules and exceptions. Best to
have a machine that learns from a large
training set.
Big Data – Example
● Computer vision: car detection
● First Learning
● Then Testing: Is this a car?
Not a carCars
Big Data – Machine learning
● Supervised or unsupervised learning:
whether we train the model or the system
learns on its own
● Types of information processing:
● Supervised
– Classification (discrete)
– Regression (continuous)
● Unsupervised
– Clustering (discrete)
Big Data – Machine learning
Supervised – Classification / Regression
● First teach the model
● Then verify against the model
Big Data – Machine learning
Classification
● Classifier (single or multi class): given some set of
features with corresponding labels, learn a function to
predict the labels from the features
x x
x
x
x
x
x
x
o
o
o
o
o
x2
x1
Big Data – Machine learning
Classification
Many algorithms to choose from:
● SVM
● Neural networks
● Naïve Bayes
● Bayesian network
● Logistic regression
● Randomized Forests
● Boosted Decision Trees
● K-nearest neighbor
● RBMs
● Etc.
Big Data – Machine learning
Regression
● Regression allows to fit an equation to a dataset to be
able to predict values for new data
Example: calculate price of a house: in reality much more than
1 variable: size, number of floors, # of rooms, age, location, etc
Big Data – Machine learning
Clustering
● Clustering allows to place data elements into related
groups without advance knowledge of the group
definitions.
● Example: social network aka similar profiles
● K-means is a popular algorithm for clustering
http://en.wikipedia.org/wiki/K-means_clustering
Big Data – Machine learning
● Predictive analytics techniques usage
Big Data – Machine learning
● Designing a high accuracy learning system
“It’s not who has the best algorithm that wins.
It’s who has the most data.”
Ex: Classify between confusable words.
{to, two, too}, {then, than}
For breakfast I ate _____ eggs.
●
Algorithms
●
Perceptron (Logistic regression)
●
Winnow
●
Memory-based
●
Naïve Bayes
Training set size (millions)
Accuracy
Big Data landscape
● Storage
● Analytics
● Visualization
Big Data – Visualization
● Help overcome information overload
● Allows to see patterns and connections:
instantly and overtime
● Focus on specific parts of data but also in
relation to other parts: data is relative
● Many different tools and techniques can
be used based on data sets
http://www.ted.com/talks/david_mccandless_the_beauty_of_data_visualization.html
http://www.ted.com/talks/joann_kuchera_morin_tours_the_allosphere.html
Big Data – Visualization
● Many differents types available:
● 1D, 2D, 3D
● Temporal: timeline, time series, etc
● Advanced types: cloud tag, bubble
chart, network graph, rose chart, ,
spider chart, heatmap, tree map,
dependency graph, etc.
● Can allow interactivity (navigate, zoom
in/out, slide and dice, etc).
http://guides.library.duke.edu/vis_types
Big Data – Visualization Examples
https://developers.google.com/maps/tutorials/visualizing/earthquakes
http://www.webdesignerdepot.com/2009/06/50-great-examples-of-data-visualization/
Proposed Agenda
● What is Big Data?
● Big Data landscape
● Business / Use cases
● Discussion
Big Data – A Word on Privacy
● Currently mostly ignored: Big Brother?
● Everything is being stored (data retention)
– Location, calls, SMS, searches, web access,
transactions, applications used, contacts, calendar, etc.
● Data doesn't belong to you (Facebook, etc)
and may be resold (based on privacy policy)
● Apps can read your calendar, contacts, etc.
and upload data on their server
● For now users do not seem to care:
they care about service and free (as in $ ).
Your phone company is watching
Google's drive privacy article
Who's afraid of the bad, big data?
Big Data – And Social Media
An opportunity to:
● Identify trends: tweets, likes, blogs, page
views, etc
● Pinpoint problems: social media data can be
used to get sentiment / feedback on products
/ brands / events (even real-time)
● Predict behavior: what is trend over time and
how does it correlate to particular events?
Big Data – Not just 1 device
http://www.smartinsights.com/mobile-marketing/mobile-marketing-analytics/mobile-marketing-statistics/
Big Data – Mobile is growing faster
http://www.smartinsights.com/mobile-marketing/mobile-marketing-analytics/mobile-marketing-statistics/
Big Data – Shopping habits
Big Data - Business models
● Data is the "new oil"
● Every day, 2.5 quintillion bytes of data are created, with 90
percent of the world's data created in the past two years alone.
● Data production will be 44 times greater in 2020 than in 2009.
Big Data - Business models
● Data is the new business model as:
● Cost of HW, SW and networks requires to
produce and transport data continues to
approach an effective cost of zero
● Even in the physical manufacturing world,
cost will go down: robotics, 3D printing, etc.
● Data creates insight which allows to enhance
and disrupt existing business models
Big Data - Business models
● Opportunities for:
● Web businesses
To increase ARPU
● Enterprises
Serve their customers better and improve management of
suppliers and partners
● IoT
Internet of Things (IoT) or M2M (Machine To Machine) for
instance will allow brand new capabilities and services
Big Data - Business models
● Already used by web business (Google,
Facebook,etc and moving to Enterprises)
Big Data - Web
● More data can derive more insight which lead to
increase ARPU
● Ex: Ad platform
Advertisers define ads and campaigns available
across web, mobile, TV, etc.
On Google properties, Google makes money each
time an ad is clicked (CPC). On Network members and
content providers, Google makes money each time an
ad is clicked or is displayed (CPM)
-> Increase relevance and knowledge on the user lead to
increased revenues
Big Data - Enterprises
● All Industries are being disrupted
Big Data - Enterprises
Big Data - Enterprises
● Differentiation: satisfy customers, improve
existing services and create new service
offerings
● Improve processes: merchandising,
forecasting, and purchasing to distribution,
allocation, and transportation, etc.
● Data as a service: resell information,
analysis and insights
Big Data - Enterprises
Big Data - IoT
More and more machine are connecting
and generating data
Big Data - IoT
http://harborresearch.com/wp-content/uploads/2012/05/HarborResearch-nPhase_Paper_March-2011.pdf
Big Data - IoT
http://www.slideshare.net/harborresearch/harbor-research-introduction-to-smart-business-m2-m
Big Data – IoT and Healthcare
Home Healthcare / Tele-Health
● Business and Technology trends
● Aging Population
● Increase in Chronic Illnesses
● Demand from patients for home environment and
independence
● Costs pressure and scarcity for hospital beds
● Affordable and available telecommunications
● Computing advances: cost, size, power,
performance, imaging, etc.
Proposed Agenda
● What is Big Data?
● Big Data landscape
● Business / Use cases
● Discussion

Mais conteúdo relacionado

Mais procurados

Data science Applications in the Enterprise
Data science Applications in the EnterpriseData science Applications in the Enterprise
Data science Applications in the EnterpriseSrinath Perera
 
Big data, Machine learning and the Auditor
Big data, Machine learning and the AuditorBig data, Machine learning and the Auditor
Big data, Machine learning and the AuditorBharath Rao
 
intro_to_business_analytics_and_data_science_ver 1.0
intro_to_business_analytics_and_data_science_ver 1.0intro_to_business_analytics_and_data_science_ver 1.0
intro_to_business_analytics_and_data_science_ver 1.0Anthony Paulus
 
Importance of data analytics for business
Importance of data analytics for businessImportance of data analytics for business
Importance of data analytics for businessBranliticSocial
 
Business intelligence concepts & application
Business intelligence concepts & applicationBusiness intelligence concepts & application
Business intelligence concepts & applicationnandini patil
 
Data science and business analytics
Data  science and business analyticsData  science and business analytics
Data science and business analyticsInbavalli Valli
 
Big Data Analytics and a Chartered Accountant
Big Data Analytics and a Chartered AccountantBig Data Analytics and a Chartered Accountant
Big Data Analytics and a Chartered AccountantBharath Rao
 
Business intelligence data analytics-visualization
Business intelligence data analytics-visualizationBusiness intelligence data analytics-visualization
Business intelligence data analytics-visualizationMuthu Natarajan
 
Significance of Data Mining
Significance of Data MiningSignificance of Data Mining
Significance of Data Mining8trackweb
 
Data warehouse and data mining
Data warehouse and data miningData warehouse and data mining
Data warehouse and data miningRohit Kumar
 
Fight Fraud with Big Data Analytics
Fight Fraud with Big Data AnalyticsFight Fraud with Big Data Analytics
Fight Fraud with Big Data AnalyticsDatameer
 
Basic analtyics & advanced analtyics
Basic analtyics & advanced analtyicsBasic analtyics & advanced analtyics
Basic analtyics & advanced analtyicsDEEPIKA T
 
Business Intelligence Module 1
Business Intelligence Module 1Business Intelligence Module 1
Business Intelligence Module 1Home
 
The Do's and Don'ts of Data Mining
The Do's and Don'ts of Data MiningThe Do's and Don'ts of Data Mining
The Do's and Don'ts of Data MiningSalford Systems
 

Mais procurados (20)

Data science Applications in the Enterprise
Data science Applications in the EnterpriseData science Applications in the Enterprise
Data science Applications in the Enterprise
 
Big data, Machine learning and the Auditor
Big data, Machine learning and the AuditorBig data, Machine learning and the Auditor
Big data, Machine learning and the Auditor
 
intro_to_business_analytics_and_data_science_ver 1.0
intro_to_business_analytics_and_data_science_ver 1.0intro_to_business_analytics_and_data_science_ver 1.0
intro_to_business_analytics_and_data_science_ver 1.0
 
Business intelligence
Business intelligenceBusiness intelligence
Business intelligence
 
Importance of data analytics for business
Importance of data analytics for businessImportance of data analytics for business
Importance of data analytics for business
 
Data Mining
Data MiningData Mining
Data Mining
 
Business intelligence concepts & application
Business intelligence concepts & applicationBusiness intelligence concepts & application
Business intelligence concepts & application
 
Data science and business analytics
Data  science and business analyticsData  science and business analytics
Data science and business analytics
 
Big Data Analytics and a Chartered Accountant
Big Data Analytics and a Chartered AccountantBig Data Analytics and a Chartered Accountant
Big Data Analytics and a Chartered Accountant
 
Data Mining
Data MiningData Mining
Data Mining
 
Road Map for Careers in Big Data
Road Map for Careers in Big DataRoad Map for Careers in Big Data
Road Map for Careers in Big Data
 
Business intelligence data analytics-visualization
Business intelligence data analytics-visualizationBusiness intelligence data analytics-visualization
Business intelligence data analytics-visualization
 
Significance of Data Mining
Significance of Data MiningSignificance of Data Mining
Significance of Data Mining
 
Data warehouse and data mining
Data warehouse and data miningData warehouse and data mining
Data warehouse and data mining
 
Fight Fraud with Big Data Analytics
Fight Fraud with Big Data AnalyticsFight Fraud with Big Data Analytics
Fight Fraud with Big Data Analytics
 
Basic analtyics & advanced analtyics
Basic analtyics & advanced analtyicsBasic analtyics & advanced analtyics
Basic analtyics & advanced analtyics
 
Business Intelligence Module 1
Business Intelligence Module 1Business Intelligence Module 1
Business Intelligence Module 1
 
Seminario Big Data
Seminario Big DataSeminario Big Data
Seminario Big Data
 
The Do's and Don'ts of Data Mining
The Do's and Don'ts of Data MiningThe Do's and Don'ts of Data Mining
The Do's and Don'ts of Data Mining
 
Class1
Class1Class1
Class1
 

Semelhante a Big Data Overview Guide

BigDataFinal.pptx
BigDataFinal.pptxBigDataFinal.pptx
BigDataFinal.pptxPentaTech
 
The ABCs of Treating Data as Product
The ABCs of Treating Data as ProductThe ABCs of Treating Data as Product
The ABCs of Treating Data as ProductDATAVERSITY
 
Data_and_Analytics_Industry_IESE_v3.pdf
Data_and_Analytics_Industry_IESE_v3.pdfData_and_Analytics_Industry_IESE_v3.pdf
Data_and_Analytics_Industry_IESE_v3.pdfprevota
 
Data analytics career path
Data analytics career pathData analytics career path
Data analytics career pathRubikal
 
Gse uk-cedrinemadera-2018-shared
Gse uk-cedrinemadera-2018-sharedGse uk-cedrinemadera-2018-shared
Gse uk-cedrinemadera-2018-sharedcedrinemadera
 
It’s Not About Big Data – It’s About Big Insights - SAP Webinar - 20 Aug 201...
 It’s Not About Big Data – It’s About Big Insights - SAP Webinar - 20 Aug 201... It’s Not About Big Data – It’s About Big Insights - SAP Webinar - 20 Aug 201...
It’s Not About Big Data – It’s About Big Insights - SAP Webinar - 20 Aug 201...Edgar Alejandro Villegas
 
How Data Virtualization Puts Machine Learning into Production (APAC)
How Data Virtualization Puts Machine Learning into Production (APAC)How Data Virtualization Puts Machine Learning into Production (APAC)
How Data Virtualization Puts Machine Learning into Production (APAC)Denodo
 
How Data Virtualization Puts Enterprise Machine Learning Programs into Produc...
How Data Virtualization Puts Enterprise Machine Learning Programs into Produc...How Data Virtualization Puts Enterprise Machine Learning Programs into Produc...
How Data Virtualization Puts Enterprise Machine Learning Programs into Produc...Denodo
 
March Towards Big Data - Big Data Implementation, Migration, Ingestion, Manag...
March Towards Big Data - Big Data Implementation, Migration, Ingestion, Manag...March Towards Big Data - Big Data Implementation, Migration, Ingestion, Manag...
March Towards Big Data - Big Data Implementation, Migration, Ingestion, Manag...Experfy
 
A technical Introduction to Big Data Analytics
A technical Introduction to Big Data AnalyticsA technical Introduction to Big Data Analytics
A technical Introduction to Big Data AnalyticsPethuru Raj PhD
 
Business intelligence- Components, Tools, Need and Applications
Business intelligence- Components, Tools, Need and ApplicationsBusiness intelligence- Components, Tools, Need and Applications
Business intelligence- Components, Tools, Need and Applicationsraj
 
Choosing the Right Big Data Architecture for your Business
Choosing the Right Big Data Architecture for your BusinessChoosing the Right Big Data Architecture for your Business
Choosing the Right Big Data Architecture for your BusinessChicago Hadoop Users Group
 

Semelhante a Big Data Overview Guide (20)

BigDataFinal.pptx
BigDataFinal.pptxBigDataFinal.pptx
BigDataFinal.pptx
 
The ABCs of Treating Data as Product
The ABCs of Treating Data as ProductThe ABCs of Treating Data as Product
The ABCs of Treating Data as Product
 
Data_and_Analytics_Industry_IESE_v3.pdf
Data_and_Analytics_Industry_IESE_v3.pdfData_and_Analytics_Industry_IESE_v3.pdf
Data_and_Analytics_Industry_IESE_v3.pdf
 
Big data Analytics
Big data AnalyticsBig data Analytics
Big data Analytics
 
Data analytics career path
Data analytics career pathData analytics career path
Data analytics career path
 
Data Analytics Career Paths
Data Analytics Career PathsData Analytics Career Paths
Data Analytics Career Paths
 
Machine Data Analytics
Machine Data AnalyticsMachine Data Analytics
Machine Data Analytics
 
Taming Big Data With Modern Software Architecture
Taming Big Data  With Modern Software ArchitectureTaming Big Data  With Modern Software Architecture
Taming Big Data With Modern Software Architecture
 
Gse uk-cedrinemadera-2018-shared
Gse uk-cedrinemadera-2018-sharedGse uk-cedrinemadera-2018-shared
Gse uk-cedrinemadera-2018-shared
 
Seminario Big Data - 27/11/2017
Seminario Big Data - 27/11/2017Seminario Big Data - 27/11/2017
Seminario Big Data - 27/11/2017
 
Trends in data analytics
Trends in data analyticsTrends in data analytics
Trends in data analytics
 
Sea of Data
Sea of DataSea of Data
Sea of Data
 
It’s Not About Big Data – It’s About Big Insights - SAP Webinar - 20 Aug 201...
 It’s Not About Big Data – It’s About Big Insights - SAP Webinar - 20 Aug 201... It’s Not About Big Data – It’s About Big Insights - SAP Webinar - 20 Aug 201...
It’s Not About Big Data – It’s About Big Insights - SAP Webinar - 20 Aug 201...
 
How Data Virtualization Puts Machine Learning into Production (APAC)
How Data Virtualization Puts Machine Learning into Production (APAC)How Data Virtualization Puts Machine Learning into Production (APAC)
How Data Virtualization Puts Machine Learning into Production (APAC)
 
How Data Virtualization Puts Enterprise Machine Learning Programs into Produc...
How Data Virtualization Puts Enterprise Machine Learning Programs into Produc...How Data Virtualization Puts Enterprise Machine Learning Programs into Produc...
How Data Virtualization Puts Enterprise Machine Learning Programs into Produc...
 
Demystifying Data Science
Demystifying Data ScienceDemystifying Data Science
Demystifying Data Science
 
March Towards Big Data - Big Data Implementation, Migration, Ingestion, Manag...
March Towards Big Data - Big Data Implementation, Migration, Ingestion, Manag...March Towards Big Data - Big Data Implementation, Migration, Ingestion, Manag...
March Towards Big Data - Big Data Implementation, Migration, Ingestion, Manag...
 
A technical Introduction to Big Data Analytics
A technical Introduction to Big Data AnalyticsA technical Introduction to Big Data Analytics
A technical Introduction to Big Data Analytics
 
Business intelligence- Components, Tools, Need and Applications
Business intelligence- Components, Tools, Need and ApplicationsBusiness intelligence- Components, Tools, Need and Applications
Business intelligence- Components, Tools, Need and Applications
 
Choosing the Right Big Data Architecture for your Business
Choosing the Right Big Data Architecture for your BusinessChoosing the Right Big Data Architecture for your Business
Choosing the Right Big Data Architecture for your Business
 

Último

Dev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebDev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebUiPathCommunity
 
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Mark Simos
 
What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024Stephanie Beckett
 
Developer Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQLDeveloper Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQLScyllaDB
 
Sample pptx for embedding into website for demo
Sample pptx for embedding into website for demoSample pptx for embedding into website for demo
Sample pptx for embedding into website for demoHarshalMandlekar2
 
Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!Commit University
 
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptx
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptxPasskey Providers and Enabling Portability: FIDO Paris Seminar.pptx
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptxLoriGlavin3
 
unit 4 immunoblotting technique complete.pptx
unit 4 immunoblotting technique complete.pptxunit 4 immunoblotting technique complete.pptx
unit 4 immunoblotting technique complete.pptxBkGupta21
 
The State of Passkeys with FIDO Alliance.pptx
The State of Passkeys with FIDO Alliance.pptxThe State of Passkeys with FIDO Alliance.pptx
The State of Passkeys with FIDO Alliance.pptxLoriGlavin3
 
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek SchlawackFwdays
 
Unraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdfUnraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdfAlex Barbosa Coqueiro
 
The Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and ConsThe Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and ConsPixlogix Infotech
 
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024BookNet Canada
 
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptxMerck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptxLoriGlavin3
 
Take control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test SuiteTake control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test SuiteDianaGray10
 
SALESFORCE EDUCATION CLOUD | FEXLE SERVICES
SALESFORCE EDUCATION CLOUD | FEXLE SERVICESSALESFORCE EDUCATION CLOUD | FEXLE SERVICES
SALESFORCE EDUCATION CLOUD | FEXLE SERVICESmohitsingh558521
 
DevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache MavenDevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache MavenHervé Boutemy
 
How to write a Business Continuity Plan
How to write a Business Continuity PlanHow to write a Business Continuity Plan
How to write a Business Continuity PlanDatabarracks
 
Artificial intelligence in cctv survelliance.pptx
Artificial intelligence in cctv survelliance.pptxArtificial intelligence in cctv survelliance.pptx
Artificial intelligence in cctv survelliance.pptxhariprasad279825
 
What is Artificial Intelligence?????????
What is Artificial Intelligence?????????What is Artificial Intelligence?????????
What is Artificial Intelligence?????????blackmambaettijean
 

Último (20)

Dev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebDev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio Web
 
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
 
What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024
 
Developer Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQLDeveloper Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQL
 
Sample pptx for embedding into website for demo
Sample pptx for embedding into website for demoSample pptx for embedding into website for demo
Sample pptx for embedding into website for demo
 
Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!
 
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptx
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptxPasskey Providers and Enabling Portability: FIDO Paris Seminar.pptx
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptx
 
unit 4 immunoblotting technique complete.pptx
unit 4 immunoblotting technique complete.pptxunit 4 immunoblotting technique complete.pptx
unit 4 immunoblotting technique complete.pptx
 
The State of Passkeys with FIDO Alliance.pptx
The State of Passkeys with FIDO Alliance.pptxThe State of Passkeys with FIDO Alliance.pptx
The State of Passkeys with FIDO Alliance.pptx
 
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
 
Unraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdfUnraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdf
 
The Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and ConsThe Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and Cons
 
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
 
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptxMerck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptx
 
Take control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test SuiteTake control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test Suite
 
SALESFORCE EDUCATION CLOUD | FEXLE SERVICES
SALESFORCE EDUCATION CLOUD | FEXLE SERVICESSALESFORCE EDUCATION CLOUD | FEXLE SERVICES
SALESFORCE EDUCATION CLOUD | FEXLE SERVICES
 
DevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache MavenDevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache Maven
 
How to write a Business Continuity Plan
How to write a Business Continuity PlanHow to write a Business Continuity Plan
How to write a Business Continuity Plan
 
Artificial intelligence in cctv survelliance.pptx
Artificial intelligence in cctv survelliance.pptxArtificial intelligence in cctv survelliance.pptx
Artificial intelligence in cctv survelliance.pptx
 
What is Artificial Intelligence?????????
What is Artificial Intelligence?????????What is Artificial Intelligence?????????
What is Artificial Intelligence?????????
 

Big Data Overview Guide

  • 1. Big Data overview ESCEN Alexis Roos Senior Sales Engineer / Architect © Copyright 2013, Alexis Roos, alexis.roos@gmail.com
  • 2. Course objectives ● Give you a map / big picture and pointers to be able to drill down as you need ● Will cover business side but will also cover technology as without good technical understanding; it is not possible to grasp business side ● Will go over landscape and possibilities and illustrate with a good number of use cases
  • 3. Proposed Agenda ● What is Big Data? ● Big Data landscape (Tech heavy) ● Business / Use cases ● Discussion
  • 4. Proposed Agenda ● What is Big Data? ● Big Data landscape ● Business / Use cases ● Discussion
  • 6. Data and Big Data ● Data is the basis for Information Economics are now allowing to store virtually unlimited data ● "“Big data” is high -volume, -velocity and -variety information assets that demand cost-effective, innovative forms of information processing for enhanced insight and decision making." Gartner's definition. http://www.youtube.com/watch?v=ah14LEFKe8Q Year Cost of 1GB 1980 $3,000,000 1990 $8000 2000 $30 2010 $0.08
  • 7. Data – Information processing 1/2 ● Through processing data becomes Information (knowledge) and knowledge creates insight and insight = success. ● Transaction processing: A sequence of information exchange and related work that is treated as a unit for the purposes of satisfying a request (usually human but not exclusively) aka Online Transaction processing or OLTP Example: you buy an item on Amazon: . Item is placed on hold in Inventory system . Item is placed in shopping cart . System requests CC payment authorization for item . If payment is approved, CC is charged, item is removed from inventory and shipped. -> all of the above or nothing (roll back)
  • 8. Data – Information processing 2/2 ● Real Time processing Perceived as "immediate" from the originator Ex: trading, payment, online booking, "right" ad delivery, gaming, etc. ● Batch processing: Delayed Execution of a series of programs ("jobs") on a computer without manual intervention. Ex: billing, virus scanning, web indexing, data mining, analytics, etc.
  • 9. Data – ACID Transaction ● Technical definition: ● Atomicity: each transation is all or nothing ● Consistency: transaction will stay consistent with data rules ● Isolation: Ensures that each transaction is kept isolated from others ● Durability: Once a transaction has been committed, it will remain so, even in the event of power loss, crashes, or errors
  • 10. Big Data - Applications ● Find deeper insight in data: customers, partners and business. All Industries will be affected. "The software is eating the world" ● Retail: buying patterns, store traffic, etc ● Logistics: track and optimize shipments, etc ● Healthcare: preventive medecine, disease management, etc. ● Social media: optimize usage, ads, etc. ● Finance: buying patterns, portfolio optimization http://www.youtube.com/watch?v=7D1CQ_LOizA http://online.wsj.com/article/SB10001424053111903480904576512250915629460.html
  • 11. Big Data – Three dimensions ● Volume ● Amount of data ● Velocity ● Speed at which it arrives ● Variety ● Types of data
  • 12. Big Data – Volume/Size matters Name Value Example kilobyte (kB) 10^3 Email (7KB), Images, web pages megabyte (MB) 10^6 Ebooks, MP3, SD video etc. gigabyte (GB) 10^9 HD movie terabyte (TB) 10^12 For a single journey across the Atlantic Ocean, a four-engine jumbo jet can create 640 terabytes of data petabyte (PB) 10^15 FB has over 1.5 PB of stored photos exabyte (EB) 10^18 Seagate Technology reported selling 330 exabytes worth of hard drives during the 2011 Fiscal Year zettabyte (ZB) 10^21 WW production and consumption of data. According to International Data Corporation, the total amount of global data is expected to grow to 2.7 zettabytes during 2012 yottabyte (YB) 10^24 Not there yet .. ... http://en.wikipedia.org/wiki/Zettabyte http://www.youtube.com/watch?v=CsVYID9rMGE
  • 13. Big Data - Speed ● How fast is new data coming? ● How does this data need to be used or correlated? ● How long is data valuable? ● How fast does data need to be processed? ● This dimension in particular will affect the system architecture
  • 14. Big Data - Variety ● What type(s) / format(s)? ● Human or machine generated ● Text, location, document, picture, video, click streams, log file, event, etc. ● Is it structured or unstructured? ● Static vs dynamic ● What are relationships/dependencies within data elements?
  • 15. Proposed Agenda ● What is Big Data? ● Big Data landscape ● Business / Use cases ● Discussion
  • 16. Big Data landscape Big Data applications are roughly built using three technology layers: ● Storage ● Analytics ● Visualization
  • 18. Big Data landscape ● Storage ● Analytics ● Visualization
  • 19. Big Data – Storage ● Main logical data models: ● Tabular (represented by rows and columns) - Relational model ● Tree (a set of nodes with parent-children relationship) ● Graph structure (a set of interconnected nodes) ● Document (free structure / unstructured / schema less)
  • 20. Big Data – Storage ● Physical data models: ● Relational Data Base Mananement Systems (RDBMS) support ACIDity and joins are considered relational. Use SQL language as API. ● Key-value systems basically support get, put, and delete operations based on a primary key. ● Column-oriented systems still use tables but have no joins (joins must be handled within your application). Obviously, they store data by column as opposed to traditional row- oriented databases. This makes aggregations much easier. ● Document-oriented systems store structured "documents" such as JSON or XML but have no joins (joins must be handled within your application). It's very easy to map data from object-oriented software to these systems. http://nosql-database.org/
  • 21. Big Data - Storage ● Not practical to store data on 1 system, but distributing data creates complexity: ● Consistency: means that each client always has the same view of the data. ● Availability: means that all clients can always read and write. ● Partition: tolerance means that the system works well across physical network partitions. ● If system is partitioned, it is only possible to achieve 2 out of 3 properties (known as CAP theorem): CA, AP or CP.
  • 22. Big data - Storage Source: http://blog.nahurst.com/visual-guide-to-nosql-systems
  • 23. Big Data - Storage ATKearny
  • 24. Big Data landscape ● Storage ● Analytics ● Visualization
  • 25. Big Data – Analytics ● Process of examining large amounts of data of a variety of types to uncover hidden patterns, unknown correlations and other useful information resulting in business benefits, such as more effective marketing or increased revenue. ● Can work on all forms of data as described before ● Can involve Transactions, Real Time and/or Batch Oriented
  • 26. Big Data – Analytics ● "Stages" of analytics: ● Business monitoring: traditional BI, Charting, Key Performance Indicator, etc. ● Business insights: uses statistics, data mining, predictive analysis to generate actionable insights: "Intelligent dashboards". Leverages trending, classification, optimization, simulations. ● Business transformation based on data
  • 27. Big Data – Analytics ● Anticipate and predict
  • 28. Big Data – Analytics ● Traditional predictive analytics and data mining are designed for relational data or structured data so a whole set of new technologies have evolved for unstructured data. ● Hadoop (batch oriented): "brute force" ● Real Time processing (new trend): optimized for specific use cases ● Machine learning: data intensive
  • 29. Big Data – Hadoop ● Designed for large scale (100's of terabytes of data) batch oriented information processing: archiving, transformation, exploration, etc. ● Reliable while using commodity HW and open source ● Main components: ● Distributed File System (HDFS) ● Map Reduce: distributed data processing ● Associated infrastructure components, query mechanisms and machine learning
  • 30. Big Data – Hadoop Example ● Derive meaning from logs: ● Who is using the web site? IP, location, device, etc. ● What pages are they looking at? How long, how often? ● Are they buying? Adding products to cart? Checking out? ● What are the trends?
  • 31. Big Data – Real-Time ● Goal is to process data from highly dynamic sources in real time ● Data is typically streaming to the processing system and stored / processed directly into memory ● Complex Event Processing has been there for years but need new architecture for Big Data scale and distributed processing: Storm/Kafta are one of the frameworks that could become "Hadoop" of Real-Time
  • 32. Big Data – Real-Time Example ● Derive meaning from tweets: ● How well brand is trending? ● By time, category? ● Compared to competitors ● Sentiment? ● etc http://www.filtize.com/
  • 33. Big Data – Machine learning ● What is Machine learning?
  • 34. Big Data – Machine learning ● "A branch of artificial intelligence, that is about the construction and study of systems that can learn from data." Supports Predictive Analytics ● Can perform tasks that are too difficult to specify algorithmically ● Example of applications: ● Computer vision, Natural language processing, Fraud detection, Game playing, Robot locomotion, Sentiment analysis, Adaptive systems, scientific applications, anomaly detection, recommendation engine, personal assistant, etc
  • 35. Big Data – Example ● Handwritten recognition ● Handcrafted rules will result in large number of rules and exceptions. Best to have a machine that learns from a large training set.
  • 36. Big Data – Example ● Computer vision: car detection ● First Learning ● Then Testing: Is this a car? Not a carCars
  • 37. Big Data – Machine learning ● Supervised or unsupervised learning: whether we train the model or the system learns on its own ● Types of information processing: ● Supervised – Classification (discrete) – Regression (continuous) ● Unsupervised – Clustering (discrete)
  • 38. Big Data – Machine learning Supervised – Classification / Regression ● First teach the model ● Then verify against the model
  • 39. Big Data – Machine learning Classification ● Classifier (single or multi class): given some set of features with corresponding labels, learn a function to predict the labels from the features x x x x x x x x o o o o o x2 x1
  • 40. Big Data – Machine learning Classification Many algorithms to choose from: ● SVM ● Neural networks ● Naïve Bayes ● Bayesian network ● Logistic regression ● Randomized Forests ● Boosted Decision Trees ● K-nearest neighbor ● RBMs ● Etc.
  • 41. Big Data – Machine learning Regression ● Regression allows to fit an equation to a dataset to be able to predict values for new data Example: calculate price of a house: in reality much more than 1 variable: size, number of floors, # of rooms, age, location, etc
  • 42. Big Data – Machine learning Clustering ● Clustering allows to place data elements into related groups without advance knowledge of the group definitions. ● Example: social network aka similar profiles ● K-means is a popular algorithm for clustering http://en.wikipedia.org/wiki/K-means_clustering
  • 43. Big Data – Machine learning ● Predictive analytics techniques usage
  • 44. Big Data – Machine learning ● Designing a high accuracy learning system “It’s not who has the best algorithm that wins. It’s who has the most data.” Ex: Classify between confusable words. {to, two, too}, {then, than} For breakfast I ate _____ eggs. ● Algorithms ● Perceptron (Logistic regression) ● Winnow ● Memory-based ● Naïve Bayes Training set size (millions) Accuracy
  • 45. Big Data landscape ● Storage ● Analytics ● Visualization
  • 46. Big Data – Visualization ● Help overcome information overload ● Allows to see patterns and connections: instantly and overtime ● Focus on specific parts of data but also in relation to other parts: data is relative ● Many different tools and techniques can be used based on data sets http://www.ted.com/talks/david_mccandless_the_beauty_of_data_visualization.html http://www.ted.com/talks/joann_kuchera_morin_tours_the_allosphere.html
  • 47. Big Data – Visualization ● Many differents types available: ● 1D, 2D, 3D ● Temporal: timeline, time series, etc ● Advanced types: cloud tag, bubble chart, network graph, rose chart, , spider chart, heatmap, tree map, dependency graph, etc. ● Can allow interactivity (navigate, zoom in/out, slide and dice, etc). http://guides.library.duke.edu/vis_types
  • 48. Big Data – Visualization Examples https://developers.google.com/maps/tutorials/visualizing/earthquakes http://www.webdesignerdepot.com/2009/06/50-great-examples-of-data-visualization/
  • 49. Proposed Agenda ● What is Big Data? ● Big Data landscape ● Business / Use cases ● Discussion
  • 50. Big Data – A Word on Privacy ● Currently mostly ignored: Big Brother? ● Everything is being stored (data retention) – Location, calls, SMS, searches, web access, transactions, applications used, contacts, calendar, etc. ● Data doesn't belong to you (Facebook, etc) and may be resold (based on privacy policy) ● Apps can read your calendar, contacts, etc. and upload data on their server ● For now users do not seem to care: they care about service and free (as in $ ). Your phone company is watching Google's drive privacy article Who's afraid of the bad, big data?
  • 51. Big Data – And Social Media An opportunity to: ● Identify trends: tweets, likes, blogs, page views, etc ● Pinpoint problems: social media data can be used to get sentiment / feedback on products / brands / events (even real-time) ● Predict behavior: what is trend over time and how does it correlate to particular events?
  • 52. Big Data – Not just 1 device http://www.smartinsights.com/mobile-marketing/mobile-marketing-analytics/mobile-marketing-statistics/
  • 53. Big Data – Mobile is growing faster http://www.smartinsights.com/mobile-marketing/mobile-marketing-analytics/mobile-marketing-statistics/
  • 54. Big Data – Shopping habits
  • 55. Big Data - Business models ● Data is the "new oil" ● Every day, 2.5 quintillion bytes of data are created, with 90 percent of the world's data created in the past two years alone. ● Data production will be 44 times greater in 2020 than in 2009.
  • 56. Big Data - Business models ● Data is the new business model as: ● Cost of HW, SW and networks requires to produce and transport data continues to approach an effective cost of zero ● Even in the physical manufacturing world, cost will go down: robotics, 3D printing, etc. ● Data creates insight which allows to enhance and disrupt existing business models
  • 57. Big Data - Business models ● Opportunities for: ● Web businesses To increase ARPU ● Enterprises Serve their customers better and improve management of suppliers and partners ● IoT Internet of Things (IoT) or M2M (Machine To Machine) for instance will allow brand new capabilities and services
  • 58. Big Data - Business models ● Already used by web business (Google, Facebook,etc and moving to Enterprises)
  • 59. Big Data - Web ● More data can derive more insight which lead to increase ARPU ● Ex: Ad platform Advertisers define ads and campaigns available across web, mobile, TV, etc. On Google properties, Google makes money each time an ad is clicked (CPC). On Network members and content providers, Google makes money each time an ad is clicked or is displayed (CPM) -> Increase relevance and knowledge on the user lead to increased revenues
  • 60. Big Data - Enterprises ● All Industries are being disrupted
  • 61. Big Data - Enterprises
  • 62. Big Data - Enterprises ● Differentiation: satisfy customers, improve existing services and create new service offerings ● Improve processes: merchandising, forecasting, and purchasing to distribution, allocation, and transportation, etc. ● Data as a service: resell information, analysis and insights
  • 63. Big Data - Enterprises
  • 64. Big Data - IoT More and more machine are connecting and generating data
  • 65. Big Data - IoT http://harborresearch.com/wp-content/uploads/2012/05/HarborResearch-nPhase_Paper_March-2011.pdf
  • 66. Big Data - IoT http://www.slideshare.net/harborresearch/harbor-research-introduction-to-smart-business-m2-m
  • 67. Big Data – IoT and Healthcare Home Healthcare / Tele-Health ● Business and Technology trends ● Aging Population ● Increase in Chronic Illnesses ● Demand from patients for home environment and independence ● Costs pressure and scarcity for hospital beds ● Affordable and available telecommunications ● Computing advances: cost, size, power, performance, imaging, etc.
  • 68. Proposed Agenda ● What is Big Data? ● Big Data landscape ● Business / Use cases ● Discussion