SlideShare uma empresa Scribd logo
1 de 28
Big Data
Valeri Kopaleishvili
Outline
◦what is Big Data ?
◦where this Beg Data come from?
◦4v`s Analysis
◦When dealing with big Data?
◦EXAMPLE : Google
What is big data?
“Every day, we create 2.5 quintillion bytes of data — so
much that 90% of the data in the world today has been
created in the last two years alone. This data comes from
everywhere: sensors used to gather climate information,
posts to social media sites, digital pictures and videos,
purchase transaction records, and cell phone GPS signals to
name a few.
This data is “big data.”
Where Is This “Big Data” Coming From?
12+ TBs
of tweet data
every day
25+ TBs of
log data
every day
?TBsof
dataevery
day
2+ billion
people on
the Web
by end
2011
30 billion RFID tags
today
(1.3B in 2005)
4.6
billion
camera
phones
world
wide
100s of
millions of
GPS
enabled
devices
sold
annually
76 million smart
meters in 2009…
200M by 2014
Volume
of Tweets
create daily.
12+ terabytes
Variety
of different
types of data.
100’s
Value
With Big Data, We’ve Moved to 4 Vs Analytics
trade events
per second.
5+ million
Velocity
Volume (Scale)
Data Volume
◦ 44x increase from 2009 2020
◦ From 0.8 zettabytes to 35zb
Data volume is increasing exponentially
6
Refers to the vast amounts of data generated every second. We are not talking
Terabytes but Petabytes . If we take all the data generated in the world between the
beginning of time and 2008, the same amount of data will soon be generated every
minute. This makes most data sets too large to store and analyze using traditional
database technology. New big data tools use distributed systems so that we can store
and analyse data across databases that are dotted around anywhere in the world.
Variety (Complexity)
7
To extract knowledge all these types of
data need to be linked together
Refers to the different types of data we can now use. In the past we only
focused on structured data that neatly fitted into tables or relational
databases, such as financial data.
In fact, 80% of the world’s data is unstructured (text, images, video,
voice, etc.) With big data technology we can now analyze and bring
together data of different types such as messages, social media
conversations, photos, sensor data, video or voice recordings.
Velocity (Speed)
Velocity :Refers to the speed at which new data is generated and the speed at
which data moves around. Just think of social media messages going viral in
seconds. Technology allows us now to analyze the data while it is being
generated (sometimes referred to as in-memory analytics), without ever
putting it into databases.
Examples
◦ E-Promotions: Based on your current location, your purchase history, what
you like  send promotions right now for store next to you
◦ Healthcare monitoring: sensors monitoring your activities and body  any
abnormal measurements require immediate reaction
8
Real-time/Fast Data
The progress and innovation is no longer hindered by the ability to collect data
But, by the ability to manage, analyze, summarize, visualize, and discover
knowledge from the collected data in a timely manner and in a scalable fashion
9
Social media and networks
(all of us are generating data)
Scientific instruments
(collecting all sorts of data)
Mobile devices
(tracking all objects all the time)
Sensor technology and networks
(measuring all kinds of data)
Value Then there is another V to take into account when looking at Big
Data: Value! Having access to big data is no good unless we can turn it
into value. Companies are starting to generate amazing value from their
big data.
We currently only see the beginnings of a transformation into a big data
economy. Any business that doesn’t seriously consider the implications
of Big Data runs the risk of being left behind.
Value
Big Data Exploration: Value & Diagram
11
File
Systems
Relational
Data
Content
Management
Email
CRM
Supply
Chain
ERP
RSS Feeds
Cloud
Custom
Sources
DataExplorer
Application/
Users
Find, Visualize & Understand
all big data to improve
business knowledge
• Greater efficiencies in business
processes
• New insights from combining and
analyzing data types in new
ways
• Develop new business models
with resulting increased market
presence and revenue
Applications for Big Data Analytics
Homeland Security
FinanceSmarter Healthcare
Telecom
Manufacturing
Traffic Control
Trading Analytics
Log Analysis
Search Quality
When dealing with Big Data is
hard
When the operations on data are complex:
◦ Eg. Simple counting is not a complex problem.
◦ Modeling and reasoning with data of different kinds can
get extremely complex
Good news with big-data:
◦ Often, because of the vast amount of data, modeling
techniques can get simpler (e.g., smart counting can
replace complex model-based analytics)…
◦ …as long as we deal with the scale.
Hadoopis an open-source software framework for storing and processing big data in a
distributed fashion on large clusters of commodity hardware.
Suitable for extremely large databases (billions of rows, millions of columns), distributed
across thousands of nodes.
Hadoop Distributed File System (HDFS) is a Java-based file system that provides
scalable and reliable data storage that is designed to large clusters of commodity
servers.
MapReduce is a programming model and an associated implementation for processing and generating
large data sets with a parallel, distributed algorithm on a cluster.
We first wrote the data into HDFS, then created a table and loaded data from HDFS
files to HIVE table.
Thanks!

Mais conteúdo relacionado

Mais procurados

Big data Presentation
Big data PresentationBig data Presentation
Big data PresentationAswadmehar
 
Big Data: Its Characteristics And Architecture Capabilities
Big Data: Its Characteristics And Architecture CapabilitiesBig Data: Its Characteristics And Architecture Capabilities
Big Data: Its Characteristics And Architecture CapabilitiesAshraf Uddin
 
Presentation About Big Data (DBMS)
Presentation About Big Data (DBMS)Presentation About Big Data (DBMS)
Presentation About Big Data (DBMS)SiamAhmed16
 
Big data PPT prepared by Hritika Raj (Shivalik college of engg.)
Big data PPT prepared by Hritika Raj (Shivalik college of engg.)Big data PPT prepared by Hritika Raj (Shivalik college of engg.)
Big data PPT prepared by Hritika Raj (Shivalik college of engg.)Hritika Raj
 
Introduction to Big Data
Introduction to Big Data Introduction to Big Data
Introduction to Big Data Srinath Perera
 
Big Data & Analytics (Conceptual and Practical Introduction)
Big Data & Analytics (Conceptual and Practical Introduction)Big Data & Analytics (Conceptual and Practical Introduction)
Big Data & Analytics (Conceptual and Practical Introduction)Yaman Hajja, Ph.D.
 
Big Data - Applications and Technologies Overview
Big Data - Applications and Technologies OverviewBig Data - Applications and Technologies Overview
Big Data - Applications and Technologies OverviewSivashankar Ganapathy
 

Mais procurados (20)

Presentation on Big Data
Presentation on Big DataPresentation on Big Data
Presentation on Big Data
 
What is big data?
What is big data?What is big data?
What is big data?
 
Big data
Big dataBig data
Big data
 
Big Data ppt
Big Data pptBig Data ppt
Big Data ppt
 
Big data Presentation
Big data PresentationBig data Presentation
Big data Presentation
 
Our big data
Our big dataOur big data
Our big data
 
Big data
Big dataBig data
Big data
 
Big Data
Big DataBig Data
Big Data
 
Big Data: Its Characteristics And Architecture Capabilities
Big Data: Its Characteristics And Architecture CapabilitiesBig Data: Its Characteristics And Architecture Capabilities
Big Data: Its Characteristics And Architecture Capabilities
 
Big data
Big dataBig data
Big data
 
Presentation About Big Data (DBMS)
Presentation About Big Data (DBMS)Presentation About Big Data (DBMS)
Presentation About Big Data (DBMS)
 
Big data PPT prepared by Hritika Raj (Shivalik college of engg.)
Big data PPT prepared by Hritika Raj (Shivalik college of engg.)Big data PPT prepared by Hritika Raj (Shivalik college of engg.)
Big data PPT prepared by Hritika Raj (Shivalik college of engg.)
 
Big Data
Big DataBig Data
Big Data
 
Introduction to Big Data
Introduction to Big Data Introduction to Big Data
Introduction to Big Data
 
Big Data & Analytics (Conceptual and Practical Introduction)
Big Data & Analytics (Conceptual and Practical Introduction)Big Data & Analytics (Conceptual and Practical Introduction)
Big Data & Analytics (Conceptual and Practical Introduction)
 
Big data.
Big data.Big data.
Big data.
 
Applications of Big Data
Applications of Big DataApplications of Big Data
Applications of Big Data
 
Big data-ppt
Big data-pptBig data-ppt
Big data-ppt
 
Big data Ppt
Big data PptBig data Ppt
Big data Ppt
 
Big Data - Applications and Technologies Overview
Big Data - Applications and Technologies OverviewBig Data - Applications and Technologies Overview
Big Data - Applications and Technologies Overview
 

Destaque

Big data (Data Size doesn't Matter, How and What is Data that's matter)
Big data (Data Size doesn't Matter, How and What is Data that's matter)Big data (Data Size doesn't Matter, How and What is Data that's matter)
Big data (Data Size doesn't Matter, How and What is Data that's matter)Syed Taimoor Hussain Shah
 
BDI- The Beginning (Big data training in Coimbatore)
BDI- The Beginning (Big data training in Coimbatore)BDI- The Beginning (Big data training in Coimbatore)
BDI- The Beginning (Big data training in Coimbatore)Ashok Rangaswamy
 
Big data : Coudbells.com
Big data : Coudbells.comBig data : Coudbells.com
Big data : Coudbells.comCloudbells.com
 
Webinar | Using Hadoop Analytics to Gain a Big Data Advantage
Webinar | Using Hadoop Analytics to Gain a Big Data AdvantageWebinar | Using Hadoop Analytics to Gain a Big Data Advantage
Webinar | Using Hadoop Analytics to Gain a Big Data AdvantageCloudera, Inc.
 
SAP HANA and Apache Hadoop for Big Data Management (SF Scalable Systems Meetup)
SAP HANA and Apache Hadoop for Big Data Management (SF Scalable Systems Meetup)SAP HANA and Apache Hadoop for Big Data Management (SF Scalable Systems Meetup)
SAP HANA and Apache Hadoop for Big Data Management (SF Scalable Systems Meetup)Will Gardella
 
行動廣告與大數據資料分析策略與執行
行動廣告與大數據資料分析策略與執行行動廣告與大數據資料分析策略與執行
行動廣告與大數據資料分析策略與執行台灣資料科學年會
 
Big Data
Big DataBig Data
Big DataNGDATA
 

Destaque (13)

Big Data
Big DataBig Data
Big Data
 
Big data (Data Size doesn't Matter, How and What is Data that's matter)
Big data (Data Size doesn't Matter, How and What is Data that's matter)Big data (Data Size doesn't Matter, How and What is Data that's matter)
Big data (Data Size doesn't Matter, How and What is Data that's matter)
 
BDI- The Beginning (Big data training in Coimbatore)
BDI- The Beginning (Big data training in Coimbatore)BDI- The Beginning (Big data training in Coimbatore)
BDI- The Beginning (Big data training in Coimbatore)
 
Privacy in the Age of Big Data
Privacy in the Age of Big DataPrivacy in the Age of Big Data
Privacy in the Age of Big Data
 
Big data : Coudbells.com
Big data : Coudbells.comBig data : Coudbells.com
Big data : Coudbells.com
 
Webinar | Using Hadoop Analytics to Gain a Big Data Advantage
Webinar | Using Hadoop Analytics to Gain a Big Data AdvantageWebinar | Using Hadoop Analytics to Gain a Big Data Advantage
Webinar | Using Hadoop Analytics to Gain a Big Data Advantage
 
Big data analystics
Big data analysticsBig data analystics
Big data analystics
 
SAP HANA and Apache Hadoop for Big Data Management (SF Scalable Systems Meetup)
SAP HANA and Apache Hadoop for Big Data Management (SF Scalable Systems Meetup)SAP HANA and Apache Hadoop for Big Data Management (SF Scalable Systems Meetup)
SAP HANA and Apache Hadoop for Big Data Management (SF Scalable Systems Meetup)
 
行動廣告與大數據資料分析策略與執行
行動廣告與大數據資料分析策略與執行行動廣告與大數據資料分析策略與執行
行動廣告與大數據資料分析策略與執行
 
Big Data
Big DataBig Data
Big Data
 
Big data ppt
Big data pptBig data ppt
Big data ppt
 
Big data ppt
Big  data pptBig  data ppt
Big data ppt
 
What is Big Data?
What is Big Data?What is Big Data?
What is Big Data?
 

Semelhante a Big data (20)

Introduction to big data – convergences.
Introduction to big data – convergences.Introduction to big data – convergences.
Introduction to big data – convergences.
 
MBA-TU-Thailand:BigData for business startup.
MBA-TU-Thailand:BigData for business startup.MBA-TU-Thailand:BigData for business startup.
MBA-TU-Thailand:BigData for business startup.
 
SKILLWISE-BIGDATA ANALYSIS
SKILLWISE-BIGDATA ANALYSISSKILLWISE-BIGDATA ANALYSIS
SKILLWISE-BIGDATA ANALYSIS
 
SuanIct-Bigdata desktop-final
SuanIct-Bigdata desktop-finalSuanIct-Bigdata desktop-final
SuanIct-Bigdata desktop-final
 
Big data
Big dataBig data
Big data
 
Big data
Big dataBig data
Big data
 
Final_Bigdata_pret
Final_Bigdata_pretFinal_Bigdata_pret
Final_Bigdata_pret
 
Big data Seminar/Presentation
Big data Seminar/PresentationBig data Seminar/Presentation
Big data Seminar/Presentation
 
What is Big Data?
What is Big Data? What is Big Data?
What is Big Data?
 
Big data nou
Big data nouBig data nou
Big data nou
 
What is Big Data?
What is Big Data?What is Big Data?
What is Big Data?
 
Bigdata
BigdataBigdata
Bigdata
 
IT FUTURE- Big data
IT FUTURE- Big dataIT FUTURE- Big data
IT FUTURE- Big data
 
Big data lecture notes
Big data lecture notesBig data lecture notes
Big data lecture notes
 
Bigdata (1) converted
Bigdata (1) convertedBigdata (1) converted
Bigdata (1) converted
 
130214 copy
130214   copy130214   copy
130214 copy
 
new.pptx
new.pptxnew.pptx
new.pptx
 
Data Mining With Big Data
Data Mining With Big DataData Mining With Big Data
Data Mining With Big Data
 
Big data
Big dataBig data
Big data
 
Introduction to Big Data
Introduction to Big DataIntroduction to Big Data
Introduction to Big Data
 

Mais de valeri kopaleishvili

Mais de valeri kopaleishvili (6)

Georgia(格鲁吉亚)
Georgia(格鲁吉亚)Georgia(格鲁吉亚)
Georgia(格鲁吉亚)
 
Run wordcount job (hadoop)
Run wordcount job (hadoop)Run wordcount job (hadoop)
Run wordcount job (hadoop)
 
Staruml
StarumlStaruml
Staruml
 
Software specification for
Software specification forSoftware specification for
Software specification for
 
Erp (sap report)
Erp (sap report)Erp (sap report)
Erp (sap report)
 
Design interpreter pattern
Design interpreter patternDesign interpreter pattern
Design interpreter pattern
 

Último

Halmar dropshipping via API with DroFx
Halmar  dropshipping  via API with DroFxHalmar  dropshipping  via API with DroFx
Halmar dropshipping via API with DroFxolyaivanovalion
 
Discover Why Less is More in B2B Research
Discover Why Less is More in B2B ResearchDiscover Why Less is More in B2B Research
Discover Why Less is More in B2B Researchmichael115558
 
Mg Road Call Girls Service: 🍓 7737669865 🍓 High Profile Model Escorts | Banga...
Mg Road Call Girls Service: 🍓 7737669865 🍓 High Profile Model Escorts | Banga...Mg Road Call Girls Service: 🍓 7737669865 🍓 High Profile Model Escorts | Banga...
Mg Road Call Girls Service: 🍓 7737669865 🍓 High Profile Model Escorts | Banga...amitlee9823
 
BabyOno dropshipping via API with DroFx.pptx
BabyOno dropshipping via API with DroFx.pptxBabyOno dropshipping via API with DroFx.pptx
BabyOno dropshipping via API with DroFx.pptxolyaivanovalion
 
Vip Mumbai Call Girls Thane West Call On 9920725232 With Body to body massage...
Vip Mumbai Call Girls Thane West Call On 9920725232 With Body to body massage...Vip Mumbai Call Girls Thane West Call On 9920725232 With Body to body massage...
Vip Mumbai Call Girls Thane West Call On 9920725232 With Body to body massage...amitlee9823
 
Call Girls Hsr Layout Just Call 👗 7737669865 👗 Top Class Call Girl Service Ba...
Call Girls Hsr Layout Just Call 👗 7737669865 👗 Top Class Call Girl Service Ba...Call Girls Hsr Layout Just Call 👗 7737669865 👗 Top Class Call Girl Service Ba...
Call Girls Hsr Layout Just Call 👗 7737669865 👗 Top Class Call Girl Service Ba...amitlee9823
 
VIP Model Call Girls Hinjewadi ( Pune ) Call ON 8005736733 Starting From 5K t...
VIP Model Call Girls Hinjewadi ( Pune ) Call ON 8005736733 Starting From 5K t...VIP Model Call Girls Hinjewadi ( Pune ) Call ON 8005736733 Starting From 5K t...
VIP Model Call Girls Hinjewadi ( Pune ) Call ON 8005736733 Starting From 5K t...SUHANI PANDEY
 
Invezz.com - Grow your wealth with trading signals
Invezz.com - Grow your wealth with trading signalsInvezz.com - Grow your wealth with trading signals
Invezz.com - Grow your wealth with trading signalsInvezz1
 
Probability Grade 10 Third Quarter Lessons
Probability Grade 10 Third Quarter LessonsProbability Grade 10 Third Quarter Lessons
Probability Grade 10 Third Quarter LessonsJoseMangaJr1
 
ELKO dropshipping via API with DroFx.pptx
ELKO dropshipping via API with DroFx.pptxELKO dropshipping via API with DroFx.pptx
ELKO dropshipping via API with DroFx.pptxolyaivanovalion
 
Accredited-Transport-Cooperatives-Jan-2021-Web.pdf
Accredited-Transport-Cooperatives-Jan-2021-Web.pdfAccredited-Transport-Cooperatives-Jan-2021-Web.pdf
Accredited-Transport-Cooperatives-Jan-2021-Web.pdfadriantubila
 
Generative AI on Enterprise Cloud with NiFi and Milvus
Generative AI on Enterprise Cloud with NiFi and MilvusGenerative AI on Enterprise Cloud with NiFi and Milvus
Generative AI on Enterprise Cloud with NiFi and MilvusTimothy Spann
 
Carero dropshipping via API with DroFx.pptx
Carero dropshipping via API with DroFx.pptxCarero dropshipping via API with DroFx.pptx
Carero dropshipping via API with DroFx.pptxolyaivanovalion
 
Call Girls Indiranagar Just Call 👗 7737669865 👗 Top Class Call Girl Service B...
Call Girls Indiranagar Just Call 👗 7737669865 👗 Top Class Call Girl Service B...Call Girls Indiranagar Just Call 👗 7737669865 👗 Top Class Call Girl Service B...
Call Girls Indiranagar Just Call 👗 7737669865 👗 Top Class Call Girl Service B...amitlee9823
 
Call Girls Bommasandra Just Call 👗 7737669865 👗 Top Class Call Girl Service B...
Call Girls Bommasandra Just Call 👗 7737669865 👗 Top Class Call Girl Service B...Call Girls Bommasandra Just Call 👗 7737669865 👗 Top Class Call Girl Service B...
Call Girls Bommasandra Just Call 👗 7737669865 👗 Top Class Call Girl Service B...amitlee9823
 
Call Girls Jalahalli Just Call 👗 7737669865 👗 Top Class Call Girl Service Ban...
Call Girls Jalahalli Just Call 👗 7737669865 👗 Top Class Call Girl Service Ban...Call Girls Jalahalli Just Call 👗 7737669865 👗 Top Class Call Girl Service Ban...
Call Girls Jalahalli Just Call 👗 7737669865 👗 Top Class Call Girl Service Ban...amitlee9823
 
ALSO dropshipping via API with DroFx.pptx
ALSO dropshipping via API with DroFx.pptxALSO dropshipping via API with DroFx.pptx
ALSO dropshipping via API with DroFx.pptxolyaivanovalion
 

Último (20)

Halmar dropshipping via API with DroFx
Halmar  dropshipping  via API with DroFxHalmar  dropshipping  via API with DroFx
Halmar dropshipping via API with DroFx
 
Discover Why Less is More in B2B Research
Discover Why Less is More in B2B ResearchDiscover Why Less is More in B2B Research
Discover Why Less is More in B2B Research
 
Predicting Loan Approval: A Data Science Project
Predicting Loan Approval: A Data Science ProjectPredicting Loan Approval: A Data Science Project
Predicting Loan Approval: A Data Science Project
 
Mg Road Call Girls Service: 🍓 7737669865 🍓 High Profile Model Escorts | Banga...
Mg Road Call Girls Service: 🍓 7737669865 🍓 High Profile Model Escorts | Banga...Mg Road Call Girls Service: 🍓 7737669865 🍓 High Profile Model Escorts | Banga...
Mg Road Call Girls Service: 🍓 7737669865 🍓 High Profile Model Escorts | Banga...
 
BabyOno dropshipping via API with DroFx.pptx
BabyOno dropshipping via API with DroFx.pptxBabyOno dropshipping via API with DroFx.pptx
BabyOno dropshipping via API with DroFx.pptx
 
Vip Mumbai Call Girls Thane West Call On 9920725232 With Body to body massage...
Vip Mumbai Call Girls Thane West Call On 9920725232 With Body to body massage...Vip Mumbai Call Girls Thane West Call On 9920725232 With Body to body massage...
Vip Mumbai Call Girls Thane West Call On 9920725232 With Body to body massage...
 
Call Girls Hsr Layout Just Call 👗 7737669865 👗 Top Class Call Girl Service Ba...
Call Girls Hsr Layout Just Call 👗 7737669865 👗 Top Class Call Girl Service Ba...Call Girls Hsr Layout Just Call 👗 7737669865 👗 Top Class Call Girl Service Ba...
Call Girls Hsr Layout Just Call 👗 7737669865 👗 Top Class Call Girl Service Ba...
 
VIP Model Call Girls Hinjewadi ( Pune ) Call ON 8005736733 Starting From 5K t...
VIP Model Call Girls Hinjewadi ( Pune ) Call ON 8005736733 Starting From 5K t...VIP Model Call Girls Hinjewadi ( Pune ) Call ON 8005736733 Starting From 5K t...
VIP Model Call Girls Hinjewadi ( Pune ) Call ON 8005736733 Starting From 5K t...
 
Invezz.com - Grow your wealth with trading signals
Invezz.com - Grow your wealth with trading signalsInvezz.com - Grow your wealth with trading signals
Invezz.com - Grow your wealth with trading signals
 
Probability Grade 10 Third Quarter Lessons
Probability Grade 10 Third Quarter LessonsProbability Grade 10 Third Quarter Lessons
Probability Grade 10 Third Quarter Lessons
 
ELKO dropshipping via API with DroFx.pptx
ELKO dropshipping via API with DroFx.pptxELKO dropshipping via API with DroFx.pptx
ELKO dropshipping via API with DroFx.pptx
 
Accredited-Transport-Cooperatives-Jan-2021-Web.pdf
Accredited-Transport-Cooperatives-Jan-2021-Web.pdfAccredited-Transport-Cooperatives-Jan-2021-Web.pdf
Accredited-Transport-Cooperatives-Jan-2021-Web.pdf
 
Generative AI on Enterprise Cloud with NiFi and Milvus
Generative AI on Enterprise Cloud with NiFi and MilvusGenerative AI on Enterprise Cloud with NiFi and Milvus
Generative AI on Enterprise Cloud with NiFi and Milvus
 
Carero dropshipping via API with DroFx.pptx
Carero dropshipping via API with DroFx.pptxCarero dropshipping via API with DroFx.pptx
Carero dropshipping via API with DroFx.pptx
 
Call Girls Indiranagar Just Call 👗 7737669865 👗 Top Class Call Girl Service B...
Call Girls Indiranagar Just Call 👗 7737669865 👗 Top Class Call Girl Service B...Call Girls Indiranagar Just Call 👗 7737669865 👗 Top Class Call Girl Service B...
Call Girls Indiranagar Just Call 👗 7737669865 👗 Top Class Call Girl Service B...
 
Call Girls In Shalimar Bagh ( Delhi) 9953330565 Escorts Service
Call Girls In Shalimar Bagh ( Delhi) 9953330565 Escorts ServiceCall Girls In Shalimar Bagh ( Delhi) 9953330565 Escorts Service
Call Girls In Shalimar Bagh ( Delhi) 9953330565 Escorts Service
 
Call Girls Bommasandra Just Call 👗 7737669865 👗 Top Class Call Girl Service B...
Call Girls Bommasandra Just Call 👗 7737669865 👗 Top Class Call Girl Service B...Call Girls Bommasandra Just Call 👗 7737669865 👗 Top Class Call Girl Service B...
Call Girls Bommasandra Just Call 👗 7737669865 👗 Top Class Call Girl Service B...
 
Call Girls Jalahalli Just Call 👗 7737669865 👗 Top Class Call Girl Service Ban...
Call Girls Jalahalli Just Call 👗 7737669865 👗 Top Class Call Girl Service Ban...Call Girls Jalahalli Just Call 👗 7737669865 👗 Top Class Call Girl Service Ban...
Call Girls Jalahalli Just Call 👗 7737669865 👗 Top Class Call Girl Service Ban...
 
ALSO dropshipping via API with DroFx.pptx
ALSO dropshipping via API with DroFx.pptxALSO dropshipping via API with DroFx.pptx
ALSO dropshipping via API with DroFx.pptx
 
Anomaly detection and data imputation within time series
Anomaly detection and data imputation within time seriesAnomaly detection and data imputation within time series
Anomaly detection and data imputation within time series
 

Big data

  • 2. Outline ◦what is Big Data ? ◦where this Beg Data come from? ◦4v`s Analysis ◦When dealing with big Data? ◦EXAMPLE : Google
  • 3. What is big data? “Every day, we create 2.5 quintillion bytes of data — so much that 90% of the data in the world today has been created in the last two years alone. This data comes from everywhere: sensors used to gather climate information, posts to social media sites, digital pictures and videos, purchase transaction records, and cell phone GPS signals to name a few. This data is “big data.”
  • 4. Where Is This “Big Data” Coming From? 12+ TBs of tweet data every day 25+ TBs of log data every day ?TBsof dataevery day 2+ billion people on the Web by end 2011 30 billion RFID tags today (1.3B in 2005) 4.6 billion camera phones world wide 100s of millions of GPS enabled devices sold annually 76 million smart meters in 2009… 200M by 2014
  • 5. Volume of Tweets create daily. 12+ terabytes Variety of different types of data. 100’s Value With Big Data, We’ve Moved to 4 Vs Analytics trade events per second. 5+ million Velocity
  • 6. Volume (Scale) Data Volume ◦ 44x increase from 2009 2020 ◦ From 0.8 zettabytes to 35zb Data volume is increasing exponentially 6 Refers to the vast amounts of data generated every second. We are not talking Terabytes but Petabytes . If we take all the data generated in the world between the beginning of time and 2008, the same amount of data will soon be generated every minute. This makes most data sets too large to store and analyze using traditional database technology. New big data tools use distributed systems so that we can store and analyse data across databases that are dotted around anywhere in the world.
  • 7. Variety (Complexity) 7 To extract knowledge all these types of data need to be linked together Refers to the different types of data we can now use. In the past we only focused on structured data that neatly fitted into tables or relational databases, such as financial data. In fact, 80% of the world’s data is unstructured (text, images, video, voice, etc.) With big data technology we can now analyze and bring together data of different types such as messages, social media conversations, photos, sensor data, video or voice recordings.
  • 8. Velocity (Speed) Velocity :Refers to the speed at which new data is generated and the speed at which data moves around. Just think of social media messages going viral in seconds. Technology allows us now to analyze the data while it is being generated (sometimes referred to as in-memory analytics), without ever putting it into databases. Examples ◦ E-Promotions: Based on your current location, your purchase history, what you like  send promotions right now for store next to you ◦ Healthcare monitoring: sensors monitoring your activities and body  any abnormal measurements require immediate reaction 8
  • 9. Real-time/Fast Data The progress and innovation is no longer hindered by the ability to collect data But, by the ability to manage, analyze, summarize, visualize, and discover knowledge from the collected data in a timely manner and in a scalable fashion 9 Social media and networks (all of us are generating data) Scientific instruments (collecting all sorts of data) Mobile devices (tracking all objects all the time) Sensor technology and networks (measuring all kinds of data)
  • 10. Value Then there is another V to take into account when looking at Big Data: Value! Having access to big data is no good unless we can turn it into value. Companies are starting to generate amazing value from their big data. We currently only see the beginnings of a transformation into a big data economy. Any business that doesn’t seriously consider the implications of Big Data runs the risk of being left behind. Value
  • 11. Big Data Exploration: Value & Diagram 11 File Systems Relational Data Content Management Email CRM Supply Chain ERP RSS Feeds Cloud Custom Sources DataExplorer Application/ Users Find, Visualize & Understand all big data to improve business knowledge • Greater efficiencies in business processes • New insights from combining and analyzing data types in new ways • Develop new business models with resulting increased market presence and revenue
  • 12. Applications for Big Data Analytics Homeland Security FinanceSmarter Healthcare Telecom Manufacturing Traffic Control Trading Analytics Log Analysis Search Quality
  • 13. When dealing with Big Data is hard When the operations on data are complex: ◦ Eg. Simple counting is not a complex problem. ◦ Modeling and reasoning with data of different kinds can get extremely complex Good news with big-data: ◦ Often, because of the vast amount of data, modeling techniques can get simpler (e.g., smart counting can replace complex model-based analytics)… ◦ …as long as we deal with the scale.
  • 14.
  • 15.
  • 16.
  • 17.
  • 18.
  • 19.
  • 20. Hadoopis an open-source software framework for storing and processing big data in a distributed fashion on large clusters of commodity hardware. Suitable for extremely large databases (billions of rows, millions of columns), distributed across thousands of nodes.
  • 21. Hadoop Distributed File System (HDFS) is a Java-based file system that provides scalable and reliable data storage that is designed to large clusters of commodity servers.
  • 22.
  • 23.
  • 24. MapReduce is a programming model and an associated implementation for processing and generating large data sets with a parallel, distributed algorithm on a cluster.
  • 25.
  • 26. We first wrote the data into HDFS, then created a table and loaded data from HDFS files to HIVE table.
  • 27.