SlideShare uma empresa Scribd logo
1 de 22
Handling
Big Data
Deddy Setyadi
www.elakiri.com
... - 2003
2 days in 2011
10 minutes in 2013
5 billion GB
Live stats
2016
Where those data comes from?
Activity Listening music, reading a book, searching, shopping, etc.
Our conversations in social media are now digitally recorded.Conversation
We upload and share 100s of thousands of them on social media
sites every second.
Photo and Video
We are increasingly surrounded by sensors that collect and share
data.
Sensor
We now have smart TVs that are able to collect and process data.The Internet of Things
The basic idea behind the phrase
'Big Data' is that everything we
do is increasingly leaving a digital
trace (or data), which we (and
others) can use and analyse
Big data :
means really a big data, it is
a collection of large
datasets that cannot be
processed using traditional
computing techniques.
Big Data includes huge volume, high velocity,
and extensible variety of data.
Structured
Item 2
Semi Structured Unstructured
● Database
● Census records
● Economic data
● Phone numbers
● JSON
● XML
● Word
● PDF
● Text
● Media Logs
Benefits of Big Data
https://www.youtube.com/watch?v=HqsBensINkE
Big Data Technologies
Operational Big Data
This include systems like MongoDB that
provide operational capabilities for real-
time, interactive workloads where data is
primarily captured and stored.
NoSQL Big Data systems are designed to
allow massive computations to be run
inexpensively and efficiently. This makes
operational big data workloads much
easier to manage, cheaper, and faster to
implement.
Analytical Big Data
This includes systems like Massively
Parallel Processing (MPP) database
systems and MapReduce that provide
analytical capabilities for retrospective
and complex analysis.
A system based on MapReduce can be
scaled up from single servers to
thousands of high and low end machines.
Big Data Solutions
Traditional Approach
In this approach, an enterprise will have a
computer to store and process big data. Here
data will be stored in an RDBMS, process the
required data and present it to the users for
analysis purpose. tutorialspoint.com
Google’s
Solution
Google solved this problem using an
algorithm called MapReduce. This
algorithm divides the task into small
parts and assigns those parts to
many computers connected over
the network, and collects the results
to form the final result dataset.
tutorialspoint.com
Hadoop
Hadoop runs applications using the
MapReduce algorithm, where the
data is processed in parallel on
different CPU nodes. In short,
Hadoop framework is capable
enough to develop applications,
capable of running on clusters of
computers and they could perform
complete statistical analysis for a
huge amounts of data.
tutorialspoint.com
Hadoop
Hadoop
Architecture
tutorialspoint.com
MapReduce
Data
Map
Converts data into another set of
data. Elements are broken down
into tuples (key/value pairs).
Reduce
Shuffle stage and the Reduce
stage that produces a new set
of output, which will be stored
in the HDFS.
1 2 3
MapReduce
http://mm-tom.s3.amazonaws.com/blog/MapReduce.png
MapReduce
noviardisyamsuir.blogspot.com
HDFS
● Fault detection and recovery : HDFS
should have mechanisms for quick and
automatic fault detection and recovery.
● Huge datasets : HDFS should have
hundreds of nodes per cluster to manage
the applications having huge data sets.
● Hardware at data : A requested task can
be done efficiently.
tutorialspoint.com
Demo
Closing ...
blog.cloudera.com
References & Source
http://www.tutorialspoint.com/hadoop/
http://www.wired.com/2013/02/the-decades-that-invented-the-future-part-11-2001-2010/
http://www.slideshare.net/BernardMarr/140228-big-data-slide-share/3-The_basic_idea_behind_the
https://www.youtube.com/watch?v=HqsBensINkE
http://www.bogotobogo.com/Hadoop/BigData_hadoop_Install_on_ubuntu_single_node_cluster.php
http://noviardisyamsuir.blogspot.co.id/2016/03/hadoop-mapreduce-adalah.html
http://www.slideshare.net/lynnlangit/hadoop-mapreduce-fundamentals-21427224/5-
What_types_of_business_problems
https://blog.cloudera.com/blog/2013/08/how-to-select-the-right-hardware-for-your-new-hadoop-
cluster/
Thank you!

Mais conteúdo relacionado

Mais procurados

Bigdata " new level"
Bigdata " new level"Bigdata " new level"
Bigdata " new level"
Vamshikrishna Goud
 

Mais procurados (20)

Bigdata " new level"
Bigdata " new level"Bigdata " new level"
Bigdata " new level"
 
Big data
Big dataBig data
Big data
 
Big data management
Big data managementBig data management
Big data management
 
Big data PPT prepared by Hritika Raj (Shivalik college of engg.)
Big data PPT prepared by Hritika Raj (Shivalik college of engg.)Big data PPT prepared by Hritika Raj (Shivalik college of engg.)
Big data PPT prepared by Hritika Raj (Shivalik college of engg.)
 
Big Data
Big DataBig Data
Big Data
 
Big data
Big dataBig data
Big data
 
big data overview ppt
big data overview pptbig data overview ppt
big data overview ppt
 
Big data and its applications
Big data and its applicationsBig data and its applications
Big data and its applications
 
Overview of Big data(ppt)
Overview of Big data(ppt)Overview of Big data(ppt)
Overview of Big data(ppt)
 
Ppt for Application of big data
Ppt for Application of big dataPpt for Application of big data
Ppt for Application of big data
 
Big Data ppt
Big Data pptBig Data ppt
Big Data ppt
 
Introduction of big data and analytics
Introduction of big data and analyticsIntroduction of big data and analytics
Introduction of big data and analytics
 
A Big Data Timeline
A Big Data TimelineA Big Data Timeline
A Big Data Timeline
 
Big data tools
Big data toolsBig data tools
Big data tools
 
Chapter 4 what is data and data types
Chapter 4  what is data and data typesChapter 4  what is data and data types
Chapter 4 what is data and data types
 
Big Data & the importance of Data Science
Big Data & the importance of Data ScienceBig Data & the importance of Data Science
Big Data & the importance of Data Science
 
Big Data
Big DataBig Data
Big Data
 
Big Data for Beginners
Big Data for BeginnersBig Data for Beginners
Big Data for Beginners
 
Big data introduction
Big data introductionBig data introduction
Big data introduction
 
Big data
Big dataBig data
Big data
 

Destaque

To Have or Have Not.PDF
To Have or Have Not.PDFTo Have or Have Not.PDF
To Have or Have Not.PDF
Lisa M. Beck
 
Marketing Automation Vendors_Investment_Report
Marketing Automation Vendors_Investment_ReportMarketing Automation Vendors_Investment_Report
Marketing Automation Vendors_Investment_Report
Todd Price
 
Laura Pittoni Showcase 2017
Laura Pittoni Showcase 2017Laura Pittoni Showcase 2017
Laura Pittoni Showcase 2017
Laura Pittoni
 
ความสำคัญของคอมพิวเตอร์ 2222
ความสำคัญของคอมพิวเตอร์ 2222ความสำคัญของคอมพิวเตอร์ 2222
ความสำคัญของคอมพิวเตอร์ 2222
aomhongoingkad
 
Company Profile PT Adi Caraka Tirta Containerline
Company Profile PT Adi Caraka Tirta ContainerlineCompany Profile PT Adi Caraka Tirta Containerline
Company Profile PT Adi Caraka Tirta Containerline
ari f rahman
 

Destaque (14)

To Have or Have Not.PDF
To Have or Have Not.PDFTo Have or Have Not.PDF
To Have or Have Not.PDF
 
Php curl
Php curlPhp curl
Php curl
 
what is today?
what is today?what is today?
what is today?
 
Mohamed Samy CV With Experience Certificates
Mohamed Samy CV With Experience CertificatesMohamed Samy CV With Experience Certificates
Mohamed Samy CV With Experience Certificates
 
Marketing Automation Vendors_Investment_Report
Marketing Automation Vendors_Investment_ReportMarketing Automation Vendors_Investment_Report
Marketing Automation Vendors_Investment_Report
 
Sentenza_Trib_Alessandria
Sentenza_Trib_AlessandriaSentenza_Trib_Alessandria
Sentenza_Trib_Alessandria
 
Laura Pittoni Showcase 2017
Laura Pittoni Showcase 2017Laura Pittoni Showcase 2017
Laura Pittoni Showcase 2017
 
ความสำคัญของคอมพิวเตอร์ 2222
ความสำคัญของคอมพิวเตอร์ 2222ความสำคัญของคอมพิวเตอร์ 2222
ความสำคัญของคอมพิวเตอร์ 2222
 
Narrative theory
Narrative theoryNarrative theory
Narrative theory
 
Objetivo 6- estrategia 3
Objetivo 6- estrategia 3Objetivo 6- estrategia 3
Objetivo 6- estrategia 3
 
Company Profile PT Adi Caraka Tirta Containerline
Company Profile PT Adi Caraka Tirta ContainerlineCompany Profile PT Adi Caraka Tirta Containerline
Company Profile PT Adi Caraka Tirta Containerline
 
Udflugter på Fano
Udflugter på FanoUdflugter på Fano
Udflugter på Fano
 
Be The Machine Winter Resorts
Be The Machine Winter ResortsBe The Machine Winter Resorts
Be The Machine Winter Resorts
 
Automatic Voltage Range (AVR) - Sollatek
Automatic Voltage Range (AVR) - SollatekAutomatic Voltage Range (AVR) - Sollatek
Automatic Voltage Range (AVR) - Sollatek
 

Semelhante a Big data

Semelhante a Big data (20)

Big data
Big dataBig data
Big data
 
Big data with hadoop
Big data with hadoopBig data with hadoop
Big data with hadoop
 
Big Data
Big DataBig Data
Big Data
 
Big data Analytics
Big data Analytics Big data Analytics
Big data Analytics
 
hadoop seminar training report
hadoop seminar  training reporthadoop seminar  training report
hadoop seminar training report
 
Big Data
Big DataBig Data
Big Data
 
The book of elephant tattoo
The book of elephant tattooThe book of elephant tattoo
The book of elephant tattoo
 
A Big Data Concept
A Big Data ConceptA Big Data Concept
A Big Data Concept
 
Big data and Hadoop overview
Big data and Hadoop overviewBig data and Hadoop overview
Big data and Hadoop overview
 
Big data
Big dataBig data
Big data
 
Big data
Big dataBig data
Big data
 
GADLJRIET850691
GADLJRIET850691GADLJRIET850691
GADLJRIET850691
 
big-data-8722-m8RQ3h1.pptx
big-data-8722-m8RQ3h1.pptxbig-data-8722-m8RQ3h1.pptx
big-data-8722-m8RQ3h1.pptx
 
ANALYTICS OF DATA USING HADOOP-A REVIEW
ANALYTICS OF DATA USING HADOOP-A REVIEWANALYTICS OF DATA USING HADOOP-A REVIEW
ANALYTICS OF DATA USING HADOOP-A REVIEW
 
Big data ppt
Big data pptBig data ppt
Big data ppt
 
Big data-ppt
Big data-pptBig data-ppt
Big data-ppt
 
Big Data Analytics(Intro,Hadoop Map Reduce,Mahout,K-means clustering,H-base)
Big Data Analytics(Intro,Hadoop Map Reduce,Mahout,K-means clustering,H-base)Big Data Analytics(Intro,Hadoop Map Reduce,Mahout,K-means clustering,H-base)
Big Data Analytics(Intro,Hadoop Map Reduce,Mahout,K-means clustering,H-base)
 
International Journal of Engineering Research and Development (IJERD)
International Journal of Engineering Research and Development (IJERD)International Journal of Engineering Research and Development (IJERD)
International Journal of Engineering Research and Development (IJERD)
 
Big Data & Hadoop
Big Data & HadoopBig Data & Hadoop
Big Data & Hadoop
 
Big Data Analytics
Big Data AnalyticsBig Data Analytics
Big Data Analytics
 

Último

Top profile Call Girls In dimapur [ 7014168258 ] Call Me For Genuine Models W...
Top profile Call Girls In dimapur [ 7014168258 ] Call Me For Genuine Models W...Top profile Call Girls In dimapur [ 7014168258 ] Call Me For Genuine Models W...
Top profile Call Girls In dimapur [ 7014168258 ] Call Me For Genuine Models W...
gajnagarg
 
一比一原版(UCD毕业证书)加州大学戴维斯分校毕业证成绩单原件一模一样
一比一原版(UCD毕业证书)加州大学戴维斯分校毕业证成绩单原件一模一样一比一原版(UCD毕业证书)加州大学戴维斯分校毕业证成绩单原件一模一样
一比一原版(UCD毕业证书)加州大学戴维斯分校毕业证成绩单原件一模一样
wsppdmt
 
怎样办理圣地亚哥州立大学毕业证(SDSU毕业证书)成绩单学校原版复制
怎样办理圣地亚哥州立大学毕业证(SDSU毕业证书)成绩单学校原版复制怎样办理圣地亚哥州立大学毕业证(SDSU毕业证书)成绩单学校原版复制
怎样办理圣地亚哥州立大学毕业证(SDSU毕业证书)成绩单学校原版复制
vexqp
 
如何办理英国诺森比亚大学毕业证(NU毕业证书)成绩单原件一模一样
如何办理英国诺森比亚大学毕业证(NU毕业证书)成绩单原件一模一样如何办理英国诺森比亚大学毕业证(NU毕业证书)成绩单原件一模一样
如何办理英国诺森比亚大学毕业证(NU毕业证书)成绩单原件一模一样
wsppdmt
 
In Riyadh ((+919101817206)) Cytotec kit @ Abortion Pills Saudi Arabia
In Riyadh ((+919101817206)) Cytotec kit @ Abortion Pills Saudi ArabiaIn Riyadh ((+919101817206)) Cytotec kit @ Abortion Pills Saudi Arabia
In Riyadh ((+919101817206)) Cytotec kit @ Abortion Pills Saudi Arabia
ahmedjiabur940
 
Top profile Call Girls In Chandrapur [ 7014168258 ] Call Me For Genuine Model...
Top profile Call Girls In Chandrapur [ 7014168258 ] Call Me For Genuine Model...Top profile Call Girls In Chandrapur [ 7014168258 ] Call Me For Genuine Model...
Top profile Call Girls In Chandrapur [ 7014168258 ] Call Me For Genuine Model...
gajnagarg
 
Gartner's Data Analytics Maturity Model.pptx
Gartner's Data Analytics Maturity Model.pptxGartner's Data Analytics Maturity Model.pptx
Gartner's Data Analytics Maturity Model.pptx
chadhar227
 
Reconciling Conflicting Data Curation Actions: Transparency Through Argument...
Reconciling Conflicting Data Curation Actions:  Transparency Through Argument...Reconciling Conflicting Data Curation Actions:  Transparency Through Argument...
Reconciling Conflicting Data Curation Actions: Transparency Through Argument...
Bertram Ludäscher
 
Top profile Call Girls In Bihar Sharif [ 7014168258 ] Call Me For Genuine Mod...
Top profile Call Girls In Bihar Sharif [ 7014168258 ] Call Me For Genuine Mod...Top profile Call Girls In Bihar Sharif [ 7014168258 ] Call Me For Genuine Mod...
Top profile Call Girls In Bihar Sharif [ 7014168258 ] Call Me For Genuine Mod...
nirzagarg
 
Jual Cytotec Asli Obat Aborsi No. 1 Paling Manjur
Jual Cytotec Asli Obat Aborsi No. 1 Paling ManjurJual Cytotec Asli Obat Aborsi No. 1 Paling Manjur
Jual Cytotec Asli Obat Aborsi No. 1 Paling Manjur
ptikerjasaptiker
 
Top profile Call Girls In Vadodara [ 7014168258 ] Call Me For Genuine Models ...
Top profile Call Girls In Vadodara [ 7014168258 ] Call Me For Genuine Models ...Top profile Call Girls In Vadodara [ 7014168258 ] Call Me For Genuine Models ...
Top profile Call Girls In Vadodara [ 7014168258 ] Call Me For Genuine Models ...
gajnagarg
 
怎样办理伦敦大学城市学院毕业证(CITY毕业证书)成绩单学校原版复制
怎样办理伦敦大学城市学院毕业证(CITY毕业证书)成绩单学校原版复制怎样办理伦敦大学城市学院毕业证(CITY毕业证书)成绩单学校原版复制
怎样办理伦敦大学城市学院毕业证(CITY毕业证书)成绩单学校原版复制
vexqp
 
怎样办理圣路易斯大学毕业证(SLU毕业证书)成绩单学校原版复制
怎样办理圣路易斯大学毕业证(SLU毕业证书)成绩单学校原版复制怎样办理圣路易斯大学毕业证(SLU毕业证书)成绩单学校原版复制
怎样办理圣路易斯大学毕业证(SLU毕业证书)成绩单学校原版复制
vexqp
 

Último (20)

Top profile Call Girls In dimapur [ 7014168258 ] Call Me For Genuine Models W...
Top profile Call Girls In dimapur [ 7014168258 ] Call Me For Genuine Models W...Top profile Call Girls In dimapur [ 7014168258 ] Call Me For Genuine Models W...
Top profile Call Girls In dimapur [ 7014168258 ] Call Me For Genuine Models W...
 
SAC 25 Final National, Regional & Local Angel Group Investing Insights 2024 0...
SAC 25 Final National, Regional & Local Angel Group Investing Insights 2024 0...SAC 25 Final National, Regional & Local Angel Group Investing Insights 2024 0...
SAC 25 Final National, Regional & Local Angel Group Investing Insights 2024 0...
 
一比一原版(UCD毕业证书)加州大学戴维斯分校毕业证成绩单原件一模一样
一比一原版(UCD毕业证书)加州大学戴维斯分校毕业证成绩单原件一模一样一比一原版(UCD毕业证书)加州大学戴维斯分校毕业证成绩单原件一模一样
一比一原版(UCD毕业证书)加州大学戴维斯分校毕业证成绩单原件一模一样
 
Digital Advertising Lecture for Advanced Digital & Social Media Strategy at U...
Digital Advertising Lecture for Advanced Digital & Social Media Strategy at U...Digital Advertising Lecture for Advanced Digital & Social Media Strategy at U...
Digital Advertising Lecture for Advanced Digital & Social Media Strategy at U...
 
怎样办理圣地亚哥州立大学毕业证(SDSU毕业证书)成绩单学校原版复制
怎样办理圣地亚哥州立大学毕业证(SDSU毕业证书)成绩单学校原版复制怎样办理圣地亚哥州立大学毕业证(SDSU毕业证书)成绩单学校原版复制
怎样办理圣地亚哥州立大学毕业证(SDSU毕业证书)成绩单学校原版复制
 
Predicting HDB Resale Prices - Conducting Linear Regression Analysis With Orange
Predicting HDB Resale Prices - Conducting Linear Regression Analysis With OrangePredicting HDB Resale Prices - Conducting Linear Regression Analysis With Orange
Predicting HDB Resale Prices - Conducting Linear Regression Analysis With Orange
 
如何办理英国诺森比亚大学毕业证(NU毕业证书)成绩单原件一模一样
如何办理英国诺森比亚大学毕业证(NU毕业证书)成绩单原件一模一样如何办理英国诺森比亚大学毕业证(NU毕业证书)成绩单原件一模一样
如何办理英国诺森比亚大学毕业证(NU毕业证书)成绩单原件一模一样
 
Vadodara 💋 Call Girl 7737669865 Call Girls in Vadodara Escort service book now
Vadodara 💋 Call Girl 7737669865 Call Girls in Vadodara Escort service book nowVadodara 💋 Call Girl 7737669865 Call Girls in Vadodara Escort service book now
Vadodara 💋 Call Girl 7737669865 Call Girls in Vadodara Escort service book now
 
In Riyadh ((+919101817206)) Cytotec kit @ Abortion Pills Saudi Arabia
In Riyadh ((+919101817206)) Cytotec kit @ Abortion Pills Saudi ArabiaIn Riyadh ((+919101817206)) Cytotec kit @ Abortion Pills Saudi Arabia
In Riyadh ((+919101817206)) Cytotec kit @ Abortion Pills Saudi Arabia
 
Harnessing the Power of GenAI for BI and Reporting.pptx
Harnessing the Power of GenAI for BI and Reporting.pptxHarnessing the Power of GenAI for BI and Reporting.pptx
Harnessing the Power of GenAI for BI and Reporting.pptx
 
Top profile Call Girls In Chandrapur [ 7014168258 ] Call Me For Genuine Model...
Top profile Call Girls In Chandrapur [ 7014168258 ] Call Me For Genuine Model...Top profile Call Girls In Chandrapur [ 7014168258 ] Call Me For Genuine Model...
Top profile Call Girls In Chandrapur [ 7014168258 ] Call Me For Genuine Model...
 
Gartner's Data Analytics Maturity Model.pptx
Gartner's Data Analytics Maturity Model.pptxGartner's Data Analytics Maturity Model.pptx
Gartner's Data Analytics Maturity Model.pptx
 
Reconciling Conflicting Data Curation Actions: Transparency Through Argument...
Reconciling Conflicting Data Curation Actions:  Transparency Through Argument...Reconciling Conflicting Data Curation Actions:  Transparency Through Argument...
Reconciling Conflicting Data Curation Actions: Transparency Through Argument...
 
Top profile Call Girls In Bihar Sharif [ 7014168258 ] Call Me For Genuine Mod...
Top profile Call Girls In Bihar Sharif [ 7014168258 ] Call Me For Genuine Mod...Top profile Call Girls In Bihar Sharif [ 7014168258 ] Call Me For Genuine Mod...
Top profile Call Girls In Bihar Sharif [ 7014168258 ] Call Me For Genuine Mod...
 
Jual Cytotec Asli Obat Aborsi No. 1 Paling Manjur
Jual Cytotec Asli Obat Aborsi No. 1 Paling ManjurJual Cytotec Asli Obat Aborsi No. 1 Paling Manjur
Jual Cytotec Asli Obat Aborsi No. 1 Paling Manjur
 
Dubai Call Girls Peeing O525547819 Call Girls Dubai
Dubai Call Girls Peeing O525547819 Call Girls DubaiDubai Call Girls Peeing O525547819 Call Girls Dubai
Dubai Call Girls Peeing O525547819 Call Girls Dubai
 
Top profile Call Girls In Vadodara [ 7014168258 ] Call Me For Genuine Models ...
Top profile Call Girls In Vadodara [ 7014168258 ] Call Me For Genuine Models ...Top profile Call Girls In Vadodara [ 7014168258 ] Call Me For Genuine Models ...
Top profile Call Girls In Vadodara [ 7014168258 ] Call Me For Genuine Models ...
 
Data Analyst Tasks to do the internship.pdf
Data Analyst Tasks to do the internship.pdfData Analyst Tasks to do the internship.pdf
Data Analyst Tasks to do the internship.pdf
 
怎样办理伦敦大学城市学院毕业证(CITY毕业证书)成绩单学校原版复制
怎样办理伦敦大学城市学院毕业证(CITY毕业证书)成绩单学校原版复制怎样办理伦敦大学城市学院毕业证(CITY毕业证书)成绩单学校原版复制
怎样办理伦敦大学城市学院毕业证(CITY毕业证书)成绩单学校原版复制
 
怎样办理圣路易斯大学毕业证(SLU毕业证书)成绩单学校原版复制
怎样办理圣路易斯大学毕业证(SLU毕业证书)成绩单学校原版复制怎样办理圣路易斯大学毕业证(SLU毕业证书)成绩单学校原版复制
怎样办理圣路易斯大学毕业证(SLU毕业证书)成绩单学校原版复制
 

Big data

Notas do Editor

  1. Perkembangan teknologi, alat, dan media komunikasi yang semakin pesat, berbanding lurus dengan jumlah data yang dihasilkan oleh umat manusia. Dari awal terbentuknya bumi sampai 2003, ketika bilik-bilik warnet masih sepi, dan internet masih benda asing, data yang dihasilkan umat manusia itu sebanyak 5 milliar GB. Kemudian di tahun-tahun berikutnya, muncul friendster, facebook, twitter, pun perangkat baru mulai bermunculan seperti ipod, nokia yang dibekali dengan gprs sehingga umat manusia mulai menggunakan internet. Delapan tahun berlalu, blackberry mulai booming, disertai dengan whatsapp, twitter, dan dalam 2 hari mampu memproduksi 5 milyar GB meskipun untuk paketan internet saat itu masih eman-eman. Android pun mulai menjamur beberapa tahun sesudahnya, pengguna pun mulai banyak, umat manusia sudah mulai terbiasa dengan paketan internet dan akhirnya data sebanyak 5 milyar GB dapat diproduksi dalam eaktu 10 menit.
  2. Simple activities like listening to music or reading a book are now generating data. Digital music players and eBooks collect data on our activities. Your smart phone collects data on how you use it and your web browser collects information on what you are searching for. Your credit card company collects data on where you shop and your shop collects data on what you buy. It is hard to imagine any activity that does not generate data. Our conversations are now digitally recorded. It all started with emails but nowadays most of our conversations leave a digital trail. Just think of all the conversations we have on social media sites like Facebook or Twitter. Even many of our phone conversations are now digitally recorded. Just think about all the pictures we take on our smart phones or digital cameras. We upload and share 100s of thousands of them on social media sites every second. The increasing amounts of CCTV cameras take video images and we up-load hundreds of hours of video images to YouTube and other sites every minute . We are increasingly surrounded by sensors that collect and share data. Take your smart phone, it contains a global positioning sensor to track exactly where you are every second of the day, it includes an accelometer to track the speed and direction at which you are travelling. We now have sensors in many devices and products. We now have smart TVs that are able to collect and process data, we have smart watches, smart fridges, and smart alarms. The Internet of Things, or Internet of Everything connects these devices so that e.g. the traffic sensors on the road send data to your alarm clock which will wake you up earlier than planned because the blocked road means you have to leave earlier to make your 9am meeting…
  3. Volume refers to the vast amounts of data generated every second. We are not talking Terabytes but Zettabytes or Brontobytes. If we take all the data generated in the world between the beginning of time and 2008, the same amount of data will soon be generated every minute. New big data tools use distributed systems so that we can store and analyse data across databases that are dotted around anywhere in the world. Velocity refers to the speed at which new data is generated and the speed at which data moves around. Just think of social media messages going viral in seconds. Technology allows us now to analyse the data while it is being generated (sometimes referred to as in-memory analytics), without ever putting it into databases. Variety refers to the different types of data we can now use. In the past we only focused on structured data that neatly fitted into tables or relational databases, such as financial data. In fact, 80% of the world’s data is unstructured (text, images, video, voice, etc.) With big data technology we can now analyse and bring together data of different types such as messages, social media conversations, photos, sensor data, video or voice recordings.
  4. Limitation : This approach works well where we have less volume of data that can be accommodated by standard database servers, or up to the limit of the processor which is processing the data. But when it comes to dealing with huge amounts of data, it is really a tedious task to process such data through a traditional database server.