SlideShare uma empresa Scribd logo
1 de 29
Baixar para ler offline
Mohammad Reza Gerami 
gerami@aryatadbir.com 
mrgerami@aut.ac.ir 
1
2
3
•‘Big Data’ is similar to ‘small data’, but bigger 
•…but having data bigger it requires different approaches: 
•Techniques, tools and architecture 
•…with an aim to solve new problems 
•…or old problems in a better way 
4
5
Characteristics of Big Data: 1-Scale (Volume) 
•Data Volume 
Exponential increase in collected/generated data 
6
Big Data in Today’s Business and Technology Environment 
2.7 Zetabytesof data exist in the digital universe today. (Source) 
235 Terabytes of data has been collected by the U.S. Library of Congress in April 2011. (Source) 
The Obama administration is investing $200 million in big data research projects. (Source) 
IDC Estimates that by 2020,business transactions on the internet-business-to- business and business-to-consumer –will reach 450 billion per day. (Source) 
Facebook stores, accesses, and analyzes 30+ Petabytes of user generated data. (Source) 
Akamai analyzes 75 million events per day to better target advertisements. (Source) 
94% of Hadoopusers perform analytics on large volumes of data not possible before; 88% analyze data in greater detail; while 82% can now retain more of their data. (Source) 
7
Walmart handles more than 1 million customer transactions every hour, which is imported into databases estimated to contain more than 2.5 petabytes of data. (Source) 
More than 5 billion people are calling, texting, tweeting and browsing on mobile phones worldwide. (Source) 
Decoding thehuman genomeoriginally took 10 years to process; now it can be achieved in one week. (Source) 
In 2008,Google was processing 20,000 terabytes of data (20 petabytes) a day. (Source) 
The largest AT&T database boasts titles including the largest volume of data in one unique database (312 terabytes) and the second largest number of rows in a unique 8
The Rapid Growth of Unstructured Data 
YouTube users upload 48 hours of new video every minute of the day. (Source) 
571 new websites are created every minute of the day. (Source) 
Brands and organizations on Facebook receive 34,722 Likes every minute of the day. (Source) 
100 terabytes of data uploaded daily to Facebook. (Source) 
According to Twitter’s own research in early 2012, it sees roughly 175 million tweets every day, and has more than 465 million accounts. (Source) 
30 Billion pieces of content shared on Facebook every month. (Source) 
Data production will be 44 times greater in 2020 than it was in 2009. (Source) 
9
The Rapid Growth of Unstructured Data 
In late 2011,IDC Digital Universe published a report indicating that some 1.8 zettabytes of data will be created that year. (Source) In other words, the amount of data in the world today is equal to: 
Every person in the US tweeting three tweets per minute for 26,976 years. 
Every person in the world having more than 215m high-resolution MRI scans a day. 
More than 200bn HD movies –which would take a person 47m years to watch. 
10
Decimal 
Value 
Metric 
1000 
kB 
kilobyte 
10002 
MB 
megabyte 
10003 
GB 
gigabyte 
10004 
TB 
terabyte 
10005 
PB 
petabyte 
10006 
EB 
exabyte 
10007 
ZB 
zettabyte 
10008 
YB 
yottabyte 
11
Social media and networks 
(all of us are generating data) 
Scientific instruments 
(collecting all sorts of data) 
Mobile devices 
(tracking all objects all the time) 
Sensor technology and networks 
(measuring all kinds of data) 
12
•No single standard definition… Big Data 
13
14
15
What to do with these data? 
16
How much data? 
640Kought to be enough for anybody. 
17
Why Big Data 
•Key enablers of appearance and growth of Big Data are 
–Increase of storage capacities 
–Increase of processing power 
–Availability of data 
–Every day we create 2.5 quintillion bytes of data; 90% of the data in the world today has been created in the last two years alone 
18
Big Data Analytics 
•Examining large amount of data 
•Appropriate information 
•Identification of hidden patterns, unknown correlations 
•Competitive advantage 
•Better business decisions: strategic and operational 
•Effective marketing, customer satisfaction, increased revenue 
19
Applications for Big Data Analytics 
Homeland Security 
Finance 
Smarter Healthcare 
Multi-channel sales 
Telecom 
Manufacturing 
Traffic Control 
Trading Analytics 
Fraud and Risk 
LogAnalysis 
Search Quality 
Retail: Churn, NBO 
20
Healthcare 
•80% of medical data is unstructured and is clinically relevant 
•Data resides in multiple places like individual EMRs, lab and imaging systems, physician notes, medical correspondence, claims etc 
•Leveraging Big Data 
•Build sustainable healthcare systems 
•Collaborate to improve care and outcomes 
•Increase access to healthcare 
21
Market Size 
Source: WikibonTamingBig Data 
By 2015 4.4 million IT jobs in Big Data ; 1.9 million is in US itself 
22
Potential Talent Pool -Big Data 
India will require a minimum of 1 lakh data scientists in the next couple of years in addition to data analysts and data managers to support the Big Data space. 
23
24
Future of Big Data 
25
Big Data Analytics Technologies 
NoSQL : non-relational or at least non-SQL database solutions such as HBase(also a part of the Hadoop ecosystem), Cassandra, MongoDB, Riak, CouchDB, and many others. 
Hadoop: It is an ecosystem of software packages, including MapReduce, HDFS, and a whole host of other software packages 
26
Main Big Data Technologies 
Hadoop 
NoSQL Databases 
Analytic Databases 
Hadoop 
•Low cost, reliable scale-out architecture 
•Distributed computing Proven success in Fortune 500 companies 
•Exploding interest 
NoSQL Databases 
•Huge horizontal scaling and high availability 
•Highly optimized for retrieval and appending 
•Types 
•Document stores 
•Key Value stores 
•Graph databases 
Analytic RDBMS 
•Optimized for bulk-load and fast aggregate query workloads 
•Types 
•Column-oriented 
•MPP 
•In-memory 
27
Thank you  
More info: 
www.aryatadbir.com 28
29

Mais conteúdo relacionado

Mais procurados

Real time analytics of big data
Real time analytics of big dataReal time analytics of big data
Real time analytics of big dataDeependra Jyoti
 
Big data and its applications
Big data and its applicationsBig data and its applications
Big data and its applicationsali easazadeh
 
A brief history of "big data"
A brief history of "big data"A brief history of "big data"
A brief history of "big data"Nicola Ferraro
 
A Hacking Toolset for Big Tabular Files (3)
A Hacking Toolset for Big Tabular Files (3)A Hacking Toolset for Big Tabular Files (3)
A Hacking Toolset for Big Tabular Files (3)Toshiyuki Shimono
 
Big data - Key Enablers, Drivers & Challenges
Big data - Key Enablers, Drivers & ChallengesBig data - Key Enablers, Drivers & Challenges
Big data - Key Enablers, Drivers & ChallengesShilpi Sharma
 
Big Tools for Big Data
Big Tools for Big DataBig Tools for Big Data
Big Tools for Big DataLewis Crawford
 
Big Data for One Big Family
Big Data for One Big FamilyBig Data for One Big Family
Big Data for One Big FamilyMatt Asay
 
Bigdatacooltools
BigdatacooltoolsBigdatacooltools
Bigdatacooltoolssuresh sood
 
Big data Presentation
Big data PresentationBig data Presentation
Big data PresentationAswadmehar
 
A Model Design of Big Data Processing using HACE Theorem
A Model Design of Big Data Processing using HACE TheoremA Model Design of Big Data Processing using HACE Theorem
A Model Design of Big Data Processing using HACE TheoremAnthonyOtuonye
 
Big Data + Big Sim: Query Processing over Unstructured CFD Models
Big Data + Big Sim: Query Processing over Unstructured CFD ModelsBig Data + Big Sim: Query Processing over Unstructured CFD Models
Big Data + Big Sim: Query Processing over Unstructured CFD ModelsUniversity of Washington
 
New data sources for statistics: Experiences at Statistics Netherlands.
New data sources for statistics: Experiences at Statistics Netherlands.New data sources for statistics: Experiences at Statistics Netherlands.
New data sources for statistics: Experiences at Statistics Netherlands.Piet J.H. Daas
 
Data Science Courses - BigData VS Data Science
Data Science Courses - BigData VS Data ScienceData Science Courses - BigData VS Data Science
Data Science Courses - BigData VS Data ScienceDataMites
 

Mais procurados (20)

Big Data
Big DataBig Data
Big Data
 
Real time analytics of big data
Real time analytics of big dataReal time analytics of big data
Real time analytics of big data
 
Big data and its applications
Big data and its applicationsBig data and its applications
Big data and its applications
 
Token
TokenToken
Token
 
A brief history of "big data"
A brief history of "big data"A brief history of "big data"
A brief history of "big data"
 
A Hacking Toolset for Big Tabular Files (3)
A Hacking Toolset for Big Tabular Files (3)A Hacking Toolset for Big Tabular Files (3)
A Hacking Toolset for Big Tabular Files (3)
 
Big data - Key Enablers, Drivers & Challenges
Big data - Key Enablers, Drivers & ChallengesBig data - Key Enablers, Drivers & Challenges
Big data - Key Enablers, Drivers & Challenges
 
Bigdata " new level"
Bigdata " new level"Bigdata " new level"
Bigdata " new level"
 
Big Tools for Big Data
Big Tools for Big DataBig Tools for Big Data
Big Tools for Big Data
 
Big Data for One Big Family
Big Data for One Big FamilyBig Data for One Big Family
Big Data for One Big Family
 
Bigdatacooltools
BigdatacooltoolsBigdatacooltools
Bigdatacooltools
 
Understanding big data
Understanding big dataUnderstanding big data
Understanding big data
 
A Brief History Of Data
A Brief History Of DataA Brief History Of Data
A Brief History Of Data
 
Big data Presentation
Big data PresentationBig data Presentation
Big data Presentation
 
Big Data analytics
Big Data analyticsBig Data analytics
Big Data analytics
 
A Model Design of Big Data Processing using HACE Theorem
A Model Design of Big Data Processing using HACE TheoremA Model Design of Big Data Processing using HACE Theorem
A Model Design of Big Data Processing using HACE Theorem
 
Big Data Tutorial V4
Big Data Tutorial V4Big Data Tutorial V4
Big Data Tutorial V4
 
Big Data + Big Sim: Query Processing over Unstructured CFD Models
Big Data + Big Sim: Query Processing over Unstructured CFD ModelsBig Data + Big Sim: Query Processing over Unstructured CFD Models
Big Data + Big Sim: Query Processing over Unstructured CFD Models
 
New data sources for statistics: Experiences at Statistics Netherlands.
New data sources for statistics: Experiences at Statistics Netherlands.New data sources for statistics: Experiences at Statistics Netherlands.
New data sources for statistics: Experiences at Statistics Netherlands.
 
Data Science Courses - BigData VS Data Science
Data Science Courses - BigData VS Data ScienceData Science Courses - BigData VS Data Science
Data Science Courses - BigData VS Data Science
 

Semelhante a Big Data - Gerami

ppt final.pptx
ppt final.pptxppt final.pptx
ppt final.pptxkalai75
 
Research issues in the big data and its Challenges
Research issues in the big data and its ChallengesResearch issues in the big data and its Challenges
Research issues in the big data and its ChallengesKathirvel Ayyaswamy
 
DataEd Online: Demystifying Big Data
DataEd Online: Demystifying Big DataDataEd Online: Demystifying Big Data
DataEd Online: Demystifying Big DataDATAVERSITY
 
Data-Ed: Demystifying Big Data
Data-Ed: Demystifying Big DataData-Ed: Demystifying Big Data
Data-Ed: Demystifying Big DataData Blueprint
 
Bigdatappt 140225061440-phpapp01
Bigdatappt 140225061440-phpapp01Bigdatappt 140225061440-phpapp01
Bigdatappt 140225061440-phpapp01nayanbhatia2
 
BIGDATAPrepared ByMuhammad Abrar UddinIntrodu.docx
BIGDATAPrepared ByMuhammad Abrar UddinIntrodu.docxBIGDATAPrepared ByMuhammad Abrar UddinIntrodu.docx
BIGDATAPrepared ByMuhammad Abrar UddinIntrodu.docxtangyechloe
 
Big data PPT prepared by Hritika Raj (Shivalik college of engg.)
Big data PPT prepared by Hritika Raj (Shivalik college of engg.)Big data PPT prepared by Hritika Raj (Shivalik college of engg.)
Big data PPT prepared by Hritika Raj (Shivalik college of engg.)Hritika Raj
 
Special issues on big data
Special issues on big dataSpecial issues on big data
Special issues on big dataVedanand Singh
 
Introduction to Big Data & Big Data 1.0 System
Introduction to Big Data & Big Data 1.0 SystemIntroduction to Big Data & Big Data 1.0 System
Introduction to Big Data & Big Data 1.0 SystemPetr Novotný
 

Semelhante a Big Data - Gerami (20)

Big-Data-AryaTadbirNetworkDesigners
Big-Data-AryaTadbirNetworkDesignersBig-Data-AryaTadbirNetworkDesigners
Big-Data-AryaTadbirNetworkDesigners
 
Kartikey tripathi
Kartikey tripathiKartikey tripathi
Kartikey tripathi
 
ppt final.pptx
ppt final.pptxppt final.pptx
ppt final.pptx
 
Research issues in the big data and its Challenges
Research issues in the big data and its ChallengesResearch issues in the big data and its Challenges
Research issues in the big data and its Challenges
 
DataEd Online: Demystifying Big Data
DataEd Online: Demystifying Big DataDataEd Online: Demystifying Big Data
DataEd Online: Demystifying Big Data
 
Data-Ed: Demystifying Big Data
Data-Ed: Demystifying Big DataData-Ed: Demystifying Big Data
Data-Ed: Demystifying Big Data
 
Bigdatappt 140225061440-phpapp01
Bigdatappt 140225061440-phpapp01Bigdatappt 140225061440-phpapp01
Bigdatappt 140225061440-phpapp01
 
Our big data
Our big dataOur big data
Our big data
 
bigdata.pptx
bigdata.pptxbigdata.pptx
bigdata.pptx
 
big data
big data big data
big data
 
Presentation on Big Data
Presentation on Big DataPresentation on Big Data
Presentation on Big Data
 
Big data Analytics
Big data Analytics Big data Analytics
Big data Analytics
 
Big data ppt
Big data pptBig data ppt
Big data ppt
 
Big data ppt
Big  data pptBig  data ppt
Big data ppt
 
BIGDATAPrepared ByMuhammad Abrar UddinIntrodu.docx
BIGDATAPrepared ByMuhammad Abrar UddinIntrodu.docxBIGDATAPrepared ByMuhammad Abrar UddinIntrodu.docx
BIGDATAPrepared ByMuhammad Abrar UddinIntrodu.docx
 
Big_Data_ppt[1] (1).pptx
Big_Data_ppt[1] (1).pptxBig_Data_ppt[1] (1).pptx
Big_Data_ppt[1] (1).pptx
 
Big data
Big dataBig data
Big data
 
Big data PPT prepared by Hritika Raj (Shivalik college of engg.)
Big data PPT prepared by Hritika Raj (Shivalik college of engg.)Big data PPT prepared by Hritika Raj (Shivalik college of engg.)
Big data PPT prepared by Hritika Raj (Shivalik college of engg.)
 
Special issues on big data
Special issues on big dataSpecial issues on big data
Special issues on big data
 
Introduction to Big Data & Big Data 1.0 System
Introduction to Big Data & Big Data 1.0 SystemIntroduction to Big Data & Big Data 1.0 System
Introduction to Big Data & Big Data 1.0 System
 

Mais de Mohammad Reza Gerami (7)

Security for distributed systems
Security for distributed systemsSecurity for distributed systems
Security for distributed systems
 
Distributed systems
Distributed systemsDistributed systems
Distributed systems
 
Hpc4 linux advanced
Hpc4 linux advancedHpc4 linux advanced
Hpc4 linux advanced
 
Linux file system
Linux file systemLinux file system
Linux file system
 
Linux installation
Linux installationLinux installation
Linux installation
 
Introducing to linux
Introducing to linuxIntroducing to linux
Introducing to linux
 
Linux History
Linux HistoryLinux History
Linux History
 

Último

In Riyadh ((+919101817206)) Cytotec kit @ Abortion Pills Saudi Arabia
In Riyadh ((+919101817206)) Cytotec kit @ Abortion Pills Saudi ArabiaIn Riyadh ((+919101817206)) Cytotec kit @ Abortion Pills Saudi Arabia
In Riyadh ((+919101817206)) Cytotec kit @ Abortion Pills Saudi Arabiaahmedjiabur940
 
7. Epi of Chronic respiratory diseases.ppt
7. Epi of Chronic respiratory diseases.ppt7. Epi of Chronic respiratory diseases.ppt
7. Epi of Chronic respiratory diseases.pptibrahimabdi22
 
Discover Why Less is More in B2B Research
Discover Why Less is More in B2B ResearchDiscover Why Less is More in B2B Research
Discover Why Less is More in B2B Researchmichael115558
 
怎样办理纽约州立大学宾汉姆顿分校毕业证(SUNY-Bin毕业证书)成绩单学校原版复制
怎样办理纽约州立大学宾汉姆顿分校毕业证(SUNY-Bin毕业证书)成绩单学校原版复制怎样办理纽约州立大学宾汉姆顿分校毕业证(SUNY-Bin毕业证书)成绩单学校原版复制
怎样办理纽约州立大学宾汉姆顿分校毕业证(SUNY-Bin毕业证书)成绩单学校原版复制vexqp
 
怎样办理伦敦大学毕业证(UoL毕业证书)成绩单学校原版复制
怎样办理伦敦大学毕业证(UoL毕业证书)成绩单学校原版复制怎样办理伦敦大学毕业证(UoL毕业证书)成绩单学校原版复制
怎样办理伦敦大学毕业证(UoL毕业证书)成绩单学校原版复制vexqp
 
怎样办理圣路易斯大学毕业证(SLU毕业证书)成绩单学校原版复制
怎样办理圣路易斯大学毕业证(SLU毕业证书)成绩单学校原版复制怎样办理圣路易斯大学毕业证(SLU毕业证书)成绩单学校原版复制
怎样办理圣路易斯大学毕业证(SLU毕业证书)成绩单学校原版复制vexqp
 
Jual Cytotec Asli Obat Aborsi No. 1 Paling Manjur
Jual Cytotec Asli Obat Aborsi No. 1 Paling ManjurJual Cytotec Asli Obat Aborsi No. 1 Paling Manjur
Jual Cytotec Asli Obat Aborsi No. 1 Paling Manjurptikerjasaptiker
 
Gartner's Data Analytics Maturity Model.pptx
Gartner's Data Analytics Maturity Model.pptxGartner's Data Analytics Maturity Model.pptx
Gartner's Data Analytics Maturity Model.pptxchadhar227
 
Top profile Call Girls In Satna [ 7014168258 ] Call Me For Genuine Models We ...
Top profile Call Girls In Satna [ 7014168258 ] Call Me For Genuine Models We ...Top profile Call Girls In Satna [ 7014168258 ] Call Me For Genuine Models We ...
Top profile Call Girls In Satna [ 7014168258 ] Call Me For Genuine Models We ...nirzagarg
 
怎样办理圣地亚哥州立大学毕业证(SDSU毕业证书)成绩单学校原版复制
怎样办理圣地亚哥州立大学毕业证(SDSU毕业证书)成绩单学校原版复制怎样办理圣地亚哥州立大学毕业证(SDSU毕业证书)成绩单学校原版复制
怎样办理圣地亚哥州立大学毕业证(SDSU毕业证书)成绩单学校原版复制vexqp
 
Jual Obat Aborsi Surabaya ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
Jual Obat Aborsi Surabaya ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...Jual Obat Aborsi Surabaya ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
Jual Obat Aborsi Surabaya ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...ZurliaSoop
 
5CL-ADBA,5cladba, Chinese supplier, safety is guaranteed
5CL-ADBA,5cladba, Chinese supplier, safety is guaranteed5CL-ADBA,5cladba, Chinese supplier, safety is guaranteed
5CL-ADBA,5cladba, Chinese supplier, safety is guaranteedamy56318795
 
Top profile Call Girls In Chandrapur [ 7014168258 ] Call Me For Genuine Model...
Top profile Call Girls In Chandrapur [ 7014168258 ] Call Me For Genuine Model...Top profile Call Girls In Chandrapur [ 7014168258 ] Call Me For Genuine Model...
Top profile Call Girls In Chandrapur [ 7014168258 ] Call Me For Genuine Model...gajnagarg
 
SAC 25 Final National, Regional & Local Angel Group Investing Insights 2024 0...
SAC 25 Final National, Regional & Local Angel Group Investing Insights 2024 0...SAC 25 Final National, Regional & Local Angel Group Investing Insights 2024 0...
SAC 25 Final National, Regional & Local Angel Group Investing Insights 2024 0...Elaine Werffeli
 
SR-101-01012024-EN.docx Federal Constitution of the Swiss Confederation
SR-101-01012024-EN.docx  Federal Constitution  of the Swiss ConfederationSR-101-01012024-EN.docx  Federal Constitution  of the Swiss Confederation
SR-101-01012024-EN.docx Federal Constitution of the Swiss ConfederationEfruzAsilolu
 
Ranking and Scoring Exercises for Research
Ranking and Scoring Exercises for ResearchRanking and Scoring Exercises for Research
Ranking and Scoring Exercises for ResearchRajesh Mondal
 
怎样办理伦敦大学城市学院毕业证(CITY毕业证书)成绩单学校原版复制
怎样办理伦敦大学城市学院毕业证(CITY毕业证书)成绩单学校原版复制怎样办理伦敦大学城市学院毕业证(CITY毕业证书)成绩单学校原版复制
怎样办理伦敦大学城市学院毕业证(CITY毕业证书)成绩单学校原版复制vexqp
 
PLE-statistics document for primary schs
PLE-statistics document for primary schsPLE-statistics document for primary schs
PLE-statistics document for primary schscnajjemba
 

Último (20)

In Riyadh ((+919101817206)) Cytotec kit @ Abortion Pills Saudi Arabia
In Riyadh ((+919101817206)) Cytotec kit @ Abortion Pills Saudi ArabiaIn Riyadh ((+919101817206)) Cytotec kit @ Abortion Pills Saudi Arabia
In Riyadh ((+919101817206)) Cytotec kit @ Abortion Pills Saudi Arabia
 
7. Epi of Chronic respiratory diseases.ppt
7. Epi of Chronic respiratory diseases.ppt7. Epi of Chronic respiratory diseases.ppt
7. Epi of Chronic respiratory diseases.ppt
 
Discover Why Less is More in B2B Research
Discover Why Less is More in B2B ResearchDiscover Why Less is More in B2B Research
Discover Why Less is More in B2B Research
 
怎样办理纽约州立大学宾汉姆顿分校毕业证(SUNY-Bin毕业证书)成绩单学校原版复制
怎样办理纽约州立大学宾汉姆顿分校毕业证(SUNY-Bin毕业证书)成绩单学校原版复制怎样办理纽约州立大学宾汉姆顿分校毕业证(SUNY-Bin毕业证书)成绩单学校原版复制
怎样办理纽约州立大学宾汉姆顿分校毕业证(SUNY-Bin毕业证书)成绩单学校原版复制
 
怎样办理伦敦大学毕业证(UoL毕业证书)成绩单学校原版复制
怎样办理伦敦大学毕业证(UoL毕业证书)成绩单学校原版复制怎样办理伦敦大学毕业证(UoL毕业证书)成绩单学校原版复制
怎样办理伦敦大学毕业证(UoL毕业证书)成绩单学校原版复制
 
怎样办理圣路易斯大学毕业证(SLU毕业证书)成绩单学校原版复制
怎样办理圣路易斯大学毕业证(SLU毕业证书)成绩单学校原版复制怎样办理圣路易斯大学毕业证(SLU毕业证书)成绩单学校原版复制
怎样办理圣路易斯大学毕业证(SLU毕业证书)成绩单学校原版复制
 
Jual Cytotec Asli Obat Aborsi No. 1 Paling Manjur
Jual Cytotec Asli Obat Aborsi No. 1 Paling ManjurJual Cytotec Asli Obat Aborsi No. 1 Paling Manjur
Jual Cytotec Asli Obat Aborsi No. 1 Paling Manjur
 
Gartner's Data Analytics Maturity Model.pptx
Gartner's Data Analytics Maturity Model.pptxGartner's Data Analytics Maturity Model.pptx
Gartner's Data Analytics Maturity Model.pptx
 
Top profile Call Girls In Satna [ 7014168258 ] Call Me For Genuine Models We ...
Top profile Call Girls In Satna [ 7014168258 ] Call Me For Genuine Models We ...Top profile Call Girls In Satna [ 7014168258 ] Call Me For Genuine Models We ...
Top profile Call Girls In Satna [ 7014168258 ] Call Me For Genuine Models We ...
 
怎样办理圣地亚哥州立大学毕业证(SDSU毕业证书)成绩单学校原版复制
怎样办理圣地亚哥州立大学毕业证(SDSU毕业证书)成绩单学校原版复制怎样办理圣地亚哥州立大学毕业证(SDSU毕业证书)成绩单学校原版复制
怎样办理圣地亚哥州立大学毕业证(SDSU毕业证书)成绩单学校原版复制
 
Jual Obat Aborsi Surabaya ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
Jual Obat Aborsi Surabaya ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...Jual Obat Aborsi Surabaya ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
Jual Obat Aborsi Surabaya ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
 
5CL-ADBA,5cladba, Chinese supplier, safety is guaranteed
5CL-ADBA,5cladba, Chinese supplier, safety is guaranteed5CL-ADBA,5cladba, Chinese supplier, safety is guaranteed
5CL-ADBA,5cladba, Chinese supplier, safety is guaranteed
 
Sequential and reinforcement learning for demand side management by Margaux B...
Sequential and reinforcement learning for demand side management by Margaux B...Sequential and reinforcement learning for demand side management by Margaux B...
Sequential and reinforcement learning for demand side management by Margaux B...
 
Top profile Call Girls In Chandrapur [ 7014168258 ] Call Me For Genuine Model...
Top profile Call Girls In Chandrapur [ 7014168258 ] Call Me For Genuine Model...Top profile Call Girls In Chandrapur [ 7014168258 ] Call Me For Genuine Model...
Top profile Call Girls In Chandrapur [ 7014168258 ] Call Me For Genuine Model...
 
SAC 25 Final National, Regional & Local Angel Group Investing Insights 2024 0...
SAC 25 Final National, Regional & Local Angel Group Investing Insights 2024 0...SAC 25 Final National, Regional & Local Angel Group Investing Insights 2024 0...
SAC 25 Final National, Regional & Local Angel Group Investing Insights 2024 0...
 
SR-101-01012024-EN.docx Federal Constitution of the Swiss Confederation
SR-101-01012024-EN.docx  Federal Constitution  of the Swiss ConfederationSR-101-01012024-EN.docx  Federal Constitution  of the Swiss Confederation
SR-101-01012024-EN.docx Federal Constitution of the Swiss Confederation
 
Ranking and Scoring Exercises for Research
Ranking and Scoring Exercises for ResearchRanking and Scoring Exercises for Research
Ranking and Scoring Exercises for Research
 
怎样办理伦敦大学城市学院毕业证(CITY毕业证书)成绩单学校原版复制
怎样办理伦敦大学城市学院毕业证(CITY毕业证书)成绩单学校原版复制怎样办理伦敦大学城市学院毕业证(CITY毕业证书)成绩单学校原版复制
怎样办理伦敦大学城市学院毕业证(CITY毕业证书)成绩单学校原版复制
 
PLE-statistics document for primary schs
PLE-statistics document for primary schsPLE-statistics document for primary schs
PLE-statistics document for primary schs
 
Abortion pills in Jeddah | +966572737505 | Get Cytotec
Abortion pills in Jeddah | +966572737505 | Get CytotecAbortion pills in Jeddah | +966572737505 | Get Cytotec
Abortion pills in Jeddah | +966572737505 | Get Cytotec
 

Big Data - Gerami

  • 1. Mohammad Reza Gerami gerami@aryatadbir.com mrgerami@aut.ac.ir 1
  • 2. 2
  • 3. 3
  • 4. •‘Big Data’ is similar to ‘small data’, but bigger •…but having data bigger it requires different approaches: •Techniques, tools and architecture •…with an aim to solve new problems •…or old problems in a better way 4
  • 5. 5
  • 6. Characteristics of Big Data: 1-Scale (Volume) •Data Volume Exponential increase in collected/generated data 6
  • 7. Big Data in Today’s Business and Technology Environment 2.7 Zetabytesof data exist in the digital universe today. (Source) 235 Terabytes of data has been collected by the U.S. Library of Congress in April 2011. (Source) The Obama administration is investing $200 million in big data research projects. (Source) IDC Estimates that by 2020,business transactions on the internet-business-to- business and business-to-consumer –will reach 450 billion per day. (Source) Facebook stores, accesses, and analyzes 30+ Petabytes of user generated data. (Source) Akamai analyzes 75 million events per day to better target advertisements. (Source) 94% of Hadoopusers perform analytics on large volumes of data not possible before; 88% analyze data in greater detail; while 82% can now retain more of their data. (Source) 7
  • 8. Walmart handles more than 1 million customer transactions every hour, which is imported into databases estimated to contain more than 2.5 petabytes of data. (Source) More than 5 billion people are calling, texting, tweeting and browsing on mobile phones worldwide. (Source) Decoding thehuman genomeoriginally took 10 years to process; now it can be achieved in one week. (Source) In 2008,Google was processing 20,000 terabytes of data (20 petabytes) a day. (Source) The largest AT&T database boasts titles including the largest volume of data in one unique database (312 terabytes) and the second largest number of rows in a unique 8
  • 9. The Rapid Growth of Unstructured Data YouTube users upload 48 hours of new video every minute of the day. (Source) 571 new websites are created every minute of the day. (Source) Brands and organizations on Facebook receive 34,722 Likes every minute of the day. (Source) 100 terabytes of data uploaded daily to Facebook. (Source) According to Twitter’s own research in early 2012, it sees roughly 175 million tweets every day, and has more than 465 million accounts. (Source) 30 Billion pieces of content shared on Facebook every month. (Source) Data production will be 44 times greater in 2020 than it was in 2009. (Source) 9
  • 10. The Rapid Growth of Unstructured Data In late 2011,IDC Digital Universe published a report indicating that some 1.8 zettabytes of data will be created that year. (Source) In other words, the amount of data in the world today is equal to: Every person in the US tweeting three tweets per minute for 26,976 years. Every person in the world having more than 215m high-resolution MRI scans a day. More than 200bn HD movies –which would take a person 47m years to watch. 10
  • 11. Decimal Value Metric 1000 kB kilobyte 10002 MB megabyte 10003 GB gigabyte 10004 TB terabyte 10005 PB petabyte 10006 EB exabyte 10007 ZB zettabyte 10008 YB yottabyte 11
  • 12. Social media and networks (all of us are generating data) Scientific instruments (collecting all sorts of data) Mobile devices (tracking all objects all the time) Sensor technology and networks (measuring all kinds of data) 12
  • 13. •No single standard definition… Big Data 13
  • 14. 14
  • 15. 15
  • 16. What to do with these data? 16
  • 17. How much data? 640Kought to be enough for anybody. 17
  • 18. Why Big Data •Key enablers of appearance and growth of Big Data are –Increase of storage capacities –Increase of processing power –Availability of data –Every day we create 2.5 quintillion bytes of data; 90% of the data in the world today has been created in the last two years alone 18
  • 19. Big Data Analytics •Examining large amount of data •Appropriate information •Identification of hidden patterns, unknown correlations •Competitive advantage •Better business decisions: strategic and operational •Effective marketing, customer satisfaction, increased revenue 19
  • 20. Applications for Big Data Analytics Homeland Security Finance Smarter Healthcare Multi-channel sales Telecom Manufacturing Traffic Control Trading Analytics Fraud and Risk LogAnalysis Search Quality Retail: Churn, NBO 20
  • 21. Healthcare •80% of medical data is unstructured and is clinically relevant •Data resides in multiple places like individual EMRs, lab and imaging systems, physician notes, medical correspondence, claims etc •Leveraging Big Data •Build sustainable healthcare systems •Collaborate to improve care and outcomes •Increase access to healthcare 21
  • 22. Market Size Source: WikibonTamingBig Data By 2015 4.4 million IT jobs in Big Data ; 1.9 million is in US itself 22
  • 23. Potential Talent Pool -Big Data India will require a minimum of 1 lakh data scientists in the next couple of years in addition to data analysts and data managers to support the Big Data space. 23
  • 24. 24
  • 25. Future of Big Data 25
  • 26. Big Data Analytics Technologies NoSQL : non-relational or at least non-SQL database solutions such as HBase(also a part of the Hadoop ecosystem), Cassandra, MongoDB, Riak, CouchDB, and many others. Hadoop: It is an ecosystem of software packages, including MapReduce, HDFS, and a whole host of other software packages 26
  • 27. Main Big Data Technologies Hadoop NoSQL Databases Analytic Databases Hadoop •Low cost, reliable scale-out architecture •Distributed computing Proven success in Fortune 500 companies •Exploding interest NoSQL Databases •Huge horizontal scaling and high availability •Highly optimized for retrieval and appending •Types •Document stores •Key Value stores •Graph databases Analytic RDBMS •Optimized for bulk-load and fast aggregate query workloads •Types •Column-oriented •MPP •In-memory 27
  • 28. Thank you  More info: www.aryatadbir.com 28
  • 29. 29