SlideShare uma empresa Scribd logo
1 de 25
n
WhatisBigData
 From the beginning of human civilization
until 2003, entire world generated 5
Exabyte of data.
 In 2004, US alone produced 5 Exabyte of
data every two days and the rate of
growth is accelerating in a rapid pace.
 1 Exabyte = 1 Million Terabyte
What’s Big Data?
IT industry defines Big Data using 4
Vs.Volume: amount of
data
Velocity:
speed of
data
arrival
Variety: text,
image, video
Veracity:
Trustworthiness
WhatisBigData
 Volume – Peta Byte, Exa Byte etc.
 Variety – Structured, Unstructured –
video, twitter trends, free form text
 Velocity- streaming data arriving in real-
time
 Veracity – trustworthiness of data
removing biases, noise, abnormality
Initial Challenges of the Big Data
 Prohibitively expensive hardware –
Computing, Networking and Storage
 Small pool of Big Data experts
 Lack of awareness about benefits of data
collected from different sources
 Analytics to process real-time data in
milli-seconds
Tools for Big Data
 Cluster of distributed computing nodes
using commodity hardware
 Map-reduce framework to run parallel
computation on Hadoop Cluster
 No SQL databases – Columnar database
using Key-value storage for very fast
data retrieval
Tools For Analyzing Big Data
Hadoop
 Batch Processing Framework
uses distributed cluster
running on low cost multi core
computers.
Tools For Analyzing Big Data
Splunk
 Real time analytics software to
process streaming data from
millions of sensors.
Tools For Analyzing Big Data
 Map Reduce Programming Model
that distributes small chunks of
data across thousand of nodes for
parallel processing and
combines the output from each
node to solve the big data
problem
Big Data Impact at all
Industries
Health Care
 MIT Technology Review report on data-driven health
care using “The New Medical Data Eco System”
 Analytics and predictive modeling of medical data
captured from many sources – insurance claims, public
health data, mobile health data and electronic medical
records to provide personalized patient care and help
doctors quickly decide best treatments
Insurance Claims Data
 Trends in drug and treatment
usage
Environmental Data
 “Sensors can pick up
behavioral Information
 Ex: Mapping, Location, and
Weather Data.”
Genomic Data
“Less expensive genome
sequencing offers insight into
the role genetics may play”
Public Health Data
 “Insight into community health
patterns from federal and state
data.”
Mobile Health Data
 “100-000 plus mobile health
apps, plus wearable devices
that measures activity bodily
function, offer a constant read on
patients.”
Electronic Medical Records
 Digital records include lab and
test results, drug prescription
and physician’s reports.
 These all records create
Family Health History
Outcome
 “Analytic algorithms and predictive
modeling mine the layers of data for
patterns and insight” (MIT Technology
Review)
Outcome
 Patients
“More precise and personalized diagnosis
and care based on a holistic view may
become possible.”
 Doctors
“Decision-support tools could help quickly
evaluate the best treatments”
Outcome
 Researchers
“Detailed information from many
patients, along with other data, could
lead to new insights into disease
and treatment.”
Use Cases Across all Industries
 Recommendation Engine
 Customer Sentiment Analysis
 Marketing Campaign Analysis
 Social Network Analysis
 Fraud Detection
 Risk Analysis
Retail
 Macy’s Inc.
Optimizing pricing of 73 million
items based on real-time market
data.
Retail
 Wal-Mart
1. Display search results based on
semantics of search items predict
customer behavior
2. Customer behavior prediction
3. Supply chain management analysis
of millions of point-of sales data in
real – time.
Future Trends
Health Care
 Important tool for cost reduction
 Adoption of EMR- Electronic Medical Record
for better patient treatment plan
 Agriculture
 Real-time tracking of farm machinery
FutureTrends
 Internet protocol becoming standard in
electricity grid, oil industry etc..
 IP v6 with 128 bit address will theoretically allow
trillion of trillion sensors to connect previously
unconnected places, people and things
 Digitization of massive data currently stored in
non-digital form
Citation
 Big Data@Work, Thomas H.Davenport
 MIT Technology Review
 Harness the Power Of Big Data, Paul Zikopolos,
Dirk Deroos, Krishna Parasuraman, Thomas
Deutsch, David Corrigan, James Giles
 The Human Face of Big Data, Rick Smolan and
Jennifer Erwitt

Mais conteúdo relacionado

Semelhante a Big data

Making an impact with data science
Making an impact  with data scienceMaking an impact  with data science
Making an impact with data science
Jordan Engbers
 
Smart Data for you and me: Personalized and Actionable Physical Cyber Social ...
Smart Data for you and me: Personalized and Actionable Physical Cyber Social ...Smart Data for you and me: Personalized and Actionable Physical Cyber Social ...
Smart Data for you and me: Personalized and Actionable Physical Cyber Social ...
Amit Sheth
 
Using Big Data for Improved Healthcare Operations and Analytics
Using Big Data for Improved Healthcare Operations and AnalyticsUsing Big Data for Improved Healthcare Operations and Analytics
Using Big Data for Improved Healthcare Operations and Analytics
Perficient, Inc.
 

Semelhante a Big data (20)

An Investigation on Scalable and Efficient Privacy Preserving Challenges for ...
An Investigation on Scalable and Efficient Privacy Preserving Challenges for ...An Investigation on Scalable and Efficient Privacy Preserving Challenges for ...
An Investigation on Scalable and Efficient Privacy Preserving Challenges for ...
 
Big data and data mining
Big data and data miningBig data and data mining
Big data and data mining
 
Ppt for Application of big data
Ppt for Application of big dataPpt for Application of big data
Ppt for Application of big data
 
Introduction to big data
Introduction to big dataIntroduction to big data
Introduction to big data
 
BIMCV, Banco de Imagen Medica de la Comunidad Valenciana. María de la Iglesia
BIMCV, Banco de Imagen Medica de la Comunidad Valenciana. María de la IglesiaBIMCV, Banco de Imagen Medica de la Comunidad Valenciana. María de la Iglesia
BIMCV, Banco de Imagen Medica de la Comunidad Valenciana. María de la Iglesia
 
Healthcare data's perfect storm
Healthcare data's perfect stormHealthcare data's perfect storm
Healthcare data's perfect storm
 
Big data (word file)
Big data  (word file)Big data  (word file)
Big data (word file)
 
Applications of Big Data
Applications of Big DataApplications of Big Data
Applications of Big Data
 
Big data-analytics-changing-way-organizations-conducting-business
Big data-analytics-changing-way-organizations-conducting-businessBig data-analytics-changing-way-organizations-conducting-business
Big data-analytics-changing-way-organizations-conducting-business
 
Big Data, The Community and The Commons (May 12, 2014)
Big Data, The Community and The Commons (May 12, 2014)Big Data, The Community and The Commons (May 12, 2014)
Big Data, The Community and The Commons (May 12, 2014)
 
Benefits of Big Data in Health Care A Revolution
Benefits of Big Data in Health Care A RevolutionBenefits of Big Data in Health Care A Revolution
Benefits of Big Data in Health Care A Revolution
 
Making an impact with data science
Making an impact  with data scienceMaking an impact  with data science
Making an impact with data science
 
Smart Data for you and me: Personalized and Actionable Physical Cyber Social ...
Smart Data for you and me: Personalized and Actionable Physical Cyber Social ...Smart Data for you and me: Personalized and Actionable Physical Cyber Social ...
Smart Data for you and me: Personalized and Actionable Physical Cyber Social ...
 
Future Research Direction of Big Data Analytics in Healthcare 2023-2024.pdf
Future Research Direction of Big Data Analytics in Healthcare 2023-2024.pdfFuture Research Direction of Big Data Analytics in Healthcare 2023-2024.pdf
Future Research Direction of Big Data Analytics in Healthcare 2023-2024.pdf
 
Using Big Data for Improved Healthcare Operations and Analytics
Using Big Data for Improved Healthcare Operations and AnalyticsUsing Big Data for Improved Healthcare Operations and Analytics
Using Big Data for Improved Healthcare Operations and Analytics
 
Data minig with Big data analysis
Data minig with Big data analysisData minig with Big data analysis
Data minig with Big data analysis
 
BIG DATA-Seminar Report
BIG DATA-Seminar ReportBIG DATA-Seminar Report
BIG DATA-Seminar Report
 
Big data analystics
Big data analysticsBig data analystics
Big data analystics
 
HEALTH PREDICTION ANALYSIS USING DATA MINING
HEALTH PREDICTION ANALYSIS USING DATA  MININGHEALTH PREDICTION ANALYSIS USING DATA  MINING
HEALTH PREDICTION ANALYSIS USING DATA MINING
 
Big Data Analytics in Health Care: A Review Paper
Big Data Analytics in Health Care: A Review PaperBig Data Analytics in Health Care: A Review Paper
Big Data Analytics in Health Care: A Review Paper
 

Mais de Cisco (7)

Colloborative computing
Colloborative computing Colloborative computing
Colloborative computing
 
mobile case_presentation_byod_dey_sushmita
 mobile case_presentation_byod_dey_sushmita mobile case_presentation_byod_dey_sushmita
mobile case_presentation_byod_dey_sushmita
 
Network Intrusion Detection Analysis using Random Forest Algorithm on Apache ...
Network Intrusion Detection Analysis using Random Forest Algorithm on Apache ...Network Intrusion Detection Analysis using Random Forest Algorithm on Apache ...
Network Intrusion Detection Analysis using Random Forest Algorithm on Apache ...
 
Clustering and Association Rule
Clustering and Association RuleClustering and Association Rule
Clustering and Association Rule
 
Time Series Forecasting for Google Inc. and Break-even analysis for Google gl...
Time Series Forecasting for Google Inc. and Break-even analysis for Google gl...Time Series Forecasting for Google Inc. and Break-even analysis for Google gl...
Time Series Forecasting for Google Inc. and Break-even analysis for Google gl...
 
Time Series Forecasting
Time Series ForecastingTime Series Forecasting
Time Series Forecasting
 
Kenneth Lay
Kenneth LayKenneth Lay
Kenneth Lay
 

Último

1029 - Danh muc Sach Giao Khoa 10 . pdf
1029 -  Danh muc Sach Giao Khoa 10 . pdf1029 -  Danh muc Sach Giao Khoa 10 . pdf
1029 - Danh muc Sach Giao Khoa 10 . pdf
QucHHunhnh
 
Beyond the EU: DORA and NIS 2 Directive's Global Impact
Beyond the EU: DORA and NIS 2 Directive's Global ImpactBeyond the EU: DORA and NIS 2 Directive's Global Impact
Beyond the EU: DORA and NIS 2 Directive's Global Impact
PECB
 
Seal of Good Local Governance (SGLG) 2024Final.pptx
Seal of Good Local Governance (SGLG) 2024Final.pptxSeal of Good Local Governance (SGLG) 2024Final.pptx
Seal of Good Local Governance (SGLG) 2024Final.pptx
negromaestrong
 

Último (20)

ICT Role in 21st Century Education & its Challenges.pptx
ICT Role in 21st Century Education & its Challenges.pptxICT Role in 21st Century Education & its Challenges.pptx
ICT Role in 21st Century Education & its Challenges.pptx
 
Advanced Views - Calendar View in Odoo 17
Advanced Views - Calendar View in Odoo 17Advanced Views - Calendar View in Odoo 17
Advanced Views - Calendar View in Odoo 17
 
Asian American Pacific Islander Month DDSD 2024.pptx
Asian American Pacific Islander Month DDSD 2024.pptxAsian American Pacific Islander Month DDSD 2024.pptx
Asian American Pacific Islander Month DDSD 2024.pptx
 
On National Teacher Day, meet the 2024-25 Kenan Fellows
On National Teacher Day, meet the 2024-25 Kenan FellowsOn National Teacher Day, meet the 2024-25 Kenan Fellows
On National Teacher Day, meet the 2024-25 Kenan Fellows
 
TỔNG ÔN TẬP THI VÀO LỚP 10 MÔN TIẾNG ANH NĂM HỌC 2023 - 2024 CÓ ĐÁP ÁN (NGỮ Â...
TỔNG ÔN TẬP THI VÀO LỚP 10 MÔN TIẾNG ANH NĂM HỌC 2023 - 2024 CÓ ĐÁP ÁN (NGỮ Â...TỔNG ÔN TẬP THI VÀO LỚP 10 MÔN TIẾNG ANH NĂM HỌC 2023 - 2024 CÓ ĐÁP ÁN (NGỮ Â...
TỔNG ÔN TẬP THI VÀO LỚP 10 MÔN TIẾNG ANH NĂM HỌC 2023 - 2024 CÓ ĐÁP ÁN (NGỮ Â...
 
Grant Readiness 101 TechSoup and Remy Consulting
Grant Readiness 101 TechSoup and Remy ConsultingGrant Readiness 101 TechSoup and Remy Consulting
Grant Readiness 101 TechSoup and Remy Consulting
 
Basic Civil Engineering first year Notes- Chapter 4 Building.pptx
Basic Civil Engineering first year Notes- Chapter 4 Building.pptxBasic Civil Engineering first year Notes- Chapter 4 Building.pptx
Basic Civil Engineering first year Notes- Chapter 4 Building.pptx
 
1029 - Danh muc Sach Giao Khoa 10 . pdf
1029 -  Danh muc Sach Giao Khoa 10 . pdf1029 -  Danh muc Sach Giao Khoa 10 . pdf
1029 - Danh muc Sach Giao Khoa 10 . pdf
 
Ecological Succession. ( ECOSYSTEM, B. Pharmacy, 1st Year, Sem-II, Environmen...
Ecological Succession. ( ECOSYSTEM, B. Pharmacy, 1st Year, Sem-II, Environmen...Ecological Succession. ( ECOSYSTEM, B. Pharmacy, 1st Year, Sem-II, Environmen...
Ecological Succession. ( ECOSYSTEM, B. Pharmacy, 1st Year, Sem-II, Environmen...
 
psychiatric nursing HISTORY COLLECTION .docx
psychiatric  nursing HISTORY  COLLECTION  .docxpsychiatric  nursing HISTORY  COLLECTION  .docx
psychiatric nursing HISTORY COLLECTION .docx
 
Beyond the EU: DORA and NIS 2 Directive's Global Impact
Beyond the EU: DORA and NIS 2 Directive's Global ImpactBeyond the EU: DORA and NIS 2 Directive's Global Impact
Beyond the EU: DORA and NIS 2 Directive's Global Impact
 
Introduction to Nonprofit Accounting: The Basics
Introduction to Nonprofit Accounting: The BasicsIntroduction to Nonprofit Accounting: The Basics
Introduction to Nonprofit Accounting: The Basics
 
Seal of Good Local Governance (SGLG) 2024Final.pptx
Seal of Good Local Governance (SGLG) 2024Final.pptxSeal of Good Local Governance (SGLG) 2024Final.pptx
Seal of Good Local Governance (SGLG) 2024Final.pptx
 
Mixin Classes in Odoo 17 How to Extend Models Using Mixin Classes
Mixin Classes in Odoo 17  How to Extend Models Using Mixin ClassesMixin Classes in Odoo 17  How to Extend Models Using Mixin Classes
Mixin Classes in Odoo 17 How to Extend Models Using Mixin Classes
 
Micro-Scholarship, What it is, How can it help me.pdf
Micro-Scholarship, What it is, How can it help me.pdfMicro-Scholarship, What it is, How can it help me.pdf
Micro-Scholarship, What it is, How can it help me.pdf
 
2024-NATIONAL-LEARNING-CAMP-AND-OTHER.pptx
2024-NATIONAL-LEARNING-CAMP-AND-OTHER.pptx2024-NATIONAL-LEARNING-CAMP-AND-OTHER.pptx
2024-NATIONAL-LEARNING-CAMP-AND-OTHER.pptx
 
Food Chain and Food Web (Ecosystem) EVS, B. Pharmacy 1st Year, Sem-II
Food Chain and Food Web (Ecosystem) EVS, B. Pharmacy 1st Year, Sem-IIFood Chain and Food Web (Ecosystem) EVS, B. Pharmacy 1st Year, Sem-II
Food Chain and Food Web (Ecosystem) EVS, B. Pharmacy 1st Year, Sem-II
 
Explore beautiful and ugly buildings. Mathematics helps us create beautiful d...
Explore beautiful and ugly buildings. Mathematics helps us create beautiful d...Explore beautiful and ugly buildings. Mathematics helps us create beautiful d...
Explore beautiful and ugly buildings. Mathematics helps us create beautiful d...
 
Unit-V; Pricing (Pharma Marketing Management).pptx
Unit-V; Pricing (Pharma Marketing Management).pptxUnit-V; Pricing (Pharma Marketing Management).pptx
Unit-V; Pricing (Pharma Marketing Management).pptx
 
Key note speaker Neum_Admir Softic_ENG.pdf
Key note speaker Neum_Admir Softic_ENG.pdfKey note speaker Neum_Admir Softic_ENG.pdf
Key note speaker Neum_Admir Softic_ENG.pdf
 

Big data

  • 1. n
  • 2. WhatisBigData  From the beginning of human civilization until 2003, entire world generated 5 Exabyte of data.  In 2004, US alone produced 5 Exabyte of data every two days and the rate of growth is accelerating in a rapid pace.  1 Exabyte = 1 Million Terabyte
  • 3. What’s Big Data? IT industry defines Big Data using 4 Vs.Volume: amount of data Velocity: speed of data arrival Variety: text, image, video Veracity: Trustworthiness
  • 4. WhatisBigData  Volume – Peta Byte, Exa Byte etc.  Variety – Structured, Unstructured – video, twitter trends, free form text  Velocity- streaming data arriving in real- time  Veracity – trustworthiness of data removing biases, noise, abnormality
  • 5. Initial Challenges of the Big Data  Prohibitively expensive hardware – Computing, Networking and Storage  Small pool of Big Data experts  Lack of awareness about benefits of data collected from different sources  Analytics to process real-time data in milli-seconds
  • 6. Tools for Big Data  Cluster of distributed computing nodes using commodity hardware  Map-reduce framework to run parallel computation on Hadoop Cluster  No SQL databases – Columnar database using Key-value storage for very fast data retrieval
  • 7. Tools For Analyzing Big Data Hadoop  Batch Processing Framework uses distributed cluster running on low cost multi core computers.
  • 8. Tools For Analyzing Big Data Splunk  Real time analytics software to process streaming data from millions of sensors.
  • 9. Tools For Analyzing Big Data  Map Reduce Programming Model that distributes small chunks of data across thousand of nodes for parallel processing and combines the output from each node to solve the big data problem
  • 10. Big Data Impact at all Industries Health Care  MIT Technology Review report on data-driven health care using “The New Medical Data Eco System”  Analytics and predictive modeling of medical data captured from many sources – insurance claims, public health data, mobile health data and electronic medical records to provide personalized patient care and help doctors quickly decide best treatments
  • 11. Insurance Claims Data  Trends in drug and treatment usage
  • 12. Environmental Data  “Sensors can pick up behavioral Information  Ex: Mapping, Location, and Weather Data.”
  • 13. Genomic Data “Less expensive genome sequencing offers insight into the role genetics may play”
  • 14. Public Health Data  “Insight into community health patterns from federal and state data.”
  • 15. Mobile Health Data  “100-000 plus mobile health apps, plus wearable devices that measures activity bodily function, offer a constant read on patients.”
  • 16. Electronic Medical Records  Digital records include lab and test results, drug prescription and physician’s reports.  These all records create Family Health History
  • 17. Outcome  “Analytic algorithms and predictive modeling mine the layers of data for patterns and insight” (MIT Technology Review)
  • 18. Outcome  Patients “More precise and personalized diagnosis and care based on a holistic view may become possible.”  Doctors “Decision-support tools could help quickly evaluate the best treatments”
  • 19. Outcome  Researchers “Detailed information from many patients, along with other data, could lead to new insights into disease and treatment.”
  • 20. Use Cases Across all Industries  Recommendation Engine  Customer Sentiment Analysis  Marketing Campaign Analysis  Social Network Analysis  Fraud Detection  Risk Analysis
  • 21. Retail  Macy’s Inc. Optimizing pricing of 73 million items based on real-time market data.
  • 22. Retail  Wal-Mart 1. Display search results based on semantics of search items predict customer behavior 2. Customer behavior prediction 3. Supply chain management analysis of millions of point-of sales data in real – time.
  • 23. Future Trends Health Care  Important tool for cost reduction  Adoption of EMR- Electronic Medical Record for better patient treatment plan  Agriculture  Real-time tracking of farm machinery
  • 24. FutureTrends  Internet protocol becoming standard in electricity grid, oil industry etc..  IP v6 with 128 bit address will theoretically allow trillion of trillion sensors to connect previously unconnected places, people and things  Digitization of massive data currently stored in non-digital form
  • 25. Citation  Big Data@Work, Thomas H.Davenport  MIT Technology Review  Harness the Power Of Big Data, Paul Zikopolos, Dirk Deroos, Krishna Parasuraman, Thomas Deutsch, David Corrigan, James Giles  The Human Face of Big Data, Rick Smolan and Jennifer Erwitt

Notas do Editor

  1. To keep your data clean and process to keep “Dirty data” from accumulating in your system.