SlideShare uma empresa Scribd logo
1 de 24
Understanding Big Data
Overview 
Why Big Data 
Big Data Users 
What is Big Data
Big Data 
Big data is a popular term used to describe the exponential growth and availability of data, both structured and 
unstructured. And big data may be as important to business – and society – as the Internet has become. Why? 
More data may lead to more accurate analyses. 
More accurate analyses may lead to more confident decision making. And better decisions can mean greater 
operational efficiencies, cost reductions and reduced risk.
Big Data 
Mainstream definition of big data as the three V’s of big data:
Big Data
Big Data 
• Volume: Many factors contribute to the increase in data volume. Transaction-based data stored through 
the years. Unstructured data streaming in from social media. Increasing amounts of sensor and machine-to- 
machine data being collected. 
• Velocity: Data is streaming in at unprecedented speed and must be dealt with in a timely manner. RFID 
tags, sensors and smart metering are driving the need to deal with torrents of data in near-real time. 
Reacting quickly enough to deal with data velocity is a challenge for most organizations. 
• Variety: Data today comes in all types of formats. Structured, numeric data in traditional databases. 
Information created from line-of-business applications. Unstructured text documents, email, video, audio, 
stock ticker data and financial transactions. Managing, merging and governing different varieties of data is 
something many organizations still grapple with.
Big Data 
Let us consider two additional dimensions when thinking about big data: 
• Variability: In addition to the increasing velocities and varieties of data, data flows can be highly 
inconsistent with periodic peaks. Is something trending in social media? Daily, seasonal and event-triggered 
peak data loads can be challenging to manage. Even more so with unstructured data involved. 
• Complexity: Today's data comes from multiple sources. And it is still an undertaking to link, match, cleanse 
and transform data across systems. However, it is necessary to connect and correlate relationships, 
hierarchies and multiple data linkages or your data can quickly spiral out of control.
Big Data
Why Big Data ?? 
The real issue is not that you are acquiring large amounts of data. It's what you do with the data that counts. The 
hopeful vision is that organizations will be able to take data from any source, harness relevant data and analyse 
it to find answers that enable 
1) cost reductions 
2) time reductions 
3) new product development and optimized offerings 
4) smarter business decision making.
Big Data Ecosystem
Big Data platform typically works by storing data first into clusters , then process the data through 
MapReduce workflows which executes by Mapping the input data through independent chunks processed 
by appropriate algorithms, the output from Map phase then moves to Shuffle/Sorting phase & finally the 
output from Shuffle phase comes to Reduce phase as input. 
Typical Big Data MapReduce workflow:
Big Data users in Next Five Years
Big Data users in Private Sector 
• Ebay.com uses two data warehouses at 7.5 petabytes and 40PB as well as a 40PB Hadoop cluster for search, 
consumer recommendations, and merchandising. 
• Amazon.com handles millions of back-end operations every day, as well as queries from more than half a 
million third-party sellers. The core technology that keeps Amazon running is Linux-based and as of 2005 
they had the world’s three largest Linux databases, with capacities of 7.8 TB, 18.5 TB, and 24.7 TB. 
• Walmart handles more than 1 million customer transactions every hour, which are imported into databases 
estimated to contain more than 2.5 petabytes (2560 terabytes) of data – the equivalent of 167 times the 
information contained in all the books in the US Library of Congress. 
• Facebook handles 50 billion photos from its user base.
Big Data users in Private Sector 
• FICO Falcon Credit Card Fraud Detection System protects 2.1 billion active accounts world-wide. The volume of 
business data worldwide, across all companies, doubles every 1.2 years, according to estimates. 
• Windermere Real Estate uses anonymous GPS signals from nearly 100 million drivers to help new home 
buyers determine their typical drive times to and from work throughout various times of the day.
Thank You.. 
Queries are welcome 
Praneet Samaiya

Mais conteúdo relacionado

Mais procurados (20)

Big data
Big dataBig data
Big data
 
Big data
Big dataBig data
Big data
 
Presentation on Big Data
Presentation on Big DataPresentation on Big Data
Presentation on Big Data
 
Data mining with big data
Data mining with big dataData mining with big data
Data mining with big data
 
Big data.
Big data.Big data.
Big data.
 
Introduction to Big Data & Big Data 1.0 System
Introduction to Big Data & Big Data 1.0 SystemIntroduction to Big Data & Big Data 1.0 System
Introduction to Big Data & Big Data 1.0 System
 
Big data
Big dataBig data
Big data
 
A Short History of Big Data
A Short History of Big DataA Short History of Big Data
A Short History of Big Data
 
Big Data
Big DataBig Data
Big Data
 
Big Data, Big Deal: For Future Big Data Scientists
Big Data, Big Deal: For Future Big Data ScientistsBig Data, Big Deal: For Future Big Data Scientists
Big Data, Big Deal: For Future Big Data Scientists
 
Big data
Big dataBig data
Big data
 
Big data
Big dataBig data
Big data
 
Big Data
Big DataBig Data
Big Data
 
Data Mining With Big Data
Data Mining With Big DataData Mining With Big Data
Data Mining With Big Data
 
Mining Big Data in Real Time
Mining Big Data in Real TimeMining Big Data in Real Time
Mining Big Data in Real Time
 
Data mining with big data implementation
Data mining with big data implementationData mining with big data implementation
Data mining with big data implementation
 
big data Presentation
big data Presentationbig data Presentation
big data Presentation
 
Data mining on big data
Data mining on big dataData mining on big data
Data mining on big data
 
JPJ1417 Data Mining With Big Data
JPJ1417   Data Mining With Big DataJPJ1417   Data Mining With Big Data
JPJ1417 Data Mining With Big Data
 
BIG DATA & DATA ANALYTICS
BIG  DATA & DATA  ANALYTICSBIG  DATA & DATA  ANALYTICS
BIG DATA & DATA ANALYTICS
 

Destaque

Nuclear Winter
Nuclear WinterNuclear Winter
Nuclear Winterbrookeec
 
acceso a la biblioteca
acceso a la biblioteca acceso a la biblioteca
acceso a la biblioteca kthriin
 
FACTORS AFFECTING LLS
FACTORS AFFECTING LLSFACTORS AFFECTING LLS
FACTORS AFFECTING LLSMau5pls
 
Діагностика уваги
Діагностика увагиДіагностика уваги
Діагностика увагиyfnfkmz1990
 
Movie assignment MoeR
Movie assignment MoeRMovie assignment MoeR
Movie assignment MoeRXin Yi Zyx
 
Anuario estadístico América Latina 2013
Anuario estadístico América Latina 2013Anuario estadístico América Latina 2013
Anuario estadístico América Latina 2013Manager Asesores
 
Klaster pariwisata desa sembungan
Klaster pariwisata desa sembunganKlaster pariwisata desa sembungan
Klaster pariwisata desa sembunganfebrinasas
 
Horario cesanjose2014
Horario cesanjose2014Horario cesanjose2014
Horario cesanjose2014dallas60
 
#10dieci: Turismo
#10dieci: Turismo#10dieci: Turismo
#10dieci: Turismopaticchio
 
How to create a new Master Page in SharePoint 2013?
How to create a new Master Page in SharePoint 2013?How to create a new Master Page in SharePoint 2013?
How to create a new Master Page in SharePoint 2013?Velocity Software
 
Pardalis & Nohavicka llp final
Pardalis & Nohavicka llp finalPardalis & Nohavicka llp final
Pardalis & Nohavicka llp finalJoseph Nohavicka
 
Ieee 2014 2015 dotnet projects titles globalsoft technologies
Ieee 2014 2015 dotnet projects titles globalsoft technologiesIeee 2014 2015 dotnet projects titles globalsoft technologies
Ieee 2014 2015 dotnet projects titles globalsoft technologiesIEEEJAVAPROJECTS
 
Tests for intergranular corrosion and stress corrosion cracking
Tests for intergranular corrosion and stress corrosion crackingTests for intergranular corrosion and stress corrosion cracking
Tests for intergranular corrosion and stress corrosion crackingkoshykanjirapallikaran
 

Destaque (19)

Nuclear Winter
Nuclear WinterNuclear Winter
Nuclear Winter
 
Bab 6 kls xi
Bab 6 kls xiBab 6 kls xi
Bab 6 kls xi
 
acceso a la biblioteca
acceso a la biblioteca acceso a la biblioteca
acceso a la biblioteca
 
FACTORS AFFECTING LLS
FACTORS AFFECTING LLSFACTORS AFFECTING LLS
FACTORS AFFECTING LLS
 
Діагностика уваги
Діагностика увагиДіагностика уваги
Діагностика уваги
 
Movie assignment MoeR
Movie assignment MoeRMovie assignment MoeR
Movie assignment MoeR
 
Anuario estadístico América Latina 2013
Anuario estadístico América Latina 2013Anuario estadístico América Latina 2013
Anuario estadístico América Latina 2013
 
Klaster pariwisata desa sembungan
Klaster pariwisata desa sembunganKlaster pariwisata desa sembungan
Klaster pariwisata desa sembungan
 
Horario cesanjose2014
Horario cesanjose2014Horario cesanjose2014
Horario cesanjose2014
 
#10dieci: Turismo
#10dieci: Turismo#10dieci: Turismo
#10dieci: Turismo
 
How to create a new Master Page in SharePoint 2013?
How to create a new Master Page in SharePoint 2013?How to create a new Master Page in SharePoint 2013?
How to create a new Master Page in SharePoint 2013?
 
Pardalis & Nohavicka llp final
Pardalis & Nohavicka llp finalPardalis & Nohavicka llp final
Pardalis & Nohavicka llp final
 
Ieee 2014 2015 dotnet projects titles globalsoft technologies
Ieee 2014 2015 dotnet projects titles globalsoft technologiesIeee 2014 2015 dotnet projects titles globalsoft technologies
Ieee 2014 2015 dotnet projects titles globalsoft technologies
 
Encuestas a niños
Encuestas a niñosEncuestas a niños
Encuestas a niños
 
Sky fall production company
Sky fall production company Sky fall production company
Sky fall production company
 
Tests for intergranular corrosion and stress corrosion cracking
Tests for intergranular corrosion and stress corrosion crackingTests for intergranular corrosion and stress corrosion cracking
Tests for intergranular corrosion and stress corrosion cracking
 
Grand Cianjur
Grand CianjurGrand Cianjur
Grand Cianjur
 
VozDigital DevFest 31/10/14
VozDigital DevFest 31/10/14VozDigital DevFest 31/10/14
VozDigital DevFest 31/10/14
 
Presentation
PresentationPresentation
Presentation
 

Semelhante a Understanding big data

Semelhante a Understanding big data (20)

Presentation on Big Data
Presentation on Big DataPresentation on Big Data
Presentation on Big Data
 
Bigdata " new level"
Bigdata " new level"Bigdata " new level"
Bigdata " new level"
 
Data mining with big data
Data mining with big dataData mining with big data
Data mining with big data
 
Data mining with big data
Data mining with big dataData mining with big data
Data mining with big data
 
Kartikey tripathi
Kartikey tripathiKartikey tripathi
Kartikey tripathi
 
Big data
Big dataBig data
Big data
 
Content1. Introduction2. What is Big Data3. Characte.docx
Content1. Introduction2. What is Big Data3. Characte.docxContent1. Introduction2. What is Big Data3. Characte.docx
Content1. Introduction2. What is Big Data3. Characte.docx
 
Special issues on big data
Special issues on big dataSpecial issues on big data
Special issues on big data
 
ppt final.pptx
ppt final.pptxppt final.pptx
ppt final.pptx
 
Introduction to Big Data
Introduction to Big DataIntroduction to Big Data
Introduction to Big Data
 
Big Data By Vijay Bhaskar Semwal
Big Data By Vijay Bhaskar SemwalBig Data By Vijay Bhaskar Semwal
Big Data By Vijay Bhaskar Semwal
 
Bigdatappt 140225061440-phpapp01
Bigdatappt 140225061440-phpapp01Bigdatappt 140225061440-phpapp01
Bigdatappt 140225061440-phpapp01
 
Big_Data_ppt[1] (1).pptx
Big_Data_ppt[1] (1).pptxBig_Data_ppt[1] (1).pptx
Big_Data_ppt[1] (1).pptx
 
Big data
Big dataBig data
Big data
 
Big data Analytics
Big data Analytics Big data Analytics
Big data Analytics
 
Big data ppt
Big  data pptBig  data ppt
Big data ppt
 
Big data
Big dataBig data
Big data
 
Big data and analytics
Big data and analyticsBig data and analytics
Big data and analytics
 
Big data
Big dataBig data
Big data
 
Big data
Big dataBig data
Big data
 

Último

Call me @ 9892124323 Cheap Rate Call Girls in Vashi with Real Photo 100% Secure
Call me @ 9892124323  Cheap Rate Call Girls in Vashi with Real Photo 100% SecureCall me @ 9892124323  Cheap Rate Call Girls in Vashi with Real Photo 100% Secure
Call me @ 9892124323 Cheap Rate Call Girls in Vashi with Real Photo 100% SecurePooja Nehwal
 
Call Girls Indiranagar Just Call 👗 7737669865 👗 Top Class Call Girl Service B...
Call Girls Indiranagar Just Call 👗 7737669865 👗 Top Class Call Girl Service B...Call Girls Indiranagar Just Call 👗 7737669865 👗 Top Class Call Girl Service B...
Call Girls Indiranagar Just Call 👗 7737669865 👗 Top Class Call Girl Service B...amitlee9823
 
Call Girls In Bellandur ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Bellandur ☎ 7737669865 🥵 Book Your One night StandCall Girls In Bellandur ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Bellandur ☎ 7737669865 🥵 Book Your One night Standamitlee9823
 
April 2024 - Crypto Market Report's Analysis
April 2024 - Crypto Market Report's AnalysisApril 2024 - Crypto Market Report's Analysis
April 2024 - Crypto Market Report's Analysismanisha194592
 
Carero dropshipping via API with DroFx.pptx
Carero dropshipping via API with DroFx.pptxCarero dropshipping via API with DroFx.pptx
Carero dropshipping via API with DroFx.pptxolyaivanovalion
 
Call Girls Begur Just Call 👗 7737669865 👗 Top Class Call Girl Service Bangalore
Call Girls Begur Just Call 👗 7737669865 👗 Top Class Call Girl Service BangaloreCall Girls Begur Just Call 👗 7737669865 👗 Top Class Call Girl Service Bangalore
Call Girls Begur Just Call 👗 7737669865 👗 Top Class Call Girl Service Bangaloreamitlee9823
 
Mature dropshipping via API with DroFx.pptx
Mature dropshipping via API with DroFx.pptxMature dropshipping via API with DroFx.pptx
Mature dropshipping via API with DroFx.pptxolyaivanovalion
 
Halmar dropshipping via API with DroFx
Halmar  dropshipping  via API with DroFxHalmar  dropshipping  via API with DroFx
Halmar dropshipping via API with DroFxolyaivanovalion
 
Escorts Service Kumaraswamy Layout ☎ 7737669865☎ Book Your One night Stand (B...
Escorts Service Kumaraswamy Layout ☎ 7737669865☎ Book Your One night Stand (B...Escorts Service Kumaraswamy Layout ☎ 7737669865☎ Book Your One night Stand (B...
Escorts Service Kumaraswamy Layout ☎ 7737669865☎ Book Your One night Stand (B...amitlee9823
 
Cheap Rate Call girls Sarita Vihar Delhi 9205541914 shot 1500 night
Cheap Rate Call girls Sarita Vihar Delhi 9205541914 shot 1500 nightCheap Rate Call girls Sarita Vihar Delhi 9205541914 shot 1500 night
Cheap Rate Call girls Sarita Vihar Delhi 9205541914 shot 1500 nightDelhi Call girls
 
Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...
Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...
Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...amitlee9823
 
BigBuy dropshipping via API with DroFx.pptx
BigBuy dropshipping via API with DroFx.pptxBigBuy dropshipping via API with DroFx.pptx
BigBuy dropshipping via API with DroFx.pptxolyaivanovalion
 
Generative AI on Enterprise Cloud with NiFi and Milvus
Generative AI on Enterprise Cloud with NiFi and MilvusGenerative AI on Enterprise Cloud with NiFi and Milvus
Generative AI on Enterprise Cloud with NiFi and MilvusTimothy Spann
 
Digital Advertising Lecture for Advanced Digital & Social Media Strategy at U...
Digital Advertising Lecture for Advanced Digital & Social Media Strategy at U...Digital Advertising Lecture for Advanced Digital & Social Media Strategy at U...
Digital Advertising Lecture for Advanced Digital & Social Media Strategy at U...Valters Lauzums
 
Capstone Project on IBM Data Analytics Program
Capstone Project on IBM Data Analytics ProgramCapstone Project on IBM Data Analytics Program
Capstone Project on IBM Data Analytics ProgramMoniSankarHazra
 
Accredited-Transport-Cooperatives-Jan-2021-Web.pdf
Accredited-Transport-Cooperatives-Jan-2021-Web.pdfAccredited-Transport-Cooperatives-Jan-2021-Web.pdf
Accredited-Transport-Cooperatives-Jan-2021-Web.pdfadriantubila
 
Vip Mumbai Call Girls Marol Naka Call On 9920725232 With Body to body massage...
Vip Mumbai Call Girls Marol Naka Call On 9920725232 With Body to body massage...Vip Mumbai Call Girls Marol Naka Call On 9920725232 With Body to body massage...
Vip Mumbai Call Girls Marol Naka Call On 9920725232 With Body to body massage...amitlee9823
 
Invezz.com - Grow your wealth with trading signals
Invezz.com - Grow your wealth with trading signalsInvezz.com - Grow your wealth with trading signals
Invezz.com - Grow your wealth with trading signalsInvezz1
 

Último (20)

Call me @ 9892124323 Cheap Rate Call Girls in Vashi with Real Photo 100% Secure
Call me @ 9892124323  Cheap Rate Call Girls in Vashi with Real Photo 100% SecureCall me @ 9892124323  Cheap Rate Call Girls in Vashi with Real Photo 100% Secure
Call me @ 9892124323 Cheap Rate Call Girls in Vashi with Real Photo 100% Secure
 
Call Girls Indiranagar Just Call 👗 7737669865 👗 Top Class Call Girl Service B...
Call Girls Indiranagar Just Call 👗 7737669865 👗 Top Class Call Girl Service B...Call Girls Indiranagar Just Call 👗 7737669865 👗 Top Class Call Girl Service B...
Call Girls Indiranagar Just Call 👗 7737669865 👗 Top Class Call Girl Service B...
 
Call Girls In Bellandur ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Bellandur ☎ 7737669865 🥵 Book Your One night StandCall Girls In Bellandur ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Bellandur ☎ 7737669865 🥵 Book Your One night Stand
 
April 2024 - Crypto Market Report's Analysis
April 2024 - Crypto Market Report's AnalysisApril 2024 - Crypto Market Report's Analysis
April 2024 - Crypto Market Report's Analysis
 
Carero dropshipping via API with DroFx.pptx
Carero dropshipping via API with DroFx.pptxCarero dropshipping via API with DroFx.pptx
Carero dropshipping via API with DroFx.pptx
 
Abortion pills in Doha Qatar (+966572737505 ! Get Cytotec
Abortion pills in Doha Qatar (+966572737505 ! Get CytotecAbortion pills in Doha Qatar (+966572737505 ! Get Cytotec
Abortion pills in Doha Qatar (+966572737505 ! Get Cytotec
 
Call Girls Begur Just Call 👗 7737669865 👗 Top Class Call Girl Service Bangalore
Call Girls Begur Just Call 👗 7737669865 👗 Top Class Call Girl Service BangaloreCall Girls Begur Just Call 👗 7737669865 👗 Top Class Call Girl Service Bangalore
Call Girls Begur Just Call 👗 7737669865 👗 Top Class Call Girl Service Bangalore
 
Mature dropshipping via API with DroFx.pptx
Mature dropshipping via API with DroFx.pptxMature dropshipping via API with DroFx.pptx
Mature dropshipping via API with DroFx.pptx
 
Predicting Loan Approval: A Data Science Project
Predicting Loan Approval: A Data Science ProjectPredicting Loan Approval: A Data Science Project
Predicting Loan Approval: A Data Science Project
 
Halmar dropshipping via API with DroFx
Halmar  dropshipping  via API with DroFxHalmar  dropshipping  via API with DroFx
Halmar dropshipping via API with DroFx
 
Escorts Service Kumaraswamy Layout ☎ 7737669865☎ Book Your One night Stand (B...
Escorts Service Kumaraswamy Layout ☎ 7737669865☎ Book Your One night Stand (B...Escorts Service Kumaraswamy Layout ☎ 7737669865☎ Book Your One night Stand (B...
Escorts Service Kumaraswamy Layout ☎ 7737669865☎ Book Your One night Stand (B...
 
Cheap Rate Call girls Sarita Vihar Delhi 9205541914 shot 1500 night
Cheap Rate Call girls Sarita Vihar Delhi 9205541914 shot 1500 nightCheap Rate Call girls Sarita Vihar Delhi 9205541914 shot 1500 night
Cheap Rate Call girls Sarita Vihar Delhi 9205541914 shot 1500 night
 
Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...
Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...
Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...
 
BigBuy dropshipping via API with DroFx.pptx
BigBuy dropshipping via API with DroFx.pptxBigBuy dropshipping via API with DroFx.pptx
BigBuy dropshipping via API with DroFx.pptx
 
Generative AI on Enterprise Cloud with NiFi and Milvus
Generative AI on Enterprise Cloud with NiFi and MilvusGenerative AI on Enterprise Cloud with NiFi and Milvus
Generative AI on Enterprise Cloud with NiFi and Milvus
 
Digital Advertising Lecture for Advanced Digital & Social Media Strategy at U...
Digital Advertising Lecture for Advanced Digital & Social Media Strategy at U...Digital Advertising Lecture for Advanced Digital & Social Media Strategy at U...
Digital Advertising Lecture for Advanced Digital & Social Media Strategy at U...
 
Capstone Project on IBM Data Analytics Program
Capstone Project on IBM Data Analytics ProgramCapstone Project on IBM Data Analytics Program
Capstone Project on IBM Data Analytics Program
 
Accredited-Transport-Cooperatives-Jan-2021-Web.pdf
Accredited-Transport-Cooperatives-Jan-2021-Web.pdfAccredited-Transport-Cooperatives-Jan-2021-Web.pdf
Accredited-Transport-Cooperatives-Jan-2021-Web.pdf
 
Vip Mumbai Call Girls Marol Naka Call On 9920725232 With Body to body massage...
Vip Mumbai Call Girls Marol Naka Call On 9920725232 With Body to body massage...Vip Mumbai Call Girls Marol Naka Call On 9920725232 With Body to body massage...
Vip Mumbai Call Girls Marol Naka Call On 9920725232 With Body to body massage...
 
Invezz.com - Grow your wealth with trading signals
Invezz.com - Grow your wealth with trading signalsInvezz.com - Grow your wealth with trading signals
Invezz.com - Grow your wealth with trading signals
 

Understanding big data

  • 2. Overview Why Big Data Big Data Users What is Big Data
  • 3. Big Data Big data is a popular term used to describe the exponential growth and availability of data, both structured and unstructured. And big data may be as important to business – and society – as the Internet has become. Why? More data may lead to more accurate analyses. More accurate analyses may lead to more confident decision making. And better decisions can mean greater operational efficiencies, cost reductions and reduced risk.
  • 4.
  • 5.
  • 6.
  • 7. Big Data Mainstream definition of big data as the three V’s of big data:
  • 9. Big Data • Volume: Many factors contribute to the increase in data volume. Transaction-based data stored through the years. Unstructured data streaming in from social media. Increasing amounts of sensor and machine-to- machine data being collected. • Velocity: Data is streaming in at unprecedented speed and must be dealt with in a timely manner. RFID tags, sensors and smart metering are driving the need to deal with torrents of data in near-real time. Reacting quickly enough to deal with data velocity is a challenge for most organizations. • Variety: Data today comes in all types of formats. Structured, numeric data in traditional databases. Information created from line-of-business applications. Unstructured text documents, email, video, audio, stock ticker data and financial transactions. Managing, merging and governing different varieties of data is something many organizations still grapple with.
  • 10. Big Data Let us consider two additional dimensions when thinking about big data: • Variability: In addition to the increasing velocities and varieties of data, data flows can be highly inconsistent with periodic peaks. Is something trending in social media? Daily, seasonal and event-triggered peak data loads can be challenging to manage. Even more so with unstructured data involved. • Complexity: Today's data comes from multiple sources. And it is still an undertaking to link, match, cleanse and transform data across systems. However, it is necessary to connect and correlate relationships, hierarchies and multiple data linkages or your data can quickly spiral out of control.
  • 12. Why Big Data ?? The real issue is not that you are acquiring large amounts of data. It's what you do with the data that counts. The hopeful vision is that organizations will be able to take data from any source, harness relevant data and analyse it to find answers that enable 1) cost reductions 2) time reductions 3) new product development and optimized offerings 4) smarter business decision making.
  • 13.
  • 14.
  • 15.
  • 17.
  • 18. Big Data platform typically works by storing data first into clusters , then process the data through MapReduce workflows which executes by Mapping the input data through independent chunks processed by appropriate algorithms, the output from Map phase then moves to Shuffle/Sorting phase & finally the output from Shuffle phase comes to Reduce phase as input. Typical Big Data MapReduce workflow:
  • 19.
  • 20.
  • 21. Big Data users in Next Five Years
  • 22. Big Data users in Private Sector • Ebay.com uses two data warehouses at 7.5 petabytes and 40PB as well as a 40PB Hadoop cluster for search, consumer recommendations, and merchandising. • Amazon.com handles millions of back-end operations every day, as well as queries from more than half a million third-party sellers. The core technology that keeps Amazon running is Linux-based and as of 2005 they had the world’s three largest Linux databases, with capacities of 7.8 TB, 18.5 TB, and 24.7 TB. • Walmart handles more than 1 million customer transactions every hour, which are imported into databases estimated to contain more than 2.5 petabytes (2560 terabytes) of data – the equivalent of 167 times the information contained in all the books in the US Library of Congress. • Facebook handles 50 billion photos from its user base.
  • 23. Big Data users in Private Sector • FICO Falcon Credit Card Fraud Detection System protects 2.1 billion active accounts world-wide. The volume of business data worldwide, across all companies, doubles every 1.2 years, according to estimates. • Windermere Real Estate uses anonymous GPS signals from nearly 100 million drivers to help new home buyers determine their typical drive times to and from work throughout various times of the day.
  • 24. Thank You.. Queries are welcome Praneet Samaiya