SlideShare uma empresa Scribd logo
1 de 14
1© Copyright 2013 EMC Corporation. All rights reserved.
Big Data – General
Introduction
- Vignesh Gopalan , IIG
2© Copyright 2013 EMC Corporation. All rights reserved.
Agenda
Big Data – Definition
Importance of Big Data
Technologies used in Big Data Analysis
3© Copyright 2013 EMC Corporation. All rights reserved.
Big Data – A Definition
Volume
Variety
Velocity
Veracity
The ‘V’s of Big Data
4© Copyright 2013 EMC Corporation. All rights reserved.
Why is Big Data Important?
Business Analytics
Big Science like LHC, Gene Sequencing Programs
Big Government
5© Copyright 2013 EMC Corporation. All rights reserved.
Big Data – Technologies Primer
MapReduce computation framework and Hadoop
Distributed File System
Distributed databases
NoSQL technologies
6© Copyright 2013 EMC Corporation. All rights reserved.
MapReduce
Published by Google
Scalable
Fault-Tolerant
Batch Computation in parallel
A distributed computation framework
7© Copyright 2013 EMC Corporation. All rights reserved.
MapReduce
Consists of two functions operating
on key-value pairs.
Map – performs filtering and sorting
Reduce - performs summary operation on
Map step results.
… continued
8© Copyright 2013 EMC Corporation. All rights reserved.
Map Reduce…
Image Courtesy – Big Data by Nathan Marz , James Warren, Manning Publications
9© Copyright 2013 EMC Corporation. All rights reserved.
Map Reduce…
Image Courtesy – Big Data by Nathan Marz , James Warren, Manning Publications
10© Copyright 2013 EMC Corporation. All rights reserved.
Distributed File System
Distributed and scalable file system
Highly Available
Intrinsically aware of Map and Reduce jobs
Supports horizontal and vertical partitioning
HDFS – Hadoop Distributed File System
11© Copyright 2013 EMC Corporation. All rights reserved.
HDFS Architecture
Image Courtesy – Big Data by Nathan Marz , James Warren, Manning Publications
12© Copyright 2013 EMC Corporation. All rights reserved.
Apache Hadoop
Open Source implementation of MapReduce + DFS
Image Courtesy – Wikipedia
13© Copyright 2013 EMC Corporation. All rights reserved.
NoSQL Databases
Highly optimized key-value stores
No ACID Guarantees. Eventual consistency
Fault-Tolerant, Distributed architecture.
Amazon Dynamo, Redis are examples.
A distributed computation framework
Big Data – General Introduction

Mais conteúdo relacionado

Mais procurados

EMC Isilon Solutions for Data Archives
EMC Isilon Solutions for Data ArchivesEMC Isilon Solutions for Data Archives
EMC Isilon Solutions for Data Archivessolarisyougood
 
Transform Your Business with Big Data Storage
Transform Your Business with Big Data StorageTransform Your Business with Big Data Storage
Transform Your Business with Big Data StorageEMC
 
EMC isilon for -media-and-entertainment-sales-deck
EMC isilon for -media-and-entertainment-sales-deckEMC isilon for -media-and-entertainment-sales-deck
EMC isilon for -media-and-entertainment-sales-decksolarisyougood
 
White Paper: Best Practices for Data Replication with EMC Isilon SyncIQ
White Paper: Best Practices for Data Replication with EMC Isilon SyncIQ   White Paper: Best Practices for Data Replication with EMC Isilon SyncIQ
White Paper: Best Practices for Data Replication with EMC Isilon SyncIQ EMC
 
7. emc isilon hdfs enterprise storage for hadoop
7. emc isilon hdfs   enterprise storage for hadoop7. emc isilon hdfs   enterprise storage for hadoop
7. emc isilon hdfs enterprise storage for hadoopTaldor Group
 
White Paper: EMC Isilon OneFS Operating System
White Paper: EMC Isilon OneFS Operating System  White Paper: EMC Isilon OneFS Operating System
White Paper: EMC Isilon OneFS Operating System EMC
 
EMC IT's Journey to the Private Cloud: A Practitioner's Guide
EMC IT's Journey to the Private Cloud: A Practitioner's Guide EMC IT's Journey to the Private Cloud: A Practitioner's Guide
EMC IT's Journey to the Private Cloud: A Practitioner's Guide EMC
 
White Paper: EMC Isilon OneFS — A Technical Overview
White Paper: EMC Isilon OneFS — A Technical Overview   White Paper: EMC Isilon OneFS — A Technical Overview
White Paper: EMC Isilon OneFS — A Technical Overview EMC
 
EMC Academic Alliance Presentation
EMC Academic Alliance PresentationEMC Academic Alliance Presentation
EMC Academic Alliance PresentationHaitham El-Ghareeb
 
The Future of Storage : EMC Software Defined Solution
The Future of Storage : EMC Software Defined Solution The Future of Storage : EMC Software Defined Solution
The Future of Storage : EMC Software Defined Solution RSD
 
EMC ScaleIO Overview
EMC ScaleIO OverviewEMC ScaleIO Overview
EMC ScaleIO Overviewwalshe1
 
Emc vi pr data services
Emc vi pr data servicesEmc vi pr data services
Emc vi pr data servicessolarisyougood
 
Emc isilon technical deep dive workshop
Emc isilon technical deep dive workshopEmc isilon technical deep dive workshop
Emc isilon technical deep dive workshopsolarisyougood
 
Building Hadoop-as-a-Service with Pivotal Hadoop Distribution, Serengeti, & I...
Building Hadoop-as-a-Service with Pivotal Hadoop Distribution, Serengeti, & I...Building Hadoop-as-a-Service with Pivotal Hadoop Distribution, Serengeti, & I...
Building Hadoop-as-a-Service with Pivotal Hadoop Distribution, Serengeti, & I...EMC
 
Transforming Mission Critical Applications
Transforming Mission Critical ApplicationsTransforming Mission Critical Applications
Transforming Mission Critical ApplicationsCenk Ersoy
 
EMC ViPR Services Storage Engine Architecture
EMC ViPR Services Storage Engine ArchitectureEMC ViPR Services Storage Engine Architecture
EMC ViPR Services Storage Engine ArchitectureEMC
 
S104875 nightmares-dreams-spectrum-control-jburg-v1809h
S104875 nightmares-dreams-spectrum-control-jburg-v1809hS104875 nightmares-dreams-spectrum-control-jburg-v1809h
S104875 nightmares-dreams-spectrum-control-jburg-v1809hTony Pearson
 
EMC-ISILON_MphasiS_Walk_through
EMC-ISILON_MphasiS_Walk_throughEMC-ISILON_MphasiS_Walk_through
EMC-ISILON_MphasiS_Walk_throughprakashjjaya
 
Journey to the Software Defined Data Center: EMA Research Results Revealed
Journey to the Software Defined Data Center: EMA Research Results Revealed Journey to the Software Defined Data Center: EMA Research Results Revealed
Journey to the Software Defined Data Center: EMA Research Results Revealed Enterprise Management Associates
 

Mais procurados (20)

EMC Isilon Solutions for Data Archives
EMC Isilon Solutions for Data ArchivesEMC Isilon Solutions for Data Archives
EMC Isilon Solutions for Data Archives
 
Transform Your Business with Big Data Storage
Transform Your Business with Big Data StorageTransform Your Business with Big Data Storage
Transform Your Business with Big Data Storage
 
EMC isilon for -media-and-entertainment-sales-deck
EMC isilon for -media-and-entertainment-sales-deckEMC isilon for -media-and-entertainment-sales-deck
EMC isilon for -media-and-entertainment-sales-deck
 
White Paper: Best Practices for Data Replication with EMC Isilon SyncIQ
White Paper: Best Practices for Data Replication with EMC Isilon SyncIQ   White Paper: Best Practices for Data Replication with EMC Isilon SyncIQ
White Paper: Best Practices for Data Replication with EMC Isilon SyncIQ
 
7. emc isilon hdfs enterprise storage for hadoop
7. emc isilon hdfs   enterprise storage for hadoop7. emc isilon hdfs   enterprise storage for hadoop
7. emc isilon hdfs enterprise storage for hadoop
 
White Paper: EMC Isilon OneFS Operating System
White Paper: EMC Isilon OneFS Operating System  White Paper: EMC Isilon OneFS Operating System
White Paper: EMC Isilon OneFS Operating System
 
EMC IT's Journey to the Private Cloud: A Practitioner's Guide
EMC IT's Journey to the Private Cloud: A Practitioner's Guide EMC IT's Journey to the Private Cloud: A Practitioner's Guide
EMC IT's Journey to the Private Cloud: A Practitioner's Guide
 
White Paper: EMC Isilon OneFS — A Technical Overview
White Paper: EMC Isilon OneFS — A Technical Overview   White Paper: EMC Isilon OneFS — A Technical Overview
White Paper: EMC Isilon OneFS — A Technical Overview
 
EMC Academic Alliance Presentation
EMC Academic Alliance PresentationEMC Academic Alliance Presentation
EMC Academic Alliance Presentation
 
The Future of Storage : EMC Software Defined Solution
The Future of Storage : EMC Software Defined Solution The Future of Storage : EMC Software Defined Solution
The Future of Storage : EMC Software Defined Solution
 
EMC ScaleIO Overview
EMC ScaleIO OverviewEMC ScaleIO Overview
EMC ScaleIO Overview
 
Emc isilon overview
Emc isilon overview Emc isilon overview
Emc isilon overview
 
Emc vi pr data services
Emc vi pr data servicesEmc vi pr data services
Emc vi pr data services
 
Emc isilon technical deep dive workshop
Emc isilon technical deep dive workshopEmc isilon technical deep dive workshop
Emc isilon technical deep dive workshop
 
Building Hadoop-as-a-Service with Pivotal Hadoop Distribution, Serengeti, & I...
Building Hadoop-as-a-Service with Pivotal Hadoop Distribution, Serengeti, & I...Building Hadoop-as-a-Service with Pivotal Hadoop Distribution, Serengeti, & I...
Building Hadoop-as-a-Service with Pivotal Hadoop Distribution, Serengeti, & I...
 
Transforming Mission Critical Applications
Transforming Mission Critical ApplicationsTransforming Mission Critical Applications
Transforming Mission Critical Applications
 
EMC ViPR Services Storage Engine Architecture
EMC ViPR Services Storage Engine ArchitectureEMC ViPR Services Storage Engine Architecture
EMC ViPR Services Storage Engine Architecture
 
S104875 nightmares-dreams-spectrum-control-jburg-v1809h
S104875 nightmares-dreams-spectrum-control-jburg-v1809hS104875 nightmares-dreams-spectrum-control-jburg-v1809h
S104875 nightmares-dreams-spectrum-control-jburg-v1809h
 
EMC-ISILON_MphasiS_Walk_through
EMC-ISILON_MphasiS_Walk_throughEMC-ISILON_MphasiS_Walk_through
EMC-ISILON_MphasiS_Walk_through
 
Journey to the Software Defined Data Center: EMA Research Results Revealed
Journey to the Software Defined Data Center: EMA Research Results Revealed Journey to the Software Defined Data Center: EMA Research Results Revealed
Journey to the Software Defined Data Center: EMA Research Results Revealed
 

Destaque

EMC & OpenStack: A View From Within
EMC & OpenStack: A View From WithinEMC & OpenStack: A View From Within
EMC & OpenStack: A View From WithinEMC
 
Integumentary Terms
Integumentary TermsIntegumentary Terms
Integumentary Termstahearn40
 
Recovered file 1
Recovered file 1Recovered file 1
Recovered file 1vicolombia
 
Insaat kursu-pendik
Insaat kursu-pendikInsaat kursu-pendik
Insaat kursu-pendiksersld54
 
What a dude born in 1888 taught me about design.
What a dude born in 1888 taught me about design. What a dude born in 1888 taught me about design.
What a dude born in 1888 taught me about design. Kylie Timpani
 
Building the Case for New Technology Have Inspiration, Will Travel ...
Building the Case for New Technology Have Inspiration, Will Travel ...Building the Case for New Technology Have Inspiration, Will Travel ...
Building the Case for New Technology Have Inspiration, Will Travel ...Society of Women Engineers
 
EMC World 2016 - code.01 Everything as Code - How did we get here?
EMC World 2016 - code.01 Everything as Code - How did we get here?EMC World 2016 - code.01 Everything as Code - How did we get here?
EMC World 2016 - code.01 Everything as Code - How did we get here?{code}
 
iNARTE Presentation to EMC Symposium 2016
iNARTE Presentation to EMC Symposium 2016iNARTE Presentation to EMC Symposium 2016
iNARTE Presentation to EMC Symposium 2016Scott Paton
 
EMC Documentum Enterprise Content Management 6.5
EMC Documentum Enterprise Content Management 6.5EMC Documentum Enterprise Content Management 6.5
EMC Documentum Enterprise Content Management 6.5Emirates Computers
 

Destaque (17)

EMC & OpenStack: A View From Within
EMC & OpenStack: A View From WithinEMC & OpenStack: A View From Within
EMC & OpenStack: A View From Within
 
Integumentary Terms
Integumentary TermsIntegumentary Terms
Integumentary Terms
 
Recovered file 1
Recovered file 1Recovered file 1
Recovered file 1
 
Prototipo
Prototipo Prototipo
Prototipo
 
Conclave
ConclaveConclave
Conclave
 
Insaat kursu-pendik
Insaat kursu-pendikInsaat kursu-pendik
Insaat kursu-pendik
 
InDA Brochure India
InDA Brochure IndiaInDA Brochure India
InDA Brochure India
 
What a dude born in 1888 taught me about design.
What a dude born in 1888 taught me about design. What a dude born in 1888 taught me about design.
What a dude born in 1888 taught me about design.
 
Spain
SpainSpain
Spain
 
SLG_EMC
SLG_EMCSLG_EMC
SLG_EMC
 
Building the Case for New Technology Have Inspiration, Will Travel ...
Building the Case for New Technology Have Inspiration, Will Travel ...Building the Case for New Technology Have Inspiration, Will Travel ...
Building the Case for New Technology Have Inspiration, Will Travel ...
 
EMC World 2016 - code.01 Everything as Code - How did we get here?
EMC World 2016 - code.01 Everything as Code - How did we get here?EMC World 2016 - code.01 Everything as Code - How did we get here?
EMC World 2016 - code.01 Everything as Code - How did we get here?
 
iNARTE Presentation to EMC Symposium 2016
iNARTE Presentation to EMC Symposium 2016iNARTE Presentation to EMC Symposium 2016
iNARTE Presentation to EMC Symposium 2016
 
Dell corporation ltd
Dell corporation ltdDell corporation ltd
Dell corporation ltd
 
EMC World 2016 Summary (Part 1)
EMC World 2016 Summary (Part 1)EMC World 2016 Summary (Part 1)
EMC World 2016 Summary (Part 1)
 
EMC Documentum Enterprise Content Management 6.5
EMC Documentum Enterprise Content Management 6.5EMC Documentum Enterprise Content Management 6.5
EMC Documentum Enterprise Content Management 6.5
 
Unit 7 Book
Unit 7 Book Unit 7 Book
Unit 7 Book
 

Semelhante a Big Data – General Introduction

Big data with hadoop
Big data with hadoopBig data with hadoop
Big data with hadoopAnusha sweety
 
hadoop seminar training report
hadoop seminar  training reporthadoop seminar  training report
hadoop seminar training reportSarvesh Meena
 
Cloud Models, Considerations, & Adoption Techniques
Cloud Models, Considerations, & Adoption TechniquesCloud Models, Considerations, & Adoption Techniques
Cloud Models, Considerations, & Adoption TechniquesEMC
 
Big Data Performance and Capacity Management
Big Data Performance and Capacity ManagementBig Data Performance and Capacity Management
Big Data Performance and Capacity Managementrightsize
 
True Storage Virtualization with Software-Defined Storage
True Storage Virtualization with Software-Defined StorageTrue Storage Virtualization with Software-Defined Storage
True Storage Virtualization with Software-Defined StorageCloudOps Summit
 
Hadoop_Its_Not_Just_Internal_Storage_V14
Hadoop_Its_Not_Just_Internal_Storage_V14Hadoop_Its_Not_Just_Internal_Storage_V14
Hadoop_Its_Not_Just_Internal_Storage_V14John Sing
 
A Survey on Big Data Analysis Techniques
A Survey on Big Data Analysis TechniquesA Survey on Big Data Analysis Techniques
A Survey on Big Data Analysis Techniquesijsrd.com
 
Pivotal: Hadoop for Powerful Processing of Unstructured Data for Valuable Ins...
Pivotal: Hadoop for Powerful Processing of Unstructured Data for Valuable Ins...Pivotal: Hadoop for Powerful Processing of Unstructured Data for Valuable Ins...
Pivotal: Hadoop for Powerful Processing of Unstructured Data for Valuable Ins...EMC
 
Hadoop Overview
Hadoop Overview Hadoop Overview
Hadoop Overview EMC
 
Big Data using NoSQL Technologies
Big Data using NoSQL TechnologiesBig Data using NoSQL Technologies
Big Data using NoSQL TechnologiesAmit Singh
 
Pivotal: Virtualize Big Data to Make the Elephant Dance
Pivotal: Virtualize Big Data to Make the Elephant DancePivotal: Virtualize Big Data to Make the Elephant Dance
Pivotal: Virtualize Big Data to Make the Elephant DanceEMC
 
Introduction and Overview of BigData, Hadoop, Distributed Computing - BigData...
Introduction and Overview of BigData, Hadoop, Distributed Computing - BigData...Introduction and Overview of BigData, Hadoop, Distributed Computing - BigData...
Introduction and Overview of BigData, Hadoop, Distributed Computing - BigData...Mahantesh Angadi
 
Cloud Computing Ambiance using Secluded Access Control Method
Cloud Computing Ambiance using Secluded Access Control MethodCloud Computing Ambiance using Secluded Access Control Method
Cloud Computing Ambiance using Secluded Access Control MethodIRJET Journal
 
Hadoop by kamran khan
Hadoop by kamran khanHadoop by kamran khan
Hadoop by kamran khanKamranKhan587
 
Emc vi pr global data services
Emc vi pr global data servicesEmc vi pr global data services
Emc vi pr global data servicessolarisyougood
 

Semelhante a Big Data – General Introduction (20)

Greenplum feature
Greenplum featureGreenplum feature
Greenplum feature
 
Big Data and its Possibilities in the Cloud
Big Data and its Possibilities in the CloudBig Data and its Possibilities in the Cloud
Big Data and its Possibilities in the Cloud
 
Big data with hadoop
Big data with hadoopBig data with hadoop
Big data with hadoop
 
hadoop seminar training report
hadoop seminar  training reporthadoop seminar  training report
hadoop seminar training report
 
Cloud Models, Considerations, & Adoption Techniques
Cloud Models, Considerations, & Adoption TechniquesCloud Models, Considerations, & Adoption Techniques
Cloud Models, Considerations, & Adoption Techniques
 
Big Data Performance and Capacity Management
Big Data Performance and Capacity ManagementBig Data Performance and Capacity Management
Big Data Performance and Capacity Management
 
True Storage Virtualization with Software-Defined Storage
True Storage Virtualization with Software-Defined StorageTrue Storage Virtualization with Software-Defined Storage
True Storage Virtualization with Software-Defined Storage
 
Hadoop_Its_Not_Just_Internal_Storage_V14
Hadoop_Its_Not_Just_Internal_Storage_V14Hadoop_Its_Not_Just_Internal_Storage_V14
Hadoop_Its_Not_Just_Internal_Storage_V14
 
Hadoop technology
Hadoop technologyHadoop technology
Hadoop technology
 
Hadoop map reduce for mobile clouds
Hadoop map reduce for mobile cloudsHadoop map reduce for mobile clouds
Hadoop map reduce for mobile clouds
 
Hadoop Everywhere
Hadoop EverywhereHadoop Everywhere
Hadoop Everywhere
 
A Survey on Big Data Analysis Techniques
A Survey on Big Data Analysis TechniquesA Survey on Big Data Analysis Techniques
A Survey on Big Data Analysis Techniques
 
Pivotal: Hadoop for Powerful Processing of Unstructured Data for Valuable Ins...
Pivotal: Hadoop for Powerful Processing of Unstructured Data for Valuable Ins...Pivotal: Hadoop for Powerful Processing of Unstructured Data for Valuable Ins...
Pivotal: Hadoop for Powerful Processing of Unstructured Data for Valuable Ins...
 
Hadoop Overview
Hadoop Overview Hadoop Overview
Hadoop Overview
 
Big Data using NoSQL Technologies
Big Data using NoSQL TechnologiesBig Data using NoSQL Technologies
Big Data using NoSQL Technologies
 
Pivotal: Virtualize Big Data to Make the Elephant Dance
Pivotal: Virtualize Big Data to Make the Elephant DancePivotal: Virtualize Big Data to Make the Elephant Dance
Pivotal: Virtualize Big Data to Make the Elephant Dance
 
Introduction and Overview of BigData, Hadoop, Distributed Computing - BigData...
Introduction and Overview of BigData, Hadoop, Distributed Computing - BigData...Introduction and Overview of BigData, Hadoop, Distributed Computing - BigData...
Introduction and Overview of BigData, Hadoop, Distributed Computing - BigData...
 
Cloud Computing Ambiance using Secluded Access Control Method
Cloud Computing Ambiance using Secluded Access Control MethodCloud Computing Ambiance using Secluded Access Control Method
Cloud Computing Ambiance using Secluded Access Control Method
 
Hadoop by kamran khan
Hadoop by kamran khanHadoop by kamran khan
Hadoop by kamran khan
 
Emc vi pr global data services
Emc vi pr global data servicesEmc vi pr global data services
Emc vi pr global data services
 

Mais de EMC

Cloud Foundry Summit Berlin Keynote
Cloud Foundry Summit Berlin Keynote Cloud Foundry Summit Berlin Keynote
Cloud Foundry Summit Berlin Keynote EMC
 
EMC GLOBAL DATA PROTECTION INDEX
EMC GLOBAL DATA PROTECTION INDEX EMC GLOBAL DATA PROTECTION INDEX
EMC GLOBAL DATA PROTECTION INDEX EMC
 
Transforming Desktop Virtualization with Citrix XenDesktop and EMC XtremIO
Transforming Desktop Virtualization with Citrix XenDesktop and EMC XtremIOTransforming Desktop Virtualization with Citrix XenDesktop and EMC XtremIO
Transforming Desktop Virtualization with Citrix XenDesktop and EMC XtremIOEMC
 
Citrix ready-webinar-xtremio
Citrix ready-webinar-xtremioCitrix ready-webinar-xtremio
Citrix ready-webinar-xtremioEMC
 
EMC FORUM RESEARCH GLOBAL RESULTS - 10,451 RESPONSES ACROSS 33 COUNTRIES
EMC FORUM RESEARCH GLOBAL RESULTS - 10,451 RESPONSES ACROSS 33 COUNTRIES EMC FORUM RESEARCH GLOBAL RESULTS - 10,451 RESPONSES ACROSS 33 COUNTRIES
EMC FORUM RESEARCH GLOBAL RESULTS - 10,451 RESPONSES ACROSS 33 COUNTRIES EMC
 
EMC with Mirantis Openstack
EMC with Mirantis OpenstackEMC with Mirantis Openstack
EMC with Mirantis OpenstackEMC
 
Modern infrastructure for business data lake
Modern infrastructure for business data lakeModern infrastructure for business data lake
Modern infrastructure for business data lakeEMC
 
Force Cyber Criminals to Shop Elsewhere
Force Cyber Criminals to Shop ElsewhereForce Cyber Criminals to Shop Elsewhere
Force Cyber Criminals to Shop ElsewhereEMC
 
Pivotal : Moments in Container History
Pivotal : Moments in Container History Pivotal : Moments in Container History
Pivotal : Moments in Container History EMC
 
Data Lake Protection - A Technical Review
Data Lake Protection - A Technical ReviewData Lake Protection - A Technical Review
Data Lake Protection - A Technical ReviewEMC
 
Mobile E-commerce: Friend or Foe
Mobile E-commerce: Friend or FoeMobile E-commerce: Friend or Foe
Mobile E-commerce: Friend or FoeEMC
 
Virtualization Myths Infographic
Virtualization Myths Infographic Virtualization Myths Infographic
Virtualization Myths Infographic EMC
 
Intelligence-Driven GRC for Security
Intelligence-Driven GRC for SecurityIntelligence-Driven GRC for Security
Intelligence-Driven GRC for SecurityEMC
 
The Trust Paradox: Access Management and Trust in an Insecure Age
The Trust Paradox: Access Management and Trust in an Insecure AgeThe Trust Paradox: Access Management and Trust in an Insecure Age
The Trust Paradox: Access Management and Trust in an Insecure AgeEMC
 
EMC Technology Day - SRM University 2015
EMC Technology Day - SRM University 2015EMC Technology Day - SRM University 2015
EMC Technology Day - SRM University 2015EMC
 
EMC Academic Summit 2015
EMC Academic Summit 2015EMC Academic Summit 2015
EMC Academic Summit 2015EMC
 
Data Science and Big Data Analytics Book from EMC Education Services
Data Science and Big Data Analytics Book from EMC Education ServicesData Science and Big Data Analytics Book from EMC Education Services
Data Science and Big Data Analytics Book from EMC Education ServicesEMC
 
Using EMC Symmetrix Storage in VMware vSphere Environments
Using EMC Symmetrix Storage in VMware vSphere EnvironmentsUsing EMC Symmetrix Storage in VMware vSphere Environments
Using EMC Symmetrix Storage in VMware vSphere EnvironmentsEMC
 
Using EMC VNX storage with VMware vSphereTechBook
Using EMC VNX storage with VMware vSphereTechBookUsing EMC VNX storage with VMware vSphereTechBook
Using EMC VNX storage with VMware vSphereTechBookEMC
 
2014 Cybercrime Roundup: The Year of the POS Breach
2014 Cybercrime Roundup: The Year of the POS Breach2014 Cybercrime Roundup: The Year of the POS Breach
2014 Cybercrime Roundup: The Year of the POS BreachEMC
 

Mais de EMC (20)

Cloud Foundry Summit Berlin Keynote
Cloud Foundry Summit Berlin Keynote Cloud Foundry Summit Berlin Keynote
Cloud Foundry Summit Berlin Keynote
 
EMC GLOBAL DATA PROTECTION INDEX
EMC GLOBAL DATA PROTECTION INDEX EMC GLOBAL DATA PROTECTION INDEX
EMC GLOBAL DATA PROTECTION INDEX
 
Transforming Desktop Virtualization with Citrix XenDesktop and EMC XtremIO
Transforming Desktop Virtualization with Citrix XenDesktop and EMC XtremIOTransforming Desktop Virtualization with Citrix XenDesktop and EMC XtremIO
Transforming Desktop Virtualization with Citrix XenDesktop and EMC XtremIO
 
Citrix ready-webinar-xtremio
Citrix ready-webinar-xtremioCitrix ready-webinar-xtremio
Citrix ready-webinar-xtremio
 
EMC FORUM RESEARCH GLOBAL RESULTS - 10,451 RESPONSES ACROSS 33 COUNTRIES
EMC FORUM RESEARCH GLOBAL RESULTS - 10,451 RESPONSES ACROSS 33 COUNTRIES EMC FORUM RESEARCH GLOBAL RESULTS - 10,451 RESPONSES ACROSS 33 COUNTRIES
EMC FORUM RESEARCH GLOBAL RESULTS - 10,451 RESPONSES ACROSS 33 COUNTRIES
 
EMC with Mirantis Openstack
EMC with Mirantis OpenstackEMC with Mirantis Openstack
EMC with Mirantis Openstack
 
Modern infrastructure for business data lake
Modern infrastructure for business data lakeModern infrastructure for business data lake
Modern infrastructure for business data lake
 
Force Cyber Criminals to Shop Elsewhere
Force Cyber Criminals to Shop ElsewhereForce Cyber Criminals to Shop Elsewhere
Force Cyber Criminals to Shop Elsewhere
 
Pivotal : Moments in Container History
Pivotal : Moments in Container History Pivotal : Moments in Container History
Pivotal : Moments in Container History
 
Data Lake Protection - A Technical Review
Data Lake Protection - A Technical ReviewData Lake Protection - A Technical Review
Data Lake Protection - A Technical Review
 
Mobile E-commerce: Friend or Foe
Mobile E-commerce: Friend or FoeMobile E-commerce: Friend or Foe
Mobile E-commerce: Friend or Foe
 
Virtualization Myths Infographic
Virtualization Myths Infographic Virtualization Myths Infographic
Virtualization Myths Infographic
 
Intelligence-Driven GRC for Security
Intelligence-Driven GRC for SecurityIntelligence-Driven GRC for Security
Intelligence-Driven GRC for Security
 
The Trust Paradox: Access Management and Trust in an Insecure Age
The Trust Paradox: Access Management and Trust in an Insecure AgeThe Trust Paradox: Access Management and Trust in an Insecure Age
The Trust Paradox: Access Management and Trust in an Insecure Age
 
EMC Technology Day - SRM University 2015
EMC Technology Day - SRM University 2015EMC Technology Day - SRM University 2015
EMC Technology Day - SRM University 2015
 
EMC Academic Summit 2015
EMC Academic Summit 2015EMC Academic Summit 2015
EMC Academic Summit 2015
 
Data Science and Big Data Analytics Book from EMC Education Services
Data Science and Big Data Analytics Book from EMC Education ServicesData Science and Big Data Analytics Book from EMC Education Services
Data Science and Big Data Analytics Book from EMC Education Services
 
Using EMC Symmetrix Storage in VMware vSphere Environments
Using EMC Symmetrix Storage in VMware vSphere EnvironmentsUsing EMC Symmetrix Storage in VMware vSphere Environments
Using EMC Symmetrix Storage in VMware vSphere Environments
 
Using EMC VNX storage with VMware vSphereTechBook
Using EMC VNX storage with VMware vSphereTechBookUsing EMC VNX storage with VMware vSphereTechBook
Using EMC VNX storage with VMware vSphereTechBook
 
2014 Cybercrime Roundup: The Year of the POS Breach
2014 Cybercrime Roundup: The Year of the POS Breach2014 Cybercrime Roundup: The Year of the POS Breach
2014 Cybercrime Roundup: The Year of the POS Breach
 

Último

My Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationMy Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationRidwan Fadjar
 
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...HostedbyConfluent
 
Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreterPresentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreternaman860154
 
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...shyamraj55
 
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfThe Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfEnterprise Knowledge
 
Slack Application Development 101 Slides
Slack Application Development 101 SlidesSlack Application Development 101 Slides
Slack Application Development 101 Slidespraypatel2
 
How to convert PDF to text with Nanonets
How to convert PDF to text with NanonetsHow to convert PDF to text with Nanonets
How to convert PDF to text with Nanonetsnaman860154
 
Understanding the Laravel MVC Architecture
Understanding the Laravel MVC ArchitectureUnderstanding the Laravel MVC Architecture
Understanding the Laravel MVC ArchitecturePixlogix Infotech
 
🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘RTylerCroy
 
Salesforce Community Group Quito, Salesforce 101
Salesforce Community Group Quito, Salesforce 101Salesforce Community Group Quito, Salesforce 101
Salesforce Community Group Quito, Salesforce 101Paola De la Torre
 
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024BookNet Canada
 
The Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxThe Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxMalak Abu Hammad
 
Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)Allon Mureinik
 
08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking MenDelhi Call girls
 
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure serviceWhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure servicePooja Nehwal
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonAnna Loughnan Colquhoun
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationMichael W. Hawkins
 
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | DelhiFULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhisoniya singh
 
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...Neo4j
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityPrincipled Technologies
 

Último (20)

My Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationMy Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 Presentation
 
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
 
Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreterPresentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreter
 
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
 
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfThe Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
 
Slack Application Development 101 Slides
Slack Application Development 101 SlidesSlack Application Development 101 Slides
Slack Application Development 101 Slides
 
How to convert PDF to text with Nanonets
How to convert PDF to text with NanonetsHow to convert PDF to text with Nanonets
How to convert PDF to text with Nanonets
 
Understanding the Laravel MVC Architecture
Understanding the Laravel MVC ArchitectureUnderstanding the Laravel MVC Architecture
Understanding the Laravel MVC Architecture
 
🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘
 
Salesforce Community Group Quito, Salesforce 101
Salesforce Community Group Quito, Salesforce 101Salesforce Community Group Quito, Salesforce 101
Salesforce Community Group Quito, Salesforce 101
 
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
 
The Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxThe Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptx
 
Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)
 
08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men
 
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure serviceWhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt Robison
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day Presentation
 
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | DelhiFULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
 
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivity
 

Big Data – General Introduction

  • 1. 1© Copyright 2013 EMC Corporation. All rights reserved. Big Data – General Introduction - Vignesh Gopalan , IIG
  • 2. 2© Copyright 2013 EMC Corporation. All rights reserved. Agenda Big Data – Definition Importance of Big Data Technologies used in Big Data Analysis
  • 3. 3© Copyright 2013 EMC Corporation. All rights reserved. Big Data – A Definition Volume Variety Velocity Veracity The ‘V’s of Big Data
  • 4. 4© Copyright 2013 EMC Corporation. All rights reserved. Why is Big Data Important? Business Analytics Big Science like LHC, Gene Sequencing Programs Big Government
  • 5. 5© Copyright 2013 EMC Corporation. All rights reserved. Big Data – Technologies Primer MapReduce computation framework and Hadoop Distributed File System Distributed databases NoSQL technologies
  • 6. 6© Copyright 2013 EMC Corporation. All rights reserved. MapReduce Published by Google Scalable Fault-Tolerant Batch Computation in parallel A distributed computation framework
  • 7. 7© Copyright 2013 EMC Corporation. All rights reserved. MapReduce Consists of two functions operating on key-value pairs. Map – performs filtering and sorting Reduce - performs summary operation on Map step results. … continued
  • 8. 8© Copyright 2013 EMC Corporation. All rights reserved. Map Reduce… Image Courtesy – Big Data by Nathan Marz , James Warren, Manning Publications
  • 9. 9© Copyright 2013 EMC Corporation. All rights reserved. Map Reduce… Image Courtesy – Big Data by Nathan Marz , James Warren, Manning Publications
  • 10. 10© Copyright 2013 EMC Corporation. All rights reserved. Distributed File System Distributed and scalable file system Highly Available Intrinsically aware of Map and Reduce jobs Supports horizontal and vertical partitioning HDFS – Hadoop Distributed File System
  • 11. 11© Copyright 2013 EMC Corporation. All rights reserved. HDFS Architecture Image Courtesy – Big Data by Nathan Marz , James Warren, Manning Publications
  • 12. 12© Copyright 2013 EMC Corporation. All rights reserved. Apache Hadoop Open Source implementation of MapReduce + DFS Image Courtesy – Wikipedia
  • 13. 13© Copyright 2013 EMC Corporation. All rights reserved. NoSQL Databases Highly optimized key-value stores No ACID Guarantees. Eventual consistency Fault-Tolerant, Distributed architecture. Amazon Dynamo, Redis are examples. A distributed computation framework