SlideShare uma empresa Scribd logo
1 de 20
What is Big Data?
• Huge Amount of Data (Terabytes or Petabytes)
• Big data is the term for a collection of data sets
so large and complex that it becomes difficult to
process using on-hand database management
tools or traditional data processing applications
• The challenges include capture, curation, storage,
search, sharing, transfer, analysis, and
visualization
LET’S SOLVE THIS
PROBLEM BY USING THE
BIG DATA NONE OF US
HAVE THE SLIGHTEST
IDEA WHAT TO DO WITH
What is Big Data? (Cont’d)
• Every day we roughly create 2.5 Quintillion bytes of data;
90% of the worlds collected data has been generated only
in the last 2 years
• Data sizes are now in Peta-bytes, Tera-bytes, Exa-bytes &
Zeta-bytes
• Data is of various types – user content (posts, tweets,)
images, audio and video files, chat logs, machine
generated data
• The number of speed of data generation has increased –
every data we generate 2,500,000,000,000,000 (2.5
quintillion) bytes of data EVERY DAY
• The data sources have multiplied many times over – social
media, banking, government, e-commerce etc.
Purpose of Big Data or Why Big Data?
• In 2015, we created data more than all of past years data combined. But 90% of our created data is not
sorted.
• Big Data gives us better and different picture of data collected. But the purpose is not solved yet. We need
to find out a better way of using it as well.
• And for all these you need the best data scientist you can possible get your hands on.
• Companies love data because it provided very lucrative insights to businesses and their clients - Data today
is the most important commodity. If companies can harness and analyse data, it provides an unmatchable
competitive advantage
• New age tools like Hadoop have made data handling and processing easier and cheaper
• Cloud computing, hardware and memory is getting cheaper so data storage is not a problem
• New analytical software provides real time analysis of data for business decisions
Big Data Market Trend
• Market research firm IDC forecasts a 50% increase in revenues from the sale of big data and
business analytics software, hardware, and services between 2015 and 2019.
• It says, services will account for the biggest chunk of revenue, with banking and
manufacturing-led industries poised to spend the most.
• By 2019, IDC said it expects revenue generated by the US market for big data and business
analytics solutions to exceed $98 billion.
• Edge analytics is the next big thing in the Big Data technology for companies that want to
gain real-time insights and impact business through IoT use cases. will allow companies to
derive the most actionable value from their data.
• The key trend in Big Data is the transition of analytics solutions into the cloud. The cloud
enables vast amount of computing resources to be applied to data analysis and to scale that
computing based on need.
Big Data Learning Challenges
• Upgradation: In every 3 to 6 months a new version of technology is introduced. So by the time you have actually
learned something, a new technology would have emerged making your knowledge old and outdated.
• Data Extraction: Challenge is to extract the most important information out of the massive data formation.
• Lack of Talent: A successful implementation of Big Data project requires a sophisticated team of developers, data
scientists and analysts having a sufficient knowledge of Big Data. But the available skill is very limited in number.
• Data Quality: United State’s cost of procuring dirty data every year is $600 billion. The common causes for these
data are- user input errors, data duplication & incorrect data linking.
Metrics of Data Size
Till date, we’ve only been familiar with data in GB. The world in changing now :
Data Metrics Hierarchy
1024 MEGABYTES = 1 GB
1024 GIGABYTES = 1 TB
1024 TERABYTES = 1 PB
1024 PETABYTES = 1 EB
1024 EXABYTES = 1 ZETABYTES
1024 KILOBYTES = 1 MB
1024 BYTES = 1 KB
Why Has Data Generation Increased?
The advent of the internet is the primary reason for the data explosion that has taken place over the last 15 years
• Facebook - more than 2.5 million pieces of content
• Twitter - 300 thousand tweets at Twitter
• Instagram - 250,000 new pictures
• YouTube - 75 hours of new video content
• Email - 400 million messages
• WhatsApp - 400,000 pictures
• Google - 5 million search requests
Every minute:
Other sources of Data generation
• E-commerce – More than1.3 billion transactions per day
• Data logging devices – Wearables (ex: FitBit) Healthcare monitoring, GPS systems, Data sensors
• Financial Data – Stock exchanges, Banking transactions
• Aviation Industry – A typical flight generates half a TB of data
• Governments – Citizen data, Tax records etc.
Companies Want More Data !!
• Companies love data because it provided very lucrative insights to businesses and their clients - Data today is
the most important commodity. If companies can harness and analyze data, it provides an unmatchable
competitive advantage
• New age tools like Hadoop have made data handling and processing easier and cheaper
• Cloud computing, hardware and memory is getting cheaper so data storage is not a problem
• New analytical software provides real time analysis of data for business decisions
Characteristics of Big Data
• Big Data is characterized by 4 V’s: Volume, Velocity, Variety and Veracity
BigData
Volume
Velocity
Variety
Veracity
Characteristics of Big Data – Volume
Volume: Refers to the enormous volumes of data
Data Warehouses
Characteristics of Big Data – Velocity
Velocity: Itrefers topace at which data is being generated, processed and consumed
Characteristics of Big Data – Variety
Variety: Data can be gathered from infinite sources
Characteristics of Big Data – Veracity
Veracity: The quality and authenticity of the data being captured can vary greatly
Scope: Testing Aspects in Big Data
• Validation of Structured and Unstructured Data: Data needs to be classified as the structured
and unstructured parts.
(i) Structured Data: It is the data which can be stored in the form of tables (rows and columns) without any
processing for example database, call details and excel sheets.
(ii) Unstructured Data: It is the data which does not have a predefined data model or structure for example data
in the form of weblogs, audio, tweets, and comments.
Adequate time needs to be spent over the validation of the data at an initial stage, and it is the point
where we encounter an abundance of bad data from various sources.
• Execution of Non-Functional Testing: Non-functional testing plays a vital role in ensuring the
scalability of the process. Functional testing focuses on the coding and requirement related issues
whereas non-functional testing classifies the performance bottlenecks and validates the non-
functional requirements.
• Handling Non-Relational Databases: Non-Relational databases form the backbone of the Big Data
storage. Since these are the main sources of data retrieval hence require a good portion of testing to maintain
the accuracy of the system. Commonly known as NoSQL databases, these DBs are designed in such a manner
that they can early handle the Big Data and are different from the traditional RDBMS which are designed on
table/key model.
• Ace Test Environment: Efficient test environment ensures that data from multiple sources is
of acceptable quality for accurate analysis. Although replicating the complete set of big data
into the test environment is next to impossible, so a small subset of the data is created for
the test environment to verify the behavior. Careful planning is required to exercise all paths
with subsets of data in a manner that fully verifies the application.
Phases of testing in Big Data and Hadoop
Testing of Big Data and Hadoop is an enormous and complex process which is segregated into
four phases to squeeze out the best results from testing. These phases are as follows:
1. Pre-Hadoop Processing: It includes the validation of the data which is collated from various sources
before Hadoop processing. This is the phase where we get rid of unwanted data.
2. Processing of Map Reduce Job: Map R job in Hadoop is the java code which is used to fetch out
the data according to the preconditions provided. Verification of the Map Reduce job is performed to
monitor the accuracy of the data fetched.
3. Data Extraction and Loading: This process of includes the validation of the data being loaded and
extracted from the HDFS (Hadoop Distributed File System) to ensure that no corrupt data occupied in
the HDFS.
4. Report Validation: This is the last phase of testing to ensure that the output which we are delivering
is meeting the accuracy standards, and there is no redundant data present in the reports.
Phases of testing in Big data and Hadoop
Become a Big Data expert with Hortonworks certification only at SpringPeople!
Get updates on upcoming classes and webinars at www.springpeople.com

Mais conteúdo relacionado

Mais procurados

Big data analytics with Apache Hadoop
Big data analytics with Apache  HadoopBig data analytics with Apache  Hadoop
Big data analytics with Apache HadoopSuman Saurabh
 
Big Data - An Overview
Big Data -  An OverviewBig Data -  An Overview
Big Data - An OverviewArvind Kalyan
 
Big Data Tutorial | What Is Big Data | Big Data Hadoop Tutorial For Beginners...
Big Data Tutorial | What Is Big Data | Big Data Hadoop Tutorial For Beginners...Big Data Tutorial | What Is Big Data | Big Data Hadoop Tutorial For Beginners...
Big Data Tutorial | What Is Big Data | Big Data Hadoop Tutorial For Beginners...Simplilearn
 
Big data lecture notes
Big data lecture notesBig data lecture notes
Big data lecture notesMohit Saini
 
Big Data Evolution
Big Data EvolutionBig Data Evolution
Big Data Evolutionitnewsafrica
 
Big Data Analytics - Introduction
Big Data Analytics - IntroductionBig Data Analytics - Introduction
Big Data Analytics - IntroductionAlex Meadows
 
Big Data Fundamentals
Big Data FundamentalsBig Data Fundamentals
Big Data Fundamentalsrjain51
 
10 Most Effective Big Data Technologies
10 Most Effective Big Data Technologies10 Most Effective Big Data Technologies
10 Most Effective Big Data TechnologiesMahindra Comviva
 
introduction to big data frameworks
introduction to big data frameworksintroduction to big data frameworks
introduction to big data frameworksAmal Targhi
 
Big data - what, why, where, when and how
Big data - what, why, where, when and howBig data - what, why, where, when and how
Big data - what, why, where, when and howbobosenthil
 
Tools and Methods for Big Data Analytics by Dahl Winters
Tools and Methods for Big Data Analytics by Dahl WintersTools and Methods for Big Data Analytics by Dahl Winters
Tools and Methods for Big Data Analytics by Dahl WintersMelinda Thielbar
 
Big data-analytics-cpe8035
Big data-analytics-cpe8035Big data-analytics-cpe8035
Big data-analytics-cpe8035Neelam Rawat
 
Introduction to Big Data
Introduction to Big DataIntroduction to Big Data
Introduction to Big DataVipin Batra
 
Introducing Technologies for Handling Big Data by Jaseela
Introducing Technologies for Handling Big Data by JaseelaIntroducing Technologies for Handling Big Data by Jaseela
Introducing Technologies for Handling Big Data by JaseelaStudent
 

Mais procurados (20)

Big data analytics with Apache Hadoop
Big data analytics with Apache  HadoopBig data analytics with Apache  Hadoop
Big data analytics with Apache Hadoop
 
Big Data - An Overview
Big Data -  An OverviewBig Data -  An Overview
Big Data - An Overview
 
Big Data Tutorial | What Is Big Data | Big Data Hadoop Tutorial For Beginners...
Big Data Tutorial | What Is Big Data | Big Data Hadoop Tutorial For Beginners...Big Data Tutorial | What Is Big Data | Big Data Hadoop Tutorial For Beginners...
Big Data Tutorial | What Is Big Data | Big Data Hadoop Tutorial For Beginners...
 
Big data lecture notes
Big data lecture notesBig data lecture notes
Big data lecture notes
 
Big Data Evolution
Big Data EvolutionBig Data Evolution
Big Data Evolution
 
Big Data Presentation
Big  Data PresentationBig  Data Presentation
Big Data Presentation
 
Big Data Analytics - Introduction
Big Data Analytics - IntroductionBig Data Analytics - Introduction
Big Data Analytics - Introduction
 
Big Data Tutorial V4
Big Data Tutorial V4Big Data Tutorial V4
Big Data Tutorial V4
 
A Big Data Concept
A Big Data ConceptA Big Data Concept
A Big Data Concept
 
Big Data Fundamentals
Big Data FundamentalsBig Data Fundamentals
Big Data Fundamentals
 
Big Data Analytics
Big Data AnalyticsBig Data Analytics
Big Data Analytics
 
Bigdata
Bigdata Bigdata
Bigdata
 
10 Most Effective Big Data Technologies
10 Most Effective Big Data Technologies10 Most Effective Big Data Technologies
10 Most Effective Big Data Technologies
 
introduction to big data frameworks
introduction to big data frameworksintroduction to big data frameworks
introduction to big data frameworks
 
Big data frameworks
Big data frameworksBig data frameworks
Big data frameworks
 
Big data - what, why, where, when and how
Big data - what, why, where, when and howBig data - what, why, where, when and how
Big data - what, why, where, when and how
 
Tools and Methods for Big Data Analytics by Dahl Winters
Tools and Methods for Big Data Analytics by Dahl WintersTools and Methods for Big Data Analytics by Dahl Winters
Tools and Methods for Big Data Analytics by Dahl Winters
 
Big data-analytics-cpe8035
Big data-analytics-cpe8035Big data-analytics-cpe8035
Big data-analytics-cpe8035
 
Introduction to Big Data
Introduction to Big DataIntroduction to Big Data
Introduction to Big Data
 
Introducing Technologies for Handling Big Data by Jaseela
Introducing Technologies for Handling Big Data by JaseelaIntroducing Technologies for Handling Big Data by Jaseela
Introducing Technologies for Handling Big Data by Jaseela
 

Destaque

Why Big Data is Really about Small Data
Why Big Data is Really about Small DataWhy Big Data is Really about Small Data
Why Big Data is Really about Small DataHurwitz & Associates
 
Matriz 2 fase 1 antoine_mario_gc177
Matriz 2  fase 1 antoine_mario_gc177Matriz 2  fase 1 antoine_mario_gc177
Matriz 2 fase 1 antoine_mario_gc177Mfario
 
Big Data - The 5 Vs Everyone Must Know
Big Data - The 5 Vs Everyone Must KnowBig Data - The 5 Vs Everyone Must Know
Big Data - The 5 Vs Everyone Must KnowBernard Marr
 
Spamalot: The Quest for the Holy Grail of Email Marketing
Spamalot: The Quest for the Holy Grail of Email MarketingSpamalot: The Quest for the Holy Grail of Email Marketing
Spamalot: The Quest for the Holy Grail of Email MarketingJeff Ebbing
 
B. Malyshev. Legal regulation of the Police in the reform context (2016)
B. Malyshev. Legal regulation of the Police in the reform context (2016)B. Malyshev. Legal regulation of the Police in the reform context (2016)
B. Malyshev. Legal regulation of the Police in the reform context (2016)Eugene Krapyvin
 

Destaque (13)

Why Big Data is Really about Small Data
Why Big Data is Really about Small DataWhy Big Data is Really about Small Data
Why Big Data is Really about Small Data
 
ABNORMAL DELIVERY
ABNORMAL DELIVERYABNORMAL DELIVERY
ABNORMAL DELIVERY
 
Monera
MoneraMonera
Monera
 
Matriz 2 fase 1 antoine_mario_gc177
Matriz 2  fase 1 antoine_mario_gc177Matriz 2  fase 1 antoine_mario_gc177
Matriz 2 fase 1 antoine_mario_gc177
 
Design patterns
Design patternsDesign patterns
Design patterns
 
Problemas dinero 3º
Problemas dinero 3ºProblemas dinero 3º
Problemas dinero 3º
 
Big Data - The 5 Vs Everyone Must Know
Big Data - The 5 Vs Everyone Must KnowBig Data - The 5 Vs Everyone Must Know
Big Data - The 5 Vs Everyone Must Know
 
MERS-CoV.pptx
MERS-CoV.pptxMERS-CoV.pptx
MERS-CoV.pptx
 
Spamalot: The Quest for the Holy Grail of Email Marketing
Spamalot: The Quest for the Holy Grail of Email MarketingSpamalot: The Quest for the Holy Grail of Email Marketing
Spamalot: The Quest for the Holy Grail of Email Marketing
 
Big Data: Issues and Challenges
Big Data: Issues and ChallengesBig Data: Issues and Challenges
Big Data: Issues and Challenges
 
3d 프린팅 소프트웨어 기술동향
3d 프린팅 소프트웨어 기술동향3d 프린팅 소프트웨어 기술동향
3d 프린팅 소프트웨어 기술동향
 
3d 프린팅 활용 분야 및 발전 전망
3d 프린팅 활용 분야 및 발전 전망3d 프린팅 활용 분야 및 발전 전망
3d 프린팅 활용 분야 및 발전 전망
 
B. Malyshev. Legal regulation of the Police in the reform context (2016)
B. Malyshev. Legal regulation of the Police in the reform context (2016)B. Malyshev. Legal regulation of the Police in the reform context (2016)
B. Malyshev. Legal regulation of the Police in the reform context (2016)
 

Semelhante a Introduction to Big Data

Content1. Introduction2. What is Big Data3. Characte.docx
Content1. Introduction2. What is Big Data3. Characte.docxContent1. Introduction2. What is Big Data3. Characte.docx
Content1. Introduction2. What is Big Data3. Characte.docxdickonsondorris
 
ppt final.pptx
ppt final.pptxppt final.pptx
ppt final.pptxkalai75
 
Special issues on big data
Special issues on big dataSpecial issues on big data
Special issues on big dataVedanand Singh
 
Big data seminor
Big data seminorBig data seminor
Big data seminorberasrujana
 
Bigdatappt 140225061440-phpapp01
Bigdatappt 140225061440-phpapp01Bigdatappt 140225061440-phpapp01
Bigdatappt 140225061440-phpapp01nayanbhatia2
 
Big data by Mithlesh sadh
Big data by Mithlesh sadhBig data by Mithlesh sadh
Big data by Mithlesh sadhMithlesh Sadh
 
Big data with Hadoop - Introduction
Big data with Hadoop - IntroductionBig data with Hadoop - Introduction
Big data with Hadoop - IntroductionTomy Rhymond
 
Big Data By Vijay Bhaskar Semwal
Big Data By Vijay Bhaskar SemwalBig Data By Vijay Bhaskar Semwal
Big Data By Vijay Bhaskar SemwalIIIT Allahabad
 
TOPIC.pptx
TOPIC.pptxTOPIC.pptx
TOPIC.pptxinfinix8
 
Big Data, NoSQL, NewSQL & The Future of Data Management
Big Data, NoSQL, NewSQL & The Future of Data ManagementBig Data, NoSQL, NewSQL & The Future of Data Management
Big Data, NoSQL, NewSQL & The Future of Data ManagementTony Bain
 

Semelhante a Introduction to Big Data (20)

Content1. Introduction2. What is Big Data3. Characte.docx
Content1. Introduction2. What is Big Data3. Characte.docxContent1. Introduction2. What is Big Data3. Characte.docx
Content1. Introduction2. What is Big Data3. Characte.docx
 
ppt final.pptx
ppt final.pptxppt final.pptx
ppt final.pptx
 
Kartikey tripathi
Kartikey tripathiKartikey tripathi
Kartikey tripathi
 
Big_Data_ppt[1] (1).pptx
Big_Data_ppt[1] (1).pptxBig_Data_ppt[1] (1).pptx
Big_Data_ppt[1] (1).pptx
 
Special issues on big data
Special issues on big dataSpecial issues on big data
Special issues on big data
 
Big data seminor
Big data seminorBig data seminor
Big data seminor
 
Bigdatappt 140225061440-phpapp01
Bigdatappt 140225061440-phpapp01Bigdatappt 140225061440-phpapp01
Bigdatappt 140225061440-phpapp01
 
Big data ppt
Big  data pptBig  data ppt
Big data ppt
 
Big data by Mithlesh sadh
Big data by Mithlesh sadhBig data by Mithlesh sadh
Big data by Mithlesh sadh
 
big data
big data big data
big data
 
Presentation on Big Data
Presentation on Big DataPresentation on Big Data
Presentation on Big Data
 
Bigdata " new level"
Bigdata " new level"Bigdata " new level"
Bigdata " new level"
 
Big data with Hadoop - Introduction
Big data with Hadoop - IntroductionBig data with Hadoop - Introduction
Big data with Hadoop - Introduction
 
Big data ppt
Big data pptBig data ppt
Big data ppt
 
Big Data By Vijay Bhaskar Semwal
Big Data By Vijay Bhaskar SemwalBig Data By Vijay Bhaskar Semwal
Big Data By Vijay Bhaskar Semwal
 
Big data and analytics
Big data and analyticsBig data and analytics
Big data and analytics
 
Understanding big data
Understanding big dataUnderstanding big data
Understanding big data
 
TOPIC.pptx
TOPIC.pptxTOPIC.pptx
TOPIC.pptx
 
Big Data, NoSQL, NewSQL & The Future of Data Management
Big Data, NoSQL, NewSQL & The Future of Data ManagementBig Data, NoSQL, NewSQL & The Future of Data Management
Big Data, NoSQL, NewSQL & The Future of Data Management
 
Big data
Big dataBig data
Big data
 

Mais de SpringPeople

Growth hacking tips and tricks that you can try
Growth hacking tips and tricks that you can tryGrowth hacking tips and tricks that you can try
Growth hacking tips and tricks that you can trySpringPeople
 
Top Big data Analytics tools: Emerging trends and Best practices
Top Big data Analytics tools: Emerging trends and Best practicesTop Big data Analytics tools: Emerging trends and Best practices
Top Big data Analytics tools: Emerging trends and Best practicesSpringPeople
 
Introduction to Microsoft Azure IaaS
Introduction to Microsoft Azure IaaSIntroduction to Microsoft Azure IaaS
Introduction to Microsoft Azure IaaSSpringPeople
 
Introduction to Selenium WebDriver
Introduction to Selenium WebDriverIntroduction to Selenium WebDriver
Introduction to Selenium WebDriverSpringPeople
 
Introduction to Open stack - An Overview
Introduction to Open stack - An Overview Introduction to Open stack - An Overview
Introduction to Open stack - An Overview SpringPeople
 
Best Practices for Administering Hadoop with Hortonworks Data Platform (HDP) ...
Best Practices for Administering Hadoop with Hortonworks Data Platform (HDP) ...Best Practices for Administering Hadoop with Hortonworks Data Platform (HDP) ...
Best Practices for Administering Hadoop with Hortonworks Data Platform (HDP) ...SpringPeople
 
Why 2 million Developers depend on MuleSoft
Why 2 million Developers depend on MuleSoftWhy 2 million Developers depend on MuleSoft
Why 2 million Developers depend on MuleSoftSpringPeople
 
Mongo DB: Fundamentals & Basics/ An Overview of MongoDB/ Mongo DB tutorials
Mongo DB: Fundamentals & Basics/ An Overview of MongoDB/ Mongo DB tutorialsMongo DB: Fundamentals & Basics/ An Overview of MongoDB/ Mongo DB tutorials
Mongo DB: Fundamentals & Basics/ An Overview of MongoDB/ Mongo DB tutorialsSpringPeople
 
Mastering Test Automation: How To Use Selenium Successfully
Mastering Test Automation: How To Use Selenium SuccessfullyMastering Test Automation: How To Use Selenium Successfully
Mastering Test Automation: How To Use Selenium SuccessfullySpringPeople
 
An Introduction of Big data; Big data for beginners; Overview of Big Data; Bi...
An Introduction of Big data; Big data for beginners; Overview of Big Data; Bi...An Introduction of Big data; Big data for beginners; Overview of Big Data; Bi...
An Introduction of Big data; Big data for beginners; Overview of Big Data; Bi...SpringPeople
 
SpringPeople - Introduction to Cloud Computing
SpringPeople - Introduction to Cloud ComputingSpringPeople - Introduction to Cloud Computing
SpringPeople - Introduction to Cloud ComputingSpringPeople
 
SpringPeople - Devops skills - Do you have what it takes?
SpringPeople - Devops skills - Do you have what it takes?SpringPeople - Devops skills - Do you have what it takes?
SpringPeople - Devops skills - Do you have what it takes?SpringPeople
 
Elastic - ELK, Logstash & Kibana
Elastic - ELK, Logstash & KibanaElastic - ELK, Logstash & Kibana
Elastic - ELK, Logstash & KibanaSpringPeople
 
Hadoop data access layer v4.0
Hadoop data access layer v4.0Hadoop data access layer v4.0
Hadoop data access layer v4.0SpringPeople
 
Introduction To Core Java - SpringPeople
Introduction To Core Java - SpringPeopleIntroduction To Core Java - SpringPeople
Introduction To Core Java - SpringPeopleSpringPeople
 
Introduction To Hadoop Administration - SpringPeople
Introduction To Hadoop Administration - SpringPeopleIntroduction To Hadoop Administration - SpringPeople
Introduction To Hadoop Administration - SpringPeopleSpringPeople
 
Introduction To Cloud Foundry - SpringPeople
Introduction To Cloud Foundry - SpringPeopleIntroduction To Cloud Foundry - SpringPeople
Introduction To Cloud Foundry - SpringPeopleSpringPeople
 
Introduction To Spring Enterprise Integration - SpringPeople
Introduction To Spring Enterprise Integration - SpringPeopleIntroduction To Spring Enterprise Integration - SpringPeople
Introduction To Spring Enterprise Integration - SpringPeopleSpringPeople
 
Introduction To Groovy And Grails - SpringPeople
Introduction To Groovy And Grails - SpringPeopleIntroduction To Groovy And Grails - SpringPeople
Introduction To Groovy And Grails - SpringPeopleSpringPeople
 
Introduction To Jenkins - SpringPeople
Introduction To Jenkins - SpringPeopleIntroduction To Jenkins - SpringPeople
Introduction To Jenkins - SpringPeopleSpringPeople
 

Mais de SpringPeople (20)

Growth hacking tips and tricks that you can try
Growth hacking tips and tricks that you can tryGrowth hacking tips and tricks that you can try
Growth hacking tips and tricks that you can try
 
Top Big data Analytics tools: Emerging trends and Best practices
Top Big data Analytics tools: Emerging trends and Best practicesTop Big data Analytics tools: Emerging trends and Best practices
Top Big data Analytics tools: Emerging trends and Best practices
 
Introduction to Microsoft Azure IaaS
Introduction to Microsoft Azure IaaSIntroduction to Microsoft Azure IaaS
Introduction to Microsoft Azure IaaS
 
Introduction to Selenium WebDriver
Introduction to Selenium WebDriverIntroduction to Selenium WebDriver
Introduction to Selenium WebDriver
 
Introduction to Open stack - An Overview
Introduction to Open stack - An Overview Introduction to Open stack - An Overview
Introduction to Open stack - An Overview
 
Best Practices for Administering Hadoop with Hortonworks Data Platform (HDP) ...
Best Practices for Administering Hadoop with Hortonworks Data Platform (HDP) ...Best Practices for Administering Hadoop with Hortonworks Data Platform (HDP) ...
Best Practices for Administering Hadoop with Hortonworks Data Platform (HDP) ...
 
Why 2 million Developers depend on MuleSoft
Why 2 million Developers depend on MuleSoftWhy 2 million Developers depend on MuleSoft
Why 2 million Developers depend on MuleSoft
 
Mongo DB: Fundamentals & Basics/ An Overview of MongoDB/ Mongo DB tutorials
Mongo DB: Fundamentals & Basics/ An Overview of MongoDB/ Mongo DB tutorialsMongo DB: Fundamentals & Basics/ An Overview of MongoDB/ Mongo DB tutorials
Mongo DB: Fundamentals & Basics/ An Overview of MongoDB/ Mongo DB tutorials
 
Mastering Test Automation: How To Use Selenium Successfully
Mastering Test Automation: How To Use Selenium SuccessfullyMastering Test Automation: How To Use Selenium Successfully
Mastering Test Automation: How To Use Selenium Successfully
 
An Introduction of Big data; Big data for beginners; Overview of Big Data; Bi...
An Introduction of Big data; Big data for beginners; Overview of Big Data; Bi...An Introduction of Big data; Big data for beginners; Overview of Big Data; Bi...
An Introduction of Big data; Big data for beginners; Overview of Big Data; Bi...
 
SpringPeople - Introduction to Cloud Computing
SpringPeople - Introduction to Cloud ComputingSpringPeople - Introduction to Cloud Computing
SpringPeople - Introduction to Cloud Computing
 
SpringPeople - Devops skills - Do you have what it takes?
SpringPeople - Devops skills - Do you have what it takes?SpringPeople - Devops skills - Do you have what it takes?
SpringPeople - Devops skills - Do you have what it takes?
 
Elastic - ELK, Logstash & Kibana
Elastic - ELK, Logstash & KibanaElastic - ELK, Logstash & Kibana
Elastic - ELK, Logstash & Kibana
 
Hadoop data access layer v4.0
Hadoop data access layer v4.0Hadoop data access layer v4.0
Hadoop data access layer v4.0
 
Introduction To Core Java - SpringPeople
Introduction To Core Java - SpringPeopleIntroduction To Core Java - SpringPeople
Introduction To Core Java - SpringPeople
 
Introduction To Hadoop Administration - SpringPeople
Introduction To Hadoop Administration - SpringPeopleIntroduction To Hadoop Administration - SpringPeople
Introduction To Hadoop Administration - SpringPeople
 
Introduction To Cloud Foundry - SpringPeople
Introduction To Cloud Foundry - SpringPeopleIntroduction To Cloud Foundry - SpringPeople
Introduction To Cloud Foundry - SpringPeople
 
Introduction To Spring Enterprise Integration - SpringPeople
Introduction To Spring Enterprise Integration - SpringPeopleIntroduction To Spring Enterprise Integration - SpringPeople
Introduction To Spring Enterprise Integration - SpringPeople
 
Introduction To Groovy And Grails - SpringPeople
Introduction To Groovy And Grails - SpringPeopleIntroduction To Groovy And Grails - SpringPeople
Introduction To Groovy And Grails - SpringPeople
 
Introduction To Jenkins - SpringPeople
Introduction To Jenkins - SpringPeopleIntroduction To Jenkins - SpringPeople
Introduction To Jenkins - SpringPeople
 

Último

BabyOno dropshipping via API with DroFx.pptx
BabyOno dropshipping via API with DroFx.pptxBabyOno dropshipping via API with DroFx.pptx
BabyOno dropshipping via API with DroFx.pptxolyaivanovalion
 
Discover Why Less is More in B2B Research
Discover Why Less is More in B2B ResearchDiscover Why Less is More in B2B Research
Discover Why Less is More in B2B Researchmichael115558
 
ALSO dropshipping via API with DroFx.pptx
ALSO dropshipping via API with DroFx.pptxALSO dropshipping via API with DroFx.pptx
ALSO dropshipping via API with DroFx.pptxolyaivanovalion
 
Generative AI on Enterprise Cloud with NiFi and Milvus
Generative AI on Enterprise Cloud with NiFi and MilvusGenerative AI on Enterprise Cloud with NiFi and Milvus
Generative AI on Enterprise Cloud with NiFi and MilvusTimothy Spann
 
VIP Model Call Girls Hinjewadi ( Pune ) Call ON 8005736733 Starting From 5K t...
VIP Model Call Girls Hinjewadi ( Pune ) Call ON 8005736733 Starting From 5K t...VIP Model Call Girls Hinjewadi ( Pune ) Call ON 8005736733 Starting From 5K t...
VIP Model Call Girls Hinjewadi ( Pune ) Call ON 8005736733 Starting From 5K t...SUHANI PANDEY
 
Mg Road Call Girls Service: 🍓 7737669865 🍓 High Profile Model Escorts | Banga...
Mg Road Call Girls Service: 🍓 7737669865 🍓 High Profile Model Escorts | Banga...Mg Road Call Girls Service: 🍓 7737669865 🍓 High Profile Model Escorts | Banga...
Mg Road Call Girls Service: 🍓 7737669865 🍓 High Profile Model Escorts | Banga...amitlee9823
 
CebaBaby dropshipping via API with DroFX.pptx
CebaBaby dropshipping via API with DroFX.pptxCebaBaby dropshipping via API with DroFX.pptx
CebaBaby dropshipping via API with DroFX.pptxolyaivanovalion
 
Call Girls Bommasandra Just Call 👗 7737669865 👗 Top Class Call Girl Service B...
Call Girls Bommasandra Just Call 👗 7737669865 👗 Top Class Call Girl Service B...Call Girls Bommasandra Just Call 👗 7737669865 👗 Top Class Call Girl Service B...
Call Girls Bommasandra Just Call 👗 7737669865 👗 Top Class Call Girl Service B...amitlee9823
 
Ravak dropshipping via API with DroFx.pptx
Ravak dropshipping via API with DroFx.pptxRavak dropshipping via API with DroFx.pptx
Ravak dropshipping via API with DroFx.pptxolyaivanovalion
 
Halmar dropshipping via API with DroFx
Halmar  dropshipping  via API with DroFxHalmar  dropshipping  via API with DroFx
Halmar dropshipping via API with DroFxolyaivanovalion
 
Call Girls Hsr Layout Just Call 👗 7737669865 👗 Top Class Call Girl Service Ba...
Call Girls Hsr Layout Just Call 👗 7737669865 👗 Top Class Call Girl Service Ba...Call Girls Hsr Layout Just Call 👗 7737669865 👗 Top Class Call Girl Service Ba...
Call Girls Hsr Layout Just Call 👗 7737669865 👗 Top Class Call Girl Service Ba...amitlee9823
 
Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...
Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...
Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...amitlee9823
 
Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...
Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...
Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...amitlee9823
 
Edukaciniai dropshipping via API with DroFx
Edukaciniai dropshipping via API with DroFxEdukaciniai dropshipping via API with DroFx
Edukaciniai dropshipping via API with DroFxolyaivanovalion
 
Accredited-Transport-Cooperatives-Jan-2021-Web.pdf
Accredited-Transport-Cooperatives-Jan-2021-Web.pdfAccredited-Transport-Cooperatives-Jan-2021-Web.pdf
Accredited-Transport-Cooperatives-Jan-2021-Web.pdfadriantubila
 
April 2024 - Crypto Market Report's Analysis
April 2024 - Crypto Market Report's AnalysisApril 2024 - Crypto Market Report's Analysis
April 2024 - Crypto Market Report's Analysismanisha194592
 
Call Girls Jalahalli Just Call 👗 7737669865 👗 Top Class Call Girl Service Ban...
Call Girls Jalahalli Just Call 👗 7737669865 👗 Top Class Call Girl Service Ban...Call Girls Jalahalli Just Call 👗 7737669865 👗 Top Class Call Girl Service Ban...
Call Girls Jalahalli Just Call 👗 7737669865 👗 Top Class Call Girl Service Ban...amitlee9823
 
Invezz.com - Grow your wealth with trading signals
Invezz.com - Grow your wealth with trading signalsInvezz.com - Grow your wealth with trading signals
Invezz.com - Grow your wealth with trading signalsInvezz1
 
BigBuy dropshipping via API with DroFx.pptx
BigBuy dropshipping via API with DroFx.pptxBigBuy dropshipping via API with DroFx.pptx
BigBuy dropshipping via API with DroFx.pptxolyaivanovalion
 
Jual Obat Aborsi Surabaya ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
Jual Obat Aborsi Surabaya ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...Jual Obat Aborsi Surabaya ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
Jual Obat Aborsi Surabaya ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...ZurliaSoop
 

Último (20)

BabyOno dropshipping via API with DroFx.pptx
BabyOno dropshipping via API with DroFx.pptxBabyOno dropshipping via API with DroFx.pptx
BabyOno dropshipping via API with DroFx.pptx
 
Discover Why Less is More in B2B Research
Discover Why Less is More in B2B ResearchDiscover Why Less is More in B2B Research
Discover Why Less is More in B2B Research
 
ALSO dropshipping via API with DroFx.pptx
ALSO dropshipping via API with DroFx.pptxALSO dropshipping via API with DroFx.pptx
ALSO dropshipping via API with DroFx.pptx
 
Generative AI on Enterprise Cloud with NiFi and Milvus
Generative AI on Enterprise Cloud with NiFi and MilvusGenerative AI on Enterprise Cloud with NiFi and Milvus
Generative AI on Enterprise Cloud with NiFi and Milvus
 
VIP Model Call Girls Hinjewadi ( Pune ) Call ON 8005736733 Starting From 5K t...
VIP Model Call Girls Hinjewadi ( Pune ) Call ON 8005736733 Starting From 5K t...VIP Model Call Girls Hinjewadi ( Pune ) Call ON 8005736733 Starting From 5K t...
VIP Model Call Girls Hinjewadi ( Pune ) Call ON 8005736733 Starting From 5K t...
 
Mg Road Call Girls Service: 🍓 7737669865 🍓 High Profile Model Escorts | Banga...
Mg Road Call Girls Service: 🍓 7737669865 🍓 High Profile Model Escorts | Banga...Mg Road Call Girls Service: 🍓 7737669865 🍓 High Profile Model Escorts | Banga...
Mg Road Call Girls Service: 🍓 7737669865 🍓 High Profile Model Escorts | Banga...
 
CebaBaby dropshipping via API with DroFX.pptx
CebaBaby dropshipping via API with DroFX.pptxCebaBaby dropshipping via API with DroFX.pptx
CebaBaby dropshipping via API with DroFX.pptx
 
Call Girls Bommasandra Just Call 👗 7737669865 👗 Top Class Call Girl Service B...
Call Girls Bommasandra Just Call 👗 7737669865 👗 Top Class Call Girl Service B...Call Girls Bommasandra Just Call 👗 7737669865 👗 Top Class Call Girl Service B...
Call Girls Bommasandra Just Call 👗 7737669865 👗 Top Class Call Girl Service B...
 
Ravak dropshipping via API with DroFx.pptx
Ravak dropshipping via API with DroFx.pptxRavak dropshipping via API with DroFx.pptx
Ravak dropshipping via API with DroFx.pptx
 
Halmar dropshipping via API with DroFx
Halmar  dropshipping  via API with DroFxHalmar  dropshipping  via API with DroFx
Halmar dropshipping via API with DroFx
 
Call Girls Hsr Layout Just Call 👗 7737669865 👗 Top Class Call Girl Service Ba...
Call Girls Hsr Layout Just Call 👗 7737669865 👗 Top Class Call Girl Service Ba...Call Girls Hsr Layout Just Call 👗 7737669865 👗 Top Class Call Girl Service Ba...
Call Girls Hsr Layout Just Call 👗 7737669865 👗 Top Class Call Girl Service Ba...
 
Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...
Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...
Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...
 
Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...
Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...
Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...
 
Edukaciniai dropshipping via API with DroFx
Edukaciniai dropshipping via API with DroFxEdukaciniai dropshipping via API with DroFx
Edukaciniai dropshipping via API with DroFx
 
Accredited-Transport-Cooperatives-Jan-2021-Web.pdf
Accredited-Transport-Cooperatives-Jan-2021-Web.pdfAccredited-Transport-Cooperatives-Jan-2021-Web.pdf
Accredited-Transport-Cooperatives-Jan-2021-Web.pdf
 
April 2024 - Crypto Market Report's Analysis
April 2024 - Crypto Market Report's AnalysisApril 2024 - Crypto Market Report's Analysis
April 2024 - Crypto Market Report's Analysis
 
Call Girls Jalahalli Just Call 👗 7737669865 👗 Top Class Call Girl Service Ban...
Call Girls Jalahalli Just Call 👗 7737669865 👗 Top Class Call Girl Service Ban...Call Girls Jalahalli Just Call 👗 7737669865 👗 Top Class Call Girl Service Ban...
Call Girls Jalahalli Just Call 👗 7737669865 👗 Top Class Call Girl Service Ban...
 
Invezz.com - Grow your wealth with trading signals
Invezz.com - Grow your wealth with trading signalsInvezz.com - Grow your wealth with trading signals
Invezz.com - Grow your wealth with trading signals
 
BigBuy dropshipping via API with DroFx.pptx
BigBuy dropshipping via API with DroFx.pptxBigBuy dropshipping via API with DroFx.pptx
BigBuy dropshipping via API with DroFx.pptx
 
Jual Obat Aborsi Surabaya ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
Jual Obat Aborsi Surabaya ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...Jual Obat Aborsi Surabaya ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
Jual Obat Aborsi Surabaya ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
 

Introduction to Big Data

  • 1.
  • 2. What is Big Data? • Huge Amount of Data (Terabytes or Petabytes) • Big data is the term for a collection of data sets so large and complex that it becomes difficult to process using on-hand database management tools or traditional data processing applications • The challenges include capture, curation, storage, search, sharing, transfer, analysis, and visualization LET’S SOLVE THIS PROBLEM BY USING THE BIG DATA NONE OF US HAVE THE SLIGHTEST IDEA WHAT TO DO WITH
  • 3. What is Big Data? (Cont’d) • Every day we roughly create 2.5 Quintillion bytes of data; 90% of the worlds collected data has been generated only in the last 2 years • Data sizes are now in Peta-bytes, Tera-bytes, Exa-bytes & Zeta-bytes • Data is of various types – user content (posts, tweets,) images, audio and video files, chat logs, machine generated data • The number of speed of data generation has increased – every data we generate 2,500,000,000,000,000 (2.5 quintillion) bytes of data EVERY DAY • The data sources have multiplied many times over – social media, banking, government, e-commerce etc.
  • 4. Purpose of Big Data or Why Big Data? • In 2015, we created data more than all of past years data combined. But 90% of our created data is not sorted. • Big Data gives us better and different picture of data collected. But the purpose is not solved yet. We need to find out a better way of using it as well. • And for all these you need the best data scientist you can possible get your hands on. • Companies love data because it provided very lucrative insights to businesses and their clients - Data today is the most important commodity. If companies can harness and analyse data, it provides an unmatchable competitive advantage • New age tools like Hadoop have made data handling and processing easier and cheaper • Cloud computing, hardware and memory is getting cheaper so data storage is not a problem • New analytical software provides real time analysis of data for business decisions
  • 5. Big Data Market Trend • Market research firm IDC forecasts a 50% increase in revenues from the sale of big data and business analytics software, hardware, and services between 2015 and 2019. • It says, services will account for the biggest chunk of revenue, with banking and manufacturing-led industries poised to spend the most. • By 2019, IDC said it expects revenue generated by the US market for big data and business analytics solutions to exceed $98 billion. • Edge analytics is the next big thing in the Big Data technology for companies that want to gain real-time insights and impact business through IoT use cases. will allow companies to derive the most actionable value from their data. • The key trend in Big Data is the transition of analytics solutions into the cloud. The cloud enables vast amount of computing resources to be applied to data analysis and to scale that computing based on need.
  • 6. Big Data Learning Challenges • Upgradation: In every 3 to 6 months a new version of technology is introduced. So by the time you have actually learned something, a new technology would have emerged making your knowledge old and outdated. • Data Extraction: Challenge is to extract the most important information out of the massive data formation. • Lack of Talent: A successful implementation of Big Data project requires a sophisticated team of developers, data scientists and analysts having a sufficient knowledge of Big Data. But the available skill is very limited in number. • Data Quality: United State’s cost of procuring dirty data every year is $600 billion. The common causes for these data are- user input errors, data duplication & incorrect data linking.
  • 7. Metrics of Data Size Till date, we’ve only been familiar with data in GB. The world in changing now : Data Metrics Hierarchy 1024 MEGABYTES = 1 GB 1024 GIGABYTES = 1 TB 1024 TERABYTES = 1 PB 1024 PETABYTES = 1 EB 1024 EXABYTES = 1 ZETABYTES 1024 KILOBYTES = 1 MB 1024 BYTES = 1 KB
  • 8. Why Has Data Generation Increased? The advent of the internet is the primary reason for the data explosion that has taken place over the last 15 years • Facebook - more than 2.5 million pieces of content • Twitter - 300 thousand tweets at Twitter • Instagram - 250,000 new pictures • YouTube - 75 hours of new video content • Email - 400 million messages • WhatsApp - 400,000 pictures • Google - 5 million search requests Every minute:
  • 9. Other sources of Data generation • E-commerce – More than1.3 billion transactions per day • Data logging devices – Wearables (ex: FitBit) Healthcare monitoring, GPS systems, Data sensors • Financial Data – Stock exchanges, Banking transactions • Aviation Industry – A typical flight generates half a TB of data • Governments – Citizen data, Tax records etc.
  • 10. Companies Want More Data !! • Companies love data because it provided very lucrative insights to businesses and their clients - Data today is the most important commodity. If companies can harness and analyze data, it provides an unmatchable competitive advantage • New age tools like Hadoop have made data handling and processing easier and cheaper • Cloud computing, hardware and memory is getting cheaper so data storage is not a problem • New analytical software provides real time analysis of data for business decisions
  • 11. Characteristics of Big Data • Big Data is characterized by 4 V’s: Volume, Velocity, Variety and Veracity BigData Volume Velocity Variety Veracity
  • 12. Characteristics of Big Data – Volume Volume: Refers to the enormous volumes of data Data Warehouses
  • 13. Characteristics of Big Data – Velocity Velocity: Itrefers topace at which data is being generated, processed and consumed
  • 14. Characteristics of Big Data – Variety Variety: Data can be gathered from infinite sources
  • 15. Characteristics of Big Data – Veracity Veracity: The quality and authenticity of the data being captured can vary greatly
  • 16. Scope: Testing Aspects in Big Data • Validation of Structured and Unstructured Data: Data needs to be classified as the structured and unstructured parts. (i) Structured Data: It is the data which can be stored in the form of tables (rows and columns) without any processing for example database, call details and excel sheets. (ii) Unstructured Data: It is the data which does not have a predefined data model or structure for example data in the form of weblogs, audio, tweets, and comments. Adequate time needs to be spent over the validation of the data at an initial stage, and it is the point where we encounter an abundance of bad data from various sources. • Execution of Non-Functional Testing: Non-functional testing plays a vital role in ensuring the scalability of the process. Functional testing focuses on the coding and requirement related issues whereas non-functional testing classifies the performance bottlenecks and validates the non- functional requirements. • Handling Non-Relational Databases: Non-Relational databases form the backbone of the Big Data storage. Since these are the main sources of data retrieval hence require a good portion of testing to maintain the accuracy of the system. Commonly known as NoSQL databases, these DBs are designed in such a manner that they can early handle the Big Data and are different from the traditional RDBMS which are designed on table/key model.
  • 17. • Ace Test Environment: Efficient test environment ensures that data from multiple sources is of acceptable quality for accurate analysis. Although replicating the complete set of big data into the test environment is next to impossible, so a small subset of the data is created for the test environment to verify the behavior. Careful planning is required to exercise all paths with subsets of data in a manner that fully verifies the application.
  • 18. Phases of testing in Big Data and Hadoop Testing of Big Data and Hadoop is an enormous and complex process which is segregated into four phases to squeeze out the best results from testing. These phases are as follows: 1. Pre-Hadoop Processing: It includes the validation of the data which is collated from various sources before Hadoop processing. This is the phase where we get rid of unwanted data. 2. Processing of Map Reduce Job: Map R job in Hadoop is the java code which is used to fetch out the data according to the preconditions provided. Verification of the Map Reduce job is performed to monitor the accuracy of the data fetched. 3. Data Extraction and Loading: This process of includes the validation of the data being loaded and extracted from the HDFS (Hadoop Distributed File System) to ensure that no corrupt data occupied in the HDFS. 4. Report Validation: This is the last phase of testing to ensure that the output which we are delivering is meeting the accuracy standards, and there is no redundant data present in the reports.
  • 19. Phases of testing in Big data and Hadoop
  • 20. Become a Big Data expert with Hortonworks certification only at SpringPeople! Get updates on upcoming classes and webinars at www.springpeople.com

Notas do Editor

  1. Do you guys know what a Tsunami is? Well, that is how BIG Data hit the technology world over the last 8 years. Only in the last 2 years, we’ve generated 90% of the worlds available data and this is just the beginning. It is not a technology. It is not a tool. BIG Data is extremely large volumes of data, mostly unstructured in nature, which cannot be stored, processed or managed by traditional RDBMS tools. Till date, we’re very familiar with Giga-bytes & Mega-bytes. However, BIG Data runs in Petabytes, Terabytes and beyond.
  2. Just to give you a comparison; here is a table showcasing the difference between each data-metric. Today, we have data being generated running even into Zeta-bytes. Just to give you a little context, a Zeta-byte is one followed by 21 zeroes. Now comes a very interesting question; what led to this data-explosion over the last 10 years?
  3. They are 3 triggers for this. Access to data became easier after the advent of the internet; Never before did the human race have access to such in-depth information. 2. BIG Data provided extremely lucrative insights to business organizations. 3. Due to this cycle of access, insight & results, investments were done to ensure as much data as possible is collected. This led to the explosion of BIG Data. Lets see a few examples of this in actual business.
  4. They are 3 triggers for this. Access to data became easier after the advent of the internet; Never before did the human race have access to such in-depth information. 2. BIG Data provided extremely lucrative insights to business organizations. 3. Due to this cycle of access, insight & results, investments were done to ensure as much data as possible is collected. This led to the explosion of BIG Data. Lets see a few examples of this in actual business.
  5. They are 3 triggers for this. Access to data became easier after the advent of the internet; Never before did the human race have access to such in-depth information. 2. BIG Data provided extremely lucrative insights to business organizations. 3. Due to this cycle of access, insight & results, investments were done to ensure as much data as possible is collected. This led to the explosion of BIG Data. Lets see a few examples of this in actual business.
  6. Till now, we’ve established that BIG Data is large, complex and diverse. Those qualities are broken down into the following 4 characteristics. Volume Velocity Variety Veracity Lets break down each and every one of them.
  7. As the name suggest, this characteristic refers to the sheer volume of BIG Data generated in the world today from a multitude of sources. For example, an airplane collects 10 tera-bytes of sensor data for every 30 minutes of flying time. Today, we’re hitting data generation to the scale of zeta-bytes. This is absolutely unprecedented.
  8. Next is Velocity. This refers to the speed at which data is generated, managed and processed. BIG Data is characterized by constantly increasing velocity. This is because an increase in data generation has led to the creation of advanced distributed processing networks and even real-time analytics tools. Due to this, the velocity of BIG Data activities is constantly increasing. Next.
  9. BIG Data can be either structured, semi-structured or unstructured and from infinite sources. In addition to that, It can be from mediums such as Geospatial, 3D, Audio, Video, Log Files, and Network Algorithms & Social Media. Due to this, variety is a very crucial characteristic of BIG Data.
  10. This refers to the biases, noise and abnormality in Big Data. Due to this extreme caution has to be applied in the data-collection process. Companies spend billions of dollars either as penalties for false data-collection or on data-cleansing. These are the 4 characteristics of BIG Data. Now lets move on to the types of BIG Data.