SlideShare uma empresa Scribd logo
1 de 19
Big Data Mining
Overview
 Introduction
 Characteristics of Big Data
 Big Data and it’s challenges
 Big Data mining Tools
 Big Data mining algorithm
 Applications of Big Data
 References
 Q&A
Introduction
Interesting Facts
 The volume of business data worldwide, across all companies, doubles every
1.2 years (was 1.5 years)
 Daily 2500 quadrillion of data are produced and more than 90 percentage of
data are produced within past two years.
 A regular person is processing daily more data than a 16th century individual
in his entire life
 In the last years cost of storage and processing power dropped significantly
 Bad data or poor data quality costs US businesses $600 billion annually
 By 2015, 4.4 million IT jobs globally will be created to support big data
(Gartner)
 Facebook processes 10 TB of data every day / Twitter 7 TB
 Google has over 3 million servers processing over 2 trillion searches per year
in 2012 (only 22 million in 2000)
What is
The term Big data is used to describe a massive
volume of both structured and unstructured data
that is so large that it's difficult to process using
traditional database and software techniques.
-Webopedia
Characteristics of Big Data
Volume - The quantity of data
 Variety - categorizing the data
 Velocity - speed of generation of data or the speed of processing
the data
 Variability - Inconsistency
 Complexity - Managing the data
DATA MINING CHALLENGES WITH BIG DATA
 Main challenge for an intelligent database is handling Big data. The
important thing is scaling the large amount of data and provide
solution for these problem by HACE theorem
 Challenges
Location of Big Data sources- Commonly Big Data are
stored in different locations
Volume of the Big Data- size of the Big Data grows
continuously.
Hardware resources- RAM capacity
Privacy- Medical reports, bank transactions
Having domain knowledge
Getting meaningful information
 Solutions
Parallel computing programming
An efficient platform for computing will not have
centralized data storage instead of that platform
will be distributed in big scale storage.
Restricting access to the data
BIG Data Mining Tools
 Hadoop
 Apache S4
 Strom
 Apache Mahout
 MOA
Hadoop
 It is developed by Apache Software Foundation project and open source
software platform for scalable, distributed computing.
 Apache Hadoop software library is a framework that allows for the distributed
processing of large data sets across clusters of computers using simple
programming models.
 Hadoop provides fast and reliable analysis of both Structured and un
structured data.
 It is designed to scale up from single servers to thousands of machines, each
offering local computation and storage.
 Hadoop uses MapReduce programming model to mine data.
 This MapReduce program is used to separate datasets which are sent as input
into independent subsets. Those are process parallel map task.
 Map() procedure that performs filtering and sorting
 Reduce() procedure that performs a summary operation
Big Data Mining Algorithm
 Big data applications have so many sources to gather information.
 If we want to mine data, we need to gather all distributed data to the
centralized site. But it is prohibited because of high data transmission cost
and privacy concerns.
 Most of the mining levels order to achieve the pattern of correlations, or
patterns can be discovered from combined variety of sources.
 The global data mining is done through two steps process.
 Model level
 Knowledge level.
 Each and every local sites use local data to calculate the data statistics and it
share this information in order to achieve global data distribution in their
data level.
 In model level it will produce local pattern. This pattern will be produced
after mined local data.
 By sharing these local patterns with other local sites, we can produce a single
global pattern.
 At the knowledge level, model correlation analysis investigates the relevance
between models generated from various data sources to determine how
related the data sources are correlated to each other, and how to form
accurate decisions based on models built from autonomous sources
Applications of Big Data
 Healthcare organizations can achieve better insight into disease trends and
patient treatments.
 Public sector agencies can catch fraud and other threats in real-time.
 Applications of Multimedia data
 To find travelling pattern of travelers
 CC TV camera footage
 Photos and Videos from social network
 Recommender system
 Integration and mining of Bio data from various sources in Biological network
by NSF (National Science Foundation).
 Classifying the Big data stream in run time, by Australian Research council.
References
[1] IEEE, Data Mining with Big Data, January 2014
[2] McKinsy Global Institute, Big Data: The next frontier for innovation, competition and
productivity- May 2011
[3] Xindong Wu, Xinguan Zhu, Gong-Qing Wu, Wei Ding, 2013, Data Mining with Big Data
[4] Ahmed and Karypis 2012, Rezwan Ahmed, George Karpis, Algorithms for mining the evolution
of conserved relational states in dynamic network
[5] Wu X. 2000, Building Intelligent Learning Database Systems, AI Magazine
[6] Oracle, June 2013,Unstructured Data Management with Oracle Database 12c
[7] Valery A.Petrushin, Jia-Yu Pan, Cees G.M.Snoek, 2010,Tenth International Workshop on
Multimedia Data Mining
[8] Big data[Online].Available:www.en.wikipedia.org/wiki/Big_data
[9] Big data [Online]. Available: www.webopedia.com/TERM /B/ big_data.html
[10]IBM big data and information management [Online]. Available: www-
01.ibm.com/software/data/bigdata
[11] Big data [Online]. Available: www.adainbigdata.com
[12] Big Data Explained [Online]. Available: www.mongodb.com/big-data-explained
[13] Big data [Online]. Available: www.sas.com/en_us/insights/big-data/what-is-big-data.html
[14] Big Data Mining Tools[Online]. Available: www.albertbifet.com/big-data-mining-tools
Cloud storage for Big Data
Processing

Mais conteúdo relacionado

Mais procurados

Data preprocessing
Data preprocessingData preprocessing
Data preprocessing
ankur bhalla
 
Statistics and Data Mining
Statistics and  Data MiningStatistics and  Data Mining
Statistics and Data Mining
R A Akerkar
 
Machine Learning: Introduction to Neural Networks
Machine Learning: Introduction to Neural NetworksMachine Learning: Introduction to Neural Networks
Machine Learning: Introduction to Neural Networks
Francesco Collova'
 

Mais procurados (20)

5.2 mining time series data
5.2 mining time series data5.2 mining time series data
5.2 mining time series data
 
Artificial Neural Networks for Data Mining
Artificial Neural Networks for Data MiningArtificial Neural Networks for Data Mining
Artificial Neural Networks for Data Mining
 
01 Data Mining: Concepts and Techniques, 2nd ed.
01 Data Mining: Concepts and Techniques, 2nd ed.01 Data Mining: Concepts and Techniques, 2nd ed.
01 Data Mining: Concepts and Techniques, 2nd ed.
 
Big data unit 2
Big data unit 2Big data unit 2
Big data unit 2
 
Machine learning with Big Data power point presentation
Machine learning with Big Data power point presentationMachine learning with Big Data power point presentation
Machine learning with Big Data power point presentation
 
Introduction of Data Science
Introduction of Data ScienceIntroduction of Data Science
Introduction of Data Science
 
Data Mining: Data cube computation and data generalization
Data Mining: Data cube computation and data generalizationData Mining: Data cube computation and data generalization
Data Mining: Data cube computation and data generalization
 
Data mining :Concepts and Techniques Chapter 2, data
Data mining :Concepts and Techniques Chapter 2, dataData mining :Concepts and Techniques Chapter 2, data
Data mining :Concepts and Techniques Chapter 2, data
 
The Data Science Process
The Data Science ProcessThe Data Science Process
The Data Science Process
 
Data mining & big data presentation 01
Data mining & big data presentation 01Data mining & big data presentation 01
Data mining & big data presentation 01
 
Data preprocessing
Data preprocessingData preprocessing
Data preprocessing
 
Statistics and Data Mining
Statistics and  Data MiningStatistics and  Data Mining
Statistics and Data Mining
 
data generalization and summarization
data generalization and summarization data generalization and summarization
data generalization and summarization
 
Big Data: Its Characteristics And Architecture Capabilities
Big Data: Its Characteristics And Architecture CapabilitiesBig Data: Its Characteristics And Architecture Capabilities
Big Data: Its Characteristics And Architecture Capabilities
 
DATA WAREHOUSE IMPLEMENTATION BY SAIKIRAN PANJALA
DATA WAREHOUSE IMPLEMENTATION BY SAIKIRAN PANJALADATA WAREHOUSE IMPLEMENTATION BY SAIKIRAN PANJALA
DATA WAREHOUSE IMPLEMENTATION BY SAIKIRAN PANJALA
 
Big Data: Issues and Challenges
Big Data: Issues and ChallengesBig Data: Issues and Challenges
Big Data: Issues and Challenges
 
Big data ppt
Big data pptBig data ppt
Big data ppt
 
Machine Learning: Introduction to Neural Networks
Machine Learning: Introduction to Neural NetworksMachine Learning: Introduction to Neural Networks
Machine Learning: Introduction to Neural Networks
 
Data mining
Data miningData mining
Data mining
 
Big data and data science overview
Big data and data science overviewBig data and data science overview
Big data and data science overview
 

Semelhante a Big data mining

Semelhante a Big data mining (20)

Big data and data mining
Big data and data miningBig data and data mining
Big data and data mining
 
Data mining with big data
Data mining with big dataData mining with big data
Data mining with big data
 
Data mining with big data
Data mining with big dataData mining with big data
Data mining with big data
 
Big data analytics, survey r.nabati
Big data analytics, survey r.nabatiBig data analytics, survey r.nabati
Big data analytics, survey r.nabati
 
A Comprehensive Study on Big Data Applications and Challenges
A Comprehensive Study on Big Data Applications and ChallengesA Comprehensive Study on Big Data Applications and Challenges
A Comprehensive Study on Big Data Applications and Challenges
 
A Novel Framework for Big Data Processing in a Data-driven Society
A Novel Framework for Big Data Processing in a Data-driven SocietyA Novel Framework for Big Data Processing in a Data-driven Society
A Novel Framework for Big Data Processing in a Data-driven Society
 
An Overview of BigData
An Overview of BigDataAn Overview of BigData
An Overview of BigData
 
A Model Design of Big Data Processing using HACE Theorem
A Model Design of Big Data Processing using HACE TheoremA Model Design of Big Data Processing using HACE Theorem
A Model Design of Big Data Processing using HACE Theorem
 
Lecture 5 - Big Data and Hadoop Intro.ppt
Lecture 5 - Big Data and Hadoop Intro.pptLecture 5 - Big Data and Hadoop Intro.ppt
Lecture 5 - Big Data and Hadoop Intro.ppt
 
Big Data Testing Using Hadoop Platform
Big Data Testing Using Hadoop PlatformBig Data Testing Using Hadoop Platform
Big Data Testing Using Hadoop Platform
 
Big Data Mining - Classification, Techniques and Issues
Big Data Mining - Classification, Techniques and IssuesBig Data Mining - Classification, Techniques and Issues
Big Data Mining - Classification, Techniques and Issues
 
Big Data Handling Technologies ICCCS 2014_Love Arora _GNDU
Big Data Handling Technologies ICCCS 2014_Love Arora _GNDU Big Data Handling Technologies ICCCS 2014_Love Arora _GNDU
Big Data Handling Technologies ICCCS 2014_Love Arora _GNDU
 
A Survey on Big Data Mining Challenges
A Survey on Big Data Mining ChallengesA Survey on Big Data Mining Challenges
A Survey on Big Data Mining Challenges
 
Real World Application of Big Data In Data Mining Tools
Real World Application of Big Data In Data Mining ToolsReal World Application of Big Data In Data Mining Tools
Real World Application of Big Data In Data Mining Tools
 
Big data Mining Using Very-Large-Scale Data Processing Platforms
Big data Mining Using Very-Large-Scale Data Processing PlatformsBig data Mining Using Very-Large-Scale Data Processing Platforms
Big data Mining Using Very-Large-Scale Data Processing Platforms
 
Big Data Analysis
Big Data AnalysisBig Data Analysis
Big Data Analysis
 
Big Data Mining, Techniques, Handling Technologies and Some Related Issues: A...
Big Data Mining, Techniques, Handling Technologies and Some Related Issues: A...Big Data Mining, Techniques, Handling Technologies and Some Related Issues: A...
Big Data Mining, Techniques, Handling Technologies and Some Related Issues: A...
 
Big Data Mining, Techniques, Handling Technologies and Some Related Issues: A...
Big Data Mining, Techniques, Handling Technologies and Some Related Issues: A...Big Data Mining, Techniques, Handling Technologies and Some Related Issues: A...
Big Data Mining, Techniques, Handling Technologies and Some Related Issues: A...
 
IRJET- A Scenario on Big Data
IRJET- A Scenario on Big DataIRJET- A Scenario on Big Data
IRJET- A Scenario on Big Data
 
ISSUES, CHALLENGES, AND SOLUTIONS: BIG DATA MINING
ISSUES, CHALLENGES, AND SOLUTIONS: BIG DATA MININGISSUES, CHALLENGES, AND SOLUTIONS: BIG DATA MINING
ISSUES, CHALLENGES, AND SOLUTIONS: BIG DATA MINING
 

Último

Call Girls Jalahalli Just Call 👗 7737669865 👗 Top Class Call Girl Service Ban...
Call Girls Jalahalli Just Call 👗 7737669865 👗 Top Class Call Girl Service Ban...Call Girls Jalahalli Just Call 👗 7737669865 👗 Top Class Call Girl Service Ban...
Call Girls Jalahalli Just Call 👗 7737669865 👗 Top Class Call Girl Service Ban...
amitlee9823
 
Probability Grade 10 Third Quarter Lessons
Probability Grade 10 Third Quarter LessonsProbability Grade 10 Third Quarter Lessons
Probability Grade 10 Third Quarter Lessons
JoseMangaJr1
 
Call Girls Indiranagar Just Call 👗 7737669865 👗 Top Class Call Girl Service B...
Call Girls Indiranagar Just Call 👗 7737669865 👗 Top Class Call Girl Service B...Call Girls Indiranagar Just Call 👗 7737669865 👗 Top Class Call Girl Service B...
Call Girls Indiranagar Just Call 👗 7737669865 👗 Top Class Call Girl Service B...
amitlee9823
 
Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...
Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...
Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...
amitlee9823
 
CHEAP Call Girls in Saket (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
CHEAP Call Girls in Saket (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICECHEAP Call Girls in Saket (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
CHEAP Call Girls in Saket (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
9953056974 Low Rate Call Girls In Saket, Delhi NCR
 
Call Girls Hsr Layout Just Call 👗 7737669865 👗 Top Class Call Girl Service Ba...
Call Girls Hsr Layout Just Call 👗 7737669865 👗 Top Class Call Girl Service Ba...Call Girls Hsr Layout Just Call 👗 7737669865 👗 Top Class Call Girl Service Ba...
Call Girls Hsr Layout Just Call 👗 7737669865 👗 Top Class Call Girl Service Ba...
amitlee9823
 
FESE Capital Markets Fact Sheet 2024 Q1.pdf
FESE Capital Markets Fact Sheet 2024 Q1.pdfFESE Capital Markets Fact Sheet 2024 Q1.pdf
FESE Capital Markets Fact Sheet 2024 Q1.pdf
MarinCaroMartnezBerg
 
Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...
Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...
Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...
amitlee9823
 

Último (20)

Call Girls Jalahalli Just Call 👗 7737669865 👗 Top Class Call Girl Service Ban...
Call Girls Jalahalli Just Call 👗 7737669865 👗 Top Class Call Girl Service Ban...Call Girls Jalahalli Just Call 👗 7737669865 👗 Top Class Call Girl Service Ban...
Call Girls Jalahalli Just Call 👗 7737669865 👗 Top Class Call Girl Service Ban...
 
BDSM⚡Call Girls in Mandawali Delhi >༒8448380779 Escort Service
BDSM⚡Call Girls in Mandawali Delhi >༒8448380779 Escort ServiceBDSM⚡Call Girls in Mandawali Delhi >༒8448380779 Escort Service
BDSM⚡Call Girls in Mandawali Delhi >༒8448380779 Escort Service
 
Probability Grade 10 Third Quarter Lessons
Probability Grade 10 Third Quarter LessonsProbability Grade 10 Third Quarter Lessons
Probability Grade 10 Third Quarter Lessons
 
Call Girls Indiranagar Just Call 👗 7737669865 👗 Top Class Call Girl Service B...
Call Girls Indiranagar Just Call 👗 7737669865 👗 Top Class Call Girl Service B...Call Girls Indiranagar Just Call 👗 7737669865 👗 Top Class Call Girl Service B...
Call Girls Indiranagar Just Call 👗 7737669865 👗 Top Class Call Girl Service B...
 
Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...
Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...
Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...
 
CHEAP Call Girls in Saket (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
CHEAP Call Girls in Saket (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICECHEAP Call Girls in Saket (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
CHEAP Call Girls in Saket (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
 
Cheap Rate Call girls Sarita Vihar Delhi 9205541914 shot 1500 night
Cheap Rate Call girls Sarita Vihar Delhi 9205541914 shot 1500 nightCheap Rate Call girls Sarita Vihar Delhi 9205541914 shot 1500 night
Cheap Rate Call girls Sarita Vihar Delhi 9205541914 shot 1500 night
 
Halmar dropshipping via API with DroFx
Halmar  dropshipping  via API with DroFxHalmar  dropshipping  via API with DroFx
Halmar dropshipping via API with DroFx
 
Call me @ 9892124323 Cheap Rate Call Girls in Vashi with Real Photo 100% Secure
Call me @ 9892124323  Cheap Rate Call Girls in Vashi with Real Photo 100% SecureCall me @ 9892124323  Cheap Rate Call Girls in Vashi with Real Photo 100% Secure
Call me @ 9892124323 Cheap Rate Call Girls in Vashi with Real Photo 100% Secure
 
Call Girls Hsr Layout Just Call 👗 7737669865 👗 Top Class Call Girl Service Ba...
Call Girls Hsr Layout Just Call 👗 7737669865 👗 Top Class Call Girl Service Ba...Call Girls Hsr Layout Just Call 👗 7737669865 👗 Top Class Call Girl Service Ba...
Call Girls Hsr Layout Just Call 👗 7737669865 👗 Top Class Call Girl Service Ba...
 
Generative AI on Enterprise Cloud with NiFi and Milvus
Generative AI on Enterprise Cloud with NiFi and MilvusGenerative AI on Enterprise Cloud with NiFi and Milvus
Generative AI on Enterprise Cloud with NiFi and Milvus
 
BabyOno dropshipping via API with DroFx.pptx
BabyOno dropshipping via API with DroFx.pptxBabyOno dropshipping via API with DroFx.pptx
BabyOno dropshipping via API with DroFx.pptx
 
Invezz.com - Grow your wealth with trading signals
Invezz.com - Grow your wealth with trading signalsInvezz.com - Grow your wealth with trading signals
Invezz.com - Grow your wealth with trading signals
 
FESE Capital Markets Fact Sheet 2024 Q1.pdf
FESE Capital Markets Fact Sheet 2024 Q1.pdfFESE Capital Markets Fact Sheet 2024 Q1.pdf
FESE Capital Markets Fact Sheet 2024 Q1.pdf
 
ELKO dropshipping via API with DroFx.pptx
ELKO dropshipping via API with DroFx.pptxELKO dropshipping via API with DroFx.pptx
ELKO dropshipping via API with DroFx.pptx
 
BigBuy dropshipping via API with DroFx.pptx
BigBuy dropshipping via API with DroFx.pptxBigBuy dropshipping via API with DroFx.pptx
BigBuy dropshipping via API with DroFx.pptx
 
CebaBaby dropshipping via API with DroFX.pptx
CebaBaby dropshipping via API with DroFX.pptxCebaBaby dropshipping via API with DroFX.pptx
CebaBaby dropshipping via API with DroFX.pptx
 
Accredited-Transport-Cooperatives-Jan-2021-Web.pdf
Accredited-Transport-Cooperatives-Jan-2021-Web.pdfAccredited-Transport-Cooperatives-Jan-2021-Web.pdf
Accredited-Transport-Cooperatives-Jan-2021-Web.pdf
 
Edukaciniai dropshipping via API with DroFx
Edukaciniai dropshipping via API with DroFxEdukaciniai dropshipping via API with DroFx
Edukaciniai dropshipping via API with DroFx
 
Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...
Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...
Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...
 

Big data mining

  • 2. Overview  Introduction  Characteristics of Big Data  Big Data and it’s challenges  Big Data mining Tools  Big Data mining algorithm  Applications of Big Data  References  Q&A
  • 4. Interesting Facts  The volume of business data worldwide, across all companies, doubles every 1.2 years (was 1.5 years)  Daily 2500 quadrillion of data are produced and more than 90 percentage of data are produced within past two years.  A regular person is processing daily more data than a 16th century individual in his entire life  In the last years cost of storage and processing power dropped significantly  Bad data or poor data quality costs US businesses $600 billion annually  By 2015, 4.4 million IT jobs globally will be created to support big data (Gartner)  Facebook processes 10 TB of data every day / Twitter 7 TB  Google has over 3 million servers processing over 2 trillion searches per year in 2012 (only 22 million in 2000)
  • 6. The term Big data is used to describe a massive volume of both structured and unstructured data that is so large that it's difficult to process using traditional database and software techniques. -Webopedia
  • 7. Characteristics of Big Data Volume - The quantity of data  Variety - categorizing the data  Velocity - speed of generation of data or the speed of processing the data  Variability - Inconsistency  Complexity - Managing the data
  • 8. DATA MINING CHALLENGES WITH BIG DATA  Main challenge for an intelligent database is handling Big data. The important thing is scaling the large amount of data and provide solution for these problem by HACE theorem
  • 9.  Challenges Location of Big Data sources- Commonly Big Data are stored in different locations Volume of the Big Data- size of the Big Data grows continuously. Hardware resources- RAM capacity Privacy- Medical reports, bank transactions Having domain knowledge Getting meaningful information  Solutions Parallel computing programming An efficient platform for computing will not have centralized data storage instead of that platform will be distributed in big scale storage. Restricting access to the data
  • 10. BIG Data Mining Tools  Hadoop  Apache S4  Strom  Apache Mahout  MOA
  • 11. Hadoop  It is developed by Apache Software Foundation project and open source software platform for scalable, distributed computing.  Apache Hadoop software library is a framework that allows for the distributed processing of large data sets across clusters of computers using simple programming models.  Hadoop provides fast and reliable analysis of both Structured and un structured data.  It is designed to scale up from single servers to thousands of machines, each offering local computation and storage.  Hadoop uses MapReduce programming model to mine data.  This MapReduce program is used to separate datasets which are sent as input into independent subsets. Those are process parallel map task.  Map() procedure that performs filtering and sorting  Reduce() procedure that performs a summary operation
  • 12.
  • 13. Big Data Mining Algorithm  Big data applications have so many sources to gather information.  If we want to mine data, we need to gather all distributed data to the centralized site. But it is prohibited because of high data transmission cost and privacy concerns.  Most of the mining levels order to achieve the pattern of correlations, or patterns can be discovered from combined variety of sources.  The global data mining is done through two steps process.  Model level  Knowledge level.  Each and every local sites use local data to calculate the data statistics and it share this information in order to achieve global data distribution in their data level.
  • 14.  In model level it will produce local pattern. This pattern will be produced after mined local data.  By sharing these local patterns with other local sites, we can produce a single global pattern.  At the knowledge level, model correlation analysis investigates the relevance between models generated from various data sources to determine how related the data sources are correlated to each other, and how to form accurate decisions based on models built from autonomous sources
  • 15. Applications of Big Data  Healthcare organizations can achieve better insight into disease trends and patient treatments.  Public sector agencies can catch fraud and other threats in real-time.  Applications of Multimedia data  To find travelling pattern of travelers  CC TV camera footage  Photos and Videos from social network  Recommender system  Integration and mining of Bio data from various sources in Biological network by NSF (National Science Foundation).  Classifying the Big data stream in run time, by Australian Research council.
  • 16. References [1] IEEE, Data Mining with Big Data, January 2014 [2] McKinsy Global Institute, Big Data: The next frontier for innovation, competition and productivity- May 2011 [3] Xindong Wu, Xinguan Zhu, Gong-Qing Wu, Wei Ding, 2013, Data Mining with Big Data [4] Ahmed and Karypis 2012, Rezwan Ahmed, George Karpis, Algorithms for mining the evolution of conserved relational states in dynamic network [5] Wu X. 2000, Building Intelligent Learning Database Systems, AI Magazine [6] Oracle, June 2013,Unstructured Data Management with Oracle Database 12c [7] Valery A.Petrushin, Jia-Yu Pan, Cees G.M.Snoek, 2010,Tenth International Workshop on Multimedia Data Mining [8] Big data[Online].Available:www.en.wikipedia.org/wiki/Big_data [9] Big data [Online]. Available: www.webopedia.com/TERM /B/ big_data.html [10]IBM big data and information management [Online]. Available: www- 01.ibm.com/software/data/bigdata [11] Big data [Online]. Available: www.adainbigdata.com [12] Big Data Explained [Online]. Available: www.mongodb.com/big-data-explained [13] Big data [Online]. Available: www.sas.com/en_us/insights/big-data/what-is-big-data.html [14] Big Data Mining Tools[Online]. Available: www.albertbifet.com/big-data-mining-tools
  • 17.
  • 18.
  • 19. Cloud storage for Big Data Processing

Notas do Editor

  1. In 2012, debate which is held during president election between Obama & Mitt triggered about 10 million tweets within 2 hours. And the well-known web site Flickr which is used to post our images faced a problem. It receives 1.8 million photographs every day which has the size of 2MB. Approximately they need 3.6TB storage capacity per day. These situations shows the reason for rise of Big Data application
  2. Sourcessssssssss Social network Satellite data Geographical data Live streaming data