SlideShare uma empresa Scribd logo
1 de 29
Data Mining
With Big Data
Presented By:
Dinesh chandra yenduri
Rg.no : y15mc24095
Abstract
Big Data concern large-volume, complex,
growing data sets with multiple, autonomous
sources. With the fast development of
networking, data storage, and the data collection
capacity, Big Data are now rapidly expanding in
all science and engineering domains, including
physical, biological and biomedical sciences
2
Outlines
• Introduction
• What is Data Mining With Big Data
• How To Produce The Big Data
• Big Data Characteristics
• 4Vs Big Data
• Hadoop System Architecture
• Hadoop Framework
• Data Mining Challenges With Big Data
• Big Data Challenges and solution
• Advantages
• Conclusion
• References
3
Introduction
• The volume of business data worldwide, across all
companies, doubles every 1.2 years (was 1.5 years)
• Daily 2500 quintillion of data are produced and more
than 90 percentage of data are produced within past two
years.
• Face book processes 10 TB of data every day / Twitter 7
TB
• On 4 October 2012, the first presidential debate between
President Barack Obama and Governor Mitt Romney
triggered more than 10 million tweets within 2 hours
• Examples : Booing Jet, Scientific Data, Sensor Data,
Internet Data
4
What is Data Mining With Big Data
5
How To Produce The Big Data
6
Big Data Characteristics
• Data has grown
tremendously.
• Big Data starts with
large-volume,
heterogeneous,
autonomous sources
with distributed and
decentralized system
7
4Vs Big Data
Volume
• Data quantity
Velocity
• Data Speed
Variety
• Data Types
Variability
• Authenticity
8
How To Manage The Big Data
• By using the Hadoop
• It is the open source system
• It is distributed file system
9
Hadoop System Architecture
10
Hadoop Framework
11
Data Mining Challenges With Big Data
• Big Data Mining Platform
• Big Data Semantics and Application Knowledge
• Big Data Mining Algorithm
12
Big Data Mining Platform
• Data are typically large and cannot be fit into the
main memory
• Parallel computing programming to carry out
the mining process
• Big Data processing framework will rely on
cluster computers with a high-performance
computing platform on a large number of
computing nodes
13
Big Data Mining Platform (Cont…)
• Big Data mining offers opportunities to go
beyond traditional relational databases to rely
on less structured data: weblogs, social media,
e-mail, sensors, and photographs that can be
mined for useful information
14
Big Data Semantics and Application
Knowledge
The tw0 most important issues at this section
1) Data sharing and privacy
2) Domain and application knowledge
15
Data sharing and privacy
• Information sharing is an ultimate goal for all
systems involving multiple parties
• Those are the two common approaches or their
1) Restrict access to the data, such as adding
certification or access control to the data
entries, so sensitive information is accessible
by a limited group of users only
2) anonymize data fields such that sensitive
information cannot be pinpointed to an
indivi- dual record
16
Domain and application knowledge
• Domain and application knowledge provides
essential information for designing Big Data
mining algorithms and systems
• The domain and application knowledge can also
help design achievable business objectives by
using Big Data analytical techniques
17
Big Data Mining Algorithm
I. Local Learning and Model Fusion for
Multiple Information Sources
II. Mining from Sparse, Uncertain, and
Incomplete Data
III. Mining Complex and Dynamic Data
18
Local Learning and Model Fusion for Multiple
Information Sources
As Big Data applications are featured with
autonomous sources and decentralized controls,
aggregating distributed data sources to a
centralized site for mining is system - atically
prohibitive due to the potential transmission
cost and privacy concerns
19
Mining from Sparse, Uncertain, and
Incomplete Data
• Sparse, uncertain, and incomplete data are
defining features for Big Data applications
20
Mining Complex and Dynamic Data
• The rise of Big Data is driven by the rapid
increasing of complex data and their changes in
volumes and in nature
• Documents posted on WWW servers, Internet
back- bones, social networks, communication
networks, and transportation networks, and so
on are all featured with dynamic data
21
22
Big Data Challenges and solution
 Location of Big Data sources- Commonly Big
Data are stored in different locations
 Volume of the Big Data- size of the Big Data
grows continuously.
 Hardware resources- RAM capacity
 Privacy
 Domain knowledge
 Getting meaningful information
23
solution
 Parallel computing programming
 An efficient platform for computing
will not have centralized data storage
instead of that platform will be
distributed in big scale storage.
 Restricting access to the data
24
Advantages
• No Fast response
• Extract useful information
• Prediction of required data from large amount of
data
• Serves of better results in the form of
visualization
25
Conclusion
Big Data as an emerging trend and the need for
Big Data mining is arising in all science and
engineering domains. With Big Data
technologies, we will hopefully be able to provide
most relevant and most accurate social sensing
feedback to better understand our society at
real- time
26
References
• R. Ahmed and G. Karypis, “Algorithms for Mining
the Evolution of Conserved Relational States in
Dynamic Networks,” Knowledge and Information
Systems, vol. 33, no. 3, pp. 603-630, Dec. 2012.
• M.H. Alam, J.W. Ha, and S.K. Lee, “Novel
Approaches to Crawling Important Pages Early,”
Knowledge and Information Systems, vol. 33, no. 3,
pp 707-734, Dec. 2012.
• S. Aral and D. Walker, “Identifying Influential and
Susceptible Members of Social Networks,” Science,
vol. 337, pp. 337-341, 2012.
27
28
29

Mais conteúdo relacionado

Mais procurados

Big Data Analytics: Applications and Opportunities in On-line Predictive Mode...
Big Data Analytics: Applications and Opportunities in On-line Predictive Mode...Big Data Analytics: Applications and Opportunities in On-line Predictive Mode...
Big Data Analytics: Applications and Opportunities in On-line Predictive Mode...BigMine
 
Data mining & big data presentation 01
Data mining & big data presentation 01Data mining & big data presentation 01
Data mining & big data presentation 01Aseem Chakrabarthy
 
Big data
Big dataBig data
Big datahsn99
 
Big data ppt
Big data pptBig data ppt
Big data pptYash Raj
 
Big Data in Distributed Analytics,Cybersecurity And Digital Forensics
Big Data in Distributed Analytics,Cybersecurity And Digital ForensicsBig Data in Distributed Analytics,Cybersecurity And Digital Forensics
Big Data in Distributed Analytics,Cybersecurity And Digital ForensicsSherinMariamReji05
 
Big Data and Classification
Big Data and ClassificationBig Data and Classification
Big Data and Classification303Computing
 
Introduction to Big Data & Big Data 1.0 System
Introduction to Big Data & Big Data 1.0 SystemIntroduction to Big Data & Big Data 1.0 System
Introduction to Big Data & Big Data 1.0 SystemPetr Novotný
 
Big Data Analytics MIS presentation
Big Data Analytics MIS presentationBig Data Analytics MIS presentation
Big Data Analytics MIS presentationAASTHA PANDEY
 
Big Data Tutorial | What Is Big Data | Big Data Hadoop Tutorial For Beginners...
Big Data Tutorial | What Is Big Data | Big Data Hadoop Tutorial For Beginners...Big Data Tutorial | What Is Big Data | Big Data Hadoop Tutorial For Beginners...
Big Data Tutorial | What Is Big Data | Big Data Hadoop Tutorial For Beginners...Simplilearn
 
Big data analytics with Apache Hadoop
Big data analytics with Apache  HadoopBig data analytics with Apache  Hadoop
Big data analytics with Apache HadoopSuman Saurabh
 

Mais procurados (20)

Big Data Analytics: Applications and Opportunities in On-line Predictive Mode...
Big Data Analytics: Applications and Opportunities in On-line Predictive Mode...Big Data Analytics: Applications and Opportunities in On-line Predictive Mode...
Big Data Analytics: Applications and Opportunities in On-line Predictive Mode...
 
Data mining & big data presentation 01
Data mining & big data presentation 01Data mining & big data presentation 01
Data mining & big data presentation 01
 
Data mining with big data
Data mining with big dataData mining with big data
Data mining with big data
 
Big data
Big dataBig data
Big data
 
Big data
Big dataBig data
Big data
 
Data mining with big data
Data mining with big dataData mining with big data
Data mining with big data
 
Big data
Big dataBig data
Big data
 
Big data ppt
Big data pptBig data ppt
Big data ppt
 
Big Data in Distributed Analytics,Cybersecurity And Digital Forensics
Big Data in Distributed Analytics,Cybersecurity And Digital ForensicsBig Data in Distributed Analytics,Cybersecurity And Digital Forensics
Big Data in Distributed Analytics,Cybersecurity And Digital Forensics
 
Big Data and Classification
Big Data and ClassificationBig Data and Classification
Big Data and Classification
 
Introduction to Big Data & Big Data 1.0 System
Introduction to Big Data & Big Data 1.0 SystemIntroduction to Big Data & Big Data 1.0 System
Introduction to Big Data & Big Data 1.0 System
 
Presentation on Big Data
Presentation on Big DataPresentation on Big Data
Presentation on Big Data
 
Big Data Analytics MIS presentation
Big Data Analytics MIS presentationBig Data Analytics MIS presentation
Big Data Analytics MIS presentation
 
Big data analysis
Big data analysisBig data analysis
Big data analysis
 
Big data
Big dataBig data
Big data
 
Big data
Big dataBig data
Big data
 
Big Data
Big DataBig Data
Big Data
 
Big Data Tutorial | What Is Big Data | Big Data Hadoop Tutorial For Beginners...
Big Data Tutorial | What Is Big Data | Big Data Hadoop Tutorial For Beginners...Big Data Tutorial | What Is Big Data | Big Data Hadoop Tutorial For Beginners...
Big Data Tutorial | What Is Big Data | Big Data Hadoop Tutorial For Beginners...
 
Big data analytics with Apache Hadoop
Big data analytics with Apache  HadoopBig data analytics with Apache  Hadoop
Big data analytics with Apache Hadoop
 
Big Data
Big DataBig Data
Big Data
 

Destaque (17)

Big Data v Data Mining
Big Data v Data MiningBig Data v Data Mining
Big Data v Data Mining
 
Big data ppt
Big  data pptBig  data ppt
Big data ppt
 
What is big data?
What is big data?What is big data?
What is big data?
 
Big data ppt
Big data pptBig data ppt
Big data ppt
 
Big Data
Big DataBig Data
Big Data
 
Data mining
Data miningData mining
Data mining
 
What is Big Data?
What is Big Data?What is Big Data?
What is Big Data?
 
Frank henry digital rural futures conf june 2013 v3
Frank henry digital rural futures conf june  2013  v3Frank henry digital rural futures conf june  2013  v3
Frank henry digital rural futures conf june 2013 v3
 
Computer networks
Computer networksComputer networks
Computer networks
 
Big Data Then and Now
Big Data Then and Now Big Data Then and Now
Big Data Then and Now
 
slide share presentation Duty of Care
slide share presentation Duty of Careslide share presentation Duty of Care
slide share presentation Duty of Care
 
Monkey talk
Monkey talkMonkey talk
Monkey talk
 
Big Data and Semantic Web in Manufacturing
Big Data and Semantic Web in ManufacturingBig Data and Semantic Web in Manufacturing
Big Data and Semantic Web in Manufacturing
 
Introduction to Big Data & Hadoop
Introduction to Big Data & HadoopIntroduction to Big Data & Hadoop
Introduction to Big Data & Hadoop
 
Monkey runner & Monkey testing
Monkey runner & Monkey testingMonkey runner & Monkey testing
Monkey runner & Monkey testing
 
Data Mining (Predict The Future)
Data Mining (Predict The Future)Data Mining (Predict The Future)
Data Mining (Predict The Future)
 
HMI
HMIHMI
HMI
 

Semelhante a Data mining with big data

Semelhante a Data mining with big data (20)

Kartikey tripathi
Kartikey tripathiKartikey tripathi
Kartikey tripathi
 
Bigdatappt 140225061440-phpapp01
Bigdatappt 140225061440-phpapp01Bigdatappt 140225061440-phpapp01
Bigdatappt 140225061440-phpapp01
 
Special issues on big data
Special issues on big dataSpecial issues on big data
Special issues on big data
 
ppt final.pptx
ppt final.pptxppt final.pptx
ppt final.pptx
 
Big_Data_ppt[1] (1).pptx
Big_Data_ppt[1] (1).pptxBig_Data_ppt[1] (1).pptx
Big_Data_ppt[1] (1).pptx
 
Content1. Introduction2. What is Big Data3. Characte.docx
Content1. Introduction2. What is Big Data3. Characte.docxContent1. Introduction2. What is Big Data3. Characte.docx
Content1. Introduction2. What is Big Data3. Characte.docx
 
Bigdata
BigdataBigdata
Bigdata
 
BIGDATAPrepared ByMuhammad Abrar UddinIntrodu.docx
BIGDATAPrepared ByMuhammad Abrar UddinIntrodu.docxBIGDATAPrepared ByMuhammad Abrar UddinIntrodu.docx
BIGDATAPrepared ByMuhammad Abrar UddinIntrodu.docx
 
Big data
Big dataBig data
Big data
 
A Survey on Big Data Mining Challenges
A Survey on Big Data Mining ChallengesA Survey on Big Data Mining Challenges
A Survey on Big Data Mining Challenges
 
Big data ppt
Big data pptBig data ppt
Big data ppt
 
Big Data ppt
Big Data pptBig Data ppt
Big Data ppt
 
Bigdata " new level"
Bigdata " new level"Bigdata " new level"
Bigdata " new level"
 
Big data and analytics
Big data and analyticsBig data and analytics
Big data and analytics
 
bigdata.pptx
bigdata.pptxbigdata.pptx
bigdata.pptx
 
big-data-8722-m8RQ3h1.pptx
big-data-8722-m8RQ3h1.pptxbig-data-8722-m8RQ3h1.pptx
big-data-8722-m8RQ3h1.pptx
 
Presentation on Big Data
Presentation on Big DataPresentation on Big Data
Presentation on Big Data
 
Data mining with big data
Data mining with big dataData mining with big data
Data mining with big data
 
Big data
Big dataBig data
Big data
 
Big Data
Big DataBig Data
Big Data
 

Último

Al Barsha Escorts $#$ O565212860 $#$ Escort Service In Al Barsha
Al Barsha Escorts $#$ O565212860 $#$ Escort Service In Al BarshaAl Barsha Escorts $#$ O565212860 $#$ Escort Service In Al Barsha
Al Barsha Escorts $#$ O565212860 $#$ Escort Service In Al BarshaAroojKhan71
 
Data-Analysis for Chicago Crime Data 2023
Data-Analysis for Chicago Crime Data  2023Data-Analysis for Chicago Crime Data  2023
Data-Analysis for Chicago Crime Data 2023ymrp368
 
Cheap Rate Call girls Sarita Vihar Delhi 9205541914 shot 1500 night
Cheap Rate Call girls Sarita Vihar Delhi 9205541914 shot 1500 nightCheap Rate Call girls Sarita Vihar Delhi 9205541914 shot 1500 night
Cheap Rate Call girls Sarita Vihar Delhi 9205541914 shot 1500 nightDelhi Call girls
 
Log Analysis using OSSEC sasoasasasas.pptx
Log Analysis using OSSEC sasoasasasas.pptxLog Analysis using OSSEC sasoasasasas.pptx
Log Analysis using OSSEC sasoasasasas.pptxJohnnyPlasten
 
ALSO dropshipping via API with DroFx.pptx
ALSO dropshipping via API with DroFx.pptxALSO dropshipping via API with DroFx.pptx
ALSO dropshipping via API with DroFx.pptxolyaivanovalion
 
100-Concepts-of-AI by Anupama Kate .pptx
100-Concepts-of-AI by Anupama Kate .pptx100-Concepts-of-AI by Anupama Kate .pptx
100-Concepts-of-AI by Anupama Kate .pptxAnupama Kate
 
Smarteg dropshipping via API with DroFx.pptx
Smarteg dropshipping via API with DroFx.pptxSmarteg dropshipping via API with DroFx.pptx
Smarteg dropshipping via API with DroFx.pptxolyaivanovalion
 
Introduction-to-Machine-Learning (1).pptx
Introduction-to-Machine-Learning (1).pptxIntroduction-to-Machine-Learning (1).pptx
Introduction-to-Machine-Learning (1).pptxfirstjob4
 
Halmar dropshipping via API with DroFx
Halmar  dropshipping  via API with DroFxHalmar  dropshipping  via API with DroFx
Halmar dropshipping via API with DroFxolyaivanovalion
 
{Pooja: 9892124323 } Call Girl in Mumbai | Jas Kaur Rate 4500 Free Hotel Del...
{Pooja:  9892124323 } Call Girl in Mumbai | Jas Kaur Rate 4500 Free Hotel Del...{Pooja:  9892124323 } Call Girl in Mumbai | Jas Kaur Rate 4500 Free Hotel Del...
{Pooja: 9892124323 } Call Girl in Mumbai | Jas Kaur Rate 4500 Free Hotel Del...Pooja Nehwal
 
FESE Capital Markets Fact Sheet 2024 Q1.pdf
FESE Capital Markets Fact Sheet 2024 Q1.pdfFESE Capital Markets Fact Sheet 2024 Q1.pdf
FESE Capital Markets Fact Sheet 2024 Q1.pdfMarinCaroMartnezBerg
 
Week-01-2.ppt BBB human Computer interaction
Week-01-2.ppt BBB human Computer interactionWeek-01-2.ppt BBB human Computer interaction
Week-01-2.ppt BBB human Computer interactionfulawalesam
 
Vip Model Call Girls (Delhi) Karol Bagh 9711199171✔️Body to body massage wit...
Vip Model  Call Girls (Delhi) Karol Bagh 9711199171✔️Body to body massage wit...Vip Model  Call Girls (Delhi) Karol Bagh 9711199171✔️Body to body massage wit...
Vip Model Call Girls (Delhi) Karol Bagh 9711199171✔️Body to body massage wit...shivangimorya083
 
VIP Call Girls Service Miyapur Hyderabad Call +91-8250192130
VIP Call Girls Service Miyapur Hyderabad Call +91-8250192130VIP Call Girls Service Miyapur Hyderabad Call +91-8250192130
VIP Call Girls Service Miyapur Hyderabad Call +91-8250192130Suhani Kapoor
 
Invezz.com - Grow your wealth with trading signals
Invezz.com - Grow your wealth with trading signalsInvezz.com - Grow your wealth with trading signals
Invezz.com - Grow your wealth with trading signalsInvezz1
 
VidaXL dropshipping via API with DroFx.pptx
VidaXL dropshipping via API with DroFx.pptxVidaXL dropshipping via API with DroFx.pptx
VidaXL dropshipping via API with DroFx.pptxolyaivanovalion
 
Capstone Project on IBM Data Analytics Program
Capstone Project on IBM Data Analytics ProgramCapstone Project on IBM Data Analytics Program
Capstone Project on IBM Data Analytics ProgramMoniSankarHazra
 

Último (20)

Al Barsha Escorts $#$ O565212860 $#$ Escort Service In Al Barsha
Al Barsha Escorts $#$ O565212860 $#$ Escort Service In Al BarshaAl Barsha Escorts $#$ O565212860 $#$ Escort Service In Al Barsha
Al Barsha Escorts $#$ O565212860 $#$ Escort Service In Al Barsha
 
Sampling (random) method and Non random.ppt
Sampling (random) method and Non random.pptSampling (random) method and Non random.ppt
Sampling (random) method and Non random.ppt
 
Data-Analysis for Chicago Crime Data 2023
Data-Analysis for Chicago Crime Data  2023Data-Analysis for Chicago Crime Data  2023
Data-Analysis for Chicago Crime Data 2023
 
Cheap Rate Call girls Sarita Vihar Delhi 9205541914 shot 1500 night
Cheap Rate Call girls Sarita Vihar Delhi 9205541914 shot 1500 nightCheap Rate Call girls Sarita Vihar Delhi 9205541914 shot 1500 night
Cheap Rate Call girls Sarita Vihar Delhi 9205541914 shot 1500 night
 
Log Analysis using OSSEC sasoasasasas.pptx
Log Analysis using OSSEC sasoasasasas.pptxLog Analysis using OSSEC sasoasasasas.pptx
Log Analysis using OSSEC sasoasasasas.pptx
 
ALSO dropshipping via API with DroFx.pptx
ALSO dropshipping via API with DroFx.pptxALSO dropshipping via API with DroFx.pptx
ALSO dropshipping via API with DroFx.pptx
 
100-Concepts-of-AI by Anupama Kate .pptx
100-Concepts-of-AI by Anupama Kate .pptx100-Concepts-of-AI by Anupama Kate .pptx
100-Concepts-of-AI by Anupama Kate .pptx
 
Smarteg dropshipping via API with DroFx.pptx
Smarteg dropshipping via API with DroFx.pptxSmarteg dropshipping via API with DroFx.pptx
Smarteg dropshipping via API with DroFx.pptx
 
Introduction-to-Machine-Learning (1).pptx
Introduction-to-Machine-Learning (1).pptxIntroduction-to-Machine-Learning (1).pptx
Introduction-to-Machine-Learning (1).pptx
 
Halmar dropshipping via API with DroFx
Halmar  dropshipping  via API with DroFxHalmar  dropshipping  via API with DroFx
Halmar dropshipping via API with DroFx
 
{Pooja: 9892124323 } Call Girl in Mumbai | Jas Kaur Rate 4500 Free Hotel Del...
{Pooja:  9892124323 } Call Girl in Mumbai | Jas Kaur Rate 4500 Free Hotel Del...{Pooja:  9892124323 } Call Girl in Mumbai | Jas Kaur Rate 4500 Free Hotel Del...
{Pooja: 9892124323 } Call Girl in Mumbai | Jas Kaur Rate 4500 Free Hotel Del...
 
Delhi 99530 vip 56974 Genuine Escort Service Call Girls in Kishangarh
Delhi 99530 vip 56974 Genuine Escort Service Call Girls in  KishangarhDelhi 99530 vip 56974 Genuine Escort Service Call Girls in  Kishangarh
Delhi 99530 vip 56974 Genuine Escort Service Call Girls in Kishangarh
 
FESE Capital Markets Fact Sheet 2024 Q1.pdf
FESE Capital Markets Fact Sheet 2024 Q1.pdfFESE Capital Markets Fact Sheet 2024 Q1.pdf
FESE Capital Markets Fact Sheet 2024 Q1.pdf
 
Abortion pills in Doha Qatar (+966572737505 ! Get Cytotec
Abortion pills in Doha Qatar (+966572737505 ! Get CytotecAbortion pills in Doha Qatar (+966572737505 ! Get Cytotec
Abortion pills in Doha Qatar (+966572737505 ! Get Cytotec
 
Week-01-2.ppt BBB human Computer interaction
Week-01-2.ppt BBB human Computer interactionWeek-01-2.ppt BBB human Computer interaction
Week-01-2.ppt BBB human Computer interaction
 
Vip Model Call Girls (Delhi) Karol Bagh 9711199171✔️Body to body massage wit...
Vip Model  Call Girls (Delhi) Karol Bagh 9711199171✔️Body to body massage wit...Vip Model  Call Girls (Delhi) Karol Bagh 9711199171✔️Body to body massage wit...
Vip Model Call Girls (Delhi) Karol Bagh 9711199171✔️Body to body massage wit...
 
VIP Call Girls Service Miyapur Hyderabad Call +91-8250192130
VIP Call Girls Service Miyapur Hyderabad Call +91-8250192130VIP Call Girls Service Miyapur Hyderabad Call +91-8250192130
VIP Call Girls Service Miyapur Hyderabad Call +91-8250192130
 
Invezz.com - Grow your wealth with trading signals
Invezz.com - Grow your wealth with trading signalsInvezz.com - Grow your wealth with trading signals
Invezz.com - Grow your wealth with trading signals
 
VidaXL dropshipping via API with DroFx.pptx
VidaXL dropshipping via API with DroFx.pptxVidaXL dropshipping via API with DroFx.pptx
VidaXL dropshipping via API with DroFx.pptx
 
Capstone Project on IBM Data Analytics Program
Capstone Project on IBM Data Analytics ProgramCapstone Project on IBM Data Analytics Program
Capstone Project on IBM Data Analytics Program
 

Data mining with big data

  • 1. Data Mining With Big Data Presented By: Dinesh chandra yenduri Rg.no : y15mc24095
  • 2. Abstract Big Data concern large-volume, complex, growing data sets with multiple, autonomous sources. With the fast development of networking, data storage, and the data collection capacity, Big Data are now rapidly expanding in all science and engineering domains, including physical, biological and biomedical sciences 2
  • 3. Outlines • Introduction • What is Data Mining With Big Data • How To Produce The Big Data • Big Data Characteristics • 4Vs Big Data • Hadoop System Architecture • Hadoop Framework • Data Mining Challenges With Big Data • Big Data Challenges and solution • Advantages • Conclusion • References 3
  • 4. Introduction • The volume of business data worldwide, across all companies, doubles every 1.2 years (was 1.5 years) • Daily 2500 quintillion of data are produced and more than 90 percentage of data are produced within past two years. • Face book processes 10 TB of data every day / Twitter 7 TB • On 4 October 2012, the first presidential debate between President Barack Obama and Governor Mitt Romney triggered more than 10 million tweets within 2 hours • Examples : Booing Jet, Scientific Data, Sensor Data, Internet Data 4
  • 5. What is Data Mining With Big Data 5
  • 6. How To Produce The Big Data 6
  • 7. Big Data Characteristics • Data has grown tremendously. • Big Data starts with large-volume, heterogeneous, autonomous sources with distributed and decentralized system 7
  • 8. 4Vs Big Data Volume • Data quantity Velocity • Data Speed Variety • Data Types Variability • Authenticity 8
  • 9. How To Manage The Big Data • By using the Hadoop • It is the open source system • It is distributed file system 9
  • 12. Data Mining Challenges With Big Data • Big Data Mining Platform • Big Data Semantics and Application Knowledge • Big Data Mining Algorithm 12
  • 13. Big Data Mining Platform • Data are typically large and cannot be fit into the main memory • Parallel computing programming to carry out the mining process • Big Data processing framework will rely on cluster computers with a high-performance computing platform on a large number of computing nodes 13
  • 14. Big Data Mining Platform (Cont…) • Big Data mining offers opportunities to go beyond traditional relational databases to rely on less structured data: weblogs, social media, e-mail, sensors, and photographs that can be mined for useful information 14
  • 15. Big Data Semantics and Application Knowledge The tw0 most important issues at this section 1) Data sharing and privacy 2) Domain and application knowledge 15
  • 16. Data sharing and privacy • Information sharing is an ultimate goal for all systems involving multiple parties • Those are the two common approaches or their 1) Restrict access to the data, such as adding certification or access control to the data entries, so sensitive information is accessible by a limited group of users only 2) anonymize data fields such that sensitive information cannot be pinpointed to an indivi- dual record 16
  • 17. Domain and application knowledge • Domain and application knowledge provides essential information for designing Big Data mining algorithms and systems • The domain and application knowledge can also help design achievable business objectives by using Big Data analytical techniques 17
  • 18. Big Data Mining Algorithm I. Local Learning and Model Fusion for Multiple Information Sources II. Mining from Sparse, Uncertain, and Incomplete Data III. Mining Complex and Dynamic Data 18
  • 19. Local Learning and Model Fusion for Multiple Information Sources As Big Data applications are featured with autonomous sources and decentralized controls, aggregating distributed data sources to a centralized site for mining is system - atically prohibitive due to the potential transmission cost and privacy concerns 19
  • 20. Mining from Sparse, Uncertain, and Incomplete Data • Sparse, uncertain, and incomplete data are defining features for Big Data applications 20
  • 21. Mining Complex and Dynamic Data • The rise of Big Data is driven by the rapid increasing of complex data and their changes in volumes and in nature • Documents posted on WWW servers, Internet back- bones, social networks, communication networks, and transportation networks, and so on are all featured with dynamic data 21
  • 22. 22
  • 23. Big Data Challenges and solution  Location of Big Data sources- Commonly Big Data are stored in different locations  Volume of the Big Data- size of the Big Data grows continuously.  Hardware resources- RAM capacity  Privacy  Domain knowledge  Getting meaningful information 23
  • 24. solution  Parallel computing programming  An efficient platform for computing will not have centralized data storage instead of that platform will be distributed in big scale storage.  Restricting access to the data 24
  • 25. Advantages • No Fast response • Extract useful information • Prediction of required data from large amount of data • Serves of better results in the form of visualization 25
  • 26. Conclusion Big Data as an emerging trend and the need for Big Data mining is arising in all science and engineering domains. With Big Data technologies, we will hopefully be able to provide most relevant and most accurate social sensing feedback to better understand our society at real- time 26
  • 27. References • R. Ahmed and G. Karypis, “Algorithms for Mining the Evolution of Conserved Relational States in Dynamic Networks,” Knowledge and Information Systems, vol. 33, no. 3, pp. 603-630, Dec. 2012. • M.H. Alam, J.W. Ha, and S.K. Lee, “Novel Approaches to Crawling Important Pages Early,” Knowledge and Information Systems, vol. 33, no. 3, pp 707-734, Dec. 2012. • S. Aral and D. Walker, “Identifying Influential and Susceptible Members of Social Networks,” Science, vol. 337, pp. 337-341, 2012. 27
  • 28. 28
  • 29. 29