SlideShare uma empresa Scribd logo
1 de 16
Baixar para ler offline
Mining Big Data: Current
State of work and Challenges
Group members:
Misbah Rashid
Mariam Rashid
About Journal
• The journal is published in the year 2015 in (IJANA) International
Journal of Advanced Networking and Applications
• The journal was published by Kaushika Pal and Dr. Jatinderkumar R.
Saini.
Brief Overview
• Introduction to big data
• Big Data Mining
• Big data mining importance
Introduction To Big Data
• Huge amount of data are generated and collected from various sources like
sensors, devices etc. all are in different formats from connected or independent
application.
• This data has to be processed, investigated, stored and understood. Considering
internet data the web pages indexed by Google were One million in 1998, One
billion in 2000 and one trillion in 2008.
• Examples are from social media- Facebook, Twitter, GooglePlus, YouTube,
LinkedIn.
• Each of these site receives huge volume of data on a daily basis.
• Smartphones are now highly connected to internet and use and store data on
web and thus increasing web volume.Twitter process around 400 millions tweets
each day.
• Smartphones are the real producer of big data, and it is up to us how we can
utilize that data to change our lives.
• Data created via smartphones can be put to good use. Smartphone
usage patterns helped researchers in Africa determine where malaria
outbreaks were occurring and where the affected people went [10].
This information can be used to determine where to best distribute
medicines more efficiently. This is the power of big data analysis
which has a positive impact on humanity.
Big Data Mining
• Big data mining is referred to the collective data miming.
• Extraction techniques that are performed on large volume of data.
• We need new tools and new algorithm to deal with all this huge amount of
data. While working with Big Data 7 V’s have to be considered for Big Data
Management
• Volume:every industry is flooded with data, which can be extremely
valuable, if it can be used to retrieve important information.
• Variety:90% of data generated is amorphous coming in all shapes and
forms-the data is generated from geo-spatial, tweets, photos and videos
uploading on social networking sites, which can be analysed for content
• Velocity:Velocity’ refers to the increasing speed at which this data is
created, and the increasing speed at which the data can be processed,
stored and analysed.
• Value: The probable value of Big Data is huge.
• Variability: Variability refers to data whose meaning is constantly changing.
There are changes in the structure of data and how users want to interpret
that data.
• Veracity: Big Data Veracity refers to the noise and abnormality in data. In
scoping out your big data strategy you need to help keep your data clean
and processes to keep ‘dirty data’ from accumulating in your systems.
• Visibility: Data from different sources should be visible to the technology
stack making up Big Data.Certain data which are crucial are available but
not visible to Big Data.
Literature Review
• Mining heterogeneous information networks is a new and promising
research frontier in Big Data mining. It considers interconnected, various
different types of data, including the relational database data, as
heterogeneous information networks.
• Mining Big Data in Real Time discusses the challenges in structured pattern
classification. The classification methods mostly deal with vector data. To
apply them to graph pattern classification can be converted into vectors of
attributes. Each and every attributes indicates the presence or absence of
sub patterns. Attributes are created for every frequent sub patterns. The
number of such sub patterns can be very large.
• Data Mining with Big data had drawn our attention on challenges with
mining big data at three levels dealing with data, model, and system.
Application Of Big Data Mining
• Business: expands customer intelligence, improves
operational efficiencies, customer personalization. To gain deep
customer requirements one need strong personal connections
and give customized services if possible which will drive more
sales.
• Managing demands in the market By capturing external
market and retailer data in real time to sense, evaluate, and
answer to demand indicators faster than ever before.
• Fraud detection: By analysing certain abnormal pattern from
various data sources, fraud can be detected in financial
transaction, health insurance etc
Challenges
• Variety and Heterogeneity: Different sources generate Big Data leading to great variety
or heterogeneity of big data. Heterogeneity in big data deals with structured, semi-
structured, and even entirely unstructured data concurrently. The challenge is to unveil
or extract the hidden knowledge in such data sets.
• Scalability: The extraordinary volume requires high scalability of its data management
and mining tools. However, most algorithms currently used in data mining do not scale
very well when applied to very large data sets because they were initially developed
and tested upon smaller data sets. we have such large data sets that these algorithms
are no longer efficient enough for mining and analysing
• Velocity/Speed: The capability of fast accessing and mining big data is highly essential.
Mining of a task must be finished within a definite period of time, otherwise, the
processing/mining results becomes less valuable or even worthless. However design of
new and more efficient indexing schemes is much desired, but remains one of the
greatest challenges to the research community.
Challenges
• Privacy Crisis: Data privacy has been always an issue. The concern has become
extremely serious with big data mining that often requires personal information in
order to produce relevant/accurate results such as location-based and personalized
services. Also, with the huge volume of big data such as social media that contains
incredible amount of highly interrelated personal information, each bit of information
can be mined out. Every transaction regarding our daily life is being pushed to online
and leaves a trace there: we comminute with friends via email, instant message, blog,
and Facebook; we do shopping and pay our bills online; credit card companies hold our
confidential identity information. As time goes, your personal information will be
scattered here or there. Everyone would easily gain the privilege of using powerful
tools to extract your confidential information.
• Garbage Mining: As the volume of data is increasing day by day so the amount of
irrelevant and unnecessary data is also increasing.Garbage minig is to extract the
hidden data and clean it from important data. It is not easy as it is difficult to extract
hidden data from bulk of data and then clean it. Garbage mining remains one of the
greatest challenges
Appreciation
• In this journal, author has fully explained the insights about the
mining of big data including the main concerns and main challenges
for the future.
• The most positive aspect of this article is its clarity in the statement of
research problem
• The author selected 14 relevant sources published between the years
of (2012) and (2014). Ten of these references were primary sources.
The author did a reasonable job of highlighting the previous search on
topics related to their research and even provided comparisons of
literature when possible.
Critic
• The statement of the problem was implied in the abstract section of
the article but the specific problem is not being addressed until the
author has described the usefulness of mining big data later in the
article.
• The author has not clearly explained the applications of mining big
data in medical, healthcare and engineering.
• The author has disscussed the big data in terms of mobile phones.The
scope of big data is far more than what author has disscussed.
Future work
• The techniques will be developed to overcome the challenges facing
in mining big data
• Social media and Big Data be used to understand public opinion
trends.
Thank You
Big data Mining

Mais conteúdo relacionado

Mais procurados

Importance of Data Analytics
 Importance of Data Analytics Importance of Data Analytics
Importance of Data AnalyticsProduct School
 
Data Science Innovations : Democratisation of Data and Data Science
Data Science Innovations : Democratisation of Data and Data Science  Data Science Innovations : Democratisation of Data and Data Science
Data Science Innovations : Democratisation of Data and Data Science suresh sood
 
Big data Presentation
Big data PresentationBig data Presentation
Big data PresentationAswadmehar
 
Introduction to Big Data & Big Data 1.0 System
Introduction to Big Data & Big Data 1.0 SystemIntroduction to Big Data & Big Data 1.0 System
Introduction to Big Data & Big Data 1.0 SystemPetr Novotný
 
Applications of Big Data Analytics in Businesses
Applications of Big Data Analytics in BusinessesApplications of Big Data Analytics in Businesses
Applications of Big Data Analytics in BusinessesT.S. Lim
 
Business intelligence architectures.pdf
Business intelligence architectures.pdfBusiness intelligence architectures.pdf
Business intelligence architectures.pdfAnand572211
 
Presentation Big Data
Presentation Big DataPresentation Big Data
Presentation Big DataRené Kuipers
 
Big Data Introduction
Big Data IntroductionBig Data Introduction
Big Data IntroductionTiago Knoch
 

Mais procurados (20)

Importance of Data Analytics
 Importance of Data Analytics Importance of Data Analytics
Importance of Data Analytics
 
Big data
Big dataBig data
Big data
 
Big data
Big dataBig data
Big data
 
Big data
Big dataBig data
Big data
 
Data Science Innovations : Democratisation of Data and Data Science
Data Science Innovations : Democratisation of Data and Data Science  Data Science Innovations : Democratisation of Data and Data Science
Data Science Innovations : Democratisation of Data and Data Science
 
Intro big data analytics
Intro big data analyticsIntro big data analytics
Intro big data analytics
 
Big data Presentation
Big data PresentationBig data Presentation
Big data Presentation
 
Big Data
Big DataBig Data
Big Data
 
Big data
Big dataBig data
Big data
 
Introduction to Big Data & Big Data 1.0 System
Introduction to Big Data & Big Data 1.0 SystemIntroduction to Big Data & Big Data 1.0 System
Introduction to Big Data & Big Data 1.0 System
 
Big data.
Big data.Big data.
Big data.
 
Big data
Big dataBig data
Big data
 
Applications of Big Data Analytics in Businesses
Applications of Big Data Analytics in BusinessesApplications of Big Data Analytics in Businesses
Applications of Big Data Analytics in Businesses
 
Overview of Big data(ppt)
Overview of Big data(ppt)Overview of Big data(ppt)
Overview of Big data(ppt)
 
Understanding big data
Understanding big dataUnderstanding big data
Understanding big data
 
Business intelligence architectures.pdf
Business intelligence architectures.pdfBusiness intelligence architectures.pdf
Business intelligence architectures.pdf
 
Presentation Big Data
Presentation Big DataPresentation Big Data
Presentation Big Data
 
Big data
Big dataBig data
Big data
 
Elementary Concepts of data minig
Elementary Concepts of data minigElementary Concepts of data minig
Elementary Concepts of data minig
 
Big Data Introduction
Big Data IntroductionBig Data Introduction
Big Data Introduction
 

Semelhante a Big data Mining

TOPIC.pptx
TOPIC.pptxTOPIC.pptx
TOPIC.pptxinfinix8
 
Bigdata and Hadoop with applications
Bigdata and Hadoop with applicationsBigdata and Hadoop with applications
Bigdata and Hadoop with applicationsPadma Metta
 
Introduction to Big Data
Introduction to Big DataIntroduction to Big Data
Introduction to Big DataUmair Shafique
 
20211011112936_PPT01-Introduction to Big Data.pptx
20211011112936_PPT01-Introduction to Big Data.pptx20211011112936_PPT01-Introduction to Big Data.pptx
20211011112936_PPT01-Introduction to Big Data.pptxSyauqiAsyhabira1
 
Big Data Challenges and solutions.pptx
 Big Data Challenges and solutions.pptx Big Data Challenges and solutions.pptx
Big Data Challenges and solutions.pptxjawaria11
 
Data Mining in the World of BIG Data-A Survey
Data Mining in the World of BIG Data-A SurveyData Mining in the World of BIG Data-A Survey
Data Mining in the World of BIG Data-A SurveyEditor IJCATR
 
Unit 1 (DSBDA) PD.pptx
Unit 1 (DSBDA)  PD.pptxUnit 1 (DSBDA)  PD.pptx
Unit 1 (DSBDA) PD.pptxSamiksha880257
 
Business Analytics and Data mining.pdf
Business Analytics and Data mining.pdfBusiness Analytics and Data mining.pdf
Business Analytics and Data mining.pdfssuser0413ec
 
Big Data for Development
Big Data for DevelopmentBig Data for Development
Big Data for DevelopmentJoud Khattab
 
Group 2 Handling and Processing of big data.pptx
Group 2 Handling and Processing of big data.pptxGroup 2 Handling and Processing of big data.pptx
Group 2 Handling and Processing of big data.pptxsalutiontechnology
 
Data-Ed Webinar: Demystifying Big Data
Data-Ed Webinar: Demystifying Big Data Data-Ed Webinar: Demystifying Big Data
Data-Ed Webinar: Demystifying Big Data DATAVERSITY
 
Data-Ed: Demystifying Big Data
Data-Ed: Demystifying Big Data Data-Ed: Demystifying Big Data
Data-Ed: Demystifying Big Data Data Blueprint
 

Semelhante a Big data Mining (20)

TOPIC.pptx
TOPIC.pptxTOPIC.pptx
TOPIC.pptx
 
Bigdata and Hadoop with applications
Bigdata and Hadoop with applicationsBigdata and Hadoop with applications
Bigdata and Hadoop with applications
 
Introduction to Big Data
Introduction to Big DataIntroduction to Big Data
Introduction to Big Data
 
20211011112936_PPT01-Introduction to Big Data.pptx
20211011112936_PPT01-Introduction to Big Data.pptx20211011112936_PPT01-Introduction to Big Data.pptx
20211011112936_PPT01-Introduction to Big Data.pptx
 
Big Data Challenges and solutions.pptx
 Big Data Challenges and solutions.pptx Big Data Challenges and solutions.pptx
Big Data Challenges and solutions.pptx
 
Big Data World
Big Data WorldBig Data World
Big Data World
 
Data Mining in the World of BIG Data-A Survey
Data Mining in the World of BIG Data-A SurveyData Mining in the World of BIG Data-A Survey
Data Mining in the World of BIG Data-A Survey
 
Applications of Big Data
Applications of Big DataApplications of Big Data
Applications of Big Data
 
Big data
Big dataBig data
Big data
 
Unit 1 (DSBDA) PD.pptx
Unit 1 (DSBDA)  PD.pptxUnit 1 (DSBDA)  PD.pptx
Unit 1 (DSBDA) PD.pptx
 
Business Analytics and Data mining.pdf
Business Analytics and Data mining.pdfBusiness Analytics and Data mining.pdf
Business Analytics and Data mining.pdf
 
NCCT.pptx
NCCT.pptxNCCT.pptx
NCCT.pptx
 
BigData.pptx
BigData.pptxBigData.pptx
BigData.pptx
 
Big_Data.pptx
Big_Data.pptxBig_Data.pptx
Big_Data.pptx
 
Big Data for Development
Big Data for DevelopmentBig Data for Development
Big Data for Development
 
Group 2 Handling and Processing of big data.pptx
Group 2 Handling and Processing of big data.pptxGroup 2 Handling and Processing of big data.pptx
Group 2 Handling and Processing of big data.pptx
 
Data-Ed Webinar: Demystifying Big Data
Data-Ed Webinar: Demystifying Big Data Data-Ed Webinar: Demystifying Big Data
Data-Ed Webinar: Demystifying Big Data
 
Data-Ed: Demystifying Big Data
Data-Ed: Demystifying Big Data Data-Ed: Demystifying Big Data
Data-Ed: Demystifying Big Data
 
big-data.pdf
big-data.pdfbig-data.pdf
big-data.pdf
 
DOWLD SLIDES.pptx
DOWLD SLIDES.pptxDOWLD SLIDES.pptx
DOWLD SLIDES.pptx
 

Mais de MariamKhan120

Artificial Intelligence I What is AI? I Introduction to Artificial Intelligence
Artificial Intelligence I What is AI? I Introduction to Artificial Intelligence Artificial Intelligence I What is AI? I Introduction to Artificial Intelligence
Artificial Intelligence I What is AI? I Introduction to Artificial Intelligence MariamKhan120
 
Porte's Five Forces Model
Porte's Five Forces ModelPorte's Five Forces Model
Porte's Five Forces ModelMariamKhan120
 
Ernst & Young- Knowledge Management
 Ernst & Young- Knowledge Management Ernst & Young- Knowledge Management
Ernst & Young- Knowledge ManagementMariamKhan120
 
Waste Management Using IOT
Waste Management Using IOTWaste Management Using IOT
Waste Management Using IOTMariamKhan120
 
Six Sigma and Quality Management System
Six Sigma and  Quality Management SystemSix Sigma and  Quality Management System
Six Sigma and Quality Management SystemMariamKhan120
 
Capability Maturity Model Integration (CMMI)
Capability Maturity Model Integration (CMMI)Capability Maturity Model Integration (CMMI)
Capability Maturity Model Integration (CMMI)MariamKhan120
 
Blood Bank Management System
Blood Bank Management SystemBlood Bank Management System
Blood Bank Management SystemMariamKhan120
 
School management system
School management systemSchool management system
School management systemMariamKhan120
 
Motorola Marketing Startegies
Motorola Marketing StartegiesMotorola Marketing Startegies
Motorola Marketing StartegiesMariamKhan120
 
Software development life cycle (sdlc)
Software development life cycle (sdlc)Software development life cycle (sdlc)
Software development life cycle (sdlc)MariamKhan120
 

Mais de MariamKhan120 (20)

Artificial Intelligence I What is AI? I Introduction to Artificial Intelligence
Artificial Intelligence I What is AI? I Introduction to Artificial Intelligence Artificial Intelligence I What is AI? I Introduction to Artificial Intelligence
Artificial Intelligence I What is AI? I Introduction to Artificial Intelligence
 
Data Mining
Data MiningData Mining
Data Mining
 
E-learning
E-learningE-learning
E-learning
 
Porte's Five Forces Model
Porte's Five Forces ModelPorte's Five Forces Model
Porte's Five Forces Model
 
Ernst & Young- Knowledge Management
 Ernst & Young- Knowledge Management Ernst & Young- Knowledge Management
Ernst & Young- Knowledge Management
 
Scorpio Technique
Scorpio TechniqueScorpio Technique
Scorpio Technique
 
Waste Management Using IOT
Waste Management Using IOTWaste Management Using IOT
Waste Management Using IOT
 
Microsoft Company
Microsoft CompanyMicrosoft Company
Microsoft Company
 
Incremental model
Incremental modelIncremental model
Incremental model
 
Spiral Model
Spiral  ModelSpiral  Model
Spiral Model
 
RAD Model
RAD ModelRAD Model
RAD Model
 
Agile Model
Agile ModelAgile Model
Agile Model
 
Six Sigma and Quality Management System
Six Sigma and  Quality Management SystemSix Sigma and  Quality Management System
Six Sigma and Quality Management System
 
Capability Maturity Model Integration (CMMI)
Capability Maturity Model Integration (CMMI)Capability Maturity Model Integration (CMMI)
Capability Maturity Model Integration (CMMI)
 
White Box Testing
White Box Testing White Box Testing
White Box Testing
 
Blood Bank Management System
Blood Bank Management SystemBlood Bank Management System
Blood Bank Management System
 
Black Box Testing
Black Box TestingBlack Box Testing
Black Box Testing
 
School management system
School management systemSchool management system
School management system
 
Motorola Marketing Startegies
Motorola Marketing StartegiesMotorola Marketing Startegies
Motorola Marketing Startegies
 
Software development life cycle (sdlc)
Software development life cycle (sdlc)Software development life cycle (sdlc)
Software development life cycle (sdlc)
 

Último

Tampa BSides - The No BS SOC (slides from April 6, 2024 talk)
Tampa BSides - The No BS SOC (slides from April 6, 2024 talk)Tampa BSides - The No BS SOC (slides from April 6, 2024 talk)
Tampa BSides - The No BS SOC (slides from April 6, 2024 talk)Mark Simos
 
A Journey Into the Emotions of Software Developers
A Journey Into the Emotions of Software DevelopersA Journey Into the Emotions of Software Developers
A Journey Into the Emotions of Software DevelopersNicole Novielli
 
Varsha Sewlal- Cyber Attacks on Critical Critical Infrastructure
Varsha Sewlal- Cyber Attacks on Critical Critical InfrastructureVarsha Sewlal- Cyber Attacks on Critical Critical Infrastructure
Varsha Sewlal- Cyber Attacks on Critical Critical Infrastructureitnewsafrica
 
Scale your database traffic with Read & Write split using MySQL Router
Scale your database traffic with Read & Write split using MySQL RouterScale your database traffic with Read & Write split using MySQL Router
Scale your database traffic with Read & Write split using MySQL RouterMydbops
 
React Native vs Ionic - The Best Mobile App Framework
React Native vs Ionic - The Best Mobile App FrameworkReact Native vs Ionic - The Best Mobile App Framework
React Native vs Ionic - The Best Mobile App FrameworkPixlogix Infotech
 
Data governance with Unity Catalog Presentation
Data governance with Unity Catalog PresentationData governance with Unity Catalog Presentation
Data governance with Unity Catalog PresentationKnoldus Inc.
 
Zeshan Sattar- Assessing the skill requirements and industry expectations for...
Zeshan Sattar- Assessing the skill requirements and industry expectations for...Zeshan Sattar- Assessing the skill requirements and industry expectations for...
Zeshan Sattar- Assessing the skill requirements and industry expectations for...itnewsafrica
 
Landscape Catalogue 2024 Australia-1.pdf
Landscape Catalogue 2024 Australia-1.pdfLandscape Catalogue 2024 Australia-1.pdf
Landscape Catalogue 2024 Australia-1.pdfAarwolf Industries LLC
 
A Framework for Development in the AI Age
A Framework for Development in the AI AgeA Framework for Development in the AI Age
A Framework for Development in the AI AgeCprime
 
QCon London: Mastering long-running processes in modern architectures
QCon London: Mastering long-running processes in modern architecturesQCon London: Mastering long-running processes in modern architectures
QCon London: Mastering long-running processes in modern architecturesBernd Ruecker
 
Digital Identity is Under Attack: FIDO Paris Seminar.pptx
Digital Identity is Under Attack: FIDO Paris Seminar.pptxDigital Identity is Under Attack: FIDO Paris Seminar.pptx
Digital Identity is Under Attack: FIDO Paris Seminar.pptxLoriGlavin3
 
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptxThe Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptxLoriGlavin3
 
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptxThe Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptxLoriGlavin3
 
React JS; all concepts. Contains React Features, JSX, functional & Class comp...
React JS; all concepts. Contains React Features, JSX, functional & Class comp...React JS; all concepts. Contains React Features, JSX, functional & Class comp...
React JS; all concepts. Contains React Features, JSX, functional & Class comp...Karmanjay Verma
 
Generative AI - Gitex v1Generative AI - Gitex v1.pptx
Generative AI - Gitex v1Generative AI - Gitex v1.pptxGenerative AI - Gitex v1Generative AI - Gitex v1.pptx
Generative AI - Gitex v1Generative AI - Gitex v1.pptxfnnc6jmgwh
 
Genislab builds better products and faster go-to-market with Lean project man...
Genislab builds better products and faster go-to-market with Lean project man...Genislab builds better products and faster go-to-market with Lean project man...
Genislab builds better products and faster go-to-market with Lean project man...Farhan Tariq
 
UiPath Community: Communication Mining from Zero to Hero
UiPath Community: Communication Mining from Zero to HeroUiPath Community: Communication Mining from Zero to Hero
UiPath Community: Communication Mining from Zero to HeroUiPathCommunity
 
2024 April Patch Tuesday
2024 April Patch Tuesday2024 April Patch Tuesday
2024 April Patch TuesdayIvanti
 
Emixa Mendix Meetup 11 April 2024 about Mendix Native development
Emixa Mendix Meetup 11 April 2024 about Mendix Native developmentEmixa Mendix Meetup 11 April 2024 about Mendix Native development
Emixa Mendix Meetup 11 April 2024 about Mendix Native developmentPim van der Noll
 
Email Marketing Automation for Bonterra Impact Management (fka Social Solutio...
Email Marketing Automation for Bonterra Impact Management (fka Social Solutio...Email Marketing Automation for Bonterra Impact Management (fka Social Solutio...
Email Marketing Automation for Bonterra Impact Management (fka Social Solutio...Jeffrey Haguewood
 

Último (20)

Tampa BSides - The No BS SOC (slides from April 6, 2024 talk)
Tampa BSides - The No BS SOC (slides from April 6, 2024 talk)Tampa BSides - The No BS SOC (slides from April 6, 2024 talk)
Tampa BSides - The No BS SOC (slides from April 6, 2024 talk)
 
A Journey Into the Emotions of Software Developers
A Journey Into the Emotions of Software DevelopersA Journey Into the Emotions of Software Developers
A Journey Into the Emotions of Software Developers
 
Varsha Sewlal- Cyber Attacks on Critical Critical Infrastructure
Varsha Sewlal- Cyber Attacks on Critical Critical InfrastructureVarsha Sewlal- Cyber Attacks on Critical Critical Infrastructure
Varsha Sewlal- Cyber Attacks on Critical Critical Infrastructure
 
Scale your database traffic with Read & Write split using MySQL Router
Scale your database traffic with Read & Write split using MySQL RouterScale your database traffic with Read & Write split using MySQL Router
Scale your database traffic with Read & Write split using MySQL Router
 
React Native vs Ionic - The Best Mobile App Framework
React Native vs Ionic - The Best Mobile App FrameworkReact Native vs Ionic - The Best Mobile App Framework
React Native vs Ionic - The Best Mobile App Framework
 
Data governance with Unity Catalog Presentation
Data governance with Unity Catalog PresentationData governance with Unity Catalog Presentation
Data governance with Unity Catalog Presentation
 
Zeshan Sattar- Assessing the skill requirements and industry expectations for...
Zeshan Sattar- Assessing the skill requirements and industry expectations for...Zeshan Sattar- Assessing the skill requirements and industry expectations for...
Zeshan Sattar- Assessing the skill requirements and industry expectations for...
 
Landscape Catalogue 2024 Australia-1.pdf
Landscape Catalogue 2024 Australia-1.pdfLandscape Catalogue 2024 Australia-1.pdf
Landscape Catalogue 2024 Australia-1.pdf
 
A Framework for Development in the AI Age
A Framework for Development in the AI AgeA Framework for Development in the AI Age
A Framework for Development in the AI Age
 
QCon London: Mastering long-running processes in modern architectures
QCon London: Mastering long-running processes in modern architecturesQCon London: Mastering long-running processes in modern architectures
QCon London: Mastering long-running processes in modern architectures
 
Digital Identity is Under Attack: FIDO Paris Seminar.pptx
Digital Identity is Under Attack: FIDO Paris Seminar.pptxDigital Identity is Under Attack: FIDO Paris Seminar.pptx
Digital Identity is Under Attack: FIDO Paris Seminar.pptx
 
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptxThe Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
 
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptxThe Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
 
React JS; all concepts. Contains React Features, JSX, functional & Class comp...
React JS; all concepts. Contains React Features, JSX, functional & Class comp...React JS; all concepts. Contains React Features, JSX, functional & Class comp...
React JS; all concepts. Contains React Features, JSX, functional & Class comp...
 
Generative AI - Gitex v1Generative AI - Gitex v1.pptx
Generative AI - Gitex v1Generative AI - Gitex v1.pptxGenerative AI - Gitex v1Generative AI - Gitex v1.pptx
Generative AI - Gitex v1Generative AI - Gitex v1.pptx
 
Genislab builds better products and faster go-to-market with Lean project man...
Genislab builds better products and faster go-to-market with Lean project man...Genislab builds better products and faster go-to-market with Lean project man...
Genislab builds better products and faster go-to-market with Lean project man...
 
UiPath Community: Communication Mining from Zero to Hero
UiPath Community: Communication Mining from Zero to HeroUiPath Community: Communication Mining from Zero to Hero
UiPath Community: Communication Mining from Zero to Hero
 
2024 April Patch Tuesday
2024 April Patch Tuesday2024 April Patch Tuesday
2024 April Patch Tuesday
 
Emixa Mendix Meetup 11 April 2024 about Mendix Native development
Emixa Mendix Meetup 11 April 2024 about Mendix Native developmentEmixa Mendix Meetup 11 April 2024 about Mendix Native development
Emixa Mendix Meetup 11 April 2024 about Mendix Native development
 
Email Marketing Automation for Bonterra Impact Management (fka Social Solutio...
Email Marketing Automation for Bonterra Impact Management (fka Social Solutio...Email Marketing Automation for Bonterra Impact Management (fka Social Solutio...
Email Marketing Automation for Bonterra Impact Management (fka Social Solutio...
 

Big data Mining

  • 1. Mining Big Data: Current State of work and Challenges Group members: Misbah Rashid Mariam Rashid
  • 2. About Journal • The journal is published in the year 2015 in (IJANA) International Journal of Advanced Networking and Applications • The journal was published by Kaushika Pal and Dr. Jatinderkumar R. Saini.
  • 3. Brief Overview • Introduction to big data • Big Data Mining • Big data mining importance
  • 4. Introduction To Big Data • Huge amount of data are generated and collected from various sources like sensors, devices etc. all are in different formats from connected or independent application. • This data has to be processed, investigated, stored and understood. Considering internet data the web pages indexed by Google were One million in 1998, One billion in 2000 and one trillion in 2008. • Examples are from social media- Facebook, Twitter, GooglePlus, YouTube, LinkedIn. • Each of these site receives huge volume of data on a daily basis. • Smartphones are now highly connected to internet and use and store data on web and thus increasing web volume.Twitter process around 400 millions tweets each day. • Smartphones are the real producer of big data, and it is up to us how we can utilize that data to change our lives.
  • 5. • Data created via smartphones can be put to good use. Smartphone usage patterns helped researchers in Africa determine where malaria outbreaks were occurring and where the affected people went [10]. This information can be used to determine where to best distribute medicines more efficiently. This is the power of big data analysis which has a positive impact on humanity.
  • 6. Big Data Mining • Big data mining is referred to the collective data miming. • Extraction techniques that are performed on large volume of data. • We need new tools and new algorithm to deal with all this huge amount of data. While working with Big Data 7 V’s have to be considered for Big Data Management • Volume:every industry is flooded with data, which can be extremely valuable, if it can be used to retrieve important information. • Variety:90% of data generated is amorphous coming in all shapes and forms-the data is generated from geo-spatial, tweets, photos and videos uploading on social networking sites, which can be analysed for content
  • 7. • Velocity:Velocity’ refers to the increasing speed at which this data is created, and the increasing speed at which the data can be processed, stored and analysed. • Value: The probable value of Big Data is huge. • Variability: Variability refers to data whose meaning is constantly changing. There are changes in the structure of data and how users want to interpret that data. • Veracity: Big Data Veracity refers to the noise and abnormality in data. In scoping out your big data strategy you need to help keep your data clean and processes to keep ‘dirty data’ from accumulating in your systems. • Visibility: Data from different sources should be visible to the technology stack making up Big Data.Certain data which are crucial are available but not visible to Big Data.
  • 8. Literature Review • Mining heterogeneous information networks is a new and promising research frontier in Big Data mining. It considers interconnected, various different types of data, including the relational database data, as heterogeneous information networks. • Mining Big Data in Real Time discusses the challenges in structured pattern classification. The classification methods mostly deal with vector data. To apply them to graph pattern classification can be converted into vectors of attributes. Each and every attributes indicates the presence or absence of sub patterns. Attributes are created for every frequent sub patterns. The number of such sub patterns can be very large. • Data Mining with Big data had drawn our attention on challenges with mining big data at three levels dealing with data, model, and system.
  • 9. Application Of Big Data Mining • Business: expands customer intelligence, improves operational efficiencies, customer personalization. To gain deep customer requirements one need strong personal connections and give customized services if possible which will drive more sales. • Managing demands in the market By capturing external market and retailer data in real time to sense, evaluate, and answer to demand indicators faster than ever before. • Fraud detection: By analysing certain abnormal pattern from various data sources, fraud can be detected in financial transaction, health insurance etc
  • 10. Challenges • Variety and Heterogeneity: Different sources generate Big Data leading to great variety or heterogeneity of big data. Heterogeneity in big data deals with structured, semi- structured, and even entirely unstructured data concurrently. The challenge is to unveil or extract the hidden knowledge in such data sets. • Scalability: The extraordinary volume requires high scalability of its data management and mining tools. However, most algorithms currently used in data mining do not scale very well when applied to very large data sets because they were initially developed and tested upon smaller data sets. we have such large data sets that these algorithms are no longer efficient enough for mining and analysing • Velocity/Speed: The capability of fast accessing and mining big data is highly essential. Mining of a task must be finished within a definite period of time, otherwise, the processing/mining results becomes less valuable or even worthless. However design of new and more efficient indexing schemes is much desired, but remains one of the greatest challenges to the research community.
  • 11. Challenges • Privacy Crisis: Data privacy has been always an issue. The concern has become extremely serious with big data mining that often requires personal information in order to produce relevant/accurate results such as location-based and personalized services. Also, with the huge volume of big data such as social media that contains incredible amount of highly interrelated personal information, each bit of information can be mined out. Every transaction regarding our daily life is being pushed to online and leaves a trace there: we comminute with friends via email, instant message, blog, and Facebook; we do shopping and pay our bills online; credit card companies hold our confidential identity information. As time goes, your personal information will be scattered here or there. Everyone would easily gain the privilege of using powerful tools to extract your confidential information. • Garbage Mining: As the volume of data is increasing day by day so the amount of irrelevant and unnecessary data is also increasing.Garbage minig is to extract the hidden data and clean it from important data. It is not easy as it is difficult to extract hidden data from bulk of data and then clean it. Garbage mining remains one of the greatest challenges
  • 12. Appreciation • In this journal, author has fully explained the insights about the mining of big data including the main concerns and main challenges for the future. • The most positive aspect of this article is its clarity in the statement of research problem • The author selected 14 relevant sources published between the years of (2012) and (2014). Ten of these references were primary sources. The author did a reasonable job of highlighting the previous search on topics related to their research and even provided comparisons of literature when possible.
  • 13. Critic • The statement of the problem was implied in the abstract section of the article but the specific problem is not being addressed until the author has described the usefulness of mining big data later in the article. • The author has not clearly explained the applications of mining big data in medical, healthcare and engineering. • The author has disscussed the big data in terms of mobile phones.The scope of big data is far more than what author has disscussed.
  • 14. Future work • The techniques will be developed to overcome the challenges facing in mining big data • Social media and Big Data be used to understand public opinion trends.

Notas do Editor

  1. When conducting research, it is easy to go to one source: Wikipedia. However, you need to include a variety of sources in your research. Consider the following sources: Who can I interview to get more information on the topic? Is the topic current and will it be relevant to my audience? What articles, blogs, and magazines may have something related to my topic? Is there a YouTube video on the topic? If so, what is it about? What images can I find related to the topic?
  2. When conducting research, it is easy to go to one source: Wikipedia. However, you need to include a variety of sources in your research. Consider the following sources: Who can I interview to get more information on the topic? Is the topic current and will it be relevant to my audience? What articles, blogs, and magazines may have something related to my topic? Is there a YouTube video on the topic? If so, what is it about? What images can I find related to the topic?
  3. When conducting research, it is easy to go to one source: Wikipedia. However, you need to include a variety of sources in your research. Consider the following sources: Who can I interview to get more information on the topic? Is the topic current and will it be relevant to my audience? What articles, blogs, and magazines may have something related to my topic? Is there a YouTube video on the topic? If so, what is it about? What images can I find related to the topic?
  4. When conducting research, it is easy to go to one source: Wikipedia. However, you need to include a variety of sources in your research. Consider the following sources: Who can I interview to get more information on the topic? Is the topic current and will it be relevant to my audience? What articles, blogs, and magazines may have something related to my topic? Is there a YouTube video on the topic? If so, what is it about? What images can I find related to the topic?
  5. When conducting research, it is easy to go to one source: Wikipedia. However, you need to include a variety of sources in your research. Consider the following sources: Who can I interview to get more information on the topic? Is the topic current and will it be relevant to my audience? What articles, blogs, and magazines may have something related to my topic? Is there a YouTube video on the topic? If so, what is it about? What images can I find related to the topic?
  6. When conducting research, it is easy to go to one source: Wikipedia. However, you need to include a variety of sources in your research. Consider the following sources: Who can I interview to get more information on the topic? Is the topic current and will it be relevant to my audience? What articles, blogs, and magazines may have something related to my topic? Is there a YouTube video on the topic? If so, what is it about? What images can I find related to the topic?
  7. When conducting research, it is easy to go to one source: Wikipedia. However, you need to include a variety of sources in your research. Consider the following sources: Who can I interview to get more information on the topic? Is the topic current and will it be relevant to my audience? What articles, blogs, and magazines may have something related to my topic? Is there a YouTube video on the topic? If so, what is it about? What images can I find related to the topic?
  8. When conducting research, it is easy to go to one source: Wikipedia. However, you need to include a variety of sources in your research. Consider the following sources: Who can I interview to get more information on the topic? Is the topic current and will it be relevant to my audience? What articles, blogs, and magazines may have something related to my topic? Is there a YouTube video on the topic? If so, what is it about? What images can I find related to the topic?
  9. When conducting research, it is easy to go to one source: Wikipedia. However, you need to include a variety of sources in your research. Consider the following sources: Who can I interview to get more information on the topic? Is the topic current and will it be relevant to my audience? What articles, blogs, and magazines may have something related to my topic? Is there a YouTube video on the topic? If so, what is it about? What images can I find related to the topic?