Email Marketing Automation for Bonterra Impact Management (fka Social Solutio...
Big data Mining
1. Mining Big Data: Current
State of work and Challenges
Group members:
Misbah Rashid
Mariam Rashid
2. About Journal
• The journal is published in the year 2015 in (IJANA) International
Journal of Advanced Networking and Applications
• The journal was published by Kaushika Pal and Dr. Jatinderkumar R.
Saini.
4. Introduction To Big Data
• Huge amount of data are generated and collected from various sources like
sensors, devices etc. all are in different formats from connected or independent
application.
• This data has to be processed, investigated, stored and understood. Considering
internet data the web pages indexed by Google were One million in 1998, One
billion in 2000 and one trillion in 2008.
• Examples are from social media- Facebook, Twitter, GooglePlus, YouTube,
LinkedIn.
• Each of these site receives huge volume of data on a daily basis.
• Smartphones are now highly connected to internet and use and store data on
web and thus increasing web volume.Twitter process around 400 millions tweets
each day.
• Smartphones are the real producer of big data, and it is up to us how we can
utilize that data to change our lives.
5. • Data created via smartphones can be put to good use. Smartphone
usage patterns helped researchers in Africa determine where malaria
outbreaks were occurring and where the affected people went [10].
This information can be used to determine where to best distribute
medicines more efficiently. This is the power of big data analysis
which has a positive impact on humanity.
6. Big Data Mining
• Big data mining is referred to the collective data miming.
• Extraction techniques that are performed on large volume of data.
• We need new tools and new algorithm to deal with all this huge amount of
data. While working with Big Data 7 V’s have to be considered for Big Data
Management
• Volume:every industry is flooded with data, which can be extremely
valuable, if it can be used to retrieve important information.
• Variety:90% of data generated is amorphous coming in all shapes and
forms-the data is generated from geo-spatial, tweets, photos and videos
uploading on social networking sites, which can be analysed for content
7. • Velocity:Velocity’ refers to the increasing speed at which this data is
created, and the increasing speed at which the data can be processed,
stored and analysed.
• Value: The probable value of Big Data is huge.
• Variability: Variability refers to data whose meaning is constantly changing.
There are changes in the structure of data and how users want to interpret
that data.
• Veracity: Big Data Veracity refers to the noise and abnormality in data. In
scoping out your big data strategy you need to help keep your data clean
and processes to keep ‘dirty data’ from accumulating in your systems.
• Visibility: Data from different sources should be visible to the technology
stack making up Big Data.Certain data which are crucial are available but
not visible to Big Data.
8. Literature Review
• Mining heterogeneous information networks is a new and promising
research frontier in Big Data mining. It considers interconnected, various
different types of data, including the relational database data, as
heterogeneous information networks.
• Mining Big Data in Real Time discusses the challenges in structured pattern
classification. The classification methods mostly deal with vector data. To
apply them to graph pattern classification can be converted into vectors of
attributes. Each and every attributes indicates the presence or absence of
sub patterns. Attributes are created for every frequent sub patterns. The
number of such sub patterns can be very large.
• Data Mining with Big data had drawn our attention on challenges with
mining big data at three levels dealing with data, model, and system.
9. Application Of Big Data Mining
• Business: expands customer intelligence, improves
operational efficiencies, customer personalization. To gain deep
customer requirements one need strong personal connections
and give customized services if possible which will drive more
sales.
• Managing demands in the market By capturing external
market and retailer data in real time to sense, evaluate, and
answer to demand indicators faster than ever before.
• Fraud detection: By analysing certain abnormal pattern from
various data sources, fraud can be detected in financial
transaction, health insurance etc
10. Challenges
• Variety and Heterogeneity: Different sources generate Big Data leading to great variety
or heterogeneity of big data. Heterogeneity in big data deals with structured, semi-
structured, and even entirely unstructured data concurrently. The challenge is to unveil
or extract the hidden knowledge in such data sets.
• Scalability: The extraordinary volume requires high scalability of its data management
and mining tools. However, most algorithms currently used in data mining do not scale
very well when applied to very large data sets because they were initially developed
and tested upon smaller data sets. we have such large data sets that these algorithms
are no longer efficient enough for mining and analysing
• Velocity/Speed: The capability of fast accessing and mining big data is highly essential.
Mining of a task must be finished within a definite period of time, otherwise, the
processing/mining results becomes less valuable or even worthless. However design of
new and more efficient indexing schemes is much desired, but remains one of the
greatest challenges to the research community.
11. Challenges
• Privacy Crisis: Data privacy has been always an issue. The concern has become
extremely serious with big data mining that often requires personal information in
order to produce relevant/accurate results such as location-based and personalized
services. Also, with the huge volume of big data such as social media that contains
incredible amount of highly interrelated personal information, each bit of information
can be mined out. Every transaction regarding our daily life is being pushed to online
and leaves a trace there: we comminute with friends via email, instant message, blog,
and Facebook; we do shopping and pay our bills online; credit card companies hold our
confidential identity information. As time goes, your personal information will be
scattered here or there. Everyone would easily gain the privilege of using powerful
tools to extract your confidential information.
• Garbage Mining: As the volume of data is increasing day by day so the amount of
irrelevant and unnecessary data is also increasing.Garbage minig is to extract the
hidden data and clean it from important data. It is not easy as it is difficult to extract
hidden data from bulk of data and then clean it. Garbage mining remains one of the
greatest challenges
12. Appreciation
• In this journal, author has fully explained the insights about the
mining of big data including the main concerns and main challenges
for the future.
• The most positive aspect of this article is its clarity in the statement of
research problem
• The author selected 14 relevant sources published between the years
of (2012) and (2014). Ten of these references were primary sources.
The author did a reasonable job of highlighting the previous search on
topics related to their research and even provided comparisons of
literature when possible.
13. Critic
• The statement of the problem was implied in the abstract section of
the article but the specific problem is not being addressed until the
author has described the usefulness of mining big data later in the
article.
• The author has not clearly explained the applications of mining big
data in medical, healthcare and engineering.
• The author has disscussed the big data in terms of mobile phones.The
scope of big data is far more than what author has disscussed.
14. Future work
• The techniques will be developed to overcome the challenges facing
in mining big data
• Social media and Big Data be used to understand public opinion
trends.
When conducting research, it is easy to go to one source: Wikipedia. However, you need to include a variety of sources in your research. Consider the following sources:
Who can I interview to get more information on the topic?
Is the topic current and will it be relevant to my audience?
What articles, blogs, and magazines may have something related to my topic?
Is there a YouTube video on the topic? If so, what is it about?
What images can I find related to the topic?
When conducting research, it is easy to go to one source: Wikipedia. However, you need to include a variety of sources in your research. Consider the following sources:
Who can I interview to get more information on the topic?
Is the topic current and will it be relevant to my audience?
What articles, blogs, and magazines may have something related to my topic?
Is there a YouTube video on the topic? If so, what is it about?
What images can I find related to the topic?
When conducting research, it is easy to go to one source: Wikipedia. However, you need to include a variety of sources in your research. Consider the following sources:
Who can I interview to get more information on the topic?
Is the topic current and will it be relevant to my audience?
What articles, blogs, and magazines may have something related to my topic?
Is there a YouTube video on the topic? If so, what is it about?
What images can I find related to the topic?
When conducting research, it is easy to go to one source: Wikipedia. However, you need to include a variety of sources in your research. Consider the following sources:
Who can I interview to get more information on the topic?
Is the topic current and will it be relevant to my audience?
What articles, blogs, and magazines may have something related to my topic?
Is there a YouTube video on the topic? If so, what is it about?
What images can I find related to the topic?
When conducting research, it is easy to go to one source: Wikipedia. However, you need to include a variety of sources in your research. Consider the following sources:
Who can I interview to get more information on the topic?
Is the topic current and will it be relevant to my audience?
What articles, blogs, and magazines may have something related to my topic?
Is there a YouTube video on the topic? If so, what is it about?
What images can I find related to the topic?
When conducting research, it is easy to go to one source: Wikipedia. However, you need to include a variety of sources in your research. Consider the following sources:
Who can I interview to get more information on the topic?
Is the topic current and will it be relevant to my audience?
What articles, blogs, and magazines may have something related to my topic?
Is there a YouTube video on the topic? If so, what is it about?
What images can I find related to the topic?
When conducting research, it is easy to go to one source: Wikipedia. However, you need to include a variety of sources in your research. Consider the following sources:
Who can I interview to get more information on the topic?
Is the topic current and will it be relevant to my audience?
What articles, blogs, and magazines may have something related to my topic?
Is there a YouTube video on the topic? If so, what is it about?
What images can I find related to the topic?
When conducting research, it is easy to go to one source: Wikipedia. However, you need to include a variety of sources in your research. Consider the following sources:
Who can I interview to get more information on the topic?
Is the topic current and will it be relevant to my audience?
What articles, blogs, and magazines may have something related to my topic?
Is there a YouTube video on the topic? If so, what is it about?
What images can I find related to the topic?
When conducting research, it is easy to go to one source: Wikipedia. However, you need to include a variety of sources in your research. Consider the following sources:
Who can I interview to get more information on the topic?
Is the topic current and will it be relevant to my audience?
What articles, blogs, and magazines may have something related to my topic?
Is there a YouTube video on the topic? If so, what is it about?
What images can I find related to the topic?