In this presentation we have covered why data mining is important and various techniques used for data mining. Apart from that, examples of applications have been given for each technique. This presentation also explains how an enterprise can source web data via crawling services to bolster data mining models.
2. What is Data Mining?
Process of analyzing large-
scale data to identify hidden
patterns and trends for
understanding, guiding and
forecasting future
behaviour.
8. While your organisation
might have large
volume of internal data,
there is another source
of the data that you
must not miss out, i.e.,
web data.
9. In essence, web data
can augment your
existing data to provide
holistic view of your
business, which should
be the basis of any data
mining project.
10. Leverage the cloud-based,
managed Data as a Service
providers such as
PromptCloud, who have
already set up a Big Data
infrastructure and the
technology stack required for
custom web data extraction.
How to
Source Web
Data
11. This is important as a DaaS
provider can eliminate the
requirement of engineering
talents at your end and take
away the pain of maintaining
data feeds (considering the
frequent structural changes in
the web pages).
12. All you need to do is focus on
consuming the data for
business growth.
13. Get started by providing the following
information
Websites you’re looking
to crawl
Relevant data fields
Desired frequency of the
crawl
(daily/weekly/monthly)
16. Association
Can be useful for bundling products,
in-store product placement and
analysing imperfections. For
example, you might identify that
when people buy banana, they tend
to buy milk as well, and therefore
you can suggest them milk next
time they purchase banana.
17. Classification
This technique is used
for identifying specific
classes of customers
or products by using
the associated
attributes.
18. Classification
Can be useful for
identifying customers who
are likely to purchase or
not likely to purchase,
customers who are most
valuable, customers who
respond to specific type of
advertisement, etc.
19. Clustering
This technique is used
for exploring data and
applying one or more
attributes for finding
innate correlations
among members of a
cluster.
20. Clustering
The applications of
clustering lies in
identifying new
customer segments,
grouping of similar
sites by search
engines, recognising
similarity in genetic
data from population
structure and more.
21. Outlier Detection
This technique is
used for identifying
unusual or
suspicious cases that
deviate from the
projected pattern or
expected norm.
Source: http://bit.ly/2sGcIum
23. Regression
Analysis
This technique is
used for establishing
the dependency
between two
variables so that
causal relationship
can be used to
predict outcome
one variable.
24. Regression Analysis
The examples of
applications of
regression analysis are
prediction of customer
lifetime value resulting
from loyalty, effect of
real estate market on
GDP, etc.
26. Attribute Importance
The examples of applications of
attribute importance include
finding factors highly associated
with customers who respond to
certain promotion, factors most
associated with high performing
employees.
Source: www.hrthatworks.com
28. Feature Selection
The examples of
applications of
feature selection are
latent semantic
analysis, data
compression and
pattern recognition
and more.