O slideshow foi denunciado.
Seu SlideShare está sendo baixado. ×

Data Mining

Anúncio
Anúncio
Anúncio
Anúncio
Anúncio
Anúncio
Anúncio
Anúncio
Anúncio
Anúncio
Anúncio
Anúncio
Carregando em…3
×

Confira estes a seguir

1 de 5 Anúncio

Mais Conteúdo rRelacionado

Diapositivos para si (20)

Anúncio

Semelhante a Data Mining (20)

Mais recentes (20)

Anúncio

Data Mining

  1. 1. VihangShah Data mining Introduction Data mining is a process of retrieving data from huge database. Data mining is automatically searching large data to discover patterns and trends that is different from simple analysis. Data mining is also known as Knowledge Discovery in Data (KDD). Data mining Process Problem Definition Problem definition in this stage the need of project, objective of project and requirements are defined and from that the basic plan should be implement on primary level. Problem Defination Data Gathering & Preparation Model building & Evaluation Knowledge Deployment
  2. 2. VihangShah Data gathering & Preparation As you know in earlier phase you collect all requirements in this phase the additional data or some data be omitted for further phases. This is also a time to identify data quality problem. In short data preparation can significantly improve the information that can be discovered through data mining. The outcome of the data preparation is final data set. Once the data sources are identified, they need to be selected, cleaned, constructed and formatted into the desired form. Model Building and evaluation In this phase selection and apply various modeling techniques for retrieving optimal values. The test will be generated to validate the quality and validity of the model. One or more model are created and run on the prepared dataset. Knowledge deployment The knowledge or information which we gain from data mining process need to present in such a way that it will be use when we need knowledge or information. In this phase the plans for deployment, maintenance and monitoring have to be created for implementation and also future supports. What can data mining do and Not Do? Do:-  Data mining can help to find pattern and relationships within your data.  Data mining help you to discover hidden information in your data.  Data mining actually give optimize result from huge databases.  Data mining can help you to analyze the data for future use.
  3. 3. VihangShah Not Do:-  Data mining cannot work automatically.  Data mining cannot give you information about value of the information to your organization.  Data mining does not eliminate the need to know your business, to understand your data. Data Mining Technique Data mining have basically six different techniques and that are Association, classification, clustering, prediction, sequential pattern and decision tree. Association Association basically works on relation between items that why it also called relation technique. It is used in marketing analysis to identify a set of customer’s frequently purchase together. Retailers are using association technique to research customer’s buying habits. Based on historical sale data, retailers might found out that customers buy bread they also buy butter. Classification Classification is used to classify each item into predefined set of data or group. For example: - We can apply classification in application that gives all records of employees who left the company, predict who will probably leave the company in a future period. Clustering In clustering the classes are defined and the objects are put in each class, while in classification technique object are assigned into predefined classes. For example:- Consider book management in library there is wide range of book that having a different topic. So now reader must have easy searching facility of books that having same topics so for that we make a cluster that can keep books that have some kind of similarities in one cluster or one shelf and label it with a meaningful name.
  4. 4. VihangShah Prediction Prediction is technique that predicts relationship between independent variable and relationship between dependent and independent variables. For instance the prediction technique can be used in sales to predict profit for the future if we consider sale is an independent variable, profit could be a dependent variable. Sequential Patterns This technique seeks to discover or identity similar patterns, regular events or trends in transaction data over a business period. Decision Tree It is most used technique of data mining because it is easy to understand. In this the root of decision tree is a simple question or condition that has a multiple answers. Each answer leads to a set of questions or conditions that help us determine the data. Note: - we often combine two or more data mining techniques together to form an appropriate process that meets the business needs. Data mining Applications  Data mining help in marketing such as it will used for analysis to provide information on what product together, when they were bought and in what sequence and it will also help to find customer’s behavior.  Data mining help in banking/finance sector such as it will used to identify customer loyalty by analyzing the data of customer’s purchasing activities and it will also help retain credit card customers.  Data mining help in health care and insurance sector such as it will analysis the claims which medical procedures are claimed together and it will also forecasts which customer will potentially purchase new policies. NOTE: - Data mining is also used to analyze the data in many sectors.
  5. 5. VihangShah

×