SlideShare a Scribd company logo
1 of 28
Download to read offline
Hands-On Lab
Lab Manual
SQL Server™ 2005:
   Data Mining
rro! Use a guia Início para aplicar heading 1 ao texto que deverá aparecer aqui.


                     Information in this document is subject to change without notice. The example companies, organizations,
                     products, people, and events depicted herein are fictitious. No association with any real company,
                     organization, product, person or event is intended or should be inferred. Complying with all applicable
                     copyright laws is the responsibility of the user. Without limiting the rights under copyright, no part of this
                     document may be reproduced, stored in or introduced into a retrieval system, or transmitted in any form or by
                     any means (electronic, mechanical, photocopying, recording, or otherwise), or for any purpose, without the
                     express written permission of Microsoft Corporation.

                     Microsoft may have patents, patent applications, trademarked, copyrights, or other intellectual property rights
                     covering subject matter in this document. Except as expressly provided in any written license agreement
                     from Microsoft, the furnishing of this document does not give you any license to these patents, trademarks,
                     copyrights, or other intellectual property.

                     2009 Microsoft Corporation. All rights reserved.

                     Microsoft, MS-DOS, MS, Windows, Windows NT, MSDN, Active Directory, BizTalk, SQL Server, SharePoint,
                     Outlook, PowerPoint, FrontPage, Visual Basic, Visual C++, Visual J++, Visual InterDev, Visual SourceSafe,
                     Visual C#, Visual J#, and Visual Studio are either registered trademarks or trademarks of Microsoft
                     Corporation in the U.S.A. and/or other countries.

                     Other product and company names herein may be the trademarks of their respective owners.
Data Mining

Objectives              After completing this lab, you will be able to:
                           Create Decision Tree and Naïve Bayes Data Mining Models.
                           View Mining Accuracy Charts.
                           Create a Prediction Query.
                           Model Time Series.


                        Note This lab focuses on the concepts in this module and as a result may not
                        comply with Microsoft security recommendations.


                        Note The SQL Server 2005 labs are based on beta builds of the product. The
                        intent of these labs is to provide you with a general feel of some of the planned
                        features for the next release of SQL Server. As with all software development
                        projects, the final version may differ from beta builds in both features and user
                        interface. For the latest details on SQL Server 2005, please visit
                        http://www.microsoft.com/sql/2005/.



Estimated time to
complete this lab: 60
minutes
2    Erro! Use a guia Início para aplicar heading 1 ao texto que deverá aparecer aqui.


Exercise 0:
Lab Setup
                           In this part of the lab you will set up the views you will work with in the rest of
                           the lab.

                            Task 1: Log in.
                               Log on with user name Administrator and password pass@word1.

                            Task 2: Create the Views
                           1. From the Windows task bar, select Start | All Programs | Microsoft SQL
                              Server 2005 | SQL Server Management Studio.
                           2. In the Connect to Server dialog, make sure that in the Server type drop
                              down list-box SQL Server is selected. Enter localhost in the Server name
                              textbox and select Windows Authentication in the Authentication drop
                              down list-box, as in Figure 1. Click Connect.




                           Figure 1: Connect to Server Dialog
                           3. Select File | Open | File.
                           4. Navigate to the C:SQL LabsLab ProjectsData Mining LabDM Setup
                              directory, and select the ViewCreation.sql file. Click Open.
                           5. Click Connect in the Connect to Server dialog.
                           6. Execute the script by pressing F5, or by clicking on the Execute icon in the
                              toolbar, as shown in Figure 2.
Erro! Use a guia Início para aplicar heading 1 ao texto que deverá aparecer aqui.   3




                Figure 2: Execute Script
                7. When the script has executed successfully, select the File | Exit menu item
                   to close the SQL Server Management Studio.




Exercise 1:
Creating Decision Tree and Naïve Bayes Data Mining Models
Overview        The management at Adventure Works wants to analyze purchasing decisions
                based on customer demographics. Analysis Services has improved data mining
                functionality, providing the following data mining techniques:
                   Microsoft Association Rules
                   Microsoft Clustering
                   Microsoft Decision Trees
                   Microsoft Naïve Bayes
                   Microsoft Neural Network
                   Microsoft Sequence Clustering
                   Microsoft Time Series
                In this exercise, you will develop an Analysis Services solution using the
                Microsoft Business Intelligence Development Studio environment. The
                Business Intelligence Development Studio is an environment based on the
                Microsoft Visual Studio 2005 environment.
                Business Intelligence Development Studio provides you with an integrated
                development environment for designing, testing, editing, and deploying projects
                to the Analysis Server. You will create and view a data mining structure with
                Decision Trees and Naïve Bayes data mining models using
                AdventureWorksDW customer data.
                To create and view data mining models, you will:
4   Erro! Use a guia Início para aplicar heading 1 ao texto que deverá aparecer aqui.


                              Create an Analysis Services project in the Business Intelligence
                               Development Studio environment.
                              Create a data source and data source view.
                              Create a data mining structure and decision trees data mining model using
                               the Mining Model Wizard.
                              Create a related mining model (Naïve Bayes) in the Mining Models view.
                              Deploy the Analysis Services solution.
                              Explore the data mining models using the Mining Model Viewer.

                           Task 1: Create an Analysis Services Project
                          1. From the Windows task bar, select Start | All Programs | Microsoft SQL
                             Server 2005 | Business Intelligence Development Studio.
                          2. Select File | New | Project.
                          3. In the New Project dialog box, in the Project Types pane, click the
                             Business Intelligence Projects folder.
                          4. In the Templates pane, click the Analysis Services Project icon.
                          5. In the Name text box, type DM Exercise 1.
                          6. In the Location text box, enter C:SQL LabsUser Projects.
                          7. Uncheck the Create directory for Solution checkbox. Figure 1 shows how
                             the New Project dialog box should look once you're done.
                          8. Click OK.




                          Figure 1: New Project Dialog
                          The project is created in a new solution: the solution is the largest unit of
                          management in the Business Intelligence Development Studio environment.
                          Each solution contains one or more projects. An Analysis Services Project is a
                          group of related files containing the XML code for all of the objects in an
                          Analysis Services database.
                          You can view the solution and its projects in the Solution Explorer pane on the
                          right hand side in the Business Intelligence Development Studio. If the Solution
Erro! Use a guia Início para aplicar heading 1 ao texto que deverá aparecer aqui.   5


Explorer is not visible you can view it by selecting the View | Solution
Explorer menu item (or the keyboard shortcut Ctrl + Alt + L)

 Task 2: Set the Deployment Mode Property
1. In the Solution Explorer window, right-click the DM Exercise 1 project,
    and select Properties from the context menu.
2. In the DM Exercise 1 Property Pages dialog box, under the Configuration
    Properties folder, click Deployment.
3. In the right pane, click the Deployment Mode property. In the Deployment
    Mode drop-down list click DeployAll, and then click OK.
You can configure the build, debugging, and deployment properties of an
Analysis Services project.

 Task 3: Create a Data Source
1. In the Solution Explorer pane, under the DM Exercise 1 project, right-click
    the Data Sources folder, and then select New Data Source from the
    context menu.
2. In the Data Source Wizard dialog box, on the Welcome to the Data
    Source Wizard page, click Next.

Tip If the Data connections pane already includes
(local).AdventureWorksDW, skip to step 14.

3. On the Select how to define the connection page, make sure the Create a
   data source based on an existing or new connection radio button is
   chosen. Click New ….
4. In the Connection Manager dialog box, verify that Native OLE DB is
    selected in the Provider drop down list box at the top of the page.
5. Still in the Connection Manager, under the 1. Select an OLE DB provider
    header, ensure the Microsoft OLE DB Provider for SQL Server is
    chosen.
6. Under the header 2. Specify the following to connect to SQL Server in the
   Select or enter a server name drop-down list, type “(local)”.

Important In the Select or enter a server name drop-down list, type (local);
do not type localhost.

7. Under Enter information to log on to the server, click Use Windows NT
   Integrated Security.
8. Click Select the database on the server, and in the Select the database on
   the server drop-down list, click AdventureWorksDW
9. Click on the Data Links… button. This brings up the Data Links
   Properties dialog with (local), Use Windows NT Integrated security and
   AdventureWorksDW already chosen.
10. Click Test Connection.
11. In the Microsoft Data Link dialog box, click OK.
12. In the Data Link Properties dialog box, click OK.
13. In the Connection Manager dialog box, click OK.
6   Erro! Use a guia Início para aplicar heading 1 ao texto que deverá aparecer aqui.


                          14. In the Data Source Wizard dialog box, on the Select how to define the
                               connection page, verify that (local).AdventureWorksDW is selected, and
                               click Next.
                          15. On the Completing the Wizard page, leave the default Data source name
                              Adventure Works DW unchanged, and then click Finish.
                          You have now set up the information how to connect to the database you are
                          working with. It is now time to define the schema information you want to use
                          in the solution. You do this through the Data Source View.

                           Task 4: Create a Data Source View
                          1. In the Solution Explorer pane, under the DM Exercise 1 folder, right-click
                             the Data Source Views folder, and then select New Data Source View
                             from the context menu.
                          2. In the Data Source View Wizard dialog box, on the Welcome to the Data
                             Source View Wizard page, click Next.
                          3. On the Select Data Source page, in the Relational data sources pane, verify
                             that Adventure Works DW is selected, and then click Next.

                          Note At this point, Analysis Services may take a few moments to read the
                          database schema.

                          4. In this project, your Data Source View is not going to be based on a table;
                             instead, it will be based on a view. On the Select Tables and Views page,
                             double-click dbo.vDMLabCustomerTrain to add this table to the Included
                             objects list.

                          Note You may need to expand the Name column, and/or the entire dialog box,
                          in order to be able to select dbo.vDMLabCustomerTrain.

                          5. Click Next.
                          6. On the Completing the Wizard page, in the Name text box, type
                             Customers and then click Finish. The Data Source View Designer will
                             open. The Data Source View Designer is a graphical representation of the
                             data schema you have defined.
                          7. Right-click the vDMLabCustomerTrain table and then click Explore
                             Data, as in Figure 2.
Erro! Use a guia Início para aplicar heading 1 ao texto que deverá aparecer aqui.   7




Figure 2: Explore Data

Note Analysis Services may take a few moments to read the data.

8. This opens a new window in which you can view the data for the table/view.
   If you like, you can move the window, or even dock it.
9. In the Explore vDMLabCustomerTrain Table window, scroll to view the
   data, and then click on the X in upper right hand corner as in Figure 3 to
   close the window.




Figure 3: Explore Table Window
A Data Source View contains data source schema information. As shown here,
you do not have to base the Data Source View on table(s): You can use views
as well.

 Task 5: Create a Data Mining Structure
1. In the Solution Explorer pane, under the DM Exercise 1 folder, right-click
    the Mining Models folder, and then select New Mining Model from the
    context menu.
2. In the Data Mining Wizard, on the Welcome to the Data Mining Wizard
    page, click Next.
The Mining Model Wizard is the starting point for all data mining operations.
3. On the Select the Definition Method page, click From existing relational
   database or data warehouse and then click Next.
4. On the Select the Data Mining Technique page, in the Which data
   mining technique do you want to use? drop-down list, verify that
   Microsoft Decision Trees is selected, and then click Next.
5. On the Select Data Source View page, in the Available data source views
   pane, verify that the Customers data source view is selected, and then click
   Next.
8   Erro! Use a guia Início para aplicar heading 1 ao texto que deverá aparecer aqui.


                          6. On the Specify Table Types page, in the Input tables pane, in the
                             vDMLabCustomerTrain row, verify that the Case check box is selected,
                             and then click Next.
                          7. On the Specify the Training Data page, in the Mining model structure
                             pane, select or deselect each cell by clicking on the check box as shown in
                             Figure 4.




                          Figure 4: Specifying Columns for Analysis
                          Because CustomerKey is the primary key of the source table, the Data Mining
                          Wizard has automatically selected it as the key. The key identifies the cases in
                          the mining model.

                          Important The CustomerKey, FirstName, and LastName columns should not
                          be selected as Input or Predictable columns.

                          8. Click Next.
                          9. On the Specify Columns’ Content and Data Type page click Next.
                          10. On the Completing the Wizard page, in the Mining Structure Name text
                              box, type Customers, and then click Finish. The Mining Structure
                              designer will open as in Figure 5.
Erro! Use a guia Início para aplicar heading 1 ao texto que deverá aparecer aqui.   9




Figure 5: The Mining Structure
A data mining structure may contain multiple data mining models. Each data
mining model uses a subset of the data referenced by the data mining structure.
When the data mining structure is processed, the source data is queried once
and then all of the data mining models are processed in parallel.

 Task 6: Add and edit columns in the Mining Structure
1. In the Mining Structure tree view on the left side of the designer window,
   right-click Columns, and then click Add a Column.
2. In the Select a Column dialog box, in the Source column tree view, select
   the Age column, and then click OK.
3. An alert will appear indicating that you already have an Age column
   selected. Click Yes to approve and dismiss the dialog box.
4. In the Mining Structure tree view, right-click the Age 1 column, and then
   select Properties from the context menu.
5. In the Properties window, in the Content property drop-down list, select
   Discretized.
By changing the Content property to Discretized, the server will automatically
determine discrete ranges for the column.
6. In the Properties window, in the Name property text box, type Age
   Discretized, and then press <Enter>.
7. An alert will appear confirming that you want to change the name for all
   related columns. Click Yes to approve and dismiss the dialog box.

 Task 7: Rename the Mining Model
1. In the View pane at the top of the designer, select the Mining Models tab
   to view information about the model as in Figure 6.
10   Erro! Use a guia Início para aplicar heading 1 ao texto que deverá aparecer aqui.




                          Figure 6: The Mining Models View
                          2. In the Mining Models grid, right-click the Customers column heading, and
                             then click Properties.
                          3. In the Properties window, in the Name property text box, type Customers
                             DT to rename the mining model, and then press <Enter>.

                          Note Step 3 renames the Decision Tree mining model, but does not rename the
                          mining model structure.


                           Task 8: Create a Related Mining Model
                          1. Click on the Create a Related Mining Model icon on the Mining Models
                             icon bar, as shown in Figure 7.




                          Figure 7: The Create a Related Mining Model icon
                          2. In the New Mining Model dialog box, in the Model Name text box, type
                              Customers NB.
                          3. In theAlgorithm Name drop-down list, click Microsoft Naive Bayes , and
                              then click OK.
Erro! Use a guia Início para aplicar heading 1 ao texto que deverá aparecer aqui.   11


4. When the alert appears confirming that you want to use the Microsoft Naïve
   Bayes algorithm and that some columns will be ignored, click Yes to
   approve and dismiss the dialog box.
The Naïve Bayes algorithm does not support continuous columns. Therefore,
the Age column will be ignored in this mining model. Instead, you will use the
Age Discretized column.
5. Click in the Age Discretized cell in the Customers NB column (the content
   is currently Ignore) in the cell drop-down list, select Input as in Figure 8.




Figure 8 Changing Usage of a Mining Model Column
6. You should now have an end result as shown in Figure 9.




Figure 9: The Customers Mining Model

 Task 9: Deploy the Analysis Services Solution
1. Select the Build | Deploy Solution menu item.
12   Erro! Use a guia Início para aplicar heading 1 ao texto que deverá aparecer aqui.


                          An Output window will open showing you deployment progress, as shown in
                          Figure 10. The deployment progress is also shown in the Deployment Progress
                          pane on the right hand side of the Business Intelligence Development Studio, as
                          in Figure 11. The Deployment Progress pane can give you more detailed
                          information about what happens during deployment than what you see in the
                          Output window.




                          Figure 10: The Output Window during Deployment




                          Figure 11: The Deployment Progress Pane
Erro! Use a guia Início para aplicar heading 1 ao texto que deverá aparecer aqui.   13


Note Analysis Services may take quite a while to process the data mining
models.

2. Once deployment is complete (as indicated by the Output window as in
   Figure 12), close the Output and Deployment Progress windows by clicking
   the X in the upper-right corner of each window.




Figure 12: Successful Deployment

Note The Analysis Services project is automatically saved when you click
Deploy Solution.

In the above procedures, various wizards and editors have been creating XML
code based on your input. Deployment sends the XML code to the Analysis
Server and then processes the Analysis Services database.

 Task 10: View the Customers DT Mining Model Decision Tree
1. In the View pane at the top of the designer window, click the Mining
   Model Viewer tab.

Note If an alert appears indicating that changes have been made, click No.

2. In the Mining Model drop-down list, select Customers DT.
3. Press Shift+Alt+Enter to view the designer window full screen. (You can
   press Shift+Alt+Enter again to return to normal view later.)

Tip If accidentally closed, the Mining Model Viewer of the Mining Model
Designer can be re-opened. Select the View | Solution Explorer menu item. In
the Solution Explorer window, under the Mining Models folder, right-click
Customers.dmm, and then select Browse from the context menu.

4. In the Tree drop-down list, make sure Bike Buyer is selected; Figure 13
   shows the result.
14   Erro! Use a guia Início para aplicar heading 1 ao texto que deverá aparecer aqui.




                          Figure 13: Browsing the Mining Model
                          5. In the lower-right corner of the Mining Model Viewer, click and hold on the
                             small + icon. The mouse pointer will change to a cross-arrow icon and the
                             Navigation window will appear. You may drag the mouse to navigate
                             within the Mining Model Viewer.

                          Tip The Node Legend window on the right side of the display may be
                          relocated and resized to improve the display of the decision tree. If you
                          accidentally close the Node Legend window, select the Mining Model tab and
                          then reselect the Mining Model Viewer tab, and the Node Legend window will
                          re-appear when the viewer is redisplayed.

                          6. On the Show Level slider control, drag the pointer to the left so that only
                             one level of the decision tree is displayed.
                          7. Click the All node.
                          The All node contains a histogram with red representing bike buyers and yellow
                          representing non-bike buyers.
                          Information about all customers is displayed in the Node Legend window.
                          Notice that 49.39% of the 18484 customers are bike buyers. (You may need to
                          widen the Node Legend window in order to be able to see the percentages.)
                          8. On the Show Level slider control, drag the pointer to the right so that two
                             levels of the decision tree are displayed.
                          Number of cars owned is most predictive of a customer's bike buying behavior.
                          9. Click on each node of level 2. The Node Legend window will display
                             detailed information for each node.
                          10. In the Background drop-down list, click Yes.
                          The shade of each node indicates the concentration of the value in the
                          Background drop-down list. Expand and contract nodes in the diagram in order
                          to investigate the predicting factors for each group.

                           Task 11: View the Customers DT Mining Model Dependency Network
Erro! Use a guia Início para aplicar heading 1 ao texto que deverá aparecer aqui.   15


1. Within the designer, click the Dependency Network tab.
The Dependency Network viewer displays the strength of the relationships
between the attributes in a decision tree model.
2. On the Links slider control, drag the pointer to the bottom.
3. In the Dependency Network diagram, click the Bike Buyer node.
The color of each node indicates that attribute's relationship to the Bike Buyer
attribute.
4. On the Links slider control, slowly drag the pointer up to the top. As you
   drag the pointer upward the relationships within the data are displayed, as
   shown in Figure 14.




Figure 14: View Strength of Relationships

 Task 12: View the NB Naïve Bayes Mining Model Attribute Profile
   display
1. In the Mining Model drop-down list, click Customers NB to view the
    Naïve Bayes mining model.
2. Select the Attribute Profiles tab
3. In the Predictable drop-down list, ensure that Bike Buyer is selected.
The Attribute Profiles tab displays the other attributes that impact the state of
the predictable value selected.

 Task 13: View the Attribute Characteristics display
1. Click the Attribute Characteristics tab.
2. In the Attribute drop-down list, ensure that Bike Buyer is selected. In the
    Value drop-down list, select Yes.
The characteristics of bike buyers, ordered by their frequency, are displayed..
3. In the Value drop-down list, select No.
16   Erro! Use a guia Início para aplicar heading 1 ao texto que deverá aparecer aqui.


                          Notice that the characteristics of non-bike buyers are different than the
                          characteristics of bike buyers.

                           Task 14: View the Attribute Discrimination display
                          1. Click the Attribute Discrimination tab.
                          2. In Attribute drop-down list, ensure that Bike Buyer is selected.
                          3. In the Value1 drop-down list, select Yes.
                          4. In the Value 2 drop-down list, select No.
                          The attribute values that impact a customer's bike buying decision are
                          displayed. The attribute values are ordered by how strongly they favor bike
                          buyers or non-bike buyers.

                           Task 15: View the Dependency Network
                          1. Click the Dependency Network tab.
                          2. On the Links slider control, drag the pointer to the bottom.
                          3. In the Dependency Network diagram, click the Bike Buyer node.
                          The color of each node indicates that attribute's relationship to the Bike Buyer
                          attribute.
                          On the Links slider control, slowly drag the pointer up to the top.
                          As you drag the pointer upward the relationships within the data are displayed.

                           Task 16: Close the Analysis Services Project
                          1. Select File | Close Solution. If prompted to save changes, select Yes.
                          2. Select File | Exit.
Erro! Use a guia Início para aplicar heading 1 ao texto que deverá aparecer aqui.   17



Exercise 2
Viewing Mining Accuracy Charts
Overview       The management team at Adventure Works wants to determine the accuracy of
               their data mining models. Using a validation data set that was held out of the
               training set, they create Mining Accuracy Charts to visually identify which
               model is performing most accurately.
               In this exercise, you will validate the mining models created in Exercise 1 by
               using the Mining Accuracy Chart view of the Mining Structure Designer.
               To view the Mining Accuracy Chart, you will:
                  Create a prediction query by selecting an input table and mapping the
                   columns of the data mining model to the columns in the validation data set.
                  View and interpret a Lift Chart

                Task 1: Open an existing project
               1. From the Windows task bar, select Start | All Programs | Microsoft SQL
                  Server 2005 | Business Intelligence Development Studio.
               2. Select File | Open | Project/Solution.
               3. In the Open Project dialog box, navigate to the C:SQL LabsLab
                  ProjectsData Mining LabDM Exercise 2 folder, click DM Exercise
                  2.slnbi, and then click Open.

               Important The solution used in Exercise 2 is different than the solution created
               in Exercise 1.


                Task 2: Deploy the Analysis Services Solution
               1. Select Build | Deploy Solution. The Output window will show you
                   deployment progress, as shown in Figure 1. The deployment progress is
                   also shown in the Deployment Progress pane on the right hand side of the
                   Business Intelligence Development Studio, as in Figure 2. The
                   Deployment Progress pane can give you more detailed information about
                   what happens during deployment than what you see in the Output window.

               Note Analysis Services may take quite a while to process the data mining
               models.
18   Erro! Use a guia Início para aplicar heading 1 ao texto que deverá aparecer aqui.




                          Figure 1: Output Window




                          Figure 2: Deployment Progress Pane
                          2. Once deployment is complete, close the Output window and the
                             Deployment Progress pane.

                           Task 3: Create a Prediction Query
                          1. In the Solution Explorer window, in the Mining Models folder, double-
                              click Customers.dmm.

                          Tip If the Solution Explorer window is not visible, select the View | Solution
                          Explorer menu item.

                          2. From the list of tabs above the designer window, select the Mining
                              Accuracy Chart icon. The Mining Accuracy Chart view will open,
                              displaying the Column Mapping page, as shown in Figure 3.
Erro! Use a guia Início para aplicar heading 1 ao texto que deverá aparecer aqui.   19




Figure 3: Column Mapping Page
Next, you will use the Column Mapping page to design a Prediction Query
that will be executed in order to compare the mining model's predicted values to
the validation data set's actual values.
3. On the Column Mapping tab, in the Select Input Table(s) window, click
   the Select case table hyperlink.
4. In the Select Table window, make sure Customers is selected in the Data
    Source drop down list-box, click the vDMLabCustomerValidate table,
    and then click OK.
Relationships between the mining structure and the input table are
automatically created between columns with the same name. Relationships can
be added or deleted by the user.
5. In the table at the bottom of the Column Mapping page, verify that the
    Show check box is selected for both the Customers DT and Customers
    NB mining models.
6. In the Predictable Column Name column, verify that Bike Buyer is
    selected for both mining models.
In the Predictable Column Name drop-down lists, the mining model column
names are restricted to columns that have the usage type set to Predict or
Predict Only.
7. In the Predict Value column, in the drop-down list, click Yes for both
    mining models, as shown in Figure 4.
20   Erro! Use a guia Início para aplicar heading 1 ao texto que deverá aparecer aqui.




                          Figure 4: Predict Value
                          8. Click the Lift Chart tab.
                          The prediction query designed on the Column Mapping page will be executed.
                          The prediction query will return a prediction for each case in the validation data
                          set.

                          Note Analysis Services may take a while to process the prediction query.

                          The mining model will have greater confidence in its prediction for some cases
                          than for others. For each case, the prediction query will also return the
                          probability that the prediction is correct.
                          The cases are sorted by the probability that the prediction is correct, and then
                          the percentages of correct predictions are displayed on the lift chart, as in
                          Figure 5.
Erro! Use a guia Início para aplicar heading 1 ao texto que deverá aparecer aqui.   21




Figure 5: Data Mining Lift Chart
9. Point at any of the lines on the chart. A tool tip will open and display the
    mining model name, the percent of cases (Overall Population), and the
    percent of bike buyers identified (Target Population).
As you move along different line points on a line or point to a different line, the
values displayed in the tool tip will change.
10. Point at the locations where each line reaches a Target Population of 90%,
     and wait for a tool tip to open.
The example in Figure 5 shows that the Customers DT model only needs to
select 71% of the customers in order to identify 90% of the bike buyers.
The Ideal Model only needs to select 45% of the customers in order to identify
90% of the bike buyers.
The Customers NB model needs to select 85% of the customers in order to
identify 90% of the bike buyers.
The Random Guess model needs to select 90% of the customers to in order to
identify 90% of the bike buyers.
Because the Customers DT mining model needs to select fewer cases in order
to identify a given percentage of bike buyers, it would be deemed more accurate
than the Customers NB model.

 Task 4: Close the Analysis Services Project
1. Select File | Close Solution. If prompted to save changes, select Yes.
2. Select File | Exit.
22         Erro! Use a guia Início para aplicar heading 1 ao texto que deverá aparecer aqui.



Exercise 3
Creating a Prediction Query
Overview                        The sales and marketing department at Adventure Works received a potential
                                customer list containing demographic data. The department will use the
                                Customers DT data mining model to predict the likelihood that an individual in
                                the list is a bike buyer.
                                In this exercise, you will make predictions by using the Mining Model
                                Prediction view of the Mining Structure Designer.
                                To make predictions, you will:
                                    Select an input table and map the columns of the data mining model to the
                                     columns in the prediction data set.
                                    Create a prediction query in the Prediction Query Builder Design pane.
                                    View the prediction query results.

                                 Task 1: Open an existing project
                                1. From the Windows task bar, select Start | All Programs | Microsoft SQL
                                   Server 2005 | Business Intelligence Development Studio.
                                2. Select File | Open | Project/Solution.
                                3. In the Open Project dialog box, navigate to the C:SQL LabsLab
                                   ProjectsData Mining LabDM Exercise 3 folder, click DM Exercise
                                   3.slnbi, and then click Open.

                                Important The solution used in Exercise 3 is different than the solutions
                                examined in the previous exercises.


                                 Task 2: Deploy the Analysis Services Solution
                                1. Select Build | Deploy Solution. The Output window will indicate
                                    deployment progress, as shown in Figure 1. The deployment progress is
                                    also shown in the Deployment Progress pane on the right hand side of the
                                    Business Intelligence Development Studio, as in Figure 2. The
                                    Deployment Progress pane can give you more detailed information about
                                    what happens during deployment than what you see in the Output window.

                                Note Analysis Services may take quite a while to process the data mining
                                models.
Erro! Use a guia Início para aplicar heading 1 ao texto que deverá aparecer aqui.   23




Figure 1: Output Window




Figure 2: Deployment Progress Pane
2. Once deployment is complete, close the Output window and the
   Deployment Progress pane.

 Task 3: Create a Prediction Query using the Decision Tree Mining
   Model
1. In the Solution Explorer window, in the Mining Models folder, double-
    click Customers.dmm.

Tip If the Solution Explorer pane is not visible, select View | Solution
Explorer.

2. From the list of tabs above the designer window, select the Mining Model
    Prediction tab.
3. In the Mining Model window, click the Select model hyperlink.
24   Erro! Use a guia Início para aplicar heading 1 ao texto que deverá aparecer aqui.


                          4. In the Select Mining Model dialog box, expand Customers, select
                              Customers DT, as shown in Figure 3, and then click OK.




                          Figure 3: Select Mining Model in Mining Model Prediction View
                          5. In the Select Input Table(s) window, click the Select case table hyperlink.
                          6. In the Select Table window, select the vDMLabCustomerPredict table,
                              and then click OK.
                          Relationships between the mining structure and the input table are
                          automatically created between columns with the same name. Relationships can
                          be added or deleted by the user.

                           Task 4: Build the Decision Tree Prediction Query
                          1. Enter values into the first row of the table at the bottom of the designer as
                             shown below:
                               Column                                          Value

                               Source                                          vDMLabCustomerPredict
                               Field                                           CustomerKey
                               Show                                            (Checked)


                          Tip The columns of the table may be re-sized by dragging the dividing line
                          between the column headings.

                          2. Input values into the second row of the table as shown below:
                               Column                                          Value

                               Source                                          Customers DT mining model
                               Field                                           Bike Buyer
Erro! Use a guia Início para aplicar heading 1 ao texto que deverá aparecer aqui.   25


      Show                                           (Checked)


Note The value in the Source column will change from Customers DT mining
model to Customers DT.

3. Enter values into the third row of the table as shown below:
      Column                                         Value

      Source                                         Prediction Function
      Field                                          PredictProbability
      Alias                                          Confidence
      Show                                           (Checked)
      Criteria/Argument                              [Customers DT].[Bike Buyer]

Figure 4 shows what you should see after having entered the values as
described in the previous tables.




Figure 4: Values for the Decision Tree Prediction Query
4. Select Mining Model | Query to view the SQL view of the query.

Tip You can also do this by clicking on the down arrow in the Switch to
query result view icon in the upper left corner of the Mining Model Prediction
View as in Figure 5, and then click Query.




Figure 5: Switch to SQL Query View
In the bottom pane of the Mining Model Prediction view, the text of the
prediction query is displayed.

 Task 5: Display the Decision Tree query result
     Select Mining Model | Result to view the result of the Decision Tree
      Query.

Tip
26   Erro! Use a guia Início para aplicar heading 1 ao texto que deverá aparecer aqui.


                          As with the SQL view, you can also view the result by clicking the down arrow
                          in the Switch to query result view icon in the upper left corner of the Mining
                          Model Prediction View as in Figure 4, and then click Result.

                          The results of the prediction query are displayed.
                          The CustomerKey column identifies each record from the input table.
                          The Bike Buyer column contains the mining model's prediction of the
                          customer's bike buying behavior.
                          Larger values in the Confidence column mean the mining model has more
                          confidence in the prediction contained in the Bike Buyer column.
                          By contacting potential customers predicted to be bike buyers with a high
                          Confidence value, Adventure Works can use the results of the prediction query
                          to promote their bikes to those individuals most likely to be bike buyers.
                          Adventure Works marketing expenses will be reduced because potential
                          customers who are not likely to be bike buyers will not be contacted.

                           Task 6: Close the Analysis Services Project
                          1. Select File | Close Solution. If prompted to save changes, select Yes.
                          2. Select File | Exit.

More Related Content

What's hot

Introducing visual studio_2010_v1.0--chappell
Introducing visual studio_2010_v1.0--chappellIntroducing visual studio_2010_v1.0--chappell
Introducing visual studio_2010_v1.0--chappellAravindharamanan S
 
Using splash screens in java me applications
Using splash screens in java me applicationsUsing splash screens in java me applications
Using splash screens in java me applicationsOrlando Barcia
 
Scrum desk quick start
Scrum desk quick startScrum desk quick start
Scrum desk quick startScrumDesk
 
Creating Workflows Windows Share Point Services
Creating Workflows Windows Share Point ServicesCreating Workflows Windows Share Point Services
Creating Workflows Windows Share Point ServicesLiquidHub
 
Ahg microsoft stream_insight_queries
Ahg microsoft stream_insight_queriesAhg microsoft stream_insight_queries
Ahg microsoft stream_insight_queriesSteve Xu
 

What's hot (6)

Introducing visual studio_2010_v1.0--chappell
Introducing visual studio_2010_v1.0--chappellIntroducing visual studio_2010_v1.0--chappell
Introducing visual studio_2010_v1.0--chappell
 
Using splash screens in java me applications
Using splash screens in java me applicationsUsing splash screens in java me applications
Using splash screens in java me applications
 
Scrum desk quick start
Scrum desk quick startScrum desk quick start
Scrum desk quick start
 
ssssss
ssssssssssss
ssssss
 
Creating Workflows Windows Share Point Services
Creating Workflows Windows Share Point ServicesCreating Workflows Windows Share Point Services
Creating Workflows Windows Share Point Services
 
Ahg microsoft stream_insight_queries
Ahg microsoft stream_insight_queriesAhg microsoft stream_insight_queries
Ahg microsoft stream_insight_queries
 

Viewers also liked

Introduction to SQL Server Memory Management - for the DBA
Introduction to SQL Server Memory Management - for the DBAIntroduction to SQL Server Memory Management - for the DBA
Introduction to SQL Server Memory Management - for the DBAiKosmik
 
Software project plannings
Software project planningsSoftware project plannings
Software project planningsAman Adhikari
 
MS Sql Server: Introduction To Datamining Suing Sql Server
MS Sql Server: Introduction To Datamining Suing Sql ServerMS Sql Server: Introduction To Datamining Suing Sql Server
MS Sql Server: Introduction To Datamining Suing Sql ServerDataminingTools Inc
 
10th science notes all chapters
10th science  notes all chapters10th science  notes all chapters
10th science notes all chaptersKarnataka OER
 
College mgmnt system
College mgmnt systemCollege mgmnt system
College mgmnt systemSayali Birari
 
Introduction to MongoDB
Introduction to MongoDBIntroduction to MongoDB
Introduction to MongoDBRavi Teja
 
Cassandra Introduction & Features
Cassandra Introduction & FeaturesCassandra Introduction & Features
Cassandra Introduction & FeaturesDataStax Academy
 
Introduction to MongoDB
Introduction to MongoDBIntroduction to MongoDB
Introduction to MongoDBMike Dirolf
 
Regards sur l'education 2016
Regards sur l'education 2016Regards sur l'education 2016
Regards sur l'education 2016EduSkills OECD
 
Introduction to memcached
Introduction to memcachedIntroduction to memcached
Introduction to memcachedJurriaan Persyn
 
Introduction to HTML
Introduction to HTMLIntroduction to HTML
Introduction to HTMLMayaLisa
 
Oracle architecture ppt
Oracle architecture pptOracle architecture ppt
Oracle architecture pptDeepak Shetty
 
HBaseCon 2012 | HBase Schema Design - Ian Varley, Salesforce
HBaseCon 2012 | HBase Schema Design - Ian Varley, SalesforceHBaseCon 2012 | HBase Schema Design - Ian Varley, Salesforce
HBaseCon 2012 | HBase Schema Design - Ian Varley, SalesforceCloudera, Inc.
 
Introduction to Redis
Introduction to RedisIntroduction to Redis
Introduction to RedisDvir Volk
 
Data Mining And Data Warehousing Laboratory File Manual
Data Mining And Data Warehousing Laboratory File ManualData Mining And Data Warehousing Laboratory File Manual
Data Mining And Data Warehousing Laboratory File ManualNitin Bhasin
 
Scalability, Availability & Stability Patterns
Scalability, Availability & Stability PatternsScalability, Availability & Stability Patterns
Scalability, Availability & Stability PatternsJonas Bonér
 

Viewers also liked (20)

Introduction to SQL Server Memory Management - for the DBA
Introduction to SQL Server Memory Management - for the DBAIntroduction to SQL Server Memory Management - for the DBA
Introduction to SQL Server Memory Management - for the DBA
 
Software project plannings
Software project planningsSoftware project plannings
Software project plannings
 
MS Sql Server: Introduction To Datamining Suing Sql Server
MS Sql Server: Introduction To Datamining Suing Sql ServerMS Sql Server: Introduction To Datamining Suing Sql Server
MS Sql Server: Introduction To Datamining Suing Sql Server
 
10th science notes all chapters
10th science  notes all chapters10th science  notes all chapters
10th science notes all chapters
 
Prototype Model
Prototype ModelPrototype Model
Prototype Model
 
Java programming course for beginners
Java programming course for beginnersJava programming course for beginners
Java programming course for beginners
 
College mgmnt system
College mgmnt systemCollege mgmnt system
College mgmnt system
 
Prototyping
PrototypingPrototyping
Prototyping
 
Introduction to MongoDB
Introduction to MongoDBIntroduction to MongoDB
Introduction to MongoDB
 
Cassandra Introduction & Features
Cassandra Introduction & FeaturesCassandra Introduction & Features
Cassandra Introduction & Features
 
Introduction to MongoDB
Introduction to MongoDBIntroduction to MongoDB
Introduction to MongoDB
 
Regards sur l'education 2016
Regards sur l'education 2016Regards sur l'education 2016
Regards sur l'education 2016
 
Introduction to memcached
Introduction to memcachedIntroduction to memcached
Introduction to memcached
 
Intro to HBase
Intro to HBaseIntro to HBase
Intro to HBase
 
Introduction to HTML
Introduction to HTMLIntroduction to HTML
Introduction to HTML
 
Oracle architecture ppt
Oracle architecture pptOracle architecture ppt
Oracle architecture ppt
 
HBaseCon 2012 | HBase Schema Design - Ian Varley, Salesforce
HBaseCon 2012 | HBase Schema Design - Ian Varley, SalesforceHBaseCon 2012 | HBase Schema Design - Ian Varley, Salesforce
HBaseCon 2012 | HBase Schema Design - Ian Varley, Salesforce
 
Introduction to Redis
Introduction to RedisIntroduction to Redis
Introduction to Redis
 
Data Mining And Data Warehousing Laboratory File Manual
Data Mining And Data Warehousing Laboratory File ManualData Mining And Data Warehousing Laboratory File Manual
Data Mining And Data Warehousing Laboratory File Manual
 
Scalability, Availability & Stability Patterns
Scalability, Availability & Stability PatternsScalability, Availability & Stability Patterns
Scalability, Availability & Stability Patterns
 

Similar to Hands-On Lab Data Mining - SQL Server

Visual Studio 2008 Overview
Visual Studio 2008 OverviewVisual Studio 2008 Overview
Visual Studio 2008 OverviewRoman Okolovich
 
IBM Cognos 10 Framework Manager Metadata Modeling: Tips and Tricks
IBM Cognos 10 Framework Manager Metadata Modeling: Tips and TricksIBM Cognos 10 Framework Manager Metadata Modeling: Tips and Tricks
IBM Cognos 10 Framework Manager Metadata Modeling: Tips and TricksSenturus
 
Visual Studio 2010 Testing Overview
Visual Studio 2010 Testing OverviewVisual Studio 2010 Testing Overview
Visual Studio 2010 Testing OverviewSteve Lange
 
Part 3 web development
Part 3 web developmentPart 3 web development
Part 3 web developmenttechbed
 
Tutorial test driven development with Visual Studio 2012
Tutorial test driven development with Visual Studio 2012Tutorial test driven development with Visual Studio 2012
Tutorial test driven development with Visual Studio 2012Hong Le Van
 
Part 6 debugging and testing java applications
Part 6 debugging and testing java applicationsPart 6 debugging and testing java applications
Part 6 debugging and testing java applicationstechbed
 
Part 7 packaging and deployment
Part 7 packaging and deploymentPart 7 packaging and deployment
Part 7 packaging and deploymenttechbed
 
A Lap Around Visual Studio 2010
A Lap Around Visual Studio 2010A Lap Around Visual Studio 2010
A Lap Around Visual Studio 2010adrian8three
 
( 5 ) Office 2007 Create A Business Data Catolog
( 5 ) Office 2007   Create A Business Data Catolog( 5 ) Office 2007   Create A Business Data Catolog
( 5 ) Office 2007 Create A Business Data CatologLiquidHub
 
Part 4 working with databases
Part 4 working with databasesPart 4 working with databases
Part 4 working with databasestechbed
 
Windows azure sql _ database _ tutorials
Windows azure sql _ database _ tutorialsWindows azure sql _ database _ tutorials
Windows azure sql _ database _ tutorialsCMR WORLD TECH
 
Windows Azure SQL Database Tutorials
Windows Azure SQL Database TutorialsWindows Azure SQL Database Tutorials
Windows Azure SQL Database TutorialsMILL5
 
Windows azure sql_database_tutorials
Windows azure sql_database_tutorialsWindows azure sql_database_tutorials
Windows azure sql_database_tutorialsSteve Xu
 
Windows azure sql_database_tutorials
Windows azure sql_database_tutorialsWindows azure sql_database_tutorials
Windows azure sql_database_tutorialsJose Vergara Veas
 
Windows azure sql_database_tutorials
Windows azure sql_database_tutorialsWindows azure sql_database_tutorials
Windows azure sql_database_tutorialsMILL5
 
Scom monitor datacenter
Scom   monitor datacenterScom   monitor datacenter
Scom monitor datacenterGary Jackson
 

Similar to Hands-On Lab Data Mining - SQL Server (20)

Visual Studio 2008 Overview
Visual Studio 2008 OverviewVisual Studio 2008 Overview
Visual Studio 2008 Overview
 
IBM Cognos 10 Framework Manager Metadata Modeling: Tips and Tricks
IBM Cognos 10 Framework Manager Metadata Modeling: Tips and TricksIBM Cognos 10 Framework Manager Metadata Modeling: Tips and Tricks
IBM Cognos 10 Framework Manager Metadata Modeling: Tips and Tricks
 
Build for the future
Build for the futureBuild for the future
Build for the future
 
Visual Studio 2010 Testing Overview
Visual Studio 2010 Testing OverviewVisual Studio 2010 Testing Overview
Visual Studio 2010 Testing Overview
 
Building richwebapplicationsusingasp
Building richwebapplicationsusingaspBuilding richwebapplicationsusingasp
Building richwebapplicationsusingasp
 
Part 3 web development
Part 3 web developmentPart 3 web development
Part 3 web development
 
Dbi h315
Dbi h315Dbi h315
Dbi h315
 
Tutorial test driven development with Visual Studio 2012
Tutorial test driven development with Visual Studio 2012Tutorial test driven development with Visual Studio 2012
Tutorial test driven development with Visual Studio 2012
 
SSDT unleashed
SSDT unleashedSSDT unleashed
SSDT unleashed
 
Part 6 debugging and testing java applications
Part 6 debugging and testing java applicationsPart 6 debugging and testing java applications
Part 6 debugging and testing java applications
 
Part 7 packaging and deployment
Part 7 packaging and deploymentPart 7 packaging and deployment
Part 7 packaging and deployment
 
A Lap Around Visual Studio 2010
A Lap Around Visual Studio 2010A Lap Around Visual Studio 2010
A Lap Around Visual Studio 2010
 
( 5 ) Office 2007 Create A Business Data Catolog
( 5 ) Office 2007   Create A Business Data Catolog( 5 ) Office 2007   Create A Business Data Catolog
( 5 ) Office 2007 Create A Business Data Catolog
 
Part 4 working with databases
Part 4 working with databasesPart 4 working with databases
Part 4 working with databases
 
Windows azure sql _ database _ tutorials
Windows azure sql _ database _ tutorialsWindows azure sql _ database _ tutorials
Windows azure sql _ database _ tutorials
 
Windows Azure SQL Database Tutorials
Windows Azure SQL Database TutorialsWindows Azure SQL Database Tutorials
Windows Azure SQL Database Tutorials
 
Windows azure sql_database_tutorials
Windows azure sql_database_tutorialsWindows azure sql_database_tutorials
Windows azure sql_database_tutorials
 
Windows azure sql_database_tutorials
Windows azure sql_database_tutorialsWindows azure sql_database_tutorials
Windows azure sql_database_tutorials
 
Windows azure sql_database_tutorials
Windows azure sql_database_tutorialsWindows azure sql_database_tutorials
Windows azure sql_database_tutorials
 
Scom monitor datacenter
Scom   monitor datacenterScom   monitor datacenter
Scom monitor datacenter
 

Recently uploaded

[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdfhans926745
 
How to convert PDF to text with Nanonets
How to convert PDF to text with NanonetsHow to convert PDF to text with Nanonets
How to convert PDF to text with Nanonetsnaman860154
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Drew Madelung
 
Google AI Hackathon: LLM based Evaluator for RAG
Google AI Hackathon: LLM based Evaluator for RAGGoogle AI Hackathon: LLM based Evaluator for RAG
Google AI Hackathon: LLM based Evaluator for RAGSujit Pal
 
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptxHampshireHUG
 
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...HostedbyConfluent
 
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure serviceWhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure servicePooja Nehwal
 
SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024Scott Keck-Warren
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerThousandEyes
 
A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024Results
 
Maximizing Board Effectiveness 2024 Webinar.pptx
Maximizing Board Effectiveness 2024 Webinar.pptxMaximizing Board Effectiveness 2024 Webinar.pptx
Maximizing Board Effectiveness 2024 Webinar.pptxOnBoard
 
Unblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen FramesUnblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen FramesSinan KOZAK
 
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfThe Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfEnterprise Knowledge
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)Gabriella Davis
 
Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slidevu2urc
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024Rafal Los
 
Enhancing Worker Digital Experience: A Hands-on Workshop for Partners
Enhancing Worker Digital Experience: A Hands-on Workshop for PartnersEnhancing Worker Digital Experience: A Hands-on Workshop for Partners
Enhancing Worker Digital Experience: A Hands-on Workshop for PartnersThousandEyes
 
IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsEnterprise Knowledge
 
Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024The Digital Insurer
 
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | DelhiFULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhisoniya singh
 

Recently uploaded (20)

[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf
 
How to convert PDF to text with Nanonets
How to convert PDF to text with NanonetsHow to convert PDF to text with Nanonets
How to convert PDF to text with Nanonets
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
 
Google AI Hackathon: LLM based Evaluator for RAG
Google AI Hackathon: LLM based Evaluator for RAGGoogle AI Hackathon: LLM based Evaluator for RAG
Google AI Hackathon: LLM based Evaluator for RAG
 
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
 
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
 
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure serviceWhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
 
SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024
 
Maximizing Board Effectiveness 2024 Webinar.pptx
Maximizing Board Effectiveness 2024 Webinar.pptxMaximizing Board Effectiveness 2024 Webinar.pptx
Maximizing Board Effectiveness 2024 Webinar.pptx
 
Unblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen FramesUnblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen Frames
 
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfThe Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)
 
Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slide
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024
 
Enhancing Worker Digital Experience: A Hands-on Workshop for Partners
Enhancing Worker Digital Experience: A Hands-on Workshop for PartnersEnhancing Worker Digital Experience: A Hands-on Workshop for Partners
Enhancing Worker Digital Experience: A Hands-on Workshop for Partners
 
IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI Solutions
 
Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024
 
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | DelhiFULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
 

Hands-On Lab Data Mining - SQL Server

  • 1. Hands-On Lab Lab Manual SQL Server™ 2005: Data Mining
  • 2. rro! Use a guia Início para aplicar heading 1 ao texto que deverá aparecer aqui. Information in this document is subject to change without notice. The example companies, organizations, products, people, and events depicted herein are fictitious. No association with any real company, organization, product, person or event is intended or should be inferred. Complying with all applicable copyright laws is the responsibility of the user. Without limiting the rights under copyright, no part of this document may be reproduced, stored in or introduced into a retrieval system, or transmitted in any form or by any means (electronic, mechanical, photocopying, recording, or otherwise), or for any purpose, without the express written permission of Microsoft Corporation. Microsoft may have patents, patent applications, trademarked, copyrights, or other intellectual property rights covering subject matter in this document. Except as expressly provided in any written license agreement from Microsoft, the furnishing of this document does not give you any license to these patents, trademarks, copyrights, or other intellectual property. 2009 Microsoft Corporation. All rights reserved. Microsoft, MS-DOS, MS, Windows, Windows NT, MSDN, Active Directory, BizTalk, SQL Server, SharePoint, Outlook, PowerPoint, FrontPage, Visual Basic, Visual C++, Visual J++, Visual InterDev, Visual SourceSafe, Visual C#, Visual J#, and Visual Studio are either registered trademarks or trademarks of Microsoft Corporation in the U.S.A. and/or other countries. Other product and company names herein may be the trademarks of their respective owners.
  • 3. Data Mining Objectives After completing this lab, you will be able to:  Create Decision Tree and Naïve Bayes Data Mining Models.  View Mining Accuracy Charts.  Create a Prediction Query.  Model Time Series. Note This lab focuses on the concepts in this module and as a result may not comply with Microsoft security recommendations. Note The SQL Server 2005 labs are based on beta builds of the product. The intent of these labs is to provide you with a general feel of some of the planned features for the next release of SQL Server. As with all software development projects, the final version may differ from beta builds in both features and user interface. For the latest details on SQL Server 2005, please visit http://www.microsoft.com/sql/2005/. Estimated time to complete this lab: 60 minutes
  • 4. 2 Erro! Use a guia Início para aplicar heading 1 ao texto que deverá aparecer aqui. Exercise 0: Lab Setup In this part of the lab you will set up the views you will work with in the rest of the lab.  Task 1: Log in.  Log on with user name Administrator and password pass@word1.  Task 2: Create the Views 1. From the Windows task bar, select Start | All Programs | Microsoft SQL Server 2005 | SQL Server Management Studio. 2. In the Connect to Server dialog, make sure that in the Server type drop down list-box SQL Server is selected. Enter localhost in the Server name textbox and select Windows Authentication in the Authentication drop down list-box, as in Figure 1. Click Connect. Figure 1: Connect to Server Dialog 3. Select File | Open | File. 4. Navigate to the C:SQL LabsLab ProjectsData Mining LabDM Setup directory, and select the ViewCreation.sql file. Click Open. 5. Click Connect in the Connect to Server dialog. 6. Execute the script by pressing F5, or by clicking on the Execute icon in the toolbar, as shown in Figure 2.
  • 5. Erro! Use a guia Início para aplicar heading 1 ao texto que deverá aparecer aqui. 3 Figure 2: Execute Script 7. When the script has executed successfully, select the File | Exit menu item to close the SQL Server Management Studio. Exercise 1: Creating Decision Tree and Naïve Bayes Data Mining Models Overview The management at Adventure Works wants to analyze purchasing decisions based on customer demographics. Analysis Services has improved data mining functionality, providing the following data mining techniques:  Microsoft Association Rules  Microsoft Clustering  Microsoft Decision Trees  Microsoft Naïve Bayes  Microsoft Neural Network  Microsoft Sequence Clustering  Microsoft Time Series In this exercise, you will develop an Analysis Services solution using the Microsoft Business Intelligence Development Studio environment. The Business Intelligence Development Studio is an environment based on the Microsoft Visual Studio 2005 environment. Business Intelligence Development Studio provides you with an integrated development environment for designing, testing, editing, and deploying projects to the Analysis Server. You will create and view a data mining structure with Decision Trees and Naïve Bayes data mining models using AdventureWorksDW customer data. To create and view data mining models, you will:
  • 6. 4 Erro! Use a guia Início para aplicar heading 1 ao texto que deverá aparecer aqui.  Create an Analysis Services project in the Business Intelligence Development Studio environment.  Create a data source and data source view.  Create a data mining structure and decision trees data mining model using the Mining Model Wizard.  Create a related mining model (Naïve Bayes) in the Mining Models view.  Deploy the Analysis Services solution.  Explore the data mining models using the Mining Model Viewer.  Task 1: Create an Analysis Services Project 1. From the Windows task bar, select Start | All Programs | Microsoft SQL Server 2005 | Business Intelligence Development Studio. 2. Select File | New | Project. 3. In the New Project dialog box, in the Project Types pane, click the Business Intelligence Projects folder. 4. In the Templates pane, click the Analysis Services Project icon. 5. In the Name text box, type DM Exercise 1. 6. In the Location text box, enter C:SQL LabsUser Projects. 7. Uncheck the Create directory for Solution checkbox. Figure 1 shows how the New Project dialog box should look once you're done. 8. Click OK. Figure 1: New Project Dialog The project is created in a new solution: the solution is the largest unit of management in the Business Intelligence Development Studio environment. Each solution contains one or more projects. An Analysis Services Project is a group of related files containing the XML code for all of the objects in an Analysis Services database. You can view the solution and its projects in the Solution Explorer pane on the right hand side in the Business Intelligence Development Studio. If the Solution
  • 7. Erro! Use a guia Início para aplicar heading 1 ao texto que deverá aparecer aqui. 5 Explorer is not visible you can view it by selecting the View | Solution Explorer menu item (or the keyboard shortcut Ctrl + Alt + L)  Task 2: Set the Deployment Mode Property 1. In the Solution Explorer window, right-click the DM Exercise 1 project, and select Properties from the context menu. 2. In the DM Exercise 1 Property Pages dialog box, under the Configuration Properties folder, click Deployment. 3. In the right pane, click the Deployment Mode property. In the Deployment Mode drop-down list click DeployAll, and then click OK. You can configure the build, debugging, and deployment properties of an Analysis Services project.  Task 3: Create a Data Source 1. In the Solution Explorer pane, under the DM Exercise 1 project, right-click the Data Sources folder, and then select New Data Source from the context menu. 2. In the Data Source Wizard dialog box, on the Welcome to the Data Source Wizard page, click Next. Tip If the Data connections pane already includes (local).AdventureWorksDW, skip to step 14. 3. On the Select how to define the connection page, make sure the Create a data source based on an existing or new connection radio button is chosen. Click New …. 4. In the Connection Manager dialog box, verify that Native OLE DB is selected in the Provider drop down list box at the top of the page. 5. Still in the Connection Manager, under the 1. Select an OLE DB provider header, ensure the Microsoft OLE DB Provider for SQL Server is chosen. 6. Under the header 2. Specify the following to connect to SQL Server in the Select or enter a server name drop-down list, type “(local)”. Important In the Select or enter a server name drop-down list, type (local); do not type localhost. 7. Under Enter information to log on to the server, click Use Windows NT Integrated Security. 8. Click Select the database on the server, and in the Select the database on the server drop-down list, click AdventureWorksDW 9. Click on the Data Links… button. This brings up the Data Links Properties dialog with (local), Use Windows NT Integrated security and AdventureWorksDW already chosen. 10. Click Test Connection. 11. In the Microsoft Data Link dialog box, click OK. 12. In the Data Link Properties dialog box, click OK. 13. In the Connection Manager dialog box, click OK.
  • 8. 6 Erro! Use a guia Início para aplicar heading 1 ao texto que deverá aparecer aqui. 14. In the Data Source Wizard dialog box, on the Select how to define the connection page, verify that (local).AdventureWorksDW is selected, and click Next. 15. On the Completing the Wizard page, leave the default Data source name Adventure Works DW unchanged, and then click Finish. You have now set up the information how to connect to the database you are working with. It is now time to define the schema information you want to use in the solution. You do this through the Data Source View.  Task 4: Create a Data Source View 1. In the Solution Explorer pane, under the DM Exercise 1 folder, right-click the Data Source Views folder, and then select New Data Source View from the context menu. 2. In the Data Source View Wizard dialog box, on the Welcome to the Data Source View Wizard page, click Next. 3. On the Select Data Source page, in the Relational data sources pane, verify that Adventure Works DW is selected, and then click Next. Note At this point, Analysis Services may take a few moments to read the database schema. 4. In this project, your Data Source View is not going to be based on a table; instead, it will be based on a view. On the Select Tables and Views page, double-click dbo.vDMLabCustomerTrain to add this table to the Included objects list. Note You may need to expand the Name column, and/or the entire dialog box, in order to be able to select dbo.vDMLabCustomerTrain. 5. Click Next. 6. On the Completing the Wizard page, in the Name text box, type Customers and then click Finish. The Data Source View Designer will open. The Data Source View Designer is a graphical representation of the data schema you have defined. 7. Right-click the vDMLabCustomerTrain table and then click Explore Data, as in Figure 2.
  • 9. Erro! Use a guia Início para aplicar heading 1 ao texto que deverá aparecer aqui. 7 Figure 2: Explore Data Note Analysis Services may take a few moments to read the data. 8. This opens a new window in which you can view the data for the table/view. If you like, you can move the window, or even dock it. 9. In the Explore vDMLabCustomerTrain Table window, scroll to view the data, and then click on the X in upper right hand corner as in Figure 3 to close the window. Figure 3: Explore Table Window A Data Source View contains data source schema information. As shown here, you do not have to base the Data Source View on table(s): You can use views as well.  Task 5: Create a Data Mining Structure 1. In the Solution Explorer pane, under the DM Exercise 1 folder, right-click the Mining Models folder, and then select New Mining Model from the context menu. 2. In the Data Mining Wizard, on the Welcome to the Data Mining Wizard page, click Next. The Mining Model Wizard is the starting point for all data mining operations. 3. On the Select the Definition Method page, click From existing relational database or data warehouse and then click Next. 4. On the Select the Data Mining Technique page, in the Which data mining technique do you want to use? drop-down list, verify that Microsoft Decision Trees is selected, and then click Next. 5. On the Select Data Source View page, in the Available data source views pane, verify that the Customers data source view is selected, and then click Next.
  • 10. 8 Erro! Use a guia Início para aplicar heading 1 ao texto que deverá aparecer aqui. 6. On the Specify Table Types page, in the Input tables pane, in the vDMLabCustomerTrain row, verify that the Case check box is selected, and then click Next. 7. On the Specify the Training Data page, in the Mining model structure pane, select or deselect each cell by clicking on the check box as shown in Figure 4. Figure 4: Specifying Columns for Analysis Because CustomerKey is the primary key of the source table, the Data Mining Wizard has automatically selected it as the key. The key identifies the cases in the mining model. Important The CustomerKey, FirstName, and LastName columns should not be selected as Input or Predictable columns. 8. Click Next. 9. On the Specify Columns’ Content and Data Type page click Next. 10. On the Completing the Wizard page, in the Mining Structure Name text box, type Customers, and then click Finish. The Mining Structure designer will open as in Figure 5.
  • 11. Erro! Use a guia Início para aplicar heading 1 ao texto que deverá aparecer aqui. 9 Figure 5: The Mining Structure A data mining structure may contain multiple data mining models. Each data mining model uses a subset of the data referenced by the data mining structure. When the data mining structure is processed, the source data is queried once and then all of the data mining models are processed in parallel.  Task 6: Add and edit columns in the Mining Structure 1. In the Mining Structure tree view on the left side of the designer window, right-click Columns, and then click Add a Column. 2. In the Select a Column dialog box, in the Source column tree view, select the Age column, and then click OK. 3. An alert will appear indicating that you already have an Age column selected. Click Yes to approve and dismiss the dialog box. 4. In the Mining Structure tree view, right-click the Age 1 column, and then select Properties from the context menu. 5. In the Properties window, in the Content property drop-down list, select Discretized. By changing the Content property to Discretized, the server will automatically determine discrete ranges for the column. 6. In the Properties window, in the Name property text box, type Age Discretized, and then press <Enter>. 7. An alert will appear confirming that you want to change the name for all related columns. Click Yes to approve and dismiss the dialog box.  Task 7: Rename the Mining Model 1. In the View pane at the top of the designer, select the Mining Models tab to view information about the model as in Figure 6.
  • 12. 10 Erro! Use a guia Início para aplicar heading 1 ao texto que deverá aparecer aqui. Figure 6: The Mining Models View 2. In the Mining Models grid, right-click the Customers column heading, and then click Properties. 3. In the Properties window, in the Name property text box, type Customers DT to rename the mining model, and then press <Enter>. Note Step 3 renames the Decision Tree mining model, but does not rename the mining model structure.  Task 8: Create a Related Mining Model 1. Click on the Create a Related Mining Model icon on the Mining Models icon bar, as shown in Figure 7. Figure 7: The Create a Related Mining Model icon 2. In the New Mining Model dialog box, in the Model Name text box, type Customers NB. 3. In theAlgorithm Name drop-down list, click Microsoft Naive Bayes , and then click OK.
  • 13. Erro! Use a guia Início para aplicar heading 1 ao texto que deverá aparecer aqui. 11 4. When the alert appears confirming that you want to use the Microsoft Naïve Bayes algorithm and that some columns will be ignored, click Yes to approve and dismiss the dialog box. The Naïve Bayes algorithm does not support continuous columns. Therefore, the Age column will be ignored in this mining model. Instead, you will use the Age Discretized column. 5. Click in the Age Discretized cell in the Customers NB column (the content is currently Ignore) in the cell drop-down list, select Input as in Figure 8. Figure 8 Changing Usage of a Mining Model Column 6. You should now have an end result as shown in Figure 9. Figure 9: The Customers Mining Model  Task 9: Deploy the Analysis Services Solution 1. Select the Build | Deploy Solution menu item.
  • 14. 12 Erro! Use a guia Início para aplicar heading 1 ao texto que deverá aparecer aqui. An Output window will open showing you deployment progress, as shown in Figure 10. The deployment progress is also shown in the Deployment Progress pane on the right hand side of the Business Intelligence Development Studio, as in Figure 11. The Deployment Progress pane can give you more detailed information about what happens during deployment than what you see in the Output window. Figure 10: The Output Window during Deployment Figure 11: The Deployment Progress Pane
  • 15. Erro! Use a guia Início para aplicar heading 1 ao texto que deverá aparecer aqui. 13 Note Analysis Services may take quite a while to process the data mining models. 2. Once deployment is complete (as indicated by the Output window as in Figure 12), close the Output and Deployment Progress windows by clicking the X in the upper-right corner of each window. Figure 12: Successful Deployment Note The Analysis Services project is automatically saved when you click Deploy Solution. In the above procedures, various wizards and editors have been creating XML code based on your input. Deployment sends the XML code to the Analysis Server and then processes the Analysis Services database.  Task 10: View the Customers DT Mining Model Decision Tree 1. In the View pane at the top of the designer window, click the Mining Model Viewer tab. Note If an alert appears indicating that changes have been made, click No. 2. In the Mining Model drop-down list, select Customers DT. 3. Press Shift+Alt+Enter to view the designer window full screen. (You can press Shift+Alt+Enter again to return to normal view later.) Tip If accidentally closed, the Mining Model Viewer of the Mining Model Designer can be re-opened. Select the View | Solution Explorer menu item. In the Solution Explorer window, under the Mining Models folder, right-click Customers.dmm, and then select Browse from the context menu. 4. In the Tree drop-down list, make sure Bike Buyer is selected; Figure 13 shows the result.
  • 16. 14 Erro! Use a guia Início para aplicar heading 1 ao texto que deverá aparecer aqui. Figure 13: Browsing the Mining Model 5. In the lower-right corner of the Mining Model Viewer, click and hold on the small + icon. The mouse pointer will change to a cross-arrow icon and the Navigation window will appear. You may drag the mouse to navigate within the Mining Model Viewer. Tip The Node Legend window on the right side of the display may be relocated and resized to improve the display of the decision tree. If you accidentally close the Node Legend window, select the Mining Model tab and then reselect the Mining Model Viewer tab, and the Node Legend window will re-appear when the viewer is redisplayed. 6. On the Show Level slider control, drag the pointer to the left so that only one level of the decision tree is displayed. 7. Click the All node. The All node contains a histogram with red representing bike buyers and yellow representing non-bike buyers. Information about all customers is displayed in the Node Legend window. Notice that 49.39% of the 18484 customers are bike buyers. (You may need to widen the Node Legend window in order to be able to see the percentages.) 8. On the Show Level slider control, drag the pointer to the right so that two levels of the decision tree are displayed. Number of cars owned is most predictive of a customer's bike buying behavior. 9. Click on each node of level 2. The Node Legend window will display detailed information for each node. 10. In the Background drop-down list, click Yes. The shade of each node indicates the concentration of the value in the Background drop-down list. Expand and contract nodes in the diagram in order to investigate the predicting factors for each group.  Task 11: View the Customers DT Mining Model Dependency Network
  • 17. Erro! Use a guia Início para aplicar heading 1 ao texto que deverá aparecer aqui. 15 1. Within the designer, click the Dependency Network tab. The Dependency Network viewer displays the strength of the relationships between the attributes in a decision tree model. 2. On the Links slider control, drag the pointer to the bottom. 3. In the Dependency Network diagram, click the Bike Buyer node. The color of each node indicates that attribute's relationship to the Bike Buyer attribute. 4. On the Links slider control, slowly drag the pointer up to the top. As you drag the pointer upward the relationships within the data are displayed, as shown in Figure 14. Figure 14: View Strength of Relationships  Task 12: View the NB Naïve Bayes Mining Model Attribute Profile display 1. In the Mining Model drop-down list, click Customers NB to view the Naïve Bayes mining model. 2. Select the Attribute Profiles tab 3. In the Predictable drop-down list, ensure that Bike Buyer is selected. The Attribute Profiles tab displays the other attributes that impact the state of the predictable value selected.  Task 13: View the Attribute Characteristics display 1. Click the Attribute Characteristics tab. 2. In the Attribute drop-down list, ensure that Bike Buyer is selected. In the Value drop-down list, select Yes. The characteristics of bike buyers, ordered by their frequency, are displayed.. 3. In the Value drop-down list, select No.
  • 18. 16 Erro! Use a guia Início para aplicar heading 1 ao texto que deverá aparecer aqui. Notice that the characteristics of non-bike buyers are different than the characteristics of bike buyers.  Task 14: View the Attribute Discrimination display 1. Click the Attribute Discrimination tab. 2. In Attribute drop-down list, ensure that Bike Buyer is selected. 3. In the Value1 drop-down list, select Yes. 4. In the Value 2 drop-down list, select No. The attribute values that impact a customer's bike buying decision are displayed. The attribute values are ordered by how strongly they favor bike buyers or non-bike buyers.  Task 15: View the Dependency Network 1. Click the Dependency Network tab. 2. On the Links slider control, drag the pointer to the bottom. 3. In the Dependency Network diagram, click the Bike Buyer node. The color of each node indicates that attribute's relationship to the Bike Buyer attribute. On the Links slider control, slowly drag the pointer up to the top. As you drag the pointer upward the relationships within the data are displayed.  Task 16: Close the Analysis Services Project 1. Select File | Close Solution. If prompted to save changes, select Yes. 2. Select File | Exit.
  • 19. Erro! Use a guia Início para aplicar heading 1 ao texto que deverá aparecer aqui. 17 Exercise 2 Viewing Mining Accuracy Charts Overview The management team at Adventure Works wants to determine the accuracy of their data mining models. Using a validation data set that was held out of the training set, they create Mining Accuracy Charts to visually identify which model is performing most accurately. In this exercise, you will validate the mining models created in Exercise 1 by using the Mining Accuracy Chart view of the Mining Structure Designer. To view the Mining Accuracy Chart, you will:  Create a prediction query by selecting an input table and mapping the columns of the data mining model to the columns in the validation data set.  View and interpret a Lift Chart  Task 1: Open an existing project 1. From the Windows task bar, select Start | All Programs | Microsoft SQL Server 2005 | Business Intelligence Development Studio. 2. Select File | Open | Project/Solution. 3. In the Open Project dialog box, navigate to the C:SQL LabsLab ProjectsData Mining LabDM Exercise 2 folder, click DM Exercise 2.slnbi, and then click Open. Important The solution used in Exercise 2 is different than the solution created in Exercise 1.  Task 2: Deploy the Analysis Services Solution 1. Select Build | Deploy Solution. The Output window will show you deployment progress, as shown in Figure 1. The deployment progress is also shown in the Deployment Progress pane on the right hand side of the Business Intelligence Development Studio, as in Figure 2. The Deployment Progress pane can give you more detailed information about what happens during deployment than what you see in the Output window. Note Analysis Services may take quite a while to process the data mining models.
  • 20. 18 Erro! Use a guia Início para aplicar heading 1 ao texto que deverá aparecer aqui. Figure 1: Output Window Figure 2: Deployment Progress Pane 2. Once deployment is complete, close the Output window and the Deployment Progress pane.  Task 3: Create a Prediction Query 1. In the Solution Explorer window, in the Mining Models folder, double- click Customers.dmm. Tip If the Solution Explorer window is not visible, select the View | Solution Explorer menu item. 2. From the list of tabs above the designer window, select the Mining Accuracy Chart icon. The Mining Accuracy Chart view will open, displaying the Column Mapping page, as shown in Figure 3.
  • 21. Erro! Use a guia Início para aplicar heading 1 ao texto que deverá aparecer aqui. 19 Figure 3: Column Mapping Page Next, you will use the Column Mapping page to design a Prediction Query that will be executed in order to compare the mining model's predicted values to the validation data set's actual values. 3. On the Column Mapping tab, in the Select Input Table(s) window, click the Select case table hyperlink. 4. In the Select Table window, make sure Customers is selected in the Data Source drop down list-box, click the vDMLabCustomerValidate table, and then click OK. Relationships between the mining structure and the input table are automatically created between columns with the same name. Relationships can be added or deleted by the user. 5. In the table at the bottom of the Column Mapping page, verify that the Show check box is selected for both the Customers DT and Customers NB mining models. 6. In the Predictable Column Name column, verify that Bike Buyer is selected for both mining models. In the Predictable Column Name drop-down lists, the mining model column names are restricted to columns that have the usage type set to Predict or Predict Only. 7. In the Predict Value column, in the drop-down list, click Yes for both mining models, as shown in Figure 4.
  • 22. 20 Erro! Use a guia Início para aplicar heading 1 ao texto que deverá aparecer aqui. Figure 4: Predict Value 8. Click the Lift Chart tab. The prediction query designed on the Column Mapping page will be executed. The prediction query will return a prediction for each case in the validation data set. Note Analysis Services may take a while to process the prediction query. The mining model will have greater confidence in its prediction for some cases than for others. For each case, the prediction query will also return the probability that the prediction is correct. The cases are sorted by the probability that the prediction is correct, and then the percentages of correct predictions are displayed on the lift chart, as in Figure 5.
  • 23. Erro! Use a guia Início para aplicar heading 1 ao texto que deverá aparecer aqui. 21 Figure 5: Data Mining Lift Chart 9. Point at any of the lines on the chart. A tool tip will open and display the mining model name, the percent of cases (Overall Population), and the percent of bike buyers identified (Target Population). As you move along different line points on a line or point to a different line, the values displayed in the tool tip will change. 10. Point at the locations where each line reaches a Target Population of 90%, and wait for a tool tip to open. The example in Figure 5 shows that the Customers DT model only needs to select 71% of the customers in order to identify 90% of the bike buyers. The Ideal Model only needs to select 45% of the customers in order to identify 90% of the bike buyers. The Customers NB model needs to select 85% of the customers in order to identify 90% of the bike buyers. The Random Guess model needs to select 90% of the customers to in order to identify 90% of the bike buyers. Because the Customers DT mining model needs to select fewer cases in order to identify a given percentage of bike buyers, it would be deemed more accurate than the Customers NB model.  Task 4: Close the Analysis Services Project 1. Select File | Close Solution. If prompted to save changes, select Yes. 2. Select File | Exit.
  • 24. 22 Erro! Use a guia Início para aplicar heading 1 ao texto que deverá aparecer aqui. Exercise 3 Creating a Prediction Query Overview The sales and marketing department at Adventure Works received a potential customer list containing demographic data. The department will use the Customers DT data mining model to predict the likelihood that an individual in the list is a bike buyer. In this exercise, you will make predictions by using the Mining Model Prediction view of the Mining Structure Designer. To make predictions, you will:  Select an input table and map the columns of the data mining model to the columns in the prediction data set.  Create a prediction query in the Prediction Query Builder Design pane.  View the prediction query results.  Task 1: Open an existing project 1. From the Windows task bar, select Start | All Programs | Microsoft SQL Server 2005 | Business Intelligence Development Studio. 2. Select File | Open | Project/Solution. 3. In the Open Project dialog box, navigate to the C:SQL LabsLab ProjectsData Mining LabDM Exercise 3 folder, click DM Exercise 3.slnbi, and then click Open. Important The solution used in Exercise 3 is different than the solutions examined in the previous exercises.  Task 2: Deploy the Analysis Services Solution 1. Select Build | Deploy Solution. The Output window will indicate deployment progress, as shown in Figure 1. The deployment progress is also shown in the Deployment Progress pane on the right hand side of the Business Intelligence Development Studio, as in Figure 2. The Deployment Progress pane can give you more detailed information about what happens during deployment than what you see in the Output window. Note Analysis Services may take quite a while to process the data mining models.
  • 25. Erro! Use a guia Início para aplicar heading 1 ao texto que deverá aparecer aqui. 23 Figure 1: Output Window Figure 2: Deployment Progress Pane 2. Once deployment is complete, close the Output window and the Deployment Progress pane.  Task 3: Create a Prediction Query using the Decision Tree Mining Model 1. In the Solution Explorer window, in the Mining Models folder, double- click Customers.dmm. Tip If the Solution Explorer pane is not visible, select View | Solution Explorer. 2. From the list of tabs above the designer window, select the Mining Model Prediction tab. 3. In the Mining Model window, click the Select model hyperlink.
  • 26. 24 Erro! Use a guia Início para aplicar heading 1 ao texto que deverá aparecer aqui. 4. In the Select Mining Model dialog box, expand Customers, select Customers DT, as shown in Figure 3, and then click OK. Figure 3: Select Mining Model in Mining Model Prediction View 5. In the Select Input Table(s) window, click the Select case table hyperlink. 6. In the Select Table window, select the vDMLabCustomerPredict table, and then click OK. Relationships between the mining structure and the input table are automatically created between columns with the same name. Relationships can be added or deleted by the user.  Task 4: Build the Decision Tree Prediction Query 1. Enter values into the first row of the table at the bottom of the designer as shown below: Column Value Source vDMLabCustomerPredict Field CustomerKey Show (Checked) Tip The columns of the table may be re-sized by dragging the dividing line between the column headings. 2. Input values into the second row of the table as shown below: Column Value Source Customers DT mining model Field Bike Buyer
  • 27. Erro! Use a guia Início para aplicar heading 1 ao texto que deverá aparecer aqui. 25 Show (Checked) Note The value in the Source column will change from Customers DT mining model to Customers DT. 3. Enter values into the third row of the table as shown below: Column Value Source Prediction Function Field PredictProbability Alias Confidence Show (Checked) Criteria/Argument [Customers DT].[Bike Buyer] Figure 4 shows what you should see after having entered the values as described in the previous tables. Figure 4: Values for the Decision Tree Prediction Query 4. Select Mining Model | Query to view the SQL view of the query. Tip You can also do this by clicking on the down arrow in the Switch to query result view icon in the upper left corner of the Mining Model Prediction View as in Figure 5, and then click Query. Figure 5: Switch to SQL Query View In the bottom pane of the Mining Model Prediction view, the text of the prediction query is displayed.  Task 5: Display the Decision Tree query result  Select Mining Model | Result to view the result of the Decision Tree Query. Tip
  • 28. 26 Erro! Use a guia Início para aplicar heading 1 ao texto que deverá aparecer aqui. As with the SQL view, you can also view the result by clicking the down arrow in the Switch to query result view icon in the upper left corner of the Mining Model Prediction View as in Figure 4, and then click Result. The results of the prediction query are displayed. The CustomerKey column identifies each record from the input table. The Bike Buyer column contains the mining model's prediction of the customer's bike buying behavior. Larger values in the Confidence column mean the mining model has more confidence in the prediction contained in the Bike Buyer column. By contacting potential customers predicted to be bike buyers with a high Confidence value, Adventure Works can use the results of the prediction query to promote their bikes to those individuals most likely to be bike buyers. Adventure Works marketing expenses will be reduced because potential customers who are not likely to be bike buyers will not be contacted.  Task 6: Close the Analysis Services Project 1. Select File | Close Solution. If prompted to save changes, select Yes. 2. Select File | Exit.