SlideShare a Scribd company logo
1 of 12
Download to read offline
IT for Business Intelligence
      Term paper on Weka




                  Submitted by:


                  Saurabh Singh 10BM60082
Introduction
    The Weka contains a collection of visualization tools and algorithms for data analysis and predictive
modeling, together with graphical user interfaces for easy access to this functionality. The original non-
Java version of Weka was a TCL/TK front-end to (mostly third-party) modeling algorithms implemented
in other programming languages, plus data preprocessing utilities in C, and a Make file-based system for
running machine learning experiments. This original version was primarily designed as a tool for
analyzing data from agricultural domains, but the more recent fully Java-based version (Weka 3), for
which development started in 1997, is now used in many different application areas, in particular for
educational purposes and research. Advantages of Weka include:

     free availability under the GNU General Public License
     portability, since it is fully implemented in the Java programming language and thus runs on
      almost any modern computing platform
     a comprehensive collection of data preprocessing and modeling techniques
     ease of use due to its graphical user interfaces

Weka primarily consists of following four screens:
K-means clustering in WEKA

Suppose a company wants to cluster the market based on the attribute collected by its research team.
This can be done very effectively and efficiently by using K- mean clustering in Weka.
The attributes used are as follows:
     ID
     AGE
     SEX
     RELIGION
     INCOME
     MARRIED
     CHILDREN
     CAR
     SAVING A/C
     CURRENT A/C
     LOAN
     PENSION PLAN



Weka accepts few file input format such as .csv, .arff etc. We would be using .csv file as the input file in
our example. Given data file consists of 1600 instances and 12 attributes as described above.

Steps in K-mean analysis:

Step 1:

Weak Startup screen
Step 2:

Choose explorer option from the menu. This option is more than enough for us to perform all the
required operation on the data.




Step 3:

Load the .csv file of bank accounts data.
Step 4:

Since we intend to create cluster within the data so click on cluster tab and choose Simple K-means
among the choices that appear. Following screen would appear.




Step 5:

Click on the box next to choose box and following menu would appear
Step 6:

Assign value 4 to ‘numClusters’ box.

Step 7:

Click on start to begin the clustering process. Following screen would appear for the same.




Step 8:

The result can be viewed in a separate window. Following screen would appear.
We can interpret by the above given results that

    Cluster 0:

         Centers around male population.
         Mainly lives in town area.
         Is mostly non married.
         Doesn’t own a car or previous loan.
         Owns a Savings a/c and current a/c.
         Still is not having a pension plan.

Hence we can conclude that cluster 1 is the likely cluster to buy a pension plan. Similar interpretation
can be applied to other clusters as well according to requirements.



Step 9:

We can use visualize all to see the distribution of all the variables in the population.
Linear Regression using WEKA


Regression

Regression model can easily answer questions such as how much should be charged for a given model of
car with certain set of features. It uses the past data of car sales, price of the cars, features provided and
other attributes to determine the price of future models.



Regression in WEKA


Suppose a company wants to regress the Price of a car with various features associated with it. It can
run the regression in WEKA by appropriately determining the independent variables and then establish a
regression equation establishing the relationship between independent variables and dependent
variable. Following example illustrates this procedure -



Step 1:

Weak Startup screen
Step 2:

Choose explorer option from the menu. This option is more than enough for us to perform all the
required operation on the data.




Step 3:

Load the .csv file of car specification data.
Step4:

Click Classify tab, then click Choose button and then select Linear Regression from Functions. Following
screen would appear after this.




Step5:

After clicking on Start button, following output would be generated.
Interpretation of the output – From the above output, we can observe that the selling price is positively
correlated to the engine displacement and none of the other factors.



Step 6:

Right click on result list for options and select visualize Classifier errors for the following screen.




Step 7:

If we click at any point on the given plot summary of data point is given by Weka. E.g.
References:

       http://en.wikipedia.org/wiki/Weka_(machine_learning)
       http://www.cs.waikato.ac.nz/ml/weka/

More Related Content

Similar to Data Mining Techniques using WEKA_Saurabh Singh_10BM60082

Automation Framework Design
Automation Framework DesignAutomation Framework Design
Automation Framework Design
Kunal Saxena
 
TAO Fayan_ Introduction to WEKA
TAO Fayan_ Introduction to WEKATAO Fayan_ Introduction to WEKA
TAO Fayan_ Introduction to WEKA
Fayan TAO
 
whitepaper_advanced_analytics_with_tableau_eng
whitepaper_advanced_analytics_with_tableau_engwhitepaper_advanced_analytics_with_tableau_eng
whitepaper_advanced_analytics_with_tableau_eng
S. Hanau
 
Weka_Manual_Sagar
Weka_Manual_SagarWeka_Manual_Sagar
Weka_Manual_Sagar
Sagar Kumar
 
ENGR 131 Elementary Computer ProgrammingTeam IN – Instructor
ENGR 131  Elementary Computer ProgrammingTeam IN – InstructorENGR 131  Elementary Computer ProgrammingTeam IN – Instructor
ENGR 131 Elementary Computer ProgrammingTeam IN – Instructor
TanaMaeskm
 

Similar to Data Mining Techniques using WEKA_Saurabh Singh_10BM60082 (20)

Itb weka nikhil
Itb weka nikhilItb weka nikhil
Itb weka nikhil
 
Weka_10BM60025_VGSOM
Weka_10BM60025_VGSOMWeka_10BM60025_VGSOM
Weka_10BM60025_VGSOM
 
Automation Framework Design
Automation Framework DesignAutomation Framework Design
Automation Framework Design
 
How to create_an_ecatt
How to create_an_ecattHow to create_an_ecatt
How to create_an_ecatt
 
James Jara Portfolio 2014 - Enterprise datagrid - Part 3
James Jara Portfolio 2014  - Enterprise datagrid - Part 3James Jara Portfolio 2014  - Enterprise datagrid - Part 3
James Jara Portfolio 2014 - Enterprise datagrid - Part 3
 
Weka term paper(siddharth 10 bm60086)
Weka term paper(siddharth 10 bm60086)Weka term paper(siddharth 10 bm60086)
Weka term paper(siddharth 10 bm60086)
 
A machine learning model for average fuel consumption in heavy vehicles
A machine learning model for average fuel consumption in heavy vehiclesA machine learning model for average fuel consumption in heavy vehicles
A machine learning model for average fuel consumption in heavy vehicles
 
Data Mining Techniques using WEKA (Ankit Pandey-10BM60012)
Data Mining Techniques using WEKA (Ankit Pandey-10BM60012)Data Mining Techniques using WEKA (Ankit Pandey-10BM60012)
Data Mining Techniques using WEKA (Ankit Pandey-10BM60012)
 
Itb weka
Itb wekaItb weka
Itb weka
 
TAO Fayan_ Introduction to WEKA
TAO Fayan_ Introduction to WEKATAO Fayan_ Introduction to WEKA
TAO Fayan_ Introduction to WEKA
 
Neural Net: Machine Learning Web Application
Neural Net: Machine Learning Web ApplicationNeural Net: Machine Learning Web Application
Neural Net: Machine Learning Web Application
 
whitepaper_advanced_analytics_with_tableau_eng
whitepaper_advanced_analytics_with_tableau_engwhitepaper_advanced_analytics_with_tableau_eng
whitepaper_advanced_analytics_with_tableau_eng
 
SAC_Planning.pdf
SAC_Planning.pdfSAC_Planning.pdf
SAC_Planning.pdf
 
Introduction to weka
Introduction to wekaIntroduction to weka
Introduction to weka
 
Project report
Project report Project report
Project report
 
Design expert 9 tutorials 2015
Design expert 9 tutorials 2015Design expert 9 tutorials 2015
Design expert 9 tutorials 2015
 
Data Mining using Weka
Data Mining using WekaData Mining using Weka
Data Mining using Weka
 
Weka_Manual_Sagar
Weka_Manual_SagarWeka_Manual_Sagar
Weka_Manual_Sagar
 
ENGR 131 Elementary Computer ProgrammingTeam IN – Instructor
ENGR 131  Elementary Computer ProgrammingTeam IN – InstructorENGR 131  Elementary Computer ProgrammingTeam IN – Instructor
ENGR 131 Elementary Computer ProgrammingTeam IN – Instructor
 
What's new in Design-Expert version 9?
 What's new in  Design-Expert version 9? What's new in  Design-Expert version 9?
What's new in Design-Expert version 9?
 

Recently uploaded

Chandigarh Escorts Service 📞8868886958📞 Just📲 Call Nihal Chandigarh Call Girl...
Chandigarh Escorts Service 📞8868886958📞 Just📲 Call Nihal Chandigarh Call Girl...Chandigarh Escorts Service 📞8868886958📞 Just📲 Call Nihal Chandigarh Call Girl...
Chandigarh Escorts Service 📞8868886958📞 Just📲 Call Nihal Chandigarh Call Girl...
Sheetaleventcompany
 
Call Girls In Noida 959961⊹3876 Independent Escort Service Noida
Call Girls In Noida 959961⊹3876 Independent Escort Service NoidaCall Girls In Noida 959961⊹3876 Independent Escort Service Noida
Call Girls In Noida 959961⊹3876 Independent Escort Service Noida
dlhescort
 
FULL ENJOY Call Girls In Majnu Ka Tilla, Delhi Contact Us 8377877756
FULL ENJOY Call Girls In Majnu Ka Tilla, Delhi Contact Us 8377877756FULL ENJOY Call Girls In Majnu Ka Tilla, Delhi Contact Us 8377877756
FULL ENJOY Call Girls In Majnu Ka Tilla, Delhi Contact Us 8377877756
dollysharma2066
 
Call Girls Jp Nagar Just Call 👗 7737669865 👗 Top Class Call Girl Service Bang...
Call Girls Jp Nagar Just Call 👗 7737669865 👗 Top Class Call Girl Service Bang...Call Girls Jp Nagar Just Call 👗 7737669865 👗 Top Class Call Girl Service Bang...
Call Girls Jp Nagar Just Call 👗 7737669865 👗 Top Class Call Girl Service Bang...
amitlee9823
 
Call Girls Electronic City Just Call 👗 7737669865 👗 Top Class Call Girl Servi...
Call Girls Electronic City Just Call 👗 7737669865 👗 Top Class Call Girl Servi...Call Girls Electronic City Just Call 👗 7737669865 👗 Top Class Call Girl Servi...
Call Girls Electronic City Just Call 👗 7737669865 👗 Top Class Call Girl Servi...
amitlee9823
 
Call Girls In DLf Gurgaon ➥99902@11544 ( Best price)100% Genuine Escort In 24...
Call Girls In DLf Gurgaon ➥99902@11544 ( Best price)100% Genuine Escort In 24...Call Girls In DLf Gurgaon ➥99902@11544 ( Best price)100% Genuine Escort In 24...
Call Girls In DLf Gurgaon ➥99902@11544 ( Best price)100% Genuine Escort In 24...
lizamodels9
 
Russian Call Girls In Gurgaon ❤️8448577510 ⊹Best Escorts Service In 24/7 Delh...
Russian Call Girls In Gurgaon ❤️8448577510 ⊹Best Escorts Service In 24/7 Delh...Russian Call Girls In Gurgaon ❤️8448577510 ⊹Best Escorts Service In 24/7 Delh...
Russian Call Girls In Gurgaon ❤️8448577510 ⊹Best Escorts Service In 24/7 Delh...
lizamodels9
 

Recently uploaded (20)

Chandigarh Escorts Service 📞8868886958📞 Just📲 Call Nihal Chandigarh Call Girl...
Chandigarh Escorts Service 📞8868886958📞 Just📲 Call Nihal Chandigarh Call Girl...Chandigarh Escorts Service 📞8868886958📞 Just📲 Call Nihal Chandigarh Call Girl...
Chandigarh Escorts Service 📞8868886958📞 Just📲 Call Nihal Chandigarh Call Girl...
 
Call Girls In Noida 959961⊹3876 Independent Escort Service Noida
Call Girls In Noida 959961⊹3876 Independent Escort Service NoidaCall Girls In Noida 959961⊹3876 Independent Escort Service Noida
Call Girls In Noida 959961⊹3876 Independent Escort Service Noida
 
Enhancing and Restoring Safety & Quality Cultures - Dave Litwiller - May 2024...
Enhancing and Restoring Safety & Quality Cultures - Dave Litwiller - May 2024...Enhancing and Restoring Safety & Quality Cultures - Dave Litwiller - May 2024...
Enhancing and Restoring Safety & Quality Cultures - Dave Litwiller - May 2024...
 
FULL ENJOY Call Girls In Majnu Ka Tilla, Delhi Contact Us 8377877756
FULL ENJOY Call Girls In Majnu Ka Tilla, Delhi Contact Us 8377877756FULL ENJOY Call Girls In Majnu Ka Tilla, Delhi Contact Us 8377877756
FULL ENJOY Call Girls In Majnu Ka Tilla, Delhi Contact Us 8377877756
 
MONA 98765-12871 CALL GIRLS IN LUDHIANA LUDHIANA CALL GIRL
MONA 98765-12871 CALL GIRLS IN LUDHIANA LUDHIANA CALL GIRLMONA 98765-12871 CALL GIRLS IN LUDHIANA LUDHIANA CALL GIRL
MONA 98765-12871 CALL GIRLS IN LUDHIANA LUDHIANA CALL GIRL
 
VVVIP Call Girls In Greater Kailash ➡️ Delhi ➡️ 9999965857 🚀 No Advance 24HRS...
VVVIP Call Girls In Greater Kailash ➡️ Delhi ➡️ 9999965857 🚀 No Advance 24HRS...VVVIP Call Girls In Greater Kailash ➡️ Delhi ➡️ 9999965857 🚀 No Advance 24HRS...
VVVIP Call Girls In Greater Kailash ➡️ Delhi ➡️ 9999965857 🚀 No Advance 24HRS...
 
Call Girls Jp Nagar Just Call 👗 7737669865 👗 Top Class Call Girl Service Bang...
Call Girls Jp Nagar Just Call 👗 7737669865 👗 Top Class Call Girl Service Bang...Call Girls Jp Nagar Just Call 👗 7737669865 👗 Top Class Call Girl Service Bang...
Call Girls Jp Nagar Just Call 👗 7737669865 👗 Top Class Call Girl Service Bang...
 
Pharma Works Profile of Karan Communications
Pharma Works Profile of Karan CommunicationsPharma Works Profile of Karan Communications
Pharma Works Profile of Karan Communications
 
Falcon Invoice Discounting platform in india
Falcon Invoice Discounting platform in indiaFalcon Invoice Discounting platform in india
Falcon Invoice Discounting platform in india
 
Call Girls Electronic City Just Call 👗 7737669865 👗 Top Class Call Girl Servi...
Call Girls Electronic City Just Call 👗 7737669865 👗 Top Class Call Girl Servi...Call Girls Electronic City Just Call 👗 7737669865 👗 Top Class Call Girl Servi...
Call Girls Electronic City Just Call 👗 7737669865 👗 Top Class Call Girl Servi...
 
👉Chandigarh Call Girls 👉9878799926👉Just Call👉Chandigarh Call Girl In Chandiga...
👉Chandigarh Call Girls 👉9878799926👉Just Call👉Chandigarh Call Girl In Chandiga...👉Chandigarh Call Girls 👉9878799926👉Just Call👉Chandigarh Call Girl In Chandiga...
👉Chandigarh Call Girls 👉9878799926👉Just Call👉Chandigarh Call Girl In Chandiga...
 
Call Girls In DLf Gurgaon ➥99902@11544 ( Best price)100% Genuine Escort In 24...
Call Girls In DLf Gurgaon ➥99902@11544 ( Best price)100% Genuine Escort In 24...Call Girls In DLf Gurgaon ➥99902@11544 ( Best price)100% Genuine Escort In 24...
Call Girls In DLf Gurgaon ➥99902@11544 ( Best price)100% Genuine Escort In 24...
 
Russian Call Girls In Gurgaon ❤️8448577510 ⊹Best Escorts Service In 24/7 Delh...
Russian Call Girls In Gurgaon ❤️8448577510 ⊹Best Escorts Service In 24/7 Delh...Russian Call Girls In Gurgaon ❤️8448577510 ⊹Best Escorts Service In 24/7 Delh...
Russian Call Girls In Gurgaon ❤️8448577510 ⊹Best Escorts Service In 24/7 Delh...
 
Call Girls In Panjim North Goa 9971646499 Genuine Service
Call Girls In Panjim North Goa 9971646499 Genuine ServiceCall Girls In Panjim North Goa 9971646499 Genuine Service
Call Girls In Panjim North Goa 9971646499 Genuine Service
 
Business Model Canvas (BMC)- A new venture concept
Business Model Canvas (BMC)-  A new venture conceptBusiness Model Canvas (BMC)-  A new venture concept
Business Model Canvas (BMC)- A new venture concept
 
Mondelez State of Snacking and Future Trends 2023
Mondelez State of Snacking and Future Trends 2023Mondelez State of Snacking and Future Trends 2023
Mondelez State of Snacking and Future Trends 2023
 
How to Get Started in Social Media for Art League City
How to Get Started in Social Media for Art League CityHow to Get Started in Social Media for Art League City
How to Get Started in Social Media for Art League City
 
Famous Olympic Siblings from the 21st Century
Famous Olympic Siblings from the 21st CenturyFamous Olympic Siblings from the 21st Century
Famous Olympic Siblings from the 21st Century
 
Value Proposition canvas- Customer needs and pains
Value Proposition canvas- Customer needs and painsValue Proposition canvas- Customer needs and pains
Value Proposition canvas- Customer needs and pains
 
A DAY IN THE LIFE OF A SALESMAN / WOMAN
A DAY IN THE LIFE OF A  SALESMAN / WOMANA DAY IN THE LIFE OF A  SALESMAN / WOMAN
A DAY IN THE LIFE OF A SALESMAN / WOMAN
 

Data Mining Techniques using WEKA_Saurabh Singh_10BM60082

  • 1. IT for Business Intelligence Term paper on Weka Submitted by: Saurabh Singh 10BM60082
  • 2. Introduction The Weka contains a collection of visualization tools and algorithms for data analysis and predictive modeling, together with graphical user interfaces for easy access to this functionality. The original non- Java version of Weka was a TCL/TK front-end to (mostly third-party) modeling algorithms implemented in other programming languages, plus data preprocessing utilities in C, and a Make file-based system for running machine learning experiments. This original version was primarily designed as a tool for analyzing data from agricultural domains, but the more recent fully Java-based version (Weka 3), for which development started in 1997, is now used in many different application areas, in particular for educational purposes and research. Advantages of Weka include:  free availability under the GNU General Public License  portability, since it is fully implemented in the Java programming language and thus runs on almost any modern computing platform  a comprehensive collection of data preprocessing and modeling techniques  ease of use due to its graphical user interfaces Weka primarily consists of following four screens:
  • 3. K-means clustering in WEKA Suppose a company wants to cluster the market based on the attribute collected by its research team. This can be done very effectively and efficiently by using K- mean clustering in Weka. The attributes used are as follows:  ID  AGE  SEX  RELIGION  INCOME  MARRIED  CHILDREN  CAR  SAVING A/C  CURRENT A/C  LOAN  PENSION PLAN Weka accepts few file input format such as .csv, .arff etc. We would be using .csv file as the input file in our example. Given data file consists of 1600 instances and 12 attributes as described above. Steps in K-mean analysis: Step 1: Weak Startup screen
  • 4. Step 2: Choose explorer option from the menu. This option is more than enough for us to perform all the required operation on the data. Step 3: Load the .csv file of bank accounts data.
  • 5. Step 4: Since we intend to create cluster within the data so click on cluster tab and choose Simple K-means among the choices that appear. Following screen would appear. Step 5: Click on the box next to choose box and following menu would appear
  • 6. Step 6: Assign value 4 to ‘numClusters’ box. Step 7: Click on start to begin the clustering process. Following screen would appear for the same. Step 8: The result can be viewed in a separate window. Following screen would appear.
  • 7. We can interpret by the above given results that Cluster 0:  Centers around male population.  Mainly lives in town area.  Is mostly non married.  Doesn’t own a car or previous loan.  Owns a Savings a/c and current a/c.  Still is not having a pension plan. Hence we can conclude that cluster 1 is the likely cluster to buy a pension plan. Similar interpretation can be applied to other clusters as well according to requirements. Step 9: We can use visualize all to see the distribution of all the variables in the population.
  • 8. Linear Regression using WEKA Regression Regression model can easily answer questions such as how much should be charged for a given model of car with certain set of features. It uses the past data of car sales, price of the cars, features provided and other attributes to determine the price of future models. Regression in WEKA Suppose a company wants to regress the Price of a car with various features associated with it. It can run the regression in WEKA by appropriately determining the independent variables and then establish a regression equation establishing the relationship between independent variables and dependent variable. Following example illustrates this procedure - Step 1: Weak Startup screen
  • 9. Step 2: Choose explorer option from the menu. This option is more than enough for us to perform all the required operation on the data. Step 3: Load the .csv file of car specification data.
  • 10. Step4: Click Classify tab, then click Choose button and then select Linear Regression from Functions. Following screen would appear after this. Step5: After clicking on Start button, following output would be generated.
  • 11. Interpretation of the output – From the above output, we can observe that the selling price is positively correlated to the engine displacement and none of the other factors. Step 6: Right click on result list for options and select visualize Classifier errors for the following screen. Step 7: If we click at any point on the given plot summary of data point is given by Weka. E.g.
  • 12. References: http://en.wikipedia.org/wiki/Weka_(machine_learning) http://www.cs.waikato.ac.nz/ml/weka/