SlideShare uma empresa Scribd logo
1 de 14
MACHINE LEARNING
PROJECTS WITH R
Yiou (Leo) Li
Outline


   Classification of glass data

   Clustering of glass data
Classification by ridge regression
3
Plotting the three classes by four features
4

                                 Simple Scatterplot Matrix
                                11   12   13   14   15                        0.5   1.0   1.5   2.0




                                                                                                      1.525
                        V2




                                                                                                      1.515
          15
          14
          13




                                          V3
          12
          11




                                                                                                      4
                                                                                                      3
                                                                 V4




                                                                                                      2
                                                                                                      1
                                                                                                      0
          2.0
          1.5




                                                                                     V5
          1.0
          0.5




                1.515   1.525                            0   1   2    3   4
Performance looks good when consider only the
    classification error rate
5
Performance is poor when consider ROC
6
Using high order polynomial helps improve ROC
7




    Decision point
Using high order polynomial helps improve TPR
    and FPR!
8



                    Y ~ [V2, V3, …, V10, V2*V3, V2*V4, …]
                             Training            Test
       True Positive Rate    0.6820833           0.55
       False Positive Rate   0.008368031         0.0804762
       Error rate            0.03953965          0.1270588



                              Y ~ [V2, V3 … , V10]
                              Training               Test
       True Positive Rate     0                      0
       False Positive Rate    0.00685288             0.007142857
       Error rate             0.1104277              0.1102941
Notes on ridge regression
9




    1. The ridge solutions are not invariant under scaling of the inputs --- usually
       standardize the input --- so that the solution is invariant to scaling of inputs

    2. Intercept β0 should be left out of the penalty term! --- so that the solution is
       invariant to the choice of origin of inputs and outputs
Outline


   Classification of glass data

   Clustering of glass data
Multi-Dimensional Scaling of glass
data (Labeled as: 1,2,3,5,6,7)
                                          Metric MDS




                       6
                                              1
                                              2
                                              3
                                              5
                                              6
                       4



                                              7
        Coordinate 2

                       2
                       0
                       -2




                            -4   -2   0            2     4   6

                                          Coordinate 1
Kmeans of glass
                           K-means cluster




                     1.0
                     0.8
                     0.6
      Correct rate

                     0.4
                     0.2
                     0.0




                             Original labels
Hierarchical of glass
                            Hierachical cluster




                      1.0
                      0.8
                      0.6
       Correct rate

                      0.4
                      0.2
                      0.0




                                Original labels
Correct rate

                  0.0   0.2   0.4          0.6   0.8   1.0
                                                                  EM of glass
                                                             EM




Original labels

Mais conteúdo relacionado

Semelhante a Machine learning projects with r

Amth250 octave matlab some solutions (3)
Amth250 octave matlab some solutions (3)Amth250 octave matlab some solutions (3)
Amth250 octave matlab some solutions (3)
asghar123456
 
Wikipedia ws
Wikipedia wsWikipedia ws
Wikipedia ws
Yu Suzuki
 
adc converter basics
adc converter basicsadc converter basics
adc converter basics
hacker1500
 
Brief survey on Three-Dimensional Displays
Brief survey on Three-Dimensional DisplaysBrief survey on Three-Dimensional Displays
Brief survey on Three-Dimensional Displays
Taufiq Widjanarko
 
Objective Determination Of Minimum Engine Mapping Requirements For Optimal SI...
Objective Determination Of Minimum Engine Mapping Requirements For Optimal SI...Objective Determination Of Minimum Engine Mapping Requirements For Optimal SI...
Objective Determination Of Minimum Engine Mapping Requirements For Optimal SI...
pmaloney1
 
股票期貨問答
股票期貨問答股票期貨問答
股票期貨問答
frogman1688
 
Financial analysis
Financial analysisFinancial analysis
Financial analysis
kanchan89
 

Semelhante a Machine learning projects with r (20)

Amth250 octave matlab some solutions (3)
Amth250 octave matlab some solutions (3)Amth250 octave matlab some solutions (3)
Amth250 octave matlab some solutions (3)
 
Wikipedia ws
Wikipedia wsWikipedia ws
Wikipedia ws
 
9th ICCS Noordwijkerhout
9th ICCS Noordwijkerhout9th ICCS Noordwijkerhout
9th ICCS Noordwijkerhout
 
Towards Probabilistic Assessment of Modularity
Towards Probabilistic Assessment of ModularityTowards Probabilistic Assessment of Modularity
Towards Probabilistic Assessment of Modularity
 
adc converter basics
adc converter basicsadc converter basics
adc converter basics
 
Mlb graphs slide deck
Mlb graphs slide deckMlb graphs slide deck
Mlb graphs slide deck
 
Metrado de madera
Metrado de maderaMetrado de madera
Metrado de madera
 
Important Topics for JEE Advanced
Important Topics for JEE AdvancedImportant Topics for JEE Advanced
Important Topics for JEE Advanced
 
DCT_TR802
DCT_TR802DCT_TR802
DCT_TR802
 
DCT_TR802
DCT_TR802DCT_TR802
DCT_TR802
 
DCT_TR802
DCT_TR802DCT_TR802
DCT_TR802
 
VaR of Operational Risk
VaR of Operational RiskVaR of Operational Risk
VaR of Operational Risk
 
Brief survey on Three-Dimensional Displays
Brief survey on Three-Dimensional DisplaysBrief survey on Three-Dimensional Displays
Brief survey on Three-Dimensional Displays
 
RIT 101: Understanding Scores From MAP
RIT 101: Understanding Scores From MAPRIT 101: Understanding Scores From MAP
RIT 101: Understanding Scores From MAP
 
Objective Determination Of Minimum Engine Mapping Requirements For Optimal SI...
Objective Determination Of Minimum Engine Mapping Requirements For Optimal SI...Objective Determination Of Minimum Engine Mapping Requirements For Optimal SI...
Objective Determination Of Minimum Engine Mapping Requirements For Optimal SI...
 
股票期貨問答
股票期貨問答股票期貨問答
股票期貨問答
 
Why we don’t know how many colors there are
Why we don’t know how many colors there areWhy we don’t know how many colors there are
Why we don’t know how many colors there are
 
SPICE MODEL of 2SK2962 (Professional+BDP Model) in SPICE PARK
SPICE MODEL of 2SK2962 (Professional+BDP Model) in SPICE PARKSPICE MODEL of 2SK2962 (Professional+BDP Model) in SPICE PARK
SPICE MODEL of 2SK2962 (Professional+BDP Model) in SPICE PARK
 
Financial analysis
Financial analysisFinancial analysis
Financial analysis
 
SPICE MODEL of 2SK2989 (Professional+BDP Model) in SPICE PARK
SPICE MODEL of 2SK2989 (Professional+BDP Model) in SPICE PARKSPICE MODEL of 2SK2989 (Professional+BDP Model) in SPICE PARK
SPICE MODEL of 2SK2989 (Professional+BDP Model) in SPICE PARK
 

Último

Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Safe Software
 
Why Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire businessWhy Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire business
panagenda
 
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
?#DUbAI#??##{{(☎️+971_581248768%)**%*]'#abortion pills for sale in dubai@
 

Último (20)

AWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of TerraformAWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of Terraform
 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
 
Why Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire businessWhy Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire business
 
Manulife - Insurer Transformation Award 2024
Manulife - Insurer Transformation Award 2024Manulife - Insurer Transformation Award 2024
Manulife - Insurer Transformation Award 2024
 
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWEREMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processors
 
Corporate and higher education May webinar.pptx
Corporate and higher education May webinar.pptxCorporate and higher education May webinar.pptx
Corporate and higher education May webinar.pptx
 
Real Time Object Detection Using Open CV
Real Time Object Detection Using Open CVReal Time Object Detection Using Open CV
Real Time Object Detection Using Open CV
 
AXA XL - Insurer Innovation Award Americas 2024
AXA XL - Insurer Innovation Award Americas 2024AXA XL - Insurer Innovation Award Americas 2024
AXA XL - Insurer Innovation Award Americas 2024
 
Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Script
 
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot TakeoffStrategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
 
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
 
Apidays Singapore 2024 - Modernizing Securities Finance by Madhu Subbu
Apidays Singapore 2024 - Modernizing Securities Finance by Madhu SubbuApidays Singapore 2024 - Modernizing Securities Finance by Madhu Subbu
Apidays Singapore 2024 - Modernizing Securities Finance by Madhu Subbu
 
Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...
Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...
Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...
 
Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)
 
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
 
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost SavingRepurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
 
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdf
 
Ransomware_Q4_2023. The report. [EN].pdf
Ransomware_Q4_2023. The report. [EN].pdfRansomware_Q4_2023. The report. [EN].pdf
Ransomware_Q4_2023. The report. [EN].pdf
 

Machine learning projects with r

  • 2. Outline  Classification of glass data  Clustering of glass data
  • 3. Classification by ridge regression 3
  • 4. Plotting the three classes by four features 4 Simple Scatterplot Matrix 11 12 13 14 15 0.5 1.0 1.5 2.0 1.525 V2 1.515 15 14 13 V3 12 11 4 3 V4 2 1 0 2.0 1.5 V5 1.0 0.5 1.515 1.525 0 1 2 3 4
  • 5. Performance looks good when consider only the classification error rate 5
  • 6. Performance is poor when consider ROC 6
  • 7. Using high order polynomial helps improve ROC 7 Decision point
  • 8. Using high order polynomial helps improve TPR and FPR! 8 Y ~ [V2, V3, …, V10, V2*V3, V2*V4, …] Training Test True Positive Rate 0.6820833 0.55 False Positive Rate 0.008368031 0.0804762 Error rate 0.03953965 0.1270588 Y ~ [V2, V3 … , V10] Training Test True Positive Rate 0 0 False Positive Rate 0.00685288 0.007142857 Error rate 0.1104277 0.1102941
  • 9. Notes on ridge regression 9 1. The ridge solutions are not invariant under scaling of the inputs --- usually standardize the input --- so that the solution is invariant to scaling of inputs 2. Intercept β0 should be left out of the penalty term! --- so that the solution is invariant to the choice of origin of inputs and outputs
  • 10. Outline  Classification of glass data  Clustering of glass data
  • 11. Multi-Dimensional Scaling of glass data (Labeled as: 1,2,3,5,6,7) Metric MDS 6 1 2 3 5 6 4 7 Coordinate 2 2 0 -2 -4 -2 0 2 4 6 Coordinate 1
  • 12. Kmeans of glass K-means cluster 1.0 0.8 0.6 Correct rate 0.4 0.2 0.0 Original labels
  • 13. Hierarchical of glass Hierachical cluster 1.0 0.8 0.6 Correct rate 0.4 0.2 0.0 Original labels
  • 14. Correct rate 0.0 0.2 0.4 0.6 0.8 1.0 EM of glass EM Original labels