SlideShare a Scribd company logo
1 of 43
Why does Naïve Bayesian Classification work so well amidst known conditional dependencies in the data structure? ,[object Object],[object Object],[object Object]
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
Naïve Bayesian Classification A form of machine learning that avoids complicated  conditional dependency models, and the requirement to define  much of the conditional dependencies in your data. Why does it work so well amidst conditional dependency? Tim Hare
Naïve Bayes (naïvely, hence the name) assumes no conditional dependence, but this simplification comes at a potential cost of misclassification  ,[object Object]
NB performance is at odds with past theory :  evidence in the primary literature that Naïve Bayes works beyond what would be anticipated given known conditional dependence in the data ,[object Object],[object Object],[object Object]
Zhang 2004:  Factoring  a general form of Bayes into two parts:  [NB] * [“something else”] ,[object Object],[object Object],[object Object],Take home message: the factorization indicates that FB=NB under certain data structures, and not in others.
Full Bayes (FB) and Naïve Bayes (NB) classification carried out on synthetic data by hand on one data vector = <1,0> When  conditional dependence is of different types  (C1: if A then A, C2: if A then B) in the two classes (upper left data grid: you may recognize this as “XOR”) NB will fail to classify correctly (and the information is “lost” due to  “cancellation” by equal probabilities  taking part in each classification estimate).  If  the conditional dependence is of the same type  (C1=C2: If A then B) in both classes (lower left data grid) NB may still classify the data correctly.  FB always classifies correctly in BOTH instances.  Posterior probability may be biased, but in fact that nets out (though analysis too complex to present here) to correct classification as well for a variety of reasons, in many cases. Loss (ratio is just 1) but no Bias Bias but no Loss
Naïve Bayes in R on the synthetic conditionally dependent data we analyzed in EXCEL for vector <1,0>, results in the same misclassification for the MIXED conditional dependence, and correct Democratic classification in the case of “even” conditional dependence.
Real data:  House of Representatives 1984 voting record on 17 congressional bills (columns) ,[object Object],[object Object],[object Object]
Use “R” for NB classification on HV84 +/- augmentation with conditional dependence via synthetic data  ,[object Object],[object Object],[object Object],[object Object]
Control analysis for synthetic augmentation experiments #1 and #2 (to follow): NB analysis HV84 real data  unmodified  by synthetic data ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
Augmentation with synthetic data -- experiment 1:  NB analysis on HV84 augmented by the conditionally dependent synthetic data, with the conditional dependence of the  different types (“mixed”)  in the two classes ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
Augmentation with synthetic data -- experiment 2:  NB analysis on HV84 augmented by the conditionally dependent synthetic data, with the conditional dependence of the  same type (“even”)  in the two classes ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
Matrices of classification outcomes for control (top matrix), “mixed” (middle matrix) and “even” (bottom matrix):  no adverse impact on classification Same assignment made in each experiment indicating that augmentation of real data with two types of conditional dependence does not influence classification, at least with this HV84 data set
Raw probabilities, however, show that even though assignments to class didn’t change in CONTROL, EXPT#1, and EXPT#2, differences (in this case slight) are imparted to the probability estimates, as expected.  Important to note we only added 2 attributes (columns) to 17, so the percentage of “contamination” by synthetic data is small.  Additional exploration could be done with increasing percentages of conditional dependence added in to the original HV84 data set.
Knowledge check: FB or NB? ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
References ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
Q & A

More Related Content

What's hot

Record matching over multiple query result - Document
Record matching over multiple query result - DocumentRecord matching over multiple query result - Document
Record matching over multiple query result - DocumentNishna Ma
 
WEKA: Data Mining Input Concepts Instances And Attributes
WEKA: Data Mining Input Concepts Instances And AttributesWEKA: Data Mining Input Concepts Instances And Attributes
WEKA: Data Mining Input Concepts Instances And AttributesDataminingTools Inc
 
Attendance register1 explanation of how it works
Attendance register1    explanation of how it worksAttendance register1    explanation of how it works
Attendance register1 explanation of how it worksMarvelMan2
 
A random decision tree frameworkfor privacy preserving data mining
A random decision tree frameworkfor privacy preserving data miningA random decision tree frameworkfor privacy preserving data mining
A random decision tree frameworkfor privacy preserving data miningVenkat Projects
 
AN ENVIRONMENT FOR NON-DUPLICATE TEST GENERATION FOR WEB BASED APPLICATION
AN ENVIRONMENT FOR NON-DUPLICATE TEST GENERATION FOR WEB BASED APPLICATIONAN ENVIRONMENT FOR NON-DUPLICATE TEST GENERATION FOR WEB BASED APPLICATION
AN ENVIRONMENT FOR NON-DUPLICATE TEST GENERATION FOR WEB BASED APPLICATIONecij
 
Cis 111 Education Organization / snaptutorial.com
Cis 111  Education Organization / snaptutorial.comCis 111  Education Organization / snaptutorial.com
Cis 111 Education Organization / snaptutorial.comBaileya82
 
Cis 111 Enhance teaching / snaptutorial.com
Cis 111 Enhance teaching / snaptutorial.comCis 111 Enhance teaching / snaptutorial.com
Cis 111 Enhance teaching / snaptutorial.comDavis103
 
CIS 111 Exceptional Education / snaptutorial.com
CIS 111 Exceptional Education / snaptutorial.comCIS 111 Exceptional Education / snaptutorial.com
CIS 111 Exceptional Education / snaptutorial.comdonaldzs95
 
Cis 111 Extraordinary Success/newtonhelp.com
Cis 111 Extraordinary Success/newtonhelp.com  Cis 111 Extraordinary Success/newtonhelp.com
Cis 111 Extraordinary Success/newtonhelp.com amaranthbeg143
 
Data cleaning and screening
Data cleaning and screeningData cleaning and screening
Data cleaning and screeningHassan Hussein
 
A WEB REPOSITORY SYSTEM FOR DATA MINING IN DRUG DISCOVERY
A WEB REPOSITORY SYSTEM FOR DATA MINING IN DRUG DISCOVERYA WEB REPOSITORY SYSTEM FOR DATA MINING IN DRUG DISCOVERY
A WEB REPOSITORY SYSTEM FOR DATA MINING IN DRUG DISCOVERYIJDKP
 

What's hot (17)

Record matching over multiple query result - Document
Record matching over multiple query result - DocumentRecord matching over multiple query result - Document
Record matching over multiple query result - Document
 
bayes_proj
bayes_projbayes_proj
bayes_proj
 
Database design
Database designDatabase design
Database design
 
XL-MINER: Data Exploration
XL-MINER: Data ExplorationXL-MINER: Data Exploration
XL-MINER: Data Exploration
 
XL-MINER:Prediction
XL-MINER:PredictionXL-MINER:Prediction
XL-MINER:Prediction
 
WEKA: Data Mining Input Concepts Instances And Attributes
WEKA: Data Mining Input Concepts Instances And AttributesWEKA: Data Mining Input Concepts Instances And Attributes
WEKA: Data Mining Input Concepts Instances And Attributes
 
Attendance register1 explanation of how it works
Attendance register1    explanation of how it worksAttendance register1    explanation of how it works
Attendance register1 explanation of how it works
 
XL Miner: Classification
XL Miner: ClassificationXL Miner: Classification
XL Miner: Classification
 
A random decision tree frameworkfor privacy preserving data mining
A random decision tree frameworkfor privacy preserving data miningA random decision tree frameworkfor privacy preserving data mining
A random decision tree frameworkfor privacy preserving data mining
 
AN ENVIRONMENT FOR NON-DUPLICATE TEST GENERATION FOR WEB BASED APPLICATION
AN ENVIRONMENT FOR NON-DUPLICATE TEST GENERATION FOR WEB BASED APPLICATIONAN ENVIRONMENT FOR NON-DUPLICATE TEST GENERATION FOR WEB BASED APPLICATION
AN ENVIRONMENT FOR NON-DUPLICATE TEST GENERATION FOR WEB BASED APPLICATION
 
Cis 111 Education Organization / snaptutorial.com
Cis 111  Education Organization / snaptutorial.comCis 111  Education Organization / snaptutorial.com
Cis 111 Education Organization / snaptutorial.com
 
Cis 111 Enhance teaching / snaptutorial.com
Cis 111 Enhance teaching / snaptutorial.comCis 111 Enhance teaching / snaptutorial.com
Cis 111 Enhance teaching / snaptutorial.com
 
CIS 111 Exceptional Education / snaptutorial.com
CIS 111 Exceptional Education / snaptutorial.comCIS 111 Exceptional Education / snaptutorial.com
CIS 111 Exceptional Education / snaptutorial.com
 
Cis 111 Extraordinary Success/newtonhelp.com
Cis 111 Extraordinary Success/newtonhelp.com  Cis 111 Extraordinary Success/newtonhelp.com
Cis 111 Extraordinary Success/newtonhelp.com
 
XL-MINER: Associations
XL-MINER: AssociationsXL-MINER: Associations
XL-MINER: Associations
 
Data cleaning and screening
Data cleaning and screeningData cleaning and screening
Data cleaning and screening
 
A WEB REPOSITORY SYSTEM FOR DATA MINING IN DRUG DISCOVERY
A WEB REPOSITORY SYSTEM FOR DATA MINING IN DRUG DISCOVERYA WEB REPOSITORY SYSTEM FOR DATA MINING IN DRUG DISCOVERY
A WEB REPOSITORY SYSTEM FOR DATA MINING IN DRUG DISCOVERY
 

Viewers also liked

Time series analysis of collaborative activities-CRIWG2012
Time series analysis of collaborative activities-CRIWG2012Time series analysis of collaborative activities-CRIWG2012
Time series analysis of collaborative activities-CRIWG2012Irene-Angelica Chounta
 
A Semi-naive Bayes Classifier with Grouping of Cases
A Semi-naive Bayes Classifier with Grouping of CasesA Semi-naive Bayes Classifier with Grouping of Cases
A Semi-naive Bayes Classifier with Grouping of CasesNTNU
 
Design Pattern Explained CH1
Design Pattern Explained CH1Design Pattern Explained CH1
Design Pattern Explained CH1Jamie (Taka) Wang
 
Classification with Naive Bayes
Classification with Naive BayesClassification with Naive Bayes
Classification with Naive BayesJosh Patterson
 
Data mining: Classification and prediction
Data mining: Classification and predictionData mining: Classification and prediction
Data mining: Classification and predictionDataminingTools Inc
 
Introduction of Cloud computing
Introduction of Cloud computingIntroduction of Cloud computing
Introduction of Cloud computingRkrishna Mishra
 

Viewers also liked (12)

Tam June 2009
Tam June 2009Tam June 2009
Tam June 2009
 
Time series analysis of collaborative activities-CRIWG2012
Time series analysis of collaborative activities-CRIWG2012Time series analysis of collaborative activities-CRIWG2012
Time series analysis of collaborative activities-CRIWG2012
 
A Semi-naive Bayes Classifier with Grouping of Cases
A Semi-naive Bayes Classifier with Grouping of CasesA Semi-naive Bayes Classifier with Grouping of Cases
A Semi-naive Bayes Classifier with Grouping of Cases
 
Design Pattern Explained CH1
Design Pattern Explained CH1Design Pattern Explained CH1
Design Pattern Explained CH1
 
Lecture10 - Naïve Bayes
Lecture10 - Naïve BayesLecture10 - Naïve Bayes
Lecture10 - Naïve Bayes
 
Decision tree example problem
Decision tree example problemDecision tree example problem
Decision tree example problem
 
Classification with Naive Bayes
Classification with Naive BayesClassification with Naive Bayes
Classification with Naive Bayes
 
Naive bayes
Naive bayesNaive bayes
Naive bayes
 
Decision Trees
Decision TreesDecision Trees
Decision Trees
 
Decision tree
Decision treeDecision tree
Decision tree
 
Data mining: Classification and prediction
Data mining: Classification and predictionData mining: Classification and prediction
Data mining: Classification and prediction
 
Introduction of Cloud computing
Introduction of Cloud computingIntroduction of Cloud computing
Introduction of Cloud computing
 

Similar to Naïve Bayesian Classification Performance Amidst Conditional Dependencies

Name IDPractical Data MiningCOMP-321BTutorial 5.docx
Name IDPractical Data MiningCOMP-321BTutorial 5.docxName IDPractical Data MiningCOMP-321BTutorial 5.docx
Name IDPractical Data MiningCOMP-321BTutorial 5.docxrosemarybdodson23141
 
ICS Part 2 Computer Science Short Notes
ICS Part 2 Computer Science Short NotesICS Part 2 Computer Science Short Notes
ICS Part 2 Computer Science Short NotesAbdul Haseeb
 
Sql interview questions and answers
Sql interview questions and  answersSql interview questions and  answers
Sql interview questions and answerssheibansari
 
Improving Tree augmented Naive Bayes for class probability estimation
Improving Tree augmented Naive Bayes for class probability estimationImproving Tree augmented Naive Bayes for class probability estimation
Improving Tree augmented Naive Bayes for class probability estimationBeat Winehouse
 
Application for Logical Expression Processing
Application for Logical Expression Processing Application for Logical Expression Processing
Application for Logical Expression Processing csandit
 
Database schema architecture.ppt
Database schema architecture.pptDatabase schema architecture.ppt
Database schema architecture.pptImXaib
 
Use the SPSS software and the data set (2004)GSS.SAV (attached).docx
Use the SPSS software and the data set (2004)GSS.SAV (attached).docxUse the SPSS software and the data set (2004)GSS.SAV (attached).docx
Use the SPSS software and the data set (2004)GSS.SAV (attached).docxgidmanmary
 
WEEK 7 – HW 7 FALL 2017CORRELATIONREGRESSION PROBLEMS BASED O.docx
WEEK 7 – HW 7   FALL 2017CORRELATIONREGRESSION PROBLEMS BASED O.docxWEEK 7 – HW 7   FALL 2017CORRELATIONREGRESSION PROBLEMS BASED O.docx
WEEK 7 – HW 7 FALL 2017CORRELATIONREGRESSION PROBLEMS BASED O.docxcockekeshia
 
Bayesian Co clustering
Bayesian Co clusteringBayesian Co clustering
Bayesian Co clusteringlau
 
iStockphotoThinkstockchapter 8Factorial and Mixed-Fac.docx
iStockphotoThinkstockchapter 8Factorial and Mixed-Fac.docxiStockphotoThinkstockchapter 8Factorial and Mixed-Fac.docx
iStockphotoThinkstockchapter 8Factorial and Mixed-Fac.docxvrickens
 
Data Mining: Concepts and Techniques (3rd ed.) - Chapter 3 preprocessing
Data Mining:  Concepts and Techniques (3rd ed.)- Chapter 3 preprocessingData Mining:  Concepts and Techniques (3rd ed.)- Chapter 3 preprocessing
Data Mining: Concepts and Techniques (3rd ed.) - Chapter 3 preprocessingSalah Amean
 
The Fuzzy Logical Databases
The Fuzzy Logical DatabasesThe Fuzzy Logical Databases
The Fuzzy Logical DatabasesAlaaZ
 
Chapter 3. Data Preprocessing.ppt
Chapter 3. Data Preprocessing.pptChapter 3. Data Preprocessing.ppt
Chapter 3. Data Preprocessing.pptSubrata Kumer Paul
 
12.Data processing and concepts.pdf
12.Data processing and concepts.pdf12.Data processing and concepts.pdf
12.Data processing and concepts.pdfAyele40
 

Similar to Naïve Bayesian Classification Performance Amidst Conditional Dependencies (20)

Name IDPractical Data MiningCOMP-321BTutorial 5.docx
Name IDPractical Data MiningCOMP-321BTutorial 5.docxName IDPractical Data MiningCOMP-321BTutorial 5.docx
Name IDPractical Data MiningCOMP-321BTutorial 5.docx
 
ICS Part 2 Computer Science Short Notes
ICS Part 2 Computer Science Short NotesICS Part 2 Computer Science Short Notes
ICS Part 2 Computer Science Short Notes
 
Sql interview questions and answers
Sql interview questions and  answersSql interview questions and  answers
Sql interview questions and answers
 
Improving Tree augmented Naive Bayes for class probability estimation
Improving Tree augmented Naive Bayes for class probability estimationImproving Tree augmented Naive Bayes for class probability estimation
Improving Tree augmented Naive Bayes for class probability estimation
 
Application for Logical Expression Processing
Application for Logical Expression Processing Application for Logical Expression Processing
Application for Logical Expression Processing
 
Database schema architecture.ppt
Database schema architecture.pptDatabase schema architecture.ppt
Database schema architecture.ppt
 
Use the SPSS software and the data set (2004)GSS.SAV (attached).docx
Use the SPSS software and the data set (2004)GSS.SAV (attached).docxUse the SPSS software and the data set (2004)GSS.SAV (attached).docx
Use the SPSS software and the data set (2004)GSS.SAV (attached).docx
 
WEEK 7 – HW 7 FALL 2017CORRELATIONREGRESSION PROBLEMS BASED O.docx
WEEK 7 – HW 7   FALL 2017CORRELATIONREGRESSION PROBLEMS BASED O.docxWEEK 7 – HW 7   FALL 2017CORRELATIONREGRESSION PROBLEMS BASED O.docx
WEEK 7 – HW 7 FALL 2017CORRELATIONREGRESSION PROBLEMS BASED O.docx
 
Bayesian Co clustering
Bayesian Co clusteringBayesian Co clustering
Bayesian Co clustering
 
Data Mining: Data Preprocessing
Data Mining: Data PreprocessingData Mining: Data Preprocessing
Data Mining: Data Preprocessing
 
Unit 3-2.ppt
Unit 3-2.pptUnit 3-2.ppt
Unit 3-2.ppt
 
Saif_CCECE2007_full_paper_submitted
Saif_CCECE2007_full_paper_submittedSaif_CCECE2007_full_paper_submitted
Saif_CCECE2007_full_paper_submitted
 
iStockphotoThinkstockchapter 8Factorial and Mixed-Fac.docx
iStockphotoThinkstockchapter 8Factorial and Mixed-Fac.docxiStockphotoThinkstockchapter 8Factorial and Mixed-Fac.docx
iStockphotoThinkstockchapter 8Factorial and Mixed-Fac.docx
 
Data Mining: Concepts and Techniques (3rd ed.) - Chapter 3 preprocessing
Data Mining:  Concepts and Techniques (3rd ed.)- Chapter 3 preprocessingData Mining:  Concepts and Techniques (3rd ed.)- Chapter 3 preprocessing
Data Mining: Concepts and Techniques (3rd ed.) - Chapter 3 preprocessing
 
Blast
BlastBlast
Blast
 
The Fuzzy Logical Databases
The Fuzzy Logical DatabasesThe Fuzzy Logical Databases
The Fuzzy Logical Databases
 
Chapter 3. Data Preprocessing.ppt
Chapter 3. Data Preprocessing.pptChapter 3. Data Preprocessing.ppt
Chapter 3. Data Preprocessing.ppt
 
Final Project Statr 503
Final Project Statr 503Final Project Statr 503
Final Project Statr 503
 
Pivoting approach-eav-data-dinu-2006
Pivoting approach-eav-data-dinu-2006Pivoting approach-eav-data-dinu-2006
Pivoting approach-eav-data-dinu-2006
 
12.Data processing and concepts.pdf
12.Data processing and concepts.pdf12.Data processing and concepts.pdf
12.Data processing and concepts.pdf
 

Recently uploaded

Artificial intelligence in cctv survelliance.pptx
Artificial intelligence in cctv survelliance.pptxArtificial intelligence in cctv survelliance.pptx
Artificial intelligence in cctv survelliance.pptxhariprasad279825
 
Rise of the Machines: Known As Drones...
Rise of the Machines: Known As Drones...Rise of the Machines: Known As Drones...
Rise of the Machines: Known As Drones...Rick Flair
 
Anypoint Exchange: It’s Not Just a Repo!
Anypoint Exchange: It’s Not Just a Repo!Anypoint Exchange: It’s Not Just a Repo!
Anypoint Exchange: It’s Not Just a Repo!Manik S Magar
 
Training state-of-the-art general text embedding
Training state-of-the-art general text embeddingTraining state-of-the-art general text embedding
Training state-of-the-art general text embeddingZilliz
 
DevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache MavenDevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache MavenHervé Boutemy
 
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek SchlawackFwdays
 
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptxThe Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptxLoriGlavin3
 
WordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your BrandWordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your Brandgvaughan
 
Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 3652toLead Limited
 
Gen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfGen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfAddepto
 
The State of Passkeys with FIDO Alliance.pptx
The State of Passkeys with FIDO Alliance.pptxThe State of Passkeys with FIDO Alliance.pptx
The State of Passkeys with FIDO Alliance.pptxLoriGlavin3
 
Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!Commit University
 
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptxMerck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptxLoriGlavin3
 
Unraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdfUnraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdfAlex Barbosa Coqueiro
 
Developer Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQLDeveloper Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQLScyllaDB
 
"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr BaganFwdays
 
What is DBT - The Ultimate Data Build Tool.pdf
What is DBT - The Ultimate Data Build Tool.pdfWhat is DBT - The Ultimate Data Build Tool.pdf
What is DBT - The Ultimate Data Build Tool.pdfMounikaPolabathina
 
TrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data PrivacyTrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data PrivacyTrustArc
 
The Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and ConsThe Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and ConsPixlogix Infotech
 
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024BookNet Canada
 

Recently uploaded (20)

Artificial intelligence in cctv survelliance.pptx
Artificial intelligence in cctv survelliance.pptxArtificial intelligence in cctv survelliance.pptx
Artificial intelligence in cctv survelliance.pptx
 
Rise of the Machines: Known As Drones...
Rise of the Machines: Known As Drones...Rise of the Machines: Known As Drones...
Rise of the Machines: Known As Drones...
 
Anypoint Exchange: It’s Not Just a Repo!
Anypoint Exchange: It’s Not Just a Repo!Anypoint Exchange: It’s Not Just a Repo!
Anypoint Exchange: It’s Not Just a Repo!
 
Training state-of-the-art general text embedding
Training state-of-the-art general text embeddingTraining state-of-the-art general text embedding
Training state-of-the-art general text embedding
 
DevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache MavenDevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache Maven
 
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
 
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptxThe Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
 
WordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your BrandWordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your Brand
 
Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365
 
Gen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfGen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdf
 
The State of Passkeys with FIDO Alliance.pptx
The State of Passkeys with FIDO Alliance.pptxThe State of Passkeys with FIDO Alliance.pptx
The State of Passkeys with FIDO Alliance.pptx
 
Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!
 
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptxMerck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptx
 
Unraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdfUnraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdf
 
Developer Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQLDeveloper Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQL
 
"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan
 
What is DBT - The Ultimate Data Build Tool.pdf
What is DBT - The Ultimate Data Build Tool.pdfWhat is DBT - The Ultimate Data Build Tool.pdf
What is DBT - The Ultimate Data Build Tool.pdf
 
TrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data PrivacyTrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data Privacy
 
The Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and ConsThe Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and Cons
 
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
 

Naïve Bayesian Classification Performance Amidst Conditional Dependencies

  • 1.
  • 2.  
  • 3.  
  • 4.  
  • 5.  
  • 6.  
  • 7.  
  • 8.  
  • 9.  
  • 10.  
  • 11.  
  • 12.  
  • 13.  
  • 14.  
  • 15.  
  • 16.  
  • 17.  
  • 18.  
  • 19.  
  • 20.  
  • 21.  
  • 22.  
  • 23.  
  • 24.  
  • 25.  
  • 26.  
  • 27.  
  • 28. Naïve Bayesian Classification A form of machine learning that avoids complicated conditional dependency models, and the requirement to define much of the conditional dependencies in your data. Why does it work so well amidst conditional dependency? Tim Hare
  • 29.
  • 30.
  • 31.
  • 32. Full Bayes (FB) and Naïve Bayes (NB) classification carried out on synthetic data by hand on one data vector = <1,0> When conditional dependence is of different types (C1: if A then A, C2: if A then B) in the two classes (upper left data grid: you may recognize this as “XOR”) NB will fail to classify correctly (and the information is “lost” due to “cancellation” by equal probabilities taking part in each classification estimate). If the conditional dependence is of the same type (C1=C2: If A then B) in both classes (lower left data grid) NB may still classify the data correctly. FB always classifies correctly in BOTH instances. Posterior probability may be biased, but in fact that nets out (though analysis too complex to present here) to correct classification as well for a variety of reasons, in many cases. Loss (ratio is just 1) but no Bias Bias but no Loss
  • 33. Naïve Bayes in R on the synthetic conditionally dependent data we analyzed in EXCEL for vector <1,0>, results in the same misclassification for the MIXED conditional dependence, and correct Democratic classification in the case of “even” conditional dependence.
  • 34.
  • 35.
  • 36.
  • 37.
  • 38.
  • 39. Matrices of classification outcomes for control (top matrix), “mixed” (middle matrix) and “even” (bottom matrix): no adverse impact on classification Same assignment made in each experiment indicating that augmentation of real data with two types of conditional dependence does not influence classification, at least with this HV84 data set
  • 40. Raw probabilities, however, show that even though assignments to class didn’t change in CONTROL, EXPT#1, and EXPT#2, differences (in this case slight) are imparted to the probability estimates, as expected. Important to note we only added 2 attributes (columns) to 17, so the percentage of “contamination” by synthetic data is small. Additional exploration could be done with increasing percentages of conditional dependence added in to the original HV84 data set.
  • 41.
  • 42.
  • 43. Q & A