SlideShare uma empresa Scribd logo
1 de 20
Data Analytics & Machine Learning
PROF. (Dr.) S. PATHAK , PH.D, M.TECH, Senior Data
Scientist
&
Er. J.K. JHA ( Corporate Trainer, BIG Data &
Machine Learning )
Reach: info@iispl.co.in
www.iispl.co.in
IISPL ACADEMY
Introduction to Analytics & Data Analysis tools
 What is data analytics?
 Importance of analytics.
 Introduction to various analysis techniques
 Applications of data analysis in various industries
 Introduction to SAS/R/Python/SPSS
 Basics of programing in SAS/R/Python/SPSS
 Data handling in SAS/R/Python/SPSS
 BI reporting in SAS/R/Python/SPSS
 Performing statistical analysis on SAS/R/Python/SPSS Analyzing the data with
simple descriptive statistics
 Variance and standard deviation
IISPL ACADEMY
Data Validation & Cleaning
 Introduction to validating and cleaning data
 Examining data errors when reading raw data files
 Validating data with the CONTENTS, PRINT, FREQ, MEANS and
UNIVARIATE procedures.
 Cleaning invalid data: Missing value identification and treatment.
 Outlier identification and treatment
 Project Work
IISPL ACADEMY
Introduction to machine learning:
 What is machine learning?
 Learning system model
 Training and testing
 Performance
 Algorithms
 Machine learning structure
 What are we seeking?
 Learning techniques
IISPL ACADEMY
Nearest neighbor classification:
 Instance based classifiers
 Nearest-Neighbor classifiers
 Lazy vs. Eager learning
 k-NN variations
 How to determine the good value for k
 When to consider nearest neighbors
 Condensing
 Nearest neighbor issues
 Project Work
IISPL ACADEMY
IISPL ACADEMY
Naive Bayes classification
 Naive Bayes learning
 Conditional probability
 Bayesian theorem: basics
 The Bayes classifier
 Model parameters
 Naive Bayes training
 Types of errors
 Sensitivity and specificity
 ROC curve
 Holdout estimation
 Cross-validation
Decision Trees - Part I
 Key requirements
 Decision tree as a rule set
 How to create a decision tree
 Choosing attributes
 ID3 heuristic
 Entropy
 Pruning trees - Pre and post
 Subtree Replacement
 Raising
Decision Trees - Part II
 Tree induction
 Splitting based on ordinal attributes
 How to determine the best split
 Measure of impurity: GINI
 Splitting based on GINI
 Attributes binary
 Categorical -GINI
 Strengths and weakness of decision trees
IISPL ACADEMY
Ensemble Approaches
 Ensemble approaches
 Bagging model
 Boosting
 The Ada Boost algorithm
 Gradient boosting
 Random forests
 RIF
 RIC
 Advantages
 Disadvantages
IISPL ACADEMY
Artificial Neural Network
 Background of brain and neuron
 Neural networks
 Neurons diagram
 Neuron models- step function
 Ramp func etc
 Perceptrons
 Network architectures
 Single-layer feed-forward
Artificial Neural Network continued
 Multi layer feed-forward NN (FFNN)
 Back propagation
 NN design issues
 Recurrent network architecture
 Supervised learning NN
 Self organizing map
 Network structure
 SOM algorithm
IISPL ACADEMY
Project I
 Mentee can select project from predefined set of AcadGild projects or they
can come up with their own ideas for their projects
 Mentee can select project from predefined set of AcadGild projects or they
can come up with their own ideas for their projects
IISPL ACADEMY
Support Vector Machine Classifiers
 Support vector machines for classification
 Linear discrimination
 Nonlinear discrimination
 SVM mathematically
 Extensions
 Application in drug design
 Data classification
 Kernel functions
 Project
IISPL ACADEMY
Linear Models in R
 Introduction to regression
 Why do regression analysis
 Types of regression analysis
 OLS regression
 Dependent and independent variable(s)
 Steps to implement a regression model
 Simple linear regression
 Understanding terminology of each of the output of linear regression
 Project
IISPL ACADEMY
Correlation and Regression
 Correlation
 Strength of linear association
 Least-squares or regression line
 Linear regression model
 Correlation coefficient R
 Multiple regression
 Regression diagnostics
Assumptions in Regression Analysis
 The assumptions
 Assumption 1 and explanation- residuals and non normality
 Assumption 2 and explanation- heteroscedasticity
 Assumption 3 and explanation- additivity
 Assumption 4 and explanation- linearity ; Independence assumption; Residual
plots
 Project
IISPL ACADEMY
Model Selection in R
 Fitting the model
 Diagnostic plots
 Comparing models
 Cross validation
 Variable selection
 Relative importance
 AIC
 Dummy variable
 Box cox transformations
Creating the model
 Residuals vs fitted
 Residuals vs regression
 Diagnostic plots
IISPL ACADEMY
Logistic Regression
 Binary response regression model
 Linear regression output of proposed model
 Problems with linear probability model
 Logistic function
 Logistic regression & its interpretation
 Odds ratio
 Goodness of fit measures
 Confusion matrix
 What is cluster analysis?
 Project
IISPL ACADEMY
Introduction to Cluster Analysis
 Types of data in cluster analysis
 A categorization of major clustering methods
 Partitioning methods
 Hierarchical methods
 Density-based methods
 Grid-based methods
 Model-based clustering methods
 Supervised classification
 Project
IISPL ACADEMY
Principal Component Analysis (PCA)
 Curse of dimensionality
 Dimension reduction
 Why factor or component analysis?
 Principal component analysis
 PCs variance and least-squares
 Eigenvectors of a correlation matrix
 Factor analysis
 PCA process steps
 Project
IISPL ACADEMY
Forecasting Principles
 Basic time series and it's components
 Moving averages (simple & exponential)
 R'Â’s inbuilt function ts()
 Plotting of time series
 Business forecasting using moving average methods
 The ARIMA model
 Application of ARIMA model in business
 Project
IISPL ACADEMY
IISPL ACADEMY
Final Project
Project Work With Sophisticated Statistical
and Mathematical Tools & Techniques
Thanking You !!
With You Until Success & Beyond…
IISPL ACADEMY

Mais conteúdo relacionado

Último

Paris 2024 Olympic Geographies - an activity
Paris 2024 Olympic Geographies - an activityParis 2024 Olympic Geographies - an activity
Paris 2024 Olympic Geographies - an activityGeoBlogs
 
9548086042 for call girls in Indira Nagar with room service
9548086042  for call girls in Indira Nagar  with room service9548086042  for call girls in Indira Nagar  with room service
9548086042 for call girls in Indira Nagar with room servicediscovermytutordmt
 
Q4-W6-Restating Informational Text Grade 3
Q4-W6-Restating Informational Text Grade 3Q4-W6-Restating Informational Text Grade 3
Q4-W6-Restating Informational Text Grade 3JemimahLaneBuaron
 
Arihant handbook biology for class 11 .pdf
Arihant handbook biology for class 11 .pdfArihant handbook biology for class 11 .pdf
Arihant handbook biology for class 11 .pdfchloefrazer622
 
microwave assisted reaction. General introduction
microwave assisted reaction. General introductionmicrowave assisted reaction. General introduction
microwave assisted reaction. General introductionMaksud Ahmed
 
Interactive Powerpoint_How to Master effective communication
Interactive Powerpoint_How to Master effective communicationInteractive Powerpoint_How to Master effective communication
Interactive Powerpoint_How to Master effective communicationnomboosow
 
Ecosystem Interactions Class Discussion Presentation in Blue Green Lined Styl...
Ecosystem Interactions Class Discussion Presentation in Blue Green Lined Styl...Ecosystem Interactions Class Discussion Presentation in Blue Green Lined Styl...
Ecosystem Interactions Class Discussion Presentation in Blue Green Lined Styl...fonyou31
 
BASLIQ CURRENT LOOKBOOK LOOKBOOK(1) (1).pdf
BASLIQ CURRENT LOOKBOOK  LOOKBOOK(1) (1).pdfBASLIQ CURRENT LOOKBOOK  LOOKBOOK(1) (1).pdf
BASLIQ CURRENT LOOKBOOK LOOKBOOK(1) (1).pdfSoniaTolstoy
 
Kisan Call Centre - To harness potential of ICT in Agriculture by answer farm...
Kisan Call Centre - To harness potential of ICT in Agriculture by answer farm...Kisan Call Centre - To harness potential of ICT in Agriculture by answer farm...
Kisan Call Centre - To harness potential of ICT in Agriculture by answer farm...Krashi Coaching
 
General AI for Medical Educators April 2024
General AI for Medical Educators April 2024General AI for Medical Educators April 2024
General AI for Medical Educators April 2024Janet Corral
 
Call Girls in Dwarka Mor Delhi Contact Us 9654467111
Call Girls in Dwarka Mor Delhi Contact Us 9654467111Call Girls in Dwarka Mor Delhi Contact Us 9654467111
Call Girls in Dwarka Mor Delhi Contact Us 9654467111Sapana Sha
 
Grant Readiness 101 TechSoup and Remy Consulting
Grant Readiness 101 TechSoup and Remy ConsultingGrant Readiness 101 TechSoup and Remy Consulting
Grant Readiness 101 TechSoup and Remy ConsultingTechSoup
 
Sports & Fitness Value Added Course FY..
Sports & Fitness Value Added Course FY..Sports & Fitness Value Added Course FY..
Sports & Fitness Value Added Course FY..Disha Kariya
 
Disha NEET Physics Guide for classes 11 and 12.pdf
Disha NEET Physics Guide for classes 11 and 12.pdfDisha NEET Physics Guide for classes 11 and 12.pdf
Disha NEET Physics Guide for classes 11 and 12.pdfchloefrazer622
 
1029-Danh muc Sach Giao Khoa khoi 6.pdf
1029-Danh muc Sach Giao Khoa khoi  6.pdf1029-Danh muc Sach Giao Khoa khoi  6.pdf
1029-Danh muc Sach Giao Khoa khoi 6.pdfQucHHunhnh
 
Sanyam Choudhary Chemistry practical.pdf
Sanyam Choudhary Chemistry practical.pdfSanyam Choudhary Chemistry practical.pdf
Sanyam Choudhary Chemistry practical.pdfsanyamsingh5019
 
Russian Escort Service in Delhi 11k Hotel Foreigner Russian Call Girls in Delhi
Russian Escort Service in Delhi 11k Hotel Foreigner Russian Call Girls in DelhiRussian Escort Service in Delhi 11k Hotel Foreigner Russian Call Girls in Delhi
Russian Escort Service in Delhi 11k Hotel Foreigner Russian Call Girls in Delhikauryashika82
 
Beyond the EU: DORA and NIS 2 Directive's Global Impact
Beyond the EU: DORA and NIS 2 Directive's Global ImpactBeyond the EU: DORA and NIS 2 Directive's Global Impact
Beyond the EU: DORA and NIS 2 Directive's Global ImpactPECB
 

Último (20)

Código Creativo y Arte de Software | Unidad 1
Código Creativo y Arte de Software | Unidad 1Código Creativo y Arte de Software | Unidad 1
Código Creativo y Arte de Software | Unidad 1
 
Mattingly "AI & Prompt Design: Structured Data, Assistants, & RAG"
Mattingly "AI & Prompt Design: Structured Data, Assistants, & RAG"Mattingly "AI & Prompt Design: Structured Data, Assistants, & RAG"
Mattingly "AI & Prompt Design: Structured Data, Assistants, & RAG"
 
Paris 2024 Olympic Geographies - an activity
Paris 2024 Olympic Geographies - an activityParis 2024 Olympic Geographies - an activity
Paris 2024 Olympic Geographies - an activity
 
9548086042 for call girls in Indira Nagar with room service
9548086042  for call girls in Indira Nagar  with room service9548086042  for call girls in Indira Nagar  with room service
9548086042 for call girls in Indira Nagar with room service
 
Q4-W6-Restating Informational Text Grade 3
Q4-W6-Restating Informational Text Grade 3Q4-W6-Restating Informational Text Grade 3
Q4-W6-Restating Informational Text Grade 3
 
Arihant handbook biology for class 11 .pdf
Arihant handbook biology for class 11 .pdfArihant handbook biology for class 11 .pdf
Arihant handbook biology for class 11 .pdf
 
microwave assisted reaction. General introduction
microwave assisted reaction. General introductionmicrowave assisted reaction. General introduction
microwave assisted reaction. General introduction
 
Interactive Powerpoint_How to Master effective communication
Interactive Powerpoint_How to Master effective communicationInteractive Powerpoint_How to Master effective communication
Interactive Powerpoint_How to Master effective communication
 
Ecosystem Interactions Class Discussion Presentation in Blue Green Lined Styl...
Ecosystem Interactions Class Discussion Presentation in Blue Green Lined Styl...Ecosystem Interactions Class Discussion Presentation in Blue Green Lined Styl...
Ecosystem Interactions Class Discussion Presentation in Blue Green Lined Styl...
 
BASLIQ CURRENT LOOKBOOK LOOKBOOK(1) (1).pdf
BASLIQ CURRENT LOOKBOOK  LOOKBOOK(1) (1).pdfBASLIQ CURRENT LOOKBOOK  LOOKBOOK(1) (1).pdf
BASLIQ CURRENT LOOKBOOK LOOKBOOK(1) (1).pdf
 
Kisan Call Centre - To harness potential of ICT in Agriculture by answer farm...
Kisan Call Centre - To harness potential of ICT in Agriculture by answer farm...Kisan Call Centre - To harness potential of ICT in Agriculture by answer farm...
Kisan Call Centre - To harness potential of ICT in Agriculture by answer farm...
 
General AI for Medical Educators April 2024
General AI for Medical Educators April 2024General AI for Medical Educators April 2024
General AI for Medical Educators April 2024
 
Call Girls in Dwarka Mor Delhi Contact Us 9654467111
Call Girls in Dwarka Mor Delhi Contact Us 9654467111Call Girls in Dwarka Mor Delhi Contact Us 9654467111
Call Girls in Dwarka Mor Delhi Contact Us 9654467111
 
Grant Readiness 101 TechSoup and Remy Consulting
Grant Readiness 101 TechSoup and Remy ConsultingGrant Readiness 101 TechSoup and Remy Consulting
Grant Readiness 101 TechSoup and Remy Consulting
 
Sports & Fitness Value Added Course FY..
Sports & Fitness Value Added Course FY..Sports & Fitness Value Added Course FY..
Sports & Fitness Value Added Course FY..
 
Disha NEET Physics Guide for classes 11 and 12.pdf
Disha NEET Physics Guide for classes 11 and 12.pdfDisha NEET Physics Guide for classes 11 and 12.pdf
Disha NEET Physics Guide for classes 11 and 12.pdf
 
1029-Danh muc Sach Giao Khoa khoi 6.pdf
1029-Danh muc Sach Giao Khoa khoi  6.pdf1029-Danh muc Sach Giao Khoa khoi  6.pdf
1029-Danh muc Sach Giao Khoa khoi 6.pdf
 
Sanyam Choudhary Chemistry practical.pdf
Sanyam Choudhary Chemistry practical.pdfSanyam Choudhary Chemistry practical.pdf
Sanyam Choudhary Chemistry practical.pdf
 
Russian Escort Service in Delhi 11k Hotel Foreigner Russian Call Girls in Delhi
Russian Escort Service in Delhi 11k Hotel Foreigner Russian Call Girls in DelhiRussian Escort Service in Delhi 11k Hotel Foreigner Russian Call Girls in Delhi
Russian Escort Service in Delhi 11k Hotel Foreigner Russian Call Girls in Delhi
 
Beyond the EU: DORA and NIS 2 Directive's Global Impact
Beyond the EU: DORA and NIS 2 Directive's Global ImpactBeyond the EU: DORA and NIS 2 Directive's Global Impact
Beyond the EU: DORA and NIS 2 Directive's Global Impact
 

Destaque

Everything You Need To Know About ChatGPT
Everything You Need To Know About ChatGPTEverything You Need To Know About ChatGPT
Everything You Need To Know About ChatGPTExpeed Software
 
Product Design Trends in 2024 | Teenage Engineerings
Product Design Trends in 2024 | Teenage EngineeringsProduct Design Trends in 2024 | Teenage Engineerings
Product Design Trends in 2024 | Teenage EngineeringsPixeldarts
 
How Race, Age and Gender Shape Attitudes Towards Mental Health
How Race, Age and Gender Shape Attitudes Towards Mental HealthHow Race, Age and Gender Shape Attitudes Towards Mental Health
How Race, Age and Gender Shape Attitudes Towards Mental HealthThinkNow
 
AI Trends in Creative Operations 2024 by Artwork Flow.pdf
AI Trends in Creative Operations 2024 by Artwork Flow.pdfAI Trends in Creative Operations 2024 by Artwork Flow.pdf
AI Trends in Creative Operations 2024 by Artwork Flow.pdfmarketingartwork
 
PEPSICO Presentation to CAGNY Conference Feb 2024
PEPSICO Presentation to CAGNY Conference Feb 2024PEPSICO Presentation to CAGNY Conference Feb 2024
PEPSICO Presentation to CAGNY Conference Feb 2024Neil Kimberley
 
Content Methodology: A Best Practices Report (Webinar)
Content Methodology: A Best Practices Report (Webinar)Content Methodology: A Best Practices Report (Webinar)
Content Methodology: A Best Practices Report (Webinar)contently
 
How to Prepare For a Successful Job Search for 2024
How to Prepare For a Successful Job Search for 2024How to Prepare For a Successful Job Search for 2024
How to Prepare For a Successful Job Search for 2024Albert Qian
 
Social Media Marketing Trends 2024 // The Global Indie Insights
Social Media Marketing Trends 2024 // The Global Indie InsightsSocial Media Marketing Trends 2024 // The Global Indie Insights
Social Media Marketing Trends 2024 // The Global Indie InsightsKurio // The Social Media Age(ncy)
 
Trends In Paid Search: Navigating The Digital Landscape In 2024
Trends In Paid Search: Navigating The Digital Landscape In 2024Trends In Paid Search: Navigating The Digital Landscape In 2024
Trends In Paid Search: Navigating The Digital Landscape In 2024Search Engine Journal
 
5 Public speaking tips from TED - Visualized summary
5 Public speaking tips from TED - Visualized summary5 Public speaking tips from TED - Visualized summary
5 Public speaking tips from TED - Visualized summarySpeakerHub
 
ChatGPT and the Future of Work - Clark Boyd
ChatGPT and the Future of Work - Clark Boyd ChatGPT and the Future of Work - Clark Boyd
ChatGPT and the Future of Work - Clark Boyd Clark Boyd
 
Getting into the tech field. what next
Getting into the tech field. what next Getting into the tech field. what next
Getting into the tech field. what next Tessa Mero
 
Google's Just Not That Into You: Understanding Core Updates & Search Intent
Google's Just Not That Into You: Understanding Core Updates & Search IntentGoogle's Just Not That Into You: Understanding Core Updates & Search Intent
Google's Just Not That Into You: Understanding Core Updates & Search IntentLily Ray
 
Time Management & Productivity - Best Practices
Time Management & Productivity -  Best PracticesTime Management & Productivity -  Best Practices
Time Management & Productivity - Best PracticesVit Horky
 
The six step guide to practical project management
The six step guide to practical project managementThe six step guide to practical project management
The six step guide to practical project managementMindGenius
 
Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...
Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...
Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...RachelPearson36
 
Unlocking the Power of ChatGPT and AI in Testing - A Real-World Look, present...
Unlocking the Power of ChatGPT and AI in Testing - A Real-World Look, present...Unlocking the Power of ChatGPT and AI in Testing - A Real-World Look, present...
Unlocking the Power of ChatGPT and AI in Testing - A Real-World Look, present...Applitools
 

Destaque (20)

Everything You Need To Know About ChatGPT
Everything You Need To Know About ChatGPTEverything You Need To Know About ChatGPT
Everything You Need To Know About ChatGPT
 
Product Design Trends in 2024 | Teenage Engineerings
Product Design Trends in 2024 | Teenage EngineeringsProduct Design Trends in 2024 | Teenage Engineerings
Product Design Trends in 2024 | Teenage Engineerings
 
How Race, Age and Gender Shape Attitudes Towards Mental Health
How Race, Age and Gender Shape Attitudes Towards Mental HealthHow Race, Age and Gender Shape Attitudes Towards Mental Health
How Race, Age and Gender Shape Attitudes Towards Mental Health
 
AI Trends in Creative Operations 2024 by Artwork Flow.pdf
AI Trends in Creative Operations 2024 by Artwork Flow.pdfAI Trends in Creative Operations 2024 by Artwork Flow.pdf
AI Trends in Creative Operations 2024 by Artwork Flow.pdf
 
Skeleton Culture Code
Skeleton Culture CodeSkeleton Culture Code
Skeleton Culture Code
 
PEPSICO Presentation to CAGNY Conference Feb 2024
PEPSICO Presentation to CAGNY Conference Feb 2024PEPSICO Presentation to CAGNY Conference Feb 2024
PEPSICO Presentation to CAGNY Conference Feb 2024
 
Content Methodology: A Best Practices Report (Webinar)
Content Methodology: A Best Practices Report (Webinar)Content Methodology: A Best Practices Report (Webinar)
Content Methodology: A Best Practices Report (Webinar)
 
How to Prepare For a Successful Job Search for 2024
How to Prepare For a Successful Job Search for 2024How to Prepare For a Successful Job Search for 2024
How to Prepare For a Successful Job Search for 2024
 
Social Media Marketing Trends 2024 // The Global Indie Insights
Social Media Marketing Trends 2024 // The Global Indie InsightsSocial Media Marketing Trends 2024 // The Global Indie Insights
Social Media Marketing Trends 2024 // The Global Indie Insights
 
Trends In Paid Search: Navigating The Digital Landscape In 2024
Trends In Paid Search: Navigating The Digital Landscape In 2024Trends In Paid Search: Navigating The Digital Landscape In 2024
Trends In Paid Search: Navigating The Digital Landscape In 2024
 
5 Public speaking tips from TED - Visualized summary
5 Public speaking tips from TED - Visualized summary5 Public speaking tips from TED - Visualized summary
5 Public speaking tips from TED - Visualized summary
 
ChatGPT and the Future of Work - Clark Boyd
ChatGPT and the Future of Work - Clark Boyd ChatGPT and the Future of Work - Clark Boyd
ChatGPT and the Future of Work - Clark Boyd
 
Getting into the tech field. what next
Getting into the tech field. what next Getting into the tech field. what next
Getting into the tech field. what next
 
Google's Just Not That Into You: Understanding Core Updates & Search Intent
Google's Just Not That Into You: Understanding Core Updates & Search IntentGoogle's Just Not That Into You: Understanding Core Updates & Search Intent
Google's Just Not That Into You: Understanding Core Updates & Search Intent
 
How to have difficult conversations
How to have difficult conversations How to have difficult conversations
How to have difficult conversations
 
Introduction to Data Science
Introduction to Data ScienceIntroduction to Data Science
Introduction to Data Science
 
Time Management & Productivity - Best Practices
Time Management & Productivity -  Best PracticesTime Management & Productivity -  Best Practices
Time Management & Productivity - Best Practices
 
The six step guide to practical project management
The six step guide to practical project managementThe six step guide to practical project management
The six step guide to practical project management
 
Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...
Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...
Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...
 
Unlocking the Power of ChatGPT and AI in Testing - A Real-World Look, present...
Unlocking the Power of ChatGPT and AI in Testing - A Real-World Look, present...Unlocking the Power of ChatGPT and AI in Testing - A Real-World Look, present...
Unlocking the Power of ChatGPT and AI in Testing - A Real-World Look, present...
 

IISPL Noida Data Analytics Machine Earning Module

  • 1. Data Analytics & Machine Learning PROF. (Dr.) S. PATHAK , PH.D, M.TECH, Senior Data Scientist & Er. J.K. JHA ( Corporate Trainer, BIG Data & Machine Learning ) Reach: info@iispl.co.in www.iispl.co.in IISPL ACADEMY
  • 2. Introduction to Analytics & Data Analysis tools  What is data analytics?  Importance of analytics.  Introduction to various analysis techniques  Applications of data analysis in various industries  Introduction to SAS/R/Python/SPSS  Basics of programing in SAS/R/Python/SPSS  Data handling in SAS/R/Python/SPSS  BI reporting in SAS/R/Python/SPSS  Performing statistical analysis on SAS/R/Python/SPSS Analyzing the data with simple descriptive statistics  Variance and standard deviation IISPL ACADEMY
  • 3. Data Validation & Cleaning  Introduction to validating and cleaning data  Examining data errors when reading raw data files  Validating data with the CONTENTS, PRINT, FREQ, MEANS and UNIVARIATE procedures.  Cleaning invalid data: Missing value identification and treatment.  Outlier identification and treatment  Project Work IISPL ACADEMY
  • 4. Introduction to machine learning:  What is machine learning?  Learning system model  Training and testing  Performance  Algorithms  Machine learning structure  What are we seeking?  Learning techniques IISPL ACADEMY
  • 5. Nearest neighbor classification:  Instance based classifiers  Nearest-Neighbor classifiers  Lazy vs. Eager learning  k-NN variations  How to determine the good value for k  When to consider nearest neighbors  Condensing  Nearest neighbor issues  Project Work IISPL ACADEMY
  • 6. IISPL ACADEMY Naive Bayes classification  Naive Bayes learning  Conditional probability  Bayesian theorem: basics  The Bayes classifier  Model parameters  Naive Bayes training  Types of errors  Sensitivity and specificity  ROC curve  Holdout estimation  Cross-validation
  • 7. Decision Trees - Part I  Key requirements  Decision tree as a rule set  How to create a decision tree  Choosing attributes  ID3 heuristic  Entropy  Pruning trees - Pre and post  Subtree Replacement  Raising Decision Trees - Part II  Tree induction  Splitting based on ordinal attributes  How to determine the best split  Measure of impurity: GINI  Splitting based on GINI  Attributes binary  Categorical -GINI  Strengths and weakness of decision trees IISPL ACADEMY
  • 8. Ensemble Approaches  Ensemble approaches  Bagging model  Boosting  The Ada Boost algorithm  Gradient boosting  Random forests  RIF  RIC  Advantages  Disadvantages IISPL ACADEMY
  • 9. Artificial Neural Network  Background of brain and neuron  Neural networks  Neurons diagram  Neuron models- step function  Ramp func etc  Perceptrons  Network architectures  Single-layer feed-forward Artificial Neural Network continued  Multi layer feed-forward NN (FFNN)  Back propagation  NN design issues  Recurrent network architecture  Supervised learning NN  Self organizing map  Network structure  SOM algorithm IISPL ACADEMY
  • 10. Project I  Mentee can select project from predefined set of AcadGild projects or they can come up with their own ideas for their projects  Mentee can select project from predefined set of AcadGild projects or they can come up with their own ideas for their projects IISPL ACADEMY
  • 11. Support Vector Machine Classifiers  Support vector machines for classification  Linear discrimination  Nonlinear discrimination  SVM mathematically  Extensions  Application in drug design  Data classification  Kernel functions  Project IISPL ACADEMY
  • 12. Linear Models in R  Introduction to regression  Why do regression analysis  Types of regression analysis  OLS regression  Dependent and independent variable(s)  Steps to implement a regression model  Simple linear regression  Understanding terminology of each of the output of linear regression  Project IISPL ACADEMY
  • 13. Correlation and Regression  Correlation  Strength of linear association  Least-squares or regression line  Linear regression model  Correlation coefficient R  Multiple regression  Regression diagnostics Assumptions in Regression Analysis  The assumptions  Assumption 1 and explanation- residuals and non normality  Assumption 2 and explanation- heteroscedasticity  Assumption 3 and explanation- additivity  Assumption 4 and explanation- linearity ; Independence assumption; Residual plots  Project IISPL ACADEMY
  • 14. Model Selection in R  Fitting the model  Diagnostic plots  Comparing models  Cross validation  Variable selection  Relative importance  AIC  Dummy variable  Box cox transformations Creating the model  Residuals vs fitted  Residuals vs regression  Diagnostic plots IISPL ACADEMY
  • 15. Logistic Regression  Binary response regression model  Linear regression output of proposed model  Problems with linear probability model  Logistic function  Logistic regression & its interpretation  Odds ratio  Goodness of fit measures  Confusion matrix  What is cluster analysis?  Project IISPL ACADEMY
  • 16. Introduction to Cluster Analysis  Types of data in cluster analysis  A categorization of major clustering methods  Partitioning methods  Hierarchical methods  Density-based methods  Grid-based methods  Model-based clustering methods  Supervised classification  Project IISPL ACADEMY
  • 17. Principal Component Analysis (PCA)  Curse of dimensionality  Dimension reduction  Why factor or component analysis?  Principal component analysis  PCs variance and least-squares  Eigenvectors of a correlation matrix  Factor analysis  PCA process steps  Project IISPL ACADEMY
  • 18. Forecasting Principles  Basic time series and it's components  Moving averages (simple & exponential)  R'Â’s inbuilt function ts()  Plotting of time series  Business forecasting using moving average methods  The ARIMA model  Application of ARIMA model in business  Project IISPL ACADEMY
  • 19. IISPL ACADEMY Final Project Project Work With Sophisticated Statistical and Mathematical Tools & Techniques
  • 20. Thanking You !! With You Until Success & Beyond… IISPL ACADEMY