SlideShare uma empresa Scribd logo
1 de 16
MACHINE LEARNING USING SPARK
The following topics will be covered in our
Machine Learning Using Spark
Online Training:
Copyright @ 2015 Learntek. All Rights Reserved. 2
What is Machine Learning?
▪ Machine learning Using Spark-Spark MLlib is an application of artificial
intelligence (AI) that provides systems the ability to automatically learn
and improve from experience without being explicitly programmed.
Machine learning focuses on the development of computer programs
that can access data and use it learn for themselves.
Copyright @ 2015 Learntek. All Rights Reserved. 3
Into to Machine Learning Using Spark
• MLlib is Spark’s machine learning (ML) library. Its goal is to make practical machine learning
scalable and easy. At a high level, it provides tools such as:
• ML Algorithms: common learning algorithms such as classification, regression, clustering,
and collaborative filtering
• Featurization: feature extraction, transformation, dimensionality reduction, and selection
• Pipelines: tools for constructing, evaluating, and tuning ML Pipelines
• Persistence: saving and load algorithms, models, and Pipelines
• Utilities: linear algebra, statistics, data handling, etc.
Copyright @ 2015 Learntek. All Rights Reserved. 4
Tools
• This course will be delivered using Scala and PYTHON API. For explaining
statistical concept, R language will also be using. Visualization part will
be covered using Bokeh/ggplot library.
Copyright @ 2015 Learntek. All Rights Reserved. 5
Introduction to Apache Spark
▪ Spark Programming model
▪ RDD and Data Frame
▪ Transformation and Action
▪ Broadcast and Accumulator
▪ Running HDP on local machine
▪ Launching Spark Cluster
Copyright @ 2015 Learntek. All Rights Reserved. 6
Basic Statistics
• Mean, Mode, Media, Range, Variance,
Standard Deviation, Quartiles,
Percentiles
• Sampling
• Sampling Methods
• Sampling Errors
• Probability Distributions
• Normal distribution, t-distribution, Chi-
square, F
• Margin of Error, Confidence Interval,
Significance level, Degree of Freedom
• Hypothesis concept, Type I and Type II
error
• P-value, t-Test, Chi-square Test
• Correlation Coefficient
Copyright @ 2015 Learntek. All Rights Reserved. 7
Machine Learning Using Spark
• Introduction to Spark MLlib
• Data types: Vector, Labeled Point
• Feature Extraction
• Feature Transformation, Normalization
• Feature Selectors
• Locality Sensitive Hashing(LSH)
Copyright @ 2015 Learntek. All Rights Reserved. 8
Regression Analysis with Spark
• Types of Regression Models
• Gradient Descent
• Linear Regression, Generalized Linear Regression
• MSE, RMSE MAE, R-squared Coefficient
• Transforming the target variable
• Tuning Model Parameters
Copyright @ 2015 Learntek. All Rights Reserved. 9
Classification Model with Spark
• Linear Models, Naives Bayes Model,
Decision Tree
• Logistic Regression
• Linear Support Vector Machine
• Random Forest
• Gradient-Boosted Trees
• Training Classification Models
• Accuracy and prediction error
• Precision and Recall
• ROC curve and AUC
• Cross validation
Copyright @ 2015 Learntek. All Rights Reserved. 10
Clustering
• Hierarchical clustering
• K-mean clustering
Copyright @ 2015 Learntek. All Rights Reserved. 11
Dimensionality Reduction
• Principal Component Analysis
• Singular Value Decomposition
• Clustering as dimensionality reduction
• Training a dimensionality reduction model
• Evaluating dimensionality reduction models
Copyright @ 2015 Learntek. All Rights Reserved. 12
Recommendation Engine
▪ Content based filtering
▪ Collaborative based filtering
▪ Overview of Movie Lens data
▪ Training a recommendation model
▪ Using the recommendation model
▪ Performance Evaluation
Copyright @ 2015 Learntek. All Rights Reserved. 13
Text Processing
Copyright @ 2015 Learntek. All Rights Reserved. 14
•Feature Hashing
•TF-IDF model
•Tokenization
•Stop words
•TF-IDF Weightings
•Training a TF-IDF model
•Usage of TF-IDF model
•Evaluating TF-IDF models
Prerequisites :
▪ Prior understanding of exploratory data analysis and data visualization will
help immensely in learning machine learning concept and applications.
This include basic statistical technique for data analysis. Having some
knowledge of R programming or some Python packages like sci-kit, numpy will
be useful. However , we are going to cover basic statistics technique as part
of this course before going deep into machine learning . This will help
everyone to gain maximum from this course.
Copyright @ 2015 Learntek. All Rights Reserved. 15
Copyright @ 2015 Learntek. All Rights Reserved. 16

Mais conteúdo relacionado

Mais procurados

Machine learning pipeline with spark ml
Machine learning pipeline with spark mlMachine learning pipeline with spark ml
Machine learning pipeline with spark mldatamantra
 
Graph Analytics on Data from Meetup.com
Graph Analytics on Data from Meetup.comGraph Analytics on Data from Meetup.com
Graph Analytics on Data from Meetup.comKarin Patenge
 
Automatic Machine Learning, AutoML
Automatic Machine Learning, AutoMLAutomatic Machine Learning, AutoML
Automatic Machine Learning, AutoMLHimadri Mishra
 
Automatic machine learning (AutoML) 101
Automatic machine learning (AutoML) 101Automatic machine learning (AutoML) 101
Automatic machine learning (AutoML) 101QuantUniversity
 
H2O World - Survey of Available Machine Learning Frameworks - Brendan Herger
H2O World - Survey of Available Machine Learning Frameworks - Brendan HergerH2O World - Survey of Available Machine Learning Frameworks - Brendan Herger
H2O World - Survey of Available Machine Learning Frameworks - Brendan HergerSri Ambati
 
Automated Machine Learning
Automated Machine LearningAutomated Machine Learning
Automated Machine Learningsafa cimenli
 
Data Science Salon: Kaggle 1st Place in 30 minutes: Putting AutoML to Work wi...
Data Science Salon: Kaggle 1st Place in 30 minutes: Putting AutoML to Work wi...Data Science Salon: Kaggle 1st Place in 30 minutes: Putting AutoML to Work wi...
Data Science Salon: Kaggle 1st Place in 30 minutes: Putting AutoML to Work wi...Formulatedby
 
2015 Data Science Summit @ dato Review
2015 Data Science Summit @ dato Review2015 Data Science Summit @ dato Review
2015 Data Science Summit @ dato ReviewHang Li
 
When Graphs Meet Machine Learning
When Graphs Meet Machine LearningWhen Graphs Meet Machine Learning
When Graphs Meet Machine LearningJean Ihm
 
The Evolution of AutoML
The Evolution of AutoMLThe Evolution of AutoML
The Evolution of AutoMLNing Jiang
 
Operationalizing Edge Machine Learning with Apache Spark with Nisha Talagala ...
Operationalizing Edge Machine Learning with Apache Spark with Nisha Talagala ...Operationalizing Edge Machine Learning with Apache Spark with Nisha Talagala ...
Operationalizing Edge Machine Learning with Apache Spark with Nisha Talagala ...Databricks
 
Automated Machine Learning
Automated Machine LearningAutomated Machine Learning
Automated Machine LearningYuriy Guts
 
Building Intelligent Applications, Experimental ML with Uber’s Data Science W...
Building Intelligent Applications, Experimental ML with Uber’s Data Science W...Building Intelligent Applications, Experimental ML with Uber’s Data Science W...
Building Intelligent Applications, Experimental ML with Uber’s Data Science W...Databricks
 
GDG PDX - An Intro to Google Cloud AutoML Vision
GDG PDX - An Intro to Google Cloud AutoML VisionGDG PDX - An Intro to Google Cloud AutoML Vision
GDG PDX - An Intro to Google Cloud AutoML Visionjerryhargrove
 
HyperGraphDb
HyperGraphDbHyperGraphDb
HyperGraphDbborislav
 
From Chatbots to Augmented Conversational Assistants
From Chatbots to Augmented Conversational AssistantsFrom Chatbots to Augmented Conversational Assistants
From Chatbots to Augmented Conversational AssistantsDatabricks
 
Intro to Mahout -- DC Hadoop
Intro to Mahout -- DC HadoopIntro to Mahout -- DC Hadoop
Intro to Mahout -- DC HadoopGrant Ingersoll
 

Mais procurados (20)

Machine learning pipeline with spark ml
Machine learning pipeline with spark mlMachine learning pipeline with spark ml
Machine learning pipeline with spark ml
 
Graph Analytics on Data from Meetup.com
Graph Analytics on Data from Meetup.comGraph Analytics on Data from Meetup.com
Graph Analytics on Data from Meetup.com
 
Automatic Machine Learning, AutoML
Automatic Machine Learning, AutoMLAutomatic Machine Learning, AutoML
Automatic Machine Learning, AutoML
 
Automatic machine learning (AutoML) 101
Automatic machine learning (AutoML) 101Automatic machine learning (AutoML) 101
Automatic machine learning (AutoML) 101
 
H2O World - Survey of Available Machine Learning Frameworks - Brendan Herger
H2O World - Survey of Available Machine Learning Frameworks - Brendan HergerH2O World - Survey of Available Machine Learning Frameworks - Brendan Herger
H2O World - Survey of Available Machine Learning Frameworks - Brendan Herger
 
Automated Machine Learning
Automated Machine LearningAutomated Machine Learning
Automated Machine Learning
 
Big Graph Data with Titan DB
Big Graph Data with Titan DBBig Graph Data with Titan DB
Big Graph Data with Titan DB
 
Data Science Salon: Kaggle 1st Place in 30 minutes: Putting AutoML to Work wi...
Data Science Salon: Kaggle 1st Place in 30 minutes: Putting AutoML to Work wi...Data Science Salon: Kaggle 1st Place in 30 minutes: Putting AutoML to Work wi...
Data Science Salon: Kaggle 1st Place in 30 minutes: Putting AutoML to Work wi...
 
MLOps at OLX
MLOps at OLXMLOps at OLX
MLOps at OLX
 
2015 Data Science Summit @ dato Review
2015 Data Science Summit @ dato Review2015 Data Science Summit @ dato Review
2015 Data Science Summit @ dato Review
 
HypergraphDB
HypergraphDBHypergraphDB
HypergraphDB
 
When Graphs Meet Machine Learning
When Graphs Meet Machine LearningWhen Graphs Meet Machine Learning
When Graphs Meet Machine Learning
 
The Evolution of AutoML
The Evolution of AutoMLThe Evolution of AutoML
The Evolution of AutoML
 
Operationalizing Edge Machine Learning with Apache Spark with Nisha Talagala ...
Operationalizing Edge Machine Learning with Apache Spark with Nisha Talagala ...Operationalizing Edge Machine Learning with Apache Spark with Nisha Talagala ...
Operationalizing Edge Machine Learning with Apache Spark with Nisha Talagala ...
 
Automated Machine Learning
Automated Machine LearningAutomated Machine Learning
Automated Machine Learning
 
Building Intelligent Applications, Experimental ML with Uber’s Data Science W...
Building Intelligent Applications, Experimental ML with Uber’s Data Science W...Building Intelligent Applications, Experimental ML with Uber’s Data Science W...
Building Intelligent Applications, Experimental ML with Uber’s Data Science W...
 
GDG PDX - An Intro to Google Cloud AutoML Vision
GDG PDX - An Intro to Google Cloud AutoML VisionGDG PDX - An Intro to Google Cloud AutoML Vision
GDG PDX - An Intro to Google Cloud AutoML Vision
 
HyperGraphDb
HyperGraphDbHyperGraphDb
HyperGraphDb
 
From Chatbots to Augmented Conversational Assistants
From Chatbots to Augmented Conversational AssistantsFrom Chatbots to Augmented Conversational Assistants
From Chatbots to Augmented Conversational Assistants
 
Intro to Mahout -- DC Hadoop
Intro to Mahout -- DC HadoopIntro to Mahout -- DC Hadoop
Intro to Mahout -- DC Hadoop
 

Semelhante a Machine learning using spark Online Training

Open, Secure & Transparent AI Pipelines
Open, Secure & Transparent AI PipelinesOpen, Secure & Transparent AI Pipelines
Open, Secure & Transparent AI PipelinesNick Pentreath
 
Hadoop for the Data Scientist: Spark in Cloudera 5.5
Hadoop for the Data Scientist: Spark in Cloudera 5.5Hadoop for the Data Scientist: Spark in Cloudera 5.5
Hadoop for the Data Scientist: Spark in Cloudera 5.5Cloudera, Inc.
 
Ideas spracklen-final
Ideas spracklen-finalIdeas spracklen-final
Ideas spracklen-finalsupportlogic
 
Data meets AI - AICUG - Santa Clara
Data meets AI  - AICUG - Santa ClaraData meets AI  - AICUG - Santa Clara
Data meets AI - AICUG - Santa ClaraSandesh Rao
 
Strata parallel m-ml-ops_sept_2017
Strata parallel m-ml-ops_sept_2017Strata parallel m-ml-ops_sept_2017
Strata parallel m-ml-ops_sept_2017Nisha Talagala
 
Tuning ML Models: Scaling, Workflows, and Architecture
Tuning ML Models: Scaling, Workflows, and ArchitectureTuning ML Models: Scaling, Workflows, and Architecture
Tuning ML Models: Scaling, Workflows, and ArchitectureDatabricks
 
Combining Machine Learning frameworks with Apache Spark
Combining Machine Learning frameworks with Apache SparkCombining Machine Learning frameworks with Apache Spark
Combining Machine Learning frameworks with Apache SparkDataWorks Summit/Hadoop Summit
 
Data Production Pipelines: Legacy, practices, and innovation
Data Production Pipelines: Legacy, practices, and innovationData Production Pipelines: Legacy, practices, and innovation
Data Production Pipelines: Legacy, practices, and innovationNatalino Busa
 
Machine Learning With Spark
Machine Learning With SparkMachine Learning With Spark
Machine Learning With SparkShivaji Dutta
 
Introducing new AIOps innovations in Oracle 19c - San Jose AICUG
Introducing new AIOps innovations in Oracle 19c - San Jose AICUGIntroducing new AIOps innovations in Oracle 19c - San Jose AICUG
Introducing new AIOps innovations in Oracle 19c - San Jose AICUGSandesh Rao
 
Asp.net Training at NCrypted Learning Center
Asp.net Training at NCrypted Learning CenterAsp.net Training at NCrypted Learning Center
Asp.net Training at NCrypted Learning CenterNCrypted Learning Center
 
Combining Machine Learning Frameworks with Apache Spark
Combining Machine Learning Frameworks with Apache SparkCombining Machine Learning Frameworks with Apache Spark
Combining Machine Learning Frameworks with Apache SparkDatabricks
 
Deep Learning on Apache® Spark™ : Workflows and Best Practices
Deep Learning on Apache® Spark™ : Workflows and Best PracticesDeep Learning on Apache® Spark™ : Workflows and Best Practices
Deep Learning on Apache® Spark™ : Workflows and Best PracticesJen Aman
 
Deep Learning on Apache® Spark™: Workflows and Best Practices
Deep Learning on Apache® Spark™: Workflows and Best PracticesDeep Learning on Apache® Spark™: Workflows and Best Practices
Deep Learning on Apache® Spark™: Workflows and Best PracticesDatabricks
 
Deep Learning on Apache® Spark™: Workflows and Best Practices
Deep Learning on Apache® Spark™: Workflows and Best PracticesDeep Learning on Apache® Spark™: Workflows and Best Practices
Deep Learning on Apache® Spark™: Workflows and Best PracticesJen Aman
 
Machine learning with Spark
Machine learning with SparkMachine learning with Spark
Machine learning with SparkKhalid Salama
 

Semelhante a Machine learning using spark Online Training (20)

Ml product page
Ml product pageMl product page
Ml product page
 
Ml product page
Ml product pageMl product page
Ml product page
 
Open, Secure & Transparent AI Pipelines
Open, Secure & Transparent AI PipelinesOpen, Secure & Transparent AI Pipelines
Open, Secure & Transparent AI Pipelines
 
Apache Spark MLlib
Apache Spark MLlib Apache Spark MLlib
Apache Spark MLlib
 
Python ml
Python mlPython ml
Python ml
 
Hadoop for the Data Scientist: Spark in Cloudera 5.5
Hadoop for the Data Scientist: Spark in Cloudera 5.5Hadoop for the Data Scientist: Spark in Cloudera 5.5
Hadoop for the Data Scientist: Spark in Cloudera 5.5
 
Ideas spracklen-final
Ideas spracklen-finalIdeas spracklen-final
Ideas spracklen-final
 
Data meets AI - AICUG - Santa Clara
Data meets AI  - AICUG - Santa ClaraData meets AI  - AICUG - Santa Clara
Data meets AI - AICUG - Santa Clara
 
Strata parallel m-ml-ops_sept_2017
Strata parallel m-ml-ops_sept_2017Strata parallel m-ml-ops_sept_2017
Strata parallel m-ml-ops_sept_2017
 
Tuning ML Models: Scaling, Workflows, and Architecture
Tuning ML Models: Scaling, Workflows, and ArchitectureTuning ML Models: Scaling, Workflows, and Architecture
Tuning ML Models: Scaling, Workflows, and Architecture
 
Combining Machine Learning frameworks with Apache Spark
Combining Machine Learning frameworks with Apache SparkCombining Machine Learning frameworks with Apache Spark
Combining Machine Learning frameworks with Apache Spark
 
Data Production Pipelines: Legacy, practices, and innovation
Data Production Pipelines: Legacy, practices, and innovationData Production Pipelines: Legacy, practices, and innovation
Data Production Pipelines: Legacy, practices, and innovation
 
Machine Learning With Spark
Machine Learning With SparkMachine Learning With Spark
Machine Learning With Spark
 
Introducing new AIOps innovations in Oracle 19c - San Jose AICUG
Introducing new AIOps innovations in Oracle 19c - San Jose AICUGIntroducing new AIOps innovations in Oracle 19c - San Jose AICUG
Introducing new AIOps innovations in Oracle 19c - San Jose AICUG
 
Asp.net Training at NCrypted Learning Center
Asp.net Training at NCrypted Learning CenterAsp.net Training at NCrypted Learning Center
Asp.net Training at NCrypted Learning Center
 
Combining Machine Learning Frameworks with Apache Spark
Combining Machine Learning Frameworks with Apache SparkCombining Machine Learning Frameworks with Apache Spark
Combining Machine Learning Frameworks with Apache Spark
 
Deep Learning on Apache® Spark™ : Workflows and Best Practices
Deep Learning on Apache® Spark™ : Workflows and Best PracticesDeep Learning on Apache® Spark™ : Workflows and Best Practices
Deep Learning on Apache® Spark™ : Workflows and Best Practices
 
Deep Learning on Apache® Spark™: Workflows and Best Practices
Deep Learning on Apache® Spark™: Workflows and Best PracticesDeep Learning on Apache® Spark™: Workflows and Best Practices
Deep Learning on Apache® Spark™: Workflows and Best Practices
 
Deep Learning on Apache® Spark™: Workflows and Best Practices
Deep Learning on Apache® Spark™: Workflows and Best PracticesDeep Learning on Apache® Spark™: Workflows and Best Practices
Deep Learning on Apache® Spark™: Workflows and Best Practices
 
Machine learning with Spark
Machine learning with SparkMachine learning with Spark
Machine learning with Spark
 

Mais de Learntek1

Aws sys ops administrator
Aws sys ops administratorAws sys ops administrator
Aws sys ops administratorLearntek1
 
Big data - Online Training
Big data - Online TrainingBig data - Online Training
Big data - Online TrainingLearntek1
 
Angular js Online Training
Angular js Online TrainingAngular js Online Training
Angular js Online TrainingLearntek1
 
Selenium Online Training
Selenium  Online TrainingSelenium  Online Training
Selenium Online TrainingLearntek1
 
React js Online Training
React js Online TrainingReact js Online Training
React js Online TrainingLearntek1
 
Apache Flink Online Training
Apache Flink Online TrainingApache Flink Online Training
Apache Flink Online TrainingLearntek1
 
Scala & Spark Online Training
Scala & Spark Online TrainingScala & Spark Online Training
Scala & Spark Online TrainingLearntek1
 

Mais de Learntek1 (7)

Aws sys ops administrator
Aws sys ops administratorAws sys ops administrator
Aws sys ops administrator
 
Big data - Online Training
Big data - Online TrainingBig data - Online Training
Big data - Online Training
 
Angular js Online Training
Angular js Online TrainingAngular js Online Training
Angular js Online Training
 
Selenium Online Training
Selenium  Online TrainingSelenium  Online Training
Selenium Online Training
 
React js Online Training
React js Online TrainingReact js Online Training
React js Online Training
 
Apache Flink Online Training
Apache Flink Online TrainingApache Flink Online Training
Apache Flink Online Training
 
Scala & Spark Online Training
Scala & Spark Online TrainingScala & Spark Online Training
Scala & Spark Online Training
 

Último

Python Notes for mca i year students osmania university.docx
Python Notes for mca i year students osmania university.docxPython Notes for mca i year students osmania university.docx
Python Notes for mca i year students osmania university.docxRamakrishna Reddy Bijjam
 
The basics of sentences session 3pptx.pptx
The basics of sentences session 3pptx.pptxThe basics of sentences session 3pptx.pptx
The basics of sentences session 3pptx.pptxheathfieldcps1
 
Fostering Friendships - Enhancing Social Bonds in the Classroom
Fostering Friendships - Enhancing Social Bonds  in the ClassroomFostering Friendships - Enhancing Social Bonds  in the Classroom
Fostering Friendships - Enhancing Social Bonds in the ClassroomPooky Knightsmith
 
ICT role in 21st century education and it's challenges.
ICT role in 21st century education and it's challenges.ICT role in 21st century education and it's challenges.
ICT role in 21st century education and it's challenges.MaryamAhmad92
 
Wellbeing inclusion and digital dystopias.pptx
Wellbeing inclusion and digital dystopias.pptxWellbeing inclusion and digital dystopias.pptx
Wellbeing inclusion and digital dystopias.pptxJisc
 
UGC NET Paper 1 Mathematical Reasoning & Aptitude.pdf
UGC NET Paper 1 Mathematical Reasoning & Aptitude.pdfUGC NET Paper 1 Mathematical Reasoning & Aptitude.pdf
UGC NET Paper 1 Mathematical Reasoning & Aptitude.pdfNirmal Dwivedi
 
ICT Role in 21st Century Education & its Challenges.pptx
ICT Role in 21st Century Education & its Challenges.pptxICT Role in 21st Century Education & its Challenges.pptx
ICT Role in 21st Century Education & its Challenges.pptxAreebaZafar22
 
NO1 Top Black Magic Specialist In Lahore Black magic In Pakistan Kala Ilam Ex...
NO1 Top Black Magic Specialist In Lahore Black magic In Pakistan Kala Ilam Ex...NO1 Top Black Magic Specialist In Lahore Black magic In Pakistan Kala Ilam Ex...
NO1 Top Black Magic Specialist In Lahore Black magic In Pakistan Kala Ilam Ex...Amil baba
 
Jual Obat Aborsi Hongkong ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
Jual Obat Aborsi Hongkong ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...Jual Obat Aborsi Hongkong ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
Jual Obat Aborsi Hongkong ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...ZurliaSoop
 
Micro-Scholarship, What it is, How can it help me.pdf
Micro-Scholarship, What it is, How can it help me.pdfMicro-Scholarship, What it is, How can it help me.pdf
Micro-Scholarship, What it is, How can it help me.pdfPoh-Sun Goh
 
Graduate Outcomes Presentation Slides - English
Graduate Outcomes Presentation Slides - EnglishGraduate Outcomes Presentation Slides - English
Graduate Outcomes Presentation Slides - Englishneillewis46
 
COMMUNICATING NEGATIVE NEWS - APPROACHES .pptx
COMMUNICATING NEGATIVE NEWS - APPROACHES .pptxCOMMUNICATING NEGATIVE NEWS - APPROACHES .pptx
COMMUNICATING NEGATIVE NEWS - APPROACHES .pptxannathomasp01
 
Salient Features of India constitution especially power and functions
Salient Features of India constitution especially power and functionsSalient Features of India constitution especially power and functions
Salient Features of India constitution especially power and functionsKarakKing
 
80 ĐỀ THI THỬ TUYỂN SINH TIẾNG ANH VÀO 10 SỞ GD – ĐT THÀNH PHỐ HỒ CHÍ MINH NĂ...
80 ĐỀ THI THỬ TUYỂN SINH TIẾNG ANH VÀO 10 SỞ GD – ĐT THÀNH PHỐ HỒ CHÍ MINH NĂ...80 ĐỀ THI THỬ TUYỂN SINH TIẾNG ANH VÀO 10 SỞ GD – ĐT THÀNH PHỐ HỒ CHÍ MINH NĂ...
80 ĐỀ THI THỬ TUYỂN SINH TIẾNG ANH VÀO 10 SỞ GD – ĐT THÀNH PHỐ HỒ CHÍ MINH NĂ...Nguyen Thanh Tu Collection
 
Google Gemini An AI Revolution in Education.pptx
Google Gemini An AI Revolution in Education.pptxGoogle Gemini An AI Revolution in Education.pptx
Google Gemini An AI Revolution in Education.pptxDr. Sarita Anand
 
Beyond_Borders_Understanding_Anime_and_Manga_Fandom_A_Comprehensive_Audience_...
Beyond_Borders_Understanding_Anime_and_Manga_Fandom_A_Comprehensive_Audience_...Beyond_Borders_Understanding_Anime_and_Manga_Fandom_A_Comprehensive_Audience_...
Beyond_Borders_Understanding_Anime_and_Manga_Fandom_A_Comprehensive_Audience_...Pooja Bhuva
 
How to Add New Custom Addons Path in Odoo 17
How to Add New Custom Addons Path in Odoo 17How to Add New Custom Addons Path in Odoo 17
How to Add New Custom Addons Path in Odoo 17Celine George
 
Kodo Millet PPT made by Ghanshyam bairwa college of Agriculture kumher bhara...
Kodo Millet  PPT made by Ghanshyam bairwa college of Agriculture kumher bhara...Kodo Millet  PPT made by Ghanshyam bairwa college of Agriculture kumher bhara...
Kodo Millet PPT made by Ghanshyam bairwa college of Agriculture kumher bhara...pradhanghanshyam7136
 
How to Create and Manage Wizard in Odoo 17
How to Create and Manage Wizard in Odoo 17How to Create and Manage Wizard in Odoo 17
How to Create and Manage Wizard in Odoo 17Celine George
 

Último (20)

Python Notes for mca i year students osmania university.docx
Python Notes for mca i year students osmania university.docxPython Notes for mca i year students osmania university.docx
Python Notes for mca i year students osmania university.docx
 
The basics of sentences session 3pptx.pptx
The basics of sentences session 3pptx.pptxThe basics of sentences session 3pptx.pptx
The basics of sentences session 3pptx.pptx
 
Fostering Friendships - Enhancing Social Bonds in the Classroom
Fostering Friendships - Enhancing Social Bonds  in the ClassroomFostering Friendships - Enhancing Social Bonds  in the Classroom
Fostering Friendships - Enhancing Social Bonds in the Classroom
 
ICT role in 21st century education and it's challenges.
ICT role in 21st century education and it's challenges.ICT role in 21st century education and it's challenges.
ICT role in 21st century education and it's challenges.
 
Wellbeing inclusion and digital dystopias.pptx
Wellbeing inclusion and digital dystopias.pptxWellbeing inclusion and digital dystopias.pptx
Wellbeing inclusion and digital dystopias.pptx
 
UGC NET Paper 1 Mathematical Reasoning & Aptitude.pdf
UGC NET Paper 1 Mathematical Reasoning & Aptitude.pdfUGC NET Paper 1 Mathematical Reasoning & Aptitude.pdf
UGC NET Paper 1 Mathematical Reasoning & Aptitude.pdf
 
ICT Role in 21st Century Education & its Challenges.pptx
ICT Role in 21st Century Education & its Challenges.pptxICT Role in 21st Century Education & its Challenges.pptx
ICT Role in 21st Century Education & its Challenges.pptx
 
Mehran University Newsletter Vol-X, Issue-I, 2024
Mehran University Newsletter Vol-X, Issue-I, 2024Mehran University Newsletter Vol-X, Issue-I, 2024
Mehran University Newsletter Vol-X, Issue-I, 2024
 
NO1 Top Black Magic Specialist In Lahore Black magic In Pakistan Kala Ilam Ex...
NO1 Top Black Magic Specialist In Lahore Black magic In Pakistan Kala Ilam Ex...NO1 Top Black Magic Specialist In Lahore Black magic In Pakistan Kala Ilam Ex...
NO1 Top Black Magic Specialist In Lahore Black magic In Pakistan Kala Ilam Ex...
 
Jual Obat Aborsi Hongkong ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
Jual Obat Aborsi Hongkong ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...Jual Obat Aborsi Hongkong ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
Jual Obat Aborsi Hongkong ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
 
Micro-Scholarship, What it is, How can it help me.pdf
Micro-Scholarship, What it is, How can it help me.pdfMicro-Scholarship, What it is, How can it help me.pdf
Micro-Scholarship, What it is, How can it help me.pdf
 
Graduate Outcomes Presentation Slides - English
Graduate Outcomes Presentation Slides - EnglishGraduate Outcomes Presentation Slides - English
Graduate Outcomes Presentation Slides - English
 
COMMUNICATING NEGATIVE NEWS - APPROACHES .pptx
COMMUNICATING NEGATIVE NEWS - APPROACHES .pptxCOMMUNICATING NEGATIVE NEWS - APPROACHES .pptx
COMMUNICATING NEGATIVE NEWS - APPROACHES .pptx
 
Salient Features of India constitution especially power and functions
Salient Features of India constitution especially power and functionsSalient Features of India constitution especially power and functions
Salient Features of India constitution especially power and functions
 
80 ĐỀ THI THỬ TUYỂN SINH TIẾNG ANH VÀO 10 SỞ GD – ĐT THÀNH PHỐ HỒ CHÍ MINH NĂ...
80 ĐỀ THI THỬ TUYỂN SINH TIẾNG ANH VÀO 10 SỞ GD – ĐT THÀNH PHỐ HỒ CHÍ MINH NĂ...80 ĐỀ THI THỬ TUYỂN SINH TIẾNG ANH VÀO 10 SỞ GD – ĐT THÀNH PHỐ HỒ CHÍ MINH NĂ...
80 ĐỀ THI THỬ TUYỂN SINH TIẾNG ANH VÀO 10 SỞ GD – ĐT THÀNH PHỐ HỒ CHÍ MINH NĂ...
 
Google Gemini An AI Revolution in Education.pptx
Google Gemini An AI Revolution in Education.pptxGoogle Gemini An AI Revolution in Education.pptx
Google Gemini An AI Revolution in Education.pptx
 
Beyond_Borders_Understanding_Anime_and_Manga_Fandom_A_Comprehensive_Audience_...
Beyond_Borders_Understanding_Anime_and_Manga_Fandom_A_Comprehensive_Audience_...Beyond_Borders_Understanding_Anime_and_Manga_Fandom_A_Comprehensive_Audience_...
Beyond_Borders_Understanding_Anime_and_Manga_Fandom_A_Comprehensive_Audience_...
 
How to Add New Custom Addons Path in Odoo 17
How to Add New Custom Addons Path in Odoo 17How to Add New Custom Addons Path in Odoo 17
How to Add New Custom Addons Path in Odoo 17
 
Kodo Millet PPT made by Ghanshyam bairwa college of Agriculture kumher bhara...
Kodo Millet  PPT made by Ghanshyam bairwa college of Agriculture kumher bhara...Kodo Millet  PPT made by Ghanshyam bairwa college of Agriculture kumher bhara...
Kodo Millet PPT made by Ghanshyam bairwa college of Agriculture kumher bhara...
 
How to Create and Manage Wizard in Odoo 17
How to Create and Manage Wizard in Odoo 17How to Create and Manage Wizard in Odoo 17
How to Create and Manage Wizard in Odoo 17
 

Machine learning using spark Online Training

  • 2. The following topics will be covered in our Machine Learning Using Spark Online Training: Copyright @ 2015 Learntek. All Rights Reserved. 2
  • 3. What is Machine Learning? ▪ Machine learning Using Spark-Spark MLlib is an application of artificial intelligence (AI) that provides systems the ability to automatically learn and improve from experience without being explicitly programmed. Machine learning focuses on the development of computer programs that can access data and use it learn for themselves. Copyright @ 2015 Learntek. All Rights Reserved. 3
  • 4. Into to Machine Learning Using Spark • MLlib is Spark’s machine learning (ML) library. Its goal is to make practical machine learning scalable and easy. At a high level, it provides tools such as: • ML Algorithms: common learning algorithms such as classification, regression, clustering, and collaborative filtering • Featurization: feature extraction, transformation, dimensionality reduction, and selection • Pipelines: tools for constructing, evaluating, and tuning ML Pipelines • Persistence: saving and load algorithms, models, and Pipelines • Utilities: linear algebra, statistics, data handling, etc. Copyright @ 2015 Learntek. All Rights Reserved. 4
  • 5. Tools • This course will be delivered using Scala and PYTHON API. For explaining statistical concept, R language will also be using. Visualization part will be covered using Bokeh/ggplot library. Copyright @ 2015 Learntek. All Rights Reserved. 5
  • 6. Introduction to Apache Spark ▪ Spark Programming model ▪ RDD and Data Frame ▪ Transformation and Action ▪ Broadcast and Accumulator ▪ Running HDP on local machine ▪ Launching Spark Cluster Copyright @ 2015 Learntek. All Rights Reserved. 6
  • 7. Basic Statistics • Mean, Mode, Media, Range, Variance, Standard Deviation, Quartiles, Percentiles • Sampling • Sampling Methods • Sampling Errors • Probability Distributions • Normal distribution, t-distribution, Chi- square, F • Margin of Error, Confidence Interval, Significance level, Degree of Freedom • Hypothesis concept, Type I and Type II error • P-value, t-Test, Chi-square Test • Correlation Coefficient Copyright @ 2015 Learntek. All Rights Reserved. 7
  • 8. Machine Learning Using Spark • Introduction to Spark MLlib • Data types: Vector, Labeled Point • Feature Extraction • Feature Transformation, Normalization • Feature Selectors • Locality Sensitive Hashing(LSH) Copyright @ 2015 Learntek. All Rights Reserved. 8
  • 9. Regression Analysis with Spark • Types of Regression Models • Gradient Descent • Linear Regression, Generalized Linear Regression • MSE, RMSE MAE, R-squared Coefficient • Transforming the target variable • Tuning Model Parameters Copyright @ 2015 Learntek. All Rights Reserved. 9
  • 10. Classification Model with Spark • Linear Models, Naives Bayes Model, Decision Tree • Logistic Regression • Linear Support Vector Machine • Random Forest • Gradient-Boosted Trees • Training Classification Models • Accuracy and prediction error • Precision and Recall • ROC curve and AUC • Cross validation Copyright @ 2015 Learntek. All Rights Reserved. 10
  • 11. Clustering • Hierarchical clustering • K-mean clustering Copyright @ 2015 Learntek. All Rights Reserved. 11
  • 12. Dimensionality Reduction • Principal Component Analysis • Singular Value Decomposition • Clustering as dimensionality reduction • Training a dimensionality reduction model • Evaluating dimensionality reduction models Copyright @ 2015 Learntek. All Rights Reserved. 12
  • 13. Recommendation Engine ▪ Content based filtering ▪ Collaborative based filtering ▪ Overview of Movie Lens data ▪ Training a recommendation model ▪ Using the recommendation model ▪ Performance Evaluation Copyright @ 2015 Learntek. All Rights Reserved. 13
  • 14. Text Processing Copyright @ 2015 Learntek. All Rights Reserved. 14 •Feature Hashing •TF-IDF model •Tokenization •Stop words •TF-IDF Weightings •Training a TF-IDF model •Usage of TF-IDF model •Evaluating TF-IDF models
  • 15. Prerequisites : ▪ Prior understanding of exploratory data analysis and data visualization will help immensely in learning machine learning concept and applications. This include basic statistical technique for data analysis. Having some knowledge of R programming or some Python packages like sci-kit, numpy will be useful. However , we are going to cover basic statistics technique as part of this course before going deep into machine learning . This will help everyone to gain maximum from this course. Copyright @ 2015 Learntek. All Rights Reserved. 15
  • 16. Copyright @ 2015 Learntek. All Rights Reserved. 16