SlideShare uma empresa Scribd logo
1 de 58
Machine learning with R
AMIS Day April 3rd 2017
Maarten Smeets
MACHINE LEARNING WITH R
WHAT IS MACHINE LEARNING USE CASES FOR MACHINE
LEARNING
SUPERVISED LEARNING
UNSUPERVISED LEARNING INTRODUCING R
COOL FEATURES OF R R AND ORACLE
MACHINE LEARNING
• Machine learning is the subfield of computer science that gives
computers the ability to learn without being explicitly programmed.
MACHINE LEARNING
USE CASES
• E-mail categorization
Spam, News, Personal, Orders, …
• Anomaly detection
Fraud detection, behavior which does not fit known classifications well
• Optical Character recognition (OCR)
• Genetics
Will you have a high change of relapse when you have this cancer type
and these genes?
MACHINE LEARNING
USE CASES
• Log file analysis
Which entries are rare?
Which are the variables in a log line?
Intruder detection
• IoT
Self learning thermostats
• Predict weather
Based on environmental measures like
humidity, air pressure, satellite images
• Detect trends
The number of cases present in the
KEI system at Spir-it and performance
• Image recognition
Self driving cars like Tesla, BMW
• Predict stock prices
Find correlations between stocks and try to
find features which can predict future prices
1 2
WHAT IS MACHINE LEARNING
Supervised learning Unsupervised learning
SUPERVISED LEARNING
• The computer is presented with input and desired output
• The goal is to derive a general ruleset to map input to output
• This ruleset can be used to do predictions of output based on input
SUPERVISED LEARNING
EXAMPLES
• Linear regression
• Support Vector Regression
• Random forest
• Artificial Neural Networks (ANN)
SUPERVISED LEARNING
LINEAR REGRESSION
Data
Statistics
Plot
SUPERVISED LEARNING
SUPPORT VECTOR REGRESSION
SUPERVISED LEARNING
SUPPORT VECTOR REGRESSION
http://www.svm-tutorial.com/2014/10/support-vector-regression-r/
Prediction with tuned model
SUPERVISED LEARNING
RANDOM FOREST
SUPERVISED LEARNING
RANDOM FOREST
• Features are used to classify data
• A set of decision trees are generated based on 2 sets of random features
• Every tree sees a subset of the data
• Splits in the tree are determined by training data values
where does a split add most information
• To do predictions, features are put through all decision trees
and the result classifications are given a weight
SUPERVISED LEARNING
RANDOM FOREST
SUPERVISED LEARNING
RANDOM FOREST
SUPERVISED LEARNING
RANDOM FOREST
Variable importance plot
Mainly Y was used in the decision trees
to determine the outcome
i (a counter) was not important
SUPERVISED LEARNING
RANDOM FOREST
• Why is it very useful?
• Data does not have many requirements
• Can deal with multiple dimensions
• Does good predictions in a lot of cases
• Fast
• Variable importance can easily be determined
If many features are correlated, a single representative feature can be used
Large black box
performing magic
SUPERVISED LEARNING
ARTIFICIAL NEURAL NETWORKS (ANN)
Input Output
SUPERVISED LEARNING
ARTIFICIAL NEURAL NETWORKS (ANN)
Input Output
Input
nodes
Output
nodes
Hidden
nodes
ARTIFICIAL NEURAL NETWORKS (ANN)
EXAMPLE BACKPROPAGATION
• Backpropagation
1. Nodes have connections and connections have a random assigned weight
2. Provide input and let the network generate output
3. Compare generated output with desired output
4. Go from output nodes back to input and adjust the weight of the node connections.
Adjusting a little bit at a time increases learning time and accuracy
5. Repeat from step 2 until desired error rate reached
• Can be done with weights or with node activation thresholds
ARTIFICIAL NEURAL NETWORKS (ANN)
SOME PERSONAL THOUGHTS (AS NEUROBIOLOGIST)
• Most samples of artificial neural networks do not take into account several
properties of biological neural networks
• Signals take time to go from A to B
• Neurons are not arranged in layers
Biological neural networks have a 3d structure with specialized area’s
• Once trained, most artificial neural networks are static and don’t learn anymore
• Biological neural networks implement a wide range of signaling mechanisms per node
(neurotransmitters)
• Learning algorithms are not only internal to the neural network.
Natural selection also plays a role
SUPERVISED LEARNING
CHALLENGES
• Requires learning set of inputs and desired outputs
• Training data should be balanced
• Correlated features cause biases
• Outputs should be distributed as evenly as possible
SUPERVISED LEARNING
AAAAAA A
B B
Training data
A
BBBBBB
Test data A
BAAAAAA
Input Output
Input Output
UNSUPERVISED LEARNING
• Unsupervised machine learning is the machine learning task of
inferring a function to describe hidden structure from "unlabeled"
data
a classification or categorization is not included in the observations
• Examples
• Clustering
• Anomaly detection
• Neural networks (Self Organizing Map)
HIERARCHICAL CLUSTERING
Every point starts a cluster
Clusters merge as
they go up the tree
HIERARCHICAL CLUSTERING
A: MEAN 2,2 STDEV 2 B: MEAN 6,6 STDEV 2
HIERARCHICAL CLUSTERING (HCL)
HIERARCHICAL CLUSTERING
A: MEAN 2,2 STDEV 2 B: MEAN 6,6 STDEV 2
Original Prediction
HIERARCHICAL CLUSTERING
A: MEAN 2,2 STDEV 1 B: MEAN 6,6 STDEV 1
Original Prediction
1 2 3
History Installation Basics
INTRODUCING R
R A SHORT HISTORY
• Conceived august 1993
An implementation of the S programming language
S was conceived in 1976
• Open sourced June 1995
• Main competitors: SPSS and SAS
• A lot of (mostly statistical) libraries available
CRAN package repository features 10366 available packages.
R INSTALLATION
• Download and install R
https://www.r-project.org/
R STUDIO INSTALLATION
• Download and install R Studio
https://www.rstudio.com/
R BASICS
• R is a functional programming (FP) language
• It provides many tools for the creation and manipulation of functions.
• You can do anything with functions that you can do with vectors: you
can assign them to variables, store them in lists, pass them as
arguments to other functions, create them inside functions, and even
return them as the result of a function.
R BASICS
SOME FEATURES
• GIT integration
• Interpreted; does not require compilation
Execute a line in your script and look at the result in the console
• Has its own markdown variant for documentation
Especially useful if you want to have graphs
• R Shiny allows you to generate and host scripts / graphs and make
them available from a browser
R BASICS
SOME FEATURES
• Code completion
• Allows multi threaded execution
• Can be run remotely on an R-server
• Great at reading / writing datasets
For example web site scraping for data
• Of course great at statistics
• Great at generating plots
Especially when using the ggplot2 library
R BASICS
SOME TIPS TO GET STARTED
• ?ggplot
• help(package=“ggplot2")
R DATATYPES
THE VECTOR
• Vector
a <- c(1,2,5.3,6,-2,4) # numeric vector
b <- c("one","two","three") # character vector
c <- c(TRUE,TRUE,TRUE,FALSE,TRUE,FALSE) #logical vector
a <- c(1,2,5.3,6,-2,4)
b <- a * 2
[1] 2.0 4.0 10.6 12.0 -4.0 8.0
R DATATYPES
THE MATRIX. ALL VALUES HAVE THE SAME TYPE AND LENGTH
# generates 5 x 4 numeric matrix
y<-matrix(1:20, nrow=5,ncol=4)
# another example
cells <- c(1,26,24,68)
rnames <- c("R1", "R2")
cnames <- c("C1", "C2")
mymatrix <- matrix(cells, nrow=2, ncol=2,
byrow=TRUE, dimnames=list(rnames, cnames))
# accessing matrix values
|x[,4] # 4th column of matrix
x[3,] # 3rd row of matrix
x[2:4,1:3] # rows 2,3,4 of columns 1,2,3
R DATATYPES
THE DATA.FRAME. LIKE A MATRIX BUT TYPES AND LENGTHS CAN VARY
d <- c(1,2,3,4)
e <- c("red", "white", "red", NA)
f <- c(TRUE,TRUE,TRUE,FALSE)
mydata <- data.frame(d,e,f)
names(mydata) <- c("ID","Color","Passed") # variable names
myframe[3:5] # columns 3,4,5 of data frame
myframe[c("ID","Age")] # columns ID and Age from data frame
myframe$X1 # variable x1 in the data frame
R DATATYPES
THE LIST
• An ordered collection of objects (components)
# example of a list with 4 components –
# a string, a numeric vector, a matrix, and a scaler
w <- list(name=“Maarten", mynumbers=a, mymatrix=y, age=36)
# example of a list containing two lists
v <- c(list1,list2)
1 2 3
Hosting plots
Shiny
Plot.ly
R markdown Web site crawling
COOL FEATURES OF R
COOL FEATURES OF R
SHINY
COOL FEATURES OF R
SHINY
UI Server
COOL FEATURES OF R
PLOT.LY INTERACTIVE GRAPHS
COOL FEATURES OF R
PLOT.LY INTERACTIVE GRAPHS
COOL FEATURES OF R
R MARKDOWN
COOL FEATURES OF R
R MARKDOWN
COOL FEATURES OF R
WEB SITE CRAWLING
COOL FEATURES OF R
WEB SITE CRAWLING
• Sector to Industry, Industry to Company
COOL FEATURES OF R
WEB SITE CRAWLING
COOL FEATURES OF R
WEB SITE CRAWLING
http://chart.finance.yahoo.com/table.csv?s=ABT.AX&a=1&b=28&c=2017&d=2&e=28&f=2017&g=d&ignore=.csv
1 2 3
What does
Oracle do with R
Using data from
an Oracle DB in R
Using functions from
R in the Oracle DB
ORACLE AND R
ORACLE AND R
ORACLE R ENTERPRISE
USING DATABASE DATA IN R
ORACLE R ENTERPRISE
USING R SCRIPTS DIRECTLY IN SQL STATEMENTS
https://github.com/MaartenSmeets/R

Mais conteúdo relacionado

Mais procurados

Ranking and Diversity in Recommendations - RecSys Stammtisch at SoundCloud, B...
Ranking and Diversity in Recommendations - RecSys Stammtisch at SoundCloud, B...Ranking and Diversity in Recommendations - RecSys Stammtisch at SoundCloud, B...
Ranking and Diversity in Recommendations - RecSys Stammtisch at SoundCloud, B...Alexandros Karatzoglou
 
A Primer on Entity Resolution
A Primer on Entity ResolutionA Primer on Entity Resolution
A Primer on Entity ResolutionBenjamin Bengfort
 
Feature engineering pipelines
Feature engineering pipelinesFeature engineering pipelines
Feature engineering pipelinesRamesh Sampath
 
A Workshop on R
A Workshop on RA Workshop on R
A Workshop on RAjay Ohri
 
Machine Learning - Dummy Variable Conversion
Machine Learning - Dummy Variable ConversionMachine Learning - Dummy Variable Conversion
Machine Learning - Dummy Variable ConversionAndrew Ferlitsch
 
Machine Learning - Dataset Preparation
Machine Learning - Dataset PreparationMachine Learning - Dataset Preparation
Machine Learning - Dataset PreparationAndrew Ferlitsch
 
Boosted tree
Boosted treeBoosted tree
Boosted treeZhuyi Xue
 
Netflix's Recommendation ML Pipeline Using Apache Spark: Spark Summit East ta...
Netflix's Recommendation ML Pipeline Using Apache Spark: Spark Summit East ta...Netflix's Recommendation ML Pipeline Using Apache Spark: Spark Summit East ta...
Netflix's Recommendation ML Pipeline Using Apache Spark: Spark Summit East ta...Spark Summit
 
Visual diagnostics for more effective machine learning
Visual diagnostics for more effective machine learningVisual diagnostics for more effective machine learning
Visual diagnostics for more effective machine learningBenjamin Bengfort
 
Data Structure
Data StructureData Structure
Data Structuresheraz1
 
Wrokflow programming and provenance query model
Wrokflow programming and provenance query model  Wrokflow programming and provenance query model
Wrokflow programming and provenance query model Rayhan Ferdous
 
ensembles_emptytemplate_v2
ensembles_emptytemplate_v2ensembles_emptytemplate_v2
ensembles_emptytemplate_v2Shrayes Ramesh
 

Mais procurados (20)

Ranking and Diversity in Recommendations - RecSys Stammtisch at SoundCloud, B...
Ranking and Diversity in Recommendations - RecSys Stammtisch at SoundCloud, B...Ranking and Diversity in Recommendations - RecSys Stammtisch at SoundCloud, B...
Ranking and Diversity in Recommendations - RecSys Stammtisch at SoundCloud, B...
 
R- Introduction
R- IntroductionR- Introduction
R- Introduction
 
A Primer on Entity Resolution
A Primer on Entity ResolutionA Primer on Entity Resolution
A Primer on Entity Resolution
 
Feature engineering pipelines
Feature engineering pipelinesFeature engineering pipelines
Feature engineering pipelines
 
264finalppt (1)
264finalppt (1)264finalppt (1)
264finalppt (1)
 
A Workshop on R
A Workshop on RA Workshop on R
A Workshop on R
 
Machine Learning - Dummy Variable Conversion
Machine Learning - Dummy Variable ConversionMachine Learning - Dummy Variable Conversion
Machine Learning - Dummy Variable Conversion
 
Language R
Language RLanguage R
Language R
 
Machine Learning - Dataset Preparation
Machine Learning - Dataset PreparationMachine Learning - Dataset Preparation
Machine Learning - Dataset Preparation
 
geekgap.io webinar #1
geekgap.io webinar #1geekgap.io webinar #1
geekgap.io webinar #1
 
R language
R languageR language
R language
 
Boosted tree
Boosted treeBoosted tree
Boosted tree
 
11. Arrays
11. Arrays11. Arrays
11. Arrays
 
Netflix's Recommendation ML Pipeline Using Apache Spark: Spark Summit East ta...
Netflix's Recommendation ML Pipeline Using Apache Spark: Spark Summit East ta...Netflix's Recommendation ML Pipeline Using Apache Spark: Spark Summit East ta...
Netflix's Recommendation ML Pipeline Using Apache Spark: Spark Summit East ta...
 
Visual diagnostics for more effective machine learning
Visual diagnostics for more effective machine learningVisual diagnostics for more effective machine learning
Visual diagnostics for more effective machine learning
 
Data Structure
Data StructureData Structure
Data Structure
 
Wrokflow programming and provenance query model
Wrokflow programming and provenance query model  Wrokflow programming and provenance query model
Wrokflow programming and provenance query model
 
LSESU a Taste of R Language Workshop
LSESU a Taste of R Language WorkshopLSESU a Taste of R Language Workshop
LSESU a Taste of R Language Workshop
 
ensembles_emptytemplate_v2
ensembles_emptytemplate_v2ensembles_emptytemplate_v2
ensembles_emptytemplate_v2
 
Week 1 - Data Structures and Algorithms
Week 1 - Data Structures and AlgorithmsWeek 1 - Data Structures and Algorithms
Week 1 - Data Structures and Algorithms
 

Semelhante a Machine learning with R

Machine learning for IoT - unpacking the blackbox
Machine learning for IoT - unpacking the blackboxMachine learning for IoT - unpacking the blackbox
Machine learning for IoT - unpacking the blackboxIvo Andreev
 
Prepare your data for machine learning
Prepare your data for machine learningPrepare your data for machine learning
Prepare your data for machine learningIvo Andreev
 
Towards a Comprehensive Machine Learning Benchmark
Towards a Comprehensive Machine Learning BenchmarkTowards a Comprehensive Machine Learning Benchmark
Towards a Comprehensive Machine Learning BenchmarkTuri, Inc.
 
Basic terminologies & asymptotic notations
Basic terminologies & asymptotic notationsBasic terminologies & asymptotic notations
Basic terminologies & asymptotic notationsRajendran
 
Data structure and algorithm.
Data structure and algorithm. Data structure and algorithm.
Data structure and algorithm. Abdul salam
 
rsec2a-2016-jheaton-morning
rsec2a-2016-jheaton-morningrsec2a-2016-jheaton-morning
rsec2a-2016-jheaton-morningJeff Heaton
 
Introduction of data science
Introduction of data scienceIntroduction of data science
Introduction of data scienceTanujaSomvanshi1
 
An LSTM-Based Neural Network Architecture for Model Transformations
An LSTM-Based Neural Network Architecture for Model TransformationsAn LSTM-Based Neural Network Architecture for Model Transformations
An LSTM-Based Neural Network Architecture for Model TransformationsJordi Cabot
 
U-SQL - Azure Data Lake Analytics for Developers
U-SQL - Azure Data Lake Analytics for DevelopersU-SQL - Azure Data Lake Analytics for Developers
U-SQL - Azure Data Lake Analytics for DevelopersMichael Rys
 
L5. Data Transformation and Feature Engineering
L5. Data Transformation and Feature EngineeringL5. Data Transformation and Feature Engineering
L5. Data Transformation and Feature EngineeringMachine Learning Valencia
 
Machine learning and linear regression programming
Machine learning and linear regression programmingMachine learning and linear regression programming
Machine learning and linear regression programmingSoumya Mukherjee
 
The Power of Auto ML and How Does it Work
The Power of Auto ML and How Does it WorkThe Power of Auto ML and How Does it Work
The Power of Auto ML and How Does it WorkIvo Andreev
 
The Machine Learning Workflow with Azure
The Machine Learning Workflow with AzureThe Machine Learning Workflow with Azure
The Machine Learning Workflow with AzureIvo Andreev
 
background.pptx
background.pptxbackground.pptx
background.pptxKabileshCm
 
M. FLORENCE DAYANA/DATABASE MANAGEMENT SYSYTEM
M. FLORENCE DAYANA/DATABASE MANAGEMENT SYSYTEMM. FLORENCE DAYANA/DATABASE MANAGEMENT SYSYTEM
M. FLORENCE DAYANA/DATABASE MANAGEMENT SYSYTEMDr.Florence Dayana
 
Demystifying Machine Learning
Demystifying Machine LearningDemystifying Machine Learning
Demystifying Machine LearningAyodele Odubela
 
Machine Learning with Azure
Machine Learning with AzureMachine Learning with Azure
Machine Learning with AzureBarbara Fusinska
 
b,Sc it data structure.pptx
b,Sc it data structure.pptxb,Sc it data structure.pptx
b,Sc it data structure.pptxclassall
 
Machine Learning with ML.NET and Azure - Andy Cross
Machine Learning with ML.NET and Azure - Andy CrossMachine Learning with ML.NET and Azure - Andy Cross
Machine Learning with ML.NET and Azure - Andy CrossAndrew Flatters
 
Algorithms and Data Structures
Algorithms and Data StructuresAlgorithms and Data Structures
Algorithms and Data Structuressonykhan3
 

Semelhante a Machine learning with R (20)

Machine learning for IoT - unpacking the blackbox
Machine learning for IoT - unpacking the blackboxMachine learning for IoT - unpacking the blackbox
Machine learning for IoT - unpacking the blackbox
 
Prepare your data for machine learning
Prepare your data for machine learningPrepare your data for machine learning
Prepare your data for machine learning
 
Towards a Comprehensive Machine Learning Benchmark
Towards a Comprehensive Machine Learning BenchmarkTowards a Comprehensive Machine Learning Benchmark
Towards a Comprehensive Machine Learning Benchmark
 
Basic terminologies & asymptotic notations
Basic terminologies & asymptotic notationsBasic terminologies & asymptotic notations
Basic terminologies & asymptotic notations
 
Data structure and algorithm.
Data structure and algorithm. Data structure and algorithm.
Data structure and algorithm.
 
rsec2a-2016-jheaton-morning
rsec2a-2016-jheaton-morningrsec2a-2016-jheaton-morning
rsec2a-2016-jheaton-morning
 
Introduction of data science
Introduction of data scienceIntroduction of data science
Introduction of data science
 
An LSTM-Based Neural Network Architecture for Model Transformations
An LSTM-Based Neural Network Architecture for Model TransformationsAn LSTM-Based Neural Network Architecture for Model Transformations
An LSTM-Based Neural Network Architecture for Model Transformations
 
U-SQL - Azure Data Lake Analytics for Developers
U-SQL - Azure Data Lake Analytics for DevelopersU-SQL - Azure Data Lake Analytics for Developers
U-SQL - Azure Data Lake Analytics for Developers
 
L5. Data Transformation and Feature Engineering
L5. Data Transformation and Feature EngineeringL5. Data Transformation and Feature Engineering
L5. Data Transformation and Feature Engineering
 
Machine learning and linear regression programming
Machine learning and linear regression programmingMachine learning and linear regression programming
Machine learning and linear regression programming
 
The Power of Auto ML and How Does it Work
The Power of Auto ML and How Does it WorkThe Power of Auto ML and How Does it Work
The Power of Auto ML and How Does it Work
 
The Machine Learning Workflow with Azure
The Machine Learning Workflow with AzureThe Machine Learning Workflow with Azure
The Machine Learning Workflow with Azure
 
background.pptx
background.pptxbackground.pptx
background.pptx
 
M. FLORENCE DAYANA/DATABASE MANAGEMENT SYSYTEM
M. FLORENCE DAYANA/DATABASE MANAGEMENT SYSYTEMM. FLORENCE DAYANA/DATABASE MANAGEMENT SYSYTEM
M. FLORENCE DAYANA/DATABASE MANAGEMENT SYSYTEM
 
Demystifying Machine Learning
Demystifying Machine LearningDemystifying Machine Learning
Demystifying Machine Learning
 
Machine Learning with Azure
Machine Learning with AzureMachine Learning with Azure
Machine Learning with Azure
 
b,Sc it data structure.pptx
b,Sc it data structure.pptxb,Sc it data structure.pptx
b,Sc it data structure.pptx
 
Machine Learning with ML.NET and Azure - Andy Cross
Machine Learning with ML.NET and Azure - Andy CrossMachine Learning with ML.NET and Azure - Andy Cross
Machine Learning with ML.NET and Azure - Andy Cross
 
Algorithms and Data Structures
Algorithms and Data StructuresAlgorithms and Data Structures
Algorithms and Data Structures
 

Mais de Maarten Smeets

Google jib: Building Java containers without Docker
Google jib: Building Java containers without DockerGoogle jib: Building Java containers without Docker
Google jib: Building Java containers without DockerMaarten Smeets
 
Introduction to Anchore Engine
Introduction to Anchore EngineIntroduction to Anchore Engine
Introduction to Anchore EngineMaarten Smeets
 
R2DBC Reactive Relational Database Connectivity
R2DBC Reactive Relational Database ConnectivityR2DBC Reactive Relational Database Connectivity
R2DBC Reactive Relational Database ConnectivityMaarten Smeets
 
Performance Issue? Machine Learning to the rescue!
Performance Issue? Machine Learning to the rescue!Performance Issue? Machine Learning to the rescue!
Performance Issue? Machine Learning to the rescue!Maarten Smeets
 
Performance of Microservice Frameworks on different JVMs
Performance of Microservice Frameworks on different JVMsPerformance of Microservice Frameworks on different JVMs
Performance of Microservice Frameworks on different JVMsMaarten Smeets
 
Performance of Microservice frameworks on different JVMs
Performance of Microservice frameworks on different JVMsPerformance of Microservice frameworks on different JVMs
Performance of Microservice frameworks on different JVMsMaarten Smeets
 
VirtualBox networking explained
VirtualBox networking explainedVirtualBox networking explained
VirtualBox networking explainedMaarten Smeets
 
Microservices on Application Container Cloud Service
Microservices on Application Container Cloud ServiceMicroservices on Application Container Cloud Service
Microservices on Application Container Cloud ServiceMaarten Smeets
 
WebLogic Stability; Detect and Analyse Stuck Threads
WebLogic Stability; Detect and Analyse Stuck ThreadsWebLogic Stability; Detect and Analyse Stuck Threads
WebLogic Stability; Detect and Analyse Stuck ThreadsMaarten Smeets
 
All you need to know about transport layer security
All you need to know about transport layer securityAll you need to know about transport layer security
All you need to know about transport layer securityMaarten Smeets
 
Webservice security considerations and measures
Webservice security considerations and measuresWebservice security considerations and measures
Webservice security considerations and measuresMaarten Smeets
 
WebLogic Scripting Tool made Cool!
WebLogic Scripting Tool made Cool!WebLogic Scripting Tool made Cool!
WebLogic Scripting Tool made Cool!Maarten Smeets
 
Oracle SOA Suite 12.2.1 new features
Oracle SOA Suite 12.2.1 new featuresOracle SOA Suite 12.2.1 new features
Oracle SOA Suite 12.2.1 new featuresMaarten Smeets
 
How to build a cloud adapter
How to build a cloud adapterHow to build a cloud adapter
How to build a cloud adapterMaarten Smeets
 
WebLogic authentication debugging
WebLogic authentication debuggingWebLogic authentication debugging
WebLogic authentication debuggingMaarten Smeets
 

Mais de Maarten Smeets (16)

Google jib: Building Java containers without Docker
Google jib: Building Java containers without DockerGoogle jib: Building Java containers without Docker
Google jib: Building Java containers without Docker
 
Introduction to Anchore Engine
Introduction to Anchore EngineIntroduction to Anchore Engine
Introduction to Anchore Engine
 
R2DBC Reactive Relational Database Connectivity
R2DBC Reactive Relational Database ConnectivityR2DBC Reactive Relational Database Connectivity
R2DBC Reactive Relational Database Connectivity
 
Performance Issue? Machine Learning to the rescue!
Performance Issue? Machine Learning to the rescue!Performance Issue? Machine Learning to the rescue!
Performance Issue? Machine Learning to the rescue!
 
Performance of Microservice Frameworks on different JVMs
Performance of Microservice Frameworks on different JVMsPerformance of Microservice Frameworks on different JVMs
Performance of Microservice Frameworks on different JVMs
 
Performance of Microservice frameworks on different JVMs
Performance of Microservice frameworks on different JVMsPerformance of Microservice frameworks on different JVMs
Performance of Microservice frameworks on different JVMs
 
VirtualBox networking explained
VirtualBox networking explainedVirtualBox networking explained
VirtualBox networking explained
 
Microservices on Application Container Cloud Service
Microservices on Application Container Cloud ServiceMicroservices on Application Container Cloud Service
Microservices on Application Container Cloud Service
 
WebLogic Stability; Detect and Analyse Stuck Threads
WebLogic Stability; Detect and Analyse Stuck ThreadsWebLogic Stability; Detect and Analyse Stuck Threads
WebLogic Stability; Detect and Analyse Stuck Threads
 
Introduction to Redis
Introduction to RedisIntroduction to Redis
Introduction to Redis
 
All you need to know about transport layer security
All you need to know about transport layer securityAll you need to know about transport layer security
All you need to know about transport layer security
 
Webservice security considerations and measures
Webservice security considerations and measuresWebservice security considerations and measures
Webservice security considerations and measures
 
WebLogic Scripting Tool made Cool!
WebLogic Scripting Tool made Cool!WebLogic Scripting Tool made Cool!
WebLogic Scripting Tool made Cool!
 
Oracle SOA Suite 12.2.1 new features
Oracle SOA Suite 12.2.1 new featuresOracle SOA Suite 12.2.1 new features
Oracle SOA Suite 12.2.1 new features
 
How to build a cloud adapter
How to build a cloud adapterHow to build a cloud adapter
How to build a cloud adapter
 
WebLogic authentication debugging
WebLogic authentication debuggingWebLogic authentication debugging
WebLogic authentication debugging
 

Último

Architecture decision records - How not to get lost in the past
Architecture decision records - How not to get lost in the pastArchitecture decision records - How not to get lost in the past
Architecture decision records - How not to get lost in the pastPapp Krisztián
 
%+27788225528 love spells in new york Psychic Readings, Attraction spells,Bri...
%+27788225528 love spells in new york Psychic Readings, Attraction spells,Bri...%+27788225528 love spells in new york Psychic Readings, Attraction spells,Bri...
%+27788225528 love spells in new york Psychic Readings, Attraction spells,Bri...masabamasaba
 
WSO2Con2024 - From Code To Cloud: Fast Track Your Cloud Native Journey with C...
WSO2Con2024 - From Code To Cloud: Fast Track Your Cloud Native Journey with C...WSO2Con2024 - From Code To Cloud: Fast Track Your Cloud Native Journey with C...
WSO2Con2024 - From Code To Cloud: Fast Track Your Cloud Native Journey with C...WSO2
 
Devoxx UK 2024 - Going serverless with Quarkus, GraalVM native images and AWS...
Devoxx UK 2024 - Going serverless with Quarkus, GraalVM native images and AWS...Devoxx UK 2024 - Going serverless with Quarkus, GraalVM native images and AWS...
Devoxx UK 2024 - Going serverless with Quarkus, GraalVM native images and AWS...Bert Jan Schrijver
 
tonesoftg
tonesoftgtonesoftg
tonesoftglanshi9
 
%in ivory park+277-882-255-28 abortion pills for sale in ivory park
%in ivory park+277-882-255-28 abortion pills for sale in ivory park %in ivory park+277-882-255-28 abortion pills for sale in ivory park
%in ivory park+277-882-255-28 abortion pills for sale in ivory park masabamasaba
 
VTU technical seminar 8Th Sem on Scikit-learn
VTU technical seminar 8Th Sem on Scikit-learnVTU technical seminar 8Th Sem on Scikit-learn
VTU technical seminar 8Th Sem on Scikit-learnAmarnathKambale
 
%in Bahrain+277-882-255-28 abortion pills for sale in Bahrain
%in Bahrain+277-882-255-28 abortion pills for sale in Bahrain%in Bahrain+277-882-255-28 abortion pills for sale in Bahrain
%in Bahrain+277-882-255-28 abortion pills for sale in Bahrainmasabamasaba
 
Crypto Cloud Review - How To Earn Up To $500 Per DAY Of Bitcoin 100% On AutoP...
Crypto Cloud Review - How To Earn Up To $500 Per DAY Of Bitcoin 100% On AutoP...Crypto Cloud Review - How To Earn Up To $500 Per DAY Of Bitcoin 100% On AutoP...
Crypto Cloud Review - How To Earn Up To $500 Per DAY Of Bitcoin 100% On AutoP...SelfMade bd
 
%in Soweto+277-882-255-28 abortion pills for sale in soweto
%in Soweto+277-882-255-28 abortion pills for sale in soweto%in Soweto+277-882-255-28 abortion pills for sale in soweto
%in Soweto+277-882-255-28 abortion pills for sale in sowetomasabamasaba
 
%+27788225528 love spells in Boston Psychic Readings, Attraction spells,Bring...
%+27788225528 love spells in Boston Psychic Readings, Attraction spells,Bring...%+27788225528 love spells in Boston Psychic Readings, Attraction spells,Bring...
%+27788225528 love spells in Boston Psychic Readings, Attraction spells,Bring...masabamasaba
 
+971565801893>>SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHAB...
+971565801893>>SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHAB...+971565801893>>SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHAB...
+971565801893>>SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHAB...Health
 
WSO2CON2024 - It's time to go Platformless
WSO2CON2024 - It's time to go PlatformlessWSO2CON2024 - It's time to go Platformless
WSO2CON2024 - It's time to go PlatformlessWSO2
 
The title is not connected to what is inside
The title is not connected to what is insideThe title is not connected to what is inside
The title is not connected to what is insideshinachiaurasa2
 
%+27788225528 love spells in Atlanta Psychic Readings, Attraction spells,Brin...
%+27788225528 love spells in Atlanta Psychic Readings, Attraction spells,Brin...%+27788225528 love spells in Atlanta Psychic Readings, Attraction spells,Brin...
%+27788225528 love spells in Atlanta Psychic Readings, Attraction spells,Brin...masabamasaba
 
%in tembisa+277-882-255-28 abortion pills for sale in tembisa
%in tembisa+277-882-255-28 abortion pills for sale in tembisa%in tembisa+277-882-255-28 abortion pills for sale in tembisa
%in tembisa+277-882-255-28 abortion pills for sale in tembisamasabamasaba
 
WSO2CON 2024 - WSO2's Digital Transformation Journey with Choreo: A Platforml...
WSO2CON 2024 - WSO2's Digital Transformation Journey with Choreo: A Platforml...WSO2CON 2024 - WSO2's Digital Transformation Journey with Choreo: A Platforml...
WSO2CON 2024 - WSO2's Digital Transformation Journey with Choreo: A Platforml...WSO2
 
Announcing Codolex 2.0 from GDK Software
Announcing Codolex 2.0 from GDK SoftwareAnnouncing Codolex 2.0 from GDK Software
Announcing Codolex 2.0 from GDK SoftwareJim McKeeth
 
Shapes for Sharing between Graph Data Spaces - and Epistemic Querying of RDF-...
Shapes for Sharing between Graph Data Spaces - and Epistemic Querying of RDF-...Shapes for Sharing between Graph Data Spaces - and Epistemic Querying of RDF-...
Shapes for Sharing between Graph Data Spaces - and Epistemic Querying of RDF-...Steffen Staab
 

Último (20)

Architecture decision records - How not to get lost in the past
Architecture decision records - How not to get lost in the pastArchitecture decision records - How not to get lost in the past
Architecture decision records - How not to get lost in the past
 
%+27788225528 love spells in new york Psychic Readings, Attraction spells,Bri...
%+27788225528 love spells in new york Psychic Readings, Attraction spells,Bri...%+27788225528 love spells in new york Psychic Readings, Attraction spells,Bri...
%+27788225528 love spells in new york Psychic Readings, Attraction spells,Bri...
 
WSO2Con2024 - From Code To Cloud: Fast Track Your Cloud Native Journey with C...
WSO2Con2024 - From Code To Cloud: Fast Track Your Cloud Native Journey with C...WSO2Con2024 - From Code To Cloud: Fast Track Your Cloud Native Journey with C...
WSO2Con2024 - From Code To Cloud: Fast Track Your Cloud Native Journey with C...
 
Devoxx UK 2024 - Going serverless with Quarkus, GraalVM native images and AWS...
Devoxx UK 2024 - Going serverless with Quarkus, GraalVM native images and AWS...Devoxx UK 2024 - Going serverless with Quarkus, GraalVM native images and AWS...
Devoxx UK 2024 - Going serverless with Quarkus, GraalVM native images and AWS...
 
tonesoftg
tonesoftgtonesoftg
tonesoftg
 
%in ivory park+277-882-255-28 abortion pills for sale in ivory park
%in ivory park+277-882-255-28 abortion pills for sale in ivory park %in ivory park+277-882-255-28 abortion pills for sale in ivory park
%in ivory park+277-882-255-28 abortion pills for sale in ivory park
 
VTU technical seminar 8Th Sem on Scikit-learn
VTU technical seminar 8Th Sem on Scikit-learnVTU technical seminar 8Th Sem on Scikit-learn
VTU technical seminar 8Th Sem on Scikit-learn
 
%in Bahrain+277-882-255-28 abortion pills for sale in Bahrain
%in Bahrain+277-882-255-28 abortion pills for sale in Bahrain%in Bahrain+277-882-255-28 abortion pills for sale in Bahrain
%in Bahrain+277-882-255-28 abortion pills for sale in Bahrain
 
Crypto Cloud Review - How To Earn Up To $500 Per DAY Of Bitcoin 100% On AutoP...
Crypto Cloud Review - How To Earn Up To $500 Per DAY Of Bitcoin 100% On AutoP...Crypto Cloud Review - How To Earn Up To $500 Per DAY Of Bitcoin 100% On AutoP...
Crypto Cloud Review - How To Earn Up To $500 Per DAY Of Bitcoin 100% On AutoP...
 
%in Soweto+277-882-255-28 abortion pills for sale in soweto
%in Soweto+277-882-255-28 abortion pills for sale in soweto%in Soweto+277-882-255-28 abortion pills for sale in soweto
%in Soweto+277-882-255-28 abortion pills for sale in soweto
 
Abortion Pill Prices Tembisa [(+27832195400*)] 🏥 Women's Abortion Clinic in T...
Abortion Pill Prices Tembisa [(+27832195400*)] 🏥 Women's Abortion Clinic in T...Abortion Pill Prices Tembisa [(+27832195400*)] 🏥 Women's Abortion Clinic in T...
Abortion Pill Prices Tembisa [(+27832195400*)] 🏥 Women's Abortion Clinic in T...
 
%+27788225528 love spells in Boston Psychic Readings, Attraction spells,Bring...
%+27788225528 love spells in Boston Psychic Readings, Attraction spells,Bring...%+27788225528 love spells in Boston Psychic Readings, Attraction spells,Bring...
%+27788225528 love spells in Boston Psychic Readings, Attraction spells,Bring...
 
+971565801893>>SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHAB...
+971565801893>>SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHAB...+971565801893>>SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHAB...
+971565801893>>SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHAB...
 
WSO2CON2024 - It's time to go Platformless
WSO2CON2024 - It's time to go PlatformlessWSO2CON2024 - It's time to go Platformless
WSO2CON2024 - It's time to go Platformless
 
The title is not connected to what is inside
The title is not connected to what is insideThe title is not connected to what is inside
The title is not connected to what is inside
 
%+27788225528 love spells in Atlanta Psychic Readings, Attraction spells,Brin...
%+27788225528 love spells in Atlanta Psychic Readings, Attraction spells,Brin...%+27788225528 love spells in Atlanta Psychic Readings, Attraction spells,Brin...
%+27788225528 love spells in Atlanta Psychic Readings, Attraction spells,Brin...
 
%in tembisa+277-882-255-28 abortion pills for sale in tembisa
%in tembisa+277-882-255-28 abortion pills for sale in tembisa%in tembisa+277-882-255-28 abortion pills for sale in tembisa
%in tembisa+277-882-255-28 abortion pills for sale in tembisa
 
WSO2CON 2024 - WSO2's Digital Transformation Journey with Choreo: A Platforml...
WSO2CON 2024 - WSO2's Digital Transformation Journey with Choreo: A Platforml...WSO2CON 2024 - WSO2's Digital Transformation Journey with Choreo: A Platforml...
WSO2CON 2024 - WSO2's Digital Transformation Journey with Choreo: A Platforml...
 
Announcing Codolex 2.0 from GDK Software
Announcing Codolex 2.0 from GDK SoftwareAnnouncing Codolex 2.0 from GDK Software
Announcing Codolex 2.0 from GDK Software
 
Shapes for Sharing between Graph Data Spaces - and Epistemic Querying of RDF-...
Shapes for Sharing between Graph Data Spaces - and Epistemic Querying of RDF-...Shapes for Sharing between Graph Data Spaces - and Epistemic Querying of RDF-...
Shapes for Sharing between Graph Data Spaces - and Epistemic Querying of RDF-...
 

Machine learning with R

  • 1.
  • 2. Machine learning with R AMIS Day April 3rd 2017 Maarten Smeets
  • 3. MACHINE LEARNING WITH R WHAT IS MACHINE LEARNING USE CASES FOR MACHINE LEARNING SUPERVISED LEARNING UNSUPERVISED LEARNING INTRODUCING R COOL FEATURES OF R R AND ORACLE
  • 4. MACHINE LEARNING • Machine learning is the subfield of computer science that gives computers the ability to learn without being explicitly programmed.
  • 5. MACHINE LEARNING USE CASES • E-mail categorization Spam, News, Personal, Orders, … • Anomaly detection Fraud detection, behavior which does not fit known classifications well • Optical Character recognition (OCR) • Genetics Will you have a high change of relapse when you have this cancer type and these genes?
  • 6. MACHINE LEARNING USE CASES • Log file analysis Which entries are rare? Which are the variables in a log line? Intruder detection • IoT Self learning thermostats • Predict weather Based on environmental measures like humidity, air pressure, satellite images • Detect trends The number of cases present in the KEI system at Spir-it and performance • Image recognition Self driving cars like Tesla, BMW • Predict stock prices Find correlations between stocks and try to find features which can predict future prices
  • 7. 1 2 WHAT IS MACHINE LEARNING Supervised learning Unsupervised learning
  • 8. SUPERVISED LEARNING • The computer is presented with input and desired output • The goal is to derive a general ruleset to map input to output • This ruleset can be used to do predictions of output based on input
  • 9. SUPERVISED LEARNING EXAMPLES • Linear regression • Support Vector Regression • Random forest • Artificial Neural Networks (ANN)
  • 12. SUPERVISED LEARNING SUPPORT VECTOR REGRESSION http://www.svm-tutorial.com/2014/10/support-vector-regression-r/ Prediction with tuned model
  • 14. SUPERVISED LEARNING RANDOM FOREST • Features are used to classify data • A set of decision trees are generated based on 2 sets of random features • Every tree sees a subset of the data • Splits in the tree are determined by training data values where does a split add most information • To do predictions, features are put through all decision trees and the result classifications are given a weight
  • 17. SUPERVISED LEARNING RANDOM FOREST Variable importance plot Mainly Y was used in the decision trees to determine the outcome i (a counter) was not important
  • 18. SUPERVISED LEARNING RANDOM FOREST • Why is it very useful? • Data does not have many requirements • Can deal with multiple dimensions • Does good predictions in a lot of cases • Fast • Variable importance can easily be determined If many features are correlated, a single representative feature can be used
  • 19. Large black box performing magic SUPERVISED LEARNING ARTIFICIAL NEURAL NETWORKS (ANN) Input Output
  • 20. SUPERVISED LEARNING ARTIFICIAL NEURAL NETWORKS (ANN) Input Output Input nodes Output nodes Hidden nodes
  • 21. ARTIFICIAL NEURAL NETWORKS (ANN) EXAMPLE BACKPROPAGATION • Backpropagation 1. Nodes have connections and connections have a random assigned weight 2. Provide input and let the network generate output 3. Compare generated output with desired output 4. Go from output nodes back to input and adjust the weight of the node connections. Adjusting a little bit at a time increases learning time and accuracy 5. Repeat from step 2 until desired error rate reached • Can be done with weights or with node activation thresholds
  • 22. ARTIFICIAL NEURAL NETWORKS (ANN) SOME PERSONAL THOUGHTS (AS NEUROBIOLOGIST) • Most samples of artificial neural networks do not take into account several properties of biological neural networks • Signals take time to go from A to B • Neurons are not arranged in layers Biological neural networks have a 3d structure with specialized area’s • Once trained, most artificial neural networks are static and don’t learn anymore • Biological neural networks implement a wide range of signaling mechanisms per node (neurotransmitters) • Learning algorithms are not only internal to the neural network. Natural selection also plays a role
  • 23. SUPERVISED LEARNING CHALLENGES • Requires learning set of inputs and desired outputs • Training data should be balanced • Correlated features cause biases • Outputs should be distributed as evenly as possible
  • 24. SUPERVISED LEARNING AAAAAA A B B Training data A BBBBBB Test data A BAAAAAA Input Output Input Output
  • 25. UNSUPERVISED LEARNING • Unsupervised machine learning is the machine learning task of inferring a function to describe hidden structure from "unlabeled" data a classification or categorization is not included in the observations • Examples • Clustering • Anomaly detection • Neural networks (Self Organizing Map)
  • 26. HIERARCHICAL CLUSTERING Every point starts a cluster Clusters merge as they go up the tree
  • 27. HIERARCHICAL CLUSTERING A: MEAN 2,2 STDEV 2 B: MEAN 6,6 STDEV 2
  • 29. HIERARCHICAL CLUSTERING A: MEAN 2,2 STDEV 2 B: MEAN 6,6 STDEV 2 Original Prediction
  • 30. HIERARCHICAL CLUSTERING A: MEAN 2,2 STDEV 1 B: MEAN 6,6 STDEV 1 Original Prediction
  • 31. 1 2 3 History Installation Basics INTRODUCING R
  • 32. R A SHORT HISTORY • Conceived august 1993 An implementation of the S programming language S was conceived in 1976 • Open sourced June 1995 • Main competitors: SPSS and SAS • A lot of (mostly statistical) libraries available CRAN package repository features 10366 available packages.
  • 33. R INSTALLATION • Download and install R https://www.r-project.org/
  • 34. R STUDIO INSTALLATION • Download and install R Studio https://www.rstudio.com/
  • 35. R BASICS • R is a functional programming (FP) language • It provides many tools for the creation and manipulation of functions. • You can do anything with functions that you can do with vectors: you can assign them to variables, store them in lists, pass them as arguments to other functions, create them inside functions, and even return them as the result of a function.
  • 36. R BASICS SOME FEATURES • GIT integration • Interpreted; does not require compilation Execute a line in your script and look at the result in the console • Has its own markdown variant for documentation Especially useful if you want to have graphs • R Shiny allows you to generate and host scripts / graphs and make them available from a browser
  • 37. R BASICS SOME FEATURES • Code completion • Allows multi threaded execution • Can be run remotely on an R-server • Great at reading / writing datasets For example web site scraping for data • Of course great at statistics • Great at generating plots Especially when using the ggplot2 library
  • 38. R BASICS SOME TIPS TO GET STARTED • ?ggplot • help(package=“ggplot2")
  • 39. R DATATYPES THE VECTOR • Vector a <- c(1,2,5.3,6,-2,4) # numeric vector b <- c("one","two","three") # character vector c <- c(TRUE,TRUE,TRUE,FALSE,TRUE,FALSE) #logical vector a <- c(1,2,5.3,6,-2,4) b <- a * 2 [1] 2.0 4.0 10.6 12.0 -4.0 8.0
  • 40. R DATATYPES THE MATRIX. ALL VALUES HAVE THE SAME TYPE AND LENGTH # generates 5 x 4 numeric matrix y<-matrix(1:20, nrow=5,ncol=4) # another example cells <- c(1,26,24,68) rnames <- c("R1", "R2") cnames <- c("C1", "C2") mymatrix <- matrix(cells, nrow=2, ncol=2, byrow=TRUE, dimnames=list(rnames, cnames)) # accessing matrix values |x[,4] # 4th column of matrix x[3,] # 3rd row of matrix x[2:4,1:3] # rows 2,3,4 of columns 1,2,3
  • 41. R DATATYPES THE DATA.FRAME. LIKE A MATRIX BUT TYPES AND LENGTHS CAN VARY d <- c(1,2,3,4) e <- c("red", "white", "red", NA) f <- c(TRUE,TRUE,TRUE,FALSE) mydata <- data.frame(d,e,f) names(mydata) <- c("ID","Color","Passed") # variable names myframe[3:5] # columns 3,4,5 of data frame myframe[c("ID","Age")] # columns ID and Age from data frame myframe$X1 # variable x1 in the data frame
  • 42. R DATATYPES THE LIST • An ordered collection of objects (components) # example of a list with 4 components – # a string, a numeric vector, a matrix, and a scaler w <- list(name=“Maarten", mynumbers=a, mymatrix=y, age=36) # example of a list containing two lists v <- c(list1,list2)
  • 43. 1 2 3 Hosting plots Shiny Plot.ly R markdown Web site crawling COOL FEATURES OF R
  • 44. COOL FEATURES OF R SHINY
  • 45. COOL FEATURES OF R SHINY UI Server
  • 46. COOL FEATURES OF R PLOT.LY INTERACTIVE GRAPHS
  • 47. COOL FEATURES OF R PLOT.LY INTERACTIVE GRAPHS
  • 48. COOL FEATURES OF R R MARKDOWN
  • 49. COOL FEATURES OF R R MARKDOWN
  • 50. COOL FEATURES OF R WEB SITE CRAWLING
  • 51. COOL FEATURES OF R WEB SITE CRAWLING • Sector to Industry, Industry to Company
  • 52. COOL FEATURES OF R WEB SITE CRAWLING
  • 53. COOL FEATURES OF R WEB SITE CRAWLING http://chart.finance.yahoo.com/table.csv?s=ABT.AX&a=1&b=28&c=2017&d=2&e=28&f=2017&g=d&ignore=.csv
  • 54. 1 2 3 What does Oracle do with R Using data from an Oracle DB in R Using functions from R in the Oracle DB ORACLE AND R
  • 56. ORACLE R ENTERPRISE USING DATABASE DATA IN R
  • 57. ORACLE R ENTERPRISE USING R SCRIPTS DIRECTLY IN SQL STATEMENTS