SlideShare uma empresa Scribd logo
1 de 30
Machine Learning in Computer
Security
Presented by :
Kishor Datta Gupta
Computer security
Task of
cyber
security
Prediction
Prevention
Detection
Response
Monitoring
Places to
do the task
Network (network traffic
analysis and intrusion
detection)
Endpoint (anti-malware)
Application (WAF or
database firewalls)
User (UBA)
Process (anti-fraud)
Time to do
the tasks
In transit in real time
At rest
Historically
What Machine Learning Can Do?
• A task of predicting the next value based on the
previous values.
Regression (or
prediction)
• A task of separating things into different categories.
Classification
• Similar to classification but the classes are unknown,
grouping things by their similarity.
Clustering
• A task of recommending something based on the
previous experience.
Association rule learning
(or recommendation)
• A task of searching common and most important
features in multiple examples.
Dimensionality reduction
or generalization
• A task of creating something based on the previous
knowledge of the distribution.
Generative models
Regression:
The knowledge about the existing
data is utilized to have an idea of
the new data. Example : house
prices prediction.
Example in Cyber security: it can
be applied to fraud detection. The
features (e.g., the total amount of
suspicious transaction, location,
etc.) determine a probability of
fraudulent actions.
Regression
• Linear regression
• Polynomial regression
• Ridge regression
• Decision trees
• SVR (Support Vector Regression)
• Random forest
Machine
learning
• Artificial Neural Network (ANN)
• Recurrent Neural Network (RNN)
• Neural Turing Machines (NTM)
• Differentiable Neural Computer (DNC)
Deep
learning
Linear
Regression:
• Linear regression performs
the task to predict a
dependent variable value (y)
based on a given
independent variable (x)
• . So, this regression
technique finds out a linear
relationship between x (input)
and y(output). Hence, the
name is Linear Regression.
• Y=MX+C
Polynomial
Regression:
2 Degree polynomial
y = θo + θ₁x₁ + θ₂ x₁²
General equation of a
polynomial regression is:
Y=θo + θ₁X + θ₂X² + … + θₘXᵐ
Decision Tree
• The goal of using a Decision Tree is
to create a training model that can
use to predict the class or value of the
target variable by learning simple
decision rules inferred from prior
data(training data).
• In Decision Trees, for predicting a
class label for a record we start from
the root of the tree. We compare the
values of the root attribute with the
record’s attribute.
• On the basis of comparison, we follow
the branch corresponding to that
value and jump to the next node.
Regression
Evaluations
MAE (Mean absolute error) represents
the difference between the original and
predicted values extracted by averaged
the absolute difference over the data set.
•MSE (Mean Squared Error) represents
the difference between the original and
predicted values extracted by squared
the average difference over the data set.
•RMSE (Root Mean Squared Error) is
the error rate by the square root of MSE.
•R-squared (Coefficient of
determination) represents the coefficient
of how well the values fit compared to
the original values. The value from 0 to 1
interpreted as percentages. The higher
the value is, the better the model is.
Classification:
Classification refers to a
predictive modeling
problem where a class label
is predicted for a given
example of input data.
In terms of cybersecurity, a
spam filter separating
spams from other messages
can serve as an example.
Classification:
• LogisticRegression (LR)
• K-Nearest Neighbors (K-NN)
• Support Vector Machine (SVM)
• KernelSVM
• NaiveBayes
• DecisionTreeClassification
• Random Forest Classification
Machine
learning
• Artificial Neural Network
• Convolutional Neural Networks
Deep
learning
Support
Vector
Machine
(SVM):
The objective of the SVM is to
find a hyperplane in an N-
dimensional space(N — the
number of features) that
distinctly classifies the data
points.
Naïve Bayes:
It is a probabilistic classifier that
makes classifications using the
Maximum A Posteriori decision rule
in a Bayesian setting.
Naive Bayes classifiers have been
especially popular for text
classification, and are a traditional
solution for problems such as spam
detection.
Artificial Neural
Network:
The core component of ANNs is artificial neurons.
Each neuron receives inputs from several other
neurons, multiplies them by assigned weights, adds
them and passes the sum to one or more neurons.
Some artificial neurons might apply an activation
function to the output before passing it to the next
variable.
Artificial neural networks are composed of an input
layer, which receives data from outside sources
(data files, images, hardware sensors,
microphone…), one or more hidden layers that
process the data, and an output layer that provides
one or more data points based on the function of the
network.
Classification
Evaluations
Accuracy
Accuracy = (TP+TN)/(TP+FP+FN+TN)
Accuracy is the proportion of true results
among the total number of cases
examined.
Precision
•. what proportion of predicted Positives
is truly Positive?
•Precision = (TP)/(TP+FP)
Recall
• what proportion of actual Positives is
correctly classified?
•Recall = (TP)/(TP+FN)
F1 Score
• Harmonic Mean of precision and recall.
Clustering:
The information about the classes of the data is unknown.
There is no idea whether this data can be classified. This is
unsupervised learning.
Supposedly, the best task for clustering is forensic analysis. The
reasons, course, and consequences of an incident are obscure.
It’s required to classify all activities to find anomalies. Solutions
to malware analysis (i.e., malware protection or secure email
gateways) may implement it to separate legal files from outliers.
Another interesting area where clustering can be applied is user
behavior analytics. In this instance, application users cluster
together so that it is possible to see if they should belong to a
particular group.
Usually clustering is not applied to solving a particular task in
cybersecurity as it is more like one of the subtasks in a pipeline
(e.g., grouping users into separate groups to adjust risk values).
Clustering :
• K-means
• Mixturemodel(LDA)
• DBSCn
• Bayesian
• GaussianMixtureModel
• Agglomerative
• Mean-shift
Machine
learning
• Self-organized Maps (SOM)
• Kohonen Networks
Deep
learning
K-Means
Clustering
K-Means finds the best centroids by alternating
between (1) assigning data points to clusters based on
the current centroids (2) choosing centroids (points
which are the center of a cluster) based on the
current assignment of data points to clusters.
Association
Rule learning
Netflix and SoundCloud recommend films or songs
according to your movies or music preferences.
In cybersecurity, this principle can be used primarily for
incident response.
If a company faces a wave of incidents and offers
various types of responses, a system learns a type of
response for a particular incident (e.g., mark it as a false
positive, change a risk value, run the investigation).
Risk management solutions can also have a benefit if
they automatically assign risk values for new
vulnerabilities or misconfigurations built on their
description.
Association Rule learning :
• Apriori
• Euclat
• FP-Growth
Machine
learning
• Deep Restricted Boltzmann Machine
(RBM)
• Deep Belief Network (DBN)
• Stacked Autoencoder
Deep
learning
Generalization:
Dimensionality reduction can help
handle it and cut unnecessary
features. Like clustering,
dimensionality reduction is usually
one of the tasks in a more
complex model.
As to cybersecurity tasks,
dimensionality reduction is
common for face detection
solutions
Generalization :
• Principal Component Analysis (PCA)
• Singular-value decomposition (SVD)
• T-distributed Stochastic Neighbor Embedding (T-SNE)
• Linear Discriminant Analysis (LDA)
• Latent Semantic Analysis (LSA)
• Factor Analysis (FA)
• Independent Component Analysis (ICA)
• Non-negative Matrix Factorization (NMF)
Machine
learning
• Auto encoder
Deep
learning
Generative models:
Generative models are designed to simulate the actual data
(not decisions) based on the previous decisions.
The simple task of offensive cybersecurity is to generate a
list of input parameters to test a particular application for
Injection vulnerabilities.
Alternatively, we can have a vulnerability scanning tool for
web applications. One of its modules is testing files for
unauthorized access. These tests are able to mutate
existing filenames to identify the new ones.
For example, if a crawler detected a file called login.php, it’s
better to check the existence of any backup or test its copies
by trying names like login_1.php, login_backup.php,
login.php.2017. Generative models are good at this.
Generative models :
• Markov Chains
• Genetic Algorithm
Machine
learning
• Variational Autoencoders
• Generative adversarial networks (GANs)
• Boltzmann Machines
Deep
learning
Machine learning for Network Protection
ML in network security implies new solutions aimed at in-depth
analysis of all the traffic at each layer and detect attacks and
anomalies.
How can ML help here?
• Regression to predict the network packet parameters and compare them with the
normal ones;
• Classification to identify different classes of network attacks such as scanning and
spoofing;
• Clustering for forensic analysis.
Machine learning for Endpoint Protection
The new generation of anti-viruses is Endpoint Detection and Response. It’s
better to learn features in executable files or in the process behavior. Data may
differ depending on the type of endpoint (e.g., workstation, server, container, cloud
instance, mobile, PLC, IoT device) but the tasks are common
How can ML help here?
• Regression to predict the next system call for executable process and compare it with real ones;
• Classification to divide programs into such categories as malware, spyware and ransomware;
• Clustering for malware protection on secure email gateways (e.g., to separate legal file attachments
from outliers).
Machine learning for Application Security
Application security can differ. There are web applications,
databases, ERP systems, SaaS applications, micro services, etc.
How can ML help here?
• Regression to detect anomalies in HTTP requests (for example, XXE and
SSRF attacks and auth bypass);
• Classification to detect known types of attacks like injections (SQLi, XSS,
RCE, etc.);
• Clustering user activity to detect DDOS attacks and mass exploitation.
Machine learning for User Behavior
There are domain users, application users, SaaS users, social networks,
messengers, and other accounts that should be monitored.
User behavior is one of the complex layers and unsupervised learning problem.
As a rule, there is no labelled dataset as well as any idea of what to look for.
How can ML help here?
• Regression to detect anomalies in User actions (e.g., login in unusual time);
• Classification to group different users for peer-group analysis;
• Clustering to separate groups of users and detect outliers
Machine learning for Process Behavior
it’s necessary to know a business process in order to find something
anomalous.
Business processes can differ significantly. You can look for fraud in
banking and retail system, or a plant floor in manufacturing.
How can ML help here?
• Regression to predict the next user action and detect outliers such as credit card fraud;
• Classification to detect known types of fraud;
• Clustering to compare business processes and detect outliers.
References
• https://towardsdatascience.com/machine-learning-for-cybersecurity-101-7822b802790b
• AI for Cybersecurity by Cylance(2017)- Short but good introduction to basics of ML for Cybersecurity. Good practical
examples.
• Machine Learning and Security by O’reilly ( January 2018 ) — Best book so far about this topic but very few examples of Deep
Learning and mostly a general Machine Learning
• Machine Learning For Penetration Testers, by Packt ( July 2018 )- Less fundamental than previous one, but have more Deep
Learning approaches

Mais conteúdo relacionado

Mais procurados

NSL KDD Cup 99 dataset Anomaly Detection using Machine Learning Technique
NSL KDD Cup 99 dataset Anomaly Detection using Machine Learning Technique NSL KDD Cup 99 dataset Anomaly Detection using Machine Learning Technique
NSL KDD Cup 99 dataset Anomaly Detection using Machine Learning Technique Sujeet Suryawanshi
 
Adversarial examples in deep learning (Gregory Chatel)
Adversarial examples in deep learning (Gregory Chatel)Adversarial examples in deep learning (Gregory Chatel)
Adversarial examples in deep learning (Gregory Chatel)MeetupDataScienceRoma
 
A review of machine learning based anomaly detection
A review of machine learning based anomaly detectionA review of machine learning based anomaly detection
A review of machine learning based anomaly detectionMohamed Elfadly
 
Intrusion Detection System
Intrusion Detection SystemIntrusion Detection System
Intrusion Detection SystemAbhishek Walter
 
SPAM FILTERING SECURITY EVALUATION FRAMEWORK USING SVM, LR AND MILR
SPAM FILTERING SECURITY EVALUATION FRAMEWORK USING SVM, LR AND MILR SPAM FILTERING SECURITY EVALUATION FRAMEWORK USING SVM, LR AND MILR
SPAM FILTERING SECURITY EVALUATION FRAMEWORK USING SVM, LR AND MILR ijcax
 
Adversarial Learning_Rupam Bhattacharya
Adversarial Learning_Rupam BhattacharyaAdversarial Learning_Rupam Bhattacharya
Adversarial Learning_Rupam BhattacharyaRupam Bhattacharya
 
Causative Adversarial Learning
Causative Adversarial LearningCausative Adversarial Learning
Causative Adversarial LearningDavid Dao
 
Introduction to Machine Learning
Introduction to Machine LearningIntroduction to Machine Learning
Introduction to Machine LearningLior Rokach
 
Adversarial Attacks on A.I. Systems — NextCon, Jan 2019
Adversarial Attacks on A.I. Systems — NextCon, Jan 2019Adversarial Attacks on A.I. Systems — NextCon, Jan 2019
Adversarial Attacks on A.I. Systems — NextCon, Jan 2019anant90
 
ANALYSIS OF MACHINE LEARNING ALGORITHMS WITH FEATURE SELECTION FOR INTRUSION ...
ANALYSIS OF MACHINE LEARNING ALGORITHMS WITH FEATURE SELECTION FOR INTRUSION ...ANALYSIS OF MACHINE LEARNING ALGORITHMS WITH FEATURE SELECTION FOR INTRUSION ...
ANALYSIS OF MACHINE LEARNING ALGORITHMS WITH FEATURE SELECTION FOR INTRUSION ...IJNSA Journal
 
Sentiment analysis of tweets using Neural Networks
Sentiment analysis of tweets using Neural NetworksSentiment analysis of tweets using Neural Networks
Sentiment analysis of tweets using Neural NetworksAdrián Palacios Corella
 
Machine Learning under Attack: Vulnerability Exploitation and Security Measures
Machine Learning under Attack: Vulnerability Exploitation and Security MeasuresMachine Learning under Attack: Vulnerability Exploitation and Security Measures
Machine Learning under Attack: Vulnerability Exploitation and Security MeasuresPluribus One
 
Keyboards, Privacy, and Sensor Webs (Part II)
Keyboards, Privacy, and Sensor Webs (Part II)Keyboards, Privacy, and Sensor Webs (Part II)
Keyboards, Privacy, and Sensor Webs (Part II)butest
 
Extract Stressors for Suicide from Twitter Using Deep Learning
Extract Stressors for Suicide from Twitter Using Deep LearningExtract Stressors for Suicide from Twitter Using Deep Learning
Extract Stressors for Suicide from Twitter Using Deep LearningThi K. Tran-Nguyen, PhD
 

Mais procurados (16)

NSL KDD Cup 99 dataset Anomaly Detection using Machine Learning Technique
NSL KDD Cup 99 dataset Anomaly Detection using Machine Learning Technique NSL KDD Cup 99 dataset Anomaly Detection using Machine Learning Technique
NSL KDD Cup 99 dataset Anomaly Detection using Machine Learning Technique
 
Adversarial examples in deep learning (Gregory Chatel)
Adversarial examples in deep learning (Gregory Chatel)Adversarial examples in deep learning (Gregory Chatel)
Adversarial examples in deep learning (Gregory Chatel)
 
A review of machine learning based anomaly detection
A review of machine learning based anomaly detectionA review of machine learning based anomaly detection
A review of machine learning based anomaly detection
 
Intrusion Detection System
Intrusion Detection SystemIntrusion Detection System
Intrusion Detection System
 
SPAM FILTERING SECURITY EVALUATION FRAMEWORK USING SVM, LR AND MILR
SPAM FILTERING SECURITY EVALUATION FRAMEWORK USING SVM, LR AND MILR SPAM FILTERING SECURITY EVALUATION FRAMEWORK USING SVM, LR AND MILR
SPAM FILTERING SECURITY EVALUATION FRAMEWORK USING SVM, LR AND MILR
 
Adversarial Learning_Rupam Bhattacharya
Adversarial Learning_Rupam BhattacharyaAdversarial Learning_Rupam Bhattacharya
Adversarial Learning_Rupam Bhattacharya
 
Causative Adversarial Learning
Causative Adversarial LearningCausative Adversarial Learning
Causative Adversarial Learning
 
Introduction to Machine Learning
Introduction to Machine LearningIntroduction to Machine Learning
Introduction to Machine Learning
 
I Dunderstn
I DunderstnI Dunderstn
I Dunderstn
 
Adversarial Attacks on A.I. Systems — NextCon, Jan 2019
Adversarial Attacks on A.I. Systems — NextCon, Jan 2019Adversarial Attacks on A.I. Systems — NextCon, Jan 2019
Adversarial Attacks on A.I. Systems — NextCon, Jan 2019
 
ANALYSIS OF MACHINE LEARNING ALGORITHMS WITH FEATURE SELECTION FOR INTRUSION ...
ANALYSIS OF MACHINE LEARNING ALGORITHMS WITH FEATURE SELECTION FOR INTRUSION ...ANALYSIS OF MACHINE LEARNING ALGORITHMS WITH FEATURE SELECTION FOR INTRUSION ...
ANALYSIS OF MACHINE LEARNING ALGORITHMS WITH FEATURE SELECTION FOR INTRUSION ...
 
Sentiment analysis of tweets using Neural Networks
Sentiment analysis of tweets using Neural NetworksSentiment analysis of tweets using Neural Networks
Sentiment analysis of tweets using Neural Networks
 
Machine Learning under Attack: Vulnerability Exploitation and Security Measures
Machine Learning under Attack: Vulnerability Exploitation and Security MeasuresMachine Learning under Attack: Vulnerability Exploitation and Security Measures
Machine Learning under Attack: Vulnerability Exploitation and Security Measures
 
Keyboards, Privacy, and Sensor Webs (Part II)
Keyboards, Privacy, and Sensor Webs (Part II)Keyboards, Privacy, and Sensor Webs (Part II)
Keyboards, Privacy, and Sensor Webs (Part II)
 
01 Introduction to Machine Learning
01 Introduction to Machine Learning01 Introduction to Machine Learning
01 Introduction to Machine Learning
 
Extract Stressors for Suicide from Twitter Using Deep Learning
Extract Stressors for Suicide from Twitter Using Deep LearningExtract Stressors for Suicide from Twitter Using Deep Learning
Extract Stressors for Suicide from Twitter Using Deep Learning
 

Semelhante a Machine learning in computer security

Presentation_Malware Analysis.pptx
Presentation_Malware Analysis.pptxPresentation_Malware Analysis.pptx
Presentation_Malware Analysis.pptxnishanth kurush
 
IMAGE CLASSIFICATION USING DIFFERENT CLASSICAL APPROACHES
IMAGE CLASSIFICATION USING DIFFERENT CLASSICAL APPROACHESIMAGE CLASSIFICATION USING DIFFERENT CLASSICAL APPROACHES
IMAGE CLASSIFICATION USING DIFFERENT CLASSICAL APPROACHESVikash Kumar
 
machinecanthink-160226155704.pdf
machinecanthink-160226155704.pdfmachinecanthink-160226155704.pdf
machinecanthink-160226155704.pdfPranavPatil822557
 
Cyb 5675 class project final
Cyb 5675   class project finalCyb 5675   class project final
Cyb 5675 class project finalCraig Cannon
 
Machine Learning in Malware Detection
Machine Learning in Malware DetectionMachine Learning in Malware Detection
Machine Learning in Malware DetectionKaspersky
 
network layer service models forwarding versus routing how a router works rou...
network layer service models forwarding versus routing how a router works rou...network layer service models forwarding versus routing how a router works rou...
network layer service models forwarding versus routing how a router works rou...Ashish Gupta
 
Intro to machine learning
Intro to machine learningIntro to machine learning
Intro to machine learningAkshay Kanchan
 
5. Machine Learning.pptx
5.  Machine Learning.pptx5.  Machine Learning.pptx
5. Machine Learning.pptxssuser6654de1
 
Identifying and classifying unknown Network Disruption
Identifying and classifying unknown Network DisruptionIdentifying and classifying unknown Network Disruption
Identifying and classifying unknown Network Disruptionjagan477830
 
ML) is a subdomain of artificial intelligence (AI) that focuses on developing...
ML) is a subdomain of artificial intelligence (AI) that focuses on developing...ML) is a subdomain of artificial intelligence (AI) that focuses on developing...
ML) is a subdomain of artificial intelligence (AI) that focuses on developing...Ashish Gupta
 
network layer service models forwarding versus routing how a router works rou...
network layer service models forwarding versus routing how a router works rou...network layer service models forwarding versus routing how a router works rou...
network layer service models forwarding versus routing how a router works rou...Ashish Gupta
 
AI_06_Machine Learning.pptx
AI_06_Machine Learning.pptxAI_06_Machine Learning.pptx
AI_06_Machine Learning.pptxYousef Aburawi
 
ANALYSIS AND COMPARISON STUDY OF DATA MINING ALGORITHMS USING RAPIDMINER
ANALYSIS AND COMPARISON STUDY OF DATA MINING ALGORITHMS USING RAPIDMINERANALYSIS AND COMPARISON STUDY OF DATA MINING ALGORITHMS USING RAPIDMINER
ANALYSIS AND COMPARISON STUDY OF DATA MINING ALGORITHMS USING RAPIDMINERIJCSEA Journal
 
IRJET- Study and Evaluation of Classification Algorithms in Data Mining
IRJET- Study and Evaluation of Classification Algorithms in Data MiningIRJET- Study and Evaluation of Classification Algorithms in Data Mining
IRJET- Study and Evaluation of Classification Algorithms in Data MiningIRJET Journal
 

Semelhante a Machine learning in computer security (20)

Presentation_Malware Analysis.pptx
Presentation_Malware Analysis.pptxPresentation_Malware Analysis.pptx
Presentation_Malware Analysis.pptx
 
IMAGE CLASSIFICATION USING DIFFERENT CLASSICAL APPROACHES
IMAGE CLASSIFICATION USING DIFFERENT CLASSICAL APPROACHESIMAGE CLASSIFICATION USING DIFFERENT CLASSICAL APPROACHES
IMAGE CLASSIFICATION USING DIFFERENT CLASSICAL APPROACHES
 
Machine Can Think
Machine Can ThinkMachine Can Think
Machine Can Think
 
machinecanthink-160226155704.pdf
machinecanthink-160226155704.pdfmachinecanthink-160226155704.pdf
machinecanthink-160226155704.pdf
 
Presentation1.pptx
Presentation1.pptxPresentation1.pptx
Presentation1.pptx
 
Cyb 5675 class project final
Cyb 5675   class project finalCyb 5675   class project final
Cyb 5675 class project final
 
PPT.pptx
PPT.pptxPPT.pptx
PPT.pptx
 
PNN and inversion-B
PNN and inversion-BPNN and inversion-B
PNN and inversion-B
 
20170412 om patri pres 153pdf
20170412 om patri pres 153pdf20170412 om patri pres 153pdf
20170412 om patri pres 153pdf
 
Machine Learning in Malware Detection
Machine Learning in Malware DetectionMachine Learning in Malware Detection
Machine Learning in Malware Detection
 
network layer service models forwarding versus routing how a router works rou...
network layer service models forwarding versus routing how a router works rou...network layer service models forwarding versus routing how a router works rou...
network layer service models forwarding versus routing how a router works rou...
 
Intro to machine learning
Intro to machine learningIntro to machine learning
Intro to machine learning
 
5. Machine Learning.pptx
5.  Machine Learning.pptx5.  Machine Learning.pptx
5. Machine Learning.pptx
 
Identifying and classifying unknown Network Disruption
Identifying and classifying unknown Network DisruptionIdentifying and classifying unknown Network Disruption
Identifying and classifying unknown Network Disruption
 
ML) is a subdomain of artificial intelligence (AI) that focuses on developing...
ML) is a subdomain of artificial intelligence (AI) that focuses on developing...ML) is a subdomain of artificial intelligence (AI) that focuses on developing...
ML) is a subdomain of artificial intelligence (AI) that focuses on developing...
 
network layer service models forwarding versus routing how a router works rou...
network layer service models forwarding versus routing how a router works rou...network layer service models forwarding versus routing how a router works rou...
network layer service models forwarding versus routing how a router works rou...
 
AI_06_Machine Learning.pptx
AI_06_Machine Learning.pptxAI_06_Machine Learning.pptx
AI_06_Machine Learning.pptx
 
ANALYSIS AND COMPARISON STUDY OF DATA MINING ALGORITHMS USING RAPIDMINER
ANALYSIS AND COMPARISON STUDY OF DATA MINING ALGORITHMS USING RAPIDMINERANALYSIS AND COMPARISON STUDY OF DATA MINING ALGORITHMS USING RAPIDMINER
ANALYSIS AND COMPARISON STUDY OF DATA MINING ALGORITHMS USING RAPIDMINER
 
IRJET- Study and Evaluation of Classification Algorithms in Data Mining
IRJET- Study and Evaluation of Classification Algorithms in Data MiningIRJET- Study and Evaluation of Classification Algorithms in Data Mining
IRJET- Study and Evaluation of Classification Algorithms in Data Mining
 
PythonML.pptx
PythonML.pptxPythonML.pptx
PythonML.pptx
 

Mais de Kishor Datta Gupta

Interpretable Learning Model for Lower Dimensional Feature Space: A Case stud...
Interpretable Learning Model for Lower Dimensional Feature Space: A Case stud...Interpretable Learning Model for Lower Dimensional Feature Space: A Case stud...
Interpretable Learning Model for Lower Dimensional Feature Space: A Case stud...Kishor Datta Gupta
 
A safer approach to build recommendation systems on unidentifiable data
A safer approach to build recommendation systems on unidentifiable dataA safer approach to build recommendation systems on unidentifiable data
A safer approach to build recommendation systems on unidentifiable dataKishor Datta Gupta
 
Adversarial Attacks and Defense
Adversarial Attacks and DefenseAdversarial Attacks and Defense
Adversarial Attacks and DefenseKishor Datta Gupta
 
Using Negative Detectors for Identifying Adversarial Data Manipulation in Mac...
Using Negative Detectors for Identifying Adversarial Data Manipulation in Mac...Using Negative Detectors for Identifying Adversarial Data Manipulation in Mac...
Using Negative Detectors for Identifying Adversarial Data Manipulation in Mac...Kishor Datta Gupta
 
understanding the pandemic through mining covid news using natural language p...
understanding the pandemic through mining covid news using natural language p...understanding the pandemic through mining covid news using natural language p...
understanding the pandemic through mining covid news using natural language p...Kishor Datta Gupta
 
Different representation space for MNIST digit
Different representation space for MNIST digitDifferent representation space for MNIST digit
Different representation space for MNIST digitKishor Datta Gupta
 
"Can NLP techniques be utilized as a reliable tool for medical science?" -Bui...
"Can NLP techniques be utilized as a reliable tool for medical science?" -Bui..."Can NLP techniques be utilized as a reliable tool for medical science?" -Bui...
"Can NLP techniques be utilized as a reliable tool for medical science?" -Bui...Kishor Datta Gupta
 
An empirical study on algorithmic bias (aiml compsac2020)
An empirical study on algorithmic bias (aiml compsac2020)An empirical study on algorithmic bias (aiml compsac2020)
An empirical study on algorithmic bias (aiml compsac2020)Kishor Datta Gupta
 
Hybrid pow-pos-based-system against majority attack-in-cryptocurrency system ...
Hybrid pow-pos-based-system against majority attack-in-cryptocurrency system ...Hybrid pow-pos-based-system against majority attack-in-cryptocurrency system ...
Hybrid pow-pos-based-system against majority attack-in-cryptocurrency system ...Kishor Datta Gupta
 
Shamir secret sharing: Alternative of hashing for authentication
Shamir secret sharing: Alternative of hashing for authenticationShamir secret sharing: Alternative of hashing for authentication
Shamir secret sharing: Alternative of hashing for authenticationKishor Datta Gupta
 
A Genetic Algorithm Approach to Optimize Dispatching for A Micro-grid Energy ...
A Genetic Algorithm Approach to Optimize Dispatching for A Micro-grid Energy ...A Genetic Algorithm Approach to Optimize Dispatching for A Micro-grid Energy ...
A Genetic Algorithm Approach to Optimize Dispatching for A Micro-grid Energy ...Kishor Datta Gupta
 
Multi level ransomware analysis MALCON 2019 conference
Multi level ransomware analysis MALCON 2019 conferenceMulti level ransomware analysis MALCON 2019 conference
Multi level ransomware analysis MALCON 2019 conferenceKishor Datta Gupta
 
COMXAI A tool to explain AI USING FAULT LOCATION
COMXAI A tool to explain AI USING FAULT LOCATIONCOMXAI A tool to explain AI USING FAULT LOCATION
COMXAI A tool to explain AI USING FAULT LOCATIONKishor Datta Gupta
 
Time expired ledger for File access blockchain
Time expired ledger for File access blockchainTime expired ledger for File access blockchain
Time expired ledger for File access blockchainKishor Datta Gupta
 
BigData Computing For WebSite Classifier
BigData Computing For WebSite ClassifierBigData Computing For WebSite Classifier
BigData Computing For WebSite ClassifierKishor Datta Gupta
 

Mais de Kishor Datta Gupta (20)

GAN introduction.pptx
GAN introduction.pptxGAN introduction.pptx
GAN introduction.pptx
 
Interpretable Learning Model for Lower Dimensional Feature Space: A Case stud...
Interpretable Learning Model for Lower Dimensional Feature Space: A Case stud...Interpretable Learning Model for Lower Dimensional Feature Space: A Case stud...
Interpretable Learning Model for Lower Dimensional Feature Space: A Case stud...
 
A safer approach to build recommendation systems on unidentifiable data
A safer approach to build recommendation systems on unidentifiable dataA safer approach to build recommendation systems on unidentifiable data
A safer approach to build recommendation systems on unidentifiable data
 
Adversarial Attacks and Defense
Adversarial Attacks and DefenseAdversarial Attacks and Defense
Adversarial Attacks and Defense
 
Zero shot learning
Zero shot learning Zero shot learning
Zero shot learning
 
Using Negative Detectors for Identifying Adversarial Data Manipulation in Mac...
Using Negative Detectors for Identifying Adversarial Data Manipulation in Mac...Using Negative Detectors for Identifying Adversarial Data Manipulation in Mac...
Using Negative Detectors for Identifying Adversarial Data Manipulation in Mac...
 
Cyber intrusion
Cyber intrusionCyber intrusion
Cyber intrusion
 
understanding the pandemic through mining covid news using natural language p...
understanding the pandemic through mining covid news using natural language p...understanding the pandemic through mining covid news using natural language p...
understanding the pandemic through mining covid news using natural language p...
 
Different representation space for MNIST digit
Different representation space for MNIST digitDifferent representation space for MNIST digit
Different representation space for MNIST digit
 
"Can NLP techniques be utilized as a reliable tool for medical science?" -Bui...
"Can NLP techniques be utilized as a reliable tool for medical science?" -Bui..."Can NLP techniques be utilized as a reliable tool for medical science?" -Bui...
"Can NLP techniques be utilized as a reliable tool for medical science?" -Bui...
 
Clustering report
Clustering reportClustering report
Clustering report
 
Basic digital image concept
Basic digital image conceptBasic digital image concept
Basic digital image concept
 
An empirical study on algorithmic bias (aiml compsac2020)
An empirical study on algorithmic bias (aiml compsac2020)An empirical study on algorithmic bias (aiml compsac2020)
An empirical study on algorithmic bias (aiml compsac2020)
 
Hybrid pow-pos-based-system against majority attack-in-cryptocurrency system ...
Hybrid pow-pos-based-system against majority attack-in-cryptocurrency system ...Hybrid pow-pos-based-system against majority attack-in-cryptocurrency system ...
Hybrid pow-pos-based-system against majority attack-in-cryptocurrency system ...
 
Shamir secret sharing: Alternative of hashing for authentication
Shamir secret sharing: Alternative of hashing for authenticationShamir secret sharing: Alternative of hashing for authentication
Shamir secret sharing: Alternative of hashing for authentication
 
A Genetic Algorithm Approach to Optimize Dispatching for A Micro-grid Energy ...
A Genetic Algorithm Approach to Optimize Dispatching for A Micro-grid Energy ...A Genetic Algorithm Approach to Optimize Dispatching for A Micro-grid Energy ...
A Genetic Algorithm Approach to Optimize Dispatching for A Micro-grid Energy ...
 
Multi level ransomware analysis MALCON 2019 conference
Multi level ransomware analysis MALCON 2019 conferenceMulti level ransomware analysis MALCON 2019 conference
Multi level ransomware analysis MALCON 2019 conference
 
COMXAI A tool to explain AI USING FAULT LOCATION
COMXAI A tool to explain AI USING FAULT LOCATIONCOMXAI A tool to explain AI USING FAULT LOCATION
COMXAI A tool to explain AI USING FAULT LOCATION
 
Time expired ledger for File access blockchain
Time expired ledger for File access blockchainTime expired ledger for File access blockchain
Time expired ledger for File access blockchain
 
BigData Computing For WebSite Classifier
BigData Computing For WebSite ClassifierBigData Computing For WebSite Classifier
BigData Computing For WebSite Classifier
 

Último

(ANVI) Koregaon Park Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
(ANVI) Koregaon Park Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...(ANVI) Koregaon Park Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
(ANVI) Koregaon Park Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...ranjana rawat
 
Call Girls Service Nashik Vaishnavi 7001305949 Independent Escort Service Nashik
Call Girls Service Nashik Vaishnavi 7001305949 Independent Escort Service NashikCall Girls Service Nashik Vaishnavi 7001305949 Independent Escort Service Nashik
Call Girls Service Nashik Vaishnavi 7001305949 Independent Escort Service NashikCall Girls in Nagpur High Profile
 
High Profile Call Girls Nagpur Isha Call 7001035870 Meet With Nagpur Escorts
High Profile Call Girls Nagpur Isha Call 7001035870 Meet With Nagpur EscortsHigh Profile Call Girls Nagpur Isha Call 7001035870 Meet With Nagpur Escorts
High Profile Call Girls Nagpur Isha Call 7001035870 Meet With Nagpur Escortsranjana rawat
 
OSVC_Meta-Data based Simulation Automation to overcome Verification Challenge...
OSVC_Meta-Data based Simulation Automation to overcome Verification Challenge...OSVC_Meta-Data based Simulation Automation to overcome Verification Challenge...
OSVC_Meta-Data based Simulation Automation to overcome Verification Challenge...Soham Mondal
 
MANUFACTURING PROCESS-II UNIT-5 NC MACHINE TOOLS
MANUFACTURING PROCESS-II UNIT-5 NC MACHINE TOOLSMANUFACTURING PROCESS-II UNIT-5 NC MACHINE TOOLS
MANUFACTURING PROCESS-II UNIT-5 NC MACHINE TOOLSSIVASHANKAR N
 
Extrusion Processes and Their Limitations
Extrusion Processes and Their LimitationsExtrusion Processes and Their Limitations
Extrusion Processes and Their Limitations120cr0395
 
High Profile Call Girls Nagpur Meera Call 7001035870 Meet With Nagpur Escorts
High Profile Call Girls Nagpur Meera Call 7001035870 Meet With Nagpur EscortsHigh Profile Call Girls Nagpur Meera Call 7001035870 Meet With Nagpur Escorts
High Profile Call Girls Nagpur Meera Call 7001035870 Meet With Nagpur EscortsCall Girls in Nagpur High Profile
 
MANUFACTURING PROCESS-II UNIT-2 LATHE MACHINE
MANUFACTURING PROCESS-II UNIT-2 LATHE MACHINEMANUFACTURING PROCESS-II UNIT-2 LATHE MACHINE
MANUFACTURING PROCESS-II UNIT-2 LATHE MACHINESIVASHANKAR N
 
247267395-1-Symmetric-and-distributed-shared-memory-architectures-ppt (1).ppt
247267395-1-Symmetric-and-distributed-shared-memory-architectures-ppt (1).ppt247267395-1-Symmetric-and-distributed-shared-memory-architectures-ppt (1).ppt
247267395-1-Symmetric-and-distributed-shared-memory-architectures-ppt (1).pptssuser5c9d4b1
 
Sheet Pile Wall Design and Construction: A Practical Guide for Civil Engineer...
Sheet Pile Wall Design and Construction: A Practical Guide for Civil Engineer...Sheet Pile Wall Design and Construction: A Practical Guide for Civil Engineer...
Sheet Pile Wall Design and Construction: A Practical Guide for Civil Engineer...Dr.Costas Sachpazis
 
Call Girls Service Nagpur Tanvi Call 7001035870 Meet With Nagpur Escorts
Call Girls Service Nagpur Tanvi Call 7001035870 Meet With Nagpur EscortsCall Girls Service Nagpur Tanvi Call 7001035870 Meet With Nagpur Escorts
Call Girls Service Nagpur Tanvi Call 7001035870 Meet With Nagpur EscortsCall Girls in Nagpur High Profile
 
Microscopic Analysis of Ceramic Materials.pptx
Microscopic Analysis of Ceramic Materials.pptxMicroscopic Analysis of Ceramic Materials.pptx
Microscopic Analysis of Ceramic Materials.pptxpurnimasatapathy1234
 
College Call Girls Nashik Nehal 7001305949 Independent Escort Service Nashik
College Call Girls Nashik Nehal 7001305949 Independent Escort Service NashikCollege Call Girls Nashik Nehal 7001305949 Independent Escort Service Nashik
College Call Girls Nashik Nehal 7001305949 Independent Escort Service NashikCall Girls in Nagpur High Profile
 
(RIA) Call Girls Bhosari ( 7001035870 ) HI-Fi Pune Escorts Service
(RIA) Call Girls Bhosari ( 7001035870 ) HI-Fi Pune Escorts Service(RIA) Call Girls Bhosari ( 7001035870 ) HI-Fi Pune Escorts Service
(RIA) Call Girls Bhosari ( 7001035870 ) HI-Fi Pune Escorts Serviceranjana rawat
 
Call Girls in Nagpur Suman Call 7001035870 Meet With Nagpur Escorts
Call Girls in Nagpur Suman Call 7001035870 Meet With Nagpur EscortsCall Girls in Nagpur Suman Call 7001035870 Meet With Nagpur Escorts
Call Girls in Nagpur Suman Call 7001035870 Meet With Nagpur EscortsCall Girls in Nagpur High Profile
 
APPLICATIONS-AC/DC DRIVES-OPERATING CHARACTERISTICS
APPLICATIONS-AC/DC DRIVES-OPERATING CHARACTERISTICSAPPLICATIONS-AC/DC DRIVES-OPERATING CHARACTERISTICS
APPLICATIONS-AC/DC DRIVES-OPERATING CHARACTERISTICSKurinjimalarL3
 
KubeKraft presentation @CloudNativeHooghly
KubeKraft presentation @CloudNativeHooghlyKubeKraft presentation @CloudNativeHooghly
KubeKraft presentation @CloudNativeHooghlysanyuktamishra911
 
(PRIYA) Rajgurunagar Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
(PRIYA) Rajgurunagar Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...(PRIYA) Rajgurunagar Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
(PRIYA) Rajgurunagar Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...ranjana rawat
 
CCS335 _ Neural Networks and Deep Learning Laboratory_Lab Complete Record
CCS335 _ Neural Networks and Deep Learning Laboratory_Lab Complete RecordCCS335 _ Neural Networks and Deep Learning Laboratory_Lab Complete Record
CCS335 _ Neural Networks and Deep Learning Laboratory_Lab Complete RecordAsst.prof M.Gokilavani
 

Último (20)

(ANVI) Koregaon Park Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
(ANVI) Koregaon Park Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...(ANVI) Koregaon Park Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
(ANVI) Koregaon Park Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
 
Call Girls Service Nashik Vaishnavi 7001305949 Independent Escort Service Nashik
Call Girls Service Nashik Vaishnavi 7001305949 Independent Escort Service NashikCall Girls Service Nashik Vaishnavi 7001305949 Independent Escort Service Nashik
Call Girls Service Nashik Vaishnavi 7001305949 Independent Escort Service Nashik
 
High Profile Call Girls Nagpur Isha Call 7001035870 Meet With Nagpur Escorts
High Profile Call Girls Nagpur Isha Call 7001035870 Meet With Nagpur EscortsHigh Profile Call Girls Nagpur Isha Call 7001035870 Meet With Nagpur Escorts
High Profile Call Girls Nagpur Isha Call 7001035870 Meet With Nagpur Escorts
 
OSVC_Meta-Data based Simulation Automation to overcome Verification Challenge...
OSVC_Meta-Data based Simulation Automation to overcome Verification Challenge...OSVC_Meta-Data based Simulation Automation to overcome Verification Challenge...
OSVC_Meta-Data based Simulation Automation to overcome Verification Challenge...
 
MANUFACTURING PROCESS-II UNIT-5 NC MACHINE TOOLS
MANUFACTURING PROCESS-II UNIT-5 NC MACHINE TOOLSMANUFACTURING PROCESS-II UNIT-5 NC MACHINE TOOLS
MANUFACTURING PROCESS-II UNIT-5 NC MACHINE TOOLS
 
Extrusion Processes and Their Limitations
Extrusion Processes and Their LimitationsExtrusion Processes and Their Limitations
Extrusion Processes and Their Limitations
 
High Profile Call Girls Nagpur Meera Call 7001035870 Meet With Nagpur Escorts
High Profile Call Girls Nagpur Meera Call 7001035870 Meet With Nagpur EscortsHigh Profile Call Girls Nagpur Meera Call 7001035870 Meet With Nagpur Escorts
High Profile Call Girls Nagpur Meera Call 7001035870 Meet With Nagpur Escorts
 
DJARUM4D - SLOT GACOR ONLINE | SLOT DEMO ONLINE
DJARUM4D - SLOT GACOR ONLINE | SLOT DEMO ONLINEDJARUM4D - SLOT GACOR ONLINE | SLOT DEMO ONLINE
DJARUM4D - SLOT GACOR ONLINE | SLOT DEMO ONLINE
 
MANUFACTURING PROCESS-II UNIT-2 LATHE MACHINE
MANUFACTURING PROCESS-II UNIT-2 LATHE MACHINEMANUFACTURING PROCESS-II UNIT-2 LATHE MACHINE
MANUFACTURING PROCESS-II UNIT-2 LATHE MACHINE
 
247267395-1-Symmetric-and-distributed-shared-memory-architectures-ppt (1).ppt
247267395-1-Symmetric-and-distributed-shared-memory-architectures-ppt (1).ppt247267395-1-Symmetric-and-distributed-shared-memory-architectures-ppt (1).ppt
247267395-1-Symmetric-and-distributed-shared-memory-architectures-ppt (1).ppt
 
Sheet Pile Wall Design and Construction: A Practical Guide for Civil Engineer...
Sheet Pile Wall Design and Construction: A Practical Guide for Civil Engineer...Sheet Pile Wall Design and Construction: A Practical Guide for Civil Engineer...
Sheet Pile Wall Design and Construction: A Practical Guide for Civil Engineer...
 
Call Girls Service Nagpur Tanvi Call 7001035870 Meet With Nagpur Escorts
Call Girls Service Nagpur Tanvi Call 7001035870 Meet With Nagpur EscortsCall Girls Service Nagpur Tanvi Call 7001035870 Meet With Nagpur Escorts
Call Girls Service Nagpur Tanvi Call 7001035870 Meet With Nagpur Escorts
 
Microscopic Analysis of Ceramic Materials.pptx
Microscopic Analysis of Ceramic Materials.pptxMicroscopic Analysis of Ceramic Materials.pptx
Microscopic Analysis of Ceramic Materials.pptx
 
College Call Girls Nashik Nehal 7001305949 Independent Escort Service Nashik
College Call Girls Nashik Nehal 7001305949 Independent Escort Service NashikCollege Call Girls Nashik Nehal 7001305949 Independent Escort Service Nashik
College Call Girls Nashik Nehal 7001305949 Independent Escort Service Nashik
 
(RIA) Call Girls Bhosari ( 7001035870 ) HI-Fi Pune Escorts Service
(RIA) Call Girls Bhosari ( 7001035870 ) HI-Fi Pune Escorts Service(RIA) Call Girls Bhosari ( 7001035870 ) HI-Fi Pune Escorts Service
(RIA) Call Girls Bhosari ( 7001035870 ) HI-Fi Pune Escorts Service
 
Call Girls in Nagpur Suman Call 7001035870 Meet With Nagpur Escorts
Call Girls in Nagpur Suman Call 7001035870 Meet With Nagpur EscortsCall Girls in Nagpur Suman Call 7001035870 Meet With Nagpur Escorts
Call Girls in Nagpur Suman Call 7001035870 Meet With Nagpur Escorts
 
APPLICATIONS-AC/DC DRIVES-OPERATING CHARACTERISTICS
APPLICATIONS-AC/DC DRIVES-OPERATING CHARACTERISTICSAPPLICATIONS-AC/DC DRIVES-OPERATING CHARACTERISTICS
APPLICATIONS-AC/DC DRIVES-OPERATING CHARACTERISTICS
 
KubeKraft presentation @CloudNativeHooghly
KubeKraft presentation @CloudNativeHooghlyKubeKraft presentation @CloudNativeHooghly
KubeKraft presentation @CloudNativeHooghly
 
(PRIYA) Rajgurunagar Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
(PRIYA) Rajgurunagar Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...(PRIYA) Rajgurunagar Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
(PRIYA) Rajgurunagar Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
 
CCS335 _ Neural Networks and Deep Learning Laboratory_Lab Complete Record
CCS335 _ Neural Networks and Deep Learning Laboratory_Lab Complete RecordCCS335 _ Neural Networks and Deep Learning Laboratory_Lab Complete Record
CCS335 _ Neural Networks and Deep Learning Laboratory_Lab Complete Record
 

Machine learning in computer security

  • 1. Machine Learning in Computer Security Presented by : Kishor Datta Gupta
  • 2. Computer security Task of cyber security Prediction Prevention Detection Response Monitoring Places to do the task Network (network traffic analysis and intrusion detection) Endpoint (anti-malware) Application (WAF or database firewalls) User (UBA) Process (anti-fraud) Time to do the tasks In transit in real time At rest Historically
  • 3. What Machine Learning Can Do? • A task of predicting the next value based on the previous values. Regression (or prediction) • A task of separating things into different categories. Classification • Similar to classification but the classes are unknown, grouping things by their similarity. Clustering • A task of recommending something based on the previous experience. Association rule learning (or recommendation) • A task of searching common and most important features in multiple examples. Dimensionality reduction or generalization • A task of creating something based on the previous knowledge of the distribution. Generative models
  • 4. Regression: The knowledge about the existing data is utilized to have an idea of the new data. Example : house prices prediction. Example in Cyber security: it can be applied to fraud detection. The features (e.g., the total amount of suspicious transaction, location, etc.) determine a probability of fraudulent actions.
  • 5. Regression • Linear regression • Polynomial regression • Ridge regression • Decision trees • SVR (Support Vector Regression) • Random forest Machine learning • Artificial Neural Network (ANN) • Recurrent Neural Network (RNN) • Neural Turing Machines (NTM) • Differentiable Neural Computer (DNC) Deep learning
  • 6. Linear Regression: • Linear regression performs the task to predict a dependent variable value (y) based on a given independent variable (x) • . So, this regression technique finds out a linear relationship between x (input) and y(output). Hence, the name is Linear Regression. • Y=MX+C
  • 7. Polynomial Regression: 2 Degree polynomial y = θo + θ₁x₁ + θ₂ x₁² General equation of a polynomial regression is: Y=θo + θ₁X + θ₂X² + … + θₘXᵐ
  • 8. Decision Tree • The goal of using a Decision Tree is to create a training model that can use to predict the class or value of the target variable by learning simple decision rules inferred from prior data(training data). • In Decision Trees, for predicting a class label for a record we start from the root of the tree. We compare the values of the root attribute with the record’s attribute. • On the basis of comparison, we follow the branch corresponding to that value and jump to the next node.
  • 9. Regression Evaluations MAE (Mean absolute error) represents the difference between the original and predicted values extracted by averaged the absolute difference over the data set. •MSE (Mean Squared Error) represents the difference between the original and predicted values extracted by squared the average difference over the data set. •RMSE (Root Mean Squared Error) is the error rate by the square root of MSE. •R-squared (Coefficient of determination) represents the coefficient of how well the values fit compared to the original values. The value from 0 to 1 interpreted as percentages. The higher the value is, the better the model is.
  • 10. Classification: Classification refers to a predictive modeling problem where a class label is predicted for a given example of input data. In terms of cybersecurity, a spam filter separating spams from other messages can serve as an example.
  • 11. Classification: • LogisticRegression (LR) • K-Nearest Neighbors (K-NN) • Support Vector Machine (SVM) • KernelSVM • NaiveBayes • DecisionTreeClassification • Random Forest Classification Machine learning • Artificial Neural Network • Convolutional Neural Networks Deep learning
  • 12. Support Vector Machine (SVM): The objective of the SVM is to find a hyperplane in an N- dimensional space(N — the number of features) that distinctly classifies the data points.
  • 13. Naïve Bayes: It is a probabilistic classifier that makes classifications using the Maximum A Posteriori decision rule in a Bayesian setting. Naive Bayes classifiers have been especially popular for text classification, and are a traditional solution for problems such as spam detection.
  • 14. Artificial Neural Network: The core component of ANNs is artificial neurons. Each neuron receives inputs from several other neurons, multiplies them by assigned weights, adds them and passes the sum to one or more neurons. Some artificial neurons might apply an activation function to the output before passing it to the next variable. Artificial neural networks are composed of an input layer, which receives data from outside sources (data files, images, hardware sensors, microphone…), one or more hidden layers that process the data, and an output layer that provides one or more data points based on the function of the network.
  • 15. Classification Evaluations Accuracy Accuracy = (TP+TN)/(TP+FP+FN+TN) Accuracy is the proportion of true results among the total number of cases examined. Precision •. what proportion of predicted Positives is truly Positive? •Precision = (TP)/(TP+FP) Recall • what proportion of actual Positives is correctly classified? •Recall = (TP)/(TP+FN) F1 Score • Harmonic Mean of precision and recall.
  • 16. Clustering: The information about the classes of the data is unknown. There is no idea whether this data can be classified. This is unsupervised learning. Supposedly, the best task for clustering is forensic analysis. The reasons, course, and consequences of an incident are obscure. It’s required to classify all activities to find anomalies. Solutions to malware analysis (i.e., malware protection or secure email gateways) may implement it to separate legal files from outliers. Another interesting area where clustering can be applied is user behavior analytics. In this instance, application users cluster together so that it is possible to see if they should belong to a particular group. Usually clustering is not applied to solving a particular task in cybersecurity as it is more like one of the subtasks in a pipeline (e.g., grouping users into separate groups to adjust risk values).
  • 17. Clustering : • K-means • Mixturemodel(LDA) • DBSCn • Bayesian • GaussianMixtureModel • Agglomerative • Mean-shift Machine learning • Self-organized Maps (SOM) • Kohonen Networks Deep learning
  • 18. K-Means Clustering K-Means finds the best centroids by alternating between (1) assigning data points to clusters based on the current centroids (2) choosing centroids (points which are the center of a cluster) based on the current assignment of data points to clusters.
  • 19. Association Rule learning Netflix and SoundCloud recommend films or songs according to your movies or music preferences. In cybersecurity, this principle can be used primarily for incident response. If a company faces a wave of incidents and offers various types of responses, a system learns a type of response for a particular incident (e.g., mark it as a false positive, change a risk value, run the investigation). Risk management solutions can also have a benefit if they automatically assign risk values for new vulnerabilities or misconfigurations built on their description.
  • 20. Association Rule learning : • Apriori • Euclat • FP-Growth Machine learning • Deep Restricted Boltzmann Machine (RBM) • Deep Belief Network (DBN) • Stacked Autoencoder Deep learning
  • 21. Generalization: Dimensionality reduction can help handle it and cut unnecessary features. Like clustering, dimensionality reduction is usually one of the tasks in a more complex model. As to cybersecurity tasks, dimensionality reduction is common for face detection solutions
  • 22. Generalization : • Principal Component Analysis (PCA) • Singular-value decomposition (SVD) • T-distributed Stochastic Neighbor Embedding (T-SNE) • Linear Discriminant Analysis (LDA) • Latent Semantic Analysis (LSA) • Factor Analysis (FA) • Independent Component Analysis (ICA) • Non-negative Matrix Factorization (NMF) Machine learning • Auto encoder Deep learning
  • 23. Generative models: Generative models are designed to simulate the actual data (not decisions) based on the previous decisions. The simple task of offensive cybersecurity is to generate a list of input parameters to test a particular application for Injection vulnerabilities. Alternatively, we can have a vulnerability scanning tool for web applications. One of its modules is testing files for unauthorized access. These tests are able to mutate existing filenames to identify the new ones. For example, if a crawler detected a file called login.php, it’s better to check the existence of any backup or test its copies by trying names like login_1.php, login_backup.php, login.php.2017. Generative models are good at this.
  • 24. Generative models : • Markov Chains • Genetic Algorithm Machine learning • Variational Autoencoders • Generative adversarial networks (GANs) • Boltzmann Machines Deep learning
  • 25. Machine learning for Network Protection ML in network security implies new solutions aimed at in-depth analysis of all the traffic at each layer and detect attacks and anomalies. How can ML help here? • Regression to predict the network packet parameters and compare them with the normal ones; • Classification to identify different classes of network attacks such as scanning and spoofing; • Clustering for forensic analysis.
  • 26. Machine learning for Endpoint Protection The new generation of anti-viruses is Endpoint Detection and Response. It’s better to learn features in executable files or in the process behavior. Data may differ depending on the type of endpoint (e.g., workstation, server, container, cloud instance, mobile, PLC, IoT device) but the tasks are common How can ML help here? • Regression to predict the next system call for executable process and compare it with real ones; • Classification to divide programs into such categories as malware, spyware and ransomware; • Clustering for malware protection on secure email gateways (e.g., to separate legal file attachments from outliers).
  • 27. Machine learning for Application Security Application security can differ. There are web applications, databases, ERP systems, SaaS applications, micro services, etc. How can ML help here? • Regression to detect anomalies in HTTP requests (for example, XXE and SSRF attacks and auth bypass); • Classification to detect known types of attacks like injections (SQLi, XSS, RCE, etc.); • Clustering user activity to detect DDOS attacks and mass exploitation.
  • 28. Machine learning for User Behavior There are domain users, application users, SaaS users, social networks, messengers, and other accounts that should be monitored. User behavior is one of the complex layers and unsupervised learning problem. As a rule, there is no labelled dataset as well as any idea of what to look for. How can ML help here? • Regression to detect anomalies in User actions (e.g., login in unusual time); • Classification to group different users for peer-group analysis; • Clustering to separate groups of users and detect outliers
  • 29. Machine learning for Process Behavior it’s necessary to know a business process in order to find something anomalous. Business processes can differ significantly. You can look for fraud in banking and retail system, or a plant floor in manufacturing. How can ML help here? • Regression to predict the next user action and detect outliers such as credit card fraud; • Classification to detect known types of fraud; • Clustering to compare business processes and detect outliers.
  • 30. References • https://towardsdatascience.com/machine-learning-for-cybersecurity-101-7822b802790b • AI for Cybersecurity by Cylance(2017)- Short but good introduction to basics of ML for Cybersecurity. Good practical examples. • Machine Learning and Security by O’reilly ( January 2018 ) — Best book so far about this topic but very few examples of Deep Learning and mostly a general Machine Learning • Machine Learning For Penetration Testers, by Packt ( July 2018 )- Less fundamental than previous one, but have more Deep Learning approaches