SlideShare uma empresa Scribd logo
1 de 9
Dynamic feature
selection for spam
detection
Rivika jain under the guidance of : Dr Faraz Ahmad
19cs26
Introduction
Now-a-days people are generally using social networking sites for communicating with the
other users and for sharing information across the world
Twitter is one among several social networking sites which are expanding on daily basis.
SPAM attacks are increasing in social media these days, and many social media users are
exposed to these and similar SPAM attacks. Spammers have the intention of collecting
personal information and attacking user profiles that they identify. These Attackers share
SPAM content with malware links and expect users to install this software on their
computers.
There is a need to develop effective systems for detecting SPAM accounts and SPAM
contents so that social networks can be cleaned and users can have a better experience
Proposal
In this study, we group similar Twitter users and introduce a dynamic
feature selection technique that uses different features for each user group
instead of using a static feature set and apply machine learning algorithms
to classify spam users on Twitter.
Methodology
1. Architecture
● Data collection
● Feature extraction
● Machine learning
Data extraction
A CRAWLER has been developed using Twitter Rest and Streaming API to
collect user information. This software enables to collect user data without
being tied to the API provided by Twitter as the restrictions that the Twitter
API imposes on the user are not suitable for use because it prevents data
collection intensively. Before the data collection process starts, users were
randomly selected from among users who shared about Twitter’s agenda
items.
2.Feature extraction
Property Tool is to be created by making calculations on raw user data.
While the features accommodated in the property pool have been
determined,attention has been paid to the most commonly used features in the
literature, which does not require very high costs.
In this way, we are trying to establish a decision mechanism with high accuracy by
keeping the necessary time and resources required for collecting and extracting
features at a minimum level.
Feature Set
User based features Content based features
User age Link count
for per tweet
Total tweet count
Average hashtag count
Total followers count
Average mention count
Total followings count
Average favorite count
Tweet count for per age Average retweet
count
Follower count for per age Retweeted rate
Followings count for per age Followers count for
per tweet
Total common followers count Average spam
Machine Learning
The machine learning phase is the last phase in which the target user is made a
decision as to whether it is a SPAM account. In this phase, users who are
already grouped are tried to be classified with various classification algorithms
together with dynamically determined properties for the group they are
belonging to.
In this study we are using k-NN, SVM classifying machine learning
algorithms.
References
1. Fabricio B, Magno G, Rodrigues T, Almeida V (2010) Detecting spammers on
Twitter, collaboration, electronic messaging, anti- abuse and spam conference (CEAS),
vol 6. National Academy Press
2. Rashhid C, Nuriddin M, Mahmud GAN, Rashedur M (2013) A data mining based
spam detection system for YouTube. In: Eighth international conference on digital
information management, pp. 373–378
3. Sarita Y, Daniel R, Grant S, Danah B (2010) Detecting spam in a twitter network.
Microsoft
Res First Monday, 15(1)
4. Stafford G, Louis LY (2013) An evaluation of the effect of spam on twitter trending
topics.IEEE, New York
5. Zhao Y, Zhaoxiang Z, Yungonh W, Liu J (2012) Robust mobile spamming detection
via graph patterns. In: 21st international conference on pattern recognition.

Mais conteúdo relacionado

Semelhante a Dynamic feature selection for spam detection (1).pptx

Retrieving Hidden Friends a Collusion Privacy Attack against Online Friend Se...
Retrieving Hidden Friends a Collusion Privacy Attack against Online Friend Se...Retrieving Hidden Friends a Collusion Privacy Attack against Online Friend Se...
Retrieving Hidden Friends a Collusion Privacy Attack against Online Friend Se...ijtsrd
 
IRJET- Improved Real-Time Twitter Sentiment Analysis using ML & Word2Vec
IRJET-  	  Improved Real-Time Twitter Sentiment Analysis using ML & Word2VecIRJET-  	  Improved Real-Time Twitter Sentiment Analysis using ML & Word2Vec
IRJET- Improved Real-Time Twitter Sentiment Analysis using ML & Word2VecIRJET Journal
 
Terrorism Analysis through Social Media using Data Mining
Terrorism Analysis through Social Media using Data MiningTerrorism Analysis through Social Media using Data Mining
Terrorism Analysis through Social Media using Data MiningIRJET Journal
 
AGGRESSION DETECTION USING MACHINE LEARNING MODEL
AGGRESSION DETECTION USING MACHINE LEARNING MODELAGGRESSION DETECTION USING MACHINE LEARNING MODEL
AGGRESSION DETECTION USING MACHINE LEARNING MODELIRJET Journal
 
IRJET- Design and Development of a System for Predicting Threats using Data S...
IRJET- Design and Development of a System for Predicting Threats using Data S...IRJET- Design and Development of a System for Predicting Threats using Data S...
IRJET- Design and Development of a System for Predicting Threats using Data S...IRJET Journal
 
DP1_160430723010_Divya.pptx
DP1_160430723010_Divya.pptxDP1_160430723010_Divya.pptx
DP1_160430723010_Divya.pptxDivyaPatel729457
 
Categorize balanced dataset for troll detection
Categorize balanced dataset for troll detectionCategorize balanced dataset for troll detection
Categorize balanced dataset for troll detectionvivatechijri
 
Classification of instagram fake users using supervised machine learning algo...
Classification of instagram fake users using supervised machine learning algo...Classification of instagram fake users using supervised machine learning algo...
Classification of instagram fake users using supervised machine learning algo...IJECEIAES
 
DETECTION OF MALICIOUS SOCIAL BOTS USING ML TECHNIQUE IN TWITTER NETWORK
DETECTION OF MALICIOUS SOCIAL BOTS USING ML TECHNIQUE IN TWITTER NETWORKDETECTION OF MALICIOUS SOCIAL BOTS USING ML TECHNIQUE IN TWITTER NETWORK
DETECTION OF MALICIOUS SOCIAL BOTS USING ML TECHNIQUE IN TWITTER NETWORKIRJET Journal
 
IRJET - Election Result Prediction using Sentiment Analysis
IRJET - Election Result Prediction using Sentiment AnalysisIRJET - Election Result Prediction using Sentiment Analysis
IRJET - Election Result Prediction using Sentiment AnalysisIRJET Journal
 
FRAMEWORK FOR ANALYZING TWITTER TO DETECT COMMUNITY SUSPICIOUS CRIME ACTIVITY
FRAMEWORK FOR ANALYZING TWITTER TO DETECT COMMUNITY SUSPICIOUS CRIME ACTIVITYFRAMEWORK FOR ANALYZING TWITTER TO DETECT COMMUNITY SUSPICIOUS CRIME ACTIVITY
FRAMEWORK FOR ANALYZING TWITTER TO DETECT COMMUNITY SUSPICIOUS CRIME ACTIVITYcscpconf
 
IRJET- Big Data Driven Information Diffusion Analytics and Control on Social ...
IRJET- Big Data Driven Information Diffusion Analytics and Control on Social ...IRJET- Big Data Driven Information Diffusion Analytics and Control on Social ...
IRJET- Big Data Driven Information Diffusion Analytics and Control on Social ...IRJET Journal
 
Exploratory Data Analysis and Feature Selection for Social Media Hackers Pred...
Exploratory Data Analysis and Feature Selection for Social Media Hackers Pred...Exploratory Data Analysis and Feature Selection for Social Media Hackers Pred...
Exploratory Data Analysis and Feature Selection for Social Media Hackers Pred...CSEIJJournal
 
EXPLORATORY DATA ANALYSIS AND FEATURE SELECTION FOR SOCIAL MEDIA HACKERS PRED...
EXPLORATORY DATA ANALYSIS AND FEATURE SELECTION FOR SOCIAL MEDIA HACKERS PRED...EXPLORATORY DATA ANALYSIS AND FEATURE SELECTION FOR SOCIAL MEDIA HACKERS PRED...
EXPLORATORY DATA ANALYSIS AND FEATURE SELECTION FOR SOCIAL MEDIA HACKERS PRED...CSEIJJournal
 
A Paper on Web Data Segmentation for Terrorism Detection using Named Entity R...
A Paper on Web Data Segmentation for Terrorism Detection using Named Entity R...A Paper on Web Data Segmentation for Terrorism Detection using Named Entity R...
A Paper on Web Data Segmentation for Terrorism Detection using Named Entity R...IRJET Journal
 
INFORMATION RETRIEVAL TOPICS IN TWITTER USING WEIGHTED PREDICTION NETWORK
INFORMATION RETRIEVAL TOPICS IN TWITTER USING WEIGHTED PREDICTION NETWORKINFORMATION RETRIEVAL TOPICS IN TWITTER USING WEIGHTED PREDICTION NETWORK
INFORMATION RETRIEVAL TOPICS IN TWITTER USING WEIGHTED PREDICTION NETWORKIAEME Publication
 
IRJET - Unauthorized Terror Attack Tracking System using Web Usage Mining
IRJET - Unauthorized Terror Attack Tracking System using Web Usage MiningIRJET - Unauthorized Terror Attack Tracking System using Web Usage Mining
IRJET - Unauthorized Terror Attack Tracking System using Web Usage MiningIRJET Journal
 
IRJET - Detection of Drug Abuse using Social Media Mining
IRJET - Detection of Drug Abuse using Social Media MiningIRJET - Detection of Drug Abuse using Social Media Mining
IRJET - Detection of Drug Abuse using Social Media MiningIRJET Journal
 

Semelhante a Dynamic feature selection for spam detection (1).pptx (20)

Retrieving Hidden Friends a Collusion Privacy Attack against Online Friend Se...
Retrieving Hidden Friends a Collusion Privacy Attack against Online Friend Se...Retrieving Hidden Friends a Collusion Privacy Attack against Online Friend Se...
Retrieving Hidden Friends a Collusion Privacy Attack against Online Friend Se...
 
IRJET- Improved Real-Time Twitter Sentiment Analysis using ML & Word2Vec
IRJET-  	  Improved Real-Time Twitter Sentiment Analysis using ML & Word2VecIRJET-  	  Improved Real-Time Twitter Sentiment Analysis using ML & Word2Vec
IRJET- Improved Real-Time Twitter Sentiment Analysis using ML & Word2Vec
 
F017433947
F017433947F017433947
F017433947
 
Terrorism Analysis through Social Media using Data Mining
Terrorism Analysis through Social Media using Data MiningTerrorism Analysis through Social Media using Data Mining
Terrorism Analysis through Social Media using Data Mining
 
AGGRESSION DETECTION USING MACHINE LEARNING MODEL
AGGRESSION DETECTION USING MACHINE LEARNING MODELAGGRESSION DETECTION USING MACHINE LEARNING MODEL
AGGRESSION DETECTION USING MACHINE LEARNING MODEL
 
IRJET- Design and Development of a System for Predicting Threats using Data S...
IRJET- Design and Development of a System for Predicting Threats using Data S...IRJET- Design and Development of a System for Predicting Threats using Data S...
IRJET- Design and Development of a System for Predicting Threats using Data S...
 
DP1_160430723010_Divya.pptx
DP1_160430723010_Divya.pptxDP1_160430723010_Divya.pptx
DP1_160430723010_Divya.pptx
 
Categorize balanced dataset for troll detection
Categorize balanced dataset for troll detectionCategorize balanced dataset for troll detection
Categorize balanced dataset for troll detection
 
Classification of instagram fake users using supervised machine learning algo...
Classification of instagram fake users using supervised machine learning algo...Classification of instagram fake users using supervised machine learning algo...
Classification of instagram fake users using supervised machine learning algo...
 
DETECTION OF MALICIOUS SOCIAL BOTS USING ML TECHNIQUE IN TWITTER NETWORK
DETECTION OF MALICIOUS SOCIAL BOTS USING ML TECHNIQUE IN TWITTER NETWORKDETECTION OF MALICIOUS SOCIAL BOTS USING ML TECHNIQUE IN TWITTER NETWORK
DETECTION OF MALICIOUS SOCIAL BOTS USING ML TECHNIQUE IN TWITTER NETWORK
 
IRJET - Election Result Prediction using Sentiment Analysis
IRJET - Election Result Prediction using Sentiment AnalysisIRJET - Election Result Prediction using Sentiment Analysis
IRJET - Election Result Prediction using Sentiment Analysis
 
FRAMEWORK FOR ANALYZING TWITTER TO DETECT COMMUNITY SUSPICIOUS CRIME ACTIVITY
FRAMEWORK FOR ANALYZING TWITTER TO DETECT COMMUNITY SUSPICIOUS CRIME ACTIVITYFRAMEWORK FOR ANALYZING TWITTER TO DETECT COMMUNITY SUSPICIOUS CRIME ACTIVITY
FRAMEWORK FOR ANALYZING TWITTER TO DETECT COMMUNITY SUSPICIOUS CRIME ACTIVITY
 
IRJET- Big Data Driven Information Diffusion Analytics and Control on Social ...
IRJET- Big Data Driven Information Diffusion Analytics and Control on Social ...IRJET- Big Data Driven Information Diffusion Analytics and Control on Social ...
IRJET- Big Data Driven Information Diffusion Analytics and Control on Social ...
 
Exploratory Data Analysis and Feature Selection for Social Media Hackers Pred...
Exploratory Data Analysis and Feature Selection for Social Media Hackers Pred...Exploratory Data Analysis and Feature Selection for Social Media Hackers Pred...
Exploratory Data Analysis and Feature Selection for Social Media Hackers Pred...
 
EXPLORATORY DATA ANALYSIS AND FEATURE SELECTION FOR SOCIAL MEDIA HACKERS PRED...
EXPLORATORY DATA ANALYSIS AND FEATURE SELECTION FOR SOCIAL MEDIA HACKERS PRED...EXPLORATORY DATA ANALYSIS AND FEATURE SELECTION FOR SOCIAL MEDIA HACKERS PRED...
EXPLORATORY DATA ANALYSIS AND FEATURE SELECTION FOR SOCIAL MEDIA HACKERS PRED...
 
A Paper on Web Data Segmentation for Terrorism Detection using Named Entity R...
A Paper on Web Data Segmentation for Terrorism Detection using Named Entity R...A Paper on Web Data Segmentation for Terrorism Detection using Named Entity R...
A Paper on Web Data Segmentation for Terrorism Detection using Named Entity R...
 
INFORMATION RETRIEVAL TOPICS IN TWITTER USING WEIGHTED PREDICTION NETWORK
INFORMATION RETRIEVAL TOPICS IN TWITTER USING WEIGHTED PREDICTION NETWORKINFORMATION RETRIEVAL TOPICS IN TWITTER USING WEIGHTED PREDICTION NETWORK
INFORMATION RETRIEVAL TOPICS IN TWITTER USING WEIGHTED PREDICTION NETWORK
 
IRJET - Unauthorized Terror Attack Tracking System using Web Usage Mining
IRJET - Unauthorized Terror Attack Tracking System using Web Usage MiningIRJET - Unauthorized Terror Attack Tracking System using Web Usage Mining
IRJET - Unauthorized Terror Attack Tracking System using Web Usage Mining
 
[IJET V2I4P9] Authors: Praveen Jayasankar , Prashanth Jayaraman ,Rachel Hannah
[IJET V2I4P9] Authors: Praveen Jayasankar , Prashanth Jayaraman ,Rachel Hannah[IJET V2I4P9] Authors: Praveen Jayasankar , Prashanth Jayaraman ,Rachel Hannah
[IJET V2I4P9] Authors: Praveen Jayasankar , Prashanth Jayaraman ,Rachel Hannah
 
IRJET - Detection of Drug Abuse using Social Media Mining
IRJET - Detection of Drug Abuse using Social Media MiningIRJET - Detection of Drug Abuse using Social Media Mining
IRJET - Detection of Drug Abuse using Social Media Mining
 

Último

The Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxThe Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxMalak Abu Hammad
 
08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking MenDelhi Call girls
 
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...Neo4j
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityPrincipled Technologies
 
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure serviceWhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure servicePooja Nehwal
 
Developing An App To Navigate The Roads of Brazil
Developing An App To Navigate The Roads of BrazilDeveloping An App To Navigate The Roads of Brazil
Developing An App To Navigate The Roads of BrazilV3cube
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...apidays
 
IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsEnterprise Knowledge
 
Salesforce Community Group Quito, Salesforce 101
Salesforce Community Group Quito, Salesforce 101Salesforce Community Group Quito, Salesforce 101
Salesforce Community Group Quito, Salesforce 101Paola De la Torre
 
Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreterPresentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreternaman860154
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024Rafal Los
 
08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking MenDelhi Call girls
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationMichael W. Hawkins
 
Breaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountBreaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountPuma Security, LLC
 
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...Neo4j
 
Slack Application Development 101 Slides
Slack Application Development 101 SlidesSlack Application Development 101 Slides
Slack Application Development 101 Slidespraypatel2
 
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Igalia
 
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptxHampshireHUG
 
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEarley Information Science
 
CNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of ServiceCNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of Servicegiselly40
 

Último (20)

The Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxThe Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptx
 
08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men
 
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivity
 
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure serviceWhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
 
Developing An App To Navigate The Roads of Brazil
Developing An App To Navigate The Roads of BrazilDeveloping An App To Navigate The Roads of Brazil
Developing An App To Navigate The Roads of Brazil
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
 
IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI Solutions
 
Salesforce Community Group Quito, Salesforce 101
Salesforce Community Group Quito, Salesforce 101Salesforce Community Group Quito, Salesforce 101
Salesforce Community Group Quito, Salesforce 101
 
Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreterPresentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreter
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024
 
08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day Presentation
 
Breaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountBreaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path Mount
 
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
 
Slack Application Development 101 Slides
Slack Application Development 101 SlidesSlack Application Development 101 Slides
Slack Application Development 101 Slides
 
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
 
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
 
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
 
CNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of ServiceCNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of Service
 

Dynamic feature selection for spam detection (1).pptx

  • 1. Dynamic feature selection for spam detection Rivika jain under the guidance of : Dr Faraz Ahmad 19cs26
  • 2. Introduction Now-a-days people are generally using social networking sites for communicating with the other users and for sharing information across the world Twitter is one among several social networking sites which are expanding on daily basis. SPAM attacks are increasing in social media these days, and many social media users are exposed to these and similar SPAM attacks. Spammers have the intention of collecting personal information and attacking user profiles that they identify. These Attackers share SPAM content with malware links and expect users to install this software on their computers. There is a need to develop effective systems for detecting SPAM accounts and SPAM contents so that social networks can be cleaned and users can have a better experience
  • 3. Proposal In this study, we group similar Twitter users and introduce a dynamic feature selection technique that uses different features for each user group instead of using a static feature set and apply machine learning algorithms to classify spam users on Twitter.
  • 4. Methodology 1. Architecture ● Data collection ● Feature extraction ● Machine learning
  • 5. Data extraction A CRAWLER has been developed using Twitter Rest and Streaming API to collect user information. This software enables to collect user data without being tied to the API provided by Twitter as the restrictions that the Twitter API imposes on the user are not suitable for use because it prevents data collection intensively. Before the data collection process starts, users were randomly selected from among users who shared about Twitter’s agenda items.
  • 6. 2.Feature extraction Property Tool is to be created by making calculations on raw user data. While the features accommodated in the property pool have been determined,attention has been paid to the most commonly used features in the literature, which does not require very high costs. In this way, we are trying to establish a decision mechanism with high accuracy by keeping the necessary time and resources required for collecting and extracting features at a minimum level.
  • 7. Feature Set User based features Content based features User age Link count for per tweet Total tweet count Average hashtag count Total followers count Average mention count Total followings count Average favorite count Tweet count for per age Average retweet count Follower count for per age Retweeted rate Followings count for per age Followers count for per tweet Total common followers count Average spam
  • 8. Machine Learning The machine learning phase is the last phase in which the target user is made a decision as to whether it is a SPAM account. In this phase, users who are already grouped are tried to be classified with various classification algorithms together with dynamically determined properties for the group they are belonging to. In this study we are using k-NN, SVM classifying machine learning algorithms.
  • 9. References 1. Fabricio B, Magno G, Rodrigues T, Almeida V (2010) Detecting spammers on Twitter, collaboration, electronic messaging, anti- abuse and spam conference (CEAS), vol 6. National Academy Press 2. Rashhid C, Nuriddin M, Mahmud GAN, Rashedur M (2013) A data mining based spam detection system for YouTube. In: Eighth international conference on digital information management, pp. 373–378 3. Sarita Y, Daniel R, Grant S, Danah B (2010) Detecting spam in a twitter network. Microsoft Res First Monday, 15(1) 4. Stafford G, Louis LY (2013) An evaluation of the effect of spam on twitter trending topics.IEEE, New York 5. Zhao Y, Zhaoxiang Z, Yungonh W, Liu J (2012) Robust mobile spamming detection via graph patterns. In: 21st international conference on pattern recognition.