Dynamic feature selection for spam detection (1).pptx

•Transferir como PPTX, PDF•

0 gostou•22 visualizações

RivikaJain

Dynamic Feature selection for spam detection

Tecnologia

Introduction
Now-a-days people are generally using social networking sites for communicating with the
other users and for sharing information across the world
Twitter is one among several social networking sites which are expanding on daily basis.
SPAM attacks are increasing in social media these days, and many social media users are
exposed to these and similar SPAM attacks. Spammers have the intention of collecting
personal information and attacking user profiles that they identify. These Attackers share
SPAM content with malware links and expect users to install this software on their
computers.
There is a need to develop effective systems for detecting SPAM accounts and SPAM
contents so that social networks can be cleaned and users can have a better experience

Proposal
In this study, we group similar Twitter users and introduce a dynamic
feature selection technique that uses different features for each user group
instead of using a static feature set and apply machine learning algorithms
to classify spam users on Twitter.

Methodology
1. Architecture
● Data collection
● Feature extraction
● Machine learning

Data extraction
A CRAWLER has been developed using Twitter Rest and Streaming API to
collect user information. This software enables to collect user data without
being tied to the API provided by Twitter as the restrictions that the Twitter
API imposes on the user are not suitable for use because it prevents data
collection intensively. Before the data collection process starts, users were
randomly selected from among users who shared about Twitter’s agenda
items.

2.Feature extraction
Property Tool is to be created by making calculations on raw user data.
While the features accommodated in the property pool have been
determined,attention has been paid to the most commonly used features in the
literature, which does not require very high costs.
In this way, we are trying to establish a decision mechanism with high accuracy by
keeping the necessary time and resources required for collecting and extracting
features at a minimum level.

Feature Set
User based features Content based features
User age Link count
for per tweet
Total tweet count
Average hashtag count
Total followers count
Average mention count
Total followings count
Average favorite count
Tweet count for per age Average retweet
count
Follower count for per age Retweeted rate
Followings count for per age Followers count for
per tweet
Total common followers count Average spam

Machine Learning
The machine learning phase is the last phase in which the target user is made a
decision as to whether it is a SPAM account. In this phase, users who are
already grouped are tried to be classified with various classification algorithms
together with dynamically determined properties for the group they are
belonging to.
In this study we are using k-NN, SVM classifying machine learning
algorithms.

References
1. Fabricio B, Magno G, Rodrigues T, Almeida V (2010) Detecting spammers on
Twitter, collaboration, electronic messaging, anti- abuse and spam conference (CEAS),
vol 6. National Academy Press
2. Rashhid C, Nuriddin M, Mahmud GAN, Rashedur M (2013) A data mining based
spam detection system for YouTube. In: Eighth international conference on digital
information management, pp. 373–378
3. Sarita Y, Daniel R, Grant S, Danah B (2010) Detecting spam in a twitter network.
Microsoft
Res First Monday, 15(1)
4. Stafford G, Louis LY (2013) An evaluation of the effect of spam on twitter trending
topics.IEEE, New York
5. Zhao Y, Zhaoxiang Z, Yungonh W, Liu J (2012) Robust mobile spamming detection
via graph patterns. In: 21st international conference on pattern recognition.

Mais conteúdo relacionado

Semelhante a Dynamic feature selection for spam detection (1).pptx

Retrieving Hidden Friends a Collusion Privacy Attack against Online Friend Se...ijtsrd

IRJET- Improved Real-Time Twitter Sentiment Analysis using ML & Word2VecIRJET Journal

F017433947IOSR Journals

Terrorism Analysis through Social Media using Data MiningIRJET Journal

AGGRESSION DETECTION USING MACHINE LEARNING MODELIRJET Journal

IRJET- Design and Development of a System for Predicting Threats using Data S...IRJET Journal

DP1_160430723010_Divya.pptxDivyaPatel729457

Categorize balanced dataset for troll detectionvivatechijri

Classification of instagram fake users using supervised machine learning algo...IJECEIAES

DETECTION OF MALICIOUS SOCIAL BOTS USING ML TECHNIQUE IN TWITTER NETWORKIRJET Journal

IRJET - Election Result Prediction using Sentiment AnalysisIRJET Journal

FRAMEWORK FOR ANALYZING TWITTER TO DETECT COMMUNITY SUSPICIOUS CRIME ACTIVITYcscpconf

IRJET- Big Data Driven Information Diffusion Analytics and Control on Social ...IRJET Journal

Exploratory Data Analysis and Feature Selection for Social Media Hackers Pred...CSEIJJournal

EXPLORATORY DATA ANALYSIS AND FEATURE SELECTION FOR SOCIAL MEDIA HACKERS PRED...CSEIJJournal

A Paper on Web Data Segmentation for Terrorism Detection using Named Entity R...IRJET Journal

INFORMATION RETRIEVAL TOPICS IN TWITTER USING WEIGHTED PREDICTION NETWORKIAEME Publication

IRJET - Unauthorized Terror Attack Tracking System using Web Usage MiningIRJET Journal

[IJET V2I4P9] Authors: Praveen Jayasankar , Prashanth Jayaraman ,Rachel HannahIJET - International Journal of Engineering and Techniques

IRJET - Detection of Drug Abuse using Social Media MiningIRJET Journal

Semelhante a Dynamic feature selection for spam detection (1).pptx (20)

Retrieving Hidden Friends a Collusion Privacy Attack against Online Friend Se...

IRJET- Improved Real-Time Twitter Sentiment Analysis using ML & Word2Vec

F017433947

Terrorism Analysis through Social Media using Data Mining

AGGRESSION DETECTION USING MACHINE LEARNING MODEL

IRJET- Design and Development of a System for Predicting Threats using Data S...

DP1_160430723010_Divya.pptx

Categorize balanced dataset for troll detection

Classification of instagram fake users using supervised machine learning algo...

DETECTION OF MALICIOUS SOCIAL BOTS USING ML TECHNIQUE IN TWITTER NETWORK

IRJET - Election Result Prediction using Sentiment Analysis

FRAMEWORK FOR ANALYZING TWITTER TO DETECT COMMUNITY SUSPICIOUS CRIME ACTIVITY

IRJET- Big Data Driven Information Diffusion Analytics and Control on Social ...

Exploratory Data Analysis and Feature Selection for Social Media Hackers Pred...

EXPLORATORY DATA ANALYSIS AND FEATURE SELECTION FOR SOCIAL MEDIA HACKERS PRED...

A Paper on Web Data Segmentation for Terrorism Detection using Named Entity R...

INFORMATION RETRIEVAL TOPICS IN TWITTER USING WEIGHTED PREDICTION NETWORK

IRJET - Unauthorized Terror Attack Tracking System using Web Usage Mining

[IJET V2I4P9] Authors: Praveen Jayasankar , Prashanth Jayaraman ,Rachel Hannah

IRJET - Detection of Drug Abuse using Social Media Mining

Último

The Codex of Business Writing Software for Real-World Solutions 2.pptxMalak Abu Hammad

08448380779 Call Girls In Civil Lines Women Seeking MenDelhi Call girls

Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...Neo4j

Boost PC performance: How more available memory can improve productivityPrincipled Technologies

WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure servicePooja Nehwal

Developing An App To Navigate The Roads of BrazilV3cube

Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...apidays

IAC 2024 - IA Fast Track to Search Focused AI SolutionsEnterprise Knowledge

Salesforce Community Group Quito, Salesforce 101Paola De la Torre

Presentation on how to chat with PDF using ChatGPT code interpreternaman860154

The 7 Things I Know About Cyber Security After 25 Years | April 2024Rafal Los

08448380779 Call Girls In Friends Colony Women Seeking MenDelhi Call girls

GenCyber Cyber Security Day PresentationMichael W. Hawkins

Breaking the Kubernetes Kill Chain: Host Path MountPuma Security, LLC

Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...Neo4j

Slack Application Development 101 Slidespraypatel2

Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Igalia

04-2024-HHUG-Sales-and-Marketing-Alignment.pptxHampshireHUG

EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEarley Information Science

CNv6 Instructor Chapter 6 Quality of Servicegiselly40

Dynamic feature selection for spam detection (1).pptx

1. Dynamic feature selection for spam detection Rivika jain under the guidance of : Dr Faraz Ahmad 19cs26

2. Introduction Now-a-days people are generally using social networking sites for communicating with the other users and for sharing information across the world Twitter is one among several social networking sites which are expanding on daily basis. SPAM attacks are increasing in social media these days, and many social media users are exposed to these and similar SPAM attacks. Spammers have the intention of collecting personal information and attacking user profiles that they identify. These Attackers share SPAM content with malware links and expect users to install this software on their computers. There is a need to develop effective systems for detecting SPAM accounts and SPAM contents so that social networks can be cleaned and users can have a better experience

3. Proposal In this study, we group similar Twitter users and introduce a dynamic feature selection technique that uses different features for each user group instead of using a static feature set and apply machine learning algorithms to classify spam users on Twitter.

4. Methodology 1. Architecture ● Data collection ● Feature extraction ● Machine learning

5. Data extraction A CRAWLER has been developed using Twitter Rest and Streaming API to collect user information. This software enables to collect user data without being tied to the API provided by Twitter as the restrictions that the Twitter API imposes on the user are not suitable for use because it prevents data collection intensively. Before the data collection process starts, users were randomly selected from among users who shared about Twitter’s agenda items.

6. 2.Feature extraction Property Tool is to be created by making calculations on raw user data. While the features accommodated in the property pool have been determined,attention has been paid to the most commonly used features in the literature, which does not require very high costs. In this way, we are trying to establish a decision mechanism with high accuracy by keeping the necessary time and resources required for collecting and extracting features at a minimum level.

7. Feature Set User based features Content based features User age Link count for per tweet Total tweet count Average hashtag count Total followers count Average mention count Total followings count Average favorite count Tweet count for per age Average retweet count Follower count for per age Retweeted rate Followings count for per age Followers count for per tweet Total common followers count Average spam

8. Machine Learning The machine learning phase is the last phase in which the target user is made a decision as to whether it is a SPAM account. In this phase, users who are already grouped are tried to be classified with various classification algorithms together with dynamically determined properties for the group they are belonging to. In this study we are using k-NN, SVM classifying machine learning algorithms.

9. References 1. Fabricio B, Magno G, Rodrigues T, Almeida V (2010) Detecting spammers on Twitter, collaboration, electronic messaging, anti- abuse and spam conference (CEAS), vol 6. National Academy Press 2. Rashhid C, Nuriddin M, Mahmud GAN, Rashedur M (2013) A data mining based spam detection system for YouTube. In: Eighth international conference on digital information management, pp. 373–378 3. Sarita Y, Daniel R, Grant S, Danah B (2010) Detecting spam in a twitter network. Microsoft Res First Monday, 15(1) 4. Stafford G, Louis LY (2013) An evaluation of the effect of spam on twitter trending topics.IEEE, New York 5. Zhao Y, Zhaoxiang Z, Yungonh W, Liu J (2012) Robust mobile spamming detection via graph patterns. In: 21st international conference on pattern recognition.

Dynamic feature selection for spam detection (1).pptx

Recomendados

Recomendados

Mais conteúdo relacionado

Semelhante a Dynamic feature selection for spam detection (1).pptx

Semelhante a Dynamic feature selection for spam detection (1).pptx (20)

Último

Último (20)

Dynamic feature selection for spam detection (1).pptx