DP1_160430723010_Divya.pptx

MSR Presentation on
RUMOR DETECTION ON REAL-TIME
TWITTER DATA USING SUPERVISED
LEARNING
Presented By:-
Patel Divya M.
M.E. (Information Technology)
Enroll. No. : 160430723010
SHANTILAL SHAH
ENGINEERING COLLEGE,
BHAVNAGAR
Guided By:-
Dr. Dinesh B. Vaghela
Asst. Prof. of Information Technology Dept.
GUJARAT
TECHNOLOGICAL
UNIVERSITY
1
• Introduction
• Research Topic: Rumor Detection
• Research work: Objective
• Literature Review
• Problem Statement
• Implementation Strategy
• Implementation Environment
• Conclusion
• Future Work
• References
Outline
1
2
Introduction
• Twitter is most popular micro-blogging service on social media[1].
• A common people have a direct platform to share information and their opinions about
the news events and any other information[1].
• Not all the information posted on twitter is correct or useful in providing information
about the event to other people[1].
1
3
Introduction: Rumor Detection
• What is Rumor?
An unverified statement that starts from one or more sources and spreads
over time [2].
A rumor can end in three ways: it can be resolved as either true, false or remain
unresolved [2].
• So, its necessary to provide some solutions for detecting such kind of activity spread on
social media.
1
4
Research work: Objective
 Phase 1:
• Survey of current methods and models available for Detecting Rumors.
• To study and analyze different methods of Rumor Detection on real time Twitter data.
Phase 2:
• To design a new model/approach for detection of rumors.
Phase 3:
• To implement a proposed model/approach for detection of rumors.
• To evaluate the performance of Rumor detection on Twitter by proposed model.
5
Paper Name: “Towards Automated Real-Time Detection of Misinformation on Twitter”
Authors: Suchita Jain, Vanya Sharma and Rishabh Kaushal [3]
Publisher / Journal Name: IEEE-2016.
Literature Review
6
Proposed
Model
 Focused on the problem by providing an approach to detect
misinformation or rumors on Twitter in real-time automatically.
 Their approach based on the supposition that verified News Channel
accounts on Twitter give more credible information as compared to the
public account of user.
Observation  They calculate accuracy according the tweet they retrieve from both the
News channels and general users.
Limitation  Feature selection/extraction part is missing.
Paper Name : “Automatic detection of Rumoured Tweets and finding its Origin”
Authors: Sahana V P, Alwyn R Pias, Richa Shastri, and Shweta Mandloi [4]
Publisher / Journal Name: IEEE-2015.
7
Proposed
Model
 Focused on the topic “London Riots in 2011”.
 The methodology contains mainly three sections: data, feature extraction,
classification.
 Used 20 features based on tweet content and user accounts.
 Then after they trained a classifier to correctly classifies the tweets. For
that they used Weka tool for classification.
 Also proposed an algorithm to find the origin of the rumored tweets i.e.
obtain the account information of the user who first started spreading
rumors on Twitter.
Observation  Achieved best accuracy for J48 decision tree classification algorithm.
 Recall rate is given high accuracy 0.877.
Limitation  Focused only on one specific rumor topic.
 Real-time twitter data were not considered.
Paper Name : “Detection and Analysis of 2016 US Presidential Election Related Rumors on
Twitter”
Authors: Zhiwei Jin, Juan Cao, Han Guo, Yongdong Zhang, Yu Wang, and Jiebo Luo [5]
Publisher / Journal Name: Springer 2017.
8
Proposed
Model
 Focused on the 2016 U.S. presidential election.
 Presented an analysis of rumor tweets from the followers of two
presidential candidates: Hillary Clinton and Donald Trump.
 They detected rumor tweets by matching large amount of tweets related
to president election with verified rumor articles.
 They collected over 8 million tweets from the followers of the two
candidates.
 They compared the performance of five matching algorithms with
respect to the rumor detection task: TF-IDF, BM25, Word2Vec and
Doc2Vec, lexicon-based algorithm.
Observation  Precision gives 94.7% accuracy which is the highest accuracy result
according to their detection algorithm.
Limitation  Focused only on specific topic i.e. “2016 US President Election related
rumors”.
Paper Name : “Automatic Detection of Rumor on Social Network”
Authors: Qiao Zhang, Shuiyuan Zhang, Jian Dong, Jinhua Xiong, and Xueqi Cheng [6]
Publisher / Journal Name: Springer 2015.
9
Proposed
Model
 Proposed an automatic rumor detection method based on the combination
of new proposed implicit features and shallow features of the messages.
 It mainly divided into 3 parts: data cleaning, feature extraction and model
training.
 Used User-based implicit features and Content-based implicit features.
 A large amount of supervised model they used such as Support Vector
Machine, Random Forest.
Observation  Results show that Implicit-Content-Based method have significant
improvement compared with Shallow-Content-Based method, with
10.5% improvement in precision and 4.7% in recall rate.
Limitation  User credibility.
 Detection of rumors on the Chinese micro-blogging services.
Paper Name : “Detecting Rumors on Online Social Networks Using Multi-layer Autoencoder”
Authors: Yan Zhang, Weiling Chen, Chai Kiat Yeo, Chiew Tong Lau, Bu Sung Lee [7]
Publisher / Journal Name: IEEE 2017.
10
Proposed
Model
 Proposed an anomaly detection method based on autoencoder to perform
rumor detection.
 They used Sina Weibo which is the most popular microblog in China.
 Proposed several self-adapting thresholds which are calculated based on
the property of each recent Weibo set.
Observation  Results show that the autoencoder model achieves a good accuracy i.e.
88%, F1 i.e. 82% and a low false positive rate i.e. 7%.
Limitation  Detection of rumors on the chienese micro-blogging services.
 Performance of autoencoder with 2 hidden layer gives best performance.
Literature Review
Sr.
No.
Title Algorithm/Technique Used Advantages Disadvantages
1 Towards Automated Real-Time
Detection of Misinformation on
Twitter [3]
Sentiment and semantic
analysis
 Detect rumors on Twitter
using tweets from the
verified news channels as
base.
 Detect rumors especially in
the critical times of
emergency.
 There result is based only on
the semantic and sentiment
analysis of the tweets.
 They weren’t used any
features. Also they weren’t
used any classification
techniques to detect rumors.
2 Automatic detection of Rumored
Tweets and finding its Origin [4]
J48 decision tree Classifier  Automatically detect the
spread of rumoured tweets.
 Focused on specific rumor
topic.
 Real-time twitter data were
not considered.
3 Detection and Analysis of 2016
US Presidential Election Related
Rumors on Twitter [5]
TF-IDF and BM25,
Word2Vec and Doc2Vec,
Lexicon matching
 Detect rumor tweets from the
aspects of people, content
and time.
 Their detection algorithm
understand rumors during
political events.
 Focused only one specific
topic i.e. “2016 US President
Election related rumors”.
11
Continue..
4 Automatic Detection of Rumor on
Social Network [6]
Support vector machine  Best result as compared to
Shallow features.
 User credibility
 Detection of rumors only on
the Chinese micro-blogging
services.
5 Detecting Rumors on Online Social
Networks Using Multi-layer
Autoencoder [7]
Autoencoder (Artificial
Neural Network)
 Multi-layer autoencoder is
used.
 Self-adapting thresholds
which is used to distinguish
rumors from non-rumors.
 Detection of rumors only on
the Chinese micro-blogging
services.
 Used unsupervised learning
method to detect rumors.
6 Rumor Detection and
Classification for Twitter Data [8]
J48 decision tree Classifier  They detect rumors as a type
of misinformation
propagation.
 Rumor detection and
classification (RDC) within
the context of microblog
social media.
 The result is not better than the
pre-processing method applied
on the algorithm.
 Real-time twitter data were not
considered.
12
Table 1: Comparative analysis of Literature
Problem Statement
• An advantage of social media is that all the people can share information and
also gives their opinions on that platform.
• The downside of such rapid diffusion of information is that false information
are also spread.
• As the rumors are spreading on Twitter and other social media so fast and
easier. We need to provide some solutions to detect such rumors.
13
Proposed Work
14
Figure 1: Basic steps for Proposed method
Dataset
Collection
Pre-
processing
Feature
Extraction
Classification
Dataset Collection
15
Input Tweets
Figure 2: Flow for fetching tweets
Data Pre-processing
• Remove all URLs (e.g. www.xyz.com), hash tags (e.g. #topic), targets
(@username)
• Correct the spellings; sequence of repeated characters is to be
handled
• Remove all punctuations ,symbols, numbers
• Remove Stop Words
• Remove Non-English Tweets
16
Feature Extraction
• Identification of attributes for Classification.
17
Classification
18
Classification
Decision Tree
Linear Classifier
Support Vector
Machine
Neural Network
Rule based
classifier
Probabilistic
Classifier
Naïve Bayes
Maximum Entropy
Figure 3: Classification Techniques
19
Figure 4: Proposed method
20
Figure 5: Using News website Verification
• Our Proposed approach is divided into three step: 1) Pre-processing, 2) Sentiment
Analysis, and 3) Classification.
• In first step, we are going to preprocess on the real-time tweets to determine the topic
about which the given input tweet is posted.
• In second step, we are finding tweet’s sentiment polarity of each tweets by using
sentiment score.
• In final step, we are going to apply this sentiment score as an input to the different
classification algorithm.
21
• We are using Proposed approach with News websites approach to compare different
specific rumors topic.
• If both gives the same result then we can say that our approach gives the better accuracy.
• This comparison approach also provide the verification about the rumor topic.
22
Implementation Strategy
 Tools
Python 3.5.4
Weka tool (for classification)
 Dataset
Tweets form twitter
 Performance Evaluation
Accuracy
23
Implementation Environment
24
Figure 6: Collecting Tweets using Python Streaming API
25
Figure 7: Retrieving Tweets for specific topic.
26
Figure 8: Storing tweets in table format.
27
Figure 9: Sentiment Analysis chart for specific topic.
Conclusion
• After the study of different research paper on rumor detection, different methods to
are used to detect rumors. There are many classifiers available for detecting
rumors. This research work can be useful to detect rumors on Twitter platform
efficiently and accurately.
28
Future Work
 To implement remaining work.
29
References
[1] Anubrata Das, Moumita Roy, Soumi Dutta, Saptarshi Ghosh, Asit Kumar Das. “Predicting Trends in the
Twitter Social Network: A Machine Learning Approach”, Springer International Publishing Switzerland, 2015.
[2] Soroush Vosoughi, PhD Thesis, “Automatic Detection and Verification of Rumors on Twitter”, June 2015.
[3] Suchita Jain, Vanya Sharma and Rishabh Kaushal. “Towards Automated Real-Time Detection of
Misinformation on Twitter”, Intl. Conference on Advances in Computing, Communications and Informatics
(ICACCI), IEEE 2016.
[4] Sahana V P, Alwyn R Pias, Richa Shastri, and Shweta Mandloi. “Automatic detection of Rumoured Tweets
and finding its Origin”, Intl. Conference on Computing and Network Communications (CoCoNet'15), IEEE
2015.
[5] Zhiwei Jin, Juan Cao, Han Guo, Yongdong Zhang, Yu Wang, and Jiebo Luo. “Detection and Analysis of
2016 US Presidential Election Related Rumors on Twitter”, Springer International Publishing AG 2017,
Springer 2017.
[6] Qiao Zhang, Shuiyuan Zhang, Jian Dong, Jinhua Xiong, and Xueqi Cheng. “Automatic Detection of Rumor
on Social Network”, Springer International Publishing Switzerland 2015, Springer 2017.
[7] Yan Zhang, Weiling Chen, Chai Kiat Yeo, Chiew Tong Lau, Bu Sung Lee, “Detecting Rumors on Online
Social Networks Using Multi-layer Autoencoder”, IEEE Technology & Engineering Management Conference
(TEMSCON), IEEE 2017
[8] Sardar Hamidian and Mona Diab. “Rumor Detection and Classification for Twitter Data”, The Fifth
International Conference on Social Media Technologies, Communication, and Informatics, SOTICS 2015. 30
31
1 de 31

Recomendados

FRAMEWORK FOR ANALYZING TWITTER TO DETECT COMMUNITY SUSPICIOUS CRIME ACTIVITY por
FRAMEWORK FOR ANALYZING TWITTER TO DETECT COMMUNITY SUSPICIOUS CRIME ACTIVITYFRAMEWORK FOR ANALYZING TWITTER TO DETECT COMMUNITY SUSPICIOUS CRIME ACTIVITY
FRAMEWORK FOR ANALYZING TWITTER TO DETECT COMMUNITY SUSPICIOUS CRIME ACTIVITYcscpconf
63 visualizações20 slides
IRJET - Implementation of Twitter Sentimental Analysis According to Hash Tag por
 IRJET - Implementation of Twitter Sentimental Analysis According to Hash Tag IRJET - Implementation of Twitter Sentimental Analysis According to Hash Tag
IRJET - Implementation of Twitter Sentimental Analysis According to Hash TagIRJET Journal
14 visualizações5 slides
IRJET- Effective Countering of Communal Hatred During Disaster Events in Soci... por
IRJET- Effective Countering of Communal Hatred During Disaster Events in Soci...IRJET- Effective Countering of Communal Hatred During Disaster Events in Soci...
IRJET- Effective Countering of Communal Hatred During Disaster Events in Soci...IRJET Journal
21 visualizações4 slides
Analyzing-Threat-Levels-of-Extremists-using-Tweets por
Analyzing-Threat-Levels-of-Extremists-using-TweetsAnalyzing-Threat-Levels-of-Extremists-using-Tweets
Analyzing-Threat-Levels-of-Extremists-using-TweetsRESHAN FARAZ
89 visualizações6 slides
Detection and Analysis of Twitter Trending Topics via Link-Anomaly Detection por
Detection and Analysis of Twitter Trending Topics via Link-Anomaly DetectionDetection and Analysis of Twitter Trending Topics via Link-Anomaly Detection
Detection and Analysis of Twitter Trending Topics via Link-Anomaly DetectionIJERA Editor
276 visualizações3 slides
IRJET- Tweet Segmentation and its Application to Named Entity Recognition por
IRJET- Tweet Segmentation and its Application to Named Entity RecognitionIRJET- Tweet Segmentation and its Application to Named Entity Recognition
IRJET- Tweet Segmentation and its Application to Named Entity RecognitionIRJET Journal
45 visualizações3 slides

Mais conteúdo relacionado

Similar a DP1_160430723010_Divya.pptx

IRJET- Improved Real-Time Twitter Sentiment Analysis using ML & Word2Vec por
IRJET-  	  Improved Real-Time Twitter Sentiment Analysis using ML & Word2VecIRJET-  	  Improved Real-Time Twitter Sentiment Analysis using ML & Word2Vec
IRJET- Improved Real-Time Twitter Sentiment Analysis using ML & Word2VecIRJET Journal
40 visualizações5 slides
IRJET- Big Data Driven Information Diffusion Analytics and Control on Social ... por
IRJET- Big Data Driven Information Diffusion Analytics and Control on Social ...IRJET- Big Data Driven Information Diffusion Analytics and Control on Social ...
IRJET- Big Data Driven Information Diffusion Analytics and Control on Social ...IRJET Journal
41 visualizações4 slides
757 por
757757
757Anurag Jain
220 visualizações4 slides
F017433947 por
F017433947F017433947
F017433947IOSR Journals
153 visualizações9 slides
A Paper on Web Data Segmentation for Terrorism Detection using Named Entity R... por
A Paper on Web Data Segmentation for Terrorism Detection using Named Entity R...A Paper on Web Data Segmentation for Terrorism Detection using Named Entity R...
A Paper on Web Data Segmentation for Terrorism Detection using Named Entity R...IRJET Journal
42 visualizações5 slides
A DATA MINING APPROACH FOR FILTERING OUT SOCIAL SPAMMERS IN LARGE-SCALE TWITT... por
A DATA MINING APPROACH FOR FILTERING OUT SOCIAL SPAMMERS IN LARGE-SCALE TWITT...A DATA MINING APPROACH FOR FILTERING OUT SOCIAL SPAMMERS IN LARGE-SCALE TWITT...
A DATA MINING APPROACH FOR FILTERING OUT SOCIAL SPAMMERS IN LARGE-SCALE TWITT...ijaia
34 visualizações15 slides

Similar a DP1_160430723010_Divya.pptx(20)

IRJET- Improved Real-Time Twitter Sentiment Analysis using ML & Word2Vec por IRJET Journal
IRJET-  	  Improved Real-Time Twitter Sentiment Analysis using ML & Word2VecIRJET-  	  Improved Real-Time Twitter Sentiment Analysis using ML & Word2Vec
IRJET- Improved Real-Time Twitter Sentiment Analysis using ML & Word2Vec
IRJET Journal40 visualizações
IRJET- Big Data Driven Information Diffusion Analytics and Control on Social ... por IRJET Journal
IRJET- Big Data Driven Information Diffusion Analytics and Control on Social ...IRJET- Big Data Driven Information Diffusion Analytics and Control on Social ...
IRJET- Big Data Driven Information Diffusion Analytics and Control on Social ...
IRJET Journal41 visualizações
757 por Anurag Jain
757757
757
Anurag Jain220 visualizações
F017433947 por IOSR Journals
F017433947F017433947
F017433947
IOSR Journals153 visualizações
A Paper on Web Data Segmentation for Terrorism Detection using Named Entity R... por IRJET Journal
A Paper on Web Data Segmentation for Terrorism Detection using Named Entity R...A Paper on Web Data Segmentation for Terrorism Detection using Named Entity R...
A Paper on Web Data Segmentation for Terrorism Detection using Named Entity R...
IRJET Journal42 visualizações
A DATA MINING APPROACH FOR FILTERING OUT SOCIAL SPAMMERS IN LARGE-SCALE TWITT... por ijaia
A DATA MINING APPROACH FOR FILTERING OUT SOCIAL SPAMMERS IN LARGE-SCALE TWITT...A DATA MINING APPROACH FOR FILTERING OUT SOCIAL SPAMMERS IN LARGE-SCALE TWITT...
A DATA MINING APPROACH FOR FILTERING OUT SOCIAL SPAMMERS IN LARGE-SCALE TWITT...
ijaia34 visualizações
Dynamic feature selection for spam detection (1).pptx por RivikaJain
Dynamic feature selection for spam detection (1).pptxDynamic feature selection for spam detection (1).pptx
Dynamic feature selection for spam detection (1).pptx
RivikaJain21 visualizações
IRJET - Election Result Prediction using Sentiment Analysis por IRJET Journal
IRJET - Election Result Prediction using Sentiment AnalysisIRJET - Election Result Prediction using Sentiment Analysis
IRJET - Election Result Prediction using Sentiment Analysis
IRJET Journal37 visualizações
IRJET- Twitter Spammer Detection por IRJET Journal
IRJET- Twitter Spammer DetectionIRJET- Twitter Spammer Detection
IRJET- Twitter Spammer Detection
IRJET Journal73 visualizações
Categorize balanced dataset for troll detection por vivatechijri
Categorize balanced dataset for troll detectionCategorize balanced dataset for troll detection
Categorize balanced dataset for troll detection
vivatechijri150 visualizações
IRJET - Unauthorized Terror Attack Tracking System using Web Usage Mining por IRJET Journal
IRJET - Unauthorized Terror Attack Tracking System using Web Usage MiningIRJET - Unauthorized Terror Attack Tracking System using Web Usage Mining
IRJET - Unauthorized Terror Attack Tracking System using Web Usage Mining
IRJET Journal13 visualizações
Sentiment Analysis of Twitter Data por Sumit Raj
Sentiment Analysis of Twitter DataSentiment Analysis of Twitter Data
Sentiment Analysis of Twitter Data
Sumit Raj77.9K visualizações
News construction from microblogging post using open data por Francisco Berrizbeitia
News construction from microblogging post using open dataNews construction from microblogging post using open data
News construction from microblogging post using open data
Francisco Berrizbeitia966 visualizações
Final Poster for Engineering Showcase por Tucker Truesdale
Final Poster for Engineering ShowcaseFinal Poster for Engineering Showcase
Final Poster for Engineering Showcase
Tucker Truesdale160 visualizações
IRJET- A Survey on Trend Analysis on Twitter for Predicting Public Opinion on... por IRJET Journal
IRJET- A Survey on Trend Analysis on Twitter for Predicting Public Opinion on...IRJET- A Survey on Trend Analysis on Twitter for Predicting Public Opinion on...
IRJET- A Survey on Trend Analysis on Twitter for Predicting Public Opinion on...
IRJET Journal19 visualizações
Twitter_Hashtag_Prediction.pptx por SayaliKawale2
Twitter_Hashtag_Prediction.pptxTwitter_Hashtag_Prediction.pptx
Twitter_Hashtag_Prediction.pptx
SayaliKawale26 visualizações
DETECTION OF MALICIOUS SOCIAL BOTS USING ML TECHNIQUE IN TWITTER NETWORK por IRJET Journal
DETECTION OF MALICIOUS SOCIAL BOTS USING ML TECHNIQUE IN TWITTER NETWORKDETECTION OF MALICIOUS SOCIAL BOTS USING ML TECHNIQUE IN TWITTER NETWORK
DETECTION OF MALICIOUS SOCIAL BOTS USING ML TECHNIQUE IN TWITTER NETWORK
IRJET Journal91 visualizações
IRJET- Identification of Prevalent News from Twitter and Traditional Media us... por IRJET Journal
IRJET- Identification of Prevalent News from Twitter and Traditional Media us...IRJET- Identification of Prevalent News from Twitter and Traditional Media us...
IRJET- Identification of Prevalent News from Twitter and Traditional Media us...
IRJET Journal16 visualizações
Sentiment Analysis of Twitter tweets using supervised classification technique por IJERA Editor
Sentiment Analysis of Twitter tweets using supervised classification technique Sentiment Analysis of Twitter tweets using supervised classification technique
Sentiment Analysis of Twitter tweets using supervised classification technique
IJERA Editor65 visualizações

Último

Stone Masonry and Brick Masonry.pdf por
Stone Masonry and Brick Masonry.pdfStone Masonry and Brick Masonry.pdf
Stone Masonry and Brick Masonry.pdfMohammed Abdullah Laskar
20 visualizações6 slides
cloud computing-virtualization.pptx por
cloud computing-virtualization.pptxcloud computing-virtualization.pptx
cloud computing-virtualization.pptxRajaulKarim20
85 visualizações31 slides
Wire Rope por
Wire RopeWire Rope
Wire RopeIwiss Tools Co.,Ltd
9 visualizações5 slides
What is Unit Testing por
What is Unit TestingWhat is Unit Testing
What is Unit TestingSadaaki Emura
23 visualizações25 slides
DevOps to DevSecOps: Enhancing Software Security Throughout The Development L... por
DevOps to DevSecOps: Enhancing Software Security Throughout The Development L...DevOps to DevSecOps: Enhancing Software Security Throughout The Development L...
DevOps to DevSecOps: Enhancing Software Security Throughout The Development L...Anowar Hossain
12 visualizações34 slides
SWM L15-L28_drhasan (Part 2).pdf por
SWM L15-L28_drhasan (Part 2).pdfSWM L15-L28_drhasan (Part 2).pdf
SWM L15-L28_drhasan (Part 2).pdfMahmudHasan747870
28 visualizações93 slides

Último(20)

cloud computing-virtualization.pptx por RajaulKarim20
cloud computing-virtualization.pptxcloud computing-virtualization.pptx
cloud computing-virtualization.pptx
RajaulKarim2085 visualizações
What is Unit Testing por Sadaaki Emura
What is Unit TestingWhat is Unit Testing
What is Unit Testing
Sadaaki Emura23 visualizações
DevOps to DevSecOps: Enhancing Software Security Throughout The Development L... por Anowar Hossain
DevOps to DevSecOps: Enhancing Software Security Throughout The Development L...DevOps to DevSecOps: Enhancing Software Security Throughout The Development L...
DevOps to DevSecOps: Enhancing Software Security Throughout The Development L...
Anowar Hossain12 visualizações
SWM L15-L28_drhasan (Part 2).pdf por MahmudHasan747870
SWM L15-L28_drhasan (Part 2).pdfSWM L15-L28_drhasan (Part 2).pdf
SWM L15-L28_drhasan (Part 2).pdf
MahmudHasan74787028 visualizações
Thermal aware task assignment for multicore processors using genetic algorithm por IJECEIAES
Thermal aware task assignment for multicore processors using genetic algorithm Thermal aware task assignment for multicore processors using genetic algorithm
Thermal aware task assignment for multicore processors using genetic algorithm
IJECEIAES30 visualizações
13_DVD_Latch-up_prevention.pdf por Usha Mehta
13_DVD_Latch-up_prevention.pdf13_DVD_Latch-up_prevention.pdf
13_DVD_Latch-up_prevention.pdf
Usha Mehta10 visualizações
How I learned to stop worrying and love the dark silicon apocalypse.pdf por Tomasz Kowalczewski
How I learned to stop worrying and love the dark silicon apocalypse.pdfHow I learned to stop worrying and love the dark silicon apocalypse.pdf
How I learned to stop worrying and love the dark silicon apocalypse.pdf
Tomasz Kowalczewski24 visualizações
9_DVD_Dynamic_logic_circuits.pdf por Usha Mehta
9_DVD_Dynamic_logic_circuits.pdf9_DVD_Dynamic_logic_circuits.pdf
9_DVD_Dynamic_logic_circuits.pdf
Usha Mehta28 visualizações
fakenews_DBDA_Mar23.pptx por deepmitra8
fakenews_DBDA_Mar23.pptxfakenews_DBDA_Mar23.pptx
fakenews_DBDA_Mar23.pptx
deepmitra812 visualizações
7_DVD_Combinational_MOS_Logic_Circuits.pdf por Usha Mehta
7_DVD_Combinational_MOS_Logic_Circuits.pdf7_DVD_Combinational_MOS_Logic_Circuits.pdf
7_DVD_Combinational_MOS_Logic_Circuits.pdf
Usha Mehta59 visualizações
CHEMICAL KINETICS.pdf por AguedaGutirrez
CHEMICAL KINETICS.pdfCHEMICAL KINETICS.pdf
CHEMICAL KINETICS.pdf
AguedaGutirrez8 visualizações
Machine Element II Course outline.pdf por odatadese1
Machine Element II Course outline.pdfMachine Element II Course outline.pdf
Machine Element II Course outline.pdf
odatadese17 visualizações
_MAKRIADI-FOTEINI_diploma thesis.pptx por fotinimakriadi
_MAKRIADI-FOTEINI_diploma thesis.pptx_MAKRIADI-FOTEINI_diploma thesis.pptx
_MAKRIADI-FOTEINI_diploma thesis.pptx
fotinimakriadi6 visualizações
LFA-NPG-Paper.pdf por harinsrikanth
LFA-NPG-Paper.pdfLFA-NPG-Paper.pdf
LFA-NPG-Paper.pdf
harinsrikanth40 visualizações
MSA Website Slideshow (16).pdf por msaucla
MSA Website Slideshow (16).pdfMSA Website Slideshow (16).pdf
MSA Website Slideshow (16).pdf
msaucla46 visualizações
NEW SUPPLIERS SUPPLIES (copie).pdf por georgesradjou
NEW SUPPLIERS SUPPLIES (copie).pdfNEW SUPPLIERS SUPPLIES (copie).pdf
NEW SUPPLIERS SUPPLIES (copie).pdf
georgesradjou14 visualizações
Dynamics of Hard-Magnetic Soft Materials por Shivendra Nandan
Dynamics of Hard-Magnetic Soft MaterialsDynamics of Hard-Magnetic Soft Materials
Dynamics of Hard-Magnetic Soft Materials
Shivendra Nandan13 visualizações

DP1_160430723010_Divya.pptx

  • 1. MSR Presentation on RUMOR DETECTION ON REAL-TIME TWITTER DATA USING SUPERVISED LEARNING Presented By:- Patel Divya M. M.E. (Information Technology) Enroll. No. : 160430723010 SHANTILAL SHAH ENGINEERING COLLEGE, BHAVNAGAR Guided By:- Dr. Dinesh B. Vaghela Asst. Prof. of Information Technology Dept. GUJARAT TECHNOLOGICAL UNIVERSITY 1
  • 2. • Introduction • Research Topic: Rumor Detection • Research work: Objective • Literature Review • Problem Statement • Implementation Strategy • Implementation Environment • Conclusion • Future Work • References Outline 1 2
  • 3. Introduction • Twitter is most popular micro-blogging service on social media[1]. • A common people have a direct platform to share information and their opinions about the news events and any other information[1]. • Not all the information posted on twitter is correct or useful in providing information about the event to other people[1]. 1 3
  • 4. Introduction: Rumor Detection • What is Rumor? An unverified statement that starts from one or more sources and spreads over time [2]. A rumor can end in three ways: it can be resolved as either true, false or remain unresolved [2]. • So, its necessary to provide some solutions for detecting such kind of activity spread on social media. 1 4
  • 5. Research work: Objective  Phase 1: • Survey of current methods and models available for Detecting Rumors. • To study and analyze different methods of Rumor Detection on real time Twitter data. Phase 2: • To design a new model/approach for detection of rumors. Phase 3: • To implement a proposed model/approach for detection of rumors. • To evaluate the performance of Rumor detection on Twitter by proposed model. 5
  • 6. Paper Name: “Towards Automated Real-Time Detection of Misinformation on Twitter” Authors: Suchita Jain, Vanya Sharma and Rishabh Kaushal [3] Publisher / Journal Name: IEEE-2016. Literature Review 6 Proposed Model  Focused on the problem by providing an approach to detect misinformation or rumors on Twitter in real-time automatically.  Their approach based on the supposition that verified News Channel accounts on Twitter give more credible information as compared to the public account of user. Observation  They calculate accuracy according the tweet they retrieve from both the News channels and general users. Limitation  Feature selection/extraction part is missing.
  • 7. Paper Name : “Automatic detection of Rumoured Tweets and finding its Origin” Authors: Sahana V P, Alwyn R Pias, Richa Shastri, and Shweta Mandloi [4] Publisher / Journal Name: IEEE-2015. 7 Proposed Model  Focused on the topic “London Riots in 2011”.  The methodology contains mainly three sections: data, feature extraction, classification.  Used 20 features based on tweet content and user accounts.  Then after they trained a classifier to correctly classifies the tweets. For that they used Weka tool for classification.  Also proposed an algorithm to find the origin of the rumored tweets i.e. obtain the account information of the user who first started spreading rumors on Twitter. Observation  Achieved best accuracy for J48 decision tree classification algorithm.  Recall rate is given high accuracy 0.877. Limitation  Focused only on one specific rumor topic.  Real-time twitter data were not considered.
  • 8. Paper Name : “Detection and Analysis of 2016 US Presidential Election Related Rumors on Twitter” Authors: Zhiwei Jin, Juan Cao, Han Guo, Yongdong Zhang, Yu Wang, and Jiebo Luo [5] Publisher / Journal Name: Springer 2017. 8 Proposed Model  Focused on the 2016 U.S. presidential election.  Presented an analysis of rumor tweets from the followers of two presidential candidates: Hillary Clinton and Donald Trump.  They detected rumor tweets by matching large amount of tweets related to president election with verified rumor articles.  They collected over 8 million tweets from the followers of the two candidates.  They compared the performance of five matching algorithms with respect to the rumor detection task: TF-IDF, BM25, Word2Vec and Doc2Vec, lexicon-based algorithm. Observation  Precision gives 94.7% accuracy which is the highest accuracy result according to their detection algorithm. Limitation  Focused only on specific topic i.e. “2016 US President Election related rumors”.
  • 9. Paper Name : “Automatic Detection of Rumor on Social Network” Authors: Qiao Zhang, Shuiyuan Zhang, Jian Dong, Jinhua Xiong, and Xueqi Cheng [6] Publisher / Journal Name: Springer 2015. 9 Proposed Model  Proposed an automatic rumor detection method based on the combination of new proposed implicit features and shallow features of the messages.  It mainly divided into 3 parts: data cleaning, feature extraction and model training.  Used User-based implicit features and Content-based implicit features.  A large amount of supervised model they used such as Support Vector Machine, Random Forest. Observation  Results show that Implicit-Content-Based method have significant improvement compared with Shallow-Content-Based method, with 10.5% improvement in precision and 4.7% in recall rate. Limitation  User credibility.  Detection of rumors on the Chinese micro-blogging services.
  • 10. Paper Name : “Detecting Rumors on Online Social Networks Using Multi-layer Autoencoder” Authors: Yan Zhang, Weiling Chen, Chai Kiat Yeo, Chiew Tong Lau, Bu Sung Lee [7] Publisher / Journal Name: IEEE 2017. 10 Proposed Model  Proposed an anomaly detection method based on autoencoder to perform rumor detection.  They used Sina Weibo which is the most popular microblog in China.  Proposed several self-adapting thresholds which are calculated based on the property of each recent Weibo set. Observation  Results show that the autoencoder model achieves a good accuracy i.e. 88%, F1 i.e. 82% and a low false positive rate i.e. 7%. Limitation  Detection of rumors on the chienese micro-blogging services.  Performance of autoencoder with 2 hidden layer gives best performance.
  • 11. Literature Review Sr. No. Title Algorithm/Technique Used Advantages Disadvantages 1 Towards Automated Real-Time Detection of Misinformation on Twitter [3] Sentiment and semantic analysis  Detect rumors on Twitter using tweets from the verified news channels as base.  Detect rumors especially in the critical times of emergency.  There result is based only on the semantic and sentiment analysis of the tweets.  They weren’t used any features. Also they weren’t used any classification techniques to detect rumors. 2 Automatic detection of Rumored Tweets and finding its Origin [4] J48 decision tree Classifier  Automatically detect the spread of rumoured tweets.  Focused on specific rumor topic.  Real-time twitter data were not considered. 3 Detection and Analysis of 2016 US Presidential Election Related Rumors on Twitter [5] TF-IDF and BM25, Word2Vec and Doc2Vec, Lexicon matching  Detect rumor tweets from the aspects of people, content and time.  Their detection algorithm understand rumors during political events.  Focused only one specific topic i.e. “2016 US President Election related rumors”. 11
  • 12. Continue.. 4 Automatic Detection of Rumor on Social Network [6] Support vector machine  Best result as compared to Shallow features.  User credibility  Detection of rumors only on the Chinese micro-blogging services. 5 Detecting Rumors on Online Social Networks Using Multi-layer Autoencoder [7] Autoencoder (Artificial Neural Network)  Multi-layer autoencoder is used.  Self-adapting thresholds which is used to distinguish rumors from non-rumors.  Detection of rumors only on the Chinese micro-blogging services.  Used unsupervised learning method to detect rumors. 6 Rumor Detection and Classification for Twitter Data [8] J48 decision tree Classifier  They detect rumors as a type of misinformation propagation.  Rumor detection and classification (RDC) within the context of microblog social media.  The result is not better than the pre-processing method applied on the algorithm.  Real-time twitter data were not considered. 12 Table 1: Comparative analysis of Literature
  • 13. Problem Statement • An advantage of social media is that all the people can share information and also gives their opinions on that platform. • The downside of such rapid diffusion of information is that false information are also spread. • As the rumors are spreading on Twitter and other social media so fast and easier. We need to provide some solutions to detect such rumors. 13
  • 14. Proposed Work 14 Figure 1: Basic steps for Proposed method Dataset Collection Pre- processing Feature Extraction Classification
  • 15. Dataset Collection 15 Input Tweets Figure 2: Flow for fetching tweets
  • 16. Data Pre-processing • Remove all URLs (e.g. www.xyz.com), hash tags (e.g. #topic), targets (@username) • Correct the spellings; sequence of repeated characters is to be handled • Remove all punctuations ,symbols, numbers • Remove Stop Words • Remove Non-English Tweets 16
  • 17. Feature Extraction • Identification of attributes for Classification. 17
  • 18. Classification 18 Classification Decision Tree Linear Classifier Support Vector Machine Neural Network Rule based classifier Probabilistic Classifier Naïve Bayes Maximum Entropy Figure 3: Classification Techniques
  • 20. 20 Figure 5: Using News website Verification
  • 21. • Our Proposed approach is divided into three step: 1) Pre-processing, 2) Sentiment Analysis, and 3) Classification. • In first step, we are going to preprocess on the real-time tweets to determine the topic about which the given input tweet is posted. • In second step, we are finding tweet’s sentiment polarity of each tweets by using sentiment score. • In final step, we are going to apply this sentiment score as an input to the different classification algorithm. 21
  • 22. • We are using Proposed approach with News websites approach to compare different specific rumors topic. • If both gives the same result then we can say that our approach gives the better accuracy. • This comparison approach also provide the verification about the rumor topic. 22
  • 23. Implementation Strategy  Tools Python 3.5.4 Weka tool (for classification)  Dataset Tweets form twitter  Performance Evaluation Accuracy 23
  • 24. Implementation Environment 24 Figure 6: Collecting Tweets using Python Streaming API
  • 25. 25 Figure 7: Retrieving Tweets for specific topic.
  • 26. 26 Figure 8: Storing tweets in table format.
  • 27. 27 Figure 9: Sentiment Analysis chart for specific topic.
  • 28. Conclusion • After the study of different research paper on rumor detection, different methods to are used to detect rumors. There are many classifiers available for detecting rumors. This research work can be useful to detect rumors on Twitter platform efficiently and accurately. 28
  • 29. Future Work  To implement remaining work. 29
  • 30. References [1] Anubrata Das, Moumita Roy, Soumi Dutta, Saptarshi Ghosh, Asit Kumar Das. “Predicting Trends in the Twitter Social Network: A Machine Learning Approach”, Springer International Publishing Switzerland, 2015. [2] Soroush Vosoughi, PhD Thesis, “Automatic Detection and Verification of Rumors on Twitter”, June 2015. [3] Suchita Jain, Vanya Sharma and Rishabh Kaushal. “Towards Automated Real-Time Detection of Misinformation on Twitter”, Intl. Conference on Advances in Computing, Communications and Informatics (ICACCI), IEEE 2016. [4] Sahana V P, Alwyn R Pias, Richa Shastri, and Shweta Mandloi. “Automatic detection of Rumoured Tweets and finding its Origin”, Intl. Conference on Computing and Network Communications (CoCoNet'15), IEEE 2015. [5] Zhiwei Jin, Juan Cao, Han Guo, Yongdong Zhang, Yu Wang, and Jiebo Luo. “Detection and Analysis of 2016 US Presidential Election Related Rumors on Twitter”, Springer International Publishing AG 2017, Springer 2017. [6] Qiao Zhang, Shuiyuan Zhang, Jian Dong, Jinhua Xiong, and Xueqi Cheng. “Automatic Detection of Rumor on Social Network”, Springer International Publishing Switzerland 2015, Springer 2017. [7] Yan Zhang, Weiling Chen, Chai Kiat Yeo, Chiew Tong Lau, Bu Sung Lee, “Detecting Rumors on Online Social Networks Using Multi-layer Autoencoder”, IEEE Technology & Engineering Management Conference (TEMSCON), IEEE 2017 [8] Sardar Hamidian and Mona Diab. “Rumor Detection and Classification for Twitter Data”, The Fifth International Conference on Social Media Technologies, Communication, and Informatics, SOTICS 2015. 30
  • 31. 31

Notas do Editor

  1. Lucrative=profitable