1. MSR Presentation on
RUMOR DETECTION ON REAL-TIME
TWITTER DATA USING SUPERVISED
Patel Divya M.
M.E. (Information Technology)
Enroll. No. : 160430723010
Dr. Dinesh B. Vaghela
Asst. Prof. of Information Technology Dept.
2. • Introduction
• Research Topic: Rumor Detection
• Research work: Objective
• Literature Review
• Problem Statement
• Implementation Strategy
• Implementation Environment
• Future Work
• Twitter is most popular micro-blogging service on social media.
• A common people have a direct platform to share information and their opinions about
the news events and any other information.
• Not all the information posted on twitter is correct or useful in providing information
about the event to other people.
4. Introduction: Rumor Detection
• What is Rumor?
An unverified statement that starts from one or more sources and spreads
over time .
A rumor can end in three ways: it can be resolved as either true, false or remain
• So, its necessary to provide some solutions for detecting such kind of activity spread on
5. Research work: Objective
• Survey of current methods and models available for Detecting Rumors.
• To study and analyze different methods of Rumor Detection on real time Twitter data.
• To design a new model/approach for detection of rumors.
• To implement a proposed model/approach for detection of rumors.
• To evaluate the performance of Rumor detection on Twitter by proposed model.
6. Paper Name: “Towards Automated Real-Time Detection of Misinformation on Twitter”
Authors: Suchita Jain, Vanya Sharma and Rishabh Kaushal 
Publisher / Journal Name: IEEE-2016.
Focused on the problem by providing an approach to detect
misinformation or rumors on Twitter in real-time automatically.
Their approach based on the supposition that verified News Channel
accounts on Twitter give more credible information as compared to the
public account of user.
Observation They calculate accuracy according the tweet they retrieve from both the
News channels and general users.
Limitation Feature selection/extraction part is missing.
7. Paper Name : “Automatic detection of Rumoured Tweets and finding its Origin”
Authors: Sahana V P, Alwyn R Pias, Richa Shastri, and Shweta Mandloi 
Publisher / Journal Name: IEEE-2015.
Focused on the topic “London Riots in 2011”.
The methodology contains mainly three sections: data, feature extraction,
Used 20 features based on tweet content and user accounts.
Then after they trained a classifier to correctly classifies the tweets. For
that they used Weka tool for classification.
Also proposed an algorithm to find the origin of the rumored tweets i.e.
obtain the account information of the user who first started spreading
rumors on Twitter.
Observation Achieved best accuracy for J48 decision tree classification algorithm.
Recall rate is given high accuracy 0.877.
Limitation Focused only on one specific rumor topic.
Real-time twitter data were not considered.
8. Paper Name : “Detection and Analysis of 2016 US Presidential Election Related Rumors on
Authors: Zhiwei Jin, Juan Cao, Han Guo, Yongdong Zhang, Yu Wang, and Jiebo Luo 
Publisher / Journal Name: Springer 2017.
Focused on the 2016 U.S. presidential election.
Presented an analysis of rumor tweets from the followers of two
presidential candidates: Hillary Clinton and Donald Trump.
They detected rumor tweets by matching large amount of tweets related
to president election with verified rumor articles.
They collected over 8 million tweets from the followers of the two
They compared the performance of five matching algorithms with
respect to the rumor detection task: TF-IDF, BM25, Word2Vec and
Doc2Vec, lexicon-based algorithm.
Observation Precision gives 94.7% accuracy which is the highest accuracy result
according to their detection algorithm.
Limitation Focused only on specific topic i.e. “2016 US President Election related
9. Paper Name : “Automatic Detection of Rumor on Social Network”
Authors: Qiao Zhang, Shuiyuan Zhang, Jian Dong, Jinhua Xiong, and Xueqi Cheng 
Publisher / Journal Name: Springer 2015.
Proposed an automatic rumor detection method based on the combination
of new proposed implicit features and shallow features of the messages.
It mainly divided into 3 parts: data cleaning, feature extraction and model
Used User-based implicit features and Content-based implicit features.
A large amount of supervised model they used such as Support Vector
Machine, Random Forest.
Observation Results show that Implicit-Content-Based method have significant
improvement compared with Shallow-Content-Based method, with
10.5% improvement in precision and 4.7% in recall rate.
Limitation User credibility.
Detection of rumors on the Chinese micro-blogging services.
10. Paper Name : “Detecting Rumors on Online Social Networks Using Multi-layer Autoencoder”
Authors: Yan Zhang, Weiling Chen, Chai Kiat Yeo, Chiew Tong Lau, Bu Sung Lee 
Publisher / Journal Name: IEEE 2017.
Proposed an anomaly detection method based on autoencoder to perform
They used Sina Weibo which is the most popular microblog in China.
Proposed several self-adapting thresholds which are calculated based on
the property of each recent Weibo set.
Observation Results show that the autoencoder model achieves a good accuracy i.e.
88%, F1 i.e. 82% and a low false positive rate i.e. 7%.
Limitation Detection of rumors on the chienese micro-blogging services.
Performance of autoencoder with 2 hidden layer gives best performance.
11. Literature Review
Title Algorithm/Technique Used Advantages Disadvantages
1 Towards Automated Real-Time
Detection of Misinformation on
Sentiment and semantic
Detect rumors on Twitter
using tweets from the
verified news channels as
Detect rumors especially in
the critical times of
There result is based only on
the semantic and sentiment
analysis of the tweets.
They weren’t used any
features. Also they weren’t
used any classification
techniques to detect rumors.
2 Automatic detection of Rumored
Tweets and finding its Origin 
J48 decision tree Classifier Automatically detect the
spread of rumoured tweets.
Focused on specific rumor
Real-time twitter data were
3 Detection and Analysis of 2016
US Presidential Election Related
Rumors on Twitter 
TF-IDF and BM25,
Word2Vec and Doc2Vec,
Detect rumor tweets from the
aspects of people, content
Their detection algorithm
understand rumors during
Focused only one specific
topic i.e. “2016 US President
Election related rumors”.
4 Automatic Detection of Rumor on
Social Network 
Support vector machine Best result as compared to
Detection of rumors only on
the Chinese micro-blogging
5 Detecting Rumors on Online Social
Networks Using Multi-layer
Multi-layer autoencoder is
which is used to distinguish
rumors from non-rumors.
Detection of rumors only on
the Chinese micro-blogging
Used unsupervised learning
method to detect rumors.
6 Rumor Detection and
Classification for Twitter Data 
J48 decision tree Classifier They detect rumors as a type
Rumor detection and
classification (RDC) within
the context of microblog
The result is not better than the
pre-processing method applied
on the algorithm.
Real-time twitter data were not
Table 1: Comparative analysis of Literature
13. Problem Statement
• An advantage of social media is that all the people can share information and
also gives their opinions on that platform.
• The downside of such rapid diffusion of information is that false information
are also spread.
• As the rumors are spreading on Twitter and other social media so fast and
easier. We need to provide some solutions to detect such rumors.
16. Data Pre-processing
• Remove all URLs (e.g. www.xyz.com), hash tags (e.g. #topic), targets
• Correct the spellings; sequence of repeated characters is to be
• Remove all punctuations ,symbols, numbers
• Remove Stop Words
• Remove Non-English Tweets
21. • Our Proposed approach is divided into three step: 1) Pre-processing, 2) Sentiment
Analysis, and 3) Classification.
• In first step, we are going to preprocess on the real-time tweets to determine the topic
about which the given input tweet is posted.
• In second step, we are finding tweet’s sentiment polarity of each tweets by using
• In final step, we are going to apply this sentiment score as an input to the different
22. • We are using Proposed approach with News websites approach to compare different
specific rumors topic.
• If both gives the same result then we can say that our approach gives the better accuracy.
• This comparison approach also provide the verification about the rumor topic.
• After the study of different research paper on rumor detection, different methods to
are used to detect rumors. There are many classifiers available for detecting
rumors. This research work can be useful to detect rumors on Twitter platform
efficiently and accurately.
 Anubrata Das, Moumita Roy, Soumi Dutta, Saptarshi Ghosh, Asit Kumar Das. “Predicting Trends in the
Twitter Social Network: A Machine Learning Approach”, Springer International Publishing Switzerland, 2015.
 Soroush Vosoughi, PhD Thesis, “Automatic Detection and Verification of Rumors on Twitter”, June 2015.
 Suchita Jain, Vanya Sharma and Rishabh Kaushal. “Towards Automated Real-Time Detection of
Misinformation on Twitter”, Intl. Conference on Advances in Computing, Communications and Informatics
(ICACCI), IEEE 2016.
 Sahana V P, Alwyn R Pias, Richa Shastri, and Shweta Mandloi. “Automatic detection of Rumoured Tweets
and finding its Origin”, Intl. Conference on Computing and Network Communications (CoCoNet'15), IEEE
 Zhiwei Jin, Juan Cao, Han Guo, Yongdong Zhang, Yu Wang, and Jiebo Luo. “Detection and Analysis of
2016 US Presidential Election Related Rumors on Twitter”, Springer International Publishing AG 2017,
 Qiao Zhang, Shuiyuan Zhang, Jian Dong, Jinhua Xiong, and Xueqi Cheng. “Automatic Detection of Rumor
on Social Network”, Springer International Publishing Switzerland 2015, Springer 2017.
 Yan Zhang, Weiling Chen, Chai Kiat Yeo, Chiew Tong Lau, Bu Sung Lee, “Detecting Rumors on Online
Social Networks Using Multi-layer Autoencoder”, IEEE Technology & Engineering Management Conference
(TEMSCON), IEEE 2017
 Sardar Hamidian and Mona Diab. “Rumor Detection and Classification for Twitter Data”, The Fifth
International Conference on Social Media Technologies, Communication, and Informatics, SOTICS 2015. 30