SlideShare a Scribd company logo
Enviar pesquisa
Carregar
IRJET - Socirank Identifying and Ranking Prevalent News Topics using Social Media Factors
Denunciar
Compartilhar
IRJET Journal
Fast Track Publications
Seguir
•
0 gostou
•
24 visualizações
Engenharia
https://irjet.net/archives/V7/i3/IRJET-V7I3171.pdf
Leia mais
IRJET - Socirank Identifying and Ranking Prevalent News Topics using Social Media Factors
•
0 gostou
•
24 visualizações
IRJET Journal
Fast Track Publications
Seguir
Denunciar
Compartilhar
Engenharia
https://irjet.net/archives/V7/i3/IRJET-V7I3171.pdf
Leia mais
IRJET - Socirank Identifying and Ranking Prevalent News Topics using Social Media Factors
1 de 4
Baixar agora
Recomendados
IRJET- Identification of Prevalent News from Twitter and Traditional Media us... por
IRJET- Identification of Prevalent News from Twitter and Traditional Media us...
IRJET Journal
16 visualizações
•
8 slides
[IJET-V2I1P14] Authors:Aditi Verma, Rachana Agarwal, Sameer Bardia, Simran Sh... por
[IJET-V2I1P14] Authors:Aditi Verma, Rachana Agarwal, Sameer Bardia, Simran Sh...
IJET - International Journal of Engineering and Techniques
177 visualizações
•
6 slides
Text mining on Twitter information based on R platform por
Text mining on Twitter information based on R platform
Fayan TAO
286 visualizações
•
13 slides
Fuzzy AndANN Based Mining Approach Testing For Social Network Analysis por
Fuzzy AndANN Based Mining Approach Testing For Social Network Analysis
IJERA Editor
59 visualizações
•
5 slides
nm por
nm
Zainab Nayyar
145 visualizações
•
5 slides
Integrated expert recommendation model for online communitiesst02 por
Integrated expert recommendation model for online communitiesst02
IJwest
399 visualizações
•
11 slides
Mais conteúdo relacionado
Mais procurados
Exploiting Wikipedia and Twitter for Text Mining Applications por
Exploiting Wikipedia and Twitter for Text Mining Applications
IRJET Journal
31 visualizações
•
7 slides
CATEGORIZING 2019-N-COV TWITTER HASHTAG DATA BY CLUSTERING por
CATEGORIZING 2019-N-COV TWITTER HASHTAG DATA BY CLUSTERING
ijaia
74 visualizações
•
12 slides
Data mining on Social Media por
Data mining on Social Media
home
898 visualizações
•
10 slides
Predicting the Brand Popularity from the Brand Metadata por
Predicting the Brand Popularity from the Brand Metadata
IJECEIAES
10 visualizações
•
13 slides
Data mining in social network por
Data mining in social network
akash_mishra
13.7K visualizações
•
28 slides
INFORMATION RETRIEVAL TOPICS IN TWITTER USING WEIGHTED PREDICTION NETWORK por
INFORMATION RETRIEVAL TOPICS IN TWITTER USING WEIGHTED PREDICTION NETWORK
IAEME Publication
113 visualizações
•
8 slides
Mais procurados
(20)
Exploiting Wikipedia and Twitter for Text Mining Applications por IRJET Journal
Exploiting Wikipedia and Twitter for Text Mining Applications
IRJET Journal
•
31 visualizações
CATEGORIZING 2019-N-COV TWITTER HASHTAG DATA BY CLUSTERING por ijaia
CATEGORIZING 2019-N-COV TWITTER HASHTAG DATA BY CLUSTERING
ijaia
•
74 visualizações
Data mining on Social Media por home
Data mining on Social Media
home
•
898 visualizações
Predicting the Brand Popularity from the Brand Metadata por IJECEIAES
Predicting the Brand Popularity from the Brand Metadata
IJECEIAES
•
10 visualizações
Data mining in social network por akash_mishra
Data mining in social network
akash_mishra
•
13.7K visualizações
INFORMATION RETRIEVAL TOPICS IN TWITTER USING WEIGHTED PREDICTION NETWORK por IAEME Publication
INFORMATION RETRIEVAL TOPICS IN TWITTER USING WEIGHTED PREDICTION NETWORK
IAEME Publication
•
113 visualizações
Social Data Mining por Mahesh Meniya
Social Data Mining
Mahesh Meniya
•
60.9K visualizações
Vol 7 No 1 - November 2013 por ijcsbi
Vol 7 No 1 - November 2013
ijcsbi
•
234 visualizações
Data collection thru social media por i4box Anon
Data collection thru social media
i4box Anon
•
529 visualizações
Political prediction analysis using text mining and deep learning por Vishwambhar Deshpande
Political prediction analysis using text mining and deep learning
Vishwambhar Deshpande
•
34 visualizações
1 Crore Projects | ieee 2016 Projects | 2016 ieee Projects in chennai por 1crore projects
1 Crore Projects | ieee 2016 Projects | 2016 ieee Projects in chennai
1crore projects
•
49 visualizações
IRJET - Election Result Prediction using Sentiment Analysis por IRJET Journal
IRJET - Election Result Prediction using Sentiment Analysis
IRJET Journal
•
37 visualizações
CONTEXTUAL MODEL OF RECOMMENDING RESOURCES ON AN ACADEMIC NETWORKING PORTAL por cscpconf
CONTEXTUAL MODEL OF RECOMMENDING RESOURCES ON AN ACADEMIC NETWORKING PORTAL
cscpconf
•
38 visualizações
Contextual model of recommending resources on an academic networking portal por csandit
Contextual model of recommending resources on an academic networking portal
csandit
•
432 visualizações
Measuring information credibility in social media using combination of user p... por IJECEIAES
Measuring information credibility in social media using combination of user p...
IJECEIAES
•
65 visualizações
Prediction of Reaction towards Textual Posts in Social Networks por Mohamed El-Geish
Prediction of Reaction towards Textual Posts in Social Networks
Mohamed El-Geish
•
475 visualizações
Social Media Mining: An Introduction por Ali Abbasi
Social Media Mining: An Introduction
Ali Abbasi
•
2.6K visualizações
Jf2516311637 por IJERA Editor
Jf2516311637
IJERA Editor
•
223 visualizações
Social Network Analysis (Part 1) por Vala Ali Rohani
Social Network Analysis (Part 1)
Vala Ali Rohani
•
972 visualizações
A Review: Text Classification on Social Media Data por IOSR Journals
A Review: Text Classification on Social Media Data
IOSR Journals
•
212 visualizações
Similar a IRJET - Socirank Identifying and Ranking Prevalent News Topics using Social Media Factors
A REVIEW ON MACHINE LEARNING TECHNIQUES ON SOCIAL MEDIA DATA FOR POLICY MAKING por
A REVIEW ON MACHINE LEARNING TECHNIQUES ON SOCIAL MEDIA DATA FOR POLICY MAKING
Deja Lewis
2 visualizações
•
6 slides
Irjet v4 i73A Survey on Student’s Academic Experiences using Social Media Data53 por
Irjet v4 i73A Survey on Student’s Academic Experiences using Social Media Data53
IRJET Journal
71 visualizações
•
3 slides
Researching Social Media – Big Data and Social Media Analysis por
Researching Social Media – Big Data and Social Media Analysis
Farida Vis
6K visualizações
•
43 slides
Survey of data mining techniques for social por
Survey of data mining techniques for social
Firas Husseini
550 visualizações
•
25 slides
Increasing the Investment’s Opportunities in Kingdom of Saudi Arabia By Study... por
Increasing the Investment’s Opportunities in Kingdom of Saudi Arabia By Study...
AIRCC Publishing Corporation
4 visualizações
•
16 slides
INCREASING THE INVESTMENT’S OPPORTUNITIES IN KINGDOM OF SAUDI ARABIA BY STUDY... por
INCREASING THE INVESTMENT’S OPPORTUNITIES IN KINGDOM OF SAUDI ARABIA BY STUDY...
ijcsit
6 visualizações
•
16 slides
Similar a IRJET - Socirank Identifying and Ranking Prevalent News Topics using Social Media Factors
(20)
A REVIEW ON MACHINE LEARNING TECHNIQUES ON SOCIAL MEDIA DATA FOR POLICY MAKING por Deja Lewis
A REVIEW ON MACHINE LEARNING TECHNIQUES ON SOCIAL MEDIA DATA FOR POLICY MAKING
Deja Lewis
•
2 visualizações
Irjet v4 i73A Survey on Student’s Academic Experiences using Social Media Data53 por IRJET Journal
Irjet v4 i73A Survey on Student’s Academic Experiences using Social Media Data53
IRJET Journal
•
71 visualizações
Researching Social Media – Big Data and Social Media Analysis por Farida Vis
Researching Social Media – Big Data and Social Media Analysis
Farida Vis
•
6K visualizações
Survey of data mining techniques for social por Firas Husseini
Survey of data mining techniques for social
Firas Husseini
•
550 visualizações
Increasing the Investment’s Opportunities in Kingdom of Saudi Arabia By Study... por AIRCC Publishing Corporation
Increasing the Investment’s Opportunities in Kingdom of Saudi Arabia By Study...
AIRCC Publishing Corporation
•
4 visualizações
INCREASING THE INVESTMENT’S OPPORTUNITIES IN KINGDOM OF SAUDI ARABIA BY STUDY... por ijcsit
INCREASING THE INVESTMENT’S OPPORTUNITIES IN KINGDOM OF SAUDI ARABIA BY STUDY...
ijcsit
•
6 visualizações
Final_report6 por Priyanshu Kedia
Final_report6
Priyanshu Kedia
•
174 visualizações
Selasturkiye Guide To The Top 16 Social Media Research Questions By Mra Imro por Ziya NISANOGLU
Selasturkiye Guide To The Top 16 Social Media Research Questions By Mra Imro
Ziya NISANOGLU
•
312 visualizações
Sub1557 por International Journal of Science and Research (IJSR)
Sub1557
International Journal of Science and Research (IJSR)
•
335 visualizações
IRJET- Event Detection and Text Summary by Disaster Warning por IRJET Journal
IRJET- Event Detection and Text Summary by Disaster Warning
IRJET Journal
•
24 visualizações
The Implementation of Social Media for Educational Objectives por theijes
The Implementation of Social Media for Educational Objectives
theijes
•
292 visualizações
F017433947 por IOSR Journals
F017433947
IOSR Journals
•
153 visualizações
E017433538 por IOSR Journals
E017433538
IOSR Journals
•
109 visualizações
Twitter_Hashtag_Prediction.pptx por SayaliKawale2
Twitter_Hashtag_Prediction.pptx
SayaliKawale2
•
6 visualizações
Jx2517481755 por IJERA Editor
Jx2517481755
IJERA Editor
•
181 visualizações
Jx2517481755 por IJERA Editor
Jx2517481755
IJERA Editor
•
3.3K visualizações
IRJET- Effective Countering of Communal Hatred During Disaster Events in Soci... por IRJET Journal
IRJET- Effective Countering of Communal Hatred During Disaster Events in Soci...
IRJET Journal
•
21 visualizações
Knime social media_white_paper por Firas Husseini
Knime social media_white_paper
Firas Husseini
•
299 visualizações
Analyzing-Threat-Levels-of-Extremists-using-Tweets por RESHAN FARAZ
Analyzing-Threat-Levels-of-Extremists-using-Tweets
RESHAN FARAZ
•
89 visualizações
IRJET- A Survey on Trend Analysis on Twitter for Predicting Public Opinion on... por IRJET Journal
IRJET- A Survey on Trend Analysis on Twitter for Predicting Public Opinion on...
IRJET Journal
•
19 visualizações
Mais de IRJET Journal
SOIL STABILIZATION USING WASTE FIBER MATERIAL por
SOIL STABILIZATION USING WASTE FIBER MATERIAL
IRJET Journal
25 visualizações
•
7 slides
Sol-gel auto-combustion produced gamma irradiated Ni1-xCdxFe2O4 nanoparticles... por
Sol-gel auto-combustion produced gamma irradiated Ni1-xCdxFe2O4 nanoparticles...
IRJET Journal
8 visualizações
•
7 slides
Identification, Discrimination and Classification of Cotton Crop by Using Mul... por
Identification, Discrimination and Classification of Cotton Crop by Using Mul...
IRJET Journal
8 visualizações
•
5 slides
“Analysis of GDP, Unemployment and Inflation rates using mathematical formula... por
“Analysis of GDP, Unemployment and Inflation rates using mathematical formula...
IRJET Journal
13 visualizações
•
11 slides
MAXIMUM POWER POINT TRACKING BASED PHOTO VOLTAIC SYSTEM FOR SMART GRID INTEGR... por
MAXIMUM POWER POINT TRACKING BASED PHOTO VOLTAIC SYSTEM FOR SMART GRID INTEGR...
IRJET Journal
14 visualizações
•
6 slides
Performance Analysis of Aerodynamic Design for Wind Turbine Blade por
Performance Analysis of Aerodynamic Design for Wind Turbine Blade
IRJET Journal
7 visualizações
•
5 slides
Mais de IRJET Journal
(20)
SOIL STABILIZATION USING WASTE FIBER MATERIAL por IRJET Journal
SOIL STABILIZATION USING WASTE FIBER MATERIAL
IRJET Journal
•
25 visualizações
Sol-gel auto-combustion produced gamma irradiated Ni1-xCdxFe2O4 nanoparticles... por IRJET Journal
Sol-gel auto-combustion produced gamma irradiated Ni1-xCdxFe2O4 nanoparticles...
IRJET Journal
•
8 visualizações
Identification, Discrimination and Classification of Cotton Crop by Using Mul... por IRJET Journal
Identification, Discrimination and Classification of Cotton Crop by Using Mul...
IRJET Journal
•
8 visualizações
“Analysis of GDP, Unemployment and Inflation rates using mathematical formula... por IRJET Journal
“Analysis of GDP, Unemployment and Inflation rates using mathematical formula...
IRJET Journal
•
13 visualizações
MAXIMUM POWER POINT TRACKING BASED PHOTO VOLTAIC SYSTEM FOR SMART GRID INTEGR... por IRJET Journal
MAXIMUM POWER POINT TRACKING BASED PHOTO VOLTAIC SYSTEM FOR SMART GRID INTEGR...
IRJET Journal
•
14 visualizações
Performance Analysis of Aerodynamic Design for Wind Turbine Blade por IRJET Journal
Performance Analysis of Aerodynamic Design for Wind Turbine Blade
IRJET Journal
•
7 visualizações
Heart Failure Prediction using Different Machine Learning Techniques por IRJET Journal
Heart Failure Prediction using Different Machine Learning Techniques
IRJET Journal
•
7 visualizações
Experimental Investigation of Solar Hot Case Based on Photovoltaic Panel por IRJET Journal
Experimental Investigation of Solar Hot Case Based on Photovoltaic Panel
IRJET Journal
•
3 visualizações
Metro Development and Pedestrian Concerns por IRJET Journal
Metro Development and Pedestrian Concerns
IRJET Journal
•
2 visualizações
Mapping the Crashworthiness Domains: Investigations Based on Scientometric An... por IRJET Journal
Mapping the Crashworthiness Domains: Investigations Based on Scientometric An...
IRJET Journal
•
3 visualizações
Data Analytics and Artificial Intelligence in Healthcare Industry por IRJET Journal
Data Analytics and Artificial Intelligence in Healthcare Industry
IRJET Journal
•
3 visualizações
DESIGN AND SIMULATION OF SOLAR BASED FAST CHARGING STATION FOR ELECTRIC VEHIC... por IRJET Journal
DESIGN AND SIMULATION OF SOLAR BASED FAST CHARGING STATION FOR ELECTRIC VEHIC...
IRJET Journal
•
54 visualizações
Efficient Design for Multi-story Building Using Pre-Fabricated Steel Structur... por IRJET Journal
Efficient Design for Multi-story Building Using Pre-Fabricated Steel Structur...
IRJET Journal
•
12 visualizações
Development of Effective Tomato Package for Post-Harvest Preservation por IRJET Journal
Development of Effective Tomato Package for Post-Harvest Preservation
IRJET Journal
•
4 visualizações
“DYNAMIC ANALYSIS OF GRAVITY RETAINING WALL WITH SOIL STRUCTURE INTERACTION” por IRJET Journal
“DYNAMIC ANALYSIS OF GRAVITY RETAINING WALL WITH SOIL STRUCTURE INTERACTION”
IRJET Journal
•
5 visualizações
Understanding the Nature of Consciousness with AI por IRJET Journal
Understanding the Nature of Consciousness with AI
IRJET Journal
•
12 visualizações
Augmented Reality App for Location based Exploration at JNTUK Kakinada por IRJET Journal
Augmented Reality App for Location based Exploration at JNTUK Kakinada
IRJET Journal
•
6 visualizações
Smart Traffic Congestion Control System: Leveraging Machine Learning for Urba... por IRJET Journal
Smart Traffic Congestion Control System: Leveraging Machine Learning for Urba...
IRJET Journal
•
18 visualizações
Enhancing Real Time Communication and Efficiency With Websocket por IRJET Journal
Enhancing Real Time Communication and Efficiency With Websocket
IRJET Journal
•
5 visualizações
Textile Industrial Wastewater Treatability Studies by Soil Aquifer Treatment ... por IRJET Journal
Textile Industrial Wastewater Treatability Studies by Soil Aquifer Treatment ...
IRJET Journal
•
4 visualizações
Último
DESIGN OF SPRINGS-UNIT4.pptx por
DESIGN OF SPRINGS-UNIT4.pptx
gopinathcreddy
19 visualizações
•
47 slides
Proposal Presentation.pptx por
Proposal Presentation.pptx
keytonallamon
63 visualizações
•
36 slides
Design of machine elements-UNIT 3.pptx por
Design of machine elements-UNIT 3.pptx
gopinathcreddy
34 visualizações
•
31 slides
_MAKRIADI-FOTEINI_diploma thesis.pptx por
_MAKRIADI-FOTEINI_diploma thesis.pptx
fotinimakriadi
10 visualizações
•
32 slides
fakenews_DBDA_Mar23.pptx por
fakenews_DBDA_Mar23.pptx
deepmitra8
16 visualizações
•
34 slides
Update 42 models(Diode/General ) in SPICE PARK(DEC2023) por
Update 42 models(Diode/General ) in SPICE PARK(DEC2023)
Tsuyoshi Horigome
39 visualizações
•
16 slides
Último
(20)
DESIGN OF SPRINGS-UNIT4.pptx por gopinathcreddy
DESIGN OF SPRINGS-UNIT4.pptx
gopinathcreddy
•
19 visualizações
Proposal Presentation.pptx por keytonallamon
Proposal Presentation.pptx
keytonallamon
•
63 visualizações
Design of machine elements-UNIT 3.pptx por gopinathcreddy
Design of machine elements-UNIT 3.pptx
gopinathcreddy
•
34 visualizações
_MAKRIADI-FOTEINI_diploma thesis.pptx por fotinimakriadi
_MAKRIADI-FOTEINI_diploma thesis.pptx
fotinimakriadi
•
10 visualizações
fakenews_DBDA_Mar23.pptx por deepmitra8
fakenews_DBDA_Mar23.pptx
deepmitra8
•
16 visualizações
Update 42 models(Diode/General ) in SPICE PARK(DEC2023) por Tsuyoshi Horigome
Update 42 models(Diode/General ) in SPICE PARK(DEC2023)
Tsuyoshi Horigome
•
39 visualizações
Searching in Data Structure por raghavbirla63
Searching in Data Structure
raghavbirla63
•
14 visualizações
Ansari: Practical experiences with an LLM-based Islamic Assistant por M Waleed Kadous
Ansari: Practical experiences with an LLM-based Islamic Assistant
M Waleed Kadous
•
7 visualizações
SUMIT SQL PROJECT SUPERSTORE 1.pptx por Sumit Jadhav
SUMIT SQL PROJECT SUPERSTORE 1.pptx
Sumit Jadhav
•
22 visualizações
sam_software_eng_cv.pdf por sammyigbinovia
sam_software_eng_cv.pdf
sammyigbinovia
•
9 visualizações
Web Dev Session 1.pptx por VedVekhande
Web Dev Session 1.pptx
VedVekhande
•
13 visualizações
REACTJS.pdf por ArthyR3
REACTJS.pdf
ArthyR3
•
35 visualizações
802.11 Computer Networks por TusharChoudhary72015
802.11 Computer Networks
TusharChoudhary72015
•
13 visualizações
Design of Structures and Foundations for Vibrating Machines, Arya-ONeill-Pinc... por csegroupvn
Design of Structures and Foundations for Vibrating Machines, Arya-ONeill-Pinc...
csegroupvn
•
6 visualizações
MongoDB.pdf por ArthyR3
MongoDB.pdf
ArthyR3
•
49 visualizações
GDSC Mikroskil Members Onboarding 2023.pdf por gdscmikroskil
GDSC Mikroskil Members Onboarding 2023.pdf
gdscmikroskil
•
59 visualizações
Renewal Projects in Seismic Construction por Engineering & Seismic Construction
Renewal Projects in Seismic Construction
Engineering & Seismic Construction
•
5 visualizações
SPICE PARK DEC2023 (6,625 SPICE Models) por Tsuyoshi Horigome
SPICE PARK DEC2023 (6,625 SPICE Models)
Tsuyoshi Horigome
•
36 visualizações
MK__Cert.pdf por Hassan Khan
MK__Cert.pdf
Hassan Khan
•
16 visualizações
ASSIGNMENTS ON FUZZY LOGIC IN TRAFFIC FLOW.pdf por AlhamduKure
ASSIGNMENTS ON FUZZY LOGIC IN TRAFFIC FLOW.pdf
AlhamduKure
•
6 visualizações
IRJET - Socirank Identifying and Ranking Prevalent News Topics using Social Media Factors
1.
International Research Journal
of Engineering and Technology (IRJET) e-ISSN: 2395-0056 Volume: 07 Issue: 03 | Mar 2020 www.irjet.net p-ISSN: 2395-0072 © 2020, IRJET | Impact Factor value: 7.34 | ISO 9001:2008 Certified Journal | Page 925 SOCIRANK IDENTIFYING AND RANKING PREVALENT NEWS TOPICS USING SOCIAL MEDIA FACTORS SRIKANTH.M1, RIHANA PARVEEN.SK2, SIVA SAI.B3, SRAVANI.M4, VENU GOPAL REDDY.G5 1B.Tech, M.Tech, Associate Professor Dept. of CSE, Tirumala Engineering College, Narasaraopet, Guntur, A.P, India 2345 B.Tech Students, Dept. of CSE, Tirumala Engineering College, Narasaraopet, Guntur, A.P, India ---------------------------------------------------------------------***--------------------------------------------------------------------- Abstract - Mass media sources, specifically the news media, have traditionally informed us of daily events. In modern times, social media services such as Twitter provide an enormous amount of user- generated data, which have great potential to contain informative news-related content. For these resources to be useful, we must find a way to filter noise and only capture the content that, based on its similarity to the news media, is considered valuable. However, even after noise is removed, information overload may still exist in the remaining data—hence; it is convenient to prioritize it for consumption. To achieve prioritization, information must be ranked in order of estimated importance considering three factors. First, the temporal prevalence of a particular topic in the news media is a factor of importance, and can be considered the media focus (MF) of a topic. Second, the temporal prevalence of the topic in social media indicates its user attention (UA). Last, the interaction between the social media users who mention this topic indicates the strength of the community discussing it, and can be regarded as the user interaction (UI) toward thetopic. Weproposeanunsupervised framework—SociRank—which identifies news topics prevalent in both social media and the news media, and then ranks them by relevance using their degrees of MF, UA, and UI. Our experiments show that SociRank improvesthequalityand variety of automatically identified news topics. Key Words: Information filtering, social computing, social network analysis, topic identification, topic ranking. 1. INTRODUCTION The mining of valuable information from online sources has become a prominent research area in information technology in recent years. Historically, knowledge that apprises the general public ofdailyeventshasbeenprovided by mass media sources, specifically the newsmedia.Many of these news media sources have either abandoned their hardcopy publications and moved to the World Wide Web, or now produce both hard-copy and Internet versions simultaneously. These news media sources are considered reliable because they are published by professional journalists, who are held accountable for their content. On the other hand, the Internet, being a free and open forum for information exchange, has recently seen a fascinating phenomenon knownassocial media.Insocial media,regular, nonjournalist users are able to publish unverified content and express their interest in certain events. 1.1 Selection of Common News Topics Microblogs have become one of the most popular social media outlets. One microblogging service in particular, Twitter, is used by millions of people around the world, providing enormous amounts of user-generated data. One may assume that this sourcepotentiallycontainsinformation with equal or greater value than the news media, but one mustalso assume that because of the unverifiednatureofthe source, much of this content is useless. For social media data to be of any use fortopic identification, wemust find awayto filter uninformative information and capture only information which, based on its content similarity to the news media, may be considered useful or valuable. The news media presentsprofessionallyverifiedoccurrences or events, while social media presents the interests of the audience in these areas, and may thus provide insight into their popularity. Social media services like Twitter can also provide additional or supporting information to a particular news media topic. In summary, truly valuable information may be thought of as the area in which these two media sources topically intersect. Unfortunately, even after the removal of unimportant content, there is still information overload in the remaining news-related data, which must be prioritized for consumption. 1.2 Ranking them based on Social Media A straightforward approach for identifying topics from different social and news media sources is the application of topic modeling. Many methods have been proposed in this area, such as latent Dirichlet allocation (LDA) and probabilistic latent semantic analysis (PLSA). Topic modeling is, in essence, the discovery of “topics” in text corpora by clustering together frequently co-occurring words. This approach, however, misses out in the temporal component of prevalent topic detection, that is, it does not take into account how topics changewithtime.Furthermore, topic modeling and other topic detection techniques do not rank topics according to their popularity by taking into account their prevalence in both news media and social media. We propose an unsupervised system—SociRank—which effectively identifies news topics that are prevalent in both social media and the news media, and then ranks them by relevance using their degrees of MF, UA, andUI.Eventhough
2.
International Research Journal
of Engineering and Technology (IRJET) e-ISSN: 2395-0056 Volume: 07 Issue: 03 | Mar 2020 www.irjet.net p-ISSN: 2395-0072 © 2020, IRJET | Impact Factor value: 7.34 | ISO 9001:2008 Certified Journal | Page 926 this paper focuses on news topics, it can be easily adaptedto a wide variety of fields, from science and technology to culture and sports. To the best of our knowledge, no other work attempts to employ the use of either the social media interests of users or their social relationships to aid in the ranking of topics. Moreover, SociRank undergoes an empirical framework, comprising and integrating several techniques, such as keyword extraction, measures of similarity, graph clustering, and social network analysis.The effectiveness of our system is validated by extensive controlled and uncontrolled experiments. 2. Releated work The main research areas applied in this paper include: topic identification, topic ranking social, network analysis, keyword extraction, co-occurrence similarity measures, and graphclustering. Extensive workhasbeenconductedinmost of these areas. 2.1 Topic Identification Much research has been carried out in the field of topic identification—referred to more formally as topic modeling. Two traditional methods for detecting topics are LDA and PLSA. LDA is a generative probabilistic model that can be applied to different tasks, including topic identification. PLSA, similarly, is a statistical technique, which can also be applied to topic modeling. In these approaches, however, temporal information is lost, which is paramount in identifying prevalent topics and is an important characteristic of social media data. Furthermore, LDA and PLSA only discover topics from text corpora; they do not rank based on popularity or prevalence. Wartena and Brusseeimplementeda methodtodetecttopics by clustering keywords. Their method entails the clustering ofkeywords—basedondifferentsimilaritymeasures—using the induced k-bisecting clustering algorithm. Although they do not employ the use of graphs, they do observe that a distance measure based on the Jensen–Shannon divergence (or information radius)ofprobabilitydistributionsperforms well. 2.2 Topic Ranking Another major concept that is incorporatedintothispaperis topic ranking. There are several means by which this task can be accomplished, traditionally being done by estimating how frequently and recently a topic has been reported by mass media. 2.3 Social Network Analysis In the case of UA, Wang, estimated this factor by using anonymous website visitor data. Their method counts the amount of times a site was visited during a particular period of time, which represents the UA of the topic to which the site is related. Our belief, on the other hand, is that, although website usage statistics provide initial proof of attention, additional data are needed to corroborate it. We employ the use of social media, specifically Twitter, as a means to estimate UA. When a user tweets about a particular topic, it signifies that the user is interested in the topic and it has captured her attention more so than visiting a website related to it. 2.4 Keyword Extraction Concerning the field of keyword or informative term extraction, many unsupervised and supervised methods have been proposed. Unsupervised methods for keyword extraction rely solely on implicit information found in individual texts or in a text corpus. Supervised methods, on the other hand, make use of training datasets that have already been classified. 2.5 Co-Occurrence Similarity Matsuo and Ishizuka suggested that the co-occurrence relationship of frequent word pairs from a single document may provide statistical information to aid in the identification of the document’s keywords. They proposed that if the probability distribution of co-occurrencebetween a term x and all other terms in a document is biased to a particular subset of frequent terms, then term x is likely to be a keyword. Even though our intentionisnottoemploy co- occurrence for keyword extraction, this hypothesis emphasizes the importance of co-occurrence relationships. 2.6 Graph Clustering The main purpose of graph clustering in this paper is to identify and separate TCs, as done in Wartena and Brussee’s work. Iwasaka and Tanaka-Ishiialsoproposeda methodthat clusters a co-occurrence graph based on a graph measure known as transitivity. The basic idea of transitivity is that in a relationship between three elements, if the relationship holds between the first and second elements and between the second and third elements, it also holdsbetweenthefirst and third elements. They suggested that each output cluster is expected to have no ambiguity, and that this is only achieved when the edges of a graph (representing co- occurrence relations) are transitive. 3. SOCIRANK FRAMEWORK The goal of our method—SociRank—is to identify, consolidate and rank the most prevalent topics discussed in both news media and social media duringa specificperiodof time. The system framework can be visualized in Fig. 1. To achieve its goal, the system must undergo four main stages.
3.
International Research Journal
of Engineering and Technology (IRJET) e-ISSN: 2395-0056 Volume: 07 Issue: 03 | Mar 2020 www.irjet.net p-ISSN: 2395-0072 © 2020, IRJET | Impact Factor value: 7.34 | ISO 9001:2008 Certified Journal | Page 927 1) Preprocessing: Key terms are extracted and filtered from news and social data corresponding to a particular periodof time. 2) Key Term Graph Construction: A graphisconstructedfrom the previously extracted key term set, whose vertices represent the key terms and edges represent the co- occurrence similarity between them. The graph, after processing and pruning, contains slightly joint clusters of topics popular in both news media and social media. 3) Graph Clustering: The graph is clustered in order to obtain well-defined and disjoint TCs. 4) Content Selection and Ranking:TheTCsfromthegraph are selected and ranked using the three relevance factors (MF, UA, and UI). Initially, news and tweets data are crawledfromtheInternet and stored in a database. News articles are obtained from specific news websites via their RSS feeds and tweets are crawled from the Twitter public timeline . A user then requests an output of the top k ranked news topics for a specified period of time between date d1 (start) and date d2 (end). 3.1 Preprocessing In the preprocessing stage, the system first queries all news articles and tweets from the database thatfall withindate d1 and date d2. Additionally, two sets of terms are created: one for the news articles and one for the tweets, as explained below. 1) News Term Extraction: The set of terms from the news data source consists of keywords extracted from all the queried articles. Due to its simple implementation and effectiveness, we implement a variant of the popular TextRank algorithm to extract the top k keywordsfromeach news article.2 The selected keywords are then lemmatized using the WordNet lemmatizer in order toconsiderdifferent inflected forms of a word as a single item. After lemmatization, all unique terms are added to set N. It is worth pointing out that, since N is a set, it does not contain duplicate terms. 2) Tweets Term Extraction: For the tweets data source, the set of terms are not the tweets’ keywords, but all uniqueand relevant terms. First, the language of each queried tweet is identified, disregarding any tweet that is not in English. From the remaining tweets, all terms that appear in a stop word list or that are less than three characters in length are eliminated. The part of speech (POS) of each term in the tweets is then identified using a POS tagger. This POS tagger is especially useful because it can identify Twitter-specific POSs, such as hashtags, mentions, and emoticon symbols. 3.2 Key Term Graph Construction In this component, a graph G is constructed,whoseclustered nodes represent the most prevalent news topics in both news and social media. The vertices in G are unique terms selected from N and T, and the edges are represented by a relationship between these terms. In the following sections, we define a method for selecting the terms and establish a relationship between them. After the terms and relationships are identified, the graph is pruned by filtering out unimportant vertices and edges. 3.3 Graph Clustering Once graph G has been constructed and its most significant terms (vertices) and term-pair co-occurrencevalues(edges) have been selected, the next goal is to identify and separate well-defined TCs (subgraphs)inthegraph.Before explaining the graph clustering algorithm, the conceptsof betweenness and transitivity must first be understood.
4.
International Research Journal
of Engineering and Technology (IRJET) e-ISSN: 2395-0056 Volume: 07 Issue: 03 | Mar 2020 www.irjet.net p-ISSN: 2395-0072 © 2020, IRJET | Impact Factor value: 7.34 | ISO 9001:2008 Certified Journal | Page 928 3.4 Content Selection and Ranking Now that the prevalent news-TCs that fall within dates d1 and d2 have been identified, relevant content from the two media sources that is related tothesetopicsmustbeselected and finally ranked. Related items from the news media will represent the MF of the topic. Similarly, related items from social media (Twitter) will represent the UA—more specifically, the number of unique Twitter users related to the selected tweets. Selecting the appropriate items (i.e., tweets and news articles) related to the key terms of a topic is not an easy task, as many other items unrelated to the desired topic also contain similar key terms. 4. CONCLUSION In this paper, we proposed an unsupervised method— SociRank—which identifies news topics prevalent in both social media and the news media, and then ranks them by taking into account their MF, UA, and UI as relevancefactors. The temporal prevalence of a particular topic in the news media is considered the MF of a topic, which gives us insight into its mass media popularity. The temporal prevalence of the topic in social media, specifically Twitter, indicates user interest, and is considered its UA. Finally, the interaction between the social media users who mention the topic indicates the strength of the community discussing it, and is considered the UI. To the best of our knowledge, no other work has attempted to employ the use of either theinterests of social media users or their social relationships to aid in the ranking of topics. REFERENCES [1] D. M. Blei, A. Y. Ng, and M. I. Jordan, “Latent Dirichlet allocation,” J. Mach. Learn. Res.,vol.3,pp.993–1022,Jan. 2003. [2] T. Hofmann, “Probabilistic latent semantic analysis,” in Proc. 15th Conf. Uncertainty Artif. Intell., 1999,pp.289– 296. [3] T. Hofmann, “Probabilistic latent semantic indexing,” in Proc. 22nd Annu. Int. ACM SIGIR Conf. Res. Develop. Inf. Retrieval, Berkeley, CA, USA, 1999, pp. 50–57. [4] C. Wartena and R. Brussee, “Topic detection by clustering keywords,” in Proc. 19th Int. Workshop Database Expert Syst. Appl. (DEXA), Turin, Italy, 2008, pp. 54–58. [5] C. D. Manning and H. Schütze, Foundations of Statistical Natural Language Processing. Cambridge, MA,USA:MIT Press, 1999. [6] J. Sankaranarayanan, H. Samet, B. E. Teitler, M. D. Lieberman, and J. Sperling, “TwitterStand: News in tweets,” in Proc. 17th ACM SIGSPATIAL Int. Conf. Adv. Geograph. Inf. Syst., Seattle, WA, USA, 2009, pp. 42–51. [7] C. Zhang, “Automatic keyword extraction from documents using conditional random fields,” J. Comput. Inf. Syst., vol. 4, no. 3, pp. 1169–1180, 2008. [8] K. Sarkar, M. Nasipuri, and S. Ghose, “A new approach to keyphrase extraction using neural networks,” Int. J. Comput. Sci. Issues, vol. 7, no. 3, pp. 16–25, Mar. 2010. [9] G. Figueroa and Y.-S. Chen, “Collaborative ranking between supervised and unsupervised approaches for keyphrase extraction,” in Proc. Conf. Comput. Linguist. Speech Process. (ROCLING), 2014, pp. 110–124.
Baixar agora