O slideshow foi denunciado.
Seu SlideShare está sendo baixado. ×

Topic Lifecycle on Social Networks

Anúncio
Anúncio
Anúncio
Anúncio
Anúncio
Anúncio
Anúncio
Anúncio
Anúncio
Anúncio
Anúncio
Anúncio
Próximos SlideShares
NMIX 4200 Final Paper Report
NMIX 4200 Final Paper Report
Carregando em…3
×

Confira estes a seguir

1 de 34 Anúncio

Topic Lifecycle on Social Networks

Baixar para ler offline

Based on research work “Topic lifecycle on social networks: analyzing the effects of semantic continuity and social communities” published in European Conference on Information Retrieval, 29-42, 2018
By K Dey, S Kaushik, K Garg, R Shrivastava

Abstract: Topic lifecycle analysis on Twitter, a branch of study that investigates Twitter topics from their birth through lifecycle to death, has gained immense mainstream research popularity. In the literature, topics are often treated as one of (a) hashtags (independent from other hashtags), (b) a burst of keywords in a short time span or (c) a latent concept space captured by advanced text analysis methodologies, such as Latent Dirichlet Allocation (LDA). The first two approaches are not capable of recognizing topics where different users use different hashtags to express the same concept (semantically related), while the third approach misses out the user’s explicit intent expressed via hashtags.

Based on research work “Topic lifecycle on social networks: analyzing the effects of semantic continuity and social communities” published in European Conference on Information Retrieval, 29-42, 2018
By K Dey, S Kaushik, K Garg, R Shrivastava

Abstract: Topic lifecycle analysis on Twitter, a branch of study that investigates Twitter topics from their birth through lifecycle to death, has gained immense mainstream research popularity. In the literature, topics are often treated as one of (a) hashtags (independent from other hashtags), (b) a burst of keywords in a short time span or (c) a latent concept space captured by advanced text analysis methodologies, such as Latent Dirichlet Allocation (LDA). The first two approaches are not capable of recognizing topics where different users use different hashtags to express the same concept (semantically related), while the third approach misses out the user’s explicit intent expressed via hashtags.

Anúncio
Anúncio

Mais Conteúdo rRelacionado

Semelhante a Topic Lifecycle on Social Networks (20)

Mais recentes (20)

Anúncio

Topic Lifecycle on Social Networks

  1. 1. Topic Lifecycle on Social Networks Analyzing the Effects of Semantic Continuity and Social Communities Kuntal Dey, Saroj Kaushik, Kritika Garg, Ritvik Shrivastava
  2. 2. Topic Lifecycle Analysis Topic Lifecycle Analysis focuses on: 1. Identifying the topics of user interest 2. Understanding the lifecycle of these topics: a. how these topics emerge. b. how they spread over the social network and proliferate across several users c. how they eventually subside over time.
  3. 3. Topic Lifecycle Analysis The literature in this area has seen strong work in the following forms: ● Event lifecycle analysis with each hashtag considered as a separate topic [Ardon et. al. CIKM 2013] ● K-Spectral Centroid (KSC) clustering approach detects occurrence pattern similarity of hashtags [Yang and Leskovec WSDM 2011] ● More recently, a temporal - clustering algorithms based on the hypothesis that semantically related hashtags have similar and synchronous usage patterns.
  4. 4. Our Contributions Our work is the first to examine the lifecycle of a collection of hashtags as topics in the context of communities. We investigate the effects of: 1. Semantic/Conceptual Similarity 2. Temporal Relatedness 3. Social Connections
  5. 5. Our Approach 1. Topic Cluster Formation 2. Creating Community-Level Hashtag and Topic Timelines 3. Hashtag and topic lifecycle analysis with respect to the communities
  6. 6. Our Approach 1. Topic Cluster Formation 2. Creating Community-Level Hashtag and Topic Timelines 3. Hashtag and topic lifecycle analysis with respect to the communities
  7. 7. Topic Cluster Formation We create a separate document for each hashtag in the dataset by collecting all the tweets in the dataset that used that hashtag (one or more times). Create Document Compute Hashtag Embeddings Create (Initial) Semantically Related Hashtag Clusters Final Clusters Find Temporal Relationships #hashtag2 { all tweets} #hashtag1 {tweets}
  8. 8. Topic Cluster Formation We compute a global average embedding for each hashtag document: 1. Eliminate all the hashtags occurring in document. 2. Look-up the GloVe Twitter embedding for each word in the document. 3. Find the average across all the above word embeddings to compute hashtag embedding. Create Document Compute Hashtag Embeddings Create (Initial) Semantically Related Hashtag Clusters Final Clusters Find Temporal Relationships CAT 01000 Cat
  9. 9. Topic Cluster Formation 1. Find cosine similarity between each pair of hashtag embedding vectors 2. K-means clustering to obtain the conceptually/semantically related clusters. Create Document Compute Hashtag Embeddings Create (Initial) Semantically Related Hashtag Clusters Final Clusters Find Temporal Relationships
  10. 10. K-means Clustering https://www.researchgate.net/figure/K-means-clustering-algorithm-An-example-2-cluster-run-is-shown-with-the-clusters_fig3_268880805
  11. 11. Topic Cluster Formation Create Document Compute Hashtag Embeddings Create (Initial) Semantically Related Hashtag Clusters Final Clusters Find Temporal Relationships 1. We create a time series of the individual hashtags 2. We temporally relate a pair of hashtags hi and hj if they either satisfy: a. The overlaps relationship. b. There exists one or more hashtags hk , such that: hi is temporally related to hk , and hk overlaps h c. They are disjoint by less than a threshold number of days. (2 for our expts.)
  12. 12. Topic Cluster Formation We finalize our topics, defined as hashtag clusters, such that each consists of hashtags that are both semantically and temporally related. Create Document Compute Hashtag Embeddings Create (Initial) Semantically Related Hashtag Clusters Final Clusters Find Temporal Relationships
  13. 13. Our Approach 1. Topic Cluster Formation 2. Creating Community-Level Hashtag and Topic Timelines 3. Hashtag and topic lifecycle analysis with respect to the communities
  14. 14. Creating Community-Level Hashtag & Topic Timelines 1. Using the Twitter followership network of the users that posted the tweets, we discover modularity-based communities. 2. We perform aggregation of the users’ hashtag usage behavior, in order to find the total usage of each hashtag by community members and find timelines. 3. Two timelines are obtained through this procedure for each community a. Hashtag-level usage timeline of communities b. Topic(cluster)-level usage timeline of communities
  15. 15. Hashtag-level Usage Timeline of Communities For each given time-slot, all the usages of a given hashtag for all the community members are summed up, to find the total number of usages of the hashtag by the community (that is, its members). Topic (Cluster)-level Usage Timeline of Communities 1. For each given time-slot, for each community, we sum up the usage count of all the hashtags belonging to the same topic cluster. 2. This is useful for identifying the participation of each given community in the topic as a whole, within the given timeslot.
  16. 16. Our Approach 1. Topic Cluster Formation 2. Creating Community-Level Hashtag and Topic Timelines 3. Hashtag and topic lifecycle analysis with respect to the communities
  17. 17. Topic Lifecycle Characteristics for Analysis 1. Dominant hashtag and topic morphing ○ A dominant hashtag is the one which has been most frequently used within a given time-slot, among all the hashtags. ○ A topic morphs, when its dominant hashtag changes from one to another. 2. Topic intensity ○ The intensity of a topic is derived as the summation of the number of times each hashtag is used. ○ Computed for determining the aggregate presence of a topic at any timeslot or within any community.
  18. 18. Dataset ● We use the twitter dataset by Yang and Leskovec, which is available publicly on the Stanford Large Network Dataset Collection (SNAP) repository. ● For the purpose of our experiments, we chose to use 1 month of Twitter data from this dataset.
  19. 19. Analysis
  20. 20. Proposed Hypotheses 1. Conceptually related hashtags overlap semantically & temporally. 2. Hashtags associate with communities at a given time. 3. Hashtags are independently used across communities. 4. Hashtags evolve independently (atomically) within communities.
  21. 21. Proposed Hypotheses 1. Conceptually related hashtags overlap semantically & temporally. 2. Hashtags associate with communities at a given time. 3. Hashtags are independently used across communities. 4. Hashtags evolve independently (atomically) within communities.
  22. 22. ● Different users use different hashtags at the same time for the same topic, that are semantically related and temporally overlapping. ● User 1: #frenchopen User 2: #RollandGarros
  23. 23. Proposed Hypotheses 1. Conceptually related hashtags overlap semantically & temporally. 2. Hashtags associate with communities at a given time. 3. Hashtags are independently used across communities. 4. Hashtags evolve independently (atomically) within communities.
  24. 24. ● Hashtag usages are community-level characteristics rather than individual level. ● Individuals mostly tend to use the same hashtag that their community would use, for a given topic, at a given time. ● If two users u1 and u2 belong to the same community, then they both are likely to use #federer instead of one using #federer and the other using #rogerfederer.
  25. 25. Community 1 Community 2
  26. 26. Proposed Hypotheses 1. Conceptually related hashtags overlap semantically & temporally. 2. Hashtags associate with communities at a given time. 3. Hashtags are independently used across communities. 4. Hashtags evolve independently (atomically) within communities.
  27. 27. For the same topic, at the same time, while one community would use one hashtag, another community would use another hashtag. Community 1 Community 2
  28. 28. Proposed Hypotheses 1. Conceptually related hashtags overlap semantically & temporally. 2. Hashtags associate with communities at a given time. 3. Hashtags are independently used across communities. 4. Hashtags evolve independently (atomically) within communities.
  29. 29. Evolution and lifecycle of hashtags (and topics) are community specific.
  30. 30. The global (overall) lifecycle of a given topic can be derived as an aggregation of the lifecycle of topics within individual communities.
  31. 31. Proposed Hypotheses 1. Conceptually related hashtags overlap semantically & temporally. 2. Hashtags associate with communities at a given time. 3. Hashtags are independently used across communities. 4. Hashtags evolve independently (atomically) within communities.
  32. 32. Summary We present a novel analysis of topic lifecycles with respect to the social, temporal and semantic aspects of hashtags and user communities in a social network.

×