SlideShare uma empresa Scribd logo
1 de 13
BotNetBM

          A Benchmark for Social Network


                                       CWI
                            Project Meeting@Innsbruck
                              Feb 28 - Mar 04, 2011




Wednesday, March 02, 2011
Motivation
     —   Highly linked data

     —   No (good) benchmark yet for social
          networks




                            Feb 28 - Mar 4, 2011 ProjectMeeting@Innsbruck


Wednesday, March 02, 2011
BotNetBM
     —   A benchmark for social networks

     —   Simulates an RDF OLTP backend

     —   Simulates random activities of large #users

     —   Simulates on-site “analyst” ➠ weekly
          “analytic report”

     —   One parameter: scale (#user accounts to
          start BM)
                            Feb 28 - Mar 4, 2011 ProjectMeeting@Innsbruck


Wednesday, March 02, 2011
BotNetBM Queries
     —   SPARQL 1.1 + SPARUL

     —   User Actions

          ◦ Interactive queries (80%)

          ◦ Update transactions (20%)

     —   Measurement: successful #clicks/min.

          ◦ Transactions commit, penalty for > 3 sec.

          ◦ Interactive queries response time < 3 sec.

     —   Analytic queries (must finish within simulated weekend)

                            Feb 28 - Mar 4, 2011 ProjectMeeting@Innsbruck


Wednesday, March 02, 2011
Limitations
     —   Data generator: too uniform, not realistic for social networks

          ◦ 10 operations / user / simulated day

          ◦ all users are equally active

          ◦ some queries have no “meaningful” relation to each other

          ◦ read/write contention unrealistically frequent
          ◦ ...

     —   Query mix:

          ◦ Does not exploit SPARQL 1.1 advanced features
          ◦ No link to other RDF datasets

     —   Queries do not run with the open source ed. of Virtuoso Server

                               Feb 28 - Mar 4, 2011 ProjectMeeting@Innsbruck


Wednesday, March 02, 2011
Our Goals
     —   Exploit SPARQL1.1 features in queries

          ◦ “Property Path Expressions”

     —   Add links to well-know RDF data sets into the queries

          ◦ DBpedia

     —   Use real-life analysis info (e.g., twitter)

          ◦ redesign data generator

          ◦ distribution of interactive/update queries

     —   Use real-life social network data

          ◦ twitter, facebook, orkut, MySpace, ...

     —   Migration to MonetDB

                               Feb 28 - Mar 4, 2011 ProjectMeeting@Innsbruck


Wednesday, March 02, 2011
Done
     —   Loaded into the Virtuoso Server (commercial ed.)

     —   Design of new query mix

     —   Twitter datasets

          ◦ http://infochimps.com/collections/twitter-census

          ◦ http://an.kaist.ac.kr/traces/WWW2010.html

          ◦ http://snap.stanford.edu/data/twitter7.html

          ◦ http://twitter.mpi-sws.org/

     —   Analysis information

          ◦ “The Man Your Man Could Smell Like: Twitter Analytics Report”

          ◦ “Characterizing user behavior in online social networks”

          ◦ “User Interactions in Social Networks and their Implications”


                                 Feb 28 - Mar 4, 2011 ProjectMeeting@Innsbruck


Wednesday, March 02, 2011
Interactive & Analytic Queries

     Q1 - Q8: Information of Profiles & Friends
     1.   Find all users whose first names contain a particular string, e.g., “Minh”.

     2.   Return the names of people who study in the same school and have the same age as a user. These
          people can be the classmates of the user.

     3.   Find people studied from the same school that connect with you by a path of friend relationship. (Use
          the “Property Path Expression” in SPARQL 1.1 with arbitrary length path)

     4.   Find all friends who like an action movie whose actor is Tom Cruise. (Use the information from dbpedia
          for the movie and actor Tom Cruise)

     5.   Find all people living in a specific location, e.g., Amsterdam, that can be reached from a user by at most
          3 steps friend relationship.

     6.   Show all the friends of yours who are living in Europe. (Use the information from dbpedia. For example,
          Amsterdam is a city in Europe, London is a city in Europe)

     7.   Find top-10 suggested friends for a user: those people that are currently not your friend but are friends
          of many of your friends. (Get all friends of your friends, order them by the number of people in your
          friends list connecting to them)

     8.   Return all users that have not joined a specific group but more than 5 friends of theirs joined the group.



                                       Feb 28 - Mar 4, 2011 ProjectMeeting@Innsbruck


Wednesday, March 02, 2011
Interactive & Analytic Queries

     Q9 - Q14: Posts or Tweets
     9.   Show 10 latest posts/tweets from your friends or the friends of them. (Order by posting time)

     10. Show active posts/tweets - the 10 latest commented posts/tweets from your friends. (Order
         by the timestamp of the last comments on the posts)

     11. Return top-10 most interesting posts from your friends - First order by the number of
         “like” (or in Twitter, the number of “re-tweet” posts) on the posts from your friends, then
         order by the number of comments.

     12. Return all posts about an event (e.g., Unrest in Tunisia) in 10 recent days. (Based on the
         hash tags if they are available. In case no tag appears in the post, check whether the content
         of the post contains the terms in the searching event.)

     13. Show all posts about a specific location, e.g., Egypt, in 10 recent days. (Use the information
         from DBpedia for identify the location of the post. For example, Cairo is the capital of Egypt,
         Tahrir square is in Cairo.)

     14. Find number of inactive user: all users activated for at least 30 days but did not have any
         post or all users that do not have any more post for 60 days.



                                   Feb 28 - Mar 4, 2011 ProjectMeeting@Innsbruck


Wednesday, March 02, 2011
Interactive & Analytic Queries

     Q15 - Q17: Hash tags
     15.Show all photos posted by my friends that I was tagged.

     16.Find top-10 friends or all friends of friends of you that have
        common interest. (Based on the similarity between the tags in
        your posts and tags in their posts)

     17.What are the current hottest events/problems? (Get the hash tags
        from posts and order by the number of their appearances in 10
        recent days)




                            Feb 28 - Mar 4, 2011 ProjectMeeting@Innsbruck


Wednesday, March 02, 2011
Interactive & Analytic Queries

     Q18 - Q19: other information
     18.Which area is the most active area? (Order by the total number of
        posts in each location in 5 recent days.)

     19.Return the top-10 locations that have the fastest growth in the
        number of users. (Count the number of people joined before 10
        days and those joined during the 10 recent days, and then,
        compute the developing rate.)




                            Feb 28 - Mar 4, 2011 ProjectMeeting@Innsbruck


Wednesday, March 02, 2011
SPARQL/Update Queries
     1. Update user profile

     2. Posts/Tweets:

           2.1. Add a posts (Popularity: high)

           2.2. Remove a posts (Popularity: low)

           2.3. Add tags for your friends

           2.4. Add/Remove a comment

     3. Friends

           3.1. Add a friend (Popularity: high)

           3.2. Remove a friend (Popularity: low)

                            Feb 28 - Mar 4, 2011 ProjectMeeting@Innsbruck


Wednesday, March 02, 2011
SPARQL/Update Queries
     4. Group, Event

           4.1. Join/Leave a group/event

           4.2. Add/Delete post in the group/event

     5. Photos

           5.1. Add/Delete a photo

           5.2. Add/Remove tags in the photo

           5.3. Add/Remove a comment
           5.4. Remove tags to me from all the pictures of my friends

                            Feb 28 - Mar 4, 2011 ProjectMeeting@Innsbruck


Wednesday, March 02, 2011

Mais conteúdo relacionado

Destaque

Destaque (9)

Privacy-Preserving Schema Reuse
Privacy-Preserving Schema ReusePrivacy-Preserving Schema Reuse
Privacy-Preserving Schema Reuse
 
Dl2014 slides
Dl2014 slidesDl2014 slides
Dl2014 slides
 
Arrays in database systems, the next frontier?
Arrays in database systems, the next frontier?Arrays in database systems, the next frontier?
Arrays in database systems, the next frontier?
 
Exposing Real World Information for the Web of Things
Exposing Real World Information for the Web of ThingsExposing Real World Information for the Web of Things
Exposing Real World Information for the Web of Things
 
Scalable Nonmonotonic Reasoning over RDF Data Using MapReduce
Scalable Nonmonotonic Reasoning over RDF Data Using MapReduceScalable Nonmonotonic Reasoning over RDF Data Using MapReduce
Scalable Nonmonotonic Reasoning over RDF Data Using MapReduce
 
Tractor Pulling on Data Warehouse
Tractor Pulling on Data WarehouseTractor Pulling on Data Warehouse
Tractor Pulling on Data Warehouse
 
On Leveraging Crowdsourcing Techniques for Schema Matching Networks
On Leveraging Crowdsourcing Techniques for Schema Matching NetworksOn Leveraging Crowdsourcing Techniques for Schema Matching Networks
On Leveraging Crowdsourcing Techniques for Schema Matching Networks
 
Efficiently Maintaining Distributed Model-Based Views on Real-Time Data Streams
Efficiently Maintaining Distributed Model-Based Views on Real-Time Data StreamsEfficiently Maintaining Distributed Model-Based Views on Real-Time Data Streams
Efficiently Maintaining Distributed Model-Based Views on Real-Time Data Streams
 
Planetdata
PlanetdataPlanetdata
Planetdata
 

Semelhante a BotNetBenchmark - A Benchmark for Social Network

Analyzing customer sentiments in microblogs
Analyzing customer sentiments in microblogsAnalyzing customer sentiments in microblogs
Analyzing customer sentiments in microblogs
Stefan Sommer
 
Text mining on Twitter information based on R platform
Text mining on Twitter information based on R platformText mining on Twitter information based on R platform
Text mining on Twitter information based on R platform
Fayan TAO
 
Anticipating Discussion Activity on Community Forums
Anticipating Discussion Activity on Community ForumsAnticipating Discussion Activity on Community Forums
Anticipating Discussion Activity on Community Forums
Matthew Rowe
 

Semelhante a BotNetBenchmark - A Benchmark for Social Network (20)

Predicting what gets ‘Likes’ on Facebook: case study of BlogTO
Predicting what gets ‘Likes’ on Facebook:  case study of BlogTOPredicting what gets ‘Likes’ on Facebook:  case study of BlogTO
Predicting what gets ‘Likes’ on Facebook: case study of BlogTO
 
Predicting what gets ‘Likes’ on Facebook: case study of BlogTO
Predicting what gets ‘Likes’ on Facebook: case study of BlogTOPredicting what gets ‘Likes’ on Facebook: case study of BlogTO
Predicting what gets ‘Likes’ on Facebook: case study of BlogTO
 
Social Media Crawling & Mining Seminar
Social Media Crawling & Mining Seminar Social Media Crawling & Mining Seminar
Social Media Crawling & Mining Seminar
 
CSE509 Lecture 5
CSE509 Lecture 5CSE509 Lecture 5
CSE509 Lecture 5
 
Social media as a tool for terminological research
Social media as a tool for terminological researchSocial media as a tool for terminological research
Social media as a tool for terminological research
 
Flux of MEME - DOW 1st semester
Flux of MEME - DOW 1st semesterFlux of MEME - DOW 1st semester
Flux of MEME - DOW 1st semester
 
Developing rich interactive eBooks to teach linked open data to professionals...
Developing rich interactive eBooks to teach linked open data to professionals...Developing rich interactive eBooks to teach linked open data to professionals...
Developing rich interactive eBooks to teach linked open data to professionals...
 
2016 09-28 social network analysis with node-xl_emke
2016 09-28 social network analysis with node-xl_emke2016 09-28 social network analysis with node-xl_emke
2016 09-28 social network analysis with node-xl_emke
 
Social Media Analysis... according to Net7
Social Media Analysis... according to Net7Social Media Analysis... according to Net7
Social Media Analysis... according to Net7
 
Analyzing customer sentiments in microblogs
Analyzing customer sentiments in microblogsAnalyzing customer sentiments in microblogs
Analyzing customer sentiments in microblogs
 
Text mining on Twitter information based on R platform
Text mining on Twitter information based on R platformText mining on Twitter information based on R platform
Text mining on Twitter information based on R platform
 
#mytweet via Instagram: Exploring User Behaviour Across Multiple Social Networks
#mytweet via Instagram: Exploring User Behaviour Across Multiple Social Networks#mytweet via Instagram: Exploring User Behaviour Across Multiple Social Networks
#mytweet via Instagram: Exploring User Behaviour Across Multiple Social Networks
 
Anticipating Discussion Activity on Community Forums
Anticipating Discussion Activity on Community ForumsAnticipating Discussion Activity on Community Forums
Anticipating Discussion Activity on Community Forums
 
Social Media in Japan (Panel in Blogtalk2009)
Social Media in Japan (Panel in Blogtalk2009)Social Media in Japan (Panel in Blogtalk2009)
Social Media in Japan (Panel in Blogtalk2009)
 
Mapping Online Publics: Researching the Uses of Twitter
Mapping Online Publics: Researching the Uses of TwitterMapping Online Publics: Researching the Uses of Twitter
Mapping Online Publics: Researching the Uses of Twitter
 
IAT334-Lec04-DesignIdeasPrinciples.pptx
IAT334-Lec04-DesignIdeasPrinciples.pptxIAT334-Lec04-DesignIdeasPrinciples.pptx
IAT334-Lec04-DesignIdeasPrinciples.pptx
 
Twitter in Academic Conferences
Twitter in Academic ConferencesTwitter in Academic Conferences
Twitter in Academic Conferences
 
Open Annotation Collaboration Briefing
Open Annotation Collaboration BriefingOpen Annotation Collaboration Briefing
Open Annotation Collaboration Briefing
 
Information Extraction from Text, presented @ Deloitte
Information Extraction from Text, presented @ DeloitteInformation Extraction from Text, presented @ Deloitte
Information Extraction from Text, presented @ Deloitte
 
Accessing and analysing your own social media data.pptx
Accessing and analysing your own social media data.pptxAccessing and analysing your own social media data.pptx
Accessing and analysing your own social media data.pptx
 

Mais de PlanetData Network of Excellence

A Contextualized Knowledge Repository for Open Data about Trentino
A Contextualized Knowledge Repository for Open Data about TrentinoA Contextualized Knowledge Repository for Open Data about Trentino
A Contextualized Knowledge Repository for Open Data about Trentino
PlanetData Network of Excellence
 
Access Control for RDF graphs using Abstract Models
Access Control for RDF graphs using Abstract ModelsAccess Control for RDF graphs using Abstract Models
Access Control for RDF graphs using Abstract Models
PlanetData Network of Excellence
 
Abstract Access Control Model for Dynamic RDF Datasets
Abstract Access Control Model for Dynamic RDF DatasetsAbstract Access Control Model for Dynamic RDF Datasets
Abstract Access Control Model for Dynamic RDF Datasets
PlanetData Network of Excellence
 
Automation in Cytomics: A Modern RDBMS Based Platform for Image Analysis and ...
Automation in Cytomics: A Modern RDBMS Based Platform for Image Analysis and ...Automation in Cytomics: A Modern RDBMS Based Platform for Image Analysis and ...
Automation in Cytomics: A Modern RDBMS Based Platform for Image Analysis and ...
PlanetData Network of Excellence
 
Heuristic based Query Optimisation for SPARQL
Heuristic based Query Optimisation for SPARQLHeuristic based Query Optimisation for SPARQL
Heuristic based Query Optimisation for SPARQL
PlanetData Network of Excellence
 

Mais de PlanetData Network of Excellence (20)

A Contextualized Knowledge Repository for Open Data about Trentino
A Contextualized Knowledge Repository for Open Data about TrentinoA Contextualized Knowledge Repository for Open Data about Trentino
A Contextualized Knowledge Repository for Open Data about Trentino
 
Towards Enabling Probabilistic Databases for Participatory Sensing
Towards Enabling Probabilistic Databases for Participatory SensingTowards Enabling Probabilistic Databases for Participatory Sensing
Towards Enabling Probabilistic Databases for Participatory Sensing
 
Pay-as-you-go Reconciliation in Schema Matching Networks
Pay-as-you-go Reconciliation in Schema Matching NetworksPay-as-you-go Reconciliation in Schema Matching Networks
Pay-as-you-go Reconciliation in Schema Matching Networks
 
Demo: tablet-based visualisation of transport data in Madrid using SPARQLstream
Demo: tablet-based visualisation of transport data in Madrid using SPARQLstreamDemo: tablet-based visualisation of transport data in Madrid using SPARQLstream
Demo: tablet-based visualisation of transport data in Madrid using SPARQLstream
 
On the need for a W3C community group on RDF Stream Processing
On the need for a W3C community group on RDF Stream ProcessingOn the need for a W3C community group on RDF Stream Processing
On the need for a W3C community group on RDF Stream Processing
 
Urbanopoly: Collection and Quality Assessment of Geo-spatial Linked Data via ...
Urbanopoly: Collection and Quality Assessment of Geo-spatial Linked Data via ...Urbanopoly: Collection and Quality Assessment of Geo-spatial Linked Data via ...
Urbanopoly: Collection and Quality Assessment of Geo-spatial Linked Data via ...
 
Linking Smart Cities Datasets with Human Computation: the case of UrbanMatch
Linking Smart Cities Datasets with Human Computation: the case of UrbanMatchLinking Smart Cities Datasets with Human Computation: the case of UrbanMatch
Linking Smart Cities Datasets with Human Computation: the case of UrbanMatch
 
SciQL, Bridging the Gap between Science and Relational DBMS
SciQL, Bridging the Gap between Science and Relational DBMSSciQL, Bridging the Gap between Science and Relational DBMS
SciQL, Bridging the Gap between Science and Relational DBMS
 
CLODA: A Crowdsourced Linked Open Data Architecture
CLODA: A Crowdsourced Linked Open Data ArchitectureCLODA: A Crowdsourced Linked Open Data Architecture
CLODA: A Crowdsourced Linked Open Data Architecture
 
Data and Knowledge Evolution
Data and Knowledge Evolution  Data and Knowledge Evolution
Data and Knowledge Evolution
 
Evolution of Workflow Provenance Information in the Presence of Custom Infere...
Evolution of Workflow Provenance Information in the Presence of Custom Infere...Evolution of Workflow Provenance Information in the Presence of Custom Infere...
Evolution of Workflow Provenance Information in the Presence of Custom Infere...
 
Access Control for RDF graphs using Abstract Models
Access Control for RDF graphs using Abstract ModelsAccess Control for RDF graphs using Abstract Models
Access Control for RDF graphs using Abstract Models
 
Arrays in Databases, the next frontier?
Arrays in Databases, the next frontier?Arrays in Databases, the next frontier?
Arrays in Databases, the next frontier?
 
Abstract Access Control Model for Dynamic RDF Datasets
Abstract Access Control Model for Dynamic RDF DatasetsAbstract Access Control Model for Dynamic RDF Datasets
Abstract Access Control Model for Dynamic RDF Datasets
 
Towards Parallel Nonmonotonic Reasoning with Billions of Facts
Towards Parallel Nonmonotonic Reasoning with Billions of FactsTowards Parallel Nonmonotonic Reasoning with Billions of Facts
Towards Parallel Nonmonotonic Reasoning with Billions of Facts
 
Automation in Cytomics: A Modern RDBMS Based Platform for Image Analysis and ...
Automation in Cytomics: A Modern RDBMS Based Platform for Image Analysis and ...Automation in Cytomics: A Modern RDBMS Based Platform for Image Analysis and ...
Automation in Cytomics: A Modern RDBMS Based Platform for Image Analysis and ...
 
Heuristic based Query Optimisation for SPARQL
Heuristic based Query Optimisation for SPARQLHeuristic based Query Optimisation for SPARQL
Heuristic based Query Optimisation for SPARQL
 
Adaptive Semantic Data Management Techniques for Federations of Endpoints
Adaptive Semantic Data Management Techniques for Federations of EndpointsAdaptive Semantic Data Management Techniques for Federations of Endpoints
Adaptive Semantic Data Management Techniques for Federations of Endpoints
 
Building a Front End for a Sensor Data Cloud
Building a Front End for a Sensor Data CloudBuilding a Front End for a Sensor Data Cloud
Building a Front End for a Sensor Data Cloud
 
OntoGen Extension for Exploring Image Collections
OntoGen Extension for Exploring Image CollectionsOntoGen Extension for Exploring Image Collections
OntoGen Extension for Exploring Image Collections
 

Último

EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
Earley Information Science
 
Artificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsArtificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and Myths
Joaquim Jorge
 
IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI Solutions
Enterprise Knowledge
 

Último (20)

Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreterPresentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreter
 
Tech Trends Report 2024 Future Today Institute.pdf
Tech Trends Report 2024 Future Today Institute.pdfTech Trends Report 2024 Future Today Institute.pdf
Tech Trends Report 2024 Future Today Institute.pdf
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
 
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
 
GenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdfGenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdf
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024
 
🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day Presentation
 
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt Robison
 
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot TakeoffStrategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
 
Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)
 
Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024
 
Artificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsArtificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and Myths
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...
 
Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Script
 
IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI Solutions
 
How to convert PDF to text with Nanonets
How to convert PDF to text with NanonetsHow to convert PDF to text with Nanonets
How to convert PDF to text with Nanonets
 

BotNetBenchmark - A Benchmark for Social Network

  • 1. BotNetBM A Benchmark for Social Network CWI Project Meeting@Innsbruck Feb 28 - Mar 04, 2011 Wednesday, March 02, 2011
  • 2. Motivation — Highly linked data — No (good) benchmark yet for social networks Feb 28 - Mar 4, 2011 ProjectMeeting@Innsbruck Wednesday, March 02, 2011
  • 3. BotNetBM — A benchmark for social networks — Simulates an RDF OLTP backend — Simulates random activities of large #users — Simulates on-site “analyst” ➠ weekly “analytic report” — One parameter: scale (#user accounts to start BM) Feb 28 - Mar 4, 2011 ProjectMeeting@Innsbruck Wednesday, March 02, 2011
  • 4. BotNetBM Queries — SPARQL 1.1 + SPARUL — User Actions ◦ Interactive queries (80%) ◦ Update transactions (20%) — Measurement: successful #clicks/min. ◦ Transactions commit, penalty for > 3 sec. ◦ Interactive queries response time < 3 sec. — Analytic queries (must finish within simulated weekend) Feb 28 - Mar 4, 2011 ProjectMeeting@Innsbruck Wednesday, March 02, 2011
  • 5. Limitations — Data generator: too uniform, not realistic for social networks ◦ 10 operations / user / simulated day ◦ all users are equally active ◦ some queries have no “meaningful” relation to each other ◦ read/write contention unrealistically frequent ◦ ... — Query mix: ◦ Does not exploit SPARQL 1.1 advanced features ◦ No link to other RDF datasets — Queries do not run with the open source ed. of Virtuoso Server Feb 28 - Mar 4, 2011 ProjectMeeting@Innsbruck Wednesday, March 02, 2011
  • 6. Our Goals — Exploit SPARQL1.1 features in queries ◦ “Property Path Expressions” — Add links to well-know RDF data sets into the queries ◦ DBpedia — Use real-life analysis info (e.g., twitter) ◦ redesign data generator ◦ distribution of interactive/update queries — Use real-life social network data ◦ twitter, facebook, orkut, MySpace, ... — Migration to MonetDB Feb 28 - Mar 4, 2011 ProjectMeeting@Innsbruck Wednesday, March 02, 2011
  • 7. Done — Loaded into the Virtuoso Server (commercial ed.) — Design of new query mix — Twitter datasets ◦ http://infochimps.com/collections/twitter-census ◦ http://an.kaist.ac.kr/traces/WWW2010.html ◦ http://snap.stanford.edu/data/twitter7.html ◦ http://twitter.mpi-sws.org/ — Analysis information ◦ “The Man Your Man Could Smell Like: Twitter Analytics Report” ◦ “Characterizing user behavior in online social networks” ◦ “User Interactions in Social Networks and their Implications” Feb 28 - Mar 4, 2011 ProjectMeeting@Innsbruck Wednesday, March 02, 2011
  • 8. Interactive & Analytic Queries Q1 - Q8: Information of Profiles & Friends 1. Find all users whose first names contain a particular string, e.g., “Minh”. 2. Return the names of people who study in the same school and have the same age as a user. These people can be the classmates of the user. 3. Find people studied from the same school that connect with you by a path of friend relationship. (Use the “Property Path Expression” in SPARQL 1.1 with arbitrary length path) 4. Find all friends who like an action movie whose actor is Tom Cruise. (Use the information from dbpedia for the movie and actor Tom Cruise) 5. Find all people living in a specific location, e.g., Amsterdam, that can be reached from a user by at most 3 steps friend relationship. 6. Show all the friends of yours who are living in Europe. (Use the information from dbpedia. For example, Amsterdam is a city in Europe, London is a city in Europe) 7. Find top-10 suggested friends for a user: those people that are currently not your friend but are friends of many of your friends. (Get all friends of your friends, order them by the number of people in your friends list connecting to them) 8. Return all users that have not joined a specific group but more than 5 friends of theirs joined the group. Feb 28 - Mar 4, 2011 ProjectMeeting@Innsbruck Wednesday, March 02, 2011
  • 9. Interactive & Analytic Queries Q9 - Q14: Posts or Tweets 9. Show 10 latest posts/tweets from your friends or the friends of them. (Order by posting time) 10. Show active posts/tweets - the 10 latest commented posts/tweets from your friends. (Order by the timestamp of the last comments on the posts) 11. Return top-10 most interesting posts from your friends - First order by the number of “like” (or in Twitter, the number of “re-tweet” posts) on the posts from your friends, then order by the number of comments. 12. Return all posts about an event (e.g., Unrest in Tunisia) in 10 recent days. (Based on the hash tags if they are available. In case no tag appears in the post, check whether the content of the post contains the terms in the searching event.) 13. Show all posts about a specific location, e.g., Egypt, in 10 recent days. (Use the information from DBpedia for identify the location of the post. For example, Cairo is the capital of Egypt, Tahrir square is in Cairo.) 14. Find number of inactive user: all users activated for at least 30 days but did not have any post or all users that do not have any more post for 60 days. Feb 28 - Mar 4, 2011 ProjectMeeting@Innsbruck Wednesday, March 02, 2011
  • 10. Interactive & Analytic Queries Q15 - Q17: Hash tags 15.Show all photos posted by my friends that I was tagged. 16.Find top-10 friends or all friends of friends of you that have common interest. (Based on the similarity between the tags in your posts and tags in their posts) 17.What are the current hottest events/problems? (Get the hash tags from posts and order by the number of their appearances in 10 recent days) Feb 28 - Mar 4, 2011 ProjectMeeting@Innsbruck Wednesday, March 02, 2011
  • 11. Interactive & Analytic Queries Q18 - Q19: other information 18.Which area is the most active area? (Order by the total number of posts in each location in 5 recent days.) 19.Return the top-10 locations that have the fastest growth in the number of users. (Count the number of people joined before 10 days and those joined during the 10 recent days, and then, compute the developing rate.) Feb 28 - Mar 4, 2011 ProjectMeeting@Innsbruck Wednesday, March 02, 2011
  • 12. SPARQL/Update Queries 1. Update user profile 2. Posts/Tweets: 2.1. Add a posts (Popularity: high) 2.2. Remove a posts (Popularity: low) 2.3. Add tags for your friends 2.4. Add/Remove a comment 3. Friends 3.1. Add a friend (Popularity: high) 3.2. Remove a friend (Popularity: low) Feb 28 - Mar 4, 2011 ProjectMeeting@Innsbruck Wednesday, March 02, 2011
  • 13. SPARQL/Update Queries 4. Group, Event 4.1. Join/Leave a group/event 4.2. Add/Delete post in the group/event 5. Photos 5.1. Add/Delete a photo 5.2. Add/Remove tags in the photo 5.3. Add/Remove a comment 5.4. Remove tags to me from all the pictures of my friends Feb 28 - Mar 4, 2011 ProjectMeeting@Innsbruck Wednesday, March 02, 2011