SlideShare uma empresa Scribd logo
1 de 18
The 2012 Social Event Detection Dataset
Symeon Papadopoulos1, Emmanouil Schinas1, Vasileios Mezaris1,
Raphaël Troncy2, Yiannis Kompatsiaris1

1
  CERTH-ITI, Thessaloniki, Greece
2
  EURECOM, Sophia Antipolis, France


Oslo, 28 Feb - 1 Mar 2013
SED2012 Overview
• Large collection (>160K) of CC-licensed Flickr
  photos and some of their metadata
• Event annotations for 149 target events (of
  specific categories and locations of interest)

• Primary use: Social event detection
  – Used in the context of MediaEval 2012 (SED task)
• Secondary uses: image geotagging,
  distractors in CBIR, city summarization
                                      2
Dataset Overview
Flickr photo collection
• 167,332 photos
• 4,422 unique contributors
• Creative Commons licenses

Event Annotations
• Challenge 1: Technical events in Germany
• Challenge 2: Soccer events in Hamburg and Madrid
• Challenge 3: Indignados movement events in Madrid

                                      3
Data Collection Process
• Flickr API: http://www.flickr.com/services/api/
• Used method flickr.photo.search with five
  geographical centres:
   Barcelona, Cologne, Hamburg, Hannover, Madrid
• Time period: Jan 2009 – Dec 2011
• All photos CC licensed
• 403 photos from the
       EventMedia collection
      R. Troncy, B. Malocha, and A. Fialho. Linking Events with Media. In 6th Intern.
      Conference on Semantic Systems (I-SEMANTICS), Graz, Austria, 2010

                                                                    4
Photo Distribution
Place distribution



Yearly distribution



Language distribution



                        5
Dataset Collection Motivation
Selection of five cities (three German, two Spanish):
• Include large number of non-English text metadata (cf.
   language distribution table)
• Ensure existence of numerous events for the target types
• Include distractor images:
   – Challenge 2: Cologne, Hannover distractor for Hamburg, Barcelona
     distractor for Madrid
   – Challenge 3: Barcelona distractor for Madrid
Selection of only geotagged photos:
• Ease of annotation
Selection of only CC-licensed photos:
• Reuse of collection for research

                                                      6
Tag Statistics           (1/2)
                           number of users using the tag

51,611 unique tags

prevalence of
location specific tags




event-specific tags


                                            7
Tag Statistics                    (2/2)
                                       barcelona
>20K photos have no tags                    spain
                                                    madrid



                                                             >57% of tags appear
                                                                   once or twice




 83.9% less than or equal to 10 tags      >40K tags appear less than 10 times


                                                         8
User Statistics




                                       60% of users less
                                       than 10 photos




           30 most active users contribute ~30% of dataset
                                            9
Ground Truth Creation
• Manual annotations by use of CrEve
  – web-based annotation
  – two-round annotation by five annotators (three in the
    first, two in the second)
  – interactive annotation (search & annotate)
  – each round terminated as soon as no new event-related
    photos discovered
  – approximate effort: 100 person-hours
   C. Zigkolis, S. Papadopoulos, G. Filippou, Y. Kompatsiaris, A. Vakali. Collaborative Event
   Annotation in Tagged Photo Collections. Multimedia Tools & Applications, 2012


• Annotations for Challenge 1 enriched by EventMedia
  (403 photos featuring technical events in Germany)
                                                                        10
Ground Truth Statistics (1/3)




           10 events related
           with >100 photos

                               ~27% of events associated
                                      with 1 or 2 photos


                                     11
Ground Truth Statistics (2/3)
106 events are captured by
single users
                                 erroneous timestamps in photos




     9 events captured by more   The majority of events last for less
     than 10 people              than a day (typical for soccer)
                                               12
Ground Truth Statistics (3/3)
 Madrid events

                      Santiago Bernabeu
                      stadium              Puerta del Sol




Stadium of Butarque



                      Vicente Calderon stadium
                                                 13
Technical Event Examples
PHP Unconf. 2010           Gamescom 2009




              CeBIT 2010                   Convention Camp 2011




                                                      14
Soccer Event Examples
Real Madrid – Milan (2010)          World Cup 2010




                    St. Pauli – HSV (2010)           Spain – Colombia (2011)




                                                              15
Indignados Event Examples
Inaugural march, 15 May         Large gathering, 20 May




            Gathering, 15 Oct               Demonstration, 17 Nov




                                                          16
Evaluation
• F-measure (macro), Precision, Recall
  – goodness of retrieved photos, but not how well
    they were clustered into events
• Normalized Mutual Information (NMI)
  – compares automatically extracted clustering of
    photos into events with the ground truth
• Evaluation script is made available together
  with the dataset.
• Implementation of event detection available:
          http://mklab.iti.gr/project/sed2012_certh
                                       17
Questions
 @sympapadopoulos
 www.slideshare.net/sympapadopoulos

Mais conteúdo relacionado

Semelhante a SED2012 Dataset

(Linked Data Development and Exploitation track) "Generating the Semantic Sna...
(Linked Data Development and Exploitation track) "Generating the Semantic Sna...(Linked Data Development and Exploitation track) "Generating the Semantic Sna...
(Linked Data Development and Exploitation track) "Generating the Semantic Sna...
icwe2015
 
3D Printing: GIS Day 2013 Work in Progress Report
3D Printing: GIS Day 2013 Work in Progress Report3D Printing: GIS Day 2013 Work in Progress Report
3D Printing: GIS Day 2013 Work in Progress Report
Peter Löwe
 

Semelhante a SED2012 Dataset (20)

VRCAI 2011 Billinghurst Keynote
VRCAI 2011 Billinghurst KeynoteVRCAI 2011 Billinghurst Keynote
VRCAI 2011 Billinghurst Keynote
 
(Linked Data Development and Exploitation track) "Generating the Semantic Sna...
(Linked Data Development and Exploitation track) "Generating the Semantic Sna...(Linked Data Development and Exploitation track) "Generating the Semantic Sna...
(Linked Data Development and Exploitation track) "Generating the Semantic Sna...
 
Computer Vision++: Where Do We Go from Here?
Computer Vision++: Where Do We Go from Here?Computer Vision++: Where Do We Go from Here?
Computer Vision++: Where Do We Go from Here?
 
3D Printing: GIS Day 2013 Work in Progress Report
3D Printing: GIS Day 2013 Work in Progress Report3D Printing: GIS Day 2013 Work in Progress Report
3D Printing: GIS Day 2013 Work in Progress Report
 
Multimedia rescue 161018
Multimedia rescue 161018Multimedia rescue 161018
Multimedia rescue 161018
 
From Research to Applications: What Can We Extract with Social Media Sensing?
From Research to Applications: What Can We Extract with Social Media Sensing?From Research to Applications: What Can We Extract with Social Media Sensing?
From Research to Applications: What Can We Extract with Social Media Sensing?
 
Jan Hendrik Hammer, Fraunhofer, KIT, Eyetracking and Gaze Analysis
Jan Hendrik Hammer, Fraunhofer, KIT, Eyetracking and Gaze AnalysisJan Hendrik Hammer, Fraunhofer, KIT, Eyetracking and Gaze Analysis
Jan Hendrik Hammer, Fraunhofer, KIT, Eyetracking and Gaze Analysis
 
News Semantic Snapshot
News Semantic SnapshotNews Semantic Snapshot
News Semantic Snapshot
 
Klipfolio - Your Swiss Knife on data
Klipfolio - Your Swiss Knife on dataKlipfolio - Your Swiss Knife on data
Klipfolio - Your Swiss Knife on data
 
Smart Data fo the Smart Cities and Smart Factories in the future
Smart Data fo the Smart Cities and Smart Factories in the futureSmart Data fo the Smart Cities and Smart Factories in the future
Smart Data fo the Smart Cities and Smart Factories in the future
 
Analyzing large multimedia collections in an urban context - Prof. Marcel Wor...
Analyzing large multimedia collections in an urban context - Prof. Marcel Wor...Analyzing large multimedia collections in an urban context - Prof. Marcel Wor...
Analyzing large multimedia collections in an urban context - Prof. Marcel Wor...
 
Using synthetic data for computer vision model training
Using synthetic data for computer vision model trainingUsing synthetic data for computer vision model training
Using synthetic data for computer vision model training
 
Information Fusion Methods for Location Data Analysis
Information Fusion Methods for Location Data AnalysisInformation Fusion Methods for Location Data Analysis
Information Fusion Methods for Location Data Analysis
 
Web and Social Media Image Forensics for News Professionals
Web and Social Media Image Forensics for News ProfessionalsWeb and Social Media Image Forensics for News Professionals
Web and Social Media Image Forensics for News Professionals
 
Media REVEALr: A social multimedia monitoring and intelligence system for Web...
Media REVEALr: A social multimedia monitoring and intelligence system for Web...Media REVEALr: A social multimedia monitoring and intelligence system for Web...
Media REVEALr: A social multimedia monitoring and intelligence system for Web...
 
Visual Information Analysis for Crisis and Natural Disasters Management and R...
Visual Information Analysis for Crisis and Natural Disasters Management and R...Visual Information Analysis for Crisis and Natural Disasters Management and R...
Visual Information Analysis for Crisis and Natural Disasters Management and R...
 
A Large-Scale Analysis of YouTube Videos Depicting Everyday Thermal Camera Use
A Large-Scale Analysis of YouTube Videos Depicting Everyday Thermal Camera UseA Large-Scale Analysis of YouTube Videos Depicting Everyday Thermal Camera Use
A Large-Scale Analysis of YouTube Videos Depicting Everyday Thermal Camera Use
 
Mediarevealr: A social multimedia monitoring and intelligence system for Web ...
Mediarevealr: A social multimedia monitoring and intelligence system for Web ...Mediarevealr: A social multimedia monitoring and intelligence system for Web ...
Mediarevealr: A social multimedia monitoring and intelligence system for Web ...
 
Listening to the pulse of our cities fusing Social Media Streams and Call Dat...
Listening to the pulse of our cities fusing Social Media Streams and Call Dat...Listening to the pulse of our cities fusing Social Media Streams and Call Dat...
Listening to the pulse of our cities fusing Social Media Streams and Call Dat...
 
COSC 426 Lecture 1: Introduction to Augmented Reality
COSC 426 Lecture 1: Introduction to Augmented RealityCOSC 426 Lecture 1: Introduction to Augmented Reality
COSC 426 Lecture 1: Introduction to Augmented Reality
 

Mais de Symeon Papadopoulos

Mais de Symeon Papadopoulos (20)

DeepFake Detection: Challenges, Progress and Hands-on Demonstration of Techno...
DeepFake Detection: Challenges, Progress and Hands-on Demonstration of Techno...DeepFake Detection: Challenges, Progress and Hands-on Demonstration of Techno...
DeepFake Detection: Challenges, Progress and Hands-on Demonstration of Techno...
 
Deepfakes: An Emerging Internet Threat and their Detection
Deepfakes: An Emerging Internet Threat and their DetectionDeepfakes: An Emerging Internet Threat and their Detection
Deepfakes: An Emerging Internet Threat and their Detection
 
Knowledge-based Fusion for Image Tampering Localization
Knowledge-based Fusion for Image Tampering LocalizationKnowledge-based Fusion for Image Tampering Localization
Knowledge-based Fusion for Image Tampering Localization
 
Deepfake Detection: The Importance of Training Data Preprocessing and Practic...
Deepfake Detection: The Importance of Training Data Preprocessing and Practic...Deepfake Detection: The Importance of Training Data Preprocessing and Practic...
Deepfake Detection: The Importance of Training Data Preprocessing and Practic...
 
COVID-19 Infodemic vs Contact Tracing
COVID-19 Infodemic vs Contact TracingCOVID-19 Infodemic vs Contact Tracing
COVID-19 Infodemic vs Contact Tracing
 
Similarity-based retrieval of multimedia content
Similarity-based retrieval of multimedia contentSimilarity-based retrieval of multimedia content
Similarity-based retrieval of multimedia content
 
Twitter-based Sensing of City-level Air Quality
Twitter-based Sensing of City-level Air QualityTwitter-based Sensing of City-level Air Quality
Twitter-based Sensing of City-level Air Quality
 
Aggregating and Analyzing the Context of Social Media Content
Aggregating and Analyzing the Context of Social Media ContentAggregating and Analyzing the Context of Social Media Content
Aggregating and Analyzing the Context of Social Media Content
 
Verifying Multimedia Content on the Internet
Verifying Multimedia Content on the InternetVerifying Multimedia Content on the Internet
Verifying Multimedia Content on the Internet
 
A Web-based Service for Image Tampering Detection
A Web-based Service for Image Tampering DetectionA Web-based Service for Image Tampering Detection
A Web-based Service for Image Tampering Detection
 
Learning to detect Misleading Content on Twitter
Learning to detect Misleading Content on TwitterLearning to detect Misleading Content on Twitter
Learning to detect Misleading Content on Twitter
 
Near-Duplicate Video Retrieval by Aggregating Intermediate CNN Layers
Near-Duplicate Video Retrieval by Aggregating Intermediate CNN LayersNear-Duplicate Video Retrieval by Aggregating Intermediate CNN Layers
Near-Duplicate Video Retrieval by Aggregating Intermediate CNN Layers
 
Verifying Multimedia Use at MediaEval 2016
Verifying Multimedia Use at MediaEval 2016Verifying Multimedia Use at MediaEval 2016
Verifying Multimedia Use at MediaEval 2016
 
Multimedia Privacy
Multimedia PrivacyMultimedia Privacy
Multimedia Privacy
 
Placing Images with Refined Language Models and Similarity Search with PCA-re...
Placing Images with Refined Language Models and Similarity Search with PCA-re...Placing Images with Refined Language Models and Similarity Search with PCA-re...
Placing Images with Refined Language Models and Similarity Search with PCA-re...
 
In-depth Exploration of Geotagging Performance
In-depth Exploration of Geotagging PerformanceIn-depth Exploration of Geotagging Performance
In-depth Exploration of Geotagging Performance
 
Perceived versus Actual Predictability of Personal Information in Social Netw...
Perceived versus Actual Predictability of Personal Information in Social Netw...Perceived versus Actual Predictability of Personal Information in Social Netw...
Perceived versus Actual Predictability of Personal Information in Social Netw...
 
Predicting News Popularity by Mining Online Discussions
Predicting News Popularity by Mining Online DiscussionsPredicting News Popularity by Mining Online Discussions
Predicting News Popularity by Mining Online Discussions
 
Finding Diverse Social Images at MediaEval 2015
Finding Diverse Social Images at MediaEval 2015Finding Diverse Social Images at MediaEval 2015
Finding Diverse Social Images at MediaEval 2015
 
CERTH/CEA LIST at MediaEval Placing Task 2015
CERTH/CEA LIST at MediaEval Placing Task 2015CERTH/CEA LIST at MediaEval Placing Task 2015
CERTH/CEA LIST at MediaEval Placing Task 2015
 

Último

Structuring Teams and Portfolios for Success
Structuring Teams and Portfolios for SuccessStructuring Teams and Portfolios for Success
Structuring Teams and Portfolios for Success
UXDXConf
 
Future Visions: Predictions to Guide and Time Tech Innovation, Peter Udo Diehl
Future Visions: Predictions to Guide and Time Tech Innovation, Peter Udo DiehlFuture Visions: Predictions to Guide and Time Tech Innovation, Peter Udo Diehl
Future Visions: Predictions to Guide and Time Tech Innovation, Peter Udo Diehl
Peter Udo Diehl
 

Último (20)

ASRock Industrial FDO Solutions in Action for Industrial Edge AI _ Kenny at A...
ASRock Industrial FDO Solutions in Action for Industrial Edge AI _ Kenny at A...ASRock Industrial FDO Solutions in Action for Industrial Edge AI _ Kenny at A...
ASRock Industrial FDO Solutions in Action for Industrial Edge AI _ Kenny at A...
 
ECS 2024 Teams Premium - Pretty Secure
ECS 2024   Teams Premium - Pretty SecureECS 2024   Teams Premium - Pretty Secure
ECS 2024 Teams Premium - Pretty Secure
 
Integrating Telephony Systems with Salesforce: Insights and Considerations, B...
Integrating Telephony Systems with Salesforce: Insights and Considerations, B...Integrating Telephony Systems with Salesforce: Insights and Considerations, B...
Integrating Telephony Systems with Salesforce: Insights and Considerations, B...
 
Syngulon - Selection technology May 2024.pdf
Syngulon - Selection technology May 2024.pdfSyngulon - Selection technology May 2024.pdf
Syngulon - Selection technology May 2024.pdf
 
Top 10 Symfony Development Companies 2024
Top 10 Symfony Development Companies 2024Top 10 Symfony Development Companies 2024
Top 10 Symfony Development Companies 2024
 
Structuring Teams and Portfolios for Success
Structuring Teams and Portfolios for SuccessStructuring Teams and Portfolios for Success
Structuring Teams and Portfolios for Success
 
Future Visions: Predictions to Guide and Time Tech Innovation, Peter Udo Diehl
Future Visions: Predictions to Guide and Time Tech Innovation, Peter Udo DiehlFuture Visions: Predictions to Guide and Time Tech Innovation, Peter Udo Diehl
Future Visions: Predictions to Guide and Time Tech Innovation, Peter Udo Diehl
 
Simplified FDO Manufacturing Flow with TPMs _ Liam at Infineon.pdf
Simplified FDO Manufacturing Flow with TPMs _ Liam at Infineon.pdfSimplified FDO Manufacturing Flow with TPMs _ Liam at Infineon.pdf
Simplified FDO Manufacturing Flow with TPMs _ Liam at Infineon.pdf
 
Oauth 2.0 Introduction and Flows with MuleSoft
Oauth 2.0 Introduction and Flows with MuleSoftOauth 2.0 Introduction and Flows with MuleSoft
Oauth 2.0 Introduction and Flows with MuleSoft
 
Introduction to FDO and How It works Applications _ Richard at FIDO Alliance.pdf
Introduction to FDO and How It works Applications _ Richard at FIDO Alliance.pdfIntroduction to FDO and How It works Applications _ Richard at FIDO Alliance.pdf
Introduction to FDO and How It works Applications _ Richard at FIDO Alliance.pdf
 
Speed Wins: From Kafka to APIs in Minutes
Speed Wins: From Kafka to APIs in MinutesSpeed Wins: From Kafka to APIs in Minutes
Speed Wins: From Kafka to APIs in Minutes
 
PLAI - Acceleration Program for Generative A.I. Startups
PLAI - Acceleration Program for Generative A.I. StartupsPLAI - Acceleration Program for Generative A.I. Startups
PLAI - Acceleration Program for Generative A.I. Startups
 
Designing for Hardware Accessibility at Comcast
Designing for Hardware Accessibility at ComcastDesigning for Hardware Accessibility at Comcast
Designing for Hardware Accessibility at Comcast
 
TEST BANK For, Information Technology Project Management 9th Edition Kathy Sc...
TEST BANK For, Information Technology Project Management 9th Edition Kathy Sc...TEST BANK For, Information Technology Project Management 9th Edition Kathy Sc...
TEST BANK For, Information Technology Project Management 9th Edition Kathy Sc...
 
Google I/O Extended 2024 Warsaw
Google I/O Extended 2024 WarsawGoogle I/O Extended 2024 Warsaw
Google I/O Extended 2024 Warsaw
 
AI presentation and introduction - Retrieval Augmented Generation RAG 101
AI presentation and introduction - Retrieval Augmented Generation RAG 101AI presentation and introduction - Retrieval Augmented Generation RAG 101
AI presentation and introduction - Retrieval Augmented Generation RAG 101
 
FDO for Camera, Sensor and Networking Device – Commercial Solutions from VinC...
FDO for Camera, Sensor and Networking Device – Commercial Solutions from VinC...FDO for Camera, Sensor and Networking Device – Commercial Solutions from VinC...
FDO for Camera, Sensor and Networking Device – Commercial Solutions from VinC...
 
AI revolution and Salesforce, Jiří Karpíšek
AI revolution and Salesforce, Jiří KarpíšekAI revolution and Salesforce, Jiří Karpíšek
AI revolution and Salesforce, Jiří Karpíšek
 
Strategic AI Integration in Engineering Teams
Strategic AI Integration in Engineering TeamsStrategic AI Integration in Engineering Teams
Strategic AI Integration in Engineering Teams
 
Unpacking Value Delivery - Agile Oxford Meetup - May 2024.pptx
Unpacking Value Delivery - Agile Oxford Meetup - May 2024.pptxUnpacking Value Delivery - Agile Oxford Meetup - May 2024.pptx
Unpacking Value Delivery - Agile Oxford Meetup - May 2024.pptx
 

SED2012 Dataset

  • 1. The 2012 Social Event Detection Dataset Symeon Papadopoulos1, Emmanouil Schinas1, Vasileios Mezaris1, Raphaël Troncy2, Yiannis Kompatsiaris1 1 CERTH-ITI, Thessaloniki, Greece 2 EURECOM, Sophia Antipolis, France Oslo, 28 Feb - 1 Mar 2013
  • 2. SED2012 Overview • Large collection (>160K) of CC-licensed Flickr photos and some of their metadata • Event annotations for 149 target events (of specific categories and locations of interest) • Primary use: Social event detection – Used in the context of MediaEval 2012 (SED task) • Secondary uses: image geotagging, distractors in CBIR, city summarization 2
  • 3. Dataset Overview Flickr photo collection • 167,332 photos • 4,422 unique contributors • Creative Commons licenses Event Annotations • Challenge 1: Technical events in Germany • Challenge 2: Soccer events in Hamburg and Madrid • Challenge 3: Indignados movement events in Madrid 3
  • 4. Data Collection Process • Flickr API: http://www.flickr.com/services/api/ • Used method flickr.photo.search with five geographical centres: Barcelona, Cologne, Hamburg, Hannover, Madrid • Time period: Jan 2009 – Dec 2011 • All photos CC licensed • 403 photos from the EventMedia collection R. Troncy, B. Malocha, and A. Fialho. Linking Events with Media. In 6th Intern. Conference on Semantic Systems (I-SEMANTICS), Graz, Austria, 2010 4
  • 5. Photo Distribution Place distribution Yearly distribution Language distribution 5
  • 6. Dataset Collection Motivation Selection of five cities (three German, two Spanish): • Include large number of non-English text metadata (cf. language distribution table) • Ensure existence of numerous events for the target types • Include distractor images: – Challenge 2: Cologne, Hannover distractor for Hamburg, Barcelona distractor for Madrid – Challenge 3: Barcelona distractor for Madrid Selection of only geotagged photos: • Ease of annotation Selection of only CC-licensed photos: • Reuse of collection for research 6
  • 7. Tag Statistics (1/2) number of users using the tag 51,611 unique tags prevalence of location specific tags event-specific tags 7
  • 8. Tag Statistics (2/2) barcelona >20K photos have no tags spain madrid >57% of tags appear once or twice 83.9% less than or equal to 10 tags >40K tags appear less than 10 times 8
  • 9. User Statistics 60% of users less than 10 photos 30 most active users contribute ~30% of dataset 9
  • 10. Ground Truth Creation • Manual annotations by use of CrEve – web-based annotation – two-round annotation by five annotators (three in the first, two in the second) – interactive annotation (search & annotate) – each round terminated as soon as no new event-related photos discovered – approximate effort: 100 person-hours C. Zigkolis, S. Papadopoulos, G. Filippou, Y. Kompatsiaris, A. Vakali. Collaborative Event Annotation in Tagged Photo Collections. Multimedia Tools & Applications, 2012 • Annotations for Challenge 1 enriched by EventMedia (403 photos featuring technical events in Germany) 10
  • 11. Ground Truth Statistics (1/3) 10 events related with >100 photos ~27% of events associated with 1 or 2 photos 11
  • 12. Ground Truth Statistics (2/3) 106 events are captured by single users erroneous timestamps in photos 9 events captured by more The majority of events last for less than 10 people than a day (typical for soccer) 12
  • 13. Ground Truth Statistics (3/3) Madrid events Santiago Bernabeu stadium Puerta del Sol Stadium of Butarque Vicente Calderon stadium 13
  • 14. Technical Event Examples PHP Unconf. 2010 Gamescom 2009 CeBIT 2010 Convention Camp 2011 14
  • 15. Soccer Event Examples Real Madrid – Milan (2010) World Cup 2010 St. Pauli – HSV (2010) Spain – Colombia (2011) 15
  • 16. Indignados Event Examples Inaugural march, 15 May Large gathering, 20 May Gathering, 15 Oct Demonstration, 17 Nov 16
  • 17. Evaluation • F-measure (macro), Precision, Recall – goodness of retrieved photos, but not how well they were clustered into events • Normalized Mutual Information (NMI) – compares automatically extracted clustering of photos into events with the ground truth • Evaluation script is made available together with the dataset. • Implementation of event detection available: http://mklab.iti.gr/project/sed2012_certh 17

Notas do Editor

  1. Events with 1 or 2 photos are much harder to detect, e.g. by methods based on clustering.