SlideShare a Scribd company logo
1 of 18
Download to read offline
Semantic Search in E-Discovery
Research on the application of text mining and information retrieval
for fact finding in regulatory investigations

                                                                       David Graus
Who’s Involved?

     Prof. dr. Maarten de Rijke                   Dr. Hans Henseler
                                                  Lector E-Discovery, CREATE-IT applied
     Director Intelligent Systems Lab, UvA        research



     David Graus, MSc.                            David van Dijk, MSc.
     PhD Candidate, Semantic Search               Researcher E-Discovery, CREATE-IT
                                                  applied research
     in E-Discovery, UvA



     Zhaochun Ren, MSc.                           Menno Israël, MSc.
                                                  Teamleader Knowledge and Expertise
     PhD Candidate, Semantic Search               Centre for Intelligent Data Analysis
     in E-Discovery, UvA                          (Kecida), NFI




                                             Semantic search in e-discovery              2
Introduction

 £   Semantic Search in E-Discovery




                                Semantic search in e-discovery   3
What is

 £   Semantic Search in E-Discovery
      ˜   retrieving and securing digital forensic evidence




                                           Semantic search in e-discovery   4
What is

 £   Semantic Search in E-Discovery




                               Semantic search in e-discovery   5
What is

 £   Semantic Search in E-Discovery
      ˜ retrieving and securing digital forensic evidence
      ˜ from emails, forums, etc...




                                          Semantic search in e-discovery   6
What is

 £   Semantic Search in e-Discovery




                               Semantic search in e-discovery   7
Challenge

¢   Finding out who knew what, from whom, and when




                                Semantic search in e-discovery   8
Challenge

¢   Finding out who knew what, from whom, and when
¢   Generic search is not the answer




                                  Semantic search in e-discovery   9
Finding evidence for E-Discovery

¢   We don’t know what we’re looking for
¢   What we’re looking for might be deliberately hidden
¢   Communication might be very domain-specific,
     contextualized or incomplete




                                   Semantic search in e-discovery   10
Task

¢   Retrieve all relevant traces
¢   Highly iterative search process
¢   Support (re)formulating questions and hypotheses




                                       Semantic search in e-discovery   11
How do we approach this?

¢   Two subprojects:
     £   Information Retrieval
          ˜   Finding material of unstructured nature from large collections
     £   Information Extraction/Text Mining
          ˜   Discovering patterns in data




                                                    Semantic search in e-discovery   12
How do we approach this?

¢   Information Retrieval
     £   Integrating structure/context of data in retrieval models
          ˜   Capturing forum and email context
          ˜   Conversational search




                                                   Semantic search in e-discovery   13
How do we approach this?

¢   Information Extraction/Text Mining
     £   Extracting structured knowledge from user generated
          content
          ˜   Semantic pre-processing
          ˜   Social network inference
          ˜   Information maps




                                          Semantic search in e-discovery   14
How do we approach this?

¢   Information Retrieval <-> Information Extraction




                                   Semantic search in e-discovery   15
Current work (first steps)

¢   Information Retrieval
     £   Twitter Mining (as a form of conversational search)


¢   Information Extraction/Text Mining
     £   Entity linking (for semantic document enrichment)


¢   TREC/TAC benchmarking events
     £   TREC Legal Track 2011 (2013?)



                                            Semantic search in e-discovery   16
Contributions

¢   xTAS: Open source text analysis toolkit
¢   iColumbo: Internet monitoring framework
¢   Used by:
     £   Internet Recherche Netwerk
     £   Koninklijke Bibliotheek
     £   Beeld en Geluid
     £   ... You?




                                       Semantic search in e-discovery   17
Semantic search in E-discovery

¢   David Graus
¢   d.p.graus@uva.nl




                        Semantic search in e-discovery   18

More Related Content

Similar to Semantic Search in E-Discovery

Paul Henning Krogh A New Dawn For E Collaboration In Science
Paul Henning Krogh   A New Dawn For E Collaboration In SciencePaul Henning Krogh   A New Dawn For E Collaboration In Science
Paul Henning Krogh A New Dawn For E Collaboration In ScienceVincenzo Barone
 
Introduction to Data Mining KDD Process OLAP
Introduction to Data Mining KDD Process OLAPIntroduction to Data Mining KDD Process OLAP
Introduction to Data Mining KDD Process OLAPShivarkarSandip
 
Introduction to data mining which covers the basics
Introduction to data mining which covers the basicsIntroduction to data mining which covers the basics
Introduction to data mining which covers the basicsShivarkarSandip
 
20130318 社群網路與人工智慧
20130318 社群網路與人工智慧20130318 社群網路與人工智慧
20130318 社群網路與人工智慧景淳 許
 
Hello Open World - The Web of Data for the Pragmatic Developer
Hello Open World - The Web of Data for the Pragmatic DeveloperHello Open World - The Web of Data for the Pragmatic Developer
Hello Open World - The Web of Data for the Pragmatic DeveloperAlexandre Passant
 
Open Source, Open Science, & Citizen Science
Open Source, Open Science, & Citizen ScienceOpen Source, Open Science, & Citizen Science
Open Source, Open Science, & Citizen ScienceAndrea Wiggins
 
Hello Open World - Semtech 2009
Hello Open World - Semtech 2009Hello Open World - Semtech 2009
Hello Open World - Semtech 2009Alexandre Passant
 
@WebSciDL PhD Student Project Reviews August 5&6, 2015
@WebSciDL PhD Student Project Reviews August 5&6, 2015@WebSciDL PhD Student Project Reviews August 5&6, 2015
@WebSciDL PhD Student Project Reviews August 5&6, 2015Michael Nelson
 
Information Quality Assessment in the WIQ-EI EU Project
Information Quality Assessment in the WIQ-EI EU ProjectInformation Quality Assessment in the WIQ-EI EU Project
Information Quality Assessment in the WIQ-EI EU ProjectElisabeth Lex
 
Information Quality Assessment in the WIQ-EI EU Project
Information Quality Assessment in the WIQ-EI EU ProjectInformation Quality Assessment in the WIQ-EI EU Project
Information Quality Assessment in the WIQ-EI EU Projectwiqei
 
Research Data Management: What is it and why is the Library & Archives Servic...
Research Data Management: What is it and why is the Library & Archives Servic...Research Data Management: What is it and why is the Library & Archives Servic...
Research Data Management: What is it and why is the Library & Archives Servic...GarethKnight
 
Drowning in information – the need of macroscopes for research funding
Drowning in information – the need of macroscopes for research fundingDrowning in information – the need of macroscopes for research funding
Drowning in information – the need of macroscopes for research fundingAndrea Scharnhorst
 
APLIC 2012: Discovering & Dealing with Data
APLIC 2012: Discovering & Dealing with DataAPLIC 2012: Discovering & Dealing with Data
APLIC 2012: Discovering & Dealing with DataHamilton Public Library
 
Parcos_Data Explorer v2.pdf
Parcos_Data Explorer v2.pdfParcos_Data Explorer v2.pdf
Parcos_Data Explorer v2.pdfParCosProject
 
introduction to data mining tutorial
introduction to data mining tutorial introduction to data mining tutorial
introduction to data mining tutorial Salah Amean
 
Search and Browsing Cycle for Knowledge Discovery and Learning
Search and Browsing Cycle for Knowledge Discovery and LearningSearch and Browsing Cycle for Knowledge Discovery and Learning
Search and Browsing Cycle for Knowledge Discovery and LearningSebastian Ryszard Kruk
 
Designing a synergistic relationship between undergraduate Data Science educa...
Designing a synergistic relationship between undergraduate Data Science educa...Designing a synergistic relationship between undergraduate Data Science educa...
Designing a synergistic relationship between undergraduate Data Science educa...Ciera Martinez
 

Similar to Semantic Search in E-Discovery (20)

Paul Henning Krogh A New Dawn For E Collaboration In Science
Paul Henning Krogh   A New Dawn For E Collaboration In SciencePaul Henning Krogh   A New Dawn For E Collaboration In Science
Paul Henning Krogh A New Dawn For E Collaboration In Science
 
01datamining.pdf
01datamining.pdf01datamining.pdf
01datamining.pdf
 
Introduction to Data Mining KDD Process OLAP
Introduction to Data Mining KDD Process OLAPIntroduction to Data Mining KDD Process OLAP
Introduction to Data Mining KDD Process OLAP
 
Introduction to data mining which covers the basics
Introduction to data mining which covers the basicsIntroduction to data mining which covers the basics
Introduction to data mining which covers the basics
 
20130318 社群網路與人工智慧
20130318 社群網路與人工智慧20130318 社群網路與人工智慧
20130318 社群網路與人工智慧
 
Hello Open World - The Web of Data for the Pragmatic Developer
Hello Open World - The Web of Data for the Pragmatic DeveloperHello Open World - The Web of Data for the Pragmatic Developer
Hello Open World - The Web of Data for the Pragmatic Developer
 
Open Source, Open Science, & Citizen Science
Open Source, Open Science, & Citizen ScienceOpen Source, Open Science, & Citizen Science
Open Source, Open Science, & Citizen Science
 
Information entanglement
Information entanglementInformation entanglement
Information entanglement
 
Hello Open World - Semtech 2009
Hello Open World - Semtech 2009Hello Open World - Semtech 2009
Hello Open World - Semtech 2009
 
@WebSciDL PhD Student Project Reviews August 5&6, 2015
@WebSciDL PhD Student Project Reviews August 5&6, 2015@WebSciDL PhD Student Project Reviews August 5&6, 2015
@WebSciDL PhD Student Project Reviews August 5&6, 2015
 
Information Quality Assessment in the WIQ-EI EU Project
Information Quality Assessment in the WIQ-EI EU ProjectInformation Quality Assessment in the WIQ-EI EU Project
Information Quality Assessment in the WIQ-EI EU Project
 
Information Quality Assessment in the WIQ-EI EU Project
Information Quality Assessment in the WIQ-EI EU ProjectInformation Quality Assessment in the WIQ-EI EU Project
Information Quality Assessment in the WIQ-EI EU Project
 
Research Data Management: What is it and why is the Library & Archives Servic...
Research Data Management: What is it and why is the Library & Archives Servic...Research Data Management: What is it and why is the Library & Archives Servic...
Research Data Management: What is it and why is the Library & Archives Servic...
 
Drowning in information – the need of macroscopes for research funding
Drowning in information – the need of macroscopes for research fundingDrowning in information – the need of macroscopes for research funding
Drowning in information – the need of macroscopes for research funding
 
APLIC 2012: Discovering & Dealing with Data
APLIC 2012: Discovering & Dealing with DataAPLIC 2012: Discovering & Dealing with Data
APLIC 2012: Discovering & Dealing with Data
 
Parcos_Data Explorer v2.pdf
Parcos_Data Explorer v2.pdfParcos_Data Explorer v2.pdf
Parcos_Data Explorer v2.pdf
 
Parcos_Data Explorer v2.pdf
Parcos_Data Explorer v2.pdfParcos_Data Explorer v2.pdf
Parcos_Data Explorer v2.pdf
 
introduction to data mining tutorial
introduction to data mining tutorial introduction to data mining tutorial
introduction to data mining tutorial
 
Search and Browsing Cycle for Knowledge Discovery and Learning
Search and Browsing Cycle for Knowledge Discovery and LearningSearch and Browsing Cycle for Knowledge Discovery and Learning
Search and Browsing Cycle for Knowledge Discovery and Learning
 
Designing a synergistic relationship between undergraduate Data Science educa...
Designing a synergistic relationship between undergraduate Data Science educa...Designing a synergistic relationship between undergraduate Data Science educa...
Designing a synergistic relationship between undergraduate Data Science educa...
 

More from David Graus

Pragmatic ethical and fair AI for data scientists
Pragmatic ethical and fair AI for data scientistsPragmatic ethical and fair AI for data scientists
Pragmatic ethical and fair AI for data scientistsDavid Graus
 
Bias in Recommendations
Bias in RecommendationsBias in Recommendations
Bias in RecommendationsDavid Graus
 
RecSys in the Media Industry: Relevance, Recency, Popularity, and Diversity.
RecSys in the Media Industry: Relevance, Recency, Popularity, and Diversity.RecSys in the Media Industry: Relevance, Recency, Popularity, and Diversity.
RecSys in the Media Industry: Relevance, Recency, Popularity, and Diversity.David Graus
 
CAT/AI: Computer Assisted Translation 
Assessment for Impact
CAT/AI: Computer Assisted Translation 
Assessment for ImpactCAT/AI: Computer Assisted Translation 
Assessment for Impact
CAT/AI: Computer Assisted Translation 
Assessment for ImpactDavid Graus
 
Opening the Black Box of User Profiles in Content-based Recommender Systems
Opening the Black Box of User Profiles in Content-based Recommender SystemsOpening the Black Box of User Profiles in Content-based Recommender Systems
Opening the Black Box of User Profiles in Content-based Recommender SystemsDavid Graus
 
Zoeken, vinden, en aanbevelen: personalisatie vs. privacy
Zoeken, vinden, en aanbevelen: personalisatie vs. privacyZoeken, vinden, en aanbevelen: personalisatie vs. privacy
Zoeken, vinden, en aanbevelen: personalisatie vs. privacyDavid Graus
 
Layman's Talk: Entities of Interest --- Discovery in Digital Traces
Layman's Talk: Entities of Interest --- Discovery in Digital TracesLayman's Talk: Entities of Interest --- Discovery in Digital Traces
Layman's Talk: Entities of Interest --- Discovery in Digital TracesDavid Graus
 
Financial News Mining @ PyData Amsterdam
Financial News Mining @ PyData AmsterdamFinancial News Mining @ PyData Amsterdam
Financial News Mining @ PyData AmsterdamDavid Graus
 
De Macht van Data --- Hoe algoritmen ons leven vormgeven
De Macht van Data --- Hoe algoritmen ons leven vormgevenDe Macht van Data --- Hoe algoritmen ons leven vormgeven
De Macht van Data --- Hoe algoritmen ons leven vormgevenDavid Graus
 
Financial News Mining @ FD Mediagroep/Company.info
Financial News Mining @ FD Mediagroep/Company.infoFinancial News Mining @ FD Mediagroep/Company.info
Financial News Mining @ FD Mediagroep/Company.infoDavid Graus
 
Big Data & Machine Learning - Mogelijkheden & Valkuilen
Big Data & Machine Learning - Mogelijkheden & ValkuilenBig Data & Machine Learning - Mogelijkheden & Valkuilen
Big Data & Machine Learning - Mogelijkheden & ValkuilenDavid Graus
 
Analyzing and Predicting Task Reminders
Analyzing and Predicting Task RemindersAnalyzing and Predicting Task Reminders
Analyzing and Predicting Task RemindersDavid Graus
 
Dynamic Collective Entity Representations for Entity Ranking
Dynamic Collective Entity Representations for Entity RankingDynamic Collective Entity Representations for Entity Ranking
Dynamic Collective Entity Representations for Entity RankingDavid Graus
 
Dynamic Collective Entity Representations for Entity Ranking
Dynamic Collective Entity Representations for Entity RankingDynamic Collective Entity Representations for Entity Ranking
Dynamic Collective Entity Representations for Entity RankingDavid Graus
 
Understanding Email Traffic
Understanding Email TrafficUnderstanding Email Traffic
Understanding Email TrafficDavid Graus
 
David Graus - Entity Linking (at SEA), Search Engines Amsterdam, Fri June 27th
David Graus - Entity Linking (at SEA), Search Engines Amsterdam, Fri June 27thDavid Graus - Entity Linking (at SEA), Search Engines Amsterdam, Fri June 27th
David Graus - Entity Linking (at SEA), Search Engines Amsterdam, Fri June 27thDavid Graus
 
Understanding Email Traffic (talk @ E-Discovery NL Symposium)
Understanding Email Traffic (talk @ E-Discovery NL Symposium)Understanding Email Traffic (talk @ E-Discovery NL Symposium)
Understanding Email Traffic (talk @ E-Discovery NL Symposium)David Graus
 
Generating Pseudo-ground Truth for Detecting New Concepts in Social Streams
Generating Pseudo-ground Truth for Detecting New Concepts in Social StreamsGenerating Pseudo-ground Truth for Detecting New Concepts in Social Streams
Generating Pseudo-ground Truth for Detecting New Concepts in Social StreamsDavid Graus
 
yourHistory - entity linking for a personalized timeline of historic events
yourHistory - entity linking for a personalized timeline of historic eventsyourHistory - entity linking for a personalized timeline of historic events
yourHistory - entity linking for a personalized timeline of historic eventsDavid Graus
 
Semantic Annotation of the Cyttron Database
Semantic Annotation of the Cyttron DatabaseSemantic Annotation of the Cyttron Database
Semantic Annotation of the Cyttron DatabaseDavid Graus
 

More from David Graus (20)

Pragmatic ethical and fair AI for data scientists
Pragmatic ethical and fair AI for data scientistsPragmatic ethical and fair AI for data scientists
Pragmatic ethical and fair AI for data scientists
 
Bias in Recommendations
Bias in RecommendationsBias in Recommendations
Bias in Recommendations
 
RecSys in the Media Industry: Relevance, Recency, Popularity, and Diversity.
RecSys in the Media Industry: Relevance, Recency, Popularity, and Diversity.RecSys in the Media Industry: Relevance, Recency, Popularity, and Diversity.
RecSys in the Media Industry: Relevance, Recency, Popularity, and Diversity.
 
CAT/AI: Computer Assisted Translation 
Assessment for Impact
CAT/AI: Computer Assisted Translation 
Assessment for ImpactCAT/AI: Computer Assisted Translation 
Assessment for Impact
CAT/AI: Computer Assisted Translation 
Assessment for Impact
 
Opening the Black Box of User Profiles in Content-based Recommender Systems
Opening the Black Box of User Profiles in Content-based Recommender SystemsOpening the Black Box of User Profiles in Content-based Recommender Systems
Opening the Black Box of User Profiles in Content-based Recommender Systems
 
Zoeken, vinden, en aanbevelen: personalisatie vs. privacy
Zoeken, vinden, en aanbevelen: personalisatie vs. privacyZoeken, vinden, en aanbevelen: personalisatie vs. privacy
Zoeken, vinden, en aanbevelen: personalisatie vs. privacy
 
Layman's Talk: Entities of Interest --- Discovery in Digital Traces
Layman's Talk: Entities of Interest --- Discovery in Digital TracesLayman's Talk: Entities of Interest --- Discovery in Digital Traces
Layman's Talk: Entities of Interest --- Discovery in Digital Traces
 
Financial News Mining @ PyData Amsterdam
Financial News Mining @ PyData AmsterdamFinancial News Mining @ PyData Amsterdam
Financial News Mining @ PyData Amsterdam
 
De Macht van Data --- Hoe algoritmen ons leven vormgeven
De Macht van Data --- Hoe algoritmen ons leven vormgevenDe Macht van Data --- Hoe algoritmen ons leven vormgeven
De Macht van Data --- Hoe algoritmen ons leven vormgeven
 
Financial News Mining @ FD Mediagroep/Company.info
Financial News Mining @ FD Mediagroep/Company.infoFinancial News Mining @ FD Mediagroep/Company.info
Financial News Mining @ FD Mediagroep/Company.info
 
Big Data & Machine Learning - Mogelijkheden & Valkuilen
Big Data & Machine Learning - Mogelijkheden & ValkuilenBig Data & Machine Learning - Mogelijkheden & Valkuilen
Big Data & Machine Learning - Mogelijkheden & Valkuilen
 
Analyzing and Predicting Task Reminders
Analyzing and Predicting Task RemindersAnalyzing and Predicting Task Reminders
Analyzing and Predicting Task Reminders
 
Dynamic Collective Entity Representations for Entity Ranking
Dynamic Collective Entity Representations for Entity RankingDynamic Collective Entity Representations for Entity Ranking
Dynamic Collective Entity Representations for Entity Ranking
 
Dynamic Collective Entity Representations for Entity Ranking
Dynamic Collective Entity Representations for Entity RankingDynamic Collective Entity Representations for Entity Ranking
Dynamic Collective Entity Representations for Entity Ranking
 
Understanding Email Traffic
Understanding Email TrafficUnderstanding Email Traffic
Understanding Email Traffic
 
David Graus - Entity Linking (at SEA), Search Engines Amsterdam, Fri June 27th
David Graus - Entity Linking (at SEA), Search Engines Amsterdam, Fri June 27thDavid Graus - Entity Linking (at SEA), Search Engines Amsterdam, Fri June 27th
David Graus - Entity Linking (at SEA), Search Engines Amsterdam, Fri June 27th
 
Understanding Email Traffic (talk @ E-Discovery NL Symposium)
Understanding Email Traffic (talk @ E-Discovery NL Symposium)Understanding Email Traffic (talk @ E-Discovery NL Symposium)
Understanding Email Traffic (talk @ E-Discovery NL Symposium)
 
Generating Pseudo-ground Truth for Detecting New Concepts in Social Streams
Generating Pseudo-ground Truth for Detecting New Concepts in Social StreamsGenerating Pseudo-ground Truth for Detecting New Concepts in Social Streams
Generating Pseudo-ground Truth for Detecting New Concepts in Social Streams
 
yourHistory - entity linking for a personalized timeline of historic events
yourHistory - entity linking for a personalized timeline of historic eventsyourHistory - entity linking for a personalized timeline of historic events
yourHistory - entity linking for a personalized timeline of historic events
 
Semantic Annotation of the Cyttron Database
Semantic Annotation of the Cyttron DatabaseSemantic Annotation of the Cyttron Database
Semantic Annotation of the Cyttron Database
 

Recently uploaded

Nutritional Needs Presentation - HLTH 104
Nutritional Needs Presentation - HLTH 104Nutritional Needs Presentation - HLTH 104
Nutritional Needs Presentation - HLTH 104misteraugie
 
microwave assisted reaction. General introduction
microwave assisted reaction. General introductionmicrowave assisted reaction. General introduction
microwave assisted reaction. General introductionMaksud Ahmed
 
Accessible design: Minimum effort, maximum impact
Accessible design: Minimum effort, maximum impactAccessible design: Minimum effort, maximum impact
Accessible design: Minimum effort, maximum impactdawncurless
 
Software Engineering Methodologies (overview)
Software Engineering Methodologies (overview)Software Engineering Methodologies (overview)
Software Engineering Methodologies (overview)eniolaolutunde
 
social pharmacy d-pharm 1st year by Pragati K. Mahajan
social pharmacy d-pharm 1st year by Pragati K. Mahajansocial pharmacy d-pharm 1st year by Pragati K. Mahajan
social pharmacy d-pharm 1st year by Pragati K. Mahajanpragatimahajan3
 
Sanyam Choudhary Chemistry practical.pdf
Sanyam Choudhary Chemistry practical.pdfSanyam Choudhary Chemistry practical.pdf
Sanyam Choudhary Chemistry practical.pdfsanyamsingh5019
 
APM Welcome, APM North West Network Conference, Synergies Across Sectors
APM Welcome, APM North West Network Conference, Synergies Across SectorsAPM Welcome, APM North West Network Conference, Synergies Across Sectors
APM Welcome, APM North West Network Conference, Synergies Across SectorsAssociation for Project Management
 
Student login on Anyboli platform.helpin
Student login on Anyboli platform.helpinStudent login on Anyboli platform.helpin
Student login on Anyboli platform.helpinRaunakKeshri1
 
fourth grading exam for kindergarten in writing
fourth grading exam for kindergarten in writingfourth grading exam for kindergarten in writing
fourth grading exam for kindergarten in writingTeacherCyreneCayanan
 
SOCIAL AND HISTORICAL CONTEXT - LFTVD.pptx
SOCIAL AND HISTORICAL CONTEXT - LFTVD.pptxSOCIAL AND HISTORICAL CONTEXT - LFTVD.pptx
SOCIAL AND HISTORICAL CONTEXT - LFTVD.pptxiammrhaywood
 
Measures of Central Tendency: Mean, Median and Mode
Measures of Central Tendency: Mean, Median and ModeMeasures of Central Tendency: Mean, Median and Mode
Measures of Central Tendency: Mean, Median and ModeThiyagu K
 
Introduction to Nonprofit Accounting: The Basics
Introduction to Nonprofit Accounting: The BasicsIntroduction to Nonprofit Accounting: The Basics
Introduction to Nonprofit Accounting: The BasicsTechSoup
 
Arihant handbook biology for class 11 .pdf
Arihant handbook biology for class 11 .pdfArihant handbook biology for class 11 .pdf
Arihant handbook biology for class 11 .pdfchloefrazer622
 
Grant Readiness 101 TechSoup and Remy Consulting
Grant Readiness 101 TechSoup and Remy ConsultingGrant Readiness 101 TechSoup and Remy Consulting
Grant Readiness 101 TechSoup and Remy ConsultingTechSoup
 
Key note speaker Neum_Admir Softic_ENG.pdf
Key note speaker Neum_Admir Softic_ENG.pdfKey note speaker Neum_Admir Softic_ENG.pdf
Key note speaker Neum_Admir Softic_ENG.pdfAdmir Softic
 
Call Girls in Dwarka Mor Delhi Contact Us 9654467111
Call Girls in Dwarka Mor Delhi Contact Us 9654467111Call Girls in Dwarka Mor Delhi Contact Us 9654467111
Call Girls in Dwarka Mor Delhi Contact Us 9654467111Sapana Sha
 
IGNOU MSCCFT and PGDCFT Exam Question Pattern: MCFT003 Counselling and Family...
IGNOU MSCCFT and PGDCFT Exam Question Pattern: MCFT003 Counselling and Family...IGNOU MSCCFT and PGDCFT Exam Question Pattern: MCFT003 Counselling and Family...
IGNOU MSCCFT and PGDCFT Exam Question Pattern: MCFT003 Counselling and Family...PsychoTech Services
 
Presentation by Andreas Schleicher Tackling the School Absenteeism Crisis 30 ...
Presentation by Andreas Schleicher Tackling the School Absenteeism Crisis 30 ...Presentation by Andreas Schleicher Tackling the School Absenteeism Crisis 30 ...
Presentation by Andreas Schleicher Tackling the School Absenteeism Crisis 30 ...EduSkills OECD
 

Recently uploaded (20)

Nutritional Needs Presentation - HLTH 104
Nutritional Needs Presentation - HLTH 104Nutritional Needs Presentation - HLTH 104
Nutritional Needs Presentation - HLTH 104
 
microwave assisted reaction. General introduction
microwave assisted reaction. General introductionmicrowave assisted reaction. General introduction
microwave assisted reaction. General introduction
 
Accessible design: Minimum effort, maximum impact
Accessible design: Minimum effort, maximum impactAccessible design: Minimum effort, maximum impact
Accessible design: Minimum effort, maximum impact
 
Software Engineering Methodologies (overview)
Software Engineering Methodologies (overview)Software Engineering Methodologies (overview)
Software Engineering Methodologies (overview)
 
social pharmacy d-pharm 1st year by Pragati K. Mahajan
social pharmacy d-pharm 1st year by Pragati K. Mahajansocial pharmacy d-pharm 1st year by Pragati K. Mahajan
social pharmacy d-pharm 1st year by Pragati K. Mahajan
 
Sanyam Choudhary Chemistry practical.pdf
Sanyam Choudhary Chemistry practical.pdfSanyam Choudhary Chemistry practical.pdf
Sanyam Choudhary Chemistry practical.pdf
 
APM Welcome, APM North West Network Conference, Synergies Across Sectors
APM Welcome, APM North West Network Conference, Synergies Across SectorsAPM Welcome, APM North West Network Conference, Synergies Across Sectors
APM Welcome, APM North West Network Conference, Synergies Across Sectors
 
Student login on Anyboli platform.helpin
Student login on Anyboli platform.helpinStudent login on Anyboli platform.helpin
Student login on Anyboli platform.helpin
 
fourth grading exam for kindergarten in writing
fourth grading exam for kindergarten in writingfourth grading exam for kindergarten in writing
fourth grading exam for kindergarten in writing
 
INDIA QUIZ 2024 RLAC DELHI UNIVERSITY.pptx
INDIA QUIZ 2024 RLAC DELHI UNIVERSITY.pptxINDIA QUIZ 2024 RLAC DELHI UNIVERSITY.pptx
INDIA QUIZ 2024 RLAC DELHI UNIVERSITY.pptx
 
SOCIAL AND HISTORICAL CONTEXT - LFTVD.pptx
SOCIAL AND HISTORICAL CONTEXT - LFTVD.pptxSOCIAL AND HISTORICAL CONTEXT - LFTVD.pptx
SOCIAL AND HISTORICAL CONTEXT - LFTVD.pptx
 
Mattingly "AI & Prompt Design: The Basics of Prompt Design"
Mattingly "AI & Prompt Design: The Basics of Prompt Design"Mattingly "AI & Prompt Design: The Basics of Prompt Design"
Mattingly "AI & Prompt Design: The Basics of Prompt Design"
 
Measures of Central Tendency: Mean, Median and Mode
Measures of Central Tendency: Mean, Median and ModeMeasures of Central Tendency: Mean, Median and Mode
Measures of Central Tendency: Mean, Median and Mode
 
Introduction to Nonprofit Accounting: The Basics
Introduction to Nonprofit Accounting: The BasicsIntroduction to Nonprofit Accounting: The Basics
Introduction to Nonprofit Accounting: The Basics
 
Arihant handbook biology for class 11 .pdf
Arihant handbook biology for class 11 .pdfArihant handbook biology for class 11 .pdf
Arihant handbook biology for class 11 .pdf
 
Grant Readiness 101 TechSoup and Remy Consulting
Grant Readiness 101 TechSoup and Remy ConsultingGrant Readiness 101 TechSoup and Remy Consulting
Grant Readiness 101 TechSoup and Remy Consulting
 
Key note speaker Neum_Admir Softic_ENG.pdf
Key note speaker Neum_Admir Softic_ENG.pdfKey note speaker Neum_Admir Softic_ENG.pdf
Key note speaker Neum_Admir Softic_ENG.pdf
 
Call Girls in Dwarka Mor Delhi Contact Us 9654467111
Call Girls in Dwarka Mor Delhi Contact Us 9654467111Call Girls in Dwarka Mor Delhi Contact Us 9654467111
Call Girls in Dwarka Mor Delhi Contact Us 9654467111
 
IGNOU MSCCFT and PGDCFT Exam Question Pattern: MCFT003 Counselling and Family...
IGNOU MSCCFT and PGDCFT Exam Question Pattern: MCFT003 Counselling and Family...IGNOU MSCCFT and PGDCFT Exam Question Pattern: MCFT003 Counselling and Family...
IGNOU MSCCFT and PGDCFT Exam Question Pattern: MCFT003 Counselling and Family...
 
Presentation by Andreas Schleicher Tackling the School Absenteeism Crisis 30 ...
Presentation by Andreas Schleicher Tackling the School Absenteeism Crisis 30 ...Presentation by Andreas Schleicher Tackling the School Absenteeism Crisis 30 ...
Presentation by Andreas Schleicher Tackling the School Absenteeism Crisis 30 ...
 

Semantic Search in E-Discovery

  • 1. Semantic Search in E-Discovery Research on the application of text mining and information retrieval for fact finding in regulatory investigations David Graus
  • 2. Who’s Involved? Prof. dr. Maarten de Rijke Dr. Hans Henseler Lector E-Discovery, CREATE-IT applied Director Intelligent Systems Lab, UvA research David Graus, MSc. David van Dijk, MSc. PhD Candidate, Semantic Search Researcher E-Discovery, CREATE-IT applied research in E-Discovery, UvA Zhaochun Ren, MSc. Menno Israël, MSc. Teamleader Knowledge and Expertise PhD Candidate, Semantic Search Centre for Intelligent Data Analysis in E-Discovery, UvA (Kecida), NFI Semantic search in e-discovery 2
  • 3. Introduction £ Semantic Search in E-Discovery Semantic search in e-discovery 3
  • 4. What is £ Semantic Search in E-Discovery ˜ retrieving and securing digital forensic evidence Semantic search in e-discovery 4
  • 5. What is £ Semantic Search in E-Discovery Semantic search in e-discovery 5
  • 6. What is £ Semantic Search in E-Discovery ˜ retrieving and securing digital forensic evidence ˜ from emails, forums, etc... Semantic search in e-discovery 6
  • 7. What is £ Semantic Search in e-Discovery Semantic search in e-discovery 7
  • 8. Challenge ¢ Finding out who knew what, from whom, and when Semantic search in e-discovery 8
  • 9. Challenge ¢ Finding out who knew what, from whom, and when ¢ Generic search is not the answer Semantic search in e-discovery 9
  • 10. Finding evidence for E-Discovery ¢ We don’t know what we’re looking for ¢ What we’re looking for might be deliberately hidden ¢ Communication might be very domain-specific, contextualized or incomplete Semantic search in e-discovery 10
  • 11. Task ¢ Retrieve all relevant traces ¢ Highly iterative search process ¢ Support (re)formulating questions and hypotheses Semantic search in e-discovery 11
  • 12. How do we approach this? ¢ Two subprojects: £ Information Retrieval ˜ Finding material of unstructured nature from large collections £ Information Extraction/Text Mining ˜ Discovering patterns in data Semantic search in e-discovery 12
  • 13. How do we approach this? ¢ Information Retrieval £ Integrating structure/context of data in retrieval models ˜ Capturing forum and email context ˜ Conversational search Semantic search in e-discovery 13
  • 14. How do we approach this? ¢ Information Extraction/Text Mining £ Extracting structured knowledge from user generated content ˜ Semantic pre-processing ˜ Social network inference ˜ Information maps Semantic search in e-discovery 14
  • 15. How do we approach this? ¢ Information Retrieval <-> Information Extraction Semantic search in e-discovery 15
  • 16. Current work (first steps) ¢ Information Retrieval £ Twitter Mining (as a form of conversational search) ¢ Information Extraction/Text Mining £ Entity linking (for semantic document enrichment) ¢ TREC/TAC benchmarking events £ TREC Legal Track 2011 (2013?) Semantic search in e-discovery 16
  • 17. Contributions ¢ xTAS: Open source text analysis toolkit ¢ iColumbo: Internet monitoring framework ¢ Used by: £ Internet Recherche Netwerk £ Koninklijke Bibliotheek £ Beeld en Geluid £ ... You? Semantic search in e-discovery 17
  • 18. Semantic search in E-discovery ¢ David Graus ¢ d.p.graus@uva.nl Semantic search in e-discovery 18