SlideShare uma empresa Scribd logo
1 de 27
Database Research Group Search-As-You-Type in Forms: Leveraging the Usability and the Functionalityof Search Paradigm in Relational Databases Hao WuSupervised by Prof. Lizhu ZhouDatabase Research Group, Tsinghua University VLDB PhD Workshop – Sept. 13, Singapore
Motivation Problem Statement Challenges Initial Achievements Conclusions
Motivation Problem Statement Challenges Initial Achievements Conclusions
Motivation Relational databases are widely used. There are many search paradigms: Structured Query Language (SQL) Keyword Search (KS) Query-By-Example (QBE) Different search paradigms are needed by different users. 10/8/2010 4 Hao Wu, DB Group, Tsinghua University
Motivation #1: SQL is complex. SELECT* FROMAuthor A, Autor_Paper AP, Paper P WHERE  title LIKE'keyword' AND        title LIKE'search' AND        authors LIKE'g%'      AND A.id    =    AP.aidAND        P.id    =    AP.pid 10/8/2010 5 Hao Wu, DB Group, Tsinghua University
Motivation #2:  Traditional keyword search is imprecise. Title? Conf. name? Author name? keyword search g 10/8/2010 6 Hao Wu, DB Group, Tsinghua University
Motivation #3: Form is awkward. UCI Directory: http://directory.uci.edu/index.php?form_type=advanced_search 10/8/2010 7 Hao Wu, DB Group, Tsinghua University
Motivation #4:  The "Search" button is not convenient. 10/8/2010 8 Hao Wu, DB Group, Tsinghua University
Motivation +    Keyword Search +    Form-Style Interface +    Search-as-you-type Seaform = 10/8/2010 9 Hao Wu, DB Group, Tsinghua University
Motivation Problem Statement Challenges Initial Achievements Conclusions
Motivation Problem Statement Challenges Initial Achievements Conclusions
Problem Statement Data: Single relational table. Several searchable attributes. 10/8/2010 Hao Wu, DB Group, Tsinghua University 12
Problem Statement Query: A set of keywords (prefixes) split by fields. A focus indicator. 10/8/2010 Hao Wu, DB Group, Tsinghua University 13 Title: Author: al Focus = Author xml
Problem Statement Results: Global results: corresponding tuples. Local results: corresponding attribute values. Aggregations. 10/8/2010 Hao Wu, DB Group, Tsinghua University 14 xml database (albert) xml search (albert) xml security (alice) Title: Author: al albert2 alice1 xml
Motivation Problem Statement Challenges Initial Achievements Conclusions
Motivation Problem Statement Challenges Initial Achievements Conclusions
Challenges: Search-As-You-Type Prefix matching: E.g.al albert, alice, …Trie structure w/ cache. Fast response: Synchronization of local resultsand global results yields heavycomputational cost.On-demand synchronization and dual-list trie. 10/8/2010 Hao Wu, DB Group, Tsinghua University 17
Challenges: Error Tolerance Misplacing of keywords: E.g. input "albert"into the Title input box.Automatic query refinement (given a query, how can we modify it to obtain more results?)Large search space; rely on precise estimation and probabilistic model. Fuzzy matching: E.g. input "albrt" instead of "albert".Edit-distance computation on trie structure.Ranking issue of local results: should local results be sorted by edit-distance, or by aggregation values? 10/8/2010 Hao Wu, DB Group, Tsinghua University 18
Challenges: Scalability Handle large-scale databases: There are large number of tuples.1) Top-k algorithmPrecise aggregation is impossible in this case.2) Using RDBMS itselfIndex structure should be redesigned for DBMS; performance issues. Handle multiple tables: Data are regularized to several tables.Generalize the single-table local-global computation and reduce on-the-fly joins using pre-joined tables.It is hard to determine which tables are the most necessary to pre-join; extra storage cost. 10/8/2010 Hao Wu, DB Group, Tsinghua University 19
Motivation Problem Statement Challenges Initial Achievements Conclusions
Motivation Problem Statement Challenges Initial Achievements Conclusions
Initial Achievements Seaform-DBLP Features: ,[object Object]
Prefix matching.
Average response time is less than 30 ms.Limitations: ,[object Object]
Non-top-k, i.e. it returns all matching results.
Memory-resident.10/8/2010 22 Hao Wu, DB Group, Tsinghua University
Demonstrations: Sept. 14, Tuesday 2 14:00 to 15:30 Sept. 15, Wednesday 5 14:00 to 15:30

Mais conteúdo relacionado

Destaque

No Sql Movement
No Sql MovementNo Sql Movement
No Sql MovementAjit Koti
 
Introducao a gestao_de_projetos_v4.0
Introducao a gestao_de_projetos_v4.0Introducao a gestao_de_projetos_v4.0
Introducao a gestao_de_projetos_v4.0Nicholas Uchoa
 
Caja de herramientas_ideam[1]
Caja de herramientas_ideam[1]Caja de herramientas_ideam[1]
Caja de herramientas_ideam[1]Juan Felipe Rios
 
Gestión de datos e información 2 santamaria sosa luis
Gestión de datos e información 2   santamaria sosa luisGestión de datos e información 2   santamaria sosa luis
Gestión de datos e información 2 santamaria sosa luisLuis Ricardo Santamaria Sosa
 
Negocios empresariales
Negocios empresarialesNegocios empresariales
Negocios empresarialesmego2011
 
Presentacio¦ün poli¦ütica de desarrollo sectorial el caso colombiano
Presentacio¦ün poli¦ütica de desarrollo sectorial el caso colombianoPresentacio¦ün poli¦ütica de desarrollo sectorial el caso colombiano
Presentacio¦ün poli¦ütica de desarrollo sectorial el caso colombianoferiaindustrialasi
 
Eventos y certámenes
Eventos y certámenesEventos y certámenes
Eventos y certámenespalaesteban
 
Nursing knowledge
Nursing knowledgeNursing knowledge
Nursing knowledgeEsther Ying
 
Management Vs Leadership
Management Vs LeadershipManagement Vs Leadership
Management Vs Leadershipexportpat
 
Liam Terblanche, CIO at Accsys - Physical vs Logical Access Control
Liam Terblanche, CIO at Accsys - Physical vs Logical Access ControlLiam Terblanche, CIO at Accsys - Physical vs Logical Access Control
Liam Terblanche, CIO at Accsys - Physical vs Logical Access ControlGlobal Business Events
 
Acetatos De T.L.R. I Nuevo Plan
Acetatos De T.L.R. I Nuevo PlanAcetatos De T.L.R. I Nuevo Plan
Acetatos De T.L.R. I Nuevo PlanOdin Hernandez
 

Destaque (14)

No Sql Movement
No Sql MovementNo Sql Movement
No Sql Movement
 
Introducao a gestao_de_projetos_v4.0
Introducao a gestao_de_projetos_v4.0Introducao a gestao_de_projetos_v4.0
Introducao a gestao_de_projetos_v4.0
 
Caja de herramientas_ideam[1]
Caja de herramientas_ideam[1]Caja de herramientas_ideam[1]
Caja de herramientas_ideam[1]
 
Gestión de datos e información 2 santamaria sosa luis
Gestión de datos e información 2   santamaria sosa luisGestión de datos e información 2   santamaria sosa luis
Gestión de datos e información 2 santamaria sosa luis
 
Negocios empresariales
Negocios empresarialesNegocios empresariales
Negocios empresariales
 
Presentacio¦ün poli¦ütica de desarrollo sectorial el caso colombiano
Presentacio¦ün poli¦ütica de desarrollo sectorial el caso colombianoPresentacio¦ün poli¦ütica de desarrollo sectorial el caso colombiano
Presentacio¦ün poli¦ütica de desarrollo sectorial el caso colombiano
 
Eventos y certámenes
Eventos y certámenesEventos y certámenes
Eventos y certámenes
 
Ritmos de la producción discursiva en análisis político. Un análisis cuantita...
Ritmos de la producción discursiva en análisis político. Un análisis cuantita...Ritmos de la producción discursiva en análisis político. Un análisis cuantita...
Ritmos de la producción discursiva en análisis político. Un análisis cuantita...
 
Eventos y certámenes
Eventos y certámenesEventos y certámenes
Eventos y certámenes
 
Nursing knowledge
Nursing knowledgeNursing knowledge
Nursing knowledge
 
Management Vs Leadership
Management Vs LeadershipManagement Vs Leadership
Management Vs Leadership
 
Liam Terblanche, CIO at Accsys - Physical vs Logical Access Control
Liam Terblanche, CIO at Accsys - Physical vs Logical Access ControlLiam Terblanche, CIO at Accsys - Physical vs Logical Access Control
Liam Terblanche, CIO at Accsys - Physical vs Logical Access Control
 
Gpc parto manejo
Gpc parto manejoGpc parto manejo
Gpc parto manejo
 
Acetatos De T.L.R. I Nuevo Plan
Acetatos De T.L.R. I Nuevo PlanAcetatos De T.L.R. I Nuevo Plan
Acetatos De T.L.R. I Nuevo Plan
 

Semelhante a Seaform Slides in VLDB 2010 PhD Workshop

User friendly pattern search paradigm
User friendly pattern search paradigmUser friendly pattern search paradigm
User friendly pattern search paradigmMigrant Systems
 
Coverage-Criteria-for-Testing-SQL-Queries
Coverage-Criteria-for-Testing-SQL-QueriesCoverage-Criteria-for-Testing-SQL-Queries
Coverage-Criteria-for-Testing-SQL-QueriesMohamed Reda
 
Java supporting search-as-you-type using sql in databases
Java  supporting search-as-you-type using sql in databasesJava  supporting search-as-you-type using sql in databases
Java supporting search-as-you-type using sql in databasesecwayerode
 
Supporting search as-you-type using sql in databases
Supporting search as-you-type using sql in databasesSupporting search as-you-type using sql in databases
Supporting search as-you-type using sql in databasesEcway Technologies
 
AI Beyond Deep Learning
AI Beyond Deep LearningAI Beyond Deep Learning
AI Beyond Deep LearningAndre Freitas
 
Progress Towards Leveraging Natural Language Processing for Collecting Experi...
Progress Towards Leveraging Natural Language Processing for Collecting Experi...Progress Towards Leveraging Natural Language Processing for Collecting Experi...
Progress Towards Leveraging Natural Language Processing for Collecting Experi...Anubhav Jain
 
Context-Based Diversification for Keyword Queries over XML Data
Context-Based Diversification for Keyword Queries over XML DataContext-Based Diversification for Keyword Queries over XML Data
Context-Based Diversification for Keyword Queries over XML Data1crore projects
 
2014 IEEE JAVA DATA MINING PROJECT Mining weakly labeled web facial images fo...
2014 IEEE JAVA DATA MINING PROJECT Mining weakly labeled web facial images fo...2014 IEEE JAVA DATA MINING PROJECT Mining weakly labeled web facial images fo...
2014 IEEE JAVA DATA MINING PROJECT Mining weakly labeled web facial images fo...IEEEFINALYEARSTUDENTPROJECT
 
2014 IEEE JAVA DATA MINING PROJECT Mining weakly labeled web facial images fo...
2014 IEEE JAVA DATA MINING PROJECT Mining weakly labeled web facial images fo...2014 IEEE JAVA DATA MINING PROJECT Mining weakly labeled web facial images fo...
2014 IEEE JAVA DATA MINING PROJECT Mining weakly labeled web facial images fo...IEEEMEMTECHSTUDENTSPROJECTS
 
IEEE 2014 JAVA DATA MINING PROJECTS Mining weakly labeled web facial images f...
IEEE 2014 JAVA DATA MINING PROJECTS Mining weakly labeled web facial images f...IEEE 2014 JAVA DATA MINING PROJECTS Mining weakly labeled web facial images f...
IEEE 2014 JAVA DATA MINING PROJECTS Mining weakly labeled web facial images f...IEEEFINALYEARSTUDENTPROJECTS
 
27 ijcse-01238-5 sivaranjani
27 ijcse-01238-5 sivaranjani27 ijcse-01238-5 sivaranjani
27 ijcse-01238-5 sivaranjaniShivlal Mewada
 
Expression of Query in XML object-oriented database
Expression of Query in XML object-oriented databaseExpression of Query in XML object-oriented database
Expression of Query in XML object-oriented databaseEditor IJCATR
 
Expression of Query in XML object-oriented database
Expression of Query in XML object-oriented databaseExpression of Query in XML object-oriented database
Expression of Query in XML object-oriented databaseEditor IJCATR
 
Expression of Query in XML object-oriented database
Expression of Query in XML object-oriented databaseExpression of Query in XML object-oriented database
Expression of Query in XML object-oriented databaseEditor IJCATR
 
Efficient Refining Of Why-Not Questions on Top-K Queries
Efficient Refining Of Why-Not Questions on Top-K QueriesEfficient Refining Of Why-Not Questions on Top-K Queries
Efficient Refining Of Why-Not Questions on Top-K Queriesiosrjce
 
Topic detecton by clustering and text mining
Topic detecton by clustering and text miningTopic detecton by clustering and text mining
Topic detecton by clustering and text miningIRJET Journal
 
Open Source Tools for Materials Informatics
Open Source Tools for Materials InformaticsOpen Source Tools for Materials Informatics
Open Source Tools for Materials InformaticsAnubhav Jain
 
clustering_classification.ppt
clustering_classification.pptclustering_classification.ppt
clustering_classification.pptHODECE21
 
Dq2644974501
Dq2644974501Dq2644974501
Dq2644974501IJMER
 

Semelhante a Seaform Slides in VLDB 2010 PhD Workshop (20)

User friendly pattern search paradigm
User friendly pattern search paradigmUser friendly pattern search paradigm
User friendly pattern search paradigm
 
Coverage-Criteria-for-Testing-SQL-Queries
Coverage-Criteria-for-Testing-SQL-QueriesCoverage-Criteria-for-Testing-SQL-Queries
Coverage-Criteria-for-Testing-SQL-Queries
 
Java supporting search-as-you-type using sql in databases
Java  supporting search-as-you-type using sql in databasesJava  supporting search-as-you-type using sql in databases
Java supporting search-as-you-type using sql in databases
 
Supporting search as-you-type using sql in databases
Supporting search as-you-type using sql in databasesSupporting search as-you-type using sql in databases
Supporting search as-you-type using sql in databases
 
AI Beyond Deep Learning
AI Beyond Deep LearningAI Beyond Deep Learning
AI Beyond Deep Learning
 
Progress Towards Leveraging Natural Language Processing for Collecting Experi...
Progress Towards Leveraging Natural Language Processing for Collecting Experi...Progress Towards Leveraging Natural Language Processing for Collecting Experi...
Progress Towards Leveraging Natural Language Processing for Collecting Experi...
 
Context-Based Diversification for Keyword Queries over XML Data
Context-Based Diversification for Keyword Queries over XML DataContext-Based Diversification for Keyword Queries over XML Data
Context-Based Diversification for Keyword Queries over XML Data
 
2014 IEEE JAVA DATA MINING PROJECT Mining weakly labeled web facial images fo...
2014 IEEE JAVA DATA MINING PROJECT Mining weakly labeled web facial images fo...2014 IEEE JAVA DATA MINING PROJECT Mining weakly labeled web facial images fo...
2014 IEEE JAVA DATA MINING PROJECT Mining weakly labeled web facial images fo...
 
2014 IEEE JAVA DATA MINING PROJECT Mining weakly labeled web facial images fo...
2014 IEEE JAVA DATA MINING PROJECT Mining weakly labeled web facial images fo...2014 IEEE JAVA DATA MINING PROJECT Mining weakly labeled web facial images fo...
2014 IEEE JAVA DATA MINING PROJECT Mining weakly labeled web facial images fo...
 
IEEE 2014 JAVA DATA MINING PROJECTS Mining weakly labeled web facial images f...
IEEE 2014 JAVA DATA MINING PROJECTS Mining weakly labeled web facial images f...IEEE 2014 JAVA DATA MINING PROJECTS Mining weakly labeled web facial images f...
IEEE 2014 JAVA DATA MINING PROJECTS Mining weakly labeled web facial images f...
 
27 ijcse-01238-5 sivaranjani
27 ijcse-01238-5 sivaranjani27 ijcse-01238-5 sivaranjani
27 ijcse-01238-5 sivaranjani
 
Expression of Query in XML object-oriented database
Expression of Query in XML object-oriented databaseExpression of Query in XML object-oriented database
Expression of Query in XML object-oriented database
 
Expression of Query in XML object-oriented database
Expression of Query in XML object-oriented databaseExpression of Query in XML object-oriented database
Expression of Query in XML object-oriented database
 
Expression of Query in XML object-oriented database
Expression of Query in XML object-oriented databaseExpression of Query in XML object-oriented database
Expression of Query in XML object-oriented database
 
B017350710
B017350710B017350710
B017350710
 
Efficient Refining Of Why-Not Questions on Top-K Queries
Efficient Refining Of Why-Not Questions on Top-K QueriesEfficient Refining Of Why-Not Questions on Top-K Queries
Efficient Refining Of Why-Not Questions on Top-K Queries
 
Topic detecton by clustering and text mining
Topic detecton by clustering and text miningTopic detecton by clustering and text mining
Topic detecton by clustering and text mining
 
Open Source Tools for Materials Informatics
Open Source Tools for Materials InformaticsOpen Source Tools for Materials Informatics
Open Source Tools for Materials Informatics
 
clustering_classification.ppt
clustering_classification.pptclustering_classification.ppt
clustering_classification.ppt
 
Dq2644974501
Dq2644974501Dq2644974501
Dq2644974501
 

Último

AWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of TerraformAWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of TerraformAndrey Devyatkin
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerThousandEyes
 
Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024The Digital Insurer
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processorsdebabhi2
 
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProduct Anonymous
 
Strategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherStrategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherRemote DBA Services
 
Artificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsArtificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsJoaquim Jorge
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerThousandEyes
 
Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slidevu2urc
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...Martijn de Jong
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Miguel Araújo
 
Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...apidays
 
Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024The Digital Insurer
 
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data DiscoveryTrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data DiscoveryTrustArc
 
Real Time Object Detection Using Open CV
Real Time Object Detection Using Open CVReal Time Object Detection Using Open CV
Real Time Object Detection Using Open CVKhem
 
HTML Injection Attacks: Impact and Mitigation Strategies
HTML Injection Attacks: Impact and Mitigation StrategiesHTML Injection Attacks: Impact and Mitigation Strategies
HTML Injection Attacks: Impact and Mitigation StrategiesBoston Institute of Analytics
 
Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...Enterprise Knowledge
 
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptxHampshireHUG
 
Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024The Digital Insurer
 

Último (20)

AWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of TerraformAWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of Terraform
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
 
Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processors
 
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
 
Strategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherStrategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a Fresher
 
Artificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsArtificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and Myths
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slide
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
 
Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...
 
Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024
 
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data DiscoveryTrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
 
Real Time Object Detection Using Open CV
Real Time Object Detection Using Open CVReal Time Object Detection Using Open CV
Real Time Object Detection Using Open CV
 
HTML Injection Attacks: Impact and Mitigation Strategies
HTML Injection Attacks: Impact and Mitigation StrategiesHTML Injection Attacks: Impact and Mitigation Strategies
HTML Injection Attacks: Impact and Mitigation Strategies
 
Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...
 
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
 
Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024
 

Seaform Slides in VLDB 2010 PhD Workshop

  • 1. Database Research Group Search-As-You-Type in Forms: Leveraging the Usability and the Functionalityof Search Paradigm in Relational Databases Hao WuSupervised by Prof. Lizhu ZhouDatabase Research Group, Tsinghua University VLDB PhD Workshop – Sept. 13, Singapore
  • 2. Motivation Problem Statement Challenges Initial Achievements Conclusions
  • 3. Motivation Problem Statement Challenges Initial Achievements Conclusions
  • 4. Motivation Relational databases are widely used. There are many search paradigms: Structured Query Language (SQL) Keyword Search (KS) Query-By-Example (QBE) Different search paradigms are needed by different users. 10/8/2010 4 Hao Wu, DB Group, Tsinghua University
  • 5. Motivation #1: SQL is complex. SELECT* FROMAuthor A, Autor_Paper AP, Paper P WHERE title LIKE'keyword' AND title LIKE'search' AND authors LIKE'g%' AND A.id = AP.aidAND P.id = AP.pid 10/8/2010 5 Hao Wu, DB Group, Tsinghua University
  • 6. Motivation #2: Traditional keyword search is imprecise. Title? Conf. name? Author name? keyword search g 10/8/2010 6 Hao Wu, DB Group, Tsinghua University
  • 7. Motivation #3: Form is awkward. UCI Directory: http://directory.uci.edu/index.php?form_type=advanced_search 10/8/2010 7 Hao Wu, DB Group, Tsinghua University
  • 8. Motivation #4: The "Search" button is not convenient. 10/8/2010 8 Hao Wu, DB Group, Tsinghua University
  • 9. Motivation + Keyword Search + Form-Style Interface + Search-as-you-type Seaform = 10/8/2010 9 Hao Wu, DB Group, Tsinghua University
  • 10. Motivation Problem Statement Challenges Initial Achievements Conclusions
  • 11. Motivation Problem Statement Challenges Initial Achievements Conclusions
  • 12. Problem Statement Data: Single relational table. Several searchable attributes. 10/8/2010 Hao Wu, DB Group, Tsinghua University 12
  • 13. Problem Statement Query: A set of keywords (prefixes) split by fields. A focus indicator. 10/8/2010 Hao Wu, DB Group, Tsinghua University 13 Title: Author: al Focus = Author xml
  • 14. Problem Statement Results: Global results: corresponding tuples. Local results: corresponding attribute values. Aggregations. 10/8/2010 Hao Wu, DB Group, Tsinghua University 14 xml database (albert) xml search (albert) xml security (alice) Title: Author: al albert2 alice1 xml
  • 15. Motivation Problem Statement Challenges Initial Achievements Conclusions
  • 16. Motivation Problem Statement Challenges Initial Achievements Conclusions
  • 17. Challenges: Search-As-You-Type Prefix matching: E.g.al albert, alice, …Trie structure w/ cache. Fast response: Synchronization of local resultsand global results yields heavycomputational cost.On-demand synchronization and dual-list trie. 10/8/2010 Hao Wu, DB Group, Tsinghua University 17
  • 18. Challenges: Error Tolerance Misplacing of keywords: E.g. input "albert"into the Title input box.Automatic query refinement (given a query, how can we modify it to obtain more results?)Large search space; rely on precise estimation and probabilistic model. Fuzzy matching: E.g. input "albrt" instead of "albert".Edit-distance computation on trie structure.Ranking issue of local results: should local results be sorted by edit-distance, or by aggregation values? 10/8/2010 Hao Wu, DB Group, Tsinghua University 18
  • 19. Challenges: Scalability Handle large-scale databases: There are large number of tuples.1) Top-k algorithmPrecise aggregation is impossible in this case.2) Using RDBMS itselfIndex structure should be redesigned for DBMS; performance issues. Handle multiple tables: Data are regularized to several tables.Generalize the single-table local-global computation and reduce on-the-fly joins using pre-joined tables.It is hard to determine which tables are the most necessary to pre-join; extra storage cost. 10/8/2010 Hao Wu, DB Group, Tsinghua University 19
  • 20. Motivation Problem Statement Challenges Initial Achievements Conclusions
  • 21. Motivation Problem Statement Challenges Initial Achievements Conclusions
  • 22.
  • 24.
  • 25. Non-top-k, i.e. it returns all matching results.
  • 26. Memory-resident.10/8/2010 22 Hao Wu, DB Group, Tsinghua University
  • 27. Demonstrations: Sept. 14, Tuesday 2 14:00 to 15:30 Sept. 15, Wednesday 5 14:00 to 15:30
  • 28. Motivation Problem Statement Challenges Initial Achievements Conclusions
  • 29. Motivation Problem Statement Challenges Initial Achievements Conclusions
  • 30. Conclusions Search-as-you-type with form is a good choice to balance the usability and functionality. There are still many problems to solve: More effective index other than trie + inverted lists. Support error tolerance. Native DBMS support. Top-k algorithms. Pre-join (materialize) tables. ... 10/8/2010 Hao Wu, DB Group, Tsinghua University 26
  • 31. Thanks http://tastier.cs.thu.edu.cn/seaform/ My homepage: http://dbgroup.cs.thu.edu.cn/wuhao/