SlideShare uma empresa Scribd logo
1 de 15
An Eclipse Plug-in for Code Search
          using Full-text
  Information Retrieval Engine

          Andrejs Jermakovics
           Francesco Di Cerbo


     Free University of Bolzano-Bozen
          Bolzano-Bozen, Italy
Introduction
● Search is becoming an important aspect of
  software development motivated by
  growing code sizes and open-source
  availability.
● Moreover, users (and also developers) are
  getting used to full-text searches for
  emails and file contents.
● Why not creating a plugin to support full-
  text searches in Eclipse?



                                               2
Searches in Java: Lucene
                 ● Lucene is an
                   information
                   retrieval software.
                 ● It is FLOSS, and is
                   used in many other
                   software:
                   ●   Alfresco
                   ●   SolR
                   ●   Eclipse
                   ●   ...


                                         3
Lucene in Eclipse
● Lucene is part of Eclipse.
  ● NOT for the Search functions!
● It is mostly used for the Eclipse Help!




● … Why not using it for CODE?
                         CODE




                                            4
Instasearch
● InstaSearch provides powerful and flexible
  code search with high performance.
● It offers flexibility through Lucene query
  syntax, defining a set of specific fields for
   file searches.
● The currently available fields are:
        Field                        Description
         file     Full path of the file
        name      Name of the file
         ext      Extension of the file
         proj     Name of the project containing the file
         jar      Name of the jar if file is stored in a jar
       contents   Contents of the file (default search field)
         ws       Working set containing projects (virtual field)
                                                                    5
Instasearch searches
● Supported search types:
  ● wildcard searches to search using a substring:
    ● app* initialize
  ● searches on fields value:
    ● proj:MyProject ext:java,xml application init
    ● ws:MyWorkingSet application init
  ● fuzzy searches to find similar matches:
    ● application init~
  ● advanced queries:
    ● index AND (directory OR dir)




                                                     6
Instasearch screenshot




                         7
Instasearch: code highlighting




                                 8
Instasearch: search&results view




                                   9
Instasearch Architecture




                           10
Instasearch components (1/2)
● Analyzer
  ● It reads files from the workspace and splits the
    text into a set of tokens. Both the original
    word and its split parts are indexed thus
    allowing to search for parts of identifiers as
    well as searching for the exact identifier.
● Indexer
  ● The Indexer collects files with their tokens and
    writes them to Lucene index. The meta-data
    associated with each file is specified using
    several fields which can later be used in a
    Lucene search query to filter results.

                                                       11
Instasearch components (2/2)
● Query analyzer
  ● Parses the search text entered by the user and
    creates a search query which is used to
    retrieve the files from the index. It also
    elaborate the search queries, e.g. dealing with
    virtual fields.
● Instasearch View
  ● Performs all UI interactions for getting the
    search text and displaying a list of matching
    files. The search is performed while the search
    text is being typed thus allows the user to tune
    it quickly for more relevant results.

                                                       12
Conclusions
● Instasearch is very fast, using a dynamic
  update mechanism it does not intefere
  with usual Eclipse tasks.
● It is released under EPL license, it is
  available on the Eclipse Market, and it is
  hosted on Free University of Bolzano-
  Bozen FLOSS forge.




                                               13
Instasearch website




                      14
Thanks for your attention!

             Andrejs Jermakovics
         andrejs.jermakovics@unibz.it
              Francesco Di Cerbo
               fdicerbo@unibz.it
 https://code.inf.unibz.it/projects/instasearch

      Mark Instasearch as favourite on
             the Eclipse Market!


                                                  15

Mais conteúdo relacionado

Semelhante a Instasearch -- Eclipse IT 2010

Elasticsearch Basics
Elasticsearch BasicsElasticsearch Basics
Elasticsearch BasicsShifa Khan
 
Search API in Ruby with ES and complicated factors
Search API in Ruby with ES and complicated factorsSearch API in Ruby with ES and complicated factors
Search API in Ruby with ES and complicated factorsTrung Vu
 
Mikel_Berdufi_SemanticWebSearchEngine_Report
Mikel_Berdufi_SemanticWebSearchEngine_ReportMikel_Berdufi_SemanticWebSearchEngine_Report
Mikel_Berdufi_SemanticWebSearchEngine_ReportMikel Berdufi
 
Cs8392 u1-1-oop intro
Cs8392 u1-1-oop introCs8392 u1-1-oop intro
Cs8392 u1-1-oop introRajasekaran S
 
Dictionary project report.docx
Dictionary project report.docxDictionary project report.docx
Dictionary project report.docxkishoreadhikari2
 
Elasticsearch Architechture
Elasticsearch ArchitechtureElasticsearch Architechture
Elasticsearch ArchitechtureAnurag Sharma
 
Sulthan's_JAVA_Material_for_B.Sc-CS.pdf
Sulthan's_JAVA_Material_for_B.Sc-CS.pdfSulthan's_JAVA_Material_for_B.Sc-CS.pdf
Sulthan's_JAVA_Material_for_B.Sc-CS.pdfSULTHAN BASHA
 
Spring MVC framework
Spring MVC frameworkSpring MVC framework
Spring MVC frameworkMohit Gupta
 
Introduction to Spring Framework
Introduction to Spring FrameworkIntroduction to Spring Framework
Introduction to Spring FrameworkHùng Nguyễn Huy
 
Natural language identification
Natural language identificationNatural language identification
Natural language identificationShaktiTaneja
 
Object Oriented programming - Introduction
Object Oriented programming - IntroductionObject Oriented programming - Introduction
Object Oriented programming - IntroductionMadishetty Prathibha
 
Eclipse Training - Introduction
Eclipse Training - IntroductionEclipse Training - Introduction
Eclipse Training - IntroductionLuca D'Onofrio
 
File Assay (A Program which predict and search the files from system)
File Assay (A Program which predict and search the files from system)File Assay (A Program which predict and search the files from system)
File Assay (A Program which predict and search the files from system)akshat sinha
 
Recommendations for the automatic enrichment of digital library content using...
Recommendations for the automatic enrichment of digital library content using...Recommendations for the automatic enrichment of digital library content using...
Recommendations for the automatic enrichment of digital library content using...pathsproject
 

Semelhante a Instasearch -- Eclipse IT 2010 (20)

Elasticsearch Basics
Elasticsearch BasicsElasticsearch Basics
Elasticsearch Basics
 
Apache lucene
Apache luceneApache lucene
Apache lucene
 
Search API in Ruby with ES and complicated factors
Search API in Ruby with ES and complicated factorsSearch API in Ruby with ES and complicated factors
Search API in Ruby with ES and complicated factors
 
Mikel_Berdufi_SemanticWebSearchEngine_Report
Mikel_Berdufi_SemanticWebSearchEngine_ReportMikel_Berdufi_SemanticWebSearchEngine_Report
Mikel_Berdufi_SemanticWebSearchEngine_Report
 
Cs8392 u1-1-oop intro
Cs8392 u1-1-oop introCs8392 u1-1-oop intro
Cs8392 u1-1-oop intro
 
Dictionary project report.docx
Dictionary project report.docxDictionary project report.docx
Dictionary project report.docx
 
Getting Started with Python
Getting Started with PythonGetting Started with Python
Getting Started with Python
 
Elasticsearch Architechture
Elasticsearch ArchitechtureElasticsearch Architechture
Elasticsearch Architechture
 
Sulthan's_JAVA_Material_for_B.Sc-CS.pdf
Sulthan's_JAVA_Material_for_B.Sc-CS.pdfSulthan's_JAVA_Material_for_B.Sc-CS.pdf
Sulthan's_JAVA_Material_for_B.Sc-CS.pdf
 
Spring MVC framework
Spring MVC frameworkSpring MVC framework
Spring MVC framework
 
Introduction to Spring Framework
Introduction to Spring FrameworkIntroduction to Spring Framework
Introduction to Spring Framework
 
Jenkins
JenkinsJenkins
Jenkins
 
Natural language identification
Natural language identificationNatural language identification
Natural language identification
 
Object Oriented programming - Introduction
Object Oriented programming - IntroductionObject Oriented programming - Introduction
Object Oriented programming - Introduction
 
C++
C++C++
C++
 
Splunk Developer Platform
Splunk Developer PlatformSplunk Developer Platform
Splunk Developer Platform
 
Eclipse Training - Introduction
Eclipse Training - IntroductionEclipse Training - Introduction
Eclipse Training - Introduction
 
PYTHON PPT.pptx
PYTHON PPT.pptxPYTHON PPT.pptx
PYTHON PPT.pptx
 
File Assay (A Program which predict and search the files from system)
File Assay (A Program which predict and search the files from system)File Assay (A Program which predict and search the files from system)
File Assay (A Program which predict and search the files from system)
 
Recommendations for the automatic enrichment of digital library content using...
Recommendations for the automatic enrichment of digital library content using...Recommendations for the automatic enrichment of digital library content using...
Recommendations for the automatic enrichment of digital library content using...
 

Instasearch -- Eclipse IT 2010

  • 1. An Eclipse Plug-in for Code Search using Full-text Information Retrieval Engine Andrejs Jermakovics Francesco Di Cerbo Free University of Bolzano-Bozen Bolzano-Bozen, Italy
  • 2. Introduction ● Search is becoming an important aspect of software development motivated by growing code sizes and open-source availability. ● Moreover, users (and also developers) are getting used to full-text searches for emails and file contents. ● Why not creating a plugin to support full- text searches in Eclipse? 2
  • 3. Searches in Java: Lucene ● Lucene is an information retrieval software. ● It is FLOSS, and is used in many other software: ● Alfresco ● SolR ● Eclipse ● ... 3
  • 4. Lucene in Eclipse ● Lucene is part of Eclipse. ● NOT for the Search functions! ● It is mostly used for the Eclipse Help! ● … Why not using it for CODE? CODE 4
  • 5. Instasearch ● InstaSearch provides powerful and flexible code search with high performance. ● It offers flexibility through Lucene query syntax, defining a set of specific fields for file searches. ● The currently available fields are: Field Description file Full path of the file name Name of the file ext Extension of the file proj Name of the project containing the file jar Name of the jar if file is stored in a jar contents Contents of the file (default search field) ws Working set containing projects (virtual field) 5
  • 6. Instasearch searches ● Supported search types: ● wildcard searches to search using a substring: ● app* initialize ● searches on fields value: ● proj:MyProject ext:java,xml application init ● ws:MyWorkingSet application init ● fuzzy searches to find similar matches: ● application init~ ● advanced queries: ● index AND (directory OR dir) 6
  • 11. Instasearch components (1/2) ● Analyzer ● It reads files from the workspace and splits the text into a set of tokens. Both the original word and its split parts are indexed thus allowing to search for parts of identifiers as well as searching for the exact identifier. ● Indexer ● The Indexer collects files with their tokens and writes them to Lucene index. The meta-data associated with each file is specified using several fields which can later be used in a Lucene search query to filter results. 11
  • 12. Instasearch components (2/2) ● Query analyzer ● Parses the search text entered by the user and creates a search query which is used to retrieve the files from the index. It also elaborate the search queries, e.g. dealing with virtual fields. ● Instasearch View ● Performs all UI interactions for getting the search text and displaying a list of matching files. The search is performed while the search text is being typed thus allows the user to tune it quickly for more relevant results. 12
  • 13. Conclusions ● Instasearch is very fast, using a dynamic update mechanism it does not intefere with usual Eclipse tasks. ● It is released under EPL license, it is available on the Eclipse Market, and it is hosted on Free University of Bolzano- Bozen FLOSS forge. 13
  • 15. Thanks for your attention! Andrejs Jermakovics andrejs.jermakovics@unibz.it Francesco Di Cerbo fdicerbo@unibz.it https://code.inf.unibz.it/projects/instasearch Mark Instasearch as favourite on the Eclipse Market! 15