SlideShare uma empresa Scribd logo
1 de 37
Search Across
    Multiple VIVO Instances
    Brian Caruso, Miles Worthington, Nick Cappadona
    Albert R. Mann Library
    Cornell University




1
Building the foundation
    •   VIVO core ontology
    •   Linked Data
    •   Implementation & Adoption
    •   Ingest & Editing




2
VIVO core ontology
    • A hierarchy of classes and properties
    • Incorporates segments of established
      ontologies
       – Bibontology
       – FOAF
       – eagle-i
    • Provides structure for modeled data


3   h$p://vivoweb.org/ontology/core
Linked Data
“ structured data on the Web
  A set of best practices for publishing and connecting
                               ”
    • URIs
    • RDF
    • HTTP




4   h$p://linkeddata.org
Implementation & Adoption
    • VIVO implemented at 7 partner institutions
         Cornell University               University of Florida

         Indiana University               Washington University in St. Louis
                                          School of Medicine
         Ponce School of Medicine
                                          Weill Cornell Medical College
         The Scripps Research Institute



    • Buy-in and support



5
Ingest & Editing
    • Identify local systems of record
      – HR
      – Grants
      – Faculty Activity
    • Load data
      – Harvester
      – Ingest Tools
    • Curation and self-editing

6
vivosearch.org

    A Demonstration


7
vivosearch.org
    • An example of multi-institutional search
    • Includes 7 partner institutions
      – plus Harvard Catalyst Pro les
    • Built using 2 tools developed on the grant
      – Linked Data Index Builder
      – VIVO Search Drupal module
    • Both are open source and available today
      – http://vivosearch.org/tools


8
Preparing Linked Data for search

     Linked Data Index Builder

9   h$p://vivosearch.org/tools
Linked Data Index Builder
     • A tool to create a Solr index from VIVO sites
     • Linked Data principles
       – URIs
       – RDF
       – HTTP
     • Solr
       – open source enterprise search platform
       – http://lucene.apache.org/solr


10
LDIB input
     • URL of VIVO instance
       – or any site serving LD aligned with VIVO core ontology
       – http://vivo.cornell.edu
     • Method/service to retrieve list of URIs
       – provided in VIVO through Index page
       – http://vivo.cornell.edu/browse




11
LDIB process




12
Map of our Solr system




13
Map of our Solr system




14
LDIB to do
     •   Improve fault tolerance
     •   Automate update/sync
     •   Experiment with scaling
     •   Management tools
         – need governance model to design tools
         – site_name and site_url are manually curated
         – no registration system in place



15
Searching
a
LDIB
index
with
Drupal

      VIVO search Drupal module

16   h$p://vivosearch.org/tools
Why Drupal?
     • Need a website as well
     • Can tap into core search features
     • Existing framework for connecting to Solr




17
Apache Solr search integration module
     • Flexible, not limited to Drupal content
     • Active community
     • Commercially backed




18
VIVO search module
     • Built for Drupal 7
     • Works on top of the existing Drupal module
     • Uses Drupal's core search system
     • Packaged with 3 search facets:
       classgroup, type, institution
     • Written speci cally for LDIB indexes



19
Developing a search site

     VIVOsearch.org interface

20
21
User priorities
     • All users want relevant search results
     • Most users demand quick search results
     • Some users want to manipulate search results




22   h$p://searchuserinterfaces.com
Development priorities
     1. Relevance
     2. Performance
     3. Controls




23   h$p://searchuserinterfaces.com
Relevance
     •   Good result ranking
     •   Scannable results
     •   Clear context
     •   Result totals
     •   Handle empty results




24
Performance
     • More critical than usual
     • Don't interrupt user's train of thought
     • Users will quickly abandon your site




25
Performance
     “   Web search engines typically show ten results, or “hits,” per page,
         with hyperlinks to additional pages of results .... a Google VP
         reported that despite the fact that users said they wanted more
         hits per page, an experiment in which the number of hits was
         increased to 30 hits per page showed a 20% reduction in traffic
         (Linden, 2006). The reason turned out to be that while the page
         with 10 results took 0.4 seconds to generate, the page with 30

                                                     ”
         results took 0.9 seconds on average.



         h$p://searchuserinterfaces.com/book/sui_ch5_retrieval_results.html

26
Performance enhancements
     •   Solr
     •   Apache mod_pagespeed
     •   Lots of caching
     •   Data URIs for CSS images
     •   CSS/JS aggregation and compression




27
Controls
     • Strive for predictability and consistency
     • Facets must be intuitive
     • Offer an escape route




28
Usability testing
     •   5 sessions
     •   Covered tasks for entire site
     •   Results overall positive
     •   Revealed issues with controls




29
30
31
Future enhancements
     •   Improved result ranking
     •   More informative text snippets
     •   Spelling and term suggestions
     •   Con guration for VIVO search module




32
Build a search site using the tools we developed

     Roll Your Own

33
More than meets the eye
     • vivosearch.org > LDIB + Drupal module
       – theme
       – additional utilities




34
35
Look Mom, no Drupal
     • Solr is the key
     • choose your weapon for integration
       – http://wiki.apache.org/solr/IntegratingSolr
     • Drupal is not a requirement




36
Brian Caruso
                          brian.caruso@cornell.edu

                                 Miles Worthington
                     miles.worthington@cornell.edu

                                  Nick Cappadona
                       nick.cappadona@cornell.edu


                  vivo-dev-all@lists.sourceforge.net




     Questions?

     Thank You

37

Mais conteúdo relacionado

Semelhante a Search Across Multiple VIVO Instances

SPConnections - Search Administration in SharePoint 2013
SPConnections - Search Administration in SharePoint 2013SPConnections - Search Administration in SharePoint 2013
SPConnections - Search Administration in SharePoint 2013Agnes Molnar
 
Online Journal Management using Open Journal Systems (OJS)
Online Journal Management using Open Journal Systems (OJS)Online Journal Management using Open Journal Systems (OJS)
Online Journal Management using Open Journal Systems (OJS)Ina Smith
 
ufsojs-161024084446 (1).pdf
ufsojs-161024084446 (1).pdfufsojs-161024084446 (1).pdf
ufsojs-161024084446 (1).pdfTeshome Oljira
 
Capture All the URLS: First Steps in Web Archiving
Capture All the URLS: First Steps in Web ArchivingCapture All the URLS: First Steps in Web Archiving
Capture All the URLS: First Steps in Web ArchivingKristen Yarmey
 
10 Steps For a Successful Technology Scholarly Project
10 Steps For a Successful Technology Scholarly Project10 Steps For a Successful Technology Scholarly Project
10 Steps For a Successful Technology Scholarly Projectdsandro1
 
Drupal as a Rapid Application Development (RAD) Framework for Startups
Drupal as a Rapid Application Development (RAD) Framework for StartupsDrupal as a Rapid Application Development (RAD) Framework for Startups
Drupal as a Rapid Application Development (RAD) Framework for StartupsZyxware Technologies
 
Hydra Project Management Survey
Hydra Project Management SurveyHydra Project Management Survey
Hydra Project Management SurveyMark Notess
 
Online Journal Management using Open Journal Systems (OJS)
Online Journal Management using Open Journal Systems (OJS)Online Journal Management using Open Journal Systems (OJS)
Online Journal Management using Open Journal Systems (OJS)Ina Smith
 
Web mining slides
Web mining slidesWeb mining slides
Web mining slidesmahavir_a
 
Implimenting and Mitigating Change with all of this Newfangled Technology
Implimenting and Mitigating Change with all of this Newfangled TechnologyImplimenting and Mitigating Change with all of this Newfangled Technology
Implimenting and Mitigating Change with all of this Newfangled TechnologyIndiana Online Users Group
 
SPLive Orlando - 10 Things I Like in SharePoint 2013 Search
SPLive Orlando - 10 Things I Like in SharePoint 2013 SearchSPLive Orlando - 10 Things I Like in SharePoint 2013 Search
SPLive Orlando - 10 Things I Like in SharePoint 2013 SearchAgnes Molnar
 
Walk Before You Run: Prerequisites to Linked Data
Walk Before You Run: Prerequisites to Linked DataWalk Before You Run: Prerequisites to Linked Data
Walk Before You Run: Prerequisites to Linked DataKenning Arlitsch
 
Marc and beyond: 3 Linked Data Choices
 Marc and beyond: 3 Linked Data Choices  Marc and beyond: 3 Linked Data Choices
Marc and beyond: 3 Linked Data Choices Richard Wallis
 
Drupal Site Audit - SFDUG
Drupal Site Audit - SFDUGDrupal Site Audit - SFDUG
Drupal Site Audit - SFDUGJon Peck
 
Capture All the URLs: First Steps in Web Archiving
Capture All the URLs: First Steps in Web ArchivingCapture All the URLs: First Steps in Web Archiving
Capture All the URLs: First Steps in Web ArchivingKristen Yarmey
 

Semelhante a Search Across Multiple VIVO Instances (20)

SPConnections - Search Administration in SharePoint 2013
SPConnections - Search Administration in SharePoint 2013SPConnections - Search Administration in SharePoint 2013
SPConnections - Search Administration in SharePoint 2013
 
Online Journal Management using Open Journal Systems (OJS)
Online Journal Management using Open Journal Systems (OJS)Online Journal Management using Open Journal Systems (OJS)
Online Journal Management using Open Journal Systems (OJS)
 
ufsojs-161024084446 (1).pdf
ufsojs-161024084446 (1).pdfufsojs-161024084446 (1).pdf
ufsojs-161024084446 (1).pdf
 
Capture All the URLS: First Steps in Web Archiving
Capture All the URLS: First Steps in Web ArchivingCapture All the URLS: First Steps in Web Archiving
Capture All the URLS: First Steps in Web Archiving
 
10 Steps For a Successful Technology Scholarly Project
10 Steps For a Successful Technology Scholarly Project10 Steps For a Successful Technology Scholarly Project
10 Steps For a Successful Technology Scholarly Project
 
Drupal as a Rapid Application Development (RAD) Framework for Startups
Drupal as a Rapid Application Development (RAD) Framework for StartupsDrupal as a Rapid Application Development (RAD) Framework for Startups
Drupal as a Rapid Application Development (RAD) Framework for Startups
 
Hydra Project Management Survey
Hydra Project Management SurveyHydra Project Management Survey
Hydra Project Management Survey
 
Discovery Interfaces
Discovery InterfacesDiscovery Interfaces
Discovery Interfaces
 
Online Journal Management using Open Journal Systems (OJS)
Online Journal Management using Open Journal Systems (OJS)Online Journal Management using Open Journal Systems (OJS)
Online Journal Management using Open Journal Systems (OJS)
 
Lowe NISO virtual conf feb17
Lowe NISO virtual conf feb17Lowe NISO virtual conf feb17
Lowe NISO virtual conf feb17
 
Alamw15 VIVO
Alamw15 VIVOAlamw15 VIVO
Alamw15 VIVO
 
Web mining slides
Web mining slidesWeb mining slides
Web mining slides
 
Information update march 2013.ppt
Information update march 2013.pptInformation update march 2013.ppt
Information update march 2013.ppt
 
Pieper NISO Virtual Conf Feb17
Pieper NISO Virtual Conf Feb17Pieper NISO Virtual Conf Feb17
Pieper NISO Virtual Conf Feb17
 
Implimenting and Mitigating Change with all of this Newfangled Technology
Implimenting and Mitigating Change with all of this Newfangled TechnologyImplimenting and Mitigating Change with all of this Newfangled Technology
Implimenting and Mitigating Change with all of this Newfangled Technology
 
SPLive Orlando - 10 Things I Like in SharePoint 2013 Search
SPLive Orlando - 10 Things I Like in SharePoint 2013 SearchSPLive Orlando - 10 Things I Like in SharePoint 2013 Search
SPLive Orlando - 10 Things I Like in SharePoint 2013 Search
 
Walk Before You Run: Prerequisites to Linked Data
Walk Before You Run: Prerequisites to Linked DataWalk Before You Run: Prerequisites to Linked Data
Walk Before You Run: Prerequisites to Linked Data
 
Marc and beyond: 3 Linked Data Choices
 Marc and beyond: 3 Linked Data Choices  Marc and beyond: 3 Linked Data Choices
Marc and beyond: 3 Linked Data Choices
 
Drupal Site Audit - SFDUG
Drupal Site Audit - SFDUGDrupal Site Audit - SFDUG
Drupal Site Audit - SFDUG
 
Capture All the URLs: First Steps in Web Archiving
Capture All the URLs: First Steps in Web ArchivingCapture All the URLs: First Steps in Web Archiving
Capture All the URLs: First Steps in Web Archiving
 

Último

Collecting & Temporal Analysis of Behavioral Web Data - Tales From The Inside
Collecting & Temporal Analysis of Behavioral Web Data - Tales From The InsideCollecting & Temporal Analysis of Behavioral Web Data - Tales From The Inside
Collecting & Temporal Analysis of Behavioral Web Data - Tales From The InsideStefan Dietze
 
Harnessing Passkeys in the Battle Against AI-Powered Cyber Threats.pptx
Harnessing Passkeys in the Battle Against AI-Powered Cyber Threats.pptxHarnessing Passkeys in the Battle Against AI-Powered Cyber Threats.pptx
Harnessing Passkeys in the Battle Against AI-Powered Cyber Threats.pptxFIDO Alliance
 
Long journey of Ruby Standard library at RubyKaigi 2024
Long journey of Ruby Standard library at RubyKaigi 2024Long journey of Ruby Standard library at RubyKaigi 2024
Long journey of Ruby Standard library at RubyKaigi 2024Hiroshi SHIBATA
 
AI mind or machine power point presentation
AI mind or machine power point presentationAI mind or machine power point presentation
AI mind or machine power point presentationyogeshlabana357357
 
Google I/O Extended 2024 Warsaw
Google I/O Extended 2024 WarsawGoogle I/O Extended 2024 Warsaw
Google I/O Extended 2024 WarsawGDSC PJATK
 
ERP Contender Series: Acumatica vs. Sage Intacct
ERP Contender Series: Acumatica vs. Sage IntacctERP Contender Series: Acumatica vs. Sage Intacct
ERP Contender Series: Acumatica vs. Sage IntacctBrainSell Technologies
 
Using IESVE for Room Loads Analysis - UK & Ireland
Using IESVE for Room Loads Analysis - UK & IrelandUsing IESVE for Room Loads Analysis - UK & Ireland
Using IESVE for Room Loads Analysis - UK & IrelandIES VE
 
Linux Foundation Edge _ Overview of FDO Software Components _ Randy at Intel.pdf
Linux Foundation Edge _ Overview of FDO Software Components _ Randy at Intel.pdfLinux Foundation Edge _ Overview of FDO Software Components _ Randy at Intel.pdf
Linux Foundation Edge _ Overview of FDO Software Components _ Randy at Intel.pdfFIDO Alliance
 
WebAssembly is Key to Better LLM Performance
WebAssembly is Key to Better LLM PerformanceWebAssembly is Key to Better LLM Performance
WebAssembly is Key to Better LLM PerformanceSamy Fodil
 
Introduction to FIDO Authentication and Passkeys.pptx
Introduction to FIDO Authentication and Passkeys.pptxIntroduction to FIDO Authentication and Passkeys.pptx
Introduction to FIDO Authentication and Passkeys.pptxFIDO Alliance
 
Structuring Teams and Portfolios for Success
Structuring Teams and Portfolios for SuccessStructuring Teams and Portfolios for Success
Structuring Teams and Portfolios for SuccessUXDXConf
 
Design Guidelines for Passkeys 2024.pptx
Design Guidelines for Passkeys 2024.pptxDesign Guidelines for Passkeys 2024.pptx
Design Guidelines for Passkeys 2024.pptxFIDO Alliance
 
Portal Kombat : extension du réseau de propagande russe
Portal Kombat : extension du réseau de propagande russePortal Kombat : extension du réseau de propagande russe
Portal Kombat : extension du réseau de propagande russe中 央社
 
Oauth 2.0 Introduction and Flows with MuleSoft
Oauth 2.0 Introduction and Flows with MuleSoftOauth 2.0 Introduction and Flows with MuleSoft
Oauth 2.0 Introduction and Flows with MuleSoftshyamraj55
 
Intro in Product Management - Коротко про професію продакт менеджера
Intro in Product Management - Коротко про професію продакт менеджераIntro in Product Management - Коротко про професію продакт менеджера
Intro in Product Management - Коротко про професію продакт менеджераMark Opanasiuk
 
Event-Driven Architecture Masterclass: Integrating Distributed Data Stores Ac...
Event-Driven Architecture Masterclass: Integrating Distributed Data Stores Ac...Event-Driven Architecture Masterclass: Integrating Distributed Data Stores Ac...
Event-Driven Architecture Masterclass: Integrating Distributed Data Stores Ac...ScyllaDB
 
The Metaverse: Are We There Yet?
The  Metaverse:    Are   We  There  Yet?The  Metaverse:    Are   We  There  Yet?
The Metaverse: Are We There Yet?Mark Billinghurst
 
Event-Driven Architecture Masterclass: Challenges in Stream Processing
Event-Driven Architecture Masterclass: Challenges in Stream ProcessingEvent-Driven Architecture Masterclass: Challenges in Stream Processing
Event-Driven Architecture Masterclass: Challenges in Stream ProcessingScyllaDB
 
Continuing Bonds Through AI: A Hermeneutic Reflection on Thanabots
Continuing Bonds Through AI: A Hermeneutic Reflection on ThanabotsContinuing Bonds Through AI: A Hermeneutic Reflection on Thanabots
Continuing Bonds Through AI: A Hermeneutic Reflection on ThanabotsLeah Henrickson
 

Último (20)

Collecting & Temporal Analysis of Behavioral Web Data - Tales From The Inside
Collecting & Temporal Analysis of Behavioral Web Data - Tales From The InsideCollecting & Temporal Analysis of Behavioral Web Data - Tales From The Inside
Collecting & Temporal Analysis of Behavioral Web Data - Tales From The Inside
 
Harnessing Passkeys in the Battle Against AI-Powered Cyber Threats.pptx
Harnessing Passkeys in the Battle Against AI-Powered Cyber Threats.pptxHarnessing Passkeys in the Battle Against AI-Powered Cyber Threats.pptx
Harnessing Passkeys in the Battle Against AI-Powered Cyber Threats.pptx
 
Long journey of Ruby Standard library at RubyKaigi 2024
Long journey of Ruby Standard library at RubyKaigi 2024Long journey of Ruby Standard library at RubyKaigi 2024
Long journey of Ruby Standard library at RubyKaigi 2024
 
AI mind or machine power point presentation
AI mind or machine power point presentationAI mind or machine power point presentation
AI mind or machine power point presentation
 
Overview of Hyperledger Foundation
Overview of Hyperledger FoundationOverview of Hyperledger Foundation
Overview of Hyperledger Foundation
 
Google I/O Extended 2024 Warsaw
Google I/O Extended 2024 WarsawGoogle I/O Extended 2024 Warsaw
Google I/O Extended 2024 Warsaw
 
ERP Contender Series: Acumatica vs. Sage Intacct
ERP Contender Series: Acumatica vs. Sage IntacctERP Contender Series: Acumatica vs. Sage Intacct
ERP Contender Series: Acumatica vs. Sage Intacct
 
Using IESVE for Room Loads Analysis - UK & Ireland
Using IESVE for Room Loads Analysis - UK & IrelandUsing IESVE for Room Loads Analysis - UK & Ireland
Using IESVE for Room Loads Analysis - UK & Ireland
 
Linux Foundation Edge _ Overview of FDO Software Components _ Randy at Intel.pdf
Linux Foundation Edge _ Overview of FDO Software Components _ Randy at Intel.pdfLinux Foundation Edge _ Overview of FDO Software Components _ Randy at Intel.pdf
Linux Foundation Edge _ Overview of FDO Software Components _ Randy at Intel.pdf
 
WebAssembly is Key to Better LLM Performance
WebAssembly is Key to Better LLM PerformanceWebAssembly is Key to Better LLM Performance
WebAssembly is Key to Better LLM Performance
 
Introduction to FIDO Authentication and Passkeys.pptx
Introduction to FIDO Authentication and Passkeys.pptxIntroduction to FIDO Authentication and Passkeys.pptx
Introduction to FIDO Authentication and Passkeys.pptx
 
Structuring Teams and Portfolios for Success
Structuring Teams and Portfolios for SuccessStructuring Teams and Portfolios for Success
Structuring Teams and Portfolios for Success
 
Design Guidelines for Passkeys 2024.pptx
Design Guidelines for Passkeys 2024.pptxDesign Guidelines for Passkeys 2024.pptx
Design Guidelines for Passkeys 2024.pptx
 
Portal Kombat : extension du réseau de propagande russe
Portal Kombat : extension du réseau de propagande russePortal Kombat : extension du réseau de propagande russe
Portal Kombat : extension du réseau de propagande russe
 
Oauth 2.0 Introduction and Flows with MuleSoft
Oauth 2.0 Introduction and Flows with MuleSoftOauth 2.0 Introduction and Flows with MuleSoft
Oauth 2.0 Introduction and Flows with MuleSoft
 
Intro in Product Management - Коротко про професію продакт менеджера
Intro in Product Management - Коротко про професію продакт менеджераIntro in Product Management - Коротко про професію продакт менеджера
Intro in Product Management - Коротко про професію продакт менеджера
 
Event-Driven Architecture Masterclass: Integrating Distributed Data Stores Ac...
Event-Driven Architecture Masterclass: Integrating Distributed Data Stores Ac...Event-Driven Architecture Masterclass: Integrating Distributed Data Stores Ac...
Event-Driven Architecture Masterclass: Integrating Distributed Data Stores Ac...
 
The Metaverse: Are We There Yet?
The  Metaverse:    Are   We  There  Yet?The  Metaverse:    Are   We  There  Yet?
The Metaverse: Are We There Yet?
 
Event-Driven Architecture Masterclass: Challenges in Stream Processing
Event-Driven Architecture Masterclass: Challenges in Stream ProcessingEvent-Driven Architecture Masterclass: Challenges in Stream Processing
Event-Driven Architecture Masterclass: Challenges in Stream Processing
 
Continuing Bonds Through AI: A Hermeneutic Reflection on Thanabots
Continuing Bonds Through AI: A Hermeneutic Reflection on ThanabotsContinuing Bonds Through AI: A Hermeneutic Reflection on Thanabots
Continuing Bonds Through AI: A Hermeneutic Reflection on Thanabots
 

Search Across Multiple VIVO Instances

  • 1. Search Across Multiple VIVO Instances Brian Caruso, Miles Worthington, Nick Cappadona Albert R. Mann Library Cornell University 1
  • 2. Building the foundation • VIVO core ontology • Linked Data • Implementation & Adoption • Ingest & Editing 2
  • 3. VIVO core ontology • A hierarchy of classes and properties • Incorporates segments of established ontologies – Bibontology – FOAF – eagle-i • Provides structure for modeled data 3 h$p://vivoweb.org/ontology/core
  • 4. Linked Data “ structured data on the Web A set of best practices for publishing and connecting ” • URIs • RDF • HTTP 4 h$p://linkeddata.org
  • 5. Implementation & Adoption • VIVO implemented at 7 partner institutions Cornell University University of Florida Indiana University Washington University in St. Louis School of Medicine Ponce School of Medicine Weill Cornell Medical College The Scripps Research Institute • Buy-in and support 5
  • 6. Ingest & Editing • Identify local systems of record – HR – Grants – Faculty Activity • Load data – Harvester – Ingest Tools • Curation and self-editing 6
  • 7. vivosearch.org A Demonstration 7
  • 8. vivosearch.org • An example of multi-institutional search • Includes 7 partner institutions – plus Harvard Catalyst Pro les • Built using 2 tools developed on the grant – Linked Data Index Builder – VIVO Search Drupal module • Both are open source and available today – http://vivosearch.org/tools 8
  • 9. Preparing Linked Data for search Linked Data Index Builder 9 h$p://vivosearch.org/tools
  • 10. Linked Data Index Builder • A tool to create a Solr index from VIVO sites • Linked Data principles – URIs – RDF – HTTP • Solr – open source enterprise search platform – http://lucene.apache.org/solr 10
  • 11. LDIB input • URL of VIVO instance – or any site serving LD aligned with VIVO core ontology – http://vivo.cornell.edu • Method/service to retrieve list of URIs – provided in VIVO through Index page – http://vivo.cornell.edu/browse 11
  • 13. Map of our Solr system 13
  • 14. Map of our Solr system 14
  • 15. LDIB to do • Improve fault tolerance • Automate update/sync • Experiment with scaling • Management tools – need governance model to design tools – site_name and site_url are manually curated – no registration system in place 15
  • 16. Searching
a
LDIB
index
with
Drupal VIVO search Drupal module 16 h$p://vivosearch.org/tools
  • 17. Why Drupal? • Need a website as well • Can tap into core search features • Existing framework for connecting to Solr 17
  • 18. Apache Solr search integration module • Flexible, not limited to Drupal content • Active community • Commercially backed 18
  • 19. VIVO search module • Built for Drupal 7 • Works on top of the existing Drupal module • Uses Drupal's core search system • Packaged with 3 search facets: classgroup, type, institution • Written speci cally for LDIB indexes 19
  • 20. Developing a search site VIVOsearch.org interface 20
  • 21. 21
  • 22. User priorities • All users want relevant search results • Most users demand quick search results • Some users want to manipulate search results 22 h$p://searchuserinterfaces.com
  • 23. Development priorities 1. Relevance 2. Performance 3. Controls 23 h$p://searchuserinterfaces.com
  • 24. Relevance • Good result ranking • Scannable results • Clear context • Result totals • Handle empty results 24
  • 25. Performance • More critical than usual • Don't interrupt user's train of thought • Users will quickly abandon your site 25
  • 26. Performance “ Web search engines typically show ten results, or “hits,” per page, with hyperlinks to additional pages of results .... a Google VP reported that despite the fact that users said they wanted more hits per page, an experiment in which the number of hits was increased to 30 hits per page showed a 20% reduction in traffic (Linden, 2006). The reason turned out to be that while the page with 10 results took 0.4 seconds to generate, the page with 30 ” results took 0.9 seconds on average. h$p://searchuserinterfaces.com/book/sui_ch5_retrieval_results.html 26
  • 27. Performance enhancements • Solr • Apache mod_pagespeed • Lots of caching • Data URIs for CSS images • CSS/JS aggregation and compression 27
  • 28. Controls • Strive for predictability and consistency • Facets must be intuitive • Offer an escape route 28
  • 29. Usability testing • 5 sessions • Covered tasks for entire site • Results overall positive • Revealed issues with controls 29
  • 30. 30
  • 31. 31
  • 32. Future enhancements • Improved result ranking • More informative text snippets • Spelling and term suggestions • Con guration for VIVO search module 32
  • 33. Build a search site using the tools we developed Roll Your Own 33
  • 34. More than meets the eye • vivosearch.org > LDIB + Drupal module – theme – additional utilities 34
  • 35. 35
  • 36. Look Mom, no Drupal • Solr is the key • choose your weapon for integration – http://wiki.apache.org/solr/IntegratingSolr • Drupal is not a requirement 36
  • 37. Brian Caruso brian.caruso@cornell.edu Miles Worthington miles.worthington@cornell.edu Nick Cappadona nick.cappadona@cornell.edu vivo-dev-all@lists.sourceforge.net Questions? Thank You 37

Notas do Editor

  1. Alternative: Search Across the Seven Partner VIVO Instances\n
  2. Brief intro and background on how we got to this point\n
  3. Ontologies\n* Bibontology for publications\n* FOAF for people and organizations\n* eagle-i for scientific and research resources\n\n* Defines the common thread across institutions (tap into this for search faceting/filtering)\n
  4. * Uniform Resource Identifier - a string used to identify a resource on the web\n* Resource Description Framework - a generic graph-based data model for describing things, including relationships to other things\n* HyperText Transfer Protocol - simple, universal mechanism for requesting and retrieving resources or descriptions of resources\n
  5. * more than simply installing the VIVO app\n* efforts of Implementation and Outreach teams are too often overlooked\n* the buy-in and support from administration and faculty are critical\n\n* VIVO implemented at institutions and organizations beyond the 7 on the grant\n - University of Colorado\n - StonyBrook\n - there are definitely others...ask Elly for latest numbers?\n
  6. * local SORs are key - you could load all of the data manually but that’s no fun\n* Harvester is available as subproject on sourceforge - library of ETL tools\n - extract, transform, load\n - initial integration of Harvester in VIVO 1.3\n* Even with automated ingest, you’ll still want to edit/add information on an individual basis\n\n
  7. * so we’ve laid down this foundation and we now have the VIVO app running at the 7 partner institutions, but how do we tie all of this data together and start using it to help us discover new collaborations\n* make connections...the start of a network (probably too loaded of a term)\n\n* was thinking of listing out the URLs of the seven VIVO partner instances prior to this slide while I spoke about the points above, but felt it wasn’t necessary\n\n* vivosearch.org\n  - an example site that searches the VIVO instances at the 7 partner institutions on the grant\n  - also includes Harvard Catalyst Profiles as evidence of interoperability with external apps \n* go right into the search - start with a suggested term\n* provide a scenario or 2 that makes use of faceting\n   - need to come of with these\n* follow a result to the source institution\n
  8. * reiterate that this is an example site :)\n* HCP has aligned itself with the VIVO core ontology, serves profiles data as RDF\n* these 2 tools are works in progress and are free for you to download and use in building a similar search site\n* we’d like to show you a closer look at each of these and provide some details on how you can build a search site of your own\n\n* need to add the link for the sandbox project once it’s online at Drupal.org\n
  9. Pass off to Brian Caruso to work his magic\n
  10. * emphasize that the end result is a Solr index\n* we use Solr because it’s proven and it’s fast\n* revisit Linked Data\n - making HTTP requests to VIVO instances and retrieving RDF using URIs\n
  11. * alternate title: LDIB Minimum Requirements\n* alternate title: LDIB Ingredients\n\n* HCP is an example of one such non-VIVO site (although it doesn’t serve Linked Data -- one URI for both HTML or RDF representations)\n* this list of URIs define what will be retrieved and indexed in subsequent requests\n
  12. * All steps during index building\n share very little state\n should be very parallelizable\n
  13. Provide service to link individuals from one VIVO to individuals in another VIVO instance\n
  14. * Solr is highly scalable\n distributed indexes\n used by netflix, monster.com, digg\n
  15. Should we introduce the Solr schema here or anywhere else or is it just not worth getting into that level of detail in this presentation?\n\n* the fact that we are manually curating the site information should reinforce that we currently have no registration or signup system beyond “email Brian Caruso...”\n\n* should we mention scaling here? What do we want to say besides “scaling with Solr”\n - are there any particular example projects/numbers we want to point to?\n
  16. \n
  17. \n
  18. \n
  19. \n
  20. \n
  21. - Search is a goal-oriented activity. Users are typically not searching for fun. Get out of their way.\n- Google and others have established UI patterns that users are comfortable with. The UI itself is not where we want to experiment.\n- It seems so simple and familiar, but many subtleties in a search interface.\n- Usability testing is not negotiable.\n
  22. \n
  23. \n
  24. \n
  25. \n
  26. \n
  27. \n
  28. \n
  29. \n
  30. \n
  31. \n
  32. \n
  33. \n
  34. * we don’t want to give you the wrong impression, that using only these 2 open source tools you can build this exact site, pixel for pixel.\n* additional utilities for facets, class group taxonomy, institution management\n\n\n
  35. * show barebones D7 site with default theme and VIVO search module to illustrate extra work\n* place screenshot here and then also quickly demo this live\n\n* we will also need Apache Solr Search Integration module as well (anything else)?\n* I will work on this tonight/tomorrow\n \n
  36. * focusing on the fact that it’s more than just these 2 tools is not the point\n* instead bring the focus to Solr\n* explain why we chose it\n* illustrate the flexibility it provides\n* Drupal is not a requirement, just one example\n* demo AJAX Solr site connected to Rollins index\n\nI will work on the AJAX Solr site tonight/tomorrow as well.\n
  37. \n