Search Across Multiple VIVO Instances

Search Across
Multiple VIVO Instances
Brian Caruso, Miles Worthington, Nick Cappadona
Albert R. Mann Library
Cornell University

1

Building the foundation
• VIVO core ontology
• Linked Data
• Implementation & Adoption
• Ingest & Editing

2

VIVO core ontology
• A hierarchy of classes and properties
• Incorporates segments of established
ontologies
– Bibontology
– FOAF
– eagle-i
• Provides structure for modeled data

3 h$p://vivoweb.org/ontology/core

Linked Data
“ structured data on the Web
A set of best practices for publishing and connecting
”
• URIs
• RDF
• HTTP

4 h$p://linkeddata.org

Implementation & Adoption
• VIVO implemented at 7 partner institutions
Cornell University University of Florida

Indiana University Washington University in St. Louis
School of Medicine
Ponce School of Medicine
Weill Cornell Medical College
The Scripps Research Institute

• Buy-in and support

5

Ingest & Editing
• Identify local systems of record
– HR
– Grants
– Faculty Activity
• Load data
– Harvester
– Ingest Tools
• Curation and self-editing

6

vivosearch.org

A Demonstration

7

vivosearch.org
• An example of multi-institutional search
• Includes 7 partner institutions
– plus Harvard Catalyst Pro les
• Built using 2 tools developed on the grant
– Linked Data Index Builder
– VIVO Search Drupal module
• Both are open source and available today
– http://vivosearch.org/tools

8

Preparing Linked Data for search

Linked Data Index Builder

9 h$p://vivosearch.org/tools

Linked Data Index Builder
• A tool to create a Solr index from VIVO sites
• Linked Data principles
– URIs
– RDF
– HTTP
• Solr
– open source enterprise search platform
– http://lucene.apache.org/solr

10

LDIB input
• URL of VIVO instance
– or any site serving LD aligned with VIVO core ontology
– http://vivo.cornell.edu
• Method/service to retrieve list of URIs
– provided in VIVO through Index page
– http://vivo.cornell.edu/browse

11

LDIB to do
• Improve fault tolerance
• Automate update/sync
• Experiment with scaling
• Management tools
– need governance model to design tools
– site_name and site_url are manually curated
– no registration system in place

15

Searching a LDIB index with Drupal

VIVO search Drupal module

16 h$p://vivosearch.org/tools

Why Drupal?
• Need a website as well
• Can tap into core search features
• Existing framework for connecting to Solr

17

Apache Solr search integration module
• Flexible, not limited to Drupal content
• Active community
• Commercially backed

18

VIVO search module
• Built for Drupal 7
• Works on top of the existing Drupal module
• Uses Drupal's core search system
• Packaged with 3 search facets:
classgroup, type, institution
• Written speci cally for LDIB indexes

19

Developing a search site

VIVOsearch.org interface

20

User priorities
• All users want relevant search results
• Most users demand quick search results
• Some users want to manipulate search results

22 h$p://searchuserinterfaces.com

Development priorities
1. Relevance
2. Performance
3. Controls

23 h$p://searchuserinterfaces.com

Relevance
• Good result ranking
• Scannable results
• Clear context
• Result totals
• Handle empty results

24

Performance
• More critical than usual
• Don't interrupt user's train of thought
• Users will quickly abandon your site

25

Performance
“ Web search engines typically show ten results, or “hits,” per page,
with hyperlinks to additional pages of results .... a Google VP
reported that despite the fact that users said they wanted more
hits per page, an experiment in which the number of hits was
increased to 30 hits per page showed a 20% reduction in traﬃc
(Linden, 2006). The reason turned out to be that while the page
with 10 results took 0.4 seconds to generate, the page with 30

”
results took 0.9 seconds on average.

h$p://searchuserinterfaces.com/book/sui_ch5_retrieval_results.html

26

Performance enhancements
• Solr
• Apache mod_pagespeed
• Lots of caching
• Data URIs for CSS images
• CSS/JS aggregation and compression

27

Controls
• Strive for predictability and consistency
• Facets must be intuitive
• Oﬀer an escape route

28

Usability testing
• 5 sessions
• Covered tasks for entire site
• Results overall positive
• Revealed issues with controls

29

Future enhancements
• Improved result ranking
• More informative text snippets
• Spelling and term suggestions
• Con guration for VIVO search module

32

Build a search site using the tools we developed

Roll Your Own

33

More than meets the eye
• vivosearch.org > LDIB + Drupal module
– theme
– additional utilities

34

Look Mom, no Drupal
• Solr is the key
• choose your weapon for integration
– http://wiki.apache.org/solr/IntegratingSolr
• Drupal is not a requirement

36

Brian Caruso
brian.caruso@cornell.edu

Miles Worthington
miles.worthington@cornell.edu

Nick Cappadona
nick.cappadona@cornell.edu

vivo-dev-all@lists.sourceforge.net

Questions?

Thank You

37

Search Across Multiple VIVO Instances

Recomendados

Recomendados

Mais conteúdo relacionado

Semelhante a Search Across Multiple VIVO Instances

Semelhante a Search Across Multiple VIVO Instances (20)

Último

Último (20)

Search Across Multiple VIVO Instances

Notas do Editor