SlideShare uma empresa Scribd logo
1 de 42
HUMANITIES
NETWORKED
INFRASTRUCTURE
(HUNI)
JAILBREAKING AUSTRALIA’S
CULTURAL DATA
CRICOS Provider Code: 00113B
NATIONAL E-RESEARCH
COLLABORATION TOOLS AND
RESOURCES (NECTAR)
NeCTAR is a $47 million dollar, Australian Government
project, conducted as part of the Super Science
initiative and financed by the Education Investment
Fund. The University of Melbourne is the lead
agent, chosen by the Commonwealth Government.
VIRTUAL LABORATORY PROGRAM
• Ensure that Australian cultural datasets and the research
associated with them become part of the emerging
international Linked Open Data environment.
• Enable research enquiries to move easily from: what is?
to where is?
• Support the role of annotation and metadata in discovery
of new knowledge or the means to elucidate new
knowledge
• Position the idea of data as both a subject and object of
analysis in humanities
• Contribute to debates around standards for development
and implementation
HuNI BROAD BENEFITS
• Enable humanities researchers to work with cultural datasets
more efficiently and effectively, and on a larger scale;
• Encourage the systematic sharing of research data between
humanities researchers (including the cultural dataset
curators themselves), the community and cultural
institutions;
• Encourage a greater level of cross-disciplinary and
interdisciplinary research, both within the
humanities/creative arts and between the
humanities/creative arts and other disciplines, and the wider
public;
• Support innovative methodologies such as network
analysis, game theory and ‘virtual history’ that rely on large-
scale datasets
HUNI: SPECIFIC BENEFITS
1. Organisational level: the goals and processes of the institutions
involved
2. The semantic level: meaning of the exchanged digital resources
3. Technical level: implementing data interoperability requires
both data integration and data exchange processes as well as
enabling effective use of the data that becomes available
Pasquale Pagano, ‘Data Interoperability’ (GRDI2020)
4. Project level: The advent of more complex ‘big humanities’
projects requires multiple and multi-disciplinary personnel
which in turn entails the organization of different workflows
and expectations: e.g. challenge of developing a
comprehensive or consortial approach, common definition of
project method etc.
INTEROPERABILITY
1. A PARTNERSHIP
… a Deakin led consortium
• Cultural data providers (10) – project co-operators
• Humanities software developer (1) – project co-
developers
• eResearch organisations (2) – lead development
agencies
HUNI PARTNER DATASETS
AMHD
MAP
CAARP
Bonza
AFIRC
Circus Oz
AusStage
Media:
film, cinema, theatre, newspapers, magazines, advertis
ing, music, live performances
DAAO
AustLit
AWR
ADB
DoS
Biographical: artists, designers, writers, significant
people, scientists, Sydney demographics
EOAS
AUSTLANG
Mura
Indigenous languages
AUSTLIT
ADB
DAAO
AUSTLANG
bonza
AUSSTAGE
EOAS
TUGG
Welcome to the Cinema and Audiences Research Project (CAARP) database: An online encyclopaedia of
cinema-going in Australia.
Data
This site contains information on film screenings and venues in Australia.
430,137 screenings
10,256 films
1,978 cinemas
1,649 companies
From 1846 to now
• NeCTAR investment of $1.33M
• Partner contributions of $480,000
• Partner in-kind contributions amounting to >$1M
A FISCAL COLLABORATION
COMMUNITY BUILDING
• Collated user-stories (20)
• Online showcase events – next one is 4th September
2013
• Live link to the latest alpha prototype on huni.net.au;
feedback buttons
• Wider beta launch at eResearch Australasia in October
2013
• Stay up to date through our monthly Newsletter and
blog feed
• Follow us on twitter - @HuNIVL
Information design challenge to build an ontology and use
linked data and controlled vocabularies for data to be
aligned and related.
• Reading the data. Characteristics of the data determine
the ontological components selected and the major
“entities” (aka “access points”).
• Identified early as:
people, organisations, events, relationships, places, dates,
resources, and subjects.
• Components from ontologies already available and being
reused or kept in our sights: CIDOC-
CRM, FOAF, FRBR, FRBR-OO, BibFrame and PROV-O.
2. INTEGRATING MEANING
PHASE ONE
HUNI ONTOLOGY March 2013
HUNI ONTOLOGY (all classes and
object properties)
cidoc:E41Appellation
cidoc:E49TimeAppellation
has subclass
cidoc:E44PlaceAppellation
has subclass
cidoc:E18PhysicalThing
cidoc:E24PhysicalManMadeThing
has subclass
cidoc:E19PhysicalObject
has subclass
frbr:F7Object
has subclass
cidoc:P1isIdentifiedBy (Domain>Range)
frbr:F9Place
cidoc:P53hasCurrentOrFormerLocation (Domain>Range)
cidoc:P1isIdentifiedBy (Domain>Range)cidoc:E22Man-MadeObject
has subclass
cidoc:P1isIdentifiedBy (Domain>Range)
cidoc:E52Time-Span
cidoc:E2TemporalEntity
has subclasscidoc:P4hasTimeSpan (Domain>Range)
cidoc:E4Period
has subclass
frbr:F22Self-Contained_Expression
frbr:F25Performance_Plan
has subclass
frbr:F26Recording
has subclass
frbr:F24Publication_Expression
has subclass
frbr:F15Complex_Work
frbr:F18Serial_Work
has subclass
cidoc:E21Person
frbr:F10Person
has subclass
cidoc:E67Birth
cidoc:P98iwasBorn (Domain>Range)
foaf:Person
has subclass
cidoc:E74Group
cidoc:P107iisCurrentOrFormerMemberOf (Domain>Range)
cidoc:E69Death
cidoc:P101idiedIn (Domain>Range)
cidoc:E7Activity
cidoc:P14iperformed (Domain>Range)
Thing
cidoc:E39Actor
has subclasscidoc:E15IdentifierAssignment
has subclass
huni:PrimaryTopic
has subclass
cidoc:E35Title
has subclass
cidoc:E71Man-MadeThing
has subclass
has subclass
cidoc:E53Place
has subclass
has subclass
huni:SKOS.Occupation
has subclass
has subclass
foaf:Group
has subclass
huni:SKOS.Role
has subclass
frbr:F6Concept
has subclass
frbr:F11Corporate_Body
has subclass
huni:SKOS.Collection
has subclass
cidoc:E42Identifier
has subclass
has subclass
frbr:F8Event
has subclass
huni:SKOS.Item
has subclass
has subclass
cidoc:E56Language
has subclass
has subclass
frbr:F13Identifier
has subclass
has subclass
cidoc:E55Type
has subclass
has subclassfrbr:F40Identifier_Assignment
has subclass
cidoc:P2hasType (Domain>Range)
cidoc:P11iparticipatedIn (Domain>Range)
has subclass
cidoc:P2HasType (Domain>Range)
has subclass has subclass
has subclass
has subclass
cidoc:E65Creation
has subclass
frbr:F31Performance
has subclasshas subclass
cidoc:E12Production
has subclass
cidoc:P1isIdentifiedBy (Domain>Range)
cidoc:P1isIdentifiedBy (Domain>Range)
has subclass
huni:timeIsIdentifiedBy (Domain>Range)
cidoc:E5Event
has subclass
cidoc:P1isIdentifiedBy (Domain>Range)
has subclass
cidoc:P1isIdentifiedBy (Domain>Range)cidoc:P1isIdentifiedBy (Domain>Range)
cidoc:P7tookPlaceAt (Domain>Range)cidoc:P1isIdentifiedBy (Domain>Range)
huni:hasOccupation (Domain>Range) huni:hasRole (Domain>Range)
cidoc:E48PlaceName
has subclass
frbr:F30Publication_Event
frbr:R24created (Domain>Range)frbr:F21Recording_Work
frbr:R23createdARealisationOf (Domain>Range)
frbr:F19Publication_Work
frbr:R24created (Domain>Range)
has subclass
cidoc:P1isIdentifiedBy (Domain>Range)cidoc:P1isIdentifiedBy (Domain>Range)
cidoc:P1isIdentifiedBy (Domain>Range)
huni:placeIsIdentifiedBy (Domain>Range)
frbr:F28Expression_Creation
has subclass cidoc:P108hasProduced (Domain>Range)
has subclassfrbr:F1Work
frbr:R19createdARealisationOf (Domain>Range)
frbr:F2Expression
frbr:R17created (Domain>Range)
frbr:F21Recording_Event
has subclass
cidoc:E28ConceptualObject
has subclass
has subclass
has subclass
cidoc:E89PropositionalObject
has subclass
frbr:F14Individual_Work
frbr:F17Aggregation_Work
has subclass
cidoc:P94hasCreated (Domain>Range)
frbr:f25Work_Conception
has subclass
cidoc:P102hasTitle (Domain>Range) huni:hasCollection (Domain>Range)
cidoc:P2hasType (Domain>Range)
has subclass
cidoc:P148hasComponent (Domain>Range)
cidoc:E73InformationObject
has subclass huni:hasItem (Domain>Range)
cidoc:P2HasType (Domain>Range) frbr:f16Container_Work
has subclass
has subclass
has subclass has subclass
frbr:F20Performance_Work
has subclasshas subclass has subclasscidoc:P72hasLanguage (Domain>Range) has subclass
cidoc:P2hasType (Domain>Range)cidoc:P2HasType (Domain>Range)
has subclass
frbr:R23createdARealisationOf (Domain>Range)
frbr:R24created (Domain>Range)
frbr:R24created (Domain>Range)
has subclass
cidoc:P102hasTitle (Domain>Range)
frbr:R12isRealisedIn (Domain>Range)
has subclass
has subclass
frbr:R16initiated (Domain>Range)
cidoc:P14iperformed (Domain>Range) has subclass
ALIGNING ONTOLOGIES
3. HuNI DATA ARCHITECTURE
Data
integration
HuNI
side
Partner
side
Data harvest,
transform
and ingest
Solr Search Server
[HuNI Data]
RDF Triple Store
[HuNI Linked Data]
Data
analysis
and
mapping
HuNI Virtual Laboratory
Scholarly researcher workflow tasks Admin tasksPublic and citizen
researcher workflow tasks
Data
discovery
Data
analysis
Data
sharing
Analyse and annotate
collection
Export collection
Share collection and
analysis
Share search results
Corbicula
Registration and login
Profile management
History recording
Project management
Simple search
Advanced search
Save search results as
private collection
Refine / expand
collection
Simple search
Advanced search
Deep (SPARQL-based)
search
Data update
and
publish ADB DAAO CAARP AFIRC AusStage
A total of 28 Australian datasets are being harvested for integration into
HuNI
• Data gateway components, called HuNI Corbicula, deployed on the
NeCTAR Cloud to harvest the XML feed data and transforming it into
forms suitable for ingestion into two HuNI data aggregates: a Solr
search server [HuNI Data], and a Jena RDF Triple Store [HuNI Linked
Data]
DATA INTEGRATION
The harvesting process
requires:
• Live data feeds
deployed at the partner
sites to publish
updated partner data
as XML
Data
integration
HuNI
side
Partner
side
Data harvest,
transform
and ingest
Solr Search Server
[HuNI Data]
RDF Triple Store
[HuNI Linked Data]
Data
analysis
and
mapping
Corbicula
Data update
and
publish ADB DAAO CAARP AFIRC AusStage
TWO HUNI DATA AGGREGATES?
Solr aggregate RDF aggregate
28
0
7
14
21
24
0
7
14
21
6
partnerdataset
partnerdataset
TECHNOLOGY STACK
• front-end frameworks - AngularJS and Twitter
Bootstrap single page web app
• tools hosting framework - Open Social via Apache
Shindig
• back-end framework - SpringMVC via Roo.
• layer integration - RESTful web services
• Search the HuNI Data
• Save their search results as a
private collection
• Refine their collection through
additional searches
• Analyse and annotate their
collection with their own
assertions and commentary
• Export their collection for
further analysis
• Publish and share their
collection and research
RESEARCH ACTIVITIES
A researcher with a HuNI account will be able to:
HuNI Virtual Laboratory
Scholarly researcher workflow tasks Admin tasksPublic and citizen
researcher workflow tasks
Data
discovery
Data
analysis
Data
sharing
Analyse and annotate
collection
Export collection
Share collection and
analysis
Share search results
Registration and login
Profile management
History recording
Project management
Simple search
Advanced search
Save search results as
private collection
Refine / expand
collection
Simple search
Advanced search
Solr Search Server
[HuNI Data]
Scholarly researchers will also
be able to perform a “deep
search” of the graphs in RDF
Triple Store.
The large-scale aggregation of
Linked Data makes explicit the
relationships and connections
between related records across
all the partner
datasets, enabling the
researcher to construct more
complex semantic queries.
RESEARCH ACTIVITIES 2
HuNI Virtual Laboratory
Scholarly researcher workflow tasks Admin tasksPublic and citizen
researcher workflow tasks
Data
discovery
Data
analysis
Data
sharing
Registration and login
Profile management
History recording
Project management
Deep (SPARQL-based)
search
RDF Triple Store
[HuNI Linked Data]
EARLY VLAB PROTOTYPE
VIRTUAL LABORATORY RESEARCHER
WORKFLOW: Discovery (part 1)
VIRTUAL LABORATORY RESEARCHER
WORKFLOW: Discovery (part 2)
VIRTUAL LABORATORY RESEARCHER
WORKFLOW: Discovery (part 3)
VIRTUAL LABORATORY RESEARCHER
WORKFLOW: Analysis (part 1)
VIRTUAL LABORATORY RESEARCHER
WORKFLOW – Analysis (part 2)
VIRTUAL LABORATORY RESEARCHER
WORKFLOW: Sharing
4. THE PROJECT
• project director/community liaison (20%)
• project manager (100%)
• technical coordinator (100%)
• information services coordinator (90%)
• community engagement (30%)
• communication coordinator (20%)
• administrative support (20%)
• software developer(s)
NeCTAR
Directorate
HuNI
Steering
Committee
Team HuNI
Technical
Working
Group
Expert
Advisory
Group
Expert Data
Group
PROJECT WEBSITE: huni.net.au
PROJECT WIKI: apidictor.huni.net.au
HuNI: a virtual laboratory for the humanities
http://huni.net.au/@HuNIVL

Mais conteúdo relacionado

Destaque

CENDARI Summer School July 2015 Burrows
CENDARI Summer School July 2015 BurrowsCENDARI Summer School July 2015 Burrows
CENDARI Summer School July 2015 BurrowsToby Burrows
 
Crowdfuding University researchers
Crowdfuding University researchersCrowdfuding University researchers
Crowdfuding University researchersDeb Verhoeven
 
Ontologies and the humanities: some issues affecting the design of digital in...
Ontologies and the humanities: some issues affecting the design of digital in...Ontologies and the humanities: some issues affecting the design of digital in...
Ontologies and the humanities: some issues affecting the design of digital in...Toby Burrows
 
December 9, 2015 NISO Webinar: Two-Part Webinar: Emerging Resource Types - Pa...
December 9, 2015 NISO Webinar: Two-Part Webinar: Emerging Resource Types - Pa...December 9, 2015 NISO Webinar: Two-Part Webinar: Emerging Resource Types - Pa...
December 9, 2015 NISO Webinar: Two-Part Webinar: Emerging Resource Types - Pa...DeVonne Parks, CEM
 
D4Science Data Infrastructure - Facilitator for a FAIR Data Management
D4Science Data Infrastructure - Facilitator for a FAIR Data ManagementD4Science Data Infrastructure - Facilitator for a FAIR Data Management
D4Science Data Infrastructure - Facilitator for a FAIR Data ManagementBlue BRIDGE
 
The Outcome Economy
The Outcome EconomyThe Outcome Economy
The Outcome EconomyHelge Tennø
 

Destaque (7)

CENDARI Summer School July 2015 Burrows
CENDARI Summer School July 2015 BurrowsCENDARI Summer School July 2015 Burrows
CENDARI Summer School July 2015 Burrows
 
Crowdfuding University researchers
Crowdfuding University researchersCrowdfuding University researchers
Crowdfuding University researchers
 
Information Systems
Information SystemsInformation Systems
Information Systems
 
Ontologies and the humanities: some issues affecting the design of digital in...
Ontologies and the humanities: some issues affecting the design of digital in...Ontologies and the humanities: some issues affecting the design of digital in...
Ontologies and the humanities: some issues affecting the design of digital in...
 
December 9, 2015 NISO Webinar: Two-Part Webinar: Emerging Resource Types - Pa...
December 9, 2015 NISO Webinar: Two-Part Webinar: Emerging Resource Types - Pa...December 9, 2015 NISO Webinar: Two-Part Webinar: Emerging Resource Types - Pa...
December 9, 2015 NISO Webinar: Two-Part Webinar: Emerging Resource Types - Pa...
 
D4Science Data Infrastructure - Facilitator for a FAIR Data Management
D4Science Data Infrastructure - Facilitator for a FAIR Data ManagementD4Science Data Infrastructure - Facilitator for a FAIR Data Management
D4Science Data Infrastructure - Facilitator for a FAIR Data Management
 
The Outcome Economy
The Outcome EconomyThe Outcome Economy
The Outcome Economy
 

Semelhante a Humanities Networked Infrastructure (HuNI)

NeCTAR Presentation
NeCTAR PresentationNeCTAR Presentation
NeCTAR PresentationCybera Inc.
 
Open science / open research
Open science / open researchOpen science / open research
Open science / open researchheila1
 
HuNI Melbourne LODLAM 2012-04-17
HuNI Melbourne LODLAM 2012-04-17HuNI Melbourne LODLAM 2012-04-17
HuNI Melbourne LODLAM 2012-04-17Conal Tuohy
 
SLSTINET Library Awareness - SLIATE 20-06-2023.pptx
SLSTINET Library Awareness - SLIATE 20-06-2023.pptxSLSTINET Library Awareness - SLIATE 20-06-2023.pptx
SLSTINET Library Awareness - SLIATE 20-06-2023.pptxManujaKarunaratne1
 
RDAP 15 EarthCollab: Connecting Scientific Information Sources using the Sema...
RDAP 15 EarthCollab: Connecting Scientific Information Sources using the Sema...RDAP 15 EarthCollab: Connecting Scientific Information Sources using the Sema...
RDAP 15 EarthCollab: Connecting Scientific Information Sources using the Sema...ASIS&T
 
Online promises beyond the policies: what's under the skin
Online promises beyond the policies: what's under the skin Online promises beyond the policies: what's under the skin
Online promises beyond the policies: what's under the skin Nicolaie Constantinescu
 
Xiaobin Shen eScience2013 presentation
Xiaobin Shen eScience2013 presentationXiaobin Shen eScience2013 presentation
Xiaobin Shen eScience2013 presentationxiaobinshen
 
African Open Science Platform
African Open Science PlatformAfrican Open Science Platform
African Open Science PlatformKidsintheCloud
 
The current challenges of upgrading the infrastructure
The current challenges of upgrading the infrastructureThe current challenges of upgrading the infrastructure
The current challenges of upgrading the infrastructureArhiv družboslovnih podatkov
 
NSFC and Funding opportunities
NSFC and Funding opportunities NSFC and Funding opportunities
NSFC and Funding opportunities Dragonstarproject
 
The research data landscape: an overview - Oya Rieger, Cornell University
The research data landscape: an overview - Oya Rieger, Cornell UniversityThe research data landscape: an overview - Oya Rieger, Cornell University
The research data landscape: an overview - Oya Rieger, Cornell UniversityOpenAIRE
 
Making Knowledge Infrastructure by “Identification”
Making Knowledge Infrastructure by “Identification” Making Knowledge Infrastructure by “Identification”
Making Knowledge Infrastructure by “Identification” ORCID, Inc
 
Open sciencerefresher2019
Open sciencerefresher2019Open sciencerefresher2019
Open sciencerefresher2019heila1
 
An introduction to ViBRANT: Virtual Biodiversity Research and Access Network ...
An introduction to ViBRANT: Virtual Biodiversity Research and Access Network ...An introduction to ViBRANT: Virtual Biodiversity Research and Access Network ...
An introduction to ViBRANT: Virtual Biodiversity Research and Access Network ...Vince Smith
 

Semelhante a Humanities Networked Infrastructure (HuNI) (20)

NeCTAR Presentation
NeCTAR PresentationNeCTAR Presentation
NeCTAR Presentation
 
EMBL-ABR_ AGRF2016
EMBL-ABR_ AGRF2016EMBL-ABR_ AGRF2016
EMBL-ABR_ AGRF2016
 
Open science / open research
Open science / open researchOpen science / open research
Open science / open research
 
HuNI Melbourne LODLAM 2012-04-17
HuNI Melbourne LODLAM 2012-04-17HuNI Melbourne LODLAM 2012-04-17
HuNI Melbourne LODLAM 2012-04-17
 
SLSTINET Library Awareness - SLIATE 20-06-2023.pptx
SLSTINET Library Awareness - SLIATE 20-06-2023.pptxSLSTINET Library Awareness - SLIATE 20-06-2023.pptx
SLSTINET Library Awareness - SLIATE 20-06-2023.pptx
 
Perspectives from the African Open Science Platform (AOSP)/Ina Smith
Perspectives from the African Open Science Platform (AOSP)/Ina SmithPerspectives from the African Open Science Platform (AOSP)/Ina Smith
Perspectives from the African Open Science Platform (AOSP)/Ina Smith
 
RDAP 15 EarthCollab: Connecting Scientific Information Sources using the Sema...
RDAP 15 EarthCollab: Connecting Scientific Information Sources using the Sema...RDAP 15 EarthCollab: Connecting Scientific Information Sources using the Sema...
RDAP 15 EarthCollab: Connecting Scientific Information Sources using the Sema...
 
Online promises beyond the policies: what's under the skin
Online promises beyond the policies: what's under the skin Online promises beyond the policies: what's under the skin
Online promises beyond the policies: what's under the skin
 
African Open Science Platform
African Open Science PlatformAfrican Open Science Platform
African Open Science Platform
 
Aosp final 6.6.17
Aosp final 6.6.17Aosp final 6.6.17
Aosp final 6.6.17
 
Xiaobin Shen eScience2013 presentation
Xiaobin Shen eScience2013 presentationXiaobin Shen eScience2013 presentation
Xiaobin Shen eScience2013 presentation
 
African Open Science Platform
African Open Science PlatformAfrican Open Science Platform
African Open Science Platform
 
The current challenges of upgrading the infrastructure
The current challenges of upgrading the infrastructureThe current challenges of upgrading the infrastructure
The current challenges of upgrading the infrastructure
 
NSFC and Funding opportunities
NSFC and Funding opportunities NSFC and Funding opportunities
NSFC and Funding opportunities
 
The research data landscape: an overview - Oya Rieger, Cornell University
The research data landscape: an overview - Oya Rieger, Cornell UniversityThe research data landscape: an overview - Oya Rieger, Cornell University
The research data landscape: an overview - Oya Rieger, Cornell University
 
Making Knowledge Infrastructure by “Identification”
Making Knowledge Infrastructure by “Identification” Making Knowledge Infrastructure by “Identification”
Making Knowledge Infrastructure by “Identification”
 
Open sciencerefresher2019
Open sciencerefresher2019Open sciencerefresher2019
Open sciencerefresher2019
 
DATAD-R African Open Science Platform (AOSP)
DATAD-R African Open Science Platform (AOSP)DATAD-R African Open Science Platform (AOSP)
DATAD-R African Open Science Platform (AOSP)
 
An introduction to ViBRANT: Virtual Biodiversity Research and Access Network ...
An introduction to ViBRANT: Virtual Biodiversity Research and Access Network ...An introduction to ViBRANT: Virtual Biodiversity Research and Access Network ...
An introduction to ViBRANT: Virtual Biodiversity Research and Access Network ...
 
African Open Science Platform
African Open Science PlatformAfrican Open Science Platform
African Open Science Platform
 

Mais de Deb Verhoeven

Towards a Model of Digital Infrapuncture
Towards a Model of Digital InfrapunctureTowards a Model of Digital Infrapuncture
Towards a Model of Digital InfrapunctureDeb Verhoeven
 
Unlocking the Innovation Potential of Universities
Unlocking the Innovation Potential of UniversitiesUnlocking the Innovation Potential of Universities
Unlocking the Innovation Potential of UniversitiesDeb Verhoeven
 
HuNI presentation_workshop
HuNI presentation_workshopHuNI presentation_workshop
HuNI presentation_workshopDeb Verhoeven
 
Checklists for crowdfunding research
Checklists for crowdfunding researchChecklists for crowdfunding research
Checklists for crowdfunding researchDeb Verhoeven
 
Crowdfunding University Research
Crowdfunding University ResearchCrowdfunding University Research
Crowdfunding University ResearchDeb Verhoeven
 
Visualizing Cinema Data: Presentation at HOMER (Prague 2013)
Visualizing Cinema Data: Presentation at HOMER (Prague 2013)Visualizing Cinema Data: Presentation at HOMER (Prague 2013)
Visualizing Cinema Data: Presentation at HOMER (Prague 2013)Deb Verhoeven
 
Kinomatics: Presentation at HOMER (Prague 2013)
Kinomatics: Presentation at HOMER (Prague 2013) Kinomatics: Presentation at HOMER (Prague 2013)
Kinomatics: Presentation at HOMER (Prague 2013) Deb Verhoeven
 
Mapping the Australian Screen Content Producer
Mapping the Australian Screen Content ProducerMapping the Australian Screen Content Producer
Mapping the Australian Screen Content ProducerDeb Verhoeven
 
Evidently: New Humanities Scholarship
Evidently: New Humanities ScholarshipEvidently: New Humanities Scholarship
Evidently: New Humanities ScholarshipDeb Verhoeven
 
Teaching Digital Research
Teaching Digital ResearchTeaching Digital Research
Teaching Digital ResearchDeb Verhoeven
 

Mais de Deb Verhoeven (10)

Towards a Model of Digital Infrapuncture
Towards a Model of Digital InfrapunctureTowards a Model of Digital Infrapuncture
Towards a Model of Digital Infrapuncture
 
Unlocking the Innovation Potential of Universities
Unlocking the Innovation Potential of UniversitiesUnlocking the Innovation Potential of Universities
Unlocking the Innovation Potential of Universities
 
HuNI presentation_workshop
HuNI presentation_workshopHuNI presentation_workshop
HuNI presentation_workshop
 
Checklists for crowdfunding research
Checklists for crowdfunding researchChecklists for crowdfunding research
Checklists for crowdfunding research
 
Crowdfunding University Research
Crowdfunding University ResearchCrowdfunding University Research
Crowdfunding University Research
 
Visualizing Cinema Data: Presentation at HOMER (Prague 2013)
Visualizing Cinema Data: Presentation at HOMER (Prague 2013)Visualizing Cinema Data: Presentation at HOMER (Prague 2013)
Visualizing Cinema Data: Presentation at HOMER (Prague 2013)
 
Kinomatics: Presentation at HOMER (Prague 2013)
Kinomatics: Presentation at HOMER (Prague 2013) Kinomatics: Presentation at HOMER (Prague 2013)
Kinomatics: Presentation at HOMER (Prague 2013)
 
Mapping the Australian Screen Content Producer
Mapping the Australian Screen Content ProducerMapping the Australian Screen Content Producer
Mapping the Australian Screen Content Producer
 
Evidently: New Humanities Scholarship
Evidently: New Humanities ScholarshipEvidently: New Humanities Scholarship
Evidently: New Humanities Scholarship
 
Teaching Digital Research
Teaching Digital ResearchTeaching Digital Research
Teaching Digital Research
 

Último

MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024MIND CTI
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityPrincipled Technologies
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processorsdebabhi2
 
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024The Digital Insurer
 
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, AdobeApidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobeapidays
 
Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...apidays
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonAnna Loughnan Colquhoun
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...Martijn de Jong
 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FMESafe Software
 
Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024The Digital Insurer
 
Why Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire businessWhy Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire businesspanagenda
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Miguel Araújo
 
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...Neo4j
 
A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?Igalia
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationSafe Software
 
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin WoodPolkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin WoodJuan lago vázquez
 
Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Scriptwesley chun
 
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdfUnderstanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdfUK Journal
 
Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)wesley chun
 

Último (20)

MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivity
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processors
 
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
 
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, AdobeApidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
 
Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt Robison
 
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...
 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
 
Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024
 
Why Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire businessWhy Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire business
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
 
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
 
A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
 
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin WoodPolkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
 
Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Script
 
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdfUnderstanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
 
Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)
 

Humanities Networked Infrastructure (HuNI)

  • 2. CRICOS Provider Code: 00113B NATIONAL E-RESEARCH COLLABORATION TOOLS AND RESOURCES (NECTAR) NeCTAR is a $47 million dollar, Australian Government project, conducted as part of the Super Science initiative and financed by the Education Investment Fund. The University of Melbourne is the lead agent, chosen by the Commonwealth Government.
  • 4. • Ensure that Australian cultural datasets and the research associated with them become part of the emerging international Linked Open Data environment. • Enable research enquiries to move easily from: what is? to where is? • Support the role of annotation and metadata in discovery of new knowledge or the means to elucidate new knowledge • Position the idea of data as both a subject and object of analysis in humanities • Contribute to debates around standards for development and implementation HuNI BROAD BENEFITS
  • 5. • Enable humanities researchers to work with cultural datasets more efficiently and effectively, and on a larger scale; • Encourage the systematic sharing of research data between humanities researchers (including the cultural dataset curators themselves), the community and cultural institutions; • Encourage a greater level of cross-disciplinary and interdisciplinary research, both within the humanities/creative arts and between the humanities/creative arts and other disciplines, and the wider public; • Support innovative methodologies such as network analysis, game theory and ‘virtual history’ that rely on large- scale datasets HUNI: SPECIFIC BENEFITS
  • 6. 1. Organisational level: the goals and processes of the institutions involved 2. The semantic level: meaning of the exchanged digital resources 3. Technical level: implementing data interoperability requires both data integration and data exchange processes as well as enabling effective use of the data that becomes available Pasquale Pagano, ‘Data Interoperability’ (GRDI2020) 4. Project level: The advent of more complex ‘big humanities’ projects requires multiple and multi-disciplinary personnel which in turn entails the organization of different workflows and expectations: e.g. challenge of developing a comprehensive or consortial approach, common definition of project method etc. INTEROPERABILITY
  • 7. 1. A PARTNERSHIP … a Deakin led consortium • Cultural data providers (10) – project co-operators • Humanities software developer (1) – project co- developers • eResearch organisations (2) – lead development agencies
  • 8. HUNI PARTNER DATASETS AMHD MAP CAARP Bonza AFIRC Circus Oz AusStage Media: film, cinema, theatre, newspapers, magazines, advertis ing, music, live performances DAAO AustLit AWR ADB DoS Biographical: artists, designers, writers, significant people, scientists, Sydney demographics EOAS AUSTLANG Mura Indigenous languages
  • 10. ADB
  • 11. DAAO
  • 13. bonza
  • 15. EOAS
  • 16. TUGG
  • 17. Welcome to the Cinema and Audiences Research Project (CAARP) database: An online encyclopaedia of cinema-going in Australia. Data This site contains information on film screenings and venues in Australia. 430,137 screenings 10,256 films 1,978 cinemas 1,649 companies From 1846 to now
  • 18. • NeCTAR investment of $1.33M • Partner contributions of $480,000 • Partner in-kind contributions amounting to >$1M A FISCAL COLLABORATION
  • 19. COMMUNITY BUILDING • Collated user-stories (20) • Online showcase events – next one is 4th September 2013 • Live link to the latest alpha prototype on huni.net.au; feedback buttons • Wider beta launch at eResearch Australasia in October 2013 • Stay up to date through our monthly Newsletter and blog feed • Follow us on twitter - @HuNIVL
  • 20. Information design challenge to build an ontology and use linked data and controlled vocabularies for data to be aligned and related. • Reading the data. Characteristics of the data determine the ontological components selected and the major “entities” (aka “access points”). • Identified early as: people, organisations, events, relationships, places, dates, resources, and subjects. • Components from ontologies already available and being reused or kept in our sights: CIDOC- CRM, FOAF, FRBR, FRBR-OO, BibFrame and PROV-O. 2. INTEGRATING MEANING
  • 23. HUNI ONTOLOGY (all classes and object properties) cidoc:E41Appellation cidoc:E49TimeAppellation has subclass cidoc:E44PlaceAppellation has subclass cidoc:E18PhysicalThing cidoc:E24PhysicalManMadeThing has subclass cidoc:E19PhysicalObject has subclass frbr:F7Object has subclass cidoc:P1isIdentifiedBy (Domain>Range) frbr:F9Place cidoc:P53hasCurrentOrFormerLocation (Domain>Range) cidoc:P1isIdentifiedBy (Domain>Range)cidoc:E22Man-MadeObject has subclass cidoc:P1isIdentifiedBy (Domain>Range) cidoc:E52Time-Span cidoc:E2TemporalEntity has subclasscidoc:P4hasTimeSpan (Domain>Range) cidoc:E4Period has subclass frbr:F22Self-Contained_Expression frbr:F25Performance_Plan has subclass frbr:F26Recording has subclass frbr:F24Publication_Expression has subclass frbr:F15Complex_Work frbr:F18Serial_Work has subclass cidoc:E21Person frbr:F10Person has subclass cidoc:E67Birth cidoc:P98iwasBorn (Domain>Range) foaf:Person has subclass cidoc:E74Group cidoc:P107iisCurrentOrFormerMemberOf (Domain>Range) cidoc:E69Death cidoc:P101idiedIn (Domain>Range) cidoc:E7Activity cidoc:P14iperformed (Domain>Range) Thing cidoc:E39Actor has subclasscidoc:E15IdentifierAssignment has subclass huni:PrimaryTopic has subclass cidoc:E35Title has subclass cidoc:E71Man-MadeThing has subclass has subclass cidoc:E53Place has subclass has subclass huni:SKOS.Occupation has subclass has subclass foaf:Group has subclass huni:SKOS.Role has subclass frbr:F6Concept has subclass frbr:F11Corporate_Body has subclass huni:SKOS.Collection has subclass cidoc:E42Identifier has subclass has subclass frbr:F8Event has subclass huni:SKOS.Item has subclass has subclass cidoc:E56Language has subclass has subclass frbr:F13Identifier has subclass has subclass cidoc:E55Type has subclass has subclassfrbr:F40Identifier_Assignment has subclass cidoc:P2hasType (Domain>Range) cidoc:P11iparticipatedIn (Domain>Range) has subclass cidoc:P2HasType (Domain>Range) has subclass has subclass has subclass has subclass cidoc:E65Creation has subclass frbr:F31Performance has subclasshas subclass cidoc:E12Production has subclass cidoc:P1isIdentifiedBy (Domain>Range) cidoc:P1isIdentifiedBy (Domain>Range) has subclass huni:timeIsIdentifiedBy (Domain>Range) cidoc:E5Event has subclass cidoc:P1isIdentifiedBy (Domain>Range) has subclass cidoc:P1isIdentifiedBy (Domain>Range)cidoc:P1isIdentifiedBy (Domain>Range) cidoc:P7tookPlaceAt (Domain>Range)cidoc:P1isIdentifiedBy (Domain>Range) huni:hasOccupation (Domain>Range) huni:hasRole (Domain>Range) cidoc:E48PlaceName has subclass frbr:F30Publication_Event frbr:R24created (Domain>Range)frbr:F21Recording_Work frbr:R23createdARealisationOf (Domain>Range) frbr:F19Publication_Work frbr:R24created (Domain>Range) has subclass cidoc:P1isIdentifiedBy (Domain>Range)cidoc:P1isIdentifiedBy (Domain>Range) cidoc:P1isIdentifiedBy (Domain>Range) huni:placeIsIdentifiedBy (Domain>Range) frbr:F28Expression_Creation has subclass cidoc:P108hasProduced (Domain>Range) has subclassfrbr:F1Work frbr:R19createdARealisationOf (Domain>Range) frbr:F2Expression frbr:R17created (Domain>Range) frbr:F21Recording_Event has subclass cidoc:E28ConceptualObject has subclass has subclass has subclass cidoc:E89PropositionalObject has subclass frbr:F14Individual_Work frbr:F17Aggregation_Work has subclass cidoc:P94hasCreated (Domain>Range) frbr:f25Work_Conception has subclass cidoc:P102hasTitle (Domain>Range) huni:hasCollection (Domain>Range) cidoc:P2hasType (Domain>Range) has subclass cidoc:P148hasComponent (Domain>Range) cidoc:E73InformationObject has subclass huni:hasItem (Domain>Range) cidoc:P2HasType (Domain>Range) frbr:f16Container_Work has subclass has subclass has subclass has subclass frbr:F20Performance_Work has subclasshas subclass has subclasscidoc:P72hasLanguage (Domain>Range) has subclass cidoc:P2hasType (Domain>Range)cidoc:P2HasType (Domain>Range) has subclass frbr:R23createdARealisationOf (Domain>Range) frbr:R24created (Domain>Range) frbr:R24created (Domain>Range) has subclass cidoc:P102hasTitle (Domain>Range) frbr:R12isRealisedIn (Domain>Range) has subclass has subclass frbr:R16initiated (Domain>Range) cidoc:P14iperformed (Domain>Range) has subclass
  • 25. 3. HuNI DATA ARCHITECTURE Data integration HuNI side Partner side Data harvest, transform and ingest Solr Search Server [HuNI Data] RDF Triple Store [HuNI Linked Data] Data analysis and mapping HuNI Virtual Laboratory Scholarly researcher workflow tasks Admin tasksPublic and citizen researcher workflow tasks Data discovery Data analysis Data sharing Analyse and annotate collection Export collection Share collection and analysis Share search results Corbicula Registration and login Profile management History recording Project management Simple search Advanced search Save search results as private collection Refine / expand collection Simple search Advanced search Deep (SPARQL-based) search Data update and publish ADB DAAO CAARP AFIRC AusStage
  • 26. A total of 28 Australian datasets are being harvested for integration into HuNI • Data gateway components, called HuNI Corbicula, deployed on the NeCTAR Cloud to harvest the XML feed data and transforming it into forms suitable for ingestion into two HuNI data aggregates: a Solr search server [HuNI Data], and a Jena RDF Triple Store [HuNI Linked Data] DATA INTEGRATION The harvesting process requires: • Live data feeds deployed at the partner sites to publish updated partner data as XML Data integration HuNI side Partner side Data harvest, transform and ingest Solr Search Server [HuNI Data] RDF Triple Store [HuNI Linked Data] Data analysis and mapping Corbicula Data update and publish ADB DAAO CAARP AFIRC AusStage
  • 27. TWO HUNI DATA AGGREGATES? Solr aggregate RDF aggregate 28 0 7 14 21 24 0 7 14 21 6 partnerdataset partnerdataset
  • 28. TECHNOLOGY STACK • front-end frameworks - AngularJS and Twitter Bootstrap single page web app • tools hosting framework - Open Social via Apache Shindig • back-end framework - SpringMVC via Roo. • layer integration - RESTful web services
  • 29. • Search the HuNI Data • Save their search results as a private collection • Refine their collection through additional searches • Analyse and annotate their collection with their own assertions and commentary • Export their collection for further analysis • Publish and share their collection and research RESEARCH ACTIVITIES A researcher with a HuNI account will be able to: HuNI Virtual Laboratory Scholarly researcher workflow tasks Admin tasksPublic and citizen researcher workflow tasks Data discovery Data analysis Data sharing Analyse and annotate collection Export collection Share collection and analysis Share search results Registration and login Profile management History recording Project management Simple search Advanced search Save search results as private collection Refine / expand collection Simple search Advanced search Solr Search Server [HuNI Data]
  • 30. Scholarly researchers will also be able to perform a “deep search” of the graphs in RDF Triple Store. The large-scale aggregation of Linked Data makes explicit the relationships and connections between related records across all the partner datasets, enabling the researcher to construct more complex semantic queries. RESEARCH ACTIVITIES 2 HuNI Virtual Laboratory Scholarly researcher workflow tasks Admin tasksPublic and citizen researcher workflow tasks Data discovery Data analysis Data sharing Registration and login Profile management History recording Project management Deep (SPARQL-based) search RDF Triple Store [HuNI Linked Data]
  • 32.
  • 37. VIRTUAL LABORATORY RESEARCHER WORKFLOW – Analysis (part 2)
  • 39. 4. THE PROJECT • project director/community liaison (20%) • project manager (100%) • technical coordinator (100%) • information services coordinator (90%) • community engagement (30%) • communication coordinator (20%) • administrative support (20%) • software developer(s) NeCTAR Directorate HuNI Steering Committee Team HuNI Technical Working Group Expert Advisory Group Expert Data Group
  • 42. HuNI: a virtual laboratory for the humanities http://huni.net.au/@HuNIVL

Notas do Editor

  1. Components of CIDOC-CRM, FOAF and FRBR-OO ontologies have been reused for the integration of the initial datasets. This is a means to encode people, their existence (birth and death events), their occupations and associations with organisations. More components have been added to record two further events, i.e. creation and production events, and to record works and expressions. Work is underway to plugin SKOS and structure vocabularies, using the data supplied (in EAC type schemas) to manage the range of terminology, e.g. recreational, vocational, professional and occupational. This draft is based on a portion of the data analysed and a "mud map" (based on an assessment of data available through web interfaces). See the draft as a line diagram​. A view of the ontology generated in the tool Protege reveals FRBR-OO as an extension of CIDOC-CRM. Draft v0.3 using Initial DatasetsLimitations with using FOAF to handle personal names (culturally situated) have been found. The CIDOC component ​E41_Appellation and its subclasses will now be used, collections are being dealt with and further events are being added, e.g. ​E87_Curation_Activity to reflect actions of selection and collection development. Under discussion is: the inclusion of ​E90_Symbolic_Object to deal with citations (that are not feasible to strip apart and process but provide useful contextual information for an entity); the creation of "Floruit" as a time-related entity for ​E21_Person and ​E74_Group; categorising the datasets and collections as ​E89_Propositional_Object; and ​F3_Manifestation_Product_Type to deal with the disambiguation of portable and web formats of works.
  2. This section of the HuNI ontology reveals the "joins" and class relationships, that reveal where the CIDOC-CRM and FRBR-OO ontologies align. The yellow-green bubbles record the CIDOC entities and the red bubbles record the FRBR entities. The bidirectional arrows indicate where there is a "sameAs" relationship, the unidirectional arrow indicate where there is a sub-class relationship.
  3. The integration of partner data into HuNI requires two technical component:1. Live data feeds (at partner sites)Three technology options are available for the partners to publish their data as XML: jOAI, OAIcat and, for those who are not exposing their data via the OAI-PMH harvesting protocol, a custom-built solution that requires very little work to integrate at a provider’s site.We are not harvesting all the data – we are only harvesting the primary entity classes (and as much of the uniquely identifying information as possible for each class) that are common “touch points” across many of the partner data sites – people, places, events and objects. Therefore, the lowest common denominator for making the partner data harvestable is a flat XML file per class entity, together with the uniquely identifying information. For example, for the person class entity, uniquely identifying information will include first name, last name, date of birth/death, bio, occupation. 2. A data gateway component called CorbiculaTechnology is being deployed toharvest updated content from the partner XML data feeds and transform the data into forms suitable for ingestion into:A Solr search server: this aggregation of harvested XML records is referred to as ‘HuNI Data’ A Jena RDF Triple Store: this aggregation of stored RDF Graphs is referred to as ‘HuNI Linked Data’
  4. Based on the data architecture as set out in the original RFP, there is a requirement to harvest, transform and ingest data each of the partner datasets into some sort of Linked Data store, and very early on in the technical decision making process, it was agreed that RDF (Resource Description Framework) – a metadata modeling specification - would be the lingua franca, and that all the technical components would be developed to work with this Linked Data specification.So we began by:Making some of the partner datasets harvestable to HuNI: by developing a harvest feed for those data providers who were technically able to publish their data in a standard export format/schema (EAC-CPF)Constructing the HuNI ontology and mapping partner data to this common data model. A number of standard cultural heritage ontologies were selected for examination because of their perceived close semantic fit to the nature and types of data in each of the 28 data sources. – CIDOC-CRM, FOAF, FRBR-OO, PROV-ODeploying a data gateway component – called Corbicula – on the NeCTAR Cloud, which is able to technically harvest and transforms the updated XML data from the partner feeds and ingest it into the RDF Triple Store. Once the mappings for a given data source are known, XSLT scripts are written to interpret the XML records and re-expresses (transforms) them as RDF graphs (essentially captures the relationship/link between records from all integrated data sets. But the integration into RDF has proven to be semantically complex and technically complex, because: The publishing format necessary to allow us to do the mappings is too high a technical barrier for most data custodians The data analysis and mapping to a common data model is proving time consuming and complexThe gateway component that harvests and transforms the data into RDF using XSLT has performance and memory issuesThe SPARQL-based search interface developments – where people can search and query the graphs – was proving too slowAs a result, after 10 months of development, only 6 partner data sources have completed their integration journey into the RDF Triple Store, and the search UI isn’t very performantSo back in May it was flagged that there is a as real project risk that we will not be able to fully transform all the partner data into Linked Data, and that only a small subset of partner datasets will be barely discoverable through the lab. This was a real problem, given that the main objective of HuNI is to provide a coSo the decision was made to exYour probably wondering – why have 2 data aggregates –why we mixed the data architectures – purely a project risk management decision – harvesting, mapping, transforming and ingesting into Linked Data is complex and time consuming, and there is a real danger that we won’t have a sufficient Linked Data layer in which to build the lab on – so in order to deliver some cross dataset search capability within the project timeframe, we introduced a new development strand which sees the accelerated harvesting and integration of data into the Solr aggregate So the decision has been made to continue populating the RDF store with partner data for the remainder of 2013, and work on UI in 2014To populate the Solr search server is easy, HuNI periodically harvests the updated XML records from the partner feeds, processes the XML content via a suitable transform, and submits the transformed XML data into the Solr search server. The transformation of partner XML records into HuNI Linked Data is complex and time consuming, and we’ve faced a number of technical issues, which isn’t surprising since we’re using a combination of largely unproven technologies, on the scale required for HuNI deployment First, the harvested data had to be cleaned and mapped to a core HuNI ontology. A range of cultural heritage ontologies were examined as the starting-point for building this core ontology framework. This has been an iterative process, determined by the nature of each data source and by the main types of data found in each source. The following standard ontologies are being aligned to create the HuNI Ontology:People and Organisations (using the CIDOC-CRM and FOAF ontologies) Items, Collections and Resources (using the PROV-O, CIDOC-CRM, FOAF and FRBR-OO ontologies) Events and Relations (using the PROV-O, CIDOC-CRM, FOAF and FRBR-OO ontologies) Place and Subject (using PROV-O, CIDOC-CRM, FOAF and FRBR-OO ontologies) Once the mappings to a common data model are known, the data needs to be technically transformed and ingested. This is made possible through the HuNI gateway component called Corbicula, which performs the following steps: Periodically harvests updated XML records from the source provider feedsUses XSLT to interpret the XML records and re-express (transform) them as RDF graphs.Stores the RDF graphs.The search feature needs to be based on the linked data, to take advantage of the semantic integration provided by the RDF aggregation
  5. But of course this is a VL project and not a data integration project
  6. Support the non-linear research methods practiced by humanities researchersHuNI is about inclusivity and not exclusivity – using 3rd party authentication for login - for the a community to form around HuNI, its user-base needs to extend beyond scholarly researchers. Also worth noting that any member of the general public interested in Australian culture can run a search across the related databases (the HuNI Data), and share their search results online – not just scholarly researchersThere are discovery limitations – whilst the context is given for each record found, what isn’t available are the known relationships between related records across the disparate data sources - so we’re currently working on a ‘Social Linked Data’ feature
  7. Equipped with a full set of known facets and related data fields for each record type, researchers should be able to interact with, and construct complex queries of, the large-scale aggregation of Linked Data.
  8. Link will be made available on huni.net.au soon
  9. The lab is being designed to support the non-linear research methods practiced in the humanities and creative arts, and will support a workflow centred around discovery, analysis and sharing. As part of the discovery interface a researcher will be able to:Run a free text search across the aggregate and display their results Perform an advanced faceted browse of the aggregate by filtering their results by dataset and entity classes defined in the ontology: people, works, events, organisation, occupation/role, time, place, collections, language, objects. Narrow their search parameters at the start of their search by browsing for information within pre-defined access points. These are likely to be people, works and events since these entity classes are representative across all 28 data sources. Following the initial browse, the user can then filter their search results by dataset and the remaining entity classes.Run a SPARQL query to interrogate the underlying Linked Data The discovery interface is also going to enable serendipitous discovery (i.e. the ability to present information to users before they know what they want to search for):You might also be interested in... (based on the semantic relationships captured in the ontology)The notion of a generous interface is being included (based on some pre-defined daily query feeds), to give the researcher a sense of what is discoverable:On this day…Most popular searchesMost popular records The result sets will be displayed in a number of forms, with list being the default and map and timeline being optional. All search results will be displayed with hyperlinks that allow navigation to the source entity and will show the connections between records as per the ontology mappings
  10. The LORE Tool (developed at UQ) will be made available in the lab where researchers will be able to:Display existing connections between relevant records held within their virtual collection, and Add further links between particular records, with commentary describing the relationship between them
  11. Researchers will have the option to export their Virtual Collection as a .csv file so they can undertake further computational analysis outside of the HuNI lab and within their preferred tool environment.Whilst the lab will include a Tool Integration Framework specifying how third party tools can integrate within the lab and work with HuNI data, we recognize that tools come and go, and that researchers create their own relationship with their tools of choice. So offering an export function is crucial.
  12. Researchers will have the option to share their virtual collection, and their analysis findings, via FB, twitter and email with other researchers
  13. The development of HuNI is being managed as a projectHas a collaborative governance structure in place so that all key project decisions are only made as part of a consultative process Using Prince2 methodology in help manage the projectQuestion of consortial project management…Need to create best practice exemplars at the project management level…Staff in 4 states. Communication in skype or google hangout. Issues around discomfort with these communication technologies. Etc.