SlideShare a Scribd company logo
1 of 33
Better together: building services for
public good on top of content from the
global network of open repositories
Petr Knoth
June 21, 2019 – OAI-11, Geneva
Big Scientific Data and Text Analytics Group
Knowledge Media Institute, The Open University
Papers OA sooner and sooner
1/22
https://physicstoday.scitation.org/do/10.1063/PT.6.2.20190418a/full/
JCDL 2019 Best Paper
Award
Papers OA sooner and sooner
The delay between publication and OA availability decreasing
globally
2
Herrmannova, Drahomira; Pontika, Nancy and Knoth, Petr (2019). Do Authors Deposit on Time? Tracking Open Access Policy
Compliance. In: 2019 ACM/IEEE Joint Conference on Digital Libraries, 2-6 Jun 2019, Urbana-Champaign, IL
Faster open access makes repository
infrastructure more important
3
We don’t need just open access
we need fast open access
4
This study was only possible to conduct because
of repositories and aggregators working together
5
Global network of repositories
7
“A single scientific repository is of limited value, real benefits
come from the ability to exchange data within a network …
… interoperability allows us to exploit today's computational
power so that we can aggregate, data mine, create new tools and
services, and generate new knowledge from repository content.”
– Confederation of Open Access Repositories (COAR)
OA Aggregations and BOAI 2002
“To achieve open access to scholarly journal literature, we
recommend two complementary strategies.
• Self-Archiving: First, scholars need the tools and assistance to
deposit their refereed journal articles in open electronic
archives, a practice commonly called, self-archiving. When
these archives conform to standards created by the Open
Archives Initiative, then search engines and other tools can
treat the separate archives as one. Users then need not know
which archives exist or where they are located in order to find
and make use of their contents.
• …”
Budapest Open Access Initiative, 2002
8
CORE’s mission
9
Aggregate all open access research articles worldwide …
… enrich this content and provide seamless access to it through a set
of data services …
Introducing CORE
10
Harvesting data is challenging
11
Harvesting data is challenging
14
Harvesting data is challenging
15
Need for more interoperability across systems
16
World’s largest dataset of Open Access
full texts
• 13,117,488 Hosted full texts
• 135,539,113 Metadata records
• ~78m Links to full texts
• 15TB of raw plain text
• 4,123 Data providers
17/22
CORE Processing pipeline
• Metadata download, extraction and harmonisation
• Full text download
• Text extractions, sections extraction
• Metadata validation and enrichment (DOI, ORCID, etc.)
• Thumbnails generation
• References and citation contexts extraction
• API enrichment (e.g. finding DOIs, linking to other systems)
• Document type classification
• Deduplication
• Indexing
• Exposing (data dumps, API, FastSync)
18
CORE usage
• January 2019 – CORE reached over 10M monthly active
users for the first time
• 571% increase from January 2018
19
Alexa rank
• Within top 6k
global websites:
https://www.alexa.
com/
• core.ac.uk by
usage in the top
0.0009% of
global websites
20
CORE Search
21
• Full text search for OA content
• Faceted searching
• What you find is what you get
• Real change of data providers
wanting to be included
CORE Recommender
• Recommending relevant
content to users from
across all free content
• Recommender plugin for
repositories
• https://core.ac.uk/service
s/recommender/
22
CORE Recommender
• Recommending relevant
content to users from
across all free content
• Recommender plugin for
repositories
• https://core.ac.uk/service
s/recommender/
23
Introducing CORE Discovery
• High coverage of freely
available content
• Free service for
researchers by
researchers. No
company controlling the
pipes.
• Best grip on open
repository content.
• Repository integration
• Discovering documents
without a DOI.
https://core.ac.uk/services/discovery/
CORE Discovery demonstration
CORE Discovery Repository integration
• Majority of articles in
repositories metadata only.
• CORE Discovery
repository plugin:
• turns dead ends of user
journeys into journeys
fulfilling users’ information
needs
• makes repository content
more discoverable.
CORE Search
27
• Full text search for OA content
• Faceted searching
• What you find is what you get
CORE’s raw data services
28
Raw data services – CORE API
29
• Enables the development of new
applications
• Real-time machine access to the
world's largest collection of open
access papers
• Harmonised access to data from
across the network of CORE
providers
• Direct machine access to full texts
of research papers
Raw data services – CORE Dataset
30
• Download millions of research
papers for text and data analysis
• Prototype, analyse and mine your
data in your infrastructure
Raw data services – CORE FastSync
31
• Keeps your data in sync with
research content from around the
world
• Fast and incremental updates as
soon as they become available. No
usage restrictions
• Based on ResourceSync
Use powered by CORE
32
• It is beyond human capacities to read all
scientific literature
• Example use cases in which CORE is applied:
• Improving discovery
• Plagiarism detection
• Question answering in science
• Literature based discovery
• Fact checking and detection of misinformation
• Analysing research trends
• Finding experts in a particular domain
• Research evaluation and scientometrics
• Exploratory and visual search
• …
Working with partners
33
Take home
22/22
• Data providers (repositories, preprint servers, journals, etc.)
and aggregators need to work together to allow text and data
analysis, processing and reuse of large volumes of research
papers.
• CORE provides the tools for programmatically processing
open access data fast, reliably and from across the global
network of repositories.
• If you are a repository manager or a librarian:
• CORE Discovery and Recommender
• If you are a developer or analyst:
• Build your own stuff using CORE‘s data services on top of the
global full text open access courpus
Thank you!
https://core.ac.uk

More Related Content

What's hot

DBpedia - An Interlinking Hub in the Web of Data
DBpedia - An Interlinking Hub in the Web of DataDBpedia - An Interlinking Hub in the Web of Data
DBpedia - An Interlinking Hub in the Web of DataChris Bizer
 
Session 1.2 improving access to digital content by semantic enrichment
Session 1.2   improving access to digital content by semantic enrichmentSession 1.2   improving access to digital content by semantic enrichment
Session 1.2 improving access to digital content by semantic enrichmentsemanticsconference
 
Tufts Spatial Data Rescue: Crawling at-risk Government Data
Tufts Spatial Data Rescue: Crawling at-risk Government DataTufts Spatial Data Rescue: Crawling at-risk Government Data
Tufts Spatial Data Rescue: Crawling at-risk Government DataKyle Monahan
 
Harvesting Using the Open Archives Initiative Protocol: What Your OAI Stream ...
Harvesting Using the Open Archives Initiative Protocol: What Your OAI Stream ...Harvesting Using the Open Archives Initiative Protocol: What Your OAI Stream ...
Harvesting Using the Open Archives Initiative Protocol: What Your OAI Stream ...Sandra McIntyre
 
Using Linked Data Resources to generate web pages based on a BBC case study
Using Linked Data Resources to generate web pages based on a BBC case studyUsing Linked Data Resources to generate web pages based on a BBC case study
Using Linked Data Resources to generate web pages based on a BBC case studyLeila Zemmouchi-Ghomari
 
UKSG webinar: Making Connections - Creating Linked Open Library Data with Nei...
UKSG webinar: Making Connections - Creating Linked Open Library Data with Nei...UKSG webinar: Making Connections - Creating Linked Open Library Data with Nei...
UKSG webinar: Making Connections - Creating Linked Open Library Data with Nei...UKSG: connecting the knowledge community
 
Documents, services, and data on the web
Documents, services, and data on the webDocuments, services, and data on the web
Documents, services, and data on the webChiara Del Vescovo
 
Visualizing Co-authorship Networks for Actionable Insights: Action Design Res...
Visualizing Co-authorship Networks for Actionable Insights: Action Design Res...Visualizing Co-authorship Networks for Actionable Insights: Action Design Res...
Visualizing Co-authorship Networks for Actionable Insights: Action Design Res...Jukka Huhtamäki
 
DataCite and its DOI infrastructure - IASSIST 2013
DataCite and its DOI infrastructure - IASSIST 2013DataCite and its DOI infrastructure - IASSIST 2013
DataCite and its DOI infrastructure - IASSIST 2013Frauke Ziedorn
 
OpenAIRE guidelines and broker service for repository managers - OpenAIRE #OA...
OpenAIRE guidelines and broker service for repository managers - OpenAIRE #OA...OpenAIRE guidelines and broker service for repository managers - OpenAIRE #OA...
OpenAIRE guidelines and broker service for repository managers - OpenAIRE #OA...OpenAIRE
 
Open Archives Initiatives For Metadata Harvesting
Open Archives Initiatives For Metadata   HarvestingOpen Archives Initiatives For Metadata   Harvesting
Open Archives Initiatives For Metadata HarvestingNikesh Narayanan
 
British Library Linked Open Data Presentation for ALA June 2014
British Library Linked Open Data Presentation for ALA June 2014British Library Linked Open Data Presentation for ALA June 2014
British Library Linked Open Data Presentation for ALA June 2014nw13
 
Going for GOLD - Adventures in Open Linked Geospatial Metadata
Going for GOLD - Adventures in Open Linked Geospatial MetadataGoing for GOLD - Adventures in Open Linked Geospatial Metadata
Going for GOLD - Adventures in Open Linked Geospatial MetadataEDINA, University of Edinburgh
 
TIB's action for research data managament as a national library's strategy in...
TIB's action for research data managament as a national library's strategy in...TIB's action for research data managament as a national library's strategy in...
TIB's action for research data managament as a national library's strategy in...Peter Löwe
 

What's hot (20)

DBpedia - An Interlinking Hub in the Web of Data
DBpedia - An Interlinking Hub in the Web of DataDBpedia - An Interlinking Hub in the Web of Data
DBpedia - An Interlinking Hub in the Web of Data
 
Rusbridge Feb 8 Improving Clarity around Continuing Access
Rusbridge Feb 8 Improving Clarity around Continuing AccessRusbridge Feb 8 Improving Clarity around Continuing Access
Rusbridge Feb 8 Improving Clarity around Continuing Access
 
Session 1.2 improving access to digital content by semantic enrichment
Session 1.2   improving access to digital content by semantic enrichmentSession 1.2   improving access to digital content by semantic enrichment
Session 1.2 improving access to digital content by semantic enrichment
 
Tufts Spatial Data Rescue: Crawling at-risk Government Data
Tufts Spatial Data Rescue: Crawling at-risk Government DataTufts Spatial Data Rescue: Crawling at-risk Government Data
Tufts Spatial Data Rescue: Crawling at-risk Government Data
 
Access to Content via Link Resolvers
Access to Content via Link ResolversAccess to Content via Link Resolvers
Access to Content via Link Resolvers
 
OAI and OAI-PMH
OAI and OAI-PMHOAI and OAI-PMH
OAI and OAI-PMH
 
Harvesting Using the Open Archives Initiative Protocol: What Your OAI Stream ...
Harvesting Using the Open Archives Initiative Protocol: What Your OAI Stream ...Harvesting Using the Open Archives Initiative Protocol: What Your OAI Stream ...
Harvesting Using the Open Archives Initiative Protocol: What Your OAI Stream ...
 
Using Linked Data Resources to generate web pages based on a BBC case study
Using Linked Data Resources to generate web pages based on a BBC case studyUsing Linked Data Resources to generate web pages based on a BBC case study
Using Linked Data Resources to generate web pages based on a BBC case study
 
UKSG webinar: Making Connections - Creating Linked Open Library Data with Nei...
UKSG webinar: Making Connections - Creating Linked Open Library Data with Nei...UKSG webinar: Making Connections - Creating Linked Open Library Data with Nei...
UKSG webinar: Making Connections - Creating Linked Open Library Data with Nei...
 
Documents, services, and data on the web
Documents, services, and data on the webDocuments, services, and data on the web
Documents, services, and data on the web
 
Who is doing what, and how do we know? [PEPRS]
Who is doing what, and how do we know? [PEPRS]Who is doing what, and how do we know? [PEPRS]
Who is doing what, and how do we know? [PEPRS]
 
Deep Impact: Metadata and SUNCAT
Deep Impact: Metadata and SUNCATDeep Impact: Metadata and SUNCAT
Deep Impact: Metadata and SUNCAT
 
Visualizing Co-authorship Networks for Actionable Insights: Action Design Res...
Visualizing Co-authorship Networks for Actionable Insights: Action Design Res...Visualizing Co-authorship Networks for Actionable Insights: Action Design Res...
Visualizing Co-authorship Networks for Actionable Insights: Action Design Res...
 
DataCite and its DOI infrastructure - IASSIST 2013
DataCite and its DOI infrastructure - IASSIST 2013DataCite and its DOI infrastructure - IASSIST 2013
DataCite and its DOI infrastructure - IASSIST 2013
 
OpenAIRE guidelines and broker service for repository managers - OpenAIRE #OA...
OpenAIRE guidelines and broker service for repository managers - OpenAIRE #OA...OpenAIRE guidelines and broker service for repository managers - OpenAIRE #OA...
OpenAIRE guidelines and broker service for repository managers - OpenAIRE #OA...
 
Open Archives Initiatives For Metadata Harvesting
Open Archives Initiatives For Metadata   HarvestingOpen Archives Initiatives For Metadata   Harvesting
Open Archives Initiatives For Metadata Harvesting
 
British Library Linked Open Data Presentation for ALA June 2014
British Library Linked Open Data Presentation for ALA June 2014British Library Linked Open Data Presentation for ALA June 2014
British Library Linked Open Data Presentation for ALA June 2014
 
Going for GOLD - Adventures in Open Linked Geospatial Metadata
Going for GOLD - Adventures in Open Linked Geospatial MetadataGoing for GOLD - Adventures in Open Linked Geospatial Metadata
Going for GOLD - Adventures in Open Linked Geospatial Metadata
 
Open Access Repository Junction
Open Access Repository JunctionOpen Access Repository Junction
Open Access Repository Junction
 
TIB's action for research data managament as a national library's strategy in...
TIB's action for research data managament as a national library's strategy in...TIB's action for research data managament as a national library's strategy in...
TIB's action for research data managament as a national library's strategy in...
 

Similar to Better together: building services for public good on top of content from the global network of open repositories

Research data support: a growth area for academic libraries?
Research data support: a growth area for academic libraries?Research data support: a growth area for academic libraries?
Research data support: a growth area for academic libraries? Robin Rice
 
OpenAIRE workshop @ OR2016 - From Repositories, for repositories
OpenAIRE workshop @ OR2016 - From Repositories, for repositoriesOpenAIRE workshop @ OR2016 - From Repositories, for repositories
OpenAIRE workshop @ OR2016 - From Repositories, for repositoriesOpenAIRE
 
Closing the scientific literature access gap with CORE - how to gain free acc...
Closing the scientific literature access gap with CORE - how to gain free acc...Closing the scientific literature access gap with CORE - how to gain free acc...
Closing the scientific literature access gap with CORE - how to gain free acc...Nancy Pontika
 
Authors' and Publications' Citations knowledge base
Authors' and Publications' Citations knowledge base Authors' and Publications' Citations knowledge base
Authors' and Publications' Citations knowledge base Leila Zemmouchi-Ghomari
 
IDCC workshop: OpenAIRE services and tools for Open Research Data in H2020
IDCC workshop: OpenAIRE services and tools for Open Research Data in H2020IDCC workshop: OpenAIRE services and tools for Open Research Data in H2020
IDCC workshop: OpenAIRE services and tools for Open Research Data in H2020OpenAIRE
 
Enabling better science: Results and vision of the OpenAIRE infrastructure an...
Enabling better science: Results and vision of the OpenAIRE infrastructure an...Enabling better science: Results and vision of the OpenAIRE infrastructure an...
Enabling better science: Results and vision of the OpenAIRE infrastructure an...OpenAIRE
 
Enabling better science - Results and vision of the OpenAIRE infrastructure a...
Enabling better science - Results and vision of the OpenAIRE infrastructure a...Enabling better science - Results and vision of the OpenAIRE infrastructure a...
Enabling better science - Results and vision of the OpenAIRE infrastructure a...Paolo Manghi
 
Next Generation Repositories
Next Generation RepositoriesNext Generation Repositories
Next Generation Repositoriesukcorr
 
finde datasets repository.pptx
finde datasets repository.pptxfinde datasets repository.pptx
finde datasets repository.pptxhasanrdhaiwi
 
Instutional repositories and data
Instutional repositories and dataInstutional repositories and data
Instutional repositories and dataAndrew Treloar
 
Research Data Management in GLAM: Managing Data for Cultural Heritage
Research Data Management in GLAM: Managing Data for Cultural HeritageResearch Data Management in GLAM: Managing Data for Cultural Heritage
Research Data Management in GLAM: Managing Data for Cultural HeritageSarah Anna Stewart
 
Libraries Enabling Open Science: LIBER Strategy & Advocacy
Libraries Enabling Open Science: LIBER Strategy & AdvocacyLibraries Enabling Open Science: LIBER Strategy & Advocacy
Libraries Enabling Open Science: LIBER Strategy & AdvocacyLIBER Europe
 
Institutional Repositories
Institutional RepositoriesInstitutional Repositories
Institutional RepositoriesSridhar Gutam
 
2014 ALA MW SPARC-ACRL Forum Talk
2014 ALA MW SPARC-ACRL Forum Talk2014 ALA MW SPARC-ACRL Forum Talk
2014 ALA MW SPARC-ACRL Forum TalkPaul Bracke
 
OpenMinTeD: Making Sense of Large Volumes of Data
OpenMinTeD: Making Sense of Large Volumes of DataOpenMinTeD: Making Sense of Large Volumes of Data
OpenMinTeD: Making Sense of Large Volumes of Dataopenminted_eu
 
Text mining in CORE (OR2012)
Text mining in CORE (OR2012)Text mining in CORE (OR2012)
Text mining in CORE (OR2012)petrknoth
 
Research Data Management at Imperial College London
Research Data Management at Imperial College LondonResearch Data Management at Imperial College London
Research Data Management at Imperial College LondonSarah Anna Stewart
 
Why would a publisher care about open data?
Why would a publisher care about open data?Why would a publisher care about open data?
Why would a publisher care about open data?Anita de Waard
 

Similar to Better together: building services for public good on top of content from the global network of open repositories (20)

Research data support: a growth area for academic libraries?
Research data support: a growth area for academic libraries?Research data support: a growth area for academic libraries?
Research data support: a growth area for academic libraries?
 
OpenAIRE workshop @ OR2016 - From Repositories, for repositories
OpenAIRE workshop @ OR2016 - From Repositories, for repositoriesOpenAIRE workshop @ OR2016 - From Repositories, for repositories
OpenAIRE workshop @ OR2016 - From Repositories, for repositories
 
Closing the scientific literature access gap with CORE - how to gain free acc...
Closing the scientific literature access gap with CORE - how to gain free acc...Closing the scientific literature access gap with CORE - how to gain free acc...
Closing the scientific literature access gap with CORE - how to gain free acc...
 
Authors' and Publications' Citations knowledge base
Authors' and Publications' Citations knowledge base Authors' and Publications' Citations knowledge base
Authors' and Publications' Citations knowledge base
 
IDCC workshop: OpenAIRE services and tools for Open Research Data in H2020
IDCC workshop: OpenAIRE services and tools for Open Research Data in H2020IDCC workshop: OpenAIRE services and tools for Open Research Data in H2020
IDCC workshop: OpenAIRE services and tools for Open Research Data in H2020
 
Enabling better science: Results and vision of the OpenAIRE infrastructure an...
Enabling better science: Results and vision of the OpenAIRE infrastructure an...Enabling better science: Results and vision of the OpenAIRE infrastructure an...
Enabling better science: Results and vision of the OpenAIRE infrastructure an...
 
Enabling better science - Results and vision of the OpenAIRE infrastructure a...
Enabling better science - Results and vision of the OpenAIRE infrastructure a...Enabling better science - Results and vision of the OpenAIRE infrastructure a...
Enabling better science - Results and vision of the OpenAIRE infrastructure a...
 
Next Generation Repositories
Next Generation RepositoriesNext Generation Repositories
Next Generation Repositories
 
finde datasets repository.pptx
finde datasets repository.pptxfinde datasets repository.pptx
finde datasets repository.pptx
 
Scholze liber 2015-06-25_final
Scholze liber 2015-06-25_finalScholze liber 2015-06-25_final
Scholze liber 2015-06-25_final
 
Instutional repositories and data
Instutional repositories and dataInstutional repositories and data
Instutional repositories and data
 
Research Data Management in GLAM: Managing Data for Cultural Heritage
Research Data Management in GLAM: Managing Data for Cultural HeritageResearch Data Management in GLAM: Managing Data for Cultural Heritage
Research Data Management in GLAM: Managing Data for Cultural Heritage
 
Libraries Enabling Open Science: LIBER Strategy & Advocacy
Libraries Enabling Open Science: LIBER Strategy & AdvocacyLibraries Enabling Open Science: LIBER Strategy & Advocacy
Libraries Enabling Open Science: LIBER Strategy & Advocacy
 
Institutional Repositories
Institutional RepositoriesInstitutional Repositories
Institutional Repositories
 
2014 ALA MW SPARC-ACRL Forum Talk
2014 ALA MW SPARC-ACRL Forum Talk2014 ALA MW SPARC-ACRL Forum Talk
2014 ALA MW SPARC-ACRL Forum Talk
 
OpenMinTeD: Making Sense of Large Volumes of Data
OpenMinTeD: Making Sense of Large Volumes of DataOpenMinTeD: Making Sense of Large Volumes of Data
OpenMinTeD: Making Sense of Large Volumes of Data
 
Text mining in CORE (OR2012)
Text mining in CORE (OR2012)Text mining in CORE (OR2012)
Text mining in CORE (OR2012)
 
Research Data Management at Imperial College London
Research Data Management at Imperial College LondonResearch Data Management at Imperial College London
Research Data Management at Imperial College London
 
Why would a publisher care about open data?
Why would a publisher care about open data?Why would a publisher care about open data?
Why would a publisher care about open data?
 
UKSG 2018 Breakout - Setting your cites to open I4OC - Maccallum
UKSG 2018 Breakout - Setting your cites to open I4OC - MaccallumUKSG 2018 Breakout - Setting your cites to open I4OC - Maccallum
UKSG 2018 Breakout - Setting your cites to open I4OC - Maccallum
 

More from petrknoth

Qui Bono? Cumulative advantage in open access publishing
Qui Bono? Cumulative advantage in open access publishingQui Bono? Cumulative advantage in open access publishing
Qui Bono? Cumulative advantage in open access publishingpetrknoth
 
OAI Identifiers: Decentralised PIDs for Research Outputs in Repositories
OAI Identifiers: Decentralised PIDs for Research Outputs in RepositoriesOAI Identifiers: Decentralised PIDs for Research Outputs in Repositories
OAI Identifiers: Decentralised PIDs for Research Outputs in Repositoriespetrknoth
 
UKRI OA policy requirements for repositories and how to meet them
UKRI OA policy requirements for repositories and how to meet themUKRI OA policy requirements for repositories and how to meet them
UKRI OA policy requirements for repositories and how to meet thempetrknoth
 
Enabling Educators to Locate High-Quality Teaching Resources
Enabling Educators to LocateHigh-Quality Teaching ResourcesEnabling Educators to LocateHigh-Quality Teaching Resources
Enabling Educators to Locate High-Quality Teaching Resourcespetrknoth
 
Tracking compliance of the REF2021 policy with the CORE Repository Dashboard
Tracking compliance of the REF2021 policy with the CORE Repository DashboardTracking compliance of the REF2021 policy with the CORE Repository Dashboard
Tracking compliance of the REF2021 policy with the CORE Repository Dashboardpetrknoth
 
Better together: building services for public good on top of content from the...
Better together: building services for public good on top of content from the...Better together: building services for public good on top of content from the...
Better together: building services for public good on top of content from the...petrknoth
 
CORE Analytics Dashboard
CORE Analytics DashboardCORE Analytics Dashboard
CORE Analytics Dashboardpetrknoth
 
Analysing the performance of open access papers discovery tools
Analysing the performance of open access papers discovery toolsAnalysing the performance of open access papers discovery tools
Analysing the performance of open access papers discovery toolspetrknoth
 
Assessing Compliance with the UK REF 2021 Open Access Policy
Assessing Compliance with the UK REF 2021 Open Access PolicyAssessing Compliance with the UK REF 2021 Open Access Policy
Assessing Compliance with the UK REF 2021 Open Access Policypetrknoth
 
Data interoperability toolkit (OpenMinTeD)
Data interoperability toolkit (OpenMinTeD)Data interoperability toolkit (OpenMinTeD)
Data interoperability toolkit (OpenMinTeD)petrknoth
 
Integrating research indicators for use in the repositories infrastructure
Integrating research indicators for use in the repositories infrastructure Integrating research indicators for use in the repositories infrastructure
Integrating research indicators for use in the repositories infrastructure petrknoth
 
Towards effective research recommender systems for repositories
Towards effective research recommender systems for repositoriesTowards effective research recommender systems for repositories
Towards effective research recommender systems for repositoriespetrknoth
 
COAR Next Generation Repositories WG - Text mining and Recommender system sto...
COAR Next Generation Repositories WG - Text mining and Recommender system sto...COAR Next Generation Repositories WG - Text mining and Recommender system sto...
COAR Next Generation Repositories WG - Text mining and Recommender system sto...petrknoth
 
Seamless access to the world’s open access research papers via ResourceSync
Seamless access to the world’s open access research papers via ResourceSyncSeamless access to the world’s open access research papers via ResourceSync
Seamless access to the world’s open access research papers via ResourceSyncpetrknoth
 
Semantometrics: Towards Fulltext-based Research Evaluation
Semantometrics: Towards Fulltext-based Research EvaluationSemantometrics: Towards Fulltext-based Research Evaluation
Semantometrics: Towards Fulltext-based Research Evaluationpetrknoth
 
Aggregating Research papers from Publishers' Systems to Support Text and Data...
Aggregating Research papers from Publishers' Systems to Support Text and Data...Aggregating Research papers from Publishers' Systems to Support Text and Data...
Aggregating Research papers from Publishers' Systems to Support Text and Data...petrknoth
 
My repository is being aggregated: a blessing or a curse?
My repository is being aggregated: a blessing or a curse?My repository is being aggregated: a blessing or a curse?
My repository is being aggregated: a blessing or a curse?petrknoth
 
FOSTER - Content Delivery (WP3)
FOSTER - Content Delivery (WP3)FOSTER - Content Delivery (WP3)
FOSTER - Content Delivery (WP3)petrknoth
 
Towards an Infrastructure for Mining Scientific Publications
Towards an Infrastructure for Mining Scientific PublicationsTowards an Infrastructure for Mining Scientific Publications
Towards an Infrastructure for Mining Scientific Publicationspetrknoth
 

More from petrknoth (20)

Qui Bono? Cumulative advantage in open access publishing
Qui Bono? Cumulative advantage in open access publishingQui Bono? Cumulative advantage in open access publishing
Qui Bono? Cumulative advantage in open access publishing
 
CORE APIv3
CORE APIv3CORE APIv3
CORE APIv3
 
OAI Identifiers: Decentralised PIDs for Research Outputs in Repositories
OAI Identifiers: Decentralised PIDs for Research Outputs in RepositoriesOAI Identifiers: Decentralised PIDs for Research Outputs in Repositories
OAI Identifiers: Decentralised PIDs for Research Outputs in Repositories
 
UKRI OA policy requirements for repositories and how to meet them
UKRI OA policy requirements for repositories and how to meet themUKRI OA policy requirements for repositories and how to meet them
UKRI OA policy requirements for repositories and how to meet them
 
Enabling Educators to Locate High-Quality Teaching Resources
Enabling Educators to LocateHigh-Quality Teaching ResourcesEnabling Educators to LocateHigh-Quality Teaching Resources
Enabling Educators to Locate High-Quality Teaching Resources
 
Tracking compliance of the REF2021 policy with the CORE Repository Dashboard
Tracking compliance of the REF2021 policy with the CORE Repository DashboardTracking compliance of the REF2021 policy with the CORE Repository Dashboard
Tracking compliance of the REF2021 policy with the CORE Repository Dashboard
 
Better together: building services for public good on top of content from the...
Better together: building services for public good on top of content from the...Better together: building services for public good on top of content from the...
Better together: building services for public good on top of content from the...
 
CORE Analytics Dashboard
CORE Analytics DashboardCORE Analytics Dashboard
CORE Analytics Dashboard
 
Analysing the performance of open access papers discovery tools
Analysing the performance of open access papers discovery toolsAnalysing the performance of open access papers discovery tools
Analysing the performance of open access papers discovery tools
 
Assessing Compliance with the UK REF 2021 Open Access Policy
Assessing Compliance with the UK REF 2021 Open Access PolicyAssessing Compliance with the UK REF 2021 Open Access Policy
Assessing Compliance with the UK REF 2021 Open Access Policy
 
Data interoperability toolkit (OpenMinTeD)
Data interoperability toolkit (OpenMinTeD)Data interoperability toolkit (OpenMinTeD)
Data interoperability toolkit (OpenMinTeD)
 
Integrating research indicators for use in the repositories infrastructure
Integrating research indicators for use in the repositories infrastructure Integrating research indicators for use in the repositories infrastructure
Integrating research indicators for use in the repositories infrastructure
 
Towards effective research recommender systems for repositories
Towards effective research recommender systems for repositoriesTowards effective research recommender systems for repositories
Towards effective research recommender systems for repositories
 
COAR Next Generation Repositories WG - Text mining and Recommender system sto...
COAR Next Generation Repositories WG - Text mining and Recommender system sto...COAR Next Generation Repositories WG - Text mining and Recommender system sto...
COAR Next Generation Repositories WG - Text mining and Recommender system sto...
 
Seamless access to the world’s open access research papers via ResourceSync
Seamless access to the world’s open access research papers via ResourceSyncSeamless access to the world’s open access research papers via ResourceSync
Seamless access to the world’s open access research papers via ResourceSync
 
Semantometrics: Towards Fulltext-based Research Evaluation
Semantometrics: Towards Fulltext-based Research EvaluationSemantometrics: Towards Fulltext-based Research Evaluation
Semantometrics: Towards Fulltext-based Research Evaluation
 
Aggregating Research papers from Publishers' Systems to Support Text and Data...
Aggregating Research papers from Publishers' Systems to Support Text and Data...Aggregating Research papers from Publishers' Systems to Support Text and Data...
Aggregating Research papers from Publishers' Systems to Support Text and Data...
 
My repository is being aggregated: a blessing or a curse?
My repository is being aggregated: a blessing or a curse?My repository is being aggregated: a blessing or a curse?
My repository is being aggregated: a blessing or a curse?
 
FOSTER - Content Delivery (WP3)
FOSTER - Content Delivery (WP3)FOSTER - Content Delivery (WP3)
FOSTER - Content Delivery (WP3)
 
Towards an Infrastructure for Mining Scientific Publications
Towards an Infrastructure for Mining Scientific PublicationsTowards an Infrastructure for Mining Scientific Publications
Towards an Infrastructure for Mining Scientific Publications
 

Recently uploaded

Week-01-2.ppt BBB human Computer interaction
Week-01-2.ppt BBB human Computer interactionWeek-01-2.ppt BBB human Computer interaction
Week-01-2.ppt BBB human Computer interactionfulawalesam
 
Carero dropshipping via API with DroFx.pptx
Carero dropshipping via API with DroFx.pptxCarero dropshipping via API with DroFx.pptx
Carero dropshipping via API with DroFx.pptxolyaivanovalion
 
Delhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip CallDelhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Callshivangimorya083
 
{Pooja: 9892124323 } Call Girl in Mumbai | Jas Kaur Rate 4500 Free Hotel Del...
{Pooja:  9892124323 } Call Girl in Mumbai | Jas Kaur Rate 4500 Free Hotel Del...{Pooja:  9892124323 } Call Girl in Mumbai | Jas Kaur Rate 4500 Free Hotel Del...
{Pooja: 9892124323 } Call Girl in Mumbai | Jas Kaur Rate 4500 Free Hotel Del...Pooja Nehwal
 
Capstone Project on IBM Data Analytics Program
Capstone Project on IBM Data Analytics ProgramCapstone Project on IBM Data Analytics Program
Capstone Project on IBM Data Analytics ProgramMoniSankarHazra
 
CebaBaby dropshipping via API with DroFX.pptx
CebaBaby dropshipping via API with DroFX.pptxCebaBaby dropshipping via API with DroFX.pptx
CebaBaby dropshipping via API with DroFX.pptxolyaivanovalion
 
Vip Model Call Girls (Delhi) Karol Bagh 9711199171✔️Body to body massage wit...
Vip Model  Call Girls (Delhi) Karol Bagh 9711199171✔️Body to body massage wit...Vip Model  Call Girls (Delhi) Karol Bagh 9711199171✔️Body to body massage wit...
Vip Model Call Girls (Delhi) Karol Bagh 9711199171✔️Body to body massage wit...shivangimorya083
 
Data-Analysis for Chicago Crime Data 2023
Data-Analysis for Chicago Crime Data  2023Data-Analysis for Chicago Crime Data  2023
Data-Analysis for Chicago Crime Data 2023ymrp368
 
Schema on read is obsolete. Welcome metaprogramming..pdf
Schema on read is obsolete. Welcome metaprogramming..pdfSchema on read is obsolete. Welcome metaprogramming..pdf
Schema on read is obsolete. Welcome metaprogramming..pdfLars Albertsson
 
Al Barsha Escorts $#$ O565212860 $#$ Escort Service In Al Barsha
Al Barsha Escorts $#$ O565212860 $#$ Escort Service In Al BarshaAl Barsha Escorts $#$ O565212860 $#$ Escort Service In Al Barsha
Al Barsha Escorts $#$ O565212860 $#$ Escort Service In Al BarshaAroojKhan71
 
Delhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip CallDelhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Callshivangimorya083
 
Introduction-to-Machine-Learning (1).pptx
Introduction-to-Machine-Learning (1).pptxIntroduction-to-Machine-Learning (1).pptx
Introduction-to-Machine-Learning (1).pptxfirstjob4
 
Accredited-Transport-Cooperatives-Jan-2021-Web.pdf
Accredited-Transport-Cooperatives-Jan-2021-Web.pdfAccredited-Transport-Cooperatives-Jan-2021-Web.pdf
Accredited-Transport-Cooperatives-Jan-2021-Web.pdfadriantubila
 
BigBuy dropshipping via API with DroFx.pptx
BigBuy dropshipping via API with DroFx.pptxBigBuy dropshipping via API with DroFx.pptx
BigBuy dropshipping via API with DroFx.pptxolyaivanovalion
 
Zuja dropshipping via API with DroFx.pptx
Zuja dropshipping via API with DroFx.pptxZuja dropshipping via API with DroFx.pptx
Zuja dropshipping via API with DroFx.pptxolyaivanovalion
 
Call me @ 9892124323 Cheap Rate Call Girls in Vashi with Real Photo 100% Secure
Call me @ 9892124323  Cheap Rate Call Girls in Vashi with Real Photo 100% SecureCall me @ 9892124323  Cheap Rate Call Girls in Vashi with Real Photo 100% Secure
Call me @ 9892124323 Cheap Rate Call Girls in Vashi with Real Photo 100% SecurePooja Nehwal
 
VIP Call Girls Service Miyapur Hyderabad Call +91-8250192130
VIP Call Girls Service Miyapur Hyderabad Call +91-8250192130VIP Call Girls Service Miyapur Hyderabad Call +91-8250192130
VIP Call Girls Service Miyapur Hyderabad Call +91-8250192130Suhani Kapoor
 
Invezz.com - Grow your wealth with trading signals
Invezz.com - Grow your wealth with trading signalsInvezz.com - Grow your wealth with trading signals
Invezz.com - Grow your wealth with trading signalsInvezz1
 
BPAC WITH UFSBI GENERAL PRESENTATION 18_05_2017-1.pptx
BPAC WITH UFSBI GENERAL PRESENTATION 18_05_2017-1.pptxBPAC WITH UFSBI GENERAL PRESENTATION 18_05_2017-1.pptx
BPAC WITH UFSBI GENERAL PRESENTATION 18_05_2017-1.pptxMohammedJunaid861692
 

Recently uploaded (20)

Week-01-2.ppt BBB human Computer interaction
Week-01-2.ppt BBB human Computer interactionWeek-01-2.ppt BBB human Computer interaction
Week-01-2.ppt BBB human Computer interaction
 
Carero dropshipping via API with DroFx.pptx
Carero dropshipping via API with DroFx.pptxCarero dropshipping via API with DroFx.pptx
Carero dropshipping via API with DroFx.pptx
 
Delhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip CallDelhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
 
{Pooja: 9892124323 } Call Girl in Mumbai | Jas Kaur Rate 4500 Free Hotel Del...
{Pooja:  9892124323 } Call Girl in Mumbai | Jas Kaur Rate 4500 Free Hotel Del...{Pooja:  9892124323 } Call Girl in Mumbai | Jas Kaur Rate 4500 Free Hotel Del...
{Pooja: 9892124323 } Call Girl in Mumbai | Jas Kaur Rate 4500 Free Hotel Del...
 
Capstone Project on IBM Data Analytics Program
Capstone Project on IBM Data Analytics ProgramCapstone Project on IBM Data Analytics Program
Capstone Project on IBM Data Analytics Program
 
CebaBaby dropshipping via API with DroFX.pptx
CebaBaby dropshipping via API with DroFX.pptxCebaBaby dropshipping via API with DroFX.pptx
CebaBaby dropshipping via API with DroFX.pptx
 
Vip Model Call Girls (Delhi) Karol Bagh 9711199171✔️Body to body massage wit...
Vip Model  Call Girls (Delhi) Karol Bagh 9711199171✔️Body to body massage wit...Vip Model  Call Girls (Delhi) Karol Bagh 9711199171✔️Body to body massage wit...
Vip Model Call Girls (Delhi) Karol Bagh 9711199171✔️Body to body massage wit...
 
Data-Analysis for Chicago Crime Data 2023
Data-Analysis for Chicago Crime Data  2023Data-Analysis for Chicago Crime Data  2023
Data-Analysis for Chicago Crime Data 2023
 
Schema on read is obsolete. Welcome metaprogramming..pdf
Schema on read is obsolete. Welcome metaprogramming..pdfSchema on read is obsolete. Welcome metaprogramming..pdf
Schema on read is obsolete. Welcome metaprogramming..pdf
 
Al Barsha Escorts $#$ O565212860 $#$ Escort Service In Al Barsha
Al Barsha Escorts $#$ O565212860 $#$ Escort Service In Al BarshaAl Barsha Escorts $#$ O565212860 $#$ Escort Service In Al Barsha
Al Barsha Escorts $#$ O565212860 $#$ Escort Service In Al Barsha
 
Delhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip CallDelhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
 
Introduction-to-Machine-Learning (1).pptx
Introduction-to-Machine-Learning (1).pptxIntroduction-to-Machine-Learning (1).pptx
Introduction-to-Machine-Learning (1).pptx
 
Sampling (random) method and Non random.ppt
Sampling (random) method and Non random.pptSampling (random) method and Non random.ppt
Sampling (random) method and Non random.ppt
 
Accredited-Transport-Cooperatives-Jan-2021-Web.pdf
Accredited-Transport-Cooperatives-Jan-2021-Web.pdfAccredited-Transport-Cooperatives-Jan-2021-Web.pdf
Accredited-Transport-Cooperatives-Jan-2021-Web.pdf
 
BigBuy dropshipping via API with DroFx.pptx
BigBuy dropshipping via API with DroFx.pptxBigBuy dropshipping via API with DroFx.pptx
BigBuy dropshipping via API with DroFx.pptx
 
Zuja dropshipping via API with DroFx.pptx
Zuja dropshipping via API with DroFx.pptxZuja dropshipping via API with DroFx.pptx
Zuja dropshipping via API with DroFx.pptx
 
Call me @ 9892124323 Cheap Rate Call Girls in Vashi with Real Photo 100% Secure
Call me @ 9892124323  Cheap Rate Call Girls in Vashi with Real Photo 100% SecureCall me @ 9892124323  Cheap Rate Call Girls in Vashi with Real Photo 100% Secure
Call me @ 9892124323 Cheap Rate Call Girls in Vashi with Real Photo 100% Secure
 
VIP Call Girls Service Miyapur Hyderabad Call +91-8250192130
VIP Call Girls Service Miyapur Hyderabad Call +91-8250192130VIP Call Girls Service Miyapur Hyderabad Call +91-8250192130
VIP Call Girls Service Miyapur Hyderabad Call +91-8250192130
 
Invezz.com - Grow your wealth with trading signals
Invezz.com - Grow your wealth with trading signalsInvezz.com - Grow your wealth with trading signals
Invezz.com - Grow your wealth with trading signals
 
BPAC WITH UFSBI GENERAL PRESENTATION 18_05_2017-1.pptx
BPAC WITH UFSBI GENERAL PRESENTATION 18_05_2017-1.pptxBPAC WITH UFSBI GENERAL PRESENTATION 18_05_2017-1.pptx
BPAC WITH UFSBI GENERAL PRESENTATION 18_05_2017-1.pptx
 

Better together: building services for public good on top of content from the global network of open repositories

  • 1. Better together: building services for public good on top of content from the global network of open repositories Petr Knoth June 21, 2019 – OAI-11, Geneva Big Scientific Data and Text Analytics Group Knowledge Media Institute, The Open University
  • 2. Papers OA sooner and sooner 1/22 https://physicstoday.scitation.org/do/10.1063/PT.6.2.20190418a/full/ JCDL 2019 Best Paper Award
  • 3. Papers OA sooner and sooner The delay between publication and OA availability decreasing globally 2 Herrmannova, Drahomira; Pontika, Nancy and Knoth, Petr (2019). Do Authors Deposit on Time? Tracking Open Access Policy Compliance. In: 2019 ACM/IEEE Joint Conference on Digital Libraries, 2-6 Jun 2019, Urbana-Champaign, IL
  • 4. Faster open access makes repository infrastructure more important 3
  • 5. We don’t need just open access we need fast open access 4
  • 6. This study was only possible to conduct because of repositories and aggregators working together 5
  • 7. Global network of repositories 7 “A single scientific repository is of limited value, real benefits come from the ability to exchange data within a network … … interoperability allows us to exploit today's computational power so that we can aggregate, data mine, create new tools and services, and generate new knowledge from repository content.” – Confederation of Open Access Repositories (COAR)
  • 8. OA Aggregations and BOAI 2002 “To achieve open access to scholarly journal literature, we recommend two complementary strategies. • Self-Archiving: First, scholars need the tools and assistance to deposit their refereed journal articles in open electronic archives, a practice commonly called, self-archiving. When these archives conform to standards created by the Open Archives Initiative, then search engines and other tools can treat the separate archives as one. Users then need not know which archives exist or where they are located in order to find and make use of their contents. • …” Budapest Open Access Initiative, 2002 8
  • 9. CORE’s mission 9 Aggregate all open access research articles worldwide … … enrich this content and provide seamless access to it through a set of data services …
  • 11. Harvesting data is challenging 11
  • 12. Harvesting data is challenging 14
  • 13. Harvesting data is challenging 15
  • 14. Need for more interoperability across systems 16
  • 15. World’s largest dataset of Open Access full texts • 13,117,488 Hosted full texts • 135,539,113 Metadata records • ~78m Links to full texts • 15TB of raw plain text • 4,123 Data providers 17/22
  • 16. CORE Processing pipeline • Metadata download, extraction and harmonisation • Full text download • Text extractions, sections extraction • Metadata validation and enrichment (DOI, ORCID, etc.) • Thumbnails generation • References and citation contexts extraction • API enrichment (e.g. finding DOIs, linking to other systems) • Document type classification • Deduplication • Indexing • Exposing (data dumps, API, FastSync) 18
  • 17. CORE usage • January 2019 – CORE reached over 10M monthly active users for the first time • 571% increase from January 2018 19
  • 18. Alexa rank • Within top 6k global websites: https://www.alexa. com/ • core.ac.uk by usage in the top 0.0009% of global websites 20
  • 19. CORE Search 21 • Full text search for OA content • Faceted searching • What you find is what you get • Real change of data providers wanting to be included
  • 20. CORE Recommender • Recommending relevant content to users from across all free content • Recommender plugin for repositories • https://core.ac.uk/service s/recommender/ 22
  • 21. CORE Recommender • Recommending relevant content to users from across all free content • Recommender plugin for repositories • https://core.ac.uk/service s/recommender/ 23
  • 22. Introducing CORE Discovery • High coverage of freely available content • Free service for researchers by researchers. No company controlling the pipes. • Best grip on open repository content. • Repository integration • Discovering documents without a DOI. https://core.ac.uk/services/discovery/
  • 24. CORE Discovery Repository integration • Majority of articles in repositories metadata only. • CORE Discovery repository plugin: • turns dead ends of user journeys into journeys fulfilling users’ information needs • makes repository content more discoverable.
  • 25. CORE Search 27 • Full text search for OA content • Faceted searching • What you find is what you get
  • 26. CORE’s raw data services 28
  • 27. Raw data services – CORE API 29 • Enables the development of new applications • Real-time machine access to the world's largest collection of open access papers • Harmonised access to data from across the network of CORE providers • Direct machine access to full texts of research papers
  • 28. Raw data services – CORE Dataset 30 • Download millions of research papers for text and data analysis • Prototype, analyse and mine your data in your infrastructure
  • 29. Raw data services – CORE FastSync 31 • Keeps your data in sync with research content from around the world • Fast and incremental updates as soon as they become available. No usage restrictions • Based on ResourceSync
  • 30. Use powered by CORE 32 • It is beyond human capacities to read all scientific literature • Example use cases in which CORE is applied: • Improving discovery • Plagiarism detection • Question answering in science • Literature based discovery • Fact checking and detection of misinformation • Analysing research trends • Finding experts in a particular domain • Research evaluation and scientometrics • Exploratory and visual search • …
  • 32. Take home 22/22 • Data providers (repositories, preprint servers, journals, etc.) and aggregators need to work together to allow text and data analysis, processing and reuse of large volumes of research papers. • CORE provides the tools for programmatically processing open access data fast, reliably and from across the global network of repositories. • If you are a repository manager or a librarian: • CORE Discovery and Recommender • If you are a developer or analyst: • Build your own stuff using CORE‘s data services on top of the global full text open access courpus

Editor's Notes

  1. Paper are being made deposited into open repositories faster and faster after acceptance
  2. Paper are being made deposited into open repositories faster and faster after acceptance It is a global trend
  3. The amazing thing about this is that the sooner the content becomes available in a repository the more important the repository infrastructure becomes. The vast majority of transactions/downloads of OA literature is for recently published content. Many researchers care about getting access to content soon after it gets released If contnt becomes available in repositories before it gets published, repositories have the potential top become the primary places from which we will access research.
  4. For the OA movement to succeed soon we don’t need just OA but fast OA
  5. If we want the ability to run global analyses on open access data, if we want to be able to monitor our progress, we need repositories and services that have a global view on repositories to work together.
  6. The vision that many data providers can be treated as one
  7. CORE offers seamless, unrestricted access to millions of research papers. CORE hosts the world’s largest collection of open access full texts, which are used by researchers, libraries, software developers, funders and many more. CORE's aggregated content comes from thousands of institutional and subject repositories as well as journals and covers all research disciplines. In January 2019, CORE has hit the mark of 10 million monthly active users (10.41 million users) making core.ac.uk a top 6k website globally by usage according to the independent Alexa Rank and one of the most world's most widely used Open Access platforms. In this talk, Petr will present the CORE service, discuss current challenges in harvesting content from open repositories and share his experience of building value-added services for the society on top of open content.
  8. Mention preprints and large repositories No sufficient support for full text harvesting Previously questioned, not any more
  9. We started aggregating from repositories, we then added aggregating OA from publishers, we are now working to aggregate OA content from anywhere on the Web.
  10. We started aggregating from repositories, we then added aggregating OA from publishers, we are now working to aggregate OA content from anywhere on the Web.
  11. Hard job to do this because more interoperability needed.
  12. Non-trivial pipeline Machine learning involved The idea that one person can do this on their own doesn’t really hold Data providers and repositories must collaborate
  13. When all the processing is done. People appreciate the product.
  14. Highest coverage of freely available content. Our tests have shown CORE Discovery finding more free content than any other discovery system. Free service for researchers by researchers. CORE Discovery is the only free content discovery extension developed by researchers for researchers. There is no major publisher or enterprise controlling and profiting from your usage data. Best grip on open repository content. Due to CORE being a leader in harvesting open access literature, CORE Discovery has the best grip on open content from open repositories as opposed to other services that disproportionately focus only on content indexed in major commercial databases. Repository integration and discovering documents without a DOI. The only service offering seamless and free integration into repositories. CORE Discovery is also the only discovery system that can locate scientific content even for items with an unknown DOI or which do not have a DOI.
  15. Open access discovery tools locate freely available copies of research papers which might be behind the paywall K[.*]io
  16. Enable others to programmatically interact with CORE data. Essential for developing applications that make use of information about research papers Supporting a variety of use cases >500 registered users, including companies for CORE API Use: CORE API to interact with CORE in real time CORE Data Dumps to run batch process, develop prototype applications and do research CORE SDKs to gain easy access to CORE API from several programming languages CORE FastSync
  17. Enable others to programmatically interact with CORE data. Essential for developing applications that make use of information about research papers Supporting a variety of use cases >500 registered users, including companies for CORE API Use: CORE API to interact with CORE in real time CORE Data Dumps to run batch process, develop prototype applications and do research CORE SDKs to gain easy access to CORE API from several programming languages CORE FastSync
  18. Enable others to programmatically interact with CORE data. Essential for developing applications that make use of information about research papers Supporting a variety of use cases >500 registered users, including companies for CORE API Use: CORE API to interact with CORE in real time CORE Data Dumps to run batch process, develop prototype applications and do research CORE SDKs to gain easy access to CORE API from several programming languages CORE FastSync
  19. Enable others to programmatically interact with CORE data. Essential for developing applications that make use of information about research papers Supporting a variety of use cases >500 registered users, including companies for CORE API Use: CORE API to interact with CORE in real time CORE Data Dumps to run batch process, develop prototype applications and do research CORE SDKs to gain easy access to CORE API from several programming languages CORE FastSync
  20. Put some quotes here/logos