SlideShare uma empresa Scribd logo
1 de 56
Baixar para ler offline
Introducing
CrossRef
Prospect
Cambridge, MA
2013
Geoffrey Bilder
Director of Strategic Initiatives
Taking the tedium out of TDM….

Geoffrey Bilder
Director of Strategic Initiatives
Text & Data
Mining
Gold
Diamond

Text & Data

?
The Problem
Institute * Channel View Publications, Ltd * Chartered Institution Of
Building Service Engineers * Chattagram Maa-O-Shishu Hospital
Medical College * Chelonian Conservation And Biology Journal *
Chelonian Research Foundation * Chem-Bio Informatics Society *
Chemical Engineering Diponegoro University * Chemical Science
Transactions * Chemical Society Of Japan * Chiang Mai University *
Children, Youth And Environments Center * Chimera Innova Group
* China Agricultural University * China Communications Magazine,
Co., Ltd. * China Journal Of Chinese Materia Medica * China
Petroleum Industry Press * China Science Publishing & Media Ltd. *
Chinese Astronomical Society * Chinese Birds * Chinese Birds (Press)
* Chinese Civilisation Centre * Chinese Geoscience Union * Chinese
Institute Of Automation Engineers (Ciae) * Chinese Journal Of
Mechanical Engineering * Chinese Mathematical Society * Chinese
Physical Society * Chinese Physiological Society * Chinese Society
Of Theoretical And Applied Mechanics * Chonnam National
University Medical School (Kamje) * Christ University Bangalore *
• All parties would benefit from support of standard APIs
and data representations in order to enable TDM across
both open access and subscription-based publishers.

• Subscription-based publishers find it impractical to
negotiate multiple bilateral agreements with thousands
of researchers and institutions in order to authorize TDM
of subscribed content.

• Researchers find it impractical to negotiate multiple
bilateral agreements with hundreds of subscriptionbased publishers in order to authorize TDM of subscribed
content.
Common API
DOI
Content
Negotiation
http://dx.doi.org/10.5555-12345678
(Accept: text/html)
http://dx.doi.org/10.5555-12345678
(Accept: application/bibjson+json)
New Metadata
Full Text Link
License Information
Rate Limiting
(Optional)
Prospect HTTP
Headers
CR-Prospect-Rate-Limit: 1500!

(the rate limit ceiling per window on Prospect
requests)
!

CR-Prospect-Rate-Limit-Remaining:

1387!

(number of requests left for the current window)
!

CR-Prospect-Rate-Limit-Reset: 1378072800!

(the remaining time in UTC epoch seconds before the
rate limit resets and a new window is started)
*this is a technique used by many APIs, including Twitter’s
Common API Summary
•

Content Negotiation (Required)

•

New Metadata (Required)
•
•

•

Full text URIs
License URIs

Rate Limiting Headers (optional)
Stop here if
•

You are an open access publisher

•

You include TDM as a part of your
subscription license/T&Cs.
Click-Through
License
Service
(Optional)
Research queries DOI using CN + API token
Publisher verifies API token with Prospect
(frequency at publisher discretion)

If token verified AND access control allows,	

publisher returns full text
Research queries DOI
using CN + API token
curl -H "Accept: text/turtle" "http://
dx.doi.org/10.5555/515151" -D - -L !
Link: <http://data.crossref.org/full-text/10.5555/515151>;
rel="http://id.crossref.org/schema/full-text";
anchor="http://annalsofpsychoceramics.labs.crossref.org/
fulltext/515151/515151.pdf"
Publisher verifies API
token with Prospect
curl -H "CR-Prospect-Publisher-Token:
MdvA59fGn8ukykYlSxJL6g" "https://
prospect.crossref.org/licenses/
hZqJDbcbKSSRgRG_PJxSBA" -D - -L!
{	

"result": "ok",	

"message": "licenses",	

"orcid": "0000-0002-1825-0097",	

"given_names": "Josiah",	

"family_name": "Carberry",	

"licenses": [	

{	

"uri": "http://www.crossref.org/tdm_license",	

"status": "rejected",	

"reviewed_at": "2013-05-28T17:09:36+00:00"	

},	

{	

"uri": "http://www.oxygenxml.com/",	

"status": "read",	

"reviewed_at": "2013-05-29T12:08:59+00:00"	

}	

]	

}
Sustainability
Model
•

New initiatives are always optional to our members. Members who do not
participate in our new initiatives will not be charged for them.

•

We do not charge end-users (e.g. researchers, librarians) for access to metadata
and APIs

•

We sometimes charge intermediaries for access to our services (to cover the cost
of administration, maintaining SLAs, etc.)

•

We do not charge our members for depositing extra metadata into our services

•

We sometimes charge our members for the cost of administering our services,
maintaining SLAs, development, etc.

•

We eschew charging mechanisms that involve complex administrative overhead.
The cost of developing and running them generally negates the revenue raised
by implementing them.

•

We try to tie any charges as directly as possible to where costs are incurred.
Current State
Prospect Working Group
•

AAAS: Walter Jones, Stewart Wills, Deborah Rivera-Wienhold

•

American Institute of Physics: Evan Owens,

•

American Physical Society: Mark Doyle

•

Elsevier: Chris Shillum, Ale de Vries

•

HighWire: John Sack, Craig Jurney

•

Institute of Physics Publishing: Graham McCann, James Walker

•

Springer: Chinchu Ann Belarmin, Michiel van der Heyden

•

Taylor & Francis: Gillian Howcroft

•

Walter de Gruyter: Bettina de Keijzer

•

Wiley: Edward Wates, Alan Bacon

•

CrossRef: Geoffrey Bilder, Chuck Koscher, Ed Pentz, Carol Meyer, Kirsty Meddings.
CrossRef
• DOI Content Negotiation	

• CrossRef support for recording links to full text 	

• CrossRef metadata Search for Discovery	

• CrossRef metadata support for license URIs	

• Click-through TDM license registry	

• Prospect publisher API for verifying, managing

Exists
Exists
Exists
Exists
Exists
Exists

tokens	


•Sample publisher code	

•Sample researcher code

Exists
Exists

✻ being extended to support mime-types
We are using CrossRef's Prospect text mining API in the context
of the Hiberlink project, which investigates reference rot in
scholarly papers at a very large scale. The API is really
straightforward and based on common technical approaches; it
can easily be integrated in a broader workflow. In our case, we
have a work bench that monitors newly published papers,
obtains their XML version via the API, extracts all HTTP URIs, and
then crawls and archives the referenced content. Currently, we
can only access Elsevier papers via the API but as more
publishers join Prospect, it will become a powerful, uniform onestop-shop for text mining scholarly literature.
--Martin Klein and Herbert Van de Sompel, Los Alamos National
Laboratory
I think this is a big step in the right direction and makes
retrieving full text file a lot easier, I hope that publishers support
it.
--Maximilian Haeussler, UCSD
What do I
need to do?
Publishers (required)

•

Register full-text URLs with CrossRef

•

Register <lic_ref> well-known license URIs with
CrossRef
Publishers (optional)

•

Register click-through proprietary licenses with
Prospect click-through service

•

Adapt platform APIs to handle Prospect API tokens
Researchers
•

Register with Prospect and accept/decline licenses

•

Modify TDM tools to look for <lic_ref> elements

•

Modify TDM tools to make use of Prospect API token
kmeddings@crossref.org

gbilder@crossref.org
Thank You
gbilder@crossref.org

Mais conteúdo relacionado

Mais procurados

Liger cat challenge
Liger cat challengeLiger cat challenge
Liger cat challenge
a s
 
The UK National Chemical Database Service – an integration of commercial and ...
The UK National Chemical Database Service – an integration of commercial and ...The UK National Chemical Database Service – an integration of commercial and ...
The UK National Chemical Database Service – an integration of commercial and ...
US Environmental Protection Agency (EPA), Center for Computational Toxicology and Exposure
 
CrossRef Annual Meeting 2012 CrossCheck CrossMark Rachael Lammey
CrossRef Annual Meeting 2012 CrossCheck CrossMark Rachael LammeyCrossRef Annual Meeting 2012 CrossCheck CrossMark Rachael Lammey
CrossRef Annual Meeting 2012 CrossCheck CrossMark Rachael Lammey
Crossref
 

Mais procurados (20)

CARA MENGELOLA PERUBAHAN PADA NASKAH
CARA MENGELOLA PERUBAHAN PADA NASKAHCARA MENGELOLA PERUBAHAN PADA NASKAH
CARA MENGELOLA PERUBAHAN PADA NASKAH
 
Introduction to Crossref: History, Mission, Members
Introduction to Crossref: History, Mission, MembersIntroduction to Crossref: History, Mission, Members
Introduction to Crossref: History, Mission, Members
 
The Crossref/ORCID Auto-Update: all you need to know
The Crossref/ORCID Auto-Update: all you need to knowThe Crossref/ORCID Auto-Update: all you need to know
The Crossref/ORCID Auto-Update: all you need to know
 
CARA MEMBUAT REFERENSI DAN SITASI PADA NASKAH
CARA MEMBUAT REFERENSI DAN SITASI PADA NASKAHCARA MEMBUAT REFERENSI DAN SITASI PADA NASKAH
CARA MEMBUAT REFERENSI DAN SITASI PADA NASKAH
 
Crossref/OASPA Publishers
Crossref/OASPA PublishersCrossref/OASPA Publishers
Crossref/OASPA Publishers
 
Managing changes to content: Crossmark
Managing changes to content: CrossmarkManaging changes to content: Crossmark
Managing changes to content: Crossmark
 
Checking for originality: Crossref Similarity Check
Checking for originality: Crossref Similarity CheckChecking for originality: Crossref Similarity Check
Checking for originality: Crossref Similarity Check
 
Geoffrey Bilder: Strategic Initiatives Update #crossref15
Geoffrey Bilder: Strategic Initiatives Update #crossref15Geoffrey Bilder: Strategic Initiatives Update #crossref15
Geoffrey Bilder: Strategic Initiatives Update #crossref15
 
Introducing Crossref Similarity Check
Introducing Crossref Similarity CheckIntroducing Crossref Similarity Check
Introducing Crossref Similarity Check
 
Content Registration at Crossref - LIVE Kuala Lumpur
Content Registration at Crossref - LIVE Kuala LumpurContent Registration at Crossref - LIVE Kuala Lumpur
Content Registration at Crossref - LIVE Kuala Lumpur
 
Liger cat challenge
Liger cat challengeLiger cat challenge
Liger cat challenge
 
The UK National Chemical Database Service – an integration of commercial and ...
The UK National Chemical Database Service – an integration of commercial and ...The UK National Chemical Database Service – an integration of commercial and ...
The UK National Chemical Database Service – an integration of commercial and ...
 
Introduction to Crossref - Crossref LIVE Kuala Lumpur
Introduction to Crossref - Crossref LIVE Kuala LumpurIntroduction to Crossref - Crossref LIVE Kuala Lumpur
Introduction to Crossref - Crossref LIVE Kuala Lumpur
 
Introduction to CrossRef Basics Webinar
Introduction to CrossRef Basics WebinarIntroduction to CrossRef Basics Webinar
Introduction to CrossRef Basics Webinar
 
CrossRef System Update
CrossRef System UpdateCrossRef System Update
CrossRef System Update
 
Citation Analysis for the Free, Online Literature
Citation Analysis for the Free, Online LiteratureCitation Analysis for the Free, Online Literature
Citation Analysis for the Free, Online Literature
 
CrossRef Branding Update
CrossRef Branding UpdateCrossRef Branding Update
CrossRef Branding Update
 
Royal society of chemistry activities to develop a data repository for chemis...
Royal society of chemistry activities to develop a data repository for chemis...Royal society of chemistry activities to develop a data repository for chemis...
Royal society of chemistry activities to develop a data repository for chemis...
 
Crossref Community Call May 2016
Crossref Community Call May 2016Crossref Community Call May 2016
Crossref Community Call May 2016
 
CrossRef Annual Meeting 2012 CrossCheck CrossMark Rachael Lammey
CrossRef Annual Meeting 2012 CrossCheck CrossMark Rachael LammeyCrossRef Annual Meeting 2012 CrossCheck CrossMark Rachael Lammey
CrossRef Annual Meeting 2012 CrossCheck CrossMark Rachael Lammey
 

Semelhante a 2013 CrossRef Workshops Text Data Mining Geoffrey Bilder

Text and Data Mining (TDM):Tools to make it easier by Chuck Koscher
Text and Data Mining (TDM):Tools to make it easier by Chuck KoscherText and Data Mining (TDM):Tools to make it easier by Chuck Koscher
Text and Data Mining (TDM):Tools to make it easier by Chuck Koscher
Crossref
 
CrossRef Technical Basics 2010 CrossRef Workshops
CrossRef Technical Basics 2010 CrossRef WorkshopsCrossRef Technical Basics 2010 CrossRef Workshops
CrossRef Technical Basics 2010 CrossRef Workshops
Crossref
 

Semelhante a 2013 CrossRef Workshops Text Data Mining Geoffrey Bilder (20)

2013 CrossRef Annual Meeting Strategic Update Geoffrey Bilder
2013 CrossRef Annual Meeting Strategic Update Geoffrey Bilder2013 CrossRef Annual Meeting Strategic Update Geoffrey Bilder
2013 CrossRef Annual Meeting Strategic Update Geoffrey Bilder
 
Introduction to CrossRef Text and Data Mining Webinar
Introduction to CrossRef Text and Data Mining WebinarIntroduction to CrossRef Text and Data Mining Webinar
Introduction to CrossRef Text and Data Mining Webinar
 
CrossRef Text & Data Mining - UKSG 2015
CrossRef Text & Data Mining - UKSG 2015CrossRef Text & Data Mining - UKSG 2015
CrossRef Text & Data Mining - UKSG 2015
 
Presentation from ALA Midwinter 2014 on Elsevier's new Text and Data Mining P...
Presentation from ALA Midwinter 2014 on Elsevier's new Text and Data Mining P...Presentation from ALA Midwinter 2014 on Elsevier's new Text and Data Mining P...
Presentation from ALA Midwinter 2014 on Elsevier's new Text and Data Mining P...
 
Who is using your content?
Who is using your content? Who is using your content?
Who is using your content?
 
Text and Data Mining (TDM):Tools to make it easier by Chuck Koscher
Text and Data Mining (TDM):Tools to make it easier by Chuck KoscherText and Data Mining (TDM):Tools to make it easier by Chuck Koscher
Text and Data Mining (TDM):Tools to make it easier by Chuck Koscher
 
Webinar@AIMS on RIOXX
Webinar@AIMS on RIOXXWebinar@AIMS on RIOXX
Webinar@AIMS on RIOXX
 
FAIR Software (and Data) Citation: Europe, Research Object Systems, Networks ...
FAIR Software (and Data) Citation: Europe, Research Object Systems, Networks ...FAIR Software (and Data) Citation: Europe, Research Object Systems, Networks ...
FAIR Software (and Data) Citation: Europe, Research Object Systems, Networks ...
 
OpenAthens Conference 2018 - Tim Lull and Chad Smith - Cultivating your onlin...
OpenAthens Conference 2018 - Tim Lull and Chad Smith - Cultivating your onlin...OpenAthens Conference 2018 - Tim Lull and Chad Smith - Cultivating your onlin...
OpenAthens Conference 2018 - Tim Lull and Chad Smith - Cultivating your onlin...
 
Web services and the Development of Semantic Applications
Web services and the Development of Semantic ApplicationsWeb services and the Development of Semantic Applications
Web services and the Development of Semantic Applications
 
CrossRef Technical Basics 2010 CrossRef Workshops
CrossRef Technical Basics 2010 CrossRef WorkshopsCrossRef Technical Basics 2010 CrossRef Workshops
CrossRef Technical Basics 2010 CrossRef Workshops
 
Crossref Content Registration - LIVE Mumbai
Crossref Content Registration - LIVE MumbaiCrossref Content Registration - LIVE Mumbai
Crossref Content Registration - LIVE Mumbai
 
NCBO Technology Overview
NCBO Technology OverviewNCBO Technology Overview
NCBO Technology Overview
 
A Data Citation Roadmap for Scholarly Data Repositories
A Data Citation Roadmap for Scholarly Data RepositoriesA Data Citation Roadmap for Scholarly Data Repositories
A Data Citation Roadmap for Scholarly Data Repositories
 
Introduction to CrossRef for Researchers
Introduction to CrossRef for ResearchersIntroduction to CrossRef for Researchers
Introduction to CrossRef for Researchers
 
FAIR Data Knowledge Graphs
FAIR Data Knowledge GraphsFAIR Data Knowledge Graphs
FAIR Data Knowledge Graphs
 
The Largest General Translational Informatics Public Private Partnership to Date
The Largest General Translational Informatics Public Private Partnership to DateThe Largest General Translational Informatics Public Private Partnership to Date
The Largest General Translational Informatics Public Private Partnership to Date
 
CORE APIv3
CORE APIv3CORE APIv3
CORE APIv3
 
Big Data Analytics course: Named Entities and Deep Learning for NLP
Big Data Analytics course: Named Entities and Deep Learning for NLPBig Data Analytics course: Named Entities and Deep Learning for NLP
Big Data Analytics course: Named Entities and Deep Learning for NLP
 
Open Source Collaboration in Drug Discovery in Pharma
Open Source Collaboration in Drug Discovery in PharmaOpen Source Collaboration in Drug Discovery in Pharma
Open Source Collaboration in Drug Discovery in Pharma
 

Mais de Crossref

Crossref LIVE Chinese网络研讨会——Crossref简介 – 14 Oct 2021
Crossref LIVE Chinese网络研讨会——Crossref简介 – 14 Oct 2021  Crossref LIVE Chinese网络研讨会——Crossref简介 – 14 Oct 2021
Crossref LIVE Chinese网络研讨会——Crossref简介 – 14 Oct 2021
Crossref
 
تسجيل المحتوي مع كروس رف – ندوة عبر الانترنت باللغة العربية | Content Registr...
تسجيل المحتوي مع كروس رف – ندوة عبر الانترنت باللغة العربية | Content Registr...تسجيل المحتوي مع كروس رف – ندوة عبر الانترنت باللغة العربية | Content Registr...
تسجيل المحتوي مع كروس رف – ندوة عبر الانترنت باللغة العربية | Content Registr...
Crossref
 
Introduction to Crossmark/Crossmark: O que é e como usar
Introduction to Crossmark/Crossmark: O que é e como usarIntroduction to Crossmark/Crossmark: O que é e como usar
Introduction to Crossmark/Crossmark: O que é e como usar
Crossref
 

Mais de Crossref (20)

Crossref LIVE: The Benefits of Open Infrastructure (APAC time zones) - 29th O...
Crossref LIVE: The Benefits of Open Infrastructure (APAC time zones) - 29th O...Crossref LIVE: The Benefits of Open Infrastructure (APAC time zones) - 29th O...
Crossref LIVE: The Benefits of Open Infrastructure (APAC time zones) - 29th O...
 
Crossref LIVE Chinese网络研讨会——Crossref简介 – 14 Oct 2021
Crossref LIVE Chinese网络研讨会——Crossref简介 – 14 Oct 2021  Crossref LIVE Chinese网络研讨会——Crossref简介 – 14 Oct 2021
Crossref LIVE Chinese网络研讨会——Crossref简介 – 14 Oct 2021
 
Seminario web ‘Crossmark’, en español
Seminario web ‘Crossmark’, en español Seminario web ‘Crossmark’, en español
Seminario web ‘Crossmark’, en español
 
Working with ROR as a Crossref member: what you need to know
Working with ROR as a Crossref member: what you need to knowWorking with ROR as a Crossref member: what you need to know
Working with ROR as a Crossref member: what you need to know
 
Преимущества и варианты использования метаданных в Crossref / The Value and ...
Преимущества и варианты использования метаданных в Crossref /  The Value and ...Преимущества и варианты использования метаданных в Crossref /  The Value and ...
Преимущества и варианты использования метаданных в Crossref / The Value and ...
 
Seminario web ‘Similarity Check’, en español
Seminario web ‘Similarity Check’, en españolSeminario web ‘Similarity Check’, en español
Seminario web ‘Similarity Check’, en español
 
Crossref LIVE Indonesia: One Search Platform (Drs. Muhammad Syarif Bando pres...
Crossref LIVE Indonesia: One Search Platform (Drs. Muhammad Syarif Bando pres...Crossref LIVE Indonesia: One Search Platform (Drs. Muhammad Syarif Bando pres...
Crossref LIVE Indonesia: One Search Platform (Drs. Muhammad Syarif Bando pres...
 
Crossref LIVE Indonesia: The Future of Indonesian Journal Policy (with Dr. Lu...
Crossref LIVE Indonesia: The Future of Indonesian Journal Policy (with Dr. Lu...Crossref LIVE Indonesia: The Future of Indonesian Journal Policy (with Dr. Lu...
Crossref LIVE Indonesia: The Future of Indonesian Journal Policy (with Dr. Lu...
 
Crossref LIVE Indonesia: The Value and Use of Crossref Metadata, CRLIVE-ID 15...
Crossref LIVE Indonesia: The Value and Use of Crossref Metadata, CRLIVE-ID 15...Crossref LIVE Indonesia: The Value and Use of Crossref Metadata, CRLIVE-ID 15...
Crossref LIVE Indonesia: The Value and Use of Crossref Metadata, CRLIVE-ID 15...
 
Crossref LIVE Indonesia: Content Registration at Crossref, CRLIVE-ID 14 July ...
Crossref LIVE Indonesia: Content Registration at Crossref, CRLIVE-ID 14 July ...Crossref LIVE Indonesia: Content Registration at Crossref, CRLIVE-ID 14 July ...
Crossref LIVE Indonesia: Content Registration at Crossref, CRLIVE-ID 14 July ...
 
Crossref LIVE Indonesia: An Introduction to Crossref, CRLIVE-ID 13 July 2021
Crossref LIVE Indonesia: An Introduction to Crossref, CRLIVE-ID 13 July 2021Crossref LIVE Indonesia: An Introduction to Crossref, CRLIVE-ID 13 July 2021
Crossref LIVE Indonesia: An Introduction to Crossref, CRLIVE-ID 13 July 2021
 
Crossref İçerik Kaydı Webinarı, Türkçe | Content Registration at Crossref , ...
 Crossref İçerik Kaydı Webinarı, Türkçe | Content Registration at Crossref , ... Crossref İçerik Kaydı Webinarı, Türkçe | Content Registration at Crossref , ...
Crossref İçerik Kaydı Webinarı, Türkçe | Content Registration at Crossref , ...
 
Los Metadatos Para la Comunidad de Investigacion
Los Metadatos Para la Comunidad de InvestigacionLos Metadatos Para la Comunidad de Investigacion
Los Metadatos Para la Comunidad de Investigacion
 
تسجيل المحتوي مع كروس رف – ندوة عبر الانترنت باللغة العربية | Content Registr...
تسجيل المحتوي مع كروس رف – ندوة عبر الانترنت باللغة العربية | Content Registr...تسجيل المحتوي مع كروس رف – ندوة عبر الانترنت باللغة العربية | Content Registr...
تسجيل المحتوي مع كروس رف – ندوة عبر الانترنت باللغة العربية | Content Registr...
 
Content Registration, Crossref ALJEBI, Indonesia
Content Registration, Crossref ALJEBI, IndonesiaContent Registration, Crossref ALJEBI, Indonesia
Content Registration, Crossref ALJEBI, Indonesia
 
crossmark update
crossmark updatecrossmark update
crossmark update
 
Participation reports webinar December 2020
Participation reports webinar December 2020Participation reports webinar December 2020
Participation reports webinar December 2020
 
Participation reports webinar November 2020
Participation reports webinar November 2020Participation reports webinar November 2020
Participation reports webinar November 2020
 
Introduction to Crossmark/Crossmark: O que é e como usar
Introduction to Crossmark/Crossmark: O que é e como usarIntroduction to Crossmark/Crossmark: O que é e como usar
Introduction to Crossmark/Crossmark: O que é e como usar
 
Crossref LIVE UK Online
Crossref LIVE UK OnlineCrossref LIVE UK Online
Crossref LIVE UK Online
 

Último

IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI Solutions
Enterprise Knowledge
 
CNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of ServiceCNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of Service
giselly40
 
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
Earley Information Science
 

Último (20)

IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI Solutions
 
Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreterPresentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreter
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024
 
08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024
 
Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Script
 
Strategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherStrategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a Fresher
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organization
 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdf
 
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
 
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
 
[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf
 
CNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of ServiceCNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of Service
 
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
 
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfThe Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
 
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
 
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
 
Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)
 
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot TakeoffStrategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
 

2013 CrossRef Workshops Text Data Mining Geoffrey Bilder

  • 1.
  • 2.
  • 4. Taking the tedium out of TDM…. Geoffrey Bilder Director of Strategic Initiatives
  • 5.
  • 8.
  • 10. Institute * Channel View Publications, Ltd * Chartered Institution Of Building Service Engineers * Chattagram Maa-O-Shishu Hospital Medical College * Chelonian Conservation And Biology Journal * Chelonian Research Foundation * Chem-Bio Informatics Society * Chemical Engineering Diponegoro University * Chemical Science Transactions * Chemical Society Of Japan * Chiang Mai University * Children, Youth And Environments Center * Chimera Innova Group * China Agricultural University * China Communications Magazine, Co., Ltd. * China Journal Of Chinese Materia Medica * China Petroleum Industry Press * China Science Publishing & Media Ltd. * Chinese Astronomical Society * Chinese Birds * Chinese Birds (Press) * Chinese Civilisation Centre * Chinese Geoscience Union * Chinese Institute Of Automation Engineers (Ciae) * Chinese Journal Of Mechanical Engineering * Chinese Mathematical Society * Chinese Physical Society * Chinese Physiological Society * Chinese Society Of Theoretical And Applied Mechanics * Chonnam National University Medical School (Kamje) * Christ University Bangalore *
  • 11.
  • 12. • All parties would benefit from support of standard APIs and data representations in order to enable TDM across both open access and subscription-based publishers. • Subscription-based publishers find it impractical to negotiate multiple bilateral agreements with thousands of researchers and institutions in order to authorize TDM of subscribed content. • Researchers find it impractical to negotiate multiple bilateral agreements with hundreds of subscriptionbased publishers in order to authorize TDM of subscribed content.
  • 17.
  • 18.
  • 23. Prospect HTTP Headers CR-Prospect-Rate-Limit: 1500! (the rate limit ceiling per window on Prospect requests) ! CR-Prospect-Rate-Limit-Remaining: 1387! (number of requests left for the current window) ! CR-Prospect-Rate-Limit-Reset: 1378072800! (the remaining time in UTC epoch seconds before the rate limit resets and a new window is started) *this is a technique used by many APIs, including Twitter’s
  • 24. Common API Summary • Content Negotiation (Required) • New Metadata (Required) • • • Full text URIs License URIs Rate Limiting Headers (optional)
  • 25. Stop here if • You are an open access publisher • You include TDM as a part of your subscription license/T&Cs.
  • 27.
  • 28.
  • 29.
  • 30.
  • 31.
  • 32.
  • 33.
  • 34. Research queries DOI using CN + API token Publisher verifies API token with Prospect (frequency at publisher discretion) If token verified AND access control allows, publisher returns full text
  • 35. Research queries DOI using CN + API token curl -H "Accept: text/turtle" "http:// dx.doi.org/10.5555/515151" -D - -L !
  • 37. Publisher verifies API token with Prospect curl -H "CR-Prospect-Publisher-Token: MdvA59fGn8ukykYlSxJL6g" "https:// prospect.crossref.org/licenses/ hZqJDbcbKSSRgRG_PJxSBA" -D - -L!
  • 38. { "result": "ok", "message": "licenses", "orcid": "0000-0002-1825-0097", "given_names": "Josiah", "family_name": "Carberry", "licenses": [ { "uri": "http://www.crossref.org/tdm_license", "status": "rejected", "reviewed_at": "2013-05-28T17:09:36+00:00" }, { "uri": "http://www.oxygenxml.com/", "status": "read", "reviewed_at": "2013-05-29T12:08:59+00:00" } ] }
  • 39.
  • 40.
  • 41.
  • 43. • New initiatives are always optional to our members. Members who do not participate in our new initiatives will not be charged for them. • We do not charge end-users (e.g. researchers, librarians) for access to metadata and APIs • We sometimes charge intermediaries for access to our services (to cover the cost of administration, maintaining SLAs, etc.) • We do not charge our members for depositing extra metadata into our services • We sometimes charge our members for the cost of administering our services, maintaining SLAs, development, etc. • We eschew charging mechanisms that involve complex administrative overhead. The cost of developing and running them generally negates the revenue raised by implementing them. • We try to tie any charges as directly as possible to where costs are incurred.
  • 45. Prospect Working Group • AAAS: Walter Jones, Stewart Wills, Deborah Rivera-Wienhold • American Institute of Physics: Evan Owens, • American Physical Society: Mark Doyle • Elsevier: Chris Shillum, Ale de Vries • HighWire: John Sack, Craig Jurney • Institute of Physics Publishing: Graham McCann, James Walker • Springer: Chinchu Ann Belarmin, Michiel van der Heyden • Taylor & Francis: Gillian Howcroft • Walter de Gruyter: Bettina de Keijzer • Wiley: Edward Wates, Alan Bacon • CrossRef: Geoffrey Bilder, Chuck Koscher, Ed Pentz, Carol Meyer, Kirsty Meddings.
  • 46.
  • 47. CrossRef • DOI Content Negotiation • CrossRef support for recording links to full text • CrossRef metadata Search for Discovery • CrossRef metadata support for license URIs • Click-through TDM license registry • Prospect publisher API for verifying, managing Exists Exists Exists Exists Exists Exists tokens •Sample publisher code •Sample researcher code Exists Exists ✻ being extended to support mime-types
  • 48. We are using CrossRef's Prospect text mining API in the context of the Hiberlink project, which investigates reference rot in scholarly papers at a very large scale. The API is really straightforward and based on common technical approaches; it can easily be integrated in a broader workflow. In our case, we have a work bench that monitors newly published papers, obtains their XML version via the API, extracts all HTTP URIs, and then crawls and archives the referenced content. Currently, we can only access Elsevier papers via the API but as more publishers join Prospect, it will become a powerful, uniform onestop-shop for text mining scholarly literature. --Martin Klein and Herbert Van de Sompel, Los Alamos National Laboratory
  • 49. I think this is a big step in the right direction and makes retrieving full text file a lot easier, I hope that publishers support it. --Maximilian Haeussler, UCSD
  • 50. What do I need to do?
  • 51. Publishers (required) • Register full-text URLs with CrossRef • Register <lic_ref> well-known license URIs with CrossRef
  • 52. Publishers (optional) • Register click-through proprietary licenses with Prospect click-through service • Adapt platform APIs to handle Prospect API tokens
  • 53. Researchers • Register with Prospect and accept/decline licenses • Modify TDM tools to look for <lic_ref> elements • Modify TDM tools to make use of Prospect API token
  • 54.