SlideShare a Scribd company logo
1 of 13
Download to read offline
Infographic
Access the connector:
http://publisher-connector.core.ac.uk/resourcesync
Discovery services:
Proprietary APIs
Connector layer
frontiers
Crossref
COREPublisher
Connector
PubMedOAsubset
arXiv
Dataset
Numberofresources
492,462 59,512 172,812
1,831,877
Open Access articles seamlessly
accessible by everyone
7%
of the total content available
from the above publishers is
Open Access
Every record contains
metadata and full text
All resources are accessible
via ResourceSync
and more publishers
on the way...
Every resource is
automatically synchronised
across all clients
The largest datasets for text mining
Gold Open Access
- arXiv: 1,261,533
- PubMed Central (OA subset):1,582,188
- CORE Publisher Connector:1,660,625
For the largest collection of Green &
Gold Open Access content, look at
https://core.ac.uk/services#dataset
pdf
Title
Authors
Publisher
DOI
...
Master copy through
ResourceSync sitemaps
Synchronised copy
automated
synchronisation
immediate
propagation
of deletion
1,107,091
Presentation of the expertise directory
Knoth, P., Anastasiou, L., Pearce, S. and Pontika, M. (2018) Towards a Global Comprehensive Dataset of Open
Access Papers for Text Analytics, Open Repositories 2018, Bozeman, Montana
FORCE2017 Conference – workshop on
”Improve interoperability across publisher platforms to support text
and data mining” – 33 publishers attended
Dataset statistics
Source type Details Number of open access
articles
Repositories and full OA
publishers (OpenAIRE
and CORE)
3,667 data sources
globally harvested using
OAI-PMH
9,033,808
CORE Publisher
Connector
Elsevier 1,191,785
Springer 540,889
Frontiers 65,927
PLoS 179,571
Total publisher
connector
1,978,172
Total Dataset 11,011,980
Knoth, P., Anastasiou, L., Pearce, S. and Pontika, M. (2018) Towards a Global Comprehensive Dataset of Open
Access Papers for Text Analytics, Open Repositories 2018, Bozeman, Montana
Promotion of the expertise directory
Knoth, P., Pontika, N., Anastasiou, L. Releasing 1.8 million open access publications from publisher systems for text and data mining, LSE Blog http://blogs.lse.ac.uk/im
pactofsocialsciences/2018/03/22/releasing-1-8-million-open-access-publications-from-publisher-systems-for-text-and-data-mining/
• Established and maintain a close collaboration with
researchers
• Extensive experience in advocacy, i.e. open access
• Knowledgeable about the repository’s collection
• Participate in the Academic Institution’s Research
Committees
• Knowledgeable of your repository’s collection
• Familiarity with Copyright issues and Creative Commons Licen
ses
TDM & Research Support Staff
Where to find TDM
related material - I
3 TDM taxonomies developed
by the project:
• Text and Data Mining
• TDM Methods
• TDM workflows
OMTD tutorials and courses
url : https://www.fosteropenscience.eu/openminted
Where to find TDM
related material - II
Educational training videos
introducing TDM concepts
Other TDM training materials
TDM taxonomy
url : https://www.fosteropenscience.eu/openminted
Introduction to TDM
course - I
Created by OU and LIBER in c
ollaboration with Cambridge
University.
• First technical TDM course
addressed to research
support staff.
• Presents OMTD and guides
how to use it.
• Hands-on examples on
basic TDM processes
Introduction to TDM
course - II
Suggested readings
Introductory videos
Introduction to TDM
course - III
Quizzes
Claim course
Introduction to TDM course - IV
Thank you!

More Related Content

More from OpenAIRE

More from OpenAIRE (20)

Open Research Gateway for the ELIXIR-GR Infrastructure (Part 2)
Open Research Gateway for the ELIXIR-GR Infrastructure (Part 2)Open Research Gateway for the ELIXIR-GR Infrastructure (Part 2)
Open Research Gateway for the ELIXIR-GR Infrastructure (Part 2)
 
Open Research Gateway for the ELIXIR-GR Infrastructure (Part 1)
Open Research Gateway for the ELIXIR-GR Infrastructure (Part 1)Open Research Gateway for the ELIXIR-GR Infrastructure (Part 1)
Open Research Gateway for the ELIXIR-GR Infrastructure (Part 1)
 
6th Content Providers Community Call
6th Content Providers Community Call6th Content Providers Community Call
6th Content Providers Community Call
 
20200504_OpenAIRE Legal Policy Webinar: GDPR and Sharing Data
20200504_OpenAIRE Legal Policy Webinar: GDPR and Sharing Data20200504_OpenAIRE Legal Policy Webinar: GDPR and Sharing Data
20200504_OpenAIRE Legal Policy Webinar: GDPR and Sharing Data
 
20200504_Research Data & the GDPR: How Open is Open?
20200504_Research Data & the GDPR: How Open is Open?20200504_Research Data & the GDPR: How Open is Open?
20200504_Research Data & the GDPR: How Open is Open?
 
20200504_Data, Data Ownership and Open Science
20200504_Data, Data Ownership and Open Science20200504_Data, Data Ownership and Open Science
20200504_Data, Data Ownership and Open Science
 
20200429_Research Data & the GDPR: How Open is Open? (updated version)
20200429_Research Data & the GDPR: How Open is Open? (updated version)20200429_Research Data & the GDPR: How Open is Open? (updated version)
20200429_Research Data & the GDPR: How Open is Open? (updated version)
 
20200429_Data, Data Ownership and Open Science
20200429_Data, Data Ownership and Open Science20200429_Data, Data Ownership and Open Science
20200429_Data, Data Ownership and Open Science
 
20200429_OpenAIRE Legal Policy Webinar: GDPR and Sharing Data
20200429_OpenAIRE Legal Policy Webinar: GDPR and Sharing Data20200429_OpenAIRE Legal Policy Webinar: GDPR and Sharing Data
20200429_OpenAIRE Legal Policy Webinar: GDPR and Sharing Data
 
COVID-19: Activities, tools, best practice and contact points in Greece
 COVID-19: Activities, tools, best practice and contact points in Greece COVID-19: Activities, tools, best practice and contact points in Greece
COVID-19: Activities, tools, best practice and contact points in Greece
 
5th Content Providers Community Call
5th Content Providers Community Call5th Content Providers Community Call
5th Content Providers Community Call
 
4th Content Providers Community Call
4th Content Providers Community Call4th Content Providers Community Call
4th Content Providers Community Call
 
3rd Content Providers Community Call
3rd Content Providers Community Call3rd Content Providers Community Call
3rd Content Providers Community Call
 
2nd Content Providers Community Call
2nd Content Providers Community Call2nd Content Providers Community Call
2nd Content Providers Community Call
 
1st Content Providers Community Call
1st Content Providers Community Call1st Content Providers Community Call
1st Content Providers Community Call
 
20200130_Mannocci_OpenAIRE_ResearchGraph
20200130_Mannocci_OpenAIRE_ResearchGraph20200130_Mannocci_OpenAIRE_ResearchGraph
20200130_Mannocci_OpenAIRE_ResearchGraph
 
IPR and Exploitation
IPR and Exploitation IPR and Exploitation
IPR and Exploitation
 
Eosc_OpenAIRE_onboarding_v2
Eosc_OpenAIRE_onboarding_v2Eosc_OpenAIRE_onboarding_v2
Eosc_OpenAIRE_onboarding_v2
 
Open Science infrastructure in the EU
Open Science infrastructure in the EUOpen Science infrastructure in the EU
Open Science infrastructure in the EU
 
OpenAIRE Open Innovation call: Next Generation Repositories
OpenAIRE Open Innovation call: Next Generation RepositoriesOpenAIRE Open Innovation call: Next Generation Repositories
OpenAIRE Open Innovation call: Next Generation Repositories
 

Recently uploaded

Cyathodium bryophyte: morphology, anatomy, reproduction etc.
Cyathodium bryophyte: morphology, anatomy, reproduction etc.Cyathodium bryophyte: morphology, anatomy, reproduction etc.
Cyathodium bryophyte: morphology, anatomy, reproduction etc.
Cherry
 
PODOCARPUS...........................pptx
PODOCARPUS...........................pptxPODOCARPUS...........................pptx
PODOCARPUS...........................pptx
Cherry
 
Phenolics: types, biosynthesis and functions.
Phenolics: types, biosynthesis and functions.Phenolics: types, biosynthesis and functions.
Phenolics: types, biosynthesis and functions.
Cherry
 
CYTOGENETIC MAP................ ppt.pptx
CYTOGENETIC MAP................ ppt.pptxCYTOGENETIC MAP................ ppt.pptx
CYTOGENETIC MAP................ ppt.pptx
Cherry
 
(May 9, 2024) Enhanced Ultrafast Vector Flow Imaging (VFI) Using Multi-Angle ...
(May 9, 2024) Enhanced Ultrafast Vector Flow Imaging (VFI) Using Multi-Angle ...(May 9, 2024) Enhanced Ultrafast Vector Flow Imaging (VFI) Using Multi-Angle ...
(May 9, 2024) Enhanced Ultrafast Vector Flow Imaging (VFI) Using Multi-Angle ...
Scintica Instrumentation
 
Digital Dentistry.Digital Dentistryvv.pptx
Digital Dentistry.Digital Dentistryvv.pptxDigital Dentistry.Digital Dentistryvv.pptx
Digital Dentistry.Digital Dentistryvv.pptx
MohamedFarag457087
 
Reboulia: features, anatomy, morphology etc.
Reboulia: features, anatomy, morphology etc.Reboulia: features, anatomy, morphology etc.
Reboulia: features, anatomy, morphology etc.
Cherry
 
biology HL practice questions IB BIOLOGY
biology HL practice questions IB BIOLOGYbiology HL practice questions IB BIOLOGY
biology HL practice questions IB BIOLOGY
1301aanya
 

Recently uploaded (20)

Genetics and epigenetics of ADHD and comorbid conditions
Genetics and epigenetics of ADHD and comorbid conditionsGenetics and epigenetics of ADHD and comorbid conditions
Genetics and epigenetics of ADHD and comorbid conditions
 
Cyanide resistant respiration pathway.pptx
Cyanide resistant respiration pathway.pptxCyanide resistant respiration pathway.pptx
Cyanide resistant respiration pathway.pptx
 
Factory Acceptance Test( FAT).pptx .
Factory Acceptance Test( FAT).pptx       .Factory Acceptance Test( FAT).pptx       .
Factory Acceptance Test( FAT).pptx .
 
Cyathodium bryophyte: morphology, anatomy, reproduction etc.
Cyathodium bryophyte: morphology, anatomy, reproduction etc.Cyathodium bryophyte: morphology, anatomy, reproduction etc.
Cyathodium bryophyte: morphology, anatomy, reproduction etc.
 
PODOCARPUS...........................pptx
PODOCARPUS...........................pptxPODOCARPUS...........................pptx
PODOCARPUS...........................pptx
 
Dr. E. Muralinath_ Blood indices_clinical aspects
Dr. E. Muralinath_ Blood indices_clinical  aspectsDr. E. Muralinath_ Blood indices_clinical  aspects
Dr. E. Muralinath_ Blood indices_clinical aspects
 
Phenolics: types, biosynthesis and functions.
Phenolics: types, biosynthesis and functions.Phenolics: types, biosynthesis and functions.
Phenolics: types, biosynthesis and functions.
 
Terpineol and it's characterization pptx
Terpineol and it's characterization pptxTerpineol and it's characterization pptx
Terpineol and it's characterization pptx
 
CYTOGENETIC MAP................ ppt.pptx
CYTOGENETIC MAP................ ppt.pptxCYTOGENETIC MAP................ ppt.pptx
CYTOGENETIC MAP................ ppt.pptx
 
(May 9, 2024) Enhanced Ultrafast Vector Flow Imaging (VFI) Using Multi-Angle ...
(May 9, 2024) Enhanced Ultrafast Vector Flow Imaging (VFI) Using Multi-Angle ...(May 9, 2024) Enhanced Ultrafast Vector Flow Imaging (VFI) Using Multi-Angle ...
(May 9, 2024) Enhanced Ultrafast Vector Flow Imaging (VFI) Using Multi-Angle ...
 
Site Acceptance Test .
Site Acceptance Test                    .Site Acceptance Test                    .
Site Acceptance Test .
 
Site specific recombination and transposition.........pdf
Site specific recombination and transposition.........pdfSite specific recombination and transposition.........pdf
Site specific recombination and transposition.........pdf
 
Digital Dentistry.Digital Dentistryvv.pptx
Digital Dentistry.Digital Dentistryvv.pptxDigital Dentistry.Digital Dentistryvv.pptx
Digital Dentistry.Digital Dentistryvv.pptx
 
Selaginella: features, morphology ,anatomy and reproduction.
Selaginella: features, morphology ,anatomy and reproduction.Selaginella: features, morphology ,anatomy and reproduction.
Selaginella: features, morphology ,anatomy and reproduction.
 
CURRENT SCENARIO OF POULTRY PRODUCTION IN INDIA
CURRENT SCENARIO OF POULTRY PRODUCTION IN INDIACURRENT SCENARIO OF POULTRY PRODUCTION IN INDIA
CURRENT SCENARIO OF POULTRY PRODUCTION IN INDIA
 
Thyroid Physiology_Dr.E. Muralinath_ Associate Professor
Thyroid Physiology_Dr.E. Muralinath_ Associate ProfessorThyroid Physiology_Dr.E. Muralinath_ Associate Professor
Thyroid Physiology_Dr.E. Muralinath_ Associate Professor
 
Kanchipuram Escorts 🥰 8617370543 Call Girls Offer VIP Hot Girls
Kanchipuram Escorts 🥰 8617370543 Call Girls Offer VIP Hot GirlsKanchipuram Escorts 🥰 8617370543 Call Girls Offer VIP Hot Girls
Kanchipuram Escorts 🥰 8617370543 Call Girls Offer VIP Hot Girls
 
Reboulia: features, anatomy, morphology etc.
Reboulia: features, anatomy, morphology etc.Reboulia: features, anatomy, morphology etc.
Reboulia: features, anatomy, morphology etc.
 
FAIRSpectra - Enabling the FAIRification of Spectroscopy and Spectrometry
FAIRSpectra - Enabling the FAIRification of Spectroscopy and SpectrometryFAIRSpectra - Enabling the FAIRification of Spectroscopy and Spectrometry
FAIRSpectra - Enabling the FAIRification of Spectroscopy and Spectrometry
 
biology HL practice questions IB BIOLOGY
biology HL practice questions IB BIOLOGYbiology HL practice questions IB BIOLOGY
biology HL practice questions IB BIOLOGY
 

What is Text and Data Mining (TDM)?

  • 1. Infographic Access the connector: http://publisher-connector.core.ac.uk/resourcesync Discovery services: Proprietary APIs Connector layer frontiers Crossref COREPublisher Connector PubMedOAsubset arXiv Dataset Numberofresources 492,462 59,512 172,812 1,831,877 Open Access articles seamlessly accessible by everyone 7% of the total content available from the above publishers is Open Access Every record contains metadata and full text All resources are accessible via ResourceSync and more publishers on the way... Every resource is automatically synchronised across all clients The largest datasets for text mining Gold Open Access - arXiv: 1,261,533 - PubMed Central (OA subset):1,582,188 - CORE Publisher Connector:1,660,625 For the largest collection of Green & Gold Open Access content, look at https://core.ac.uk/services#dataset pdf Title Authors Publisher DOI ... Master copy through ResourceSync sitemaps Synchronised copy automated synchronisation immediate propagation of deletion 1,107,091
  • 2. Presentation of the expertise directory Knoth, P., Anastasiou, L., Pearce, S. and Pontika, M. (2018) Towards a Global Comprehensive Dataset of Open Access Papers for Text Analytics, Open Repositories 2018, Bozeman, Montana FORCE2017 Conference – workshop on ”Improve interoperability across publisher platforms to support text and data mining” – 33 publishers attended
  • 3. Dataset statistics Source type Details Number of open access articles Repositories and full OA publishers (OpenAIRE and CORE) 3,667 data sources globally harvested using OAI-PMH 9,033,808 CORE Publisher Connector Elsevier 1,191,785 Springer 540,889 Frontiers 65,927 PLoS 179,571 Total publisher connector 1,978,172 Total Dataset 11,011,980 Knoth, P., Anastasiou, L., Pearce, S. and Pontika, M. (2018) Towards a Global Comprehensive Dataset of Open Access Papers for Text Analytics, Open Repositories 2018, Bozeman, Montana
  • 4. Promotion of the expertise directory Knoth, P., Pontika, N., Anastasiou, L. Releasing 1.8 million open access publications from publisher systems for text and data mining, LSE Blog http://blogs.lse.ac.uk/im pactofsocialsciences/2018/03/22/releasing-1-8-million-open-access-publications-from-publisher-systems-for-text-and-data-mining/
  • 5. • Established and maintain a close collaboration with researchers • Extensive experience in advocacy, i.e. open access • Knowledgeable about the repository’s collection • Participate in the Academic Institution’s Research Committees • Knowledgeable of your repository’s collection • Familiarity with Copyright issues and Creative Commons Licen ses TDM & Research Support Staff
  • 6. Where to find TDM related material - I 3 TDM taxonomies developed by the project: • Text and Data Mining • TDM Methods • TDM workflows OMTD tutorials and courses url : https://www.fosteropenscience.eu/openminted
  • 7. Where to find TDM related material - II Educational training videos introducing TDM concepts Other TDM training materials
  • 8. TDM taxonomy url : https://www.fosteropenscience.eu/openminted
  • 9. Introduction to TDM course - I Created by OU and LIBER in c ollaboration with Cambridge University. • First technical TDM course addressed to research support staff. • Presents OMTD and guides how to use it. • Hands-on examples on basic TDM processes
  • 10. Introduction to TDM course - II Suggested readings Introductory videos
  • 11. Introduction to TDM course - III Quizzes Claim course
  • 12. Introduction to TDM course - IV