O slideshow foi denunciado.
Utilizamos seu perfil e dados de atividades no LinkedIn para personalizar e exibir anúncios mais relevantes. Altere suas preferências de anúncios quando desejar.

The Future is All Mine

521 visualizações

Publicada em

Mining Cultural Heritage text and data, presented at DISH2015

Publicada em: Dados e análise
  • Seja o primeiro a comentar

The Future is All Mine

  1. 1. The Future is All Mine Text and Data Mining Projects in Europe @openminted_eu @futuretdm @openminted_eu @futuretdm Funded by:
  2. 2. Projects funded by @openminted_eu @futuretdm
  3. 3. Text and data mining is the future “Text and data mining (TDM) is the process of deriving information from machine-read material. It works by copying large quantities of material, extracting the data, and recombining it to identify patterns.” JISC Projects funded by @openminted_eu @futuretdm
  4. 4. Text and data mining helps us understand the past Mining historical books: the evolution of language Source: http://www.sciencemag.org/content/331/6014/176 (Baylor College of Medicine, Houston) Projects funded by @openminted_eu @futuretdm
  5. 5. Text and data mining predicts the future Mining newspapers: Predicts revolutions Source: http://journals.uic.edu/ojs/index.php/fm/article/view/3663/3040 (University of Illinois) Projects funded by @openminted_eu @futuretdm
  6. 6. Text and data mining saves the future Mining scientific publications about diseases: Save lives Source: http://dl.acm.org/citation.cfm?id=2623667 (Baylor College of Medicine, Houston) Projects funded by @openminted_eu @futuretdm
  7. 7. Text mining – it seems so easy: Linguistic Analysis: Entity Recognition Data Mining Knowledge Discovery Information Extraction STAGE 1 STAGE 2 STAGE 3 STAGE 4 Information Retrieval Projects funded by @openminted_eu @futuretdm
  8. 8. But it actually poses many challenges… ? ? ? ? ? ? ? ?? ?? ? ? ?? ? ? How do I make my texts readable by machines? ?Which mining method to use? STAGE 1 STAGE 2 STAGE 3 STAGE 4 Where do I find data? Projects funded by @openminted_eu @futuretdm
  9. 9. 9 Current Barriers in Europe Awareness across Institutions & Stakeholders  Lack of awareness among research communities  Lack of guidance to uncover TDM potential Skills and Tools  Availability and accessibility across disciplines  Gap in skills across various sectors Licensing & Open Access  License proliferation and interoperability issues  License barriers to transparent open access Copyright and Data Protection  TDM activities infringing current copyright laws  Legal and policy limitations and barriers for TDM Projects funded by @openminted_eu @futuretdm
  10. 10. EU PROJECTS on TDM FutureTDM Identify TDM barriers and policy solutions Open mine Build a TDM eInfrastructure Projects funded by @openminted_eu @futuretdm
  11. 11. ELABORATE a legal and policy framework for future TDM and specify a research agenda to foster the spread of TDM BUILD a website: a Collaborative Knowledge Base and an Open Information Hub combined ANALYSE current application areas and best practices in TDM ASSESS existing studies, legal regulations and policies on TDM Main Objectives of FutureTDM INVOLVE all key stakeholders to identify practices, requirements, and specific challenges INCREASE awareness of TDM to attract new target groups and science domains @openminted_eu @futuretdm This project has received funding from the European Union’s Horizon 2020 Research and Innovation Programme under Grant Agreement No 665940.
  12. 12. Bottom-up approach: Stakeholder workshops and knowledge cafes throughout Europe FutureTDM @openminted_eu @futuretdm This project has received funding from the European Union’s Horizon 2020 Research and Innovation Programme under Grant Agreement No 665940.
  13. 13. Data centre Data centre Data centre Data centre in public cloud Publisher text corpus OpenAIRE/CORE text corpus PMC text corpus Other text corpora Other text corpora Other text corpora Other types of text corpora Layer 3: Interoperability to shared storage and computing resources Language resources Language resources Language resources Language resources Layer 2: Interoperability of language resources & corpora Layer 1: Interoperability of text mining services (platforms or components) Language resources and corpora registry service Platform services Registry Workflow ManagementAuth2 & Policy management Annotator Accounting Mining Platforms Mining Platforms Mining Platforms Proprietary architectures Mining Platforms Objective of OpenMinTeD @openminted_eu Projects funded by@futuretdm
  14. 14. OpenMinTeD brings together: 14 ACCESSIBLE CONTENT DISCOVERABLE SERVICES EFFICIENT PROCESSING TDM COMMUNITIES VALUE ADDED APPS Via standardised programmatic interfaces and access rules Easily discoverable text mining services and workflows which process, analyse and annotate text Operate on public e-Infrastructures via standarized APIs Different scientific communities have different challenges Community-driven applications to illustrate the value of the infastructure. Engage with industry. OPENMINTED = The Open Mining Infrastructure for Text and Data
  15. 15. Become involved Follow us on Twitter for the latest updates and blogs @openminted_eu @futuretdm Follow our websites www.openminted.eu www.futuretdm.eu Projects funded by @openminted_eu @futuretdm
  16. 16. THANK YOU • Athena RIC • Univ. of Manchester (NacTem) • Univ. of Darmstadt • INRA • EMBL-EBI • Agro-Know • LIBER • Univ. of Amsterdam • Open University UK • EPFL • CNIO • Univ. of Sheffield (GATE) • GESIS • GRNET • Frontiers • Univ. of Stirling PARTNERS OPENMINTEDPARTNERS FUTURETDM • SYNYO GmbH (SYNYO) • LIBER Europe • Open Knowledge Foundation LBG (OK/CM) • Radboud Univ. Nijmegen • The British Library Board • Univ. of Amsterdam • Athena RIC • Ubiquity Press • Fundacja Projekt: Polska (FPP)

×