SlideShare uma empresa Scribd logo
1 de 27
Baixar para ler offline
Treparel
Delftechpark 26
2628 XH Delft
The Netherlands
www.treparel.com
Text Analytics
in the
EU Fusepool
project
II-SDV ’13 - Nice
Anton Heijs
CTO
anton@treparel.com
April 15, 2013
Agenda	
  
•  Introduc.on	
  of	
  the	
  EU	
  Fusepool	
  project,	
  Treparel	
  and	
  the	
  
consor.um	
  partners	
  
•  Background	
  on	
  objec.ves	
  of	
  the	
  Fusepool	
  project	
  
•  User	
  adap.ve	
  system	
  
•  Data	
  pooling	
  and	
  linking	
  
•  Large	
  scale	
  machine	
  learning	
  and	
  text	
  analy.cs	
  
•  Summary	
  
Treparel KMX – All rights reserved 2013 2www.treparel.com
The	
  EU	
  Fusepool	
  Consor.um	
  
Treparel (Delft, The Netherlands) is a global software provider in Big Data Text Analytics and
Visualization.
Global companies, government agencies, software vendors or data publishers are using Treparel
KMX text analysis software to gain faster, reliable, precise insights in large complex unstructured data
sets (like application notes, blogs, email and patents) allowing them to make better informed
decisions.
As part of the Fusepool consortium Treparel integrates the KMX Text Analytics technology for
advanced classification, clustering and visualization of large complex document collections.
Treparel KMX – All rights reserved 2013 3www.treparel.com
Each	
  partner	
  provides	
  cri.cal	
  building	
  blocks:	
  
  BUAS:	
  Dynamic	
  user	
  interfaces	
  
  ENOLL:	
  End	
  users	
  with	
  specific	
  needs	
  
  GEOX:	
  NER,	
  UI	
  design	
  
  SEARCHBOX:	
  text	
  search	
  and	
  similarity	
  seman.c	
  
matching	
  
  TREPAREL:	
  text/patent	
  classifica.on	
  engines	
  
  XEROX:	
  user-­‐adap.ve	
  learning	
  know-­‐how	
  
	
  	
  	
  	
  	
  	
  BUT	
  BREAK-­‐THROUGH	
  BY	
  INTEGRATION	
  
The	
  Big	
  Picture	
  
Treparel KMX – All rights reserved 2013 4www.treparel.com
WP4
Find &
match
WP5
GUI &
Visuals
WP2
Platform,
sourcing
WP3
Extract &
enrich
WP1
Reqs &
Testing
WP6 User Involvement
Learning Learning Learning LearningFeedback
Work	
  packages	
  
The EC Fusepool project is a 2 year EU project and started in July 2012
Treparel KMX – All rights reserved 2013 5www.treparel.com
Fusing	
  and	
  pooling	
  informa.on	
  	
  
for	
  product	
  development	
  
Vision: User-adaptive system
Living Lab: Rapid app development
Data processing: Sourcing & interlinking
Machine learning: Matching & optimizing
Integrated Use Cases
Treparel KMX – All rights reserved 2013 6www.treparel.com
Background	
  
•  SMEs	
  have	
  a	
  need	
  for	
  technology	
  intelligence	
  for	
  detec.ng	
  
and	
  responding	
  to	
  opportuni.es	
  and	
  threats	
  
•  This	
  a	
  partly	
  driven	
  by	
  growth	
  and	
  complexity	
  of	
  
patents	
  and	
  lawsuits	
  
•  Consumer	
  intelligence	
  to	
  detect	
  opinions	
  and	
  needs	
  of	
  
consumers	
  for	
  product	
  development	
  
•  Open	
  innova.on	
  requiring	
  coopera.on	
  (links	
  between	
  data,	
  
e.g.	
  finding	
  business	
  partners)	
  
•  Focus:	
  Machine	
  Learning	
  algorithms	
  to	
  improve	
  matching	
  	
  
Treparel KMX – All rights reserved 2013 7www.treparel.com
User-­‐adap.ve	
  system	
  
•  Focus:	
  monitor	
  and	
  learn	
  specific	
  needs	
  and	
  preferences	
  
of	
  a	
  user	
  to	
  align	
  features,	
  func.onali.es,	
  and	
  graphical	
  
interfaces	
  
•  AdapDve:	
  machine	
  learning	
  from	
  crowd-­‐sourcing	
  (rather	
  
than	
  rule-­‐based)	
  
•  User-­‐aligned	
  prioriDzaDon:	
  more	
  usable	
  and	
  customized	
  
interfaces,	
  sugges.ons	
  based	
  on	
  ac.vity	
  &	
  user	
  feedback	
  
Treparel KMX – All rights reserved 2013 8www.treparel.com
User-­‐adap.ve	
  matching	
  
•  Main	
  goal:	
  automated	
  user-­‐adap.ve	
  matching	
  of	
  users	
  
to:	
  
–  Patent	
  analysis	
  
–  Finding	
  funding	
  opportuni.es	
  
–  Partner	
  matching	
  
•  Key	
  asset:	
  informa.on	
  provided	
  by	
  the	
  user	
  	
  
•  User	
  Data	
  Credo:	
  	
  accuracy	
  improves	
  with	
  quan.ty	
  and	
  
quality	
  of	
  user	
  data	
  while	
  variety	
  (breadth)	
  increases	
  
with	
  number	
  of	
  users	
  
•  Living	
  lab:	
  Co-­‐crea.on	
  between	
  creators	
  and	
  consumers	
  
of	
  the	
  Fusepool	
  plaWorm	
  
Treparel KMX – All rights reserved 2013 9www.treparel.com
Data	
  sourcing	
  
•  Sources:	
  internal	
  &	
  external	
  content	
  from	
  web	
  harves.ng	
  
and	
  structured	
  data	
  sources	
  
–  using	
  content	
  databases	
  and	
  linked	
  open	
  data	
  
•  Scope:	
  ini.al	
  data	
  corpus	
  includes	
  all	
  explicitly	
  in-­‐	
  and	
  
excluded	
  sources	
  
•  Gained	
  value	
  from	
  InformaDon:	
  recommenda.ons	
  
based	
  on	
  machine	
  learning	
  from	
  feedback	
  
Treparel KMX – All rights reserved 2013 10www.treparel.com
Data	
  handling	
  
1.  Text	
  analysis	
  and	
  feature	
  extracDon:	
  ML	
  &	
  NLP	
  methods	
  
for	
  categorizing,	
  named	
  en.ty	
  extrac.on,	
  etc.	
  
2.  Shared	
  metadata	
  models:	
  mapping	
  text	
  features	
  to	
  
exis.ng/custom	
  ontologies	
  and	
  genera.on	
  of	
  seman.c	
  
triplets	
  
→  High-­‐level	
  abstrac.on	
  &	
  persistence	
  for	
  reuse	
  
→  Lightweight	
  storage:	
  mostly	
  metadata	
  only,	
  text	
  
indexing	
  and	
  abstrac.on	
  uses	
  schema-­‐free	
  key-­‐
value	
  (enabling	
  ac.onable	
  facets)	
  
Treparel KMX – All rights reserved 2013 11www.treparel.com
Data	
  interlinking	
  
•  Contextualize:	
  terms	
  are	
  interlinked	
  with	
  same	
  and	
  
similar	
  terms	
  across	
  sources:	
  
–  Enrich	
  the	
  extracted	
  content	
  with	
  exis.ng	
  informa.on	
  available	
  
in	
  the	
  Internet	
  
–  Interlink	
  as	
  much	
  informa.on	
  as	
  possible	
  to	
  increase	
  the	
  value	
  of	
  
knowledge	
  extrac.on	
  
–  Use	
  available	
  public	
  sector	
  resources	
  in	
  Seman.c	
  Web	
  and	
  LOD	
  
format	
  
•  Metadata:	
  when	
  a	
  user	
  uploads	
  texts	
  to	
  be	
  matched	
  with	
  
other	
  content,	
  only	
  the	
  metadata	
  descriptors	
  are	
  
transmiYed	
  
•  Data	
  privacy:	
  data	
  fusion	
  from	
  diverse	
  sources	
  without	
  
endangering	
  user	
  privacy	
  
Treparel KMX – All rights reserved 2013 12www.treparel.com
Searching	
  &	
  finding	
  
•  Key	
  search-­‐oriented	
  features:	
  
–  Search	
  through	
  all	
  content	
  in	
  the	
  data	
  pool	
  
–  Faceted	
  search	
  (categories,	
  metadata,	
  en..es)	
  
–  Integra.on	
  of	
  Linked	
  Open	
  Data	
  (LOD)	
  results	
  
–  Cross-­‐lingual	
  indexing	
  and	
  cross-­‐referencing	
  
–  “Did	
  you	
  mean?”-­‐func.onality	
  in	
  case	
  of	
  typos	
  and	
  auto-­‐
comple.on	
  of	
  search	
  queries	
  
•  User-­‐adapDve:	
  indexing	
  and	
  integra.on	
  based	
  on	
  users	
  
needs	
  (e.g.	
  user	
  profiling)	
  
Treparel KMX – All rights reserved 2013 13www.treparel.com
Adapta.on	
  &	
  refinement	
  
•  AdapDve	
  search:	
  results	
  are	
  aligned	
  to	
  user	
  preferences	
  
based	
  on	
  analysis	
  of	
  user	
  implicit	
  and	
  explicit	
  feedback	
  	
  
•  MulD-­‐task	
  ranking:	
  good	
  trade-­‐off	
  between	
  user-­‐
independent	
  search	
  (high	
  coverage	
  but	
  lower	
  precision)	
  
and	
  a	
  very	
  customized	
  approach	
  
•  Query	
  intent	
  discovery:	
  analysis	
  of	
  the	
  query	
  structure	
  
and	
  interlinking	
  of	
  queries	
  
Treparel KMX – All rights reserved 2013 14www.treparel.com
Correla.ng	
  &	
  matching	
  
•  Search	
  guided	
  navigaDon:	
  seman.c	
  matching	
  extracts	
  
contextual	
  rela.onships	
  to	
  list	
  related	
  content	
  
–  sugges.ons	
  organized	
  by	
  categories	
  
–  exposing	
  facets	
  within	
  related	
  content	
  
•  Distributed	
  rule	
  and	
  event	
  model:	
  defines	
  states,	
  
ac.ons,	
  and	
  consequences	
  (e.g.	
  no.fica.ons,	
  
visualiza.ons)	
  for	
  reasoning	
  based	
  on	
  light-­‐weight	
  
ontologies	
  
Treparel KMX – All rights reserved 2013 15www.treparel.com
Crowd	
  sourcing	
  &	
  	
  
supervised	
  automa.on	
  
•  RelaDonal	
  learning:	
  related	
  instances	
  are	
  used	
  to	
  reason	
  
about	
  the	
  focal	
  instance	
  
–  Ra.onality	
  of	
  content	
  (links	
  to	
  other	
  content,	
  people,	
  etc.)	
  
provide	
  rich	
  informa.on	
  
–  Similari.es/dissimilari.es	
  to	
  other	
  content	
  is	
  established	
  purely	
  
on	
  rela.onal	
  proper.es	
  
Treparel KMX – All rights reserved 2013 16www.treparel.com
Visualiza.on	
  
Clustering	
   Classifica.on	
  
Text	
  Preprocessing	
  and	
  Indexing	
  
Acquire	
  documents	
  
Present	
  Results	
  
Taxonomies,	
  
Ontologies	
  
Seman.c	
  Analysis	
  
Document	
  level	
  analysis	
  
using	
  the	
  KMX	
  technology	
  	
  
KMX	
  unique	
  func.ons:	
  
•  Extract	
  concepts	
  in	
  context	
  
using	
  clustering	
  and	
  
classifica.on	
  of	
  documents	
  
•  Use	
  classifica.on	
  to	
  create	
  
ranked	
  lists	
  and	
  to	
  tag	
  subsets	
  
•  Support	
  of	
  binary	
  and	
  mul.-­‐
class	
  Classifica.on	
  
•  Integra.on	
  with	
  other	
  
applica.ons	
  through	
  KMX	
  API	
  
Treparel KMX – All rights reserved 2012 www.treparel.com 17
Query &
Search Tools
Treparel KMX – All rights reserved 2013 18www.treparel.com
NER	
  in	
  the	
  landscaping	
  	
  
Sentence	
  level	
  analysis	
  using	
  	
  
Named	
  En.ty	
  Recogni.on	
  
•  The	
  aim	
  of	
  NER	
  is	
  to	
  iden.fy	
  en..es	
  in	
  unstructured	
  text	
  
documents.	
  
o  To	
  locate	
  :	
  mark-­‐up	
  the	
  en..es	
  
o  To	
  classify	
  :	
  into	
  predefined	
  categories/domains	
  
•  Aim	
  of	
  usage	
  
o  To	
  recognise	
  trends	
  (trained	
  and	
  new	
  high	
  frequency)	
  
o  To	
  find	
  all	
  „trained”	
  en..es	
  
o  To	
  „discover”	
  new	
  en.tes	
  
•  NER	
  approaches	
  
o  Sta.s.cs	
  based	
  (supervised	
  machine	
  learning)	
  
o  Rule	
  based	
  (regular	
  expressions)	
  
Treparel KMX – All rights reserved 2013 19www.treparel.com
Building	
  a	
  NER	
  model	
  
Treparel KMX – All rights reserved 2013 20www.treparel.com
3	
  NER	
  model	
  examples	
  
  Training	
  text:	
  500	
  patents	
  from	
  EPO	
  
  Training:	
  model	
  building	
  by	
  Stanford	
  NER	
  
•  LTE	
  (long	
  term	
  evoluDon)	
  
•  F1:	
  88%	
  
•  New	
  en..es:	
  „GSM	
  BSS”,	
  „LTE	
  TDD”	
  
•  False	
  pos:	
  „Loca.on	
  Area	
  LA”	
  
•  Elements	
  
•  F1:	
  98%	
  
•  False	
  pos:	
  „argon/hydrogen”	
  
•  Cancer	
  
•  F1:	
  87%	
  
•  New	
  en..es:	
  „myeloma	
  cell”,	
  „tumor	
  .ssue”	
  
•  False	
  pos:	
  „cell	
  mortality”,	
  „the	
  test	
  compound”	
  
Treparel KMX – All rights reserved 2013 21www.treparel.com
Using	
  NER	
  in	
  a	
  GUI	
  
Treparel KMX – All rights reserved 2013 22www.treparel.com
Raw tex
Extract	
  terms	
  indica.ng	
  
domains	
  using	
  patent	
  
classifica.on	
  codes	
  
Content	
  /	
  En.ty	
  
Hub	
  
Generate	
  Vector	
  Space	
  Model	
  
Document Vectors
Generate	
  Classifiers	
  to	
  es.mate	
  domains	
  
Annotated tex with
domain labels
Domain
labels
Training Vectors
DocId : 1122
Title : Pesticide device
Classification Code :
A61B
Text: Hihaho
Etc etc
Table
A61 : pesticide
A61B : pesticide solvent
Etc etc
Many Vectors
Doc-Id : 1122
750 terms + weights
Classifier	
  on	
  pes.cide	
  solvents	
  
25 Positive vectors
Doc-Id’s
Label
750 terms + weights
25 negative vectors
Doc-Id’s
Label
750 terms + weights
Doc-Id : 1122
Labels + scores
Title : Pesticide device
Classification Code : A61B
Text: Hihaho …Etc etc
1
2
3
4
5
6
7
8
9
10	
  
11	
  
Combining	
  text	
  analysis	
  
approaches	
  
Treparel KMX – All rights reserved 2013 23www.treparel.com
•  Data	
  sourcing:	
  retrieves	
  data	
  from	
  data	
  sources	
  via	
  data	
  
integrator	
  
•  Storage:	
  raw	
  (e.g.	
  text)	
  or	
  processed	
  data	
  stored	
  in	
  
database	
  or	
  triple	
  store	
  
•  Using	
  ML	
  to	
  enable	
  learning	
  from	
  the	
  crowd	
  
•  Push	
  &	
  Pull:	
  	
  
  interface	
  to	
  consumers	
  (web	
  and	
  mobile	
  apps)	
  
  suppor.ng	
  quality	
  and	
  access	
  control	
  from	
  portal	
  
•  Portal:	
  business	
  logic,	
  storage	
  of	
  registered	
  data	
  sources,	
  
access	
  control	
  using	
  web	
  frontend	
  
Somware	
  architecture	
  
Treparel KMX – All rights reserved 2013 24www.treparel.com
Open	
  Call	
  for	
  Users	
  
•  ApplicaDons	
  received	
  from	
  Finland,	
  Germany,	
  Spain,	
  Greece,	
  
Bulgaria,	
  Hungary,	
  Italy,	
  UK,	
  Belgium,	
  China,	
  Switzerland,	
  France,	
  
Denmark,	
  Portugal,	
  Ireland	
  and	
  The	
  Netherlands	
  	
  
•  Business	
  areas	
  covered:	
  	
  
–  Bio-­‐medical,	
  
–  Pharma	
  and	
  biotech,	
  
–  ICT/Telecommunica.ons,	
  	
  
–  Digital	
  media,	
  	
  
–  Renewable	
  energies,	
  	
  
–  Educa.on,	
  	
  
–  Innova.on/consultancy	
  services	
  for	
  SMEs.	
  	
  
•  Mix	
  of	
  profiles	
  from	
  SMEs,	
  Research,	
  SME	
  intermediaries	
  
(incubators,	
  Science	
  parks,	
  Living	
  Labs,	
  etc),	
  developers	
  and	
  more	
  
•  MulDple	
  areas	
  of	
  research,	
  development	
  and	
  innova.on	
  areas	
  are	
  
covered.	
  
25Treparel KMX – All rights reserved 2013 25www.treparel.com
•  Data	
  as	
  a	
  service:	
  scale	
  economies	
  of	
  scale	
  in	
  management	
  of	
  data	
  	
  
•  Data	
  pooling:	
  processes	
  need	
  .mely	
  aggrega.on	
  and	
  redistribu.on	
  
of	
  diverse	
  data	
  but	
  building	
  own	
  is	
  redundant	
  and	
  prohibi.ve	
  for	
  
SMEs	
  
  provide	
  services	
  on	
  top	
  of	
  pool	
  with	
  high	
  quality	
  data	
  
  provide	
  access	
  to	
  services	
  on	
  demand	
  
•  Success	
  criteria:	
  
  Early	
  provision	
  of	
  scalable	
  basic	
  Fusepool	
  services	
  
  SME	
  involvement	
  and	
  uptake	
  of	
  Fusepool	
  services	
  
  Machine	
  learning	
  for	
  data	
  cura.on	
  &	
  user	
  adapta.on	
  
•  Required	
  steps:	
  
  Stepwise	
  integra.on	
  of	
  exis.ng	
  &	
  new	
  technologies	
  
  Early	
  and	
  ongoing	
  feedback	
  from	
  end	
  users	
  
EC	
  perspec.ve	
  
Treparel KMX – All rights reserved 2013 26www.treparel.com
The	
  EC	
  FP	
  7	
  program	
  Fusepool	
  is	
  all	
  about:	
  	
  
•  Building	
  a	
  plaWorm	
  with	
  web	
  enabled	
  services	
  for	
  
•  Data	
  pooling	
  
•  Large	
  scale	
  text	
  analysis	
  
•  Large	
  scale	
  machine	
  learning	
  of	
  user	
  input	
  
•  Enable	
  SME’s	
  with	
  analy.cs	
  for	
  improving	
  their	
  
innova.on	
  and	
  compe..ve	
  strengths	
  using	
  
•  SME’s	
  involvement	
  and	
  feedback	
  to	
  the	
  Fusepool	
  services	
  
•  Machine	
  learning	
  for	
  data	
  cura.on	
  &	
  user	
  adapta.on	
  
Summary	
  
Treparel KMX – All rights reserved 2013 27www.treparel.com
Welcome to visit Fusepool at:
www.fusepool.net

Mais conteúdo relacionado

Mais procurados

WEBINAR: "How to manage your data to make them open and fair"
WEBINAR:  "How to manage your data to make them open and fair"  WEBINAR:  "How to manage your data to make them open and fair"
WEBINAR: "How to manage your data to make them open and fair" OpenAIRE
 
WEBINAR: Open Access to publications in Horizon 2020
WEBINAR: Open Access to publications in Horizon 2020WEBINAR: Open Access to publications in Horizon 2020
WEBINAR: Open Access to publications in Horizon 2020OpenAIRE
 
Exposing EO Linked (meta-)Data from OpenSearch Catalogue
Exposing EO Linked (meta-)Data from OpenSearch CatalogueExposing EO Linked (meta-)Data from OpenSearch Catalogue
Exposing EO Linked (meta-)Data from OpenSearch CatalogueRaul Palma
 
WEBINAR: Open Research Data in Horizon 2020
WEBINAR: Open Research Data in Horizon 2020WEBINAR: Open Research Data in Horizon 2020
WEBINAR: Open Research Data in Horizon 2020OpenAIRE
 
An introduction to Linked (Open) Data
An introduction to Linked (Open) DataAn introduction to Linked (Open) Data
An introduction to Linked (Open) DataAli Khalili
 
Linked Data for Abbreviations and Segmentation
Linked Data for Abbreviations and SegmentationLinked Data for Abbreviations and Segmentation
Linked Data for Abbreviations and SegmentationSebastian Hellmann
 
Data management plans – EUDAT Best practices and case study | www.eudat.eu
Data management plans – EUDAT Best practices and case study | www.eudat.euData management plans – EUDAT Best practices and case study | www.eudat.eu
Data management plans – EUDAT Best practices and case study | www.eudat.euEUDAT
 
Imcs review 2013_04_v7
Imcs review 2013_04_v7Imcs review 2013_04_v7
Imcs review 2013_04_v7Karel Charvat
 
CORE - Petr Knoth, Research Associate
CORE - Petr Knoth, Research AssociateCORE - Petr Knoth, Research Associate
CORE - Petr Knoth, Research AssociateThe European Library
 
ENGAGE Workshop at OpenDataWeek2013
ENGAGE Workshop at OpenDataWeek2013ENGAGE Workshop at OpenDataWeek2013
ENGAGE Workshop at OpenDataWeek2013Valerie BRASSE
 

Mais procurados (13)

WEBINAR: "How to manage your data to make them open and fair"
WEBINAR:  "How to manage your data to make them open and fair"  WEBINAR:  "How to manage your data to make them open and fair"
WEBINAR: "How to manage your data to make them open and fair"
 
WEBINAR: Open Access to publications in Horizon 2020
WEBINAR: Open Access to publications in Horizon 2020WEBINAR: Open Access to publications in Horizon 2020
WEBINAR: Open Access to publications in Horizon 2020
 
Exposing EO Linked (meta-)Data from OpenSearch Catalogue
Exposing EO Linked (meta-)Data from OpenSearch CatalogueExposing EO Linked (meta-)Data from OpenSearch Catalogue
Exposing EO Linked (meta-)Data from OpenSearch Catalogue
 
WEBINAR: Open Research Data in Horizon 2020
WEBINAR: Open Research Data in Horizon 2020WEBINAR: Open Research Data in Horizon 2020
WEBINAR: Open Research Data in Horizon 2020
 
An introduction to Linked (Open) Data
An introduction to Linked (Open) DataAn introduction to Linked (Open) Data
An introduction to Linked (Open) Data
 
Linked Data for Abbreviations and Segmentation
Linked Data for Abbreviations and SegmentationLinked Data for Abbreviations and Segmentation
Linked Data for Abbreviations and Segmentation
 
Data management plans – EUDAT Best practices and case study | www.eudat.eu
Data management plans – EUDAT Best practices and case study | www.eudat.euData management plans – EUDAT Best practices and case study | www.eudat.eu
Data management plans – EUDAT Best practices and case study | www.eudat.eu
 
Imcs review 2013_04_v7
Imcs review 2013_04_v7Imcs review 2013_04_v7
Imcs review 2013_04_v7
 
CORE - Petr Knoth, Research Associate
CORE - Petr Knoth, Research AssociateCORE - Petr Knoth, Research Associate
CORE - Petr Knoth, Research Associate
 
LOD2 Webinar Series: CubeViz
LOD2 Webinar Series: CubeViz LOD2 Webinar Series: CubeViz
LOD2 Webinar Series: CubeViz
 
Geoportal4everybody
Geoportal4everybodyGeoportal4everybody
Geoportal4everybody
 
LOD2 Webinar Series Classification and Quality Analysis with DL Learner and ORE
LOD2 Webinar Series Classification and Quality Analysis with DL Learner and ORELOD2 Webinar Series Classification and Quality Analysis with DL Learner and ORE
LOD2 Webinar Series Classification and Quality Analysis with DL Learner and ORE
 
ENGAGE Workshop at OpenDataWeek2013
ENGAGE Workshop at OpenDataWeek2013ENGAGE Workshop at OpenDataWeek2013
ENGAGE Workshop at OpenDataWeek2013
 

Destaque

II-SDV 2013 Searching within the Enterprise: Making the Best of Poor User Que...
II-SDV 2013 Searching within the Enterprise: Making the Best of Poor User Que...II-SDV 2013 Searching within the Enterprise: Making the Best of Poor User Que...
II-SDV 2013 Searching within the Enterprise: Making the Best of Poor User Que...Dr. Haxel Consult
 
II-SDV 2013 Big Data Triage with Text Analytics
II-SDV 2013 Big Data Triage with Text AnalyticsII-SDV 2013 Big Data Triage with Text Analytics
II-SDV 2013 Big Data Triage with Text AnalyticsDr. Haxel Consult
 
II-SDV 2013 Open Source Platforms to deploy Search and Maps Visualization on ...
II-SDV 2013 Open Source Platforms to deploy Search and Maps Visualization on ...II-SDV 2013 Open Source Platforms to deploy Search and Maps Visualization on ...
II-SDV 2013 Open Source Platforms to deploy Search and Maps Visualization on ...Dr. Haxel Consult
 
II-SDV 2013 Challenges in Building a Future Search Centre: our Observations a...
II-SDV 2013 Challenges in Building a Future Search Centre: our Observations a...II-SDV 2013 Challenges in Building a Future Search Centre: our Observations a...
II-SDV 2013 Challenges in Building a Future Search Centre: our Observations a...Dr. Haxel Consult
 
II-SDV 2013 Automating Web Research through Customised Search Tools
II-SDV 2013 Automating Web Research through Customised Search ToolsII-SDV 2013 Automating Web Research through Customised Search Tools
II-SDV 2013 Automating Web Research through Customised Search ToolsDr. Haxel Consult
 
II-SDV 2013 Customising Statistics for Sharper Analysis
II-SDV 2013 Customising Statistics for Sharper AnalysisII-SDV 2013 Customising Statistics for Sharper Analysis
II-SDV 2013 Customising Statistics for Sharper AnalysisDr. Haxel Consult
 
II-SDV 2013 The Analytics Challenges Posed by Big Data
II-SDV 2013 The Analytics Challenges Posed by Big DataII-SDV 2013 The Analytics Challenges Posed by Big Data
II-SDV 2013 The Analytics Challenges Posed by Big DataDr. Haxel Consult
 
Roche Diabetes Care Incubator - Sebastiaan la Bastide Roche Diabetes Stanford...
Roche Diabetes Care Incubator - Sebastiaan la Bastide Roche Diabetes Stanford...Roche Diabetes Care Incubator - Sebastiaan la Bastide Roche Diabetes Stanford...
Roche Diabetes Care Incubator - Sebastiaan la Bastide Roche Diabetes Stanford...Burton Lee
 
II-SDV 2015 The International Information Conference on Search, Data Mining a...
II-SDV 2015 The International Information Conference on Search, Data Mining a...II-SDV 2015 The International Information Conference on Search, Data Mining a...
II-SDV 2015 The International Information Conference on Search, Data Mining a...Dr. Haxel Consult
 
Roche's Acquisition of Genentech
Roche's Acquisition of GenentechRoche's Acquisition of Genentech
Roche's Acquisition of GenentechYu Cao
 
PatSeer Introduction
PatSeer IntroductionPatSeer Introduction
PatSeer IntroductionGridlogics
 
II-SDV 2017 in Nice - The International Information Conference on Search, Dat...
II-SDV 2017 in Nice - The International Information Conference on Search, Dat...II-SDV 2017 in Nice - The International Information Conference on Search, Dat...
II-SDV 2017 in Nice - The International Information Conference on Search, Dat...Dr. Haxel Consult
 

Destaque (13)

II-SDV 2013 Searching within the Enterprise: Making the Best of Poor User Que...
II-SDV 2013 Searching within the Enterprise: Making the Best of Poor User Que...II-SDV 2013 Searching within the Enterprise: Making the Best of Poor User Que...
II-SDV 2013 Searching within the Enterprise: Making the Best of Poor User Que...
 
II-SDV 2013 Big Data Triage with Text Analytics
II-SDV 2013 Big Data Triage with Text AnalyticsII-SDV 2013 Big Data Triage with Text Analytics
II-SDV 2013 Big Data Triage with Text Analytics
 
II-SDV 2013 Open Source Platforms to deploy Search and Maps Visualization on ...
II-SDV 2013 Open Source Platforms to deploy Search and Maps Visualization on ...II-SDV 2013 Open Source Platforms to deploy Search and Maps Visualization on ...
II-SDV 2013 Open Source Platforms to deploy Search and Maps Visualization on ...
 
II-SDV 2013 Challenges in Building a Future Search Centre: our Observations a...
II-SDV 2013 Challenges in Building a Future Search Centre: our Observations a...II-SDV 2013 Challenges in Building a Future Search Centre: our Observations a...
II-SDV 2013 Challenges in Building a Future Search Centre: our Observations a...
 
II-SDV 2013 Automating Web Research through Customised Search Tools
II-SDV 2013 Automating Web Research through Customised Search ToolsII-SDV 2013 Automating Web Research through Customised Search Tools
II-SDV 2013 Automating Web Research through Customised Search Tools
 
II-SDV 2013 Customising Statistics for Sharper Analysis
II-SDV 2013 Customising Statistics for Sharper AnalysisII-SDV 2013 Customising Statistics for Sharper Analysis
II-SDV 2013 Customising Statistics for Sharper Analysis
 
II-SDV 2013 The Analytics Challenges Posed by Big Data
II-SDV 2013 The Analytics Challenges Posed by Big DataII-SDV 2013 The Analytics Challenges Posed by Big Data
II-SDV 2013 The Analytics Challenges Posed by Big Data
 
Roche Diabetes Care Incubator - Sebastiaan la Bastide Roche Diabetes Stanford...
Roche Diabetes Care Incubator - Sebastiaan la Bastide Roche Diabetes Stanford...Roche Diabetes Care Incubator - Sebastiaan la Bastide Roche Diabetes Stanford...
Roche Diabetes Care Incubator - Sebastiaan la Bastide Roche Diabetes Stanford...
 
II-SDV 2015 The International Information Conference on Search, Data Mining a...
II-SDV 2015 The International Information Conference on Search, Data Mining a...II-SDV 2015 The International Information Conference on Search, Data Mining a...
II-SDV 2015 The International Information Conference on Search, Data Mining a...
 
Roche Genentech Acquisition Analysis
Roche   Genentech Acquisition AnalysisRoche   Genentech Acquisition Analysis
Roche Genentech Acquisition Analysis
 
Roche's Acquisition of Genentech
Roche's Acquisition of GenentechRoche's Acquisition of Genentech
Roche's Acquisition of Genentech
 
PatSeer Introduction
PatSeer IntroductionPatSeer Introduction
PatSeer Introduction
 
II-SDV 2017 in Nice - The International Information Conference on Search, Dat...
II-SDV 2017 in Nice - The International Information Conference on Search, Dat...II-SDV 2017 in Nice - The International Information Conference on Search, Dat...
II-SDV 2017 in Nice - The International Information Conference on Search, Dat...
 

Semelhante a II-SDV 2013 Large scale Application of Text Mining and Visualization in the EU Fusepool Project

Jeroen Kleinhoven (Treparel), Turn Big Content into Business Insights - Data ...
Jeroen Kleinhoven (Treparel), Turn Big Content into Business Insights - Data ...Jeroen Kleinhoven (Treparel), Turn Big Content into Business Insights - Data ...
Jeroen Kleinhoven (Treparel), Turn Big Content into Business Insights - Data ...Cre-Aid
 
Text and Data Visualization Introduction 2012
Text and Data Visualization Introduction 2012Text and Data Visualization Introduction 2012
Text and Data Visualization Introduction 2012Treparel
 
MetadataTheory: Metadata Standards (6th of 10)
MetadataTheory: Metadata Standards (6th of 10)MetadataTheory: Metadata Standards (6th of 10)
MetadataTheory: Metadata Standards (6th of 10)Nikos Palavitsinis, PhD
 
Multi-faceted Classification of Big Data Use Cases and Proposed Architecture ...
Multi-faceted Classification of Big Data Use Cases and Proposed Architecture ...Multi-faceted Classification of Big Data Use Cases and Proposed Architecture ...
Multi-faceted Classification of Big Data Use Cases and Proposed Architecture ...Geoffrey Fox
 
Service integration to Enhance RDM: RSpace electronic lab notebook at the Uni...
Service integration to Enhance RDM: RSpace electronic lab notebook at the Uni...Service integration to Enhance RDM: RSpace electronic lab notebook at the Uni...
Service integration to Enhance RDM: RSpace electronic lab notebook at the Uni...ResearchSpace
 
II-SDV 2014 Insights into a Next Generation IP and R&D Dashboard (Jeroen Klei...
II-SDV 2014 Insights into a Next Generation IP and R&D Dashboard (Jeroen Klei...II-SDV 2014 Insights into a Next Generation IP and R&D Dashboard (Jeroen Klei...
II-SDV 2014 Insights into a Next Generation IP and R&D Dashboard (Jeroen Klei...Dr. Haxel Consult
 
2014: Treparel Big Data Text Analytics & Visualization
2014: Treparel Big Data Text Analytics & Visualization2014: Treparel Big Data Text Analytics & Visualization
2014: Treparel Big Data Text Analytics & VisualizationTreparel
 
Building Data Ecosystems for Accelerated Discovery
Building Data Ecosystems for Accelerated DiscoveryBuilding Data Ecosystems for Accelerated Discovery
Building Data Ecosystems for Accelerated Discoveryadamkraut
 
Session 2.1 ontological representation of the telecom domain for advanced a...
Session 2.1   ontological representation of the telecom domain for advanced a...Session 2.1   ontological representation of the telecom domain for advanced a...
Session 2.1 ontological representation of the telecom domain for advanced a...semanticsconference
 
Treparel - KMX Patent Analytics 2014
Treparel - KMX Patent Analytics 2014Treparel - KMX Patent Analytics 2014
Treparel - KMX Patent Analytics 2014Treparel
 
Crossing the Analytics Chasm and Getting the Models You Developed Deployed
Crossing the Analytics Chasm and Getting the Models You Developed DeployedCrossing the Analytics Chasm and Getting the Models You Developed Deployed
Crossing the Analytics Chasm and Getting the Models You Developed DeployedRobert Grossman
 
Intake 38 data access 4
Intake 38 data access 4Intake 38 data access 4
Intake 38 data access 4Mahmoud Ouf
 
2017 IEEE Projects 2017 For Cse ( Trichy, Chennai )
2017 IEEE Projects 2017 For Cse ( Trichy, Chennai )2017 IEEE Projects 2017 For Cse ( Trichy, Chennai )
2017 IEEE Projects 2017 For Cse ( Trichy, Chennai )SBGC
 
Jisc Research Data Shared Service - Spring Update
Jisc Research Data Shared Service - Spring UpdateJisc Research Data Shared Service - Spring Update
Jisc Research Data Shared Service - Spring UpdateJisc RDM
 

Semelhante a II-SDV 2013 Large scale Application of Text Mining and Visualization in the EU Fusepool Project (20)

Jeroen Kleinhoven (Treparel), Turn Big Content into Business Insights - Data ...
Jeroen Kleinhoven (Treparel), Turn Big Content into Business Insights - Data ...Jeroen Kleinhoven (Treparel), Turn Big Content into Business Insights - Data ...
Jeroen Kleinhoven (Treparel), Turn Big Content into Business Insights - Data ...
 
RDM@Edinburgh_interoperation_IDCC2015
RDM@Edinburgh_interoperation_IDCC2015RDM@Edinburgh_interoperation_IDCC2015
RDM@Edinburgh_interoperation_IDCC2015
 
Service Integration to Enhance RDM
Service Integration to Enhance RDMService Integration to Enhance RDM
Service Integration to Enhance RDM
 
MIDESS
MIDESSMIDESS
MIDESS
 
Text and Data Visualization Introduction 2012
Text and Data Visualization Introduction 2012Text and Data Visualization Introduction 2012
Text and Data Visualization Introduction 2012
 
MetadataTheory: Metadata Standards (6th of 10)
MetadataTheory: Metadata Standards (6th of 10)MetadataTheory: Metadata Standards (6th of 10)
MetadataTheory: Metadata Standards (6th of 10)
 
Multi-faceted Classification of Big Data Use Cases and Proposed Architecture ...
Multi-faceted Classification of Big Data Use Cases and Proposed Architecture ...Multi-faceted Classification of Big Data Use Cases and Proposed Architecture ...
Multi-faceted Classification of Big Data Use Cases and Proposed Architecture ...
 
Service integration to Enhance RDM: RSpace electronic lab notebook at the Uni...
Service integration to Enhance RDM: RSpace electronic lab notebook at the Uni...Service integration to Enhance RDM: RSpace electronic lab notebook at the Uni...
Service integration to Enhance RDM: RSpace electronic lab notebook at the Uni...
 
II-SDV 2014 Insights into a Next Generation IP and R&D Dashboard (Jeroen Klei...
II-SDV 2014 Insights into a Next Generation IP and R&D Dashboard (Jeroen Klei...II-SDV 2014 Insights into a Next Generation IP and R&D Dashboard (Jeroen Klei...
II-SDV 2014 Insights into a Next Generation IP and R&D Dashboard (Jeroen Klei...
 
2014: Treparel Big Data Text Analytics & Visualization
2014: Treparel Big Data Text Analytics & Visualization2014: Treparel Big Data Text Analytics & Visualization
2014: Treparel Big Data Text Analytics & Visualization
 
Building Data Ecosystems for Accelerated Discovery
Building Data Ecosystems for Accelerated DiscoveryBuilding Data Ecosystems for Accelerated Discovery
Building Data Ecosystems for Accelerated Discovery
 
NISO/DCMI Webinar: Metadata for Public Sector Administration
NISO/DCMI Webinar: Metadata for Public Sector AdministrationNISO/DCMI Webinar: Metadata for Public Sector Administration
NISO/DCMI Webinar: Metadata for Public Sector Administration
 
Session 2.1 ontological representation of the telecom domain for advanced a...
Session 2.1   ontological representation of the telecom domain for advanced a...Session 2.1   ontological representation of the telecom domain for advanced a...
Session 2.1 ontological representation of the telecom domain for advanced a...
 
OpenKM commercial
OpenKM commercialOpenKM commercial
OpenKM commercial
 
Treparel - KMX Patent Analytics 2014
Treparel - KMX Patent Analytics 2014Treparel - KMX Patent Analytics 2014
Treparel - KMX Patent Analytics 2014
 
Crossing the Analytics Chasm and Getting the Models You Developed Deployed
Crossing the Analytics Chasm and Getting the Models You Developed DeployedCrossing the Analytics Chasm and Getting the Models You Developed Deployed
Crossing the Analytics Chasm and Getting the Models You Developed Deployed
 
Text Mining
Text MiningText Mining
Text Mining
 
Intake 38 data access 4
Intake 38 data access 4Intake 38 data access 4
Intake 38 data access 4
 
2017 IEEE Projects 2017 For Cse ( Trichy, Chennai )
2017 IEEE Projects 2017 For Cse ( Trichy, Chennai )2017 IEEE Projects 2017 For Cse ( Trichy, Chennai )
2017 IEEE Projects 2017 For Cse ( Trichy, Chennai )
 
Jisc Research Data Shared Service - Spring Update
Jisc Research Data Shared Service - Spring UpdateJisc Research Data Shared Service - Spring Update
Jisc Research Data Shared Service - Spring Update
 

Mais de Dr. Haxel Consult

AI-SDV 2022: Henry Chang Patent Intelligence and Engineering Management
AI-SDV 2022: Henry Chang Patent Intelligence and Engineering ManagementAI-SDV 2022: Henry Chang Patent Intelligence and Engineering Management
AI-SDV 2022: Henry Chang Patent Intelligence and Engineering ManagementDr. Haxel Consult
 
AI-SDV 2022: Creation and updating of large Knowledge Graphs through NLP Anal...
AI-SDV 2022: Creation and updating of large Knowledge Graphs through NLP Anal...AI-SDV 2022: Creation and updating of large Knowledge Graphs through NLP Anal...
AI-SDV 2022: Creation and updating of large Knowledge Graphs through NLP Anal...Dr. Haxel Consult
 
AI-SDV 2022: The race to net zero: Tracking the green industrial revolution t...
AI-SDV 2022: The race to net zero: Tracking the green industrial revolution t...AI-SDV 2022: The race to net zero: Tracking the green industrial revolution t...
AI-SDV 2022: The race to net zero: Tracking the green industrial revolution t...Dr. Haxel Consult
 
AI-SDV 2022: Accommodating the Deep Learning Revolution by a Development Proc...
AI-SDV 2022: Accommodating the Deep Learning Revolution by a Development Proc...AI-SDV 2022: Accommodating the Deep Learning Revolution by a Development Proc...
AI-SDV 2022: Accommodating the Deep Learning Revolution by a Development Proc...Dr. Haxel Consult
 
AI-SDV 2022: Domain Knowledge makes Artificial Intelligence Smart Linda Ander...
AI-SDV 2022: Domain Knowledge makes Artificial Intelligence Smart Linda Ander...AI-SDV 2022: Domain Knowledge makes Artificial Intelligence Smart Linda Ander...
AI-SDV 2022: Domain Knowledge makes Artificial Intelligence Smart Linda Ander...Dr. Haxel Consult
 
AI-SDV 2022: Embedding-based Search Vs. Relevancy Search: comparing the new w...
AI-SDV 2022: Embedding-based Search Vs. Relevancy Search: comparing the new w...AI-SDV 2022: Embedding-based Search Vs. Relevancy Search: comparing the new w...
AI-SDV 2022: Embedding-based Search Vs. Relevancy Search: comparing the new w...Dr. Haxel Consult
 
AI-SDV 2022: Rolling out web crawling at Boehringer Ingelheim - 10 years of e...
AI-SDV 2022: Rolling out web crawling at Boehringer Ingelheim - 10 years of e...AI-SDV 2022: Rolling out web crawling at Boehringer Ingelheim - 10 years of e...
AI-SDV 2022: Rolling out web crawling at Boehringer Ingelheim - 10 years of e...Dr. Haxel Consult
 
AI-SDV 2022: Machine learning based patent categorization: A success story in...
AI-SDV 2022: Machine learning based patent categorization: A success story in...AI-SDV 2022: Machine learning based patent categorization: A success story in...
AI-SDV 2022: Machine learning based patent categorization: A success story in...Dr. Haxel Consult
 
AI-SDV 2022: Machine learning based patent categorization: A success story in...
AI-SDV 2022: Machine learning based patent categorization: A success story in...AI-SDV 2022: Machine learning based patent categorization: A success story in...
AI-SDV 2022: Machine learning based patent categorization: A success story in...Dr. Haxel Consult
 
AI-SDV 2022: Finding the WHAT – Will AI help? Nils Newman (Search Technology,...
AI-SDV 2022: Finding the WHAT – Will AI help? Nils Newman (Search Technology,...AI-SDV 2022: Finding the WHAT – Will AI help? Nils Newman (Search Technology,...
AI-SDV 2022: Finding the WHAT – Will AI help? Nils Newman (Search Technology,...Dr. Haxel Consult
 
AI-SDV 2022: New Insights from Trademarks with Natural Language Processing Al...
AI-SDV 2022: New Insights from Trademarks with Natural Language Processing Al...AI-SDV 2022: New Insights from Trademarks with Natural Language Processing Al...
AI-SDV 2022: New Insights from Trademarks with Natural Language Processing Al...Dr. Haxel Consult
 
AI-SDV 2022: Extracting information from tables in documents Holger Keibel (K...
AI-SDV 2022: Extracting information from tables in documents Holger Keibel (K...AI-SDV 2022: Extracting information from tables in documents Holger Keibel (K...
AI-SDV 2022: Extracting information from tables in documents Holger Keibel (K...Dr. Haxel Consult
 
AI-SDV 2022: Scientific publishing in the age of data mining and artificial i...
AI-SDV 2022: Scientific publishing in the age of data mining and artificial i...AI-SDV 2022: Scientific publishing in the age of data mining and artificial i...
AI-SDV 2022: Scientific publishing in the age of data mining and artificial i...Dr. Haxel Consult
 
AI-SDV 2022: AI developments and usability Linus Wretblad (IPscreener / Uppdr...
AI-SDV 2022: AI developments and usability Linus Wretblad (IPscreener / Uppdr...AI-SDV 2022: AI developments and usability Linus Wretblad (IPscreener / Uppdr...
AI-SDV 2022: AI developments and usability Linus Wretblad (IPscreener / Uppdr...Dr. Haxel Consult
 
AI-SDV 2022: Where’s the one about…? Looney Tunes® Revisited Jay Ven Eman (CE...
AI-SDV 2022: Where’s the one about…? Looney Tunes® Revisited Jay Ven Eman (CE...AI-SDV 2022: Where’s the one about…? Looney Tunes® Revisited Jay Ven Eman (CE...
AI-SDV 2022: Where’s the one about…? Looney Tunes® Revisited Jay Ven Eman (CE...Dr. Haxel Consult
 
AI-SDV 2022: Copyright Clearance Center
AI-SDV 2022: Copyright Clearance CenterAI-SDV 2022: Copyright Clearance Center
AI-SDV 2022: Copyright Clearance CenterDr. Haxel Consult
 
AI-SDV 2022: New Product Introductions: CENTREDOC
AI-SDV 2022: New Product Introductions: CENTREDOCAI-SDV 2022: New Product Introductions: CENTREDOC
AI-SDV 2022: New Product Introductions: CENTREDOCDr. Haxel Consult
 
AI-SDV 2022: Possibilities and limitations of AI-boosted multi-categorization...
AI-SDV 2022: Possibilities and limitations of AI-boosted multi-categorization...AI-SDV 2022: Possibilities and limitations of AI-boosted multi-categorization...
AI-SDV 2022: Possibilities and limitations of AI-boosted multi-categorization...Dr. Haxel Consult
 
AI-SDV 2022: Big data analytics platform at Bayer – Turning bits into insight...
AI-SDV 2022: Big data analytics platform at Bayer – Turning bits into insight...AI-SDV 2022: Big data analytics platform at Bayer – Turning bits into insight...
AI-SDV 2022: Big data analytics platform at Bayer – Turning bits into insight...Dr. Haxel Consult
 

Mais de Dr. Haxel Consult (20)

AI-SDV 2022: Henry Chang Patent Intelligence and Engineering Management
AI-SDV 2022: Henry Chang Patent Intelligence and Engineering ManagementAI-SDV 2022: Henry Chang Patent Intelligence and Engineering Management
AI-SDV 2022: Henry Chang Patent Intelligence and Engineering Management
 
AI-SDV 2022: Creation and updating of large Knowledge Graphs through NLP Anal...
AI-SDV 2022: Creation and updating of large Knowledge Graphs through NLP Anal...AI-SDV 2022: Creation and updating of large Knowledge Graphs through NLP Anal...
AI-SDV 2022: Creation and updating of large Knowledge Graphs through NLP Anal...
 
AI-SDV 2022: The race to net zero: Tracking the green industrial revolution t...
AI-SDV 2022: The race to net zero: Tracking the green industrial revolution t...AI-SDV 2022: The race to net zero: Tracking the green industrial revolution t...
AI-SDV 2022: The race to net zero: Tracking the green industrial revolution t...
 
AI-SDV 2022: Accommodating the Deep Learning Revolution by a Development Proc...
AI-SDV 2022: Accommodating the Deep Learning Revolution by a Development Proc...AI-SDV 2022: Accommodating the Deep Learning Revolution by a Development Proc...
AI-SDV 2022: Accommodating the Deep Learning Revolution by a Development Proc...
 
AI-SDV 2022: Domain Knowledge makes Artificial Intelligence Smart Linda Ander...
AI-SDV 2022: Domain Knowledge makes Artificial Intelligence Smart Linda Ander...AI-SDV 2022: Domain Knowledge makes Artificial Intelligence Smart Linda Ander...
AI-SDV 2022: Domain Knowledge makes Artificial Intelligence Smart Linda Ander...
 
AI-SDV 2022: Embedding-based Search Vs. Relevancy Search: comparing the new w...
AI-SDV 2022: Embedding-based Search Vs. Relevancy Search: comparing the new w...AI-SDV 2022: Embedding-based Search Vs. Relevancy Search: comparing the new w...
AI-SDV 2022: Embedding-based Search Vs. Relevancy Search: comparing the new w...
 
AI-SDV 2022: Rolling out web crawling at Boehringer Ingelheim - 10 years of e...
AI-SDV 2022: Rolling out web crawling at Boehringer Ingelheim - 10 years of e...AI-SDV 2022: Rolling out web crawling at Boehringer Ingelheim - 10 years of e...
AI-SDV 2022: Rolling out web crawling at Boehringer Ingelheim - 10 years of e...
 
AI-SDV 2022: Machine learning based patent categorization: A success story in...
AI-SDV 2022: Machine learning based patent categorization: A success story in...AI-SDV 2022: Machine learning based patent categorization: A success story in...
AI-SDV 2022: Machine learning based patent categorization: A success story in...
 
AI-SDV 2022: Machine learning based patent categorization: A success story in...
AI-SDV 2022: Machine learning based patent categorization: A success story in...AI-SDV 2022: Machine learning based patent categorization: A success story in...
AI-SDV 2022: Machine learning based patent categorization: A success story in...
 
AI-SDV 2022: Finding the WHAT – Will AI help? Nils Newman (Search Technology,...
AI-SDV 2022: Finding the WHAT – Will AI help? Nils Newman (Search Technology,...AI-SDV 2022: Finding the WHAT – Will AI help? Nils Newman (Search Technology,...
AI-SDV 2022: Finding the WHAT – Will AI help? Nils Newman (Search Technology,...
 
AI-SDV 2022: New Insights from Trademarks with Natural Language Processing Al...
AI-SDV 2022: New Insights from Trademarks with Natural Language Processing Al...AI-SDV 2022: New Insights from Trademarks with Natural Language Processing Al...
AI-SDV 2022: New Insights from Trademarks with Natural Language Processing Al...
 
AI-SDV 2022: Extracting information from tables in documents Holger Keibel (K...
AI-SDV 2022: Extracting information from tables in documents Holger Keibel (K...AI-SDV 2022: Extracting information from tables in documents Holger Keibel (K...
AI-SDV 2022: Extracting information from tables in documents Holger Keibel (K...
 
AI-SDV 2022: Scientific publishing in the age of data mining and artificial i...
AI-SDV 2022: Scientific publishing in the age of data mining and artificial i...AI-SDV 2022: Scientific publishing in the age of data mining and artificial i...
AI-SDV 2022: Scientific publishing in the age of data mining and artificial i...
 
AI-SDV 2022: AI developments and usability Linus Wretblad (IPscreener / Uppdr...
AI-SDV 2022: AI developments and usability Linus Wretblad (IPscreener / Uppdr...AI-SDV 2022: AI developments and usability Linus Wretblad (IPscreener / Uppdr...
AI-SDV 2022: AI developments and usability Linus Wretblad (IPscreener / Uppdr...
 
AI-SDV 2022: Where’s the one about…? Looney Tunes® Revisited Jay Ven Eman (CE...
AI-SDV 2022: Where’s the one about…? Looney Tunes® Revisited Jay Ven Eman (CE...AI-SDV 2022: Where’s the one about…? Looney Tunes® Revisited Jay Ven Eman (CE...
AI-SDV 2022: Where’s the one about…? Looney Tunes® Revisited Jay Ven Eman (CE...
 
AI-SDV 2022: Copyright Clearance Center
AI-SDV 2022: Copyright Clearance CenterAI-SDV 2022: Copyright Clearance Center
AI-SDV 2022: Copyright Clearance Center
 
AI-SDV 2022: Lighthouse IP
AI-SDV 2022: Lighthouse IPAI-SDV 2022: Lighthouse IP
AI-SDV 2022: Lighthouse IP
 
AI-SDV 2022: New Product Introductions: CENTREDOC
AI-SDV 2022: New Product Introductions: CENTREDOCAI-SDV 2022: New Product Introductions: CENTREDOC
AI-SDV 2022: New Product Introductions: CENTREDOC
 
AI-SDV 2022: Possibilities and limitations of AI-boosted multi-categorization...
AI-SDV 2022: Possibilities and limitations of AI-boosted multi-categorization...AI-SDV 2022: Possibilities and limitations of AI-boosted multi-categorization...
AI-SDV 2022: Possibilities and limitations of AI-boosted multi-categorization...
 
AI-SDV 2022: Big data analytics platform at Bayer – Turning bits into insight...
AI-SDV 2022: Big data analytics platform at Bayer – Turning bits into insight...AI-SDV 2022: Big data analytics platform at Bayer – Turning bits into insight...
AI-SDV 2022: Big data analytics platform at Bayer – Turning bits into insight...
 

Último

Film cover research (1).pptxsdasdasdasdasdasa
Film cover research (1).pptxsdasdasdasdasdasaFilm cover research (1).pptxsdasdasdasdasdasa
Film cover research (1).pptxsdasdasdasdasdasa494f574xmv
 
定制(Management毕业证书)新加坡管理大学毕业证成绩单原版一比一
定制(Management毕业证书)新加坡管理大学毕业证成绩单原版一比一定制(Management毕业证书)新加坡管理大学毕业证成绩单原版一比一
定制(Management毕业证书)新加坡管理大学毕业证成绩单原版一比一Fs
 
定制(Lincoln毕业证书)新西兰林肯大学毕业证成绩单原版一比一
定制(Lincoln毕业证书)新西兰林肯大学毕业证成绩单原版一比一定制(Lincoln毕业证书)新西兰林肯大学毕业证成绩单原版一比一
定制(Lincoln毕业证书)新西兰林肯大学毕业证成绩单原版一比一Fs
 
Call Girls In The Ocean Pearl Retreat Hotel New Delhi 9873777170
Call Girls In The Ocean Pearl Retreat Hotel New Delhi 9873777170Call Girls In The Ocean Pearl Retreat Hotel New Delhi 9873777170
Call Girls In The Ocean Pearl Retreat Hotel New Delhi 9873777170Sonam Pathan
 
Magic exist by Marta Loveguard - presentation.pptx
Magic exist by Marta Loveguard - presentation.pptxMagic exist by Marta Loveguard - presentation.pptx
Magic exist by Marta Loveguard - presentation.pptxMartaLoveguard
 
Potsdam FH学位证,波茨坦应用技术大学毕业证书1:1制作
Potsdam FH学位证,波茨坦应用技术大学毕业证书1:1制作Potsdam FH学位证,波茨坦应用技术大学毕业证书1:1制作
Potsdam FH学位证,波茨坦应用技术大学毕业证书1:1制作ys8omjxb
 
Top 10 Interactive Website Design Trends in 2024.pptx
Top 10 Interactive Website Design Trends in 2024.pptxTop 10 Interactive Website Design Trends in 2024.pptx
Top 10 Interactive Website Design Trends in 2024.pptxDyna Gilbert
 
『澳洲文凭』买詹姆士库克大学毕业证书成绩单办理澳洲JCU文凭学位证书
『澳洲文凭』买詹姆士库克大学毕业证书成绩单办理澳洲JCU文凭学位证书『澳洲文凭』买詹姆士库克大学毕业证书成绩单办理澳洲JCU文凭学位证书
『澳洲文凭』买詹姆士库克大学毕业证书成绩单办理澳洲JCU文凭学位证书rnrncn29
 
『澳洲文凭』买拉筹伯大学毕业证书成绩单办理澳洲LTU文凭学位证书
『澳洲文凭』买拉筹伯大学毕业证书成绩单办理澳洲LTU文凭学位证书『澳洲文凭』买拉筹伯大学毕业证书成绩单办理澳洲LTU文凭学位证书
『澳洲文凭』买拉筹伯大学毕业证书成绩单办理澳洲LTU文凭学位证书rnrncn29
 
办理多伦多大学毕业证成绩单|购买加拿大UTSG文凭证书
办理多伦多大学毕业证成绩单|购买加拿大UTSG文凭证书办理多伦多大学毕业证成绩单|购买加拿大UTSG文凭证书
办理多伦多大学毕业证成绩单|购买加拿大UTSG文凭证书zdzoqco
 
NSX-T and Service Interfaces presentation
NSX-T and Service Interfaces presentationNSX-T and Service Interfaces presentation
NSX-T and Service Interfaces presentationMarko4394
 
定制(AUT毕业证书)新西兰奥克兰理工大学毕业证成绩单原版一比一
定制(AUT毕业证书)新西兰奥克兰理工大学毕业证成绩单原版一比一定制(AUT毕业证书)新西兰奥克兰理工大学毕业证成绩单原版一比一
定制(AUT毕业证书)新西兰奥克兰理工大学毕业证成绩单原版一比一Fs
 
A Good Girl's Guide to Murder (A Good Girl's Guide to Murder, #1)
A Good Girl's Guide to Murder (A Good Girl's Guide to Murder, #1)A Good Girl's Guide to Murder (A Good Girl's Guide to Murder, #1)
A Good Girl's Guide to Murder (A Good Girl's Guide to Murder, #1)Christopher H Felton
 
Git and Github workshop GDSC MLRITM
Git and Github  workshop GDSC MLRITMGit and Github  workshop GDSC MLRITM
Git and Github workshop GDSC MLRITMgdsc13
 
Contact Rya Baby for Call Girls New Delhi
Contact Rya Baby for Call Girls New DelhiContact Rya Baby for Call Girls New Delhi
Contact Rya Baby for Call Girls New Delhimiss dipika
 
Blepharitis inflammation of eyelid symptoms cause everything included along w...
Blepharitis inflammation of eyelid symptoms cause everything included along w...Blepharitis inflammation of eyelid symptoms cause everything included along w...
Blepharitis inflammation of eyelid symptoms cause everything included along w...Excelmac1
 
Elevate Your Business with Our IT Expertise in New Orleans
Elevate Your Business with Our IT Expertise in New OrleansElevate Your Business with Our IT Expertise in New Orleans
Elevate Your Business with Our IT Expertise in New Orleanscorenetworkseo
 
SCM Symposium PPT Format Customer loyalty is predi
SCM Symposium PPT Format Customer loyalty is prediSCM Symposium PPT Format Customer loyalty is predi
SCM Symposium PPT Format Customer loyalty is predieusebiomeyer
 

Último (20)

Film cover research (1).pptxsdasdasdasdasdasa
Film cover research (1).pptxsdasdasdasdasdasaFilm cover research (1).pptxsdasdasdasdasdasa
Film cover research (1).pptxsdasdasdasdasdasa
 
定制(Management毕业证书)新加坡管理大学毕业证成绩单原版一比一
定制(Management毕业证书)新加坡管理大学毕业证成绩单原版一比一定制(Management毕业证书)新加坡管理大学毕业证成绩单原版一比一
定制(Management毕业证书)新加坡管理大学毕业证成绩单原版一比一
 
定制(Lincoln毕业证书)新西兰林肯大学毕业证成绩单原版一比一
定制(Lincoln毕业证书)新西兰林肯大学毕业证成绩单原版一比一定制(Lincoln毕业证书)新西兰林肯大学毕业证成绩单原版一比一
定制(Lincoln毕业证书)新西兰林肯大学毕业证成绩单原版一比一
 
Call Girls In The Ocean Pearl Retreat Hotel New Delhi 9873777170
Call Girls In The Ocean Pearl Retreat Hotel New Delhi 9873777170Call Girls In The Ocean Pearl Retreat Hotel New Delhi 9873777170
Call Girls In The Ocean Pearl Retreat Hotel New Delhi 9873777170
 
Magic exist by Marta Loveguard - presentation.pptx
Magic exist by Marta Loveguard - presentation.pptxMagic exist by Marta Loveguard - presentation.pptx
Magic exist by Marta Loveguard - presentation.pptx
 
Potsdam FH学位证,波茨坦应用技术大学毕业证书1:1制作
Potsdam FH学位证,波茨坦应用技术大学毕业证书1:1制作Potsdam FH学位证,波茨坦应用技术大学毕业证书1:1制作
Potsdam FH学位证,波茨坦应用技术大学毕业证书1:1制作
 
Top 10 Interactive Website Design Trends in 2024.pptx
Top 10 Interactive Website Design Trends in 2024.pptxTop 10 Interactive Website Design Trends in 2024.pptx
Top 10 Interactive Website Design Trends in 2024.pptx
 
『澳洲文凭』买詹姆士库克大学毕业证书成绩单办理澳洲JCU文凭学位证书
『澳洲文凭』买詹姆士库克大学毕业证书成绩单办理澳洲JCU文凭学位证书『澳洲文凭』买詹姆士库克大学毕业证书成绩单办理澳洲JCU文凭学位证书
『澳洲文凭』买詹姆士库克大学毕业证书成绩单办理澳洲JCU文凭学位证书
 
『澳洲文凭』买拉筹伯大学毕业证书成绩单办理澳洲LTU文凭学位证书
『澳洲文凭』买拉筹伯大学毕业证书成绩单办理澳洲LTU文凭学位证书『澳洲文凭』买拉筹伯大学毕业证书成绩单办理澳洲LTU文凭学位证书
『澳洲文凭』买拉筹伯大学毕业证书成绩单办理澳洲LTU文凭学位证书
 
办理多伦多大学毕业证成绩单|购买加拿大UTSG文凭证书
办理多伦多大学毕业证成绩单|购买加拿大UTSG文凭证书办理多伦多大学毕业证成绩单|购买加拿大UTSG文凭证书
办理多伦多大学毕业证成绩单|购买加拿大UTSG文凭证书
 
NSX-T and Service Interfaces presentation
NSX-T and Service Interfaces presentationNSX-T and Service Interfaces presentation
NSX-T and Service Interfaces presentation
 
定制(AUT毕业证书)新西兰奥克兰理工大学毕业证成绩单原版一比一
定制(AUT毕业证书)新西兰奥克兰理工大学毕业证成绩单原版一比一定制(AUT毕业证书)新西兰奥克兰理工大学毕业证成绩单原版一比一
定制(AUT毕业证书)新西兰奥克兰理工大学毕业证成绩单原版一比一
 
A Good Girl's Guide to Murder (A Good Girl's Guide to Murder, #1)
A Good Girl's Guide to Murder (A Good Girl's Guide to Murder, #1)A Good Girl's Guide to Murder (A Good Girl's Guide to Murder, #1)
A Good Girl's Guide to Murder (A Good Girl's Guide to Murder, #1)
 
young call girls in Uttam Nagar🔝 9953056974 🔝 Delhi escort Service
young call girls in Uttam Nagar🔝 9953056974 🔝 Delhi escort Serviceyoung call girls in Uttam Nagar🔝 9953056974 🔝 Delhi escort Service
young call girls in Uttam Nagar🔝 9953056974 🔝 Delhi escort Service
 
Git and Github workshop GDSC MLRITM
Git and Github  workshop GDSC MLRITMGit and Github  workshop GDSC MLRITM
Git and Github workshop GDSC MLRITM
 
Contact Rya Baby for Call Girls New Delhi
Contact Rya Baby for Call Girls New DelhiContact Rya Baby for Call Girls New Delhi
Contact Rya Baby for Call Girls New Delhi
 
Blepharitis inflammation of eyelid symptoms cause everything included along w...
Blepharitis inflammation of eyelid symptoms cause everything included along w...Blepharitis inflammation of eyelid symptoms cause everything included along w...
Blepharitis inflammation of eyelid symptoms cause everything included along w...
 
Elevate Your Business with Our IT Expertise in New Orleans
Elevate Your Business with Our IT Expertise in New OrleansElevate Your Business with Our IT Expertise in New Orleans
Elevate Your Business with Our IT Expertise in New Orleans
 
Hot Sexy call girls in Rk Puram 🔝 9953056974 🔝 Delhi escort Service
Hot Sexy call girls in  Rk Puram 🔝 9953056974 🔝 Delhi escort ServiceHot Sexy call girls in  Rk Puram 🔝 9953056974 🔝 Delhi escort Service
Hot Sexy call girls in Rk Puram 🔝 9953056974 🔝 Delhi escort Service
 
SCM Symposium PPT Format Customer loyalty is predi
SCM Symposium PPT Format Customer loyalty is prediSCM Symposium PPT Format Customer loyalty is predi
SCM Symposium PPT Format Customer loyalty is predi
 

II-SDV 2013 Large scale Application of Text Mining and Visualization in the EU Fusepool Project

  • 1. Treparel Delftechpark 26 2628 XH Delft The Netherlands www.treparel.com Text Analytics in the EU Fusepool project II-SDV ’13 - Nice Anton Heijs CTO anton@treparel.com April 15, 2013
  • 2. Agenda   •  Introduc.on  of  the  EU  Fusepool  project,  Treparel  and  the   consor.um  partners   •  Background  on  objec.ves  of  the  Fusepool  project   •  User  adap.ve  system   •  Data  pooling  and  linking   •  Large  scale  machine  learning  and  text  analy.cs   •  Summary   Treparel KMX – All rights reserved 2013 2www.treparel.com
  • 3. The  EU  Fusepool  Consor.um   Treparel (Delft, The Netherlands) is a global software provider in Big Data Text Analytics and Visualization. Global companies, government agencies, software vendors or data publishers are using Treparel KMX text analysis software to gain faster, reliable, precise insights in large complex unstructured data sets (like application notes, blogs, email and patents) allowing them to make better informed decisions. As part of the Fusepool consortium Treparel integrates the KMX Text Analytics technology for advanced classification, clustering and visualization of large complex document collections. Treparel KMX – All rights reserved 2013 3www.treparel.com
  • 4. Each  partner  provides  cri.cal  building  blocks:     BUAS:  Dynamic  user  interfaces     ENOLL:  End  users  with  specific  needs     GEOX:  NER,  UI  design     SEARCHBOX:  text  search  and  similarity  seman.c   matching     TREPAREL:  text/patent  classifica.on  engines     XEROX:  user-­‐adap.ve  learning  know-­‐how              BUT  BREAK-­‐THROUGH  BY  INTEGRATION   The  Big  Picture   Treparel KMX – All rights reserved 2013 4www.treparel.com
  • 5. WP4 Find & match WP5 GUI & Visuals WP2 Platform, sourcing WP3 Extract & enrich WP1 Reqs & Testing WP6 User Involvement Learning Learning Learning LearningFeedback Work  packages   The EC Fusepool project is a 2 year EU project and started in July 2012 Treparel KMX – All rights reserved 2013 5www.treparel.com
  • 6. Fusing  and  pooling  informa.on     for  product  development   Vision: User-adaptive system Living Lab: Rapid app development Data processing: Sourcing & interlinking Machine learning: Matching & optimizing Integrated Use Cases Treparel KMX – All rights reserved 2013 6www.treparel.com
  • 7. Background   •  SMEs  have  a  need  for  technology  intelligence  for  detec.ng   and  responding  to  opportuni.es  and  threats   •  This  a  partly  driven  by  growth  and  complexity  of   patents  and  lawsuits   •  Consumer  intelligence  to  detect  opinions  and  needs  of   consumers  for  product  development   •  Open  innova.on  requiring  coopera.on  (links  between  data,   e.g.  finding  business  partners)   •  Focus:  Machine  Learning  algorithms  to  improve  matching     Treparel KMX – All rights reserved 2013 7www.treparel.com
  • 8. User-­‐adap.ve  system   •  Focus:  monitor  and  learn  specific  needs  and  preferences   of  a  user  to  align  features,  func.onali.es,  and  graphical   interfaces   •  AdapDve:  machine  learning  from  crowd-­‐sourcing  (rather   than  rule-­‐based)   •  User-­‐aligned  prioriDzaDon:  more  usable  and  customized   interfaces,  sugges.ons  based  on  ac.vity  &  user  feedback   Treparel KMX – All rights reserved 2013 8www.treparel.com
  • 9. User-­‐adap.ve  matching   •  Main  goal:  automated  user-­‐adap.ve  matching  of  users   to:   –  Patent  analysis   –  Finding  funding  opportuni.es   –  Partner  matching   •  Key  asset:  informa.on  provided  by  the  user     •  User  Data  Credo:    accuracy  improves  with  quan.ty  and   quality  of  user  data  while  variety  (breadth)  increases   with  number  of  users   •  Living  lab:  Co-­‐crea.on  between  creators  and  consumers   of  the  Fusepool  plaWorm   Treparel KMX – All rights reserved 2013 9www.treparel.com
  • 10. Data  sourcing   •  Sources:  internal  &  external  content  from  web  harves.ng   and  structured  data  sources   –  using  content  databases  and  linked  open  data   •  Scope:  ini.al  data  corpus  includes  all  explicitly  in-­‐  and   excluded  sources   •  Gained  value  from  InformaDon:  recommenda.ons   based  on  machine  learning  from  feedback   Treparel KMX – All rights reserved 2013 10www.treparel.com
  • 11. Data  handling   1.  Text  analysis  and  feature  extracDon:  ML  &  NLP  methods   for  categorizing,  named  en.ty  extrac.on,  etc.   2.  Shared  metadata  models:  mapping  text  features  to   exis.ng/custom  ontologies  and  genera.on  of  seman.c   triplets   →  High-­‐level  abstrac.on  &  persistence  for  reuse   →  Lightweight  storage:  mostly  metadata  only,  text   indexing  and  abstrac.on  uses  schema-­‐free  key-­‐ value  (enabling  ac.onable  facets)   Treparel KMX – All rights reserved 2013 11www.treparel.com
  • 12. Data  interlinking   •  Contextualize:  terms  are  interlinked  with  same  and   similar  terms  across  sources:   –  Enrich  the  extracted  content  with  exis.ng  informa.on  available   in  the  Internet   –  Interlink  as  much  informa.on  as  possible  to  increase  the  value  of   knowledge  extrac.on   –  Use  available  public  sector  resources  in  Seman.c  Web  and  LOD   format   •  Metadata:  when  a  user  uploads  texts  to  be  matched  with   other  content,  only  the  metadata  descriptors  are   transmiYed   •  Data  privacy:  data  fusion  from  diverse  sources  without   endangering  user  privacy   Treparel KMX – All rights reserved 2013 12www.treparel.com
  • 13. Searching  &  finding   •  Key  search-­‐oriented  features:   –  Search  through  all  content  in  the  data  pool   –  Faceted  search  (categories,  metadata,  en..es)   –  Integra.on  of  Linked  Open  Data  (LOD)  results   –  Cross-­‐lingual  indexing  and  cross-­‐referencing   –  “Did  you  mean?”-­‐func.onality  in  case  of  typos  and  auto-­‐ comple.on  of  search  queries   •  User-­‐adapDve:  indexing  and  integra.on  based  on  users   needs  (e.g.  user  profiling)   Treparel KMX – All rights reserved 2013 13www.treparel.com
  • 14. Adapta.on  &  refinement   •  AdapDve  search:  results  are  aligned  to  user  preferences   based  on  analysis  of  user  implicit  and  explicit  feedback     •  MulD-­‐task  ranking:  good  trade-­‐off  between  user-­‐ independent  search  (high  coverage  but  lower  precision)   and  a  very  customized  approach   •  Query  intent  discovery:  analysis  of  the  query  structure   and  interlinking  of  queries   Treparel KMX – All rights reserved 2013 14www.treparel.com
  • 15. Correla.ng  &  matching   •  Search  guided  navigaDon:  seman.c  matching  extracts   contextual  rela.onships  to  list  related  content   –  sugges.ons  organized  by  categories   –  exposing  facets  within  related  content   •  Distributed  rule  and  event  model:  defines  states,   ac.ons,  and  consequences  (e.g.  no.fica.ons,   visualiza.ons)  for  reasoning  based  on  light-­‐weight   ontologies   Treparel KMX – All rights reserved 2013 15www.treparel.com
  • 16. Crowd  sourcing  &     supervised  automa.on   •  RelaDonal  learning:  related  instances  are  used  to  reason   about  the  focal  instance   –  Ra.onality  of  content  (links  to  other  content,  people,  etc.)   provide  rich  informa.on   –  Similari.es/dissimilari.es  to  other  content  is  established  purely   on  rela.onal  proper.es   Treparel KMX – All rights reserved 2013 16www.treparel.com
  • 17. Visualiza.on   Clustering   Classifica.on   Text  Preprocessing  and  Indexing   Acquire  documents   Present  Results   Taxonomies,   Ontologies   Seman.c  Analysis   Document  level  analysis   using  the  KMX  technology     KMX  unique  func.ons:   •  Extract  concepts  in  context   using  clustering  and   classifica.on  of  documents   •  Use  classifica.on  to  create   ranked  lists  and  to  tag  subsets   •  Support  of  binary  and  mul.-­‐ class  Classifica.on   •  Integra.on  with  other   applica.ons  through  KMX  API   Treparel KMX – All rights reserved 2012 www.treparel.com 17 Query & Search Tools
  • 18. Treparel KMX – All rights reserved 2013 18www.treparel.com NER  in  the  landscaping    
  • 19. Sentence  level  analysis  using     Named  En.ty  Recogni.on   •  The  aim  of  NER  is  to  iden.fy  en..es  in  unstructured  text   documents.   o  To  locate  :  mark-­‐up  the  en..es   o  To  classify  :  into  predefined  categories/domains   •  Aim  of  usage   o  To  recognise  trends  (trained  and  new  high  frequency)   o  To  find  all  „trained”  en..es   o  To  „discover”  new  en.tes   •  NER  approaches   o  Sta.s.cs  based  (supervised  machine  learning)   o  Rule  based  (regular  expressions)   Treparel KMX – All rights reserved 2013 19www.treparel.com
  • 20. Building  a  NER  model   Treparel KMX – All rights reserved 2013 20www.treparel.com
  • 21. 3  NER  model  examples     Training  text:  500  patents  from  EPO     Training:  model  building  by  Stanford  NER   •  LTE  (long  term  evoluDon)   •  F1:  88%   •  New  en..es:  „GSM  BSS”,  „LTE  TDD”   •  False  pos:  „Loca.on  Area  LA”   •  Elements   •  F1:  98%   •  False  pos:  „argon/hydrogen”   •  Cancer   •  F1:  87%   •  New  en..es:  „myeloma  cell”,  „tumor  .ssue”   •  False  pos:  „cell  mortality”,  „the  test  compound”   Treparel KMX – All rights reserved 2013 21www.treparel.com
  • 22. Using  NER  in  a  GUI   Treparel KMX – All rights reserved 2013 22www.treparel.com
  • 23. Raw tex Extract  terms  indica.ng   domains  using  patent   classifica.on  codes   Content  /  En.ty   Hub   Generate  Vector  Space  Model   Document Vectors Generate  Classifiers  to  es.mate  domains   Annotated tex with domain labels Domain labels Training Vectors DocId : 1122 Title : Pesticide device Classification Code : A61B Text: Hihaho Etc etc Table A61 : pesticide A61B : pesticide solvent Etc etc Many Vectors Doc-Id : 1122 750 terms + weights Classifier  on  pes.cide  solvents   25 Positive vectors Doc-Id’s Label 750 terms + weights 25 negative vectors Doc-Id’s Label 750 terms + weights Doc-Id : 1122 Labels + scores Title : Pesticide device Classification Code : A61B Text: Hihaho …Etc etc 1 2 3 4 5 6 7 8 9 10   11   Combining  text  analysis   approaches   Treparel KMX – All rights reserved 2013 23www.treparel.com
  • 24. •  Data  sourcing:  retrieves  data  from  data  sources  via  data   integrator   •  Storage:  raw  (e.g.  text)  or  processed  data  stored  in   database  or  triple  store   •  Using  ML  to  enable  learning  from  the  crowd   •  Push  &  Pull:       interface  to  consumers  (web  and  mobile  apps)     suppor.ng  quality  and  access  control  from  portal   •  Portal:  business  logic,  storage  of  registered  data  sources,   access  control  using  web  frontend   Somware  architecture   Treparel KMX – All rights reserved 2013 24www.treparel.com
  • 25. Open  Call  for  Users   •  ApplicaDons  received  from  Finland,  Germany,  Spain,  Greece,   Bulgaria,  Hungary,  Italy,  UK,  Belgium,  China,  Switzerland,  France,   Denmark,  Portugal,  Ireland  and  The  Netherlands     •  Business  areas  covered:     –  Bio-­‐medical,   –  Pharma  and  biotech,   –  ICT/Telecommunica.ons,     –  Digital  media,     –  Renewable  energies,     –  Educa.on,     –  Innova.on/consultancy  services  for  SMEs.     •  Mix  of  profiles  from  SMEs,  Research,  SME  intermediaries   (incubators,  Science  parks,  Living  Labs,  etc),  developers  and  more   •  MulDple  areas  of  research,  development  and  innova.on  areas  are   covered.   25Treparel KMX – All rights reserved 2013 25www.treparel.com
  • 26. •  Data  as  a  service:  scale  economies  of  scale  in  management  of  data     •  Data  pooling:  processes  need  .mely  aggrega.on  and  redistribu.on   of  diverse  data  but  building  own  is  redundant  and  prohibi.ve  for   SMEs     provide  services  on  top  of  pool  with  high  quality  data     provide  access  to  services  on  demand   •  Success  criteria:     Early  provision  of  scalable  basic  Fusepool  services     SME  involvement  and  uptake  of  Fusepool  services     Machine  learning  for  data  cura.on  &  user  adapta.on   •  Required  steps:     Stepwise  integra.on  of  exis.ng  &  new  technologies     Early  and  ongoing  feedback  from  end  users   EC  perspec.ve   Treparel KMX – All rights reserved 2013 26www.treparel.com
  • 27. The  EC  FP  7  program  Fusepool  is  all  about:     •  Building  a  plaWorm  with  web  enabled  services  for   •  Data  pooling   •  Large  scale  text  analysis   •  Large  scale  machine  learning  of  user  input   •  Enable  SME’s  with  analy.cs  for  improving  their   innova.on  and  compe..ve  strengths  using   •  SME’s  involvement  and  feedback  to  the  Fusepool  services   •  Machine  learning  for  data  cura.on  &  user  adapta.on   Summary   Treparel KMX – All rights reserved 2013 27www.treparel.com Welcome to visit Fusepool at: www.fusepool.net