The slide deck from the October 29, 2015 webinar "Metadata Enrichment in Publishing: Boosting Productivity and Increasing User Engagement" presented by Ilian Uzunov and Georgi Georgiev.
08448380779 Call Girls In Civil Lines Women Seeking Men
Boost Productivity and User Engagement with Metadata Enrichment
1. Metadata Enrichment in
Publishing: Boosting Productivity
and Increasing User Engagement
Ontotext webinar series – 29th of October 2015
11am ET | 10am CT |8am PT | 1500 UTC
2. Agenda
• Company info
• Metadata challenges
• Ontotext approach to metadata enrichment – Dr. Georgiev
• Live demos
• Wrap up + QnA
Oct 2015Metadata Enrichment in Publishing 2
3. Company essentials
• One-stop shop for semantic technology
− Text Analytics + Content Enrichment + Search + Graph Database Engine
− Over 400 person-years in R&D
• Started in year 2000
− As R&D lab within Sirma – the biggest Bulgarian software company
• Got spun-off and took VC investment in 2008
• 70 staff, growing revenue, profitable
− HQ in Sofia, Bulgaria, offices in London and NYC
Oct 2015Metadata Enrichment in Publishing 3
4. Company essentials
Why? enable better search, analytics and content delivery
What? data and content management technology
How? semantic analysis of text, NoSQL graph database
Best for: content publishing and information discovery
Oct 2015Metadata Enrichment in Publishing 4
6. What others are saying about metadata…
• "Metadata Matters" Bloomberg white paper, 2014
“Metadata was once purely the domain of the tech team, today it affects everyone in a
media and publishing organizations—from editorial team members to business leaders."
• John O'Donovan, Former CTO of Financial Times
"Everyone forgets about metadata. All your assets are useless to you unless you have
metadata – your archive is full of stuff that is of no value because you can’t find it and
don’t know what it’s about.”
• A manifesto on metadata, Thad Mcllroy
The only way to make publishing's great content discoverable, is "via rich metadata linked
into smart search systems."
Oct 2015Metadata Enrichment in Publishing 6
7. Types of metadata
•Structural metadata
•Technical metadata
•Descriptive metadata
•Administrative metadata
•Rights metadata
•Commercial metadata
Oct 2015Metadata Enrichment in Publishing 7
8. Technical metadata
• Examples of enrichment of technical
metadata
“Tagasauris saw the potential for a
semantic engine like GraphDB to add
value and intelligence to their tagging
services.”
Oct 2015Metadata Enrichment in Publishing 8
9. Descriptive metadata
• Examples of enrichment of descriptive
metadata
“With semantic metadata you can
describe your content in terms of the
locations, people, organizations, brands,
etc that the content is about.”
Oct 2015Metadata Enrichment in Publishing 9
10. The risk of being negligent to your metadata
“The failure to adequately account for each type of
metadata can affect a publishing company’s ability to
efficiently create, store, find, access and publish content
of all types!”
"Metadata Matters" Bloomberg white paper, 2014
Oct 2015Metadata Enrichment in Publishing 10
11. Why do we need to care more about metadata?
•Мetadata drives content organization,
workflow optimization and automation
•It’s economically smarter to tie assets
together throughout the supply chain with
metadata
Oct 2015Metadata Enrichment in Publishing 11
14. Why do we need to care more about metadata?
•A strong metadata system can power
centralized searches, helping find assets
across data silos
•If an asset can be easily found, it can be
reused and repurposed more easily
Oct 2015Metadata Enrichment in Publishing 14
15. Easily discover all your content assets
Oct 2015Metadata Enrichment in Publishing 15
16. Metadata can empower automation
• Many publishers still rely on big teams of editors to
manage their digital offerings while at the same time
• A single editor could aggregate all content that relates
to particular topic and with appropriate metadata
tagging that could happen automatically
Oct 2015Metadata Enrichment in Publishing 16
17. Use case: BBC
• Goals
− Create a dynamic semantic publishing
platform that assembles web pages on-
the-fly using a variety of data sources
− Deliver highly relevant data to web site
visitors with sub-second response
• Challenges
− BBC journalists author and publish
content which is then statistically
rendered. The costs and time to do this
were high
− Diverse content was difficult to
navigate, content re-use was not
flexible
Oct 2015Metadata Enrichment in Publishing 17
"The goal is to be able to more easily and accurately
aggregate content, find it and share it across many
sources. From these simple relationships and
building blocks you can dynamically build up
incredibly rich sites and navigation on any platform."
John O’Donovan, Chief Technical Architect, BBC
18. Metadata can greatly enhance discoverability
• United metadata across multiple data silos can provide
a universal search solution for editors looking for
specific content – internal usage and search
• Publishers need to take full advantage of traffic from
search to better expose and potentially monetize their
content
Oct 2015Metadata Enrichment in Publishing 18
19. Use case: EuroMoney
Oct 2015Metadata Enrichment in Publishing 19
• Goals
− Create a horizontal platform to
serve 100 different publications
− Platform which would include
the latest authoring, storing, and
delivery technologies including,
semantic annotation, search and
a triple store repository
• Challenges
− Multiple domains covered
− Sophisticated content analytics
including relation, template and
scenario extraction
20. Metadata can drive user engagement
•Recommendation engines also rely on
metadata to suggest content to users
•So, ultimately, the better structured and
more accurate the metadata, the more likely
recommended content to be highly relevant
Oct 2015Metadata Enrichment in Publishing 20
21. Use case: Financial Times
Oct 2015Metadata Enrichment in Publishing 21
• Goals
− Create a horizontal platform for
both data and content based on
semantics and serve all functionality
through this platform
• Challenges
− Critical part of FT.COM
− Personalized recommendation
based on user behavior and
semantic context (Related Reads)
22. Ontotext value proposition
• Make Text and Data Tango Together!
− Today there is an artificial divide between text and data
− Semantic technology removes the divide and brings them together
• We interlink text and data to unveil their meaning
− Large knowledge graphs help text-mining!
− Interlinking text and data allows us to add context and meaning to both
• We deliver unmatched search and exploration
− Across all sorts of data and at a fraction of the cost of alternative approaches
Oct 2015Metadata Enrichment in Publishing 22
23. ONTOTEXT
Technology:
Analyzing Text
Oct 2015Metadata Enrichment in Publishing 23
• Full spectrum of NLP
capabilities
• Semantic indexing
− Tag references with entity IDs
− Generate semantic metadata
descriptions of documents
− Store metadata in GraphDB
24. ONTOTEXT
Technology:
Interlinking Text
and Data
Oct 2015Metadata Enrichment in Publishing 24
• Use large knowledge
graphs for text analysis
• Semantic annotation and
search
− Combine structured database queries
with full-text search and inference We make sense of text and data by
linking and interpreting them together
36. Proven in Publishing and Other Sectors
• Application: Content production and delivery
− Helping for: authoring, enrichment, presentation, re-purposing, personalized recommendation
• Application: Information discovery
− Powerful semantic enterprise search for applications like regulation compliance and drug safety
• Valuable collection of use cases: 10+ high-profile projects
− Business news: FT, Bloomberg, Euromoney
− Scientific publishing: John Willie & Sons, Oxford University Press, IET
− Media & content publishers: BBC, DK, Getty, Disney, ...
Oct 2015Metadata Enrichment in Publishing 36
37. Semantic News
Publishing Solution
Oct 2015Metadata Enrichment in Publishing 37
• They have tons of great content
− That is expensive to manage and reuse
• But struggle to engage readers
− Hard to compete with social networks and
other online and mobile channels
• Solved by Ontotext
− Dynamic topic aggregation & feed generation
− Personalized recommendations
38. Scientific
Publishing
Solution
Oct 2015Metadata Enrichment in Publishing 38
• They have tons of legacy static
content
− That is expensive to manage and reuse
• And struggle to monetize their
content
− Hard to compete with platforms
providers and open access resources
• Solved by Ontotext
− Smarter search and recommendations
− Taxonomy, Thesauri, Vocabulary
enrichment
− Dynamic content aggregation
39. Personalized
Learning
Solution
Oct 2015Metadata Enrichment in Publishing 39
• They have tons of static content
− That is expensive to manage and reuse
• And struggle to engage learners
and educators
− Hard to compete with the ed-tech companies and
free e-learning resources
• Solved by Ontotext
− Mapping of learning resources to curricula
− Dynamic content aggregation
− Personalized learning
40. Wrap up
• Unique Technology Portfolio
− Top notch RDF graph database and text-mining
− One-stop shop for content enrichment and metadata management
• End-to-end solution for Media and Publishing
− Authoring, curation and publishing through adaptive text-mining
• Proven to Deliver – we run FT.COM and BBC.CO.UK/SPORT
• Stable, Sustainable and Growing Company
Oct 2015Metadata Enrichment in Publishing 40
41. Thank you!
Experience the technology with NOW: Semantic News Portal
http://now.ontotext.com
Try out our Semantic tagging service
http://tag.ontotext.com
Learn more at our website or simply get in touch
info@ontotext.com, @ontotext
Oct 2015Metadata Enrichment in Publishing 41
Editor's Notes
Zoom in
Domain model design – Information architecture, a combination of the work of an architect and the knowledge of a subject matter expert.
Example KB Newz
Initial Concept Extraction Pipeline
Iterative implementation of ML-driven extraction
Continuous adaptation of the text analytics modules / concept extraction services.
Initial creation of gold standards and then adapting the ML models according to new examples and editorial feedback.