SlideShare uma empresa Scribd logo
1 de 48
Generating Metadata by
Machine
BEA 2015
Friday, May 29, 11:30-12:20
Room 1E10
Presenters
Moderator
• Pat Payton, Senior Manager Publisher Relations, Bowker
Speakers
• Randi Park, Publishing Officer, The World Bank
• Hassan Zaidi, Digital Publishing Officer, International Monetary Fund
• Jim Bryant, CEO, Trajectory Inc.
Terminology
• Automated or Machine Indexing
– Process of assigning index terms against a set
vocabulary or taxonomy without human intervention
– Full text or bibliographic records
– Multiple vocabularies/rule sets allow for complex text
analysis
• Optical Character Recognition (OCR)
– Machine conversion of an image to text
– PDF of book content
• Extensible Markup Language (XML)
– Set of rules for encoding documents
– Both machine readable and human readable
2
Experience with semantic
metadata creation
Randi Park
Rpark@worldbankgroup.org
WORLD BANK PUBLICATIONS
ABOUT THE WORLD BANK
4
• The World Bank Group is the world’s largest
source of funding and technical assistance for
developing countries.
• Through its five institutions, the Bank Group
partners with developing countries to reduce
poverty, increase economic growth, and
improve the quality of life.
• Comprised of 188 member countries with
offices in 120 countries around the world.
around the world.
Our Twin Goals
End Extreme Poverty within a Generation &
Boost Shared Prosperity
Likeotherpublishersinsomerespects but...
• Publishing arm of a larger institution, with institutional
imperatives
• Open access
o Dissemination trumps revenue
• Research is performed by in-house economists and experts in
other fields, by development practitioners working on the ground,
and by external contributors.
• Our publishing outputs are meant to enrich the development
debate, inform policies, and support the development goals of our
client countries.
We are a “Knowledge Bank”
The World Bank is the largest source of development knowledge
PopularAnnualsandFlagships
7
Two platforms: The World Bank eLibrary and the Open Knowledge Repository (OKR)
Mobileapplications
Topics wecover=29
• Plus 5 Regions, Countries and Keywords
Metadata strategy
Primary Purpose
• Supports user-centered
discovery in WB electronic
products
• Semantic fields often exposed
and browseable
• Complimented by full text
search and filtering
• Book, chapter and article level
abstracts, topics, regions,
countries, keywords
• Books do not inherit chapter
semantics
Secondary Re-purpose
• Search and discovery services
• Aggregators
• Retail sales channels, both print
and electronic
Ourexperiencewithmachinegenerated
metadata
Set up
• Customized our enterprise system as much as was practical
Pros
• Reasonable solution when
there is a huge corpus
• Fast throughput
• Inexpensive to run after labor-
intensive set up
• PDF source for extraction of
topics, subtopics, countries,
regions, keywords
• XML output easily
transformed
Cons
• Set up effort/cost
• Inconsistent use of keyword
terms, depending on how
they were used in the text
anti-corruption/anticorruption
decision-making/decision making
policy-making/policy making
• Abstracts must be written by
humans
• False hits due to footnotes,
references, names, etc..
Presentworkflow –humangenerated
Pros
• Book and chapter level
including abstracts
• Able to manage keyword
vocabulary using pick-lists
with additions as needed
• More accurate, author
provides book level draft, EP
team does sense check
• New rules and terms can be
added any time with little set-
up
Cons
• Cost per book/chapter
• Capacity
• Inconsistencies between
legacy (edited machine-
generated) and newer content
to be addressed
• Single version of keywords
may not be ideal for all
channels (ie more keywords
for discovery services)
Future
• Interested in using technology to improve
discovery for direct users and in discovery
services
• Full text XML and ePub available for indexing
• Institutional need to implement new taxonomy
and full text search for over 200k documents
Randi Park
Rpark@worldbankgroup.org
WORLD BANK PUBLICATIONS
Introduction: IMF Publications
Objectives: Establish digital publishing program 2010-2011
• New IMF eLibrary
• Digital distribution
• Digital production
• New metadata management system
• Create metadata to a granular level (chapters and articles) ***
Digitization and Metadata Challenges
2010-2011
Digitization and Metadata Challenges:
2010-2011
New Challenges – New Solutions
Manual vs. Machine
•Metadata quality
•Time factor
•Cost of labor comparison
Challenge: Cataloging to a granular level (keywords,
countries, topics and sub-topics)
New challenges – New solutions
Do the Math
IMF example:
• 12, 000 titles containing 60,000 chapters/articles (assumes an
average of 5 per title),
• 15 minutes to catalog each chapter/article with keywords etc,
• 15,000 hours/40 (per week) hours =375 weeks
• 375 weeks/52 = 7 years of work for one cataloger.
If you pay just $30 per hour to a cataloger, the overall cost would be
$450,000. Not to mention new content is being created daily.
Automation allows us to slash the time it takes to catalog our
content, saving us time and money.
Machine in Action
Machine in Action
Machine in Action
Results on eLibrary
Super keywords or
specific subjects
Browsing the IMF eLibrary
Browse by Topics
Simple Search - Type a word or phrase into the
search bar at the top of every page…
…or Advanced Search allows
multiple concepts and filters
Search within results to search
within publications using a single
word or phrase.
Select Content Type (Books and
Journals/Chapters and Articles),
Countries/Region, Topics,
Languages, or Date.
Type a word in the Starts with box
to go to the first title that begins
with the word.
Sort by Title, Date, Source or
Author.
Change the number of Items per
page.
Keywords
Read on screen
in HTML
Read on a
variety of
devices
Citation
tools
Click on a title from the results page to go to the publication
landing page.
Related documents
Related
documents
• New IMF eLibrary was delivered in March 2011
• Digital distribution: Distribute IMF contents to 35 channels
in various digital formats
• Digital production: Have an established workflow to
generate XML based contents, ePubs, Mobi and PDF ebooks
• New metadata management system. MetaLogic is a full
functioning metadata management system
• Create metadata to a granular level
™
THIS INFORMATION IS PROVIDED IN CONFIDENCE AND MAY NOT BE DISCLOSED TO ANY
THIRD PARTY OR USED FOR ANY OTHER PURPOSE WITHOUT THE EXPRESS WRITTEN PERMISSION OF TRAJECTORY, INC.
Generating Metadata By Machine
BEA May 29, 2015 11:30 – 12:20
™
THIS INFORMATION IS PROVIDED IN CONFIDENCE AND MAY NOT BE DISCLOSED TO ANY
THIRD PARTY OR USED FOR ANY OTHER PURPOSE WITHOUT THE EXPRESS WRITTEN PERMISSION OF TRAJECTORY, INC.
Attributes/Entities that Characterize A Book
38
™
THIS INFORMATION IS PROVIDED IN CONFIDENCE AND MAY NOT BE DISCLOSED TO ANY
THIRD PARTY OR USED FOR ANY OTHER PURPOSE WITHOUT THE EXPRESS WRITTEN PERMISSION OF TRAJECTORY, INC.
Sentiment: Analyzing the Words Within the Book
“Outstanding”words(5) breathtaking,thrilled,superb
hell,rape,(more unmentionables)“Catastrophic”words(-5)
torture,fraud,(unmentionables)“Damned”words(-4)
woeful,worsen,kill“Terrible”words(-3)
worthless,travesty,threaten“Upset”words(-2)
numb, provoke,pushy“No”words(-1)
validate,safe,adequate“Yes”words(1):
strengthen,rich,funky“Welcome”words(2)
praise,marvelous,impressive
winning,stunning
“Happy”words(3)
“Wow”words(4)
39
Each wordisgivena numericvalue
basedon itssubjectivemeaning.
“Positive”wordsrangeona positive
scale;“Negative”wordsrangeon a
negativescale.
Trajectory’sAnalyticsEngineuses
thesevaluestocomputethebook’s
sentimentcurveacrosssentence,
paragraph,chapterandentirebook.
Thissentiment“fingerprint”atan
aggregatelevelyieldsaunique
pictureofthebook.
™
THIS INFORMATION IS PROVIDED IN CONFIDENCE AND MAY NOT BE DISCLOSED TO ANY
THIRD PARTY OR USED FOR ANY OTHER PURPOSE WITHOUT THE EXPRESS WRITTEN PERMISSION OF TRAJECTORY, INC.
Sentiment: Analyzing the Words Within the Book
40
™
THIS INFORMATION IS PROVIDED IN CONFIDENCE AND MAY NOT BE DISCLOSED TO ANY
THIRD PARTY OR USED FOR ANY OTHER PURPOSE WITHOUT THE EXPRESS WRITTEN PERMISSION OF TRAJECTORY, INC.
Sentiment: Analyzing the Words Within the Book
41
™
THIS INFORMATION IS PROVIDED IN CONFIDENCE AND MAY NOT BE DISCLOSED TO ANY
THIRD PARTY OR USED FOR ANY OTHER PURPOSE WITHOUT THE EXPRESS WRITTEN PERMISSION OF TRAJECTORY, INC.
Trajectory Index
42
™
THIS INFORMATION IS PROVIDED IN CONFIDENCE AND MAY NOT BE DISCLOSED TO ANY
THIRD PARTY OR USED FOR ANY OTHER PURPOSE WITHOUT THE EXPRESS WRITTEN PERMISSION OF TRAJECTORY, INC.
Keyword Analysis and Comparison
43
™
THIS INFORMATION IS PROVIDED IN CONFIDENCE AND MAY NOT BE DISCLOSED TO ANY
THIRD PARTY OR USED FOR ANY OTHER PURPOSE WITHOUT THE EXPRESS WRITTEN PERMISSION OF TRAJECTORY, INC.
Keyword Translation into Local Languages
44
™
THIS INFORMATION IS PROVIDED IN CONFIDENCE AND MAY NOT BE DISCLOSED TO ANY
THIRD PARTY OR USED FOR ANY OTHER PURPOSE WITHOUT THE EXPRESS WRITTEN PERMISSION OF TRAJECTORY, INC.
Recommendations
45
™
THIS INFORMATION IS PROVIDED IN CONFIDENCE AND MAY NOT BE DISCLOSED TO ANY
THIRD PARTY OR USED FOR ANY OTHER PURPOSE WITHOUT THE EXPRESS WRITTEN PERMISSION OF TRAJECTORY, INC.
Thank You
46
2015BEA – BOOTH 1347
United States:
50 Doaks Lane
Marblehead, Massachusetts
01945 United States
info@trajectory.com
www.trajectory.com
China:
No. 3, 8 ChuangYe Road
HaidanDistrict,
Beijing, China100085
Q & A
Generating Metadata by Machine
BEA 2015
Friday, May 29, 11:30-12:20
Room 1E10

Mais conteúdo relacionado

Semelhante a BEA 2015 Generating Metadata by Machine

Harnessing search engines for KM
Harnessing search engines for KMHarnessing search engines for KM
Harnessing search engines for KMInvotra
 
How Oracle Uses CrowdFlower For Sentiment Analysis
How Oracle Uses CrowdFlower For Sentiment AnalysisHow Oracle Uses CrowdFlower For Sentiment Analysis
How Oracle Uses CrowdFlower For Sentiment AnalysisCrowdFlower
 
Climbing the Slippery Slope of SharePoint Migrations Webinar
Climbing the Slippery Slope of SharePoint Migrations WebinarClimbing the Slippery Slope of SharePoint Migrations Webinar
Climbing the Slippery Slope of SharePoint Migrations WebinarConcept Searching, Inc
 
Agile Data Rationalization for Operational Intelligence
Agile Data Rationalization for Operational IntelligenceAgile Data Rationalization for Operational Intelligence
Agile Data Rationalization for Operational IntelligenceInside Analysis
 
Level Up Web: Modern Web Development and Management Practices for Libraries
Level Up Web: Modern Web Development and Management Practices for LibrariesLevel Up Web: Modern Web Development and Management Practices for Libraries
Level Up Web: Modern Web Development and Management Practices for LibrariesNina McHale
 
Building an Innovative Learning Ecosystem at Scale with Graph Technologies
Building an Innovative Learning Ecosystem at Scale with Graph TechnologiesBuilding an Innovative Learning Ecosystem at Scale with Graph Technologies
Building an Innovative Learning Ecosystem at Scale with Graph TechnologiesEnterprise Knowledge
 
Smart cities no ai without ia
Smart cities   no ai without iaSmart cities   no ai without ia
Smart cities no ai without iaFredric Landqvist
 
Enterprise search Information
Enterprise search Information Enterprise search Information
Enterprise search Information Netwoven Inc.
 
May 2021 Webinar core elements of a sustainable content hub
May 2021 Webinar core elements of a sustainable content hubMay 2021 Webinar core elements of a sustainable content hub
May 2021 Webinar core elements of a sustainable content hubBarry Loekenbach
 
The Very Best Intranets and Digital Workplaces of 2017
The Very Best Intranets and Digital Workplaces of 2017The Very Best Intranets and Digital Workplaces of 2017
The Very Best Intranets and Digital Workplaces of 2017Prescient Digital Media
 
Full-on DITA Strategies Beyond Technical Publications with Rob Hanna, ECMs
Full-on DITA Strategies Beyond Technical Publications with Rob Hanna, ECMsFull-on DITA Strategies Beyond Technical Publications with Rob Hanna, ECMs
Full-on DITA Strategies Beyond Technical Publications with Rob Hanna, ECMsInformation Development World
 
Optimising Your Content for findability
Optimising Your Content for findabilityOptimising Your Content for findability
Optimising Your Content for findabilityKristian Norling
 
AIMed 19 Workshop 1: Machine Learning for non-data scientist by Dr. Robert Hoyt
AIMed 19 Workshop 1: Machine Learning for non-data scientist by Dr. Robert HoytAIMed 19 Workshop 1: Machine Learning for non-data scientist by Dr. Robert Hoyt
AIMed 19 Workshop 1: Machine Learning for non-data scientist by Dr. Robert Hoythazelhwtang
 
F.A.I.R. Data with Knowledge Graphs & AI
F.A.I.R. Data with Knowledge Graphs & AIF.A.I.R. Data with Knowledge Graphs & AI
F.A.I.R. Data with Knowledge Graphs & AIFredric Landqvist
 
Information Architecture Exposing the Secret Sauce for Success
Information Architecture Exposing the Secret Sauce for Success Information Architecture Exposing the Secret Sauce for Success
Information Architecture Exposing the Secret Sauce for Success Baltimore SharePoint (BSPUG)
 
Agile, Automated, Aware: How to Model for Success
Agile, Automated, Aware: How to Model for SuccessAgile, Automated, Aware: How to Model for Success
Agile, Automated, Aware: How to Model for SuccessInside Analysis
 
What You Need to Know Before Upgrading to SharePoint 2013
What You Need to Know Before Upgrading to SharePoint 2013What You Need to Know Before Upgrading to SharePoint 2013
What You Need to Know Before Upgrading to SharePoint 2013Perficient, Inc.
 
How Machine Learning Will Transform Finance
How Machine Learning Will Transform FinanceHow Machine Learning Will Transform Finance
How Machine Learning Will Transform FinanceRich Clayton
 

Semelhante a BEA 2015 Generating Metadata by Machine (20)

Harnessing search engines for KM
Harnessing search engines for KMHarnessing search engines for KM
Harnessing search engines for KM
 
How Oracle Uses CrowdFlower For Sentiment Analysis
How Oracle Uses CrowdFlower For Sentiment AnalysisHow Oracle Uses CrowdFlower For Sentiment Analysis
How Oracle Uses CrowdFlower For Sentiment Analysis
 
Climbing the Slippery Slope of SharePoint Migrations Webinar
Climbing the Slippery Slope of SharePoint Migrations WebinarClimbing the Slippery Slope of SharePoint Migrations Webinar
Climbing the Slippery Slope of SharePoint Migrations Webinar
 
Agile Data Rationalization for Operational Intelligence
Agile Data Rationalization for Operational IntelligenceAgile Data Rationalization for Operational Intelligence
Agile Data Rationalization for Operational Intelligence
 
Level Up Web: Modern Web Development and Management Practices for Libraries
Level Up Web: Modern Web Development and Management Practices for LibrariesLevel Up Web: Modern Web Development and Management Practices for Libraries
Level Up Web: Modern Web Development and Management Practices for Libraries
 
Building an Innovative Learning Ecosystem at Scale with Graph Technologies
Building an Innovative Learning Ecosystem at Scale with Graph TechnologiesBuilding an Innovative Learning Ecosystem at Scale with Graph Technologies
Building an Innovative Learning Ecosystem at Scale with Graph Technologies
 
Smart cities no ai without ia
Smart cities   no ai without iaSmart cities   no ai without ia
Smart cities no ai without ia
 
Enterprise search Information
Enterprise search Information Enterprise search Information
Enterprise search Information
 
May 2021 Webinar core elements of a sustainable content hub
May 2021 Webinar core elements of a sustainable content hubMay 2021 Webinar core elements of a sustainable content hub
May 2021 Webinar core elements of a sustainable content hub
 
The Very Best Intranets and Digital Workplaces of 2017
The Very Best Intranets and Digital Workplaces of 2017The Very Best Intranets and Digital Workplaces of 2017
The Very Best Intranets and Digital Workplaces of 2017
 
FAST Search-webinar-06-29-2010
FAST Search-webinar-06-29-2010FAST Search-webinar-06-29-2010
FAST Search-webinar-06-29-2010
 
Full-on DITA Strategies Beyond Technical Publications with Rob Hanna, ECMs
Full-on DITA Strategies Beyond Technical Publications with Rob Hanna, ECMsFull-on DITA Strategies Beyond Technical Publications with Rob Hanna, ECMs
Full-on DITA Strategies Beyond Technical Publications with Rob Hanna, ECMs
 
Optimising Your Content for findability
Optimising Your Content for findabilityOptimising Your Content for findability
Optimising Your Content for findability
 
AIMed 19 Workshop 1: Machine Learning for non-data scientist by Dr. Robert Hoyt
AIMed 19 Workshop 1: Machine Learning for non-data scientist by Dr. Robert HoytAIMed 19 Workshop 1: Machine Learning for non-data scientist by Dr. Robert Hoyt
AIMed 19 Workshop 1: Machine Learning for non-data scientist by Dr. Robert Hoyt
 
F.A.I.R. Data with Knowledge Graphs & AI
F.A.I.R. Data with Knowledge Graphs & AIF.A.I.R. Data with Knowledge Graphs & AI
F.A.I.R. Data with Knowledge Graphs & AI
 
Information Architecture Exposing the Secret Sauce for Success
Information Architecture Exposing the Secret Sauce for Success Information Architecture Exposing the Secret Sauce for Success
Information Architecture Exposing the Secret Sauce for Success
 
Agile, Automated, Aware: How to Model for Success
Agile, Automated, Aware: How to Model for SuccessAgile, Automated, Aware: How to Model for Success
Agile, Automated, Aware: How to Model for Success
 
SharePoint Fest Chicago Presentation
SharePoint Fest Chicago PresentationSharePoint Fest Chicago Presentation
SharePoint Fest Chicago Presentation
 
What You Need to Know Before Upgrading to SharePoint 2013
What You Need to Know Before Upgrading to SharePoint 2013What You Need to Know Before Upgrading to SharePoint 2013
What You Need to Know Before Upgrading to SharePoint 2013
 
How Machine Learning Will Transform Finance
How Machine Learning Will Transform FinanceHow Machine Learning Will Transform Finance
How Machine Learning Will Transform Finance
 

Mais de Bowker

Ebook Central Submission Guide for Content Providers -- Revised, July 2020
Ebook Central Submission Guide for Content Providers -- Revised, July 2020Ebook Central Submission Guide for Content Providers -- Revised, July 2020
Ebook Central Submission Guide for Content Providers -- Revised, July 2020Bowker
 
2018 Metadata Tips A to Z
2018 Metadata Tips A to Z2018 Metadata Tips A to Z
2018 Metadata Tips A to ZBowker
 
Enhanced Metadata for Discovery -- Beyond the Basics
Enhanced Metadata for Discovery -- Beyond the BasicsEnhanced Metadata for Discovery -- Beyond the Basics
Enhanced Metadata for Discovery -- Beyond the BasicsBowker
 
BEA Content & Digital Conference Leading Readers to Your Children's and YA Co...
BEA Content & Digital Conference Leading Readers to Your Children's and YA Co...BEA Content & Digital Conference Leading Readers to Your Children's and YA Co...
BEA Content & Digital Conference Leading Readers to Your Children's and YA Co...Bowker
 
IDPF Digicon Future of Metadata
IDPF Digicon Future of MetadataIDPF Digicon Future of Metadata
IDPF Digicon Future of MetadataBowker
 
BEA Content & Digital Conference & IDPF 2016
BEA Content & Digital Conference & IDPF 2016BEA Content & Digital Conference & IDPF 2016
BEA Content & Digital Conference & IDPF 2016Bowker
 
PSP Subject Discovery
PSP Subject DiscoveryPSP Subject Discovery
PSP Subject DiscoveryBowker
 
The Higher Education Persona
The Higher Education PersonaThe Higher Education Persona
The Higher Education PersonaBowker
 
Creating Effective ONIX Metadata: Five Keys to Promote Discovery
Creating Effective ONIX Metadata:  Five Keys to Promote DiscoveryCreating Effective ONIX Metadata:  Five Keys to Promote Discovery
Creating Effective ONIX Metadata: Five Keys to Promote DiscoveryBowker
 
BNC Educational Standards
BNC Educational StandardsBNC Educational Standards
BNC Educational StandardsBowker
 
UPU 2015 Get Discovered By the Right Readers--Keywords
UPU 2015 Get Discovered By the Right Readers--KeywordsUPU 2015 Get Discovered By the Right Readers--Keywords
UPU 2015 Get Discovered By the Right Readers--KeywordsBowker
 
Pubwest metadata exposed
Pubwest metadata exposedPubwest metadata exposed
Pubwest metadata exposedBowker
 
Improving Subject Coding
Improving Subject CodingImproving Subject Coding
Improving Subject CodingBowker
 
AAUP 2014--Metadata Standards
AAUP 2014--Metadata StandardsAAUP 2014--Metadata Standards
AAUP 2014--Metadata StandardsBowker
 
BEA 2014--Let Common Core Power Your Publishing Accompanying Script
BEA 2014--Let Common Core Power Your Publishing Accompanying ScriptBEA 2014--Let Common Core Power Your Publishing Accompanying Script
BEA 2014--Let Common Core Power Your Publishing Accompanying ScriptBowker
 
BEA 2014--Let Common Core Power Your Publishing
BEA 2014--Let Common Core Power Your PublishingBEA 2014--Let Common Core Power Your Publishing
BEA 2014--Let Common Core Power Your PublishingBowker
 
uPublishU 2014--5 Easy Ways to Get Discovered
uPublishU 2014--5 Easy Ways to Get DiscovereduPublishU 2014--5 Easy Ways to Get Discovered
uPublishU 2014--5 Easy Ways to Get DiscoveredBowker
 
uPublishU 2014--Moving Beyond Online Sales
uPublishU 2014--Moving Beyond Online SalesuPublishU 2014--Moving Beyond Online Sales
uPublishU 2014--Moving Beyond Online SalesBowker
 
BEA 2014--Stop Hiding Your Books From Readers
BEA 2014--Stop Hiding Your Books From ReadersBEA 2014--Stop Hiding Your Books From Readers
BEA 2014--Stop Hiding Your Books From ReadersBowker
 
BEA 2014--Understanding New Developments in Metadata
BEA 2014--Understanding New Developments in MetadataBEA 2014--Understanding New Developments in Metadata
BEA 2014--Understanding New Developments in MetadataBowker
 

Mais de Bowker (20)

Ebook Central Submission Guide for Content Providers -- Revised, July 2020
Ebook Central Submission Guide for Content Providers -- Revised, July 2020Ebook Central Submission Guide for Content Providers -- Revised, July 2020
Ebook Central Submission Guide for Content Providers -- Revised, July 2020
 
2018 Metadata Tips A to Z
2018 Metadata Tips A to Z2018 Metadata Tips A to Z
2018 Metadata Tips A to Z
 
Enhanced Metadata for Discovery -- Beyond the Basics
Enhanced Metadata for Discovery -- Beyond the BasicsEnhanced Metadata for Discovery -- Beyond the Basics
Enhanced Metadata for Discovery -- Beyond the Basics
 
BEA Content & Digital Conference Leading Readers to Your Children's and YA Co...
BEA Content & Digital Conference Leading Readers to Your Children's and YA Co...BEA Content & Digital Conference Leading Readers to Your Children's and YA Co...
BEA Content & Digital Conference Leading Readers to Your Children's and YA Co...
 
IDPF Digicon Future of Metadata
IDPF Digicon Future of MetadataIDPF Digicon Future of Metadata
IDPF Digicon Future of Metadata
 
BEA Content & Digital Conference & IDPF 2016
BEA Content & Digital Conference & IDPF 2016BEA Content & Digital Conference & IDPF 2016
BEA Content & Digital Conference & IDPF 2016
 
PSP Subject Discovery
PSP Subject DiscoveryPSP Subject Discovery
PSP Subject Discovery
 
The Higher Education Persona
The Higher Education PersonaThe Higher Education Persona
The Higher Education Persona
 
Creating Effective ONIX Metadata: Five Keys to Promote Discovery
Creating Effective ONIX Metadata:  Five Keys to Promote DiscoveryCreating Effective ONIX Metadata:  Five Keys to Promote Discovery
Creating Effective ONIX Metadata: Five Keys to Promote Discovery
 
BNC Educational Standards
BNC Educational StandardsBNC Educational Standards
BNC Educational Standards
 
UPU 2015 Get Discovered By the Right Readers--Keywords
UPU 2015 Get Discovered By the Right Readers--KeywordsUPU 2015 Get Discovered By the Right Readers--Keywords
UPU 2015 Get Discovered By the Right Readers--Keywords
 
Pubwest metadata exposed
Pubwest metadata exposedPubwest metadata exposed
Pubwest metadata exposed
 
Improving Subject Coding
Improving Subject CodingImproving Subject Coding
Improving Subject Coding
 
AAUP 2014--Metadata Standards
AAUP 2014--Metadata StandardsAAUP 2014--Metadata Standards
AAUP 2014--Metadata Standards
 
BEA 2014--Let Common Core Power Your Publishing Accompanying Script
BEA 2014--Let Common Core Power Your Publishing Accompanying ScriptBEA 2014--Let Common Core Power Your Publishing Accompanying Script
BEA 2014--Let Common Core Power Your Publishing Accompanying Script
 
BEA 2014--Let Common Core Power Your Publishing
BEA 2014--Let Common Core Power Your PublishingBEA 2014--Let Common Core Power Your Publishing
BEA 2014--Let Common Core Power Your Publishing
 
uPublishU 2014--5 Easy Ways to Get Discovered
uPublishU 2014--5 Easy Ways to Get DiscovereduPublishU 2014--5 Easy Ways to Get Discovered
uPublishU 2014--5 Easy Ways to Get Discovered
 
uPublishU 2014--Moving Beyond Online Sales
uPublishU 2014--Moving Beyond Online SalesuPublishU 2014--Moving Beyond Online Sales
uPublishU 2014--Moving Beyond Online Sales
 
BEA 2014--Stop Hiding Your Books From Readers
BEA 2014--Stop Hiding Your Books From ReadersBEA 2014--Stop Hiding Your Books From Readers
BEA 2014--Stop Hiding Your Books From Readers
 
BEA 2014--Understanding New Developments in Metadata
BEA 2014--Understanding New Developments in MetadataBEA 2014--Understanding New Developments in Metadata
BEA 2014--Understanding New Developments in Metadata
 

Último

Top profile Call Girls In Indore [ 7014168258 ] Call Me For Genuine Models We...
Top profile Call Girls In Indore [ 7014168258 ] Call Me For Genuine Models We...Top profile Call Girls In Indore [ 7014168258 ] Call Me For Genuine Models We...
Top profile Call Girls In Indore [ 7014168258 ] Call Me For Genuine Models We...gajnagarg
 
Top profile Call Girls In Satna [ 7014168258 ] Call Me For Genuine Models We ...
Top profile Call Girls In Satna [ 7014168258 ] Call Me For Genuine Models We ...Top profile Call Girls In Satna [ 7014168258 ] Call Me For Genuine Models We ...
Top profile Call Girls In Satna [ 7014168258 ] Call Me For Genuine Models We ...nirzagarg
 
Sealdah % High Class Call Girls Kolkata - 450+ Call Girl Cash Payment 8005736...
Sealdah % High Class Call Girls Kolkata - 450+ Call Girl Cash Payment 8005736...Sealdah % High Class Call Girls Kolkata - 450+ Call Girl Cash Payment 8005736...
Sealdah % High Class Call Girls Kolkata - 450+ Call Girl Cash Payment 8005736...HyderabadDolls
 
TrafficWave Generator Will Instantly drive targeted and engaging traffic back...
TrafficWave Generator Will Instantly drive targeted and engaging traffic back...TrafficWave Generator Will Instantly drive targeted and engaging traffic back...
TrafficWave Generator Will Instantly drive targeted and engaging traffic back...SOFTTECHHUB
 
5CL-ADBA,5cladba, Chinese supplier, safety is guaranteed
5CL-ADBA,5cladba, Chinese supplier, safety is guaranteed5CL-ADBA,5cladba, Chinese supplier, safety is guaranteed
5CL-ADBA,5cladba, Chinese supplier, safety is guaranteedamy56318795
 
Top profile Call Girls In Rohtak [ 7014168258 ] Call Me For Genuine Models We...
Top profile Call Girls In Rohtak [ 7014168258 ] Call Me For Genuine Models We...Top profile Call Girls In Rohtak [ 7014168258 ] Call Me For Genuine Models We...
Top profile Call Girls In Rohtak [ 7014168258 ] Call Me For Genuine Models We...nirzagarg
 
Top profile Call Girls In dimapur [ 7014168258 ] Call Me For Genuine Models W...
Top profile Call Girls In dimapur [ 7014168258 ] Call Me For Genuine Models W...Top profile Call Girls In dimapur [ 7014168258 ] Call Me For Genuine Models W...
Top profile Call Girls In dimapur [ 7014168258 ] Call Me For Genuine Models W...gajnagarg
 
Charbagh + Female Escorts Service in Lucknow | Starting ₹,5K To @25k with A/C...
Charbagh + Female Escorts Service in Lucknow | Starting ₹,5K To @25k with A/C...Charbagh + Female Escorts Service in Lucknow | Starting ₹,5K To @25k with A/C...
Charbagh + Female Escorts Service in Lucknow | Starting ₹,5K To @25k with A/C...HyderabadDolls
 
SAC 25 Final National, Regional & Local Angel Group Investing Insights 2024 0...
SAC 25 Final National, Regional & Local Angel Group Investing Insights 2024 0...SAC 25 Final National, Regional & Local Angel Group Investing Insights 2024 0...
SAC 25 Final National, Regional & Local Angel Group Investing Insights 2024 0...Elaine Werffeli
 
Digital Transformation Playbook by Graham Ware
Digital Transformation Playbook by Graham WareDigital Transformation Playbook by Graham Ware
Digital Transformation Playbook by Graham WareGraham Ware
 
Lake Town / Independent Kolkata Call Girls Phone No 8005736733 Elite Escort S...
Lake Town / Independent Kolkata Call Girls Phone No 8005736733 Elite Escort S...Lake Town / Independent Kolkata Call Girls Phone No 8005736733 Elite Escort S...
Lake Town / Independent Kolkata Call Girls Phone No 8005736733 Elite Escort S...HyderabadDolls
 
High Profile Call Girls Service in Jalore { 9332606886 } VVIP NISHA Call Girl...
High Profile Call Girls Service in Jalore { 9332606886 } VVIP NISHA Call Girl...High Profile Call Girls Service in Jalore { 9332606886 } VVIP NISHA Call Girl...
High Profile Call Girls Service in Jalore { 9332606886 } VVIP NISHA Call Girl...kumargunjan9515
 
Fun all Day Call Girls in Jaipur 9332606886 High Profile Call Girls You Ca...
Fun all Day Call Girls in Jaipur   9332606886  High Profile Call Girls You Ca...Fun all Day Call Girls in Jaipur   9332606886  High Profile Call Girls You Ca...
Fun all Day Call Girls in Jaipur 9332606886 High Profile Call Girls You Ca...kumargunjan9515
 
Jual Obat Aborsi Surabaya ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
Jual Obat Aborsi Surabaya ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...Jual Obat Aborsi Surabaya ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
Jual Obat Aborsi Surabaya ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...ZurliaSoop
 
Diamond Harbour \ Russian Call Girls Kolkata | Book 8005736733 Extreme Naught...
Diamond Harbour \ Russian Call Girls Kolkata | Book 8005736733 Extreme Naught...Diamond Harbour \ Russian Call Girls Kolkata | Book 8005736733 Extreme Naught...
Diamond Harbour \ Russian Call Girls Kolkata | Book 8005736733 Extreme Naught...HyderabadDolls
 
Top profile Call Girls In Latur [ 7014168258 ] Call Me For Genuine Models We ...
Top profile Call Girls In Latur [ 7014168258 ] Call Me For Genuine Models We ...Top profile Call Girls In Latur [ 7014168258 ] Call Me For Genuine Models We ...
Top profile Call Girls In Latur [ 7014168258 ] Call Me For Genuine Models We ...gajnagarg
 
Aspirational Block Program Block Syaldey District - Almora
Aspirational Block Program Block Syaldey District - AlmoraAspirational Block Program Block Syaldey District - Almora
Aspirational Block Program Block Syaldey District - AlmoraGovindSinghDasila
 
Jual obat aborsi Bandung ( 085657271886 ) Cytote pil telat bulan penggugur ka...
Jual obat aborsi Bandung ( 085657271886 ) Cytote pil telat bulan penggugur ka...Jual obat aborsi Bandung ( 085657271886 ) Cytote pil telat bulan penggugur ka...
Jual obat aborsi Bandung ( 085657271886 ) Cytote pil telat bulan penggugur ka...Klinik kandungan
 
RESEARCH-FINAL-DEFENSE-PPT-TEMPLATE.pptx
RESEARCH-FINAL-DEFENSE-PPT-TEMPLATE.pptxRESEARCH-FINAL-DEFENSE-PPT-TEMPLATE.pptx
RESEARCH-FINAL-DEFENSE-PPT-TEMPLATE.pptxronsairoathenadugay
 
7. Epi of Chronic respiratory diseases.ppt
7. Epi of Chronic respiratory diseases.ppt7. Epi of Chronic respiratory diseases.ppt
7. Epi of Chronic respiratory diseases.pptibrahimabdi22
 

Último (20)

Top profile Call Girls In Indore [ 7014168258 ] Call Me For Genuine Models We...
Top profile Call Girls In Indore [ 7014168258 ] Call Me For Genuine Models We...Top profile Call Girls In Indore [ 7014168258 ] Call Me For Genuine Models We...
Top profile Call Girls In Indore [ 7014168258 ] Call Me For Genuine Models We...
 
Top profile Call Girls In Satna [ 7014168258 ] Call Me For Genuine Models We ...
Top profile Call Girls In Satna [ 7014168258 ] Call Me For Genuine Models We ...Top profile Call Girls In Satna [ 7014168258 ] Call Me For Genuine Models We ...
Top profile Call Girls In Satna [ 7014168258 ] Call Me For Genuine Models We ...
 
Sealdah % High Class Call Girls Kolkata - 450+ Call Girl Cash Payment 8005736...
Sealdah % High Class Call Girls Kolkata - 450+ Call Girl Cash Payment 8005736...Sealdah % High Class Call Girls Kolkata - 450+ Call Girl Cash Payment 8005736...
Sealdah % High Class Call Girls Kolkata - 450+ Call Girl Cash Payment 8005736...
 
TrafficWave Generator Will Instantly drive targeted and engaging traffic back...
TrafficWave Generator Will Instantly drive targeted and engaging traffic back...TrafficWave Generator Will Instantly drive targeted and engaging traffic back...
TrafficWave Generator Will Instantly drive targeted and engaging traffic back...
 
5CL-ADBA,5cladba, Chinese supplier, safety is guaranteed
5CL-ADBA,5cladba, Chinese supplier, safety is guaranteed5CL-ADBA,5cladba, Chinese supplier, safety is guaranteed
5CL-ADBA,5cladba, Chinese supplier, safety is guaranteed
 
Top profile Call Girls In Rohtak [ 7014168258 ] Call Me For Genuine Models We...
Top profile Call Girls In Rohtak [ 7014168258 ] Call Me For Genuine Models We...Top profile Call Girls In Rohtak [ 7014168258 ] Call Me For Genuine Models We...
Top profile Call Girls In Rohtak [ 7014168258 ] Call Me For Genuine Models We...
 
Top profile Call Girls In dimapur [ 7014168258 ] Call Me For Genuine Models W...
Top profile Call Girls In dimapur [ 7014168258 ] Call Me For Genuine Models W...Top profile Call Girls In dimapur [ 7014168258 ] Call Me For Genuine Models W...
Top profile Call Girls In dimapur [ 7014168258 ] Call Me For Genuine Models W...
 
Charbagh + Female Escorts Service in Lucknow | Starting ₹,5K To @25k with A/C...
Charbagh + Female Escorts Service in Lucknow | Starting ₹,5K To @25k with A/C...Charbagh + Female Escorts Service in Lucknow | Starting ₹,5K To @25k with A/C...
Charbagh + Female Escorts Service in Lucknow | Starting ₹,5K To @25k with A/C...
 
SAC 25 Final National, Regional & Local Angel Group Investing Insights 2024 0...
SAC 25 Final National, Regional & Local Angel Group Investing Insights 2024 0...SAC 25 Final National, Regional & Local Angel Group Investing Insights 2024 0...
SAC 25 Final National, Regional & Local Angel Group Investing Insights 2024 0...
 
Digital Transformation Playbook by Graham Ware
Digital Transformation Playbook by Graham WareDigital Transformation Playbook by Graham Ware
Digital Transformation Playbook by Graham Ware
 
Lake Town / Independent Kolkata Call Girls Phone No 8005736733 Elite Escort S...
Lake Town / Independent Kolkata Call Girls Phone No 8005736733 Elite Escort S...Lake Town / Independent Kolkata Call Girls Phone No 8005736733 Elite Escort S...
Lake Town / Independent Kolkata Call Girls Phone No 8005736733 Elite Escort S...
 
High Profile Call Girls Service in Jalore { 9332606886 } VVIP NISHA Call Girl...
High Profile Call Girls Service in Jalore { 9332606886 } VVIP NISHA Call Girl...High Profile Call Girls Service in Jalore { 9332606886 } VVIP NISHA Call Girl...
High Profile Call Girls Service in Jalore { 9332606886 } VVIP NISHA Call Girl...
 
Fun all Day Call Girls in Jaipur 9332606886 High Profile Call Girls You Ca...
Fun all Day Call Girls in Jaipur   9332606886  High Profile Call Girls You Ca...Fun all Day Call Girls in Jaipur   9332606886  High Profile Call Girls You Ca...
Fun all Day Call Girls in Jaipur 9332606886 High Profile Call Girls You Ca...
 
Jual Obat Aborsi Surabaya ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
Jual Obat Aborsi Surabaya ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...Jual Obat Aborsi Surabaya ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
Jual Obat Aborsi Surabaya ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
 
Diamond Harbour \ Russian Call Girls Kolkata | Book 8005736733 Extreme Naught...
Diamond Harbour \ Russian Call Girls Kolkata | Book 8005736733 Extreme Naught...Diamond Harbour \ Russian Call Girls Kolkata | Book 8005736733 Extreme Naught...
Diamond Harbour \ Russian Call Girls Kolkata | Book 8005736733 Extreme Naught...
 
Top profile Call Girls In Latur [ 7014168258 ] Call Me For Genuine Models We ...
Top profile Call Girls In Latur [ 7014168258 ] Call Me For Genuine Models We ...Top profile Call Girls In Latur [ 7014168258 ] Call Me For Genuine Models We ...
Top profile Call Girls In Latur [ 7014168258 ] Call Me For Genuine Models We ...
 
Aspirational Block Program Block Syaldey District - Almora
Aspirational Block Program Block Syaldey District - AlmoraAspirational Block Program Block Syaldey District - Almora
Aspirational Block Program Block Syaldey District - Almora
 
Jual obat aborsi Bandung ( 085657271886 ) Cytote pil telat bulan penggugur ka...
Jual obat aborsi Bandung ( 085657271886 ) Cytote pil telat bulan penggugur ka...Jual obat aborsi Bandung ( 085657271886 ) Cytote pil telat bulan penggugur ka...
Jual obat aborsi Bandung ( 085657271886 ) Cytote pil telat bulan penggugur ka...
 
RESEARCH-FINAL-DEFENSE-PPT-TEMPLATE.pptx
RESEARCH-FINAL-DEFENSE-PPT-TEMPLATE.pptxRESEARCH-FINAL-DEFENSE-PPT-TEMPLATE.pptx
RESEARCH-FINAL-DEFENSE-PPT-TEMPLATE.pptx
 
7. Epi of Chronic respiratory diseases.ppt
7. Epi of Chronic respiratory diseases.ppt7. Epi of Chronic respiratory diseases.ppt
7. Epi of Chronic respiratory diseases.ppt
 

BEA 2015 Generating Metadata by Machine

  • 1. Generating Metadata by Machine BEA 2015 Friday, May 29, 11:30-12:20 Room 1E10
  • 2. Presenters Moderator • Pat Payton, Senior Manager Publisher Relations, Bowker Speakers • Randi Park, Publishing Officer, The World Bank • Hassan Zaidi, Digital Publishing Officer, International Monetary Fund • Jim Bryant, CEO, Trajectory Inc.
  • 3. Terminology • Automated or Machine Indexing – Process of assigning index terms against a set vocabulary or taxonomy without human intervention – Full text or bibliographic records – Multiple vocabularies/rule sets allow for complex text analysis • Optical Character Recognition (OCR) – Machine conversion of an image to text – PDF of book content • Extensible Markup Language (XML) – Set of rules for encoding documents – Both machine readable and human readable 2
  • 4. Experience with semantic metadata creation Randi Park Rpark@worldbankgroup.org WORLD BANK PUBLICATIONS
  • 5. ABOUT THE WORLD BANK 4 • The World Bank Group is the world’s largest source of funding and technical assistance for developing countries. • Through its five institutions, the Bank Group partners with developing countries to reduce poverty, increase economic growth, and improve the quality of life. • Comprised of 188 member countries with offices in 120 countries around the world. around the world. Our Twin Goals End Extreme Poverty within a Generation & Boost Shared Prosperity
  • 6. Likeotherpublishersinsomerespects but... • Publishing arm of a larger institution, with institutional imperatives • Open access o Dissemination trumps revenue • Research is performed by in-house economists and experts in other fields, by development practitioners working on the ground, and by external contributors. • Our publishing outputs are meant to enrich the development debate, inform policies, and support the development goals of our client countries. We are a “Knowledge Bank” The World Bank is the largest source of development knowledge
  • 7.
  • 9. Two platforms: The World Bank eLibrary and the Open Knowledge Repository (OKR)
  • 11. Topics wecover=29 • Plus 5 Regions, Countries and Keywords
  • 12. Metadata strategy Primary Purpose • Supports user-centered discovery in WB electronic products • Semantic fields often exposed and browseable • Complimented by full text search and filtering • Book, chapter and article level abstracts, topics, regions, countries, keywords • Books do not inherit chapter semantics Secondary Re-purpose • Search and discovery services • Aggregators • Retail sales channels, both print and electronic
  • 13. Ourexperiencewithmachinegenerated metadata Set up • Customized our enterprise system as much as was practical Pros • Reasonable solution when there is a huge corpus • Fast throughput • Inexpensive to run after labor- intensive set up • PDF source for extraction of topics, subtopics, countries, regions, keywords • XML output easily transformed Cons • Set up effort/cost • Inconsistent use of keyword terms, depending on how they were used in the text anti-corruption/anticorruption decision-making/decision making policy-making/policy making • Abstracts must be written by humans • False hits due to footnotes, references, names, etc..
  • 14.
  • 15. Presentworkflow –humangenerated Pros • Book and chapter level including abstracts • Able to manage keyword vocabulary using pick-lists with additions as needed • More accurate, author provides book level draft, EP team does sense check • New rules and terms can be added any time with little set- up Cons • Cost per book/chapter • Capacity • Inconsistencies between legacy (edited machine- generated) and newer content to be addressed • Single version of keywords may not be ideal for all channels (ie more keywords for discovery services)
  • 16. Future • Interested in using technology to improve discovery for direct users and in discovery services • Full text XML and ePub available for indexing • Institutional need to implement new taxonomy and full text search for over 200k documents
  • 18. Introduction: IMF Publications Objectives: Establish digital publishing program 2010-2011 • New IMF eLibrary • Digital distribution • Digital production • New metadata management system • Create metadata to a granular level (chapters and articles) ***
  • 19. Digitization and Metadata Challenges 2010-2011
  • 20. Digitization and Metadata Challenges: 2010-2011
  • 21. New Challenges – New Solutions Manual vs. Machine •Metadata quality •Time factor •Cost of labor comparison Challenge: Cataloging to a granular level (keywords, countries, topics and sub-topics)
  • 22. New challenges – New solutions Do the Math IMF example: • 12, 000 titles containing 60,000 chapters/articles (assumes an average of 5 per title), • 15 minutes to catalog each chapter/article with keywords etc, • 15,000 hours/40 (per week) hours =375 weeks • 375 weeks/52 = 7 years of work for one cataloger. If you pay just $30 per hour to a cataloger, the overall cost would be $450,000. Not to mention new content is being created daily. Automation allows us to slash the time it takes to catalog our content, saving us time and money.
  • 26. Results on eLibrary Super keywords or specific subjects
  • 27. Browsing the IMF eLibrary
  • 28.
  • 30. Simple Search - Type a word or phrase into the search bar at the top of every page… …or Advanced Search allows multiple concepts and filters
  • 31. Search within results to search within publications using a single word or phrase. Select Content Type (Books and Journals/Chapters and Articles), Countries/Region, Topics, Languages, or Date. Type a word in the Starts with box to go to the first title that begins with the word. Sort by Title, Date, Source or Author. Change the number of Items per page. Keywords
  • 32. Read on screen in HTML Read on a variety of devices Citation tools Click on a title from the results page to go to the publication landing page.
  • 35.
  • 36.
  • 37. • New IMF eLibrary was delivered in March 2011 • Digital distribution: Distribute IMF contents to 35 channels in various digital formats • Digital production: Have an established workflow to generate XML based contents, ePubs, Mobi and PDF ebooks • New metadata management system. MetaLogic is a full functioning metadata management system • Create metadata to a granular level
  • 38. ™ THIS INFORMATION IS PROVIDED IN CONFIDENCE AND MAY NOT BE DISCLOSED TO ANY THIRD PARTY OR USED FOR ANY OTHER PURPOSE WITHOUT THE EXPRESS WRITTEN PERMISSION OF TRAJECTORY, INC. Generating Metadata By Machine BEA May 29, 2015 11:30 – 12:20
  • 39. ™ THIS INFORMATION IS PROVIDED IN CONFIDENCE AND MAY NOT BE DISCLOSED TO ANY THIRD PARTY OR USED FOR ANY OTHER PURPOSE WITHOUT THE EXPRESS WRITTEN PERMISSION OF TRAJECTORY, INC. Attributes/Entities that Characterize A Book 38
  • 40. ™ THIS INFORMATION IS PROVIDED IN CONFIDENCE AND MAY NOT BE DISCLOSED TO ANY THIRD PARTY OR USED FOR ANY OTHER PURPOSE WITHOUT THE EXPRESS WRITTEN PERMISSION OF TRAJECTORY, INC. Sentiment: Analyzing the Words Within the Book “Outstanding”words(5) breathtaking,thrilled,superb hell,rape,(more unmentionables)“Catastrophic”words(-5) torture,fraud,(unmentionables)“Damned”words(-4) woeful,worsen,kill“Terrible”words(-3) worthless,travesty,threaten“Upset”words(-2) numb, provoke,pushy“No”words(-1) validate,safe,adequate“Yes”words(1): strengthen,rich,funky“Welcome”words(2) praise,marvelous,impressive winning,stunning “Happy”words(3) “Wow”words(4) 39 Each wordisgivena numericvalue basedon itssubjectivemeaning. “Positive”wordsrangeona positive scale;“Negative”wordsrangeon a negativescale. Trajectory’sAnalyticsEngineuses thesevaluestocomputethebook’s sentimentcurveacrosssentence, paragraph,chapterandentirebook. Thissentiment“fingerprint”atan aggregatelevelyieldsaunique pictureofthebook.
  • 41. ™ THIS INFORMATION IS PROVIDED IN CONFIDENCE AND MAY NOT BE DISCLOSED TO ANY THIRD PARTY OR USED FOR ANY OTHER PURPOSE WITHOUT THE EXPRESS WRITTEN PERMISSION OF TRAJECTORY, INC. Sentiment: Analyzing the Words Within the Book 40
  • 42. ™ THIS INFORMATION IS PROVIDED IN CONFIDENCE AND MAY NOT BE DISCLOSED TO ANY THIRD PARTY OR USED FOR ANY OTHER PURPOSE WITHOUT THE EXPRESS WRITTEN PERMISSION OF TRAJECTORY, INC. Sentiment: Analyzing the Words Within the Book 41
  • 43. ™ THIS INFORMATION IS PROVIDED IN CONFIDENCE AND MAY NOT BE DISCLOSED TO ANY THIRD PARTY OR USED FOR ANY OTHER PURPOSE WITHOUT THE EXPRESS WRITTEN PERMISSION OF TRAJECTORY, INC. Trajectory Index 42
  • 44. ™ THIS INFORMATION IS PROVIDED IN CONFIDENCE AND MAY NOT BE DISCLOSED TO ANY THIRD PARTY OR USED FOR ANY OTHER PURPOSE WITHOUT THE EXPRESS WRITTEN PERMISSION OF TRAJECTORY, INC. Keyword Analysis and Comparison 43
  • 45. ™ THIS INFORMATION IS PROVIDED IN CONFIDENCE AND MAY NOT BE DISCLOSED TO ANY THIRD PARTY OR USED FOR ANY OTHER PURPOSE WITHOUT THE EXPRESS WRITTEN PERMISSION OF TRAJECTORY, INC. Keyword Translation into Local Languages 44
  • 46. ™ THIS INFORMATION IS PROVIDED IN CONFIDENCE AND MAY NOT BE DISCLOSED TO ANY THIRD PARTY OR USED FOR ANY OTHER PURPOSE WITHOUT THE EXPRESS WRITTEN PERMISSION OF TRAJECTORY, INC. Recommendations 45
  • 47. ™ THIS INFORMATION IS PROVIDED IN CONFIDENCE AND MAY NOT BE DISCLOSED TO ANY THIRD PARTY OR USED FOR ANY OTHER PURPOSE WITHOUT THE EXPRESS WRITTEN PERMISSION OF TRAJECTORY, INC. Thank You 46 2015BEA – BOOTH 1347 United States: 50 Doaks Lane Marblehead, Massachusetts 01945 United States info@trajectory.com www.trajectory.com China: No. 3, 8 ChuangYe Road HaidanDistrict, Beijing, China100085
  • 48. Q & A Generating Metadata by Machine BEA 2015 Friday, May 29, 11:30-12:20 Room 1E10