9. Semedica, a division of Silverchair
Tagmaster Semantic autotagging w/expert review
Totem Taxonomy/Ontology manager
Cortex Biomedical taxonomy & thesaurus
Swiss Semantic web services
Silverchair | www.silverchair.com
10. My Brother is a
Rocket Scientist
Silverchair | www.silverchair.com
11. His Test Answers
A-C-B-A-E-D-C-A-B-E-A-C-B-B-E-D-
C-A-B-C-B-C-A-D-B-E-A-D-C-A-B-
A-E-A-C-A-E-C-B-A-E-B-D-D-E-A-B-
C-A-B-A-C-A-A-C-E-C-B-D-D-A-B-
C-E-A-C-C-D-E-B-A-B-B-C-D-E-A-
D-A-E-B-E-C-A-E-D-A-C-E-B-C-A-B-
E-A-A-D-E-A-B
Silverchair | www.silverchair.com
12. Semantic Enrichment Raison D’Etre
To put thousands and thousands of
tiny meaningful hooks in your data
so that your software applications
can create richer outcomes for
your users and your organization.
Silverchair | www.silverchair.com
13. Semantic Enrichment Raison D’Etre
To put thousands and thousands of
tiny meaningful hooks in your data
so that your software applications
can create richer outcomes for
your users and your organization.
Silverchair | www.silverchair.com
14. Semantics in 3 Minutes
Silverchair | www.silverchair.com
15. Semantics are About Meaning
Semantics describe the meaning of your content,
on top of the physical structure. Meaning is
generally conveyed in topics and concepts.
Semantic metadata formally answers the most
important question of all for content producers and
users:
What is this content about?
Silverchair | www.silverchair.com
16. “Atomizing” Information
The semantic approach requires us to go
beyond documents and think of our content
as data.
For example:
1 textbook chapter = 1 document
OR
1 textbook chapter = 712 distinct pieces of data
(sections, paragraphs, lists, tables, figures, equations, etc.)
Silverchair | www.silverchair.com
17. But breaking down content into its smallest parts is not
an end unto itself…
Silverchair | www.silverchair.com
18. Taxonomy as Semantic Foundation
• The taxonomy is the framework for the semantic
layer and semantic tagging—crucial for concept
grouping and hierarchical relationships
• Also serves to normalize terminology and
language variances when combined with a
robust thesaurus
• Industry-standard taxonomies facilitate
integration
Silverchair | www.silverchair.com
19. Use taxonomy axes to organize your atomized content on
key traits and prepare it for recombination…
Silverchair | www.silverchair.com
20. Nuts & Bolts: Semantic Tagging
• Tagging is the insertion of semantic (meaning)
information in the XML, whose smallest unit is
called a tag
• Tagging can also be placed in database tables
and header files if the content is inaccessible
(such as images and videos)
• Tagging should be done at the smallest “atomic”
level of data possible
24. Know Your Users!
Focus your metadata creation on how your users
want to use your content:
• How do they search? Browse? At what
point in their workflow is your product used?
Almost all information sites have multiple use
cases. You need to know what those use cases
are for your products.
Start with what is the most important to the most
users and work your way down a priority list.
Silverchair | www.silverchair.com
25. The Semantic Use Test
I am specifically identifying __________
because ____________ is very
important to my ____________ users
when they are _____________.
Silverchair | www.silverchair.com
26. Semantic Metadata: Focus on Use
Example: I am specifically identifying
concise disease treatment
content because immediate access
to treatment options is very
important to my emergency
physician users when they have 8
seconds to look up an answer.
Silverchair | www.silverchair.com
28. Semantic Metadata: Focus on Use
Example: I am specifically identifying
skin disorder images on all body
locations and all types of skin
because visual diagnosis is very
important to my family physician
users when they are trying to
identify a rash.
Silverchair | www.silverchair.com
29. Derm101: images show up immediately in
the diagnosis results for searches
Silverchair | www.silverchair.com
30. Semantic Metadata: Focus on Use
Example: I am specifically identifying
manufacturer names because the
source of medical devices is very
important to my surgical resident
users when they are prepping for a
procedure.
Silverchair | www.silverchair.com
31. Semantic Metadata: Focus on Use
Example: I am specifically identifying
manufacturer names because the
source of medical devices is very
important to my surgical resident
users when they are prepping for a
procedure.
Not Likely!
Silverchair | www.silverchair.com
33. Use Semantics to Know Your Users
Silverchair | www.silverchair.com
34. Use Semantics to “Know Thyself!”
Silverchair | www.silverchair.com
35. Thank you!
For more information:
Jake Zarnegar
CTO, Silverchair
President, Silverchair Information Systems
jakez@silverchair.com
(434) 296-6333 x236
www.silverchair.com
www.semedica.com
Silverchair | www.silverchair.com
36. Jabin White
Director of Strategic Content
Wolters Kluwer Health – Professional &
Education
Really Strategies/Silverchair Webinar –
September 29, 2010
37. Agenda
• A little background (framing the problem)
• Our goals
• When we’re done, we’ll be able to…
38. Who we are
• We are Wolters Kluwer Health – Professional
& Education
• Wolters Kluwer Health includes:
▫ Lippincott Williams & Wilkins titles
▫ Ovid
▫ UpToDate
▫ Provation Order Sets
▫ Drug Facts & Comparisons
▫ Medi-Span
▫ Clin-eguide
39. A Little History
• Joined WK Health in May 2009
▫ Responsible for making sure content flows
through company more efficiently (DTDs,
Content Management, Authoring Tools,
Semantic Enrichment, Product Information
Management, etc.)
• The reasons are not important, but we hadn’t
spent a lot of time modernizing our digital
production methods
40. Today – Our typical workflow
• Book is “signed”
• Instructions for authors are sent, and ignored
• Chapters, etc., are submitted in MS Word
• Word files are sent “over the wall” (outsourced),
coded, and put into a pagination software (still some
Quark, moving to Adobe InDesign)
• Final pages are approved
• High-resolution PDFs are sent to printer
• After final pages are approved, vendors convert into
XML (if the title was comped after May 2009). If
before, we roll the dice…
• Delivered back to P&E archive, along with printer
PDFs, application files, and images
41. So what’s your problem?
• We pay at every step of the previous workflow,
and we believe unnecessarily near the end
• If we need ePub, we have to go back into the
archive to a “mixed bag” of content (some Quark,
some PDF, some XML)
• There is no central repository – or common
format – in which to apply semantic tagging
▫ And the frustrating thing is we have GOOD DTDs!
• If we believe in semantic markup, which we do,
we must essentially throw content over the wall
again just as in composition (shampoo, rinse,
repeat)
42. Enter RSuiteCMS
• RSuiteCMS gives us the ability to control the
workflow and use good content management
practices (it does a LOT more, but we’re starting
slow)
• Very importantly, we get to have authors write in
XML without them knowing (or quite frankly
caring)
• We put a LOT of work into the authoring
environment, trying to keep authors away from
angle brackets
• “It takes a lot of hard work to make things
simple”
43. When we’re done, we’ll be able to…
• …Produce structured content with lower
effort/cost
• Working on SECOND RSuiteCMS
implementation as we speak
▫ Will scale in latter part of 2010 and 2011
• We are moving cautiously and ensuring “buy
in” from stakeholders at each step
• Ideally, we will grow our ability to produce
clean, structured XML to check into our
repository
• But Rome wasn’t tagged in a day...
44. Enter Semedica
• Gives us the ability to add semantic tagging to
our content, either when it is finished (in the
repository) or while it is being worked on
(within RSuiteCMS)
• Semedica gives us the ability to:
▫ Leverage a standard taxonomy (Cortex)
▫ Add to the taxonomy and manage
equivalencies – perhaps mined from our search
logs – “Wenckenback = Wenckebach” (Totem)
▫ Apply the tags to our content (Tagmaster)
45. Why Semantic Tagging?
• It adds extra power to our content to drive:
▫ More precise searching
▫ Contextually-based connections
▫ Lowering of “two terms meaning the same thing”
syndrome (hypertension vs. high blood pressure; heart
attack vs. myocardial infarction)
▫ Filling in of content gaps
▫ Asking questions of data (aka, querying): “How many
chapters do we publish that are tagged with the term
“pediatric oncology” or “leukemia” that also contain
the treatment “interferon therapy”
46. How RSuiteCMS & Semedica need each
other
• I wouldn’t think of using Semedica to enrich
Word files (and not just because Jake would
laugh)
• I couldn’t make the business case for
RSuiteCMS to help produce structural XML
without dangling the prospect of semantic
enrichment
• Which came first, the chicken or the egg?
47.
48. Jabin White
Director of Strategic Content
Wolters Kluwer Health
Jabin.white@wolterskluwer.com
215.521.8911
Twitter: @jabinwhite
Blog: Technically Speaking at
http://www.bookbusinessmag.com/channel/technically-speaking