Taxonomy Now! Building a stress-resistant knowledge architecture in your current tools

Building a stress-resistant knowledge
architecture in your current tools
Taxonomy Now!

This presentation covers:
~ Principles for using taxonomy (and content
tagged with it) effectively in any tool or system
~ Futureproof ways to use taxonomy in DITA
right now (+ possibly other architectures)
– Light on delivery specifics (focus is how to work
with taxonomy in XML source), but questions very
welcome!
– Every approach involves compromises
~ Possible next steps after barebones taxonomy
– For delivery
– For managing and doing more with taxonomy

To put these tips into practice,
you need:
~ An starter hierarchical taxonomy; could be lists
or an Excel sheet
– How to design a taxonomy is not covered here…
~ A strong stomach for bits of XML (at least as
the architect / implementor, though
authors’ tasks should be simpler)

Taxonomy is:
~ A way to keep track of things that are
important to your organization
~ A way to keep track of names for those things
~ A way to indicate some broad relationships
between those things

We tag content with taxonomy
“concepts”, so that we can:
~ Find it in “containers”:
site nav; doc folders
~ Filter it on various
facets

We tag content with taxonomy
“concepts”, so that we can:
~ Find it in “containers”:
site nav; doc folders
~ Filter it on various
facets
~ Create “See also”
links
~ Let people search using
their own preferred
terms for things
~ And more…

A “concept” is:
“an idea or notion; a unit of thought”
— SKOS Simple Knowledge Organization System
Reference

A concept has a unique ID:
~ In SKOS, the ID is always a URI
~ The end of the URI (or the whole URI) can be
human-readable, e.g:
https://mekon.poolparty.biz/mek
onchef3/ShaveIce
~ However, human-readable IDs can cause
problems for authors

The ID’s all we tag content with
(of course, authors need
to see the label)
https://mekon.poolparty.
biz/mekonchef3/164

Each platform reads the taxonomy
biz/mekonchef3/164
Filter results by:
Preparation method
Chop (23)
Combine (2)
Mince (3)
Shaved ice (1)
Shred (8)
Dietary suitability
Gluten-free
Halal
▸ More…
Type of dish
Main meal
Side dish
▸ More…

Label changes are picked up
biz/mekonchef3/164
Filter results by:
Preparation method
Chop (23)
Combine (2)
Mince (3)
Shave (ice) (1)
Shred (8)
Dietary suitability
Gluten-free
Halal
▸ More…
Type of dish
Main meal
Side dish
▸ More…

Hierarchy changes work too
biz/mekonchef3/164
Filter results by:
Preparation method
Flavoring / tenderizing
Marinate (5)
Dry rub (3)
Food processing
Chop (23)
Combine (2)
Mince (3)
Shave (ice) (1)
Shred (8)
Dietary suitability
Gluten-free
Halal
▸ More…
Type of dish
Main meal
Side dish
▸ More…

Content
marketing
example to
show an
advanced
application of
this principle

Users
can select a
preparation
method
or ingredient

…to
see the
attachment
that makes the
method easier

To
learn
more about
the attachment,
select the button

From here, you
can buy the
product
directly

The doc automatically links back
to the recipe, and the recipe
relates to “shave” and “blend”

Starting with a recipe taxonomy

Connecting key inline terms…

With your DITA tools (or
other tech comm tools),
how can you tag content in
a reliable, futureproof way?

(Examples from docs for a
fictitious productivity tool)

Three major approaches…
~ Subject schemes + classification maps
(indirect classification)
~ Subject schemes + direct classification in the
content
~ Hierarchical <data> elements, conreffed into
appropriate elements in the content

…evaluated on 9 criteria…
Concepts are
addressed with
unique IDs
UI supports /
constrains authors
appropriately
Tagging
travels with
content
Classification
maps
Yes
A little
(lists of keys)
No
Subject
Scheme /
attributes
Yes Depends on tool Yes
Conreffed
<data>
Yes
Yes, though authors
must follow simple
business rules
Yes
Essential

…evaluated on 9 criteria…
Concepts are
addressed with
unique IDs
UI supports /
constrains authors
appropriately
Tagging
travels with
content
Classification
maps
Yes
A little
(lists of keys)
No
Subject
Scheme /
attributes
Conreffed
<data>
Yes
Yes, though authors
must follow simple
business rules
Yes
Apply to any object
(map / topic / block
/ inline element)
Each object
accepts multiple
metadata fields
Multiple
values
per field
Authors /
editors see
pref. labels
IDs
are
URIs
Create & maintain
thesaurus
structures
Classification
maps
Maps & topics only No Yes No Hard
Some structures
only, and those
with difficulty
Subject
Scheme /
attributes
Yes Yes Yes
Depends on
tool
Hard
Some structures
only, and those
with difficulty
Conreffed
<data>
Nearly all elements
Can nest in
semantic
elements
Yes*
Takes setup
(conref
push)
Use
conref
push
No
Essential
Often-needed

…on a 4-point scale.
Concepts are
addressed with
unique IDs
UI supports /
constrains authors
appropriately
Tagging
travels with
content
Classification
maps
Yes
A little
(lists of keys)
No
Subject
Scheme /
attributes
Conreffed
<data>
Yes
Yes, though authors
must follow simple
business rules
Yes
Apply to any object
/ inline element)
Each object
accepts multiple
metadata fields
Multiple
values
per field
Authors /
editors see
pref. labels
IDs
are
URIs
Create & maintain
thesaurus
structures
Classification
maps
Some structures
only, and those
with difficulty
Subject
Scheme /
attributes
Yes Yes Yes
Depends on
tool
Hard
Some structures
only, and those
with difficulty
Conreffed
<data>
Nearly all elements
Can nest in
semantic
elements
Yes*
Takes setup
(conref
push)
Use
conref
push
No
Essential
Often-needed
Works easily
Works OK with some
setup / planning
Hard to set up /
maintain, or doesn’t
completely satisfy
criterion
Doesn’t work

Essential criteria
~ Concepts are addressed with unique IDs
~ UI supports / constrains authors appropriately
– No copying/pasting of IDs!
– Picklist at least, but preferably hierarchical browse
and/or search
~ Tagging travels with content
– Tag is associated with source object (either right in
the XML, or in a DB field attached to the
topic/map/element)

Approach 1: Classification maps

Classification maps
Concepts are
addressed with
unique IDs
UI supports /
constrains authors
appropriately
Tagging
travels with
content
Classification
maps
Yes
A little
(lists of keys)
No
Subject
Scheme /
attributes
Conreffed
<data>
Yes
Yes, though authors
must follow simple
business rules
Yes
Essential

Approach 2: Subject-Scheme-
controlled attributes

Subject Scheme / attributes
Concepts are
addressed with
unique IDs
UI supports /
constrains authors
appropriately
Tagging
travels with
content
Classification
maps
Yes
A little
(lists of keys)
No
Subject
Scheme /
attributes
Conreffed
<data>
Yes
Yes, though authors
must follow simple
business rules
Yes
Essential

Approach 3: conreffed <data>
Why do we need another approach? Why might
Subject Scheme not fit sometimes?
~ When you need metadata in elements, not
attributes
~ When your tools (or the DITA version you’re
using) don’t support Subject Scheme
~ When you find the usability of Subject Scheme
features (in your tools) still lacking

Every help authoring tool has some concept of
reusable snippets.
This approach would be the only possible way to
control taxonomy values in most HATs.

Conref controls the values (if your business rules
mandate using it)

Conref controls the values

Conreffed <data>
Concepts are
addressed with
unique IDs
UI supports /
constrains authors
appropriately
Tagging
travels with
content
Classification
maps
Yes
A little
(lists of keys)
No
Subject
Scheme /
attributes
Conreffed
<data>
Yes
Yes, though authors
must follow simple
business rules
Yes
Essential

First 3 often-needed criteria
~ Tag any object (map / topic / block / inline
element)
– Bookmap may apply to market/product
– Topic may apply to task or product component
– Blocks & inlines have specific subject matter
~ Each object accepts multiple metadata fields
– E.G. market, product
~ Multiple values per field
– Multiple markets / products

Apply to any object
/ inline element)
Each object
accepts multiple
metadata fields
Multiple
values
per field
Authors /
editors see
pref. labels
IDs
are
URIs
Create & maintain
thesaurus
structures
Classification
maps
Some structures
only, and those
with difficulty
Subject
Scheme /
attributes
Yes Yes Yes
Depends on
tool
Hard
Some structures
only, and those
with difficulty
Conreffed
<data>
Nearly all elements
Can nest in
semantic
elements
Yes*
Takes setup
(conref
push)
Use
conref
push
No
Often needed
Classification map

Often-needed
Apply to any object
/ inline element)
Each object
accepts multiple
metadata fields
Multiple
values
per field
Classification
maps
Maps & topics only No Yes
Subject
Scheme /
attributes
Yes Yes Yes
Conreffed
<data>
Nearly all elements
Can nest in
semantic
elements
Yes*

Apply to any object
/ inline element)
Each object
accepts multiple
metadata fields
Multiple
values
per field
Classification
maps
Maps & topics only No Yes
Subject
Scheme /
attributes
Yes Yes Yes
Conreffed
<data>
Nearly all elements
Can nest in
semantic
elements
Yes*
Often needed

Apply to any object
/ inline element)
Each object
accepts multiple
metadata fields
Multiple
values
per field
Authors /
editors see
pref. labels
IDs
are
URIs
Create & maintain
thesaurus
structures
Classification
maps
Some structures
only, and those
with difficulty
Subject
Scheme /
attributes
Yes Yes Yes
Depends on
tool
Hard
Some structures
only, and those
with difficulty
Conreffed
<data>
Nearly all elements
Can nest in
semantic
elements
Yes*
Takes setup
(conref
push)
Use
conref
push
No
Often needed

Apply to any object
/ inline element)
Each object
accepts multiple
metadata fields
Multiple
values
per field
Authors /
editors see
pref. labels
IDs
are
URIs
Create & maintain
thesaurus
structures
Classification
maps
Some structures
only, and those
with difficulty
Subject
Scheme /
attributes
Yes Yes Yes
Depends on
tool
Hard
Some structures
only, and those
with difficulty
Conreffed
<data>
Nearly all elements
Can nest in
semantic
elements
Yes*
Takes setup
(conref
push)
Use
conref
push
No
Often needed
Conreffed <data>

Remaining criteria
~ Authors / editors see preferred labels
– The ID is still embedded / attached, but the preferred label’s what
authors/editors see
– Whenever the label’s updated in the taxonomy, that update’s what
authors & editors see (even for previously tagged content)
~ IDs map to URIs
– Can’t be URLs directly, since // illegal in attribute values L
– Clear mapping from DITA side makes for easier integrations:
taxonomy management, SEO markup, graph search
~ Create & maintain thesaurus structures
– Alternate labels
– Scope notes / descriptions
– Related concepts
– Matches from other taxonomies

Authors /
editors see
pref. labels
IDs
<>
URIs
Create & maintain
thesaurus
structures
Classification
maps
No
Hard
Some structures
only, and those
with difficulty
Subject
Scheme /
attributes
Depends on
tool
Conreffed
<data>
Takes setup
(conref
push)
Use
conref
push
No
Remaining often-needed features
Preferred labels?

Authors /
editors see
pref. labels
IDs
<>
URIs
Create & maintain
thesaurus
structures
Classification
maps
No
Hard
Some structures
only, and those
with difficulty
Subject
Scheme /
attributes
Depends on
tool
Conreffed
<data>
Takes setup
(conref
push)
Use
conref
push
Basically, no
Preferred labels?

Authors /
editors see
pref. labels
IDs
<>
URIs
Create & maintain
thesaurus
structures
Classification
maps
No
Hard
Some structures
only, and those
with difficulty
Subject
Scheme /
attributes
Depends on
tool
Conreffed
<data>
Takes setup
(conref
push)
Use
conref
push
Basically, no
IDs map to URIs

Authors /
editors see
pref. labels
IDs
<>
URIs
Create & maintain
thesaurus
structures
Classification
maps
No
Hard
Some structures
only, and those
with difficulty
Subject
Scheme /
attributes
Depends on
tool
Conreffed
<data>
Takes setup
(conref
push)
Use
conref
push
No
IDs map to URIs
One approach to mapping
— but it doesn’t do much

Authors /
editors see
pref. labels
IDs
<>
URIs
Create & maintain
thesaurus
structures
Classification
maps
No
Hard
Some structures
only, and those
with difficulty
Subject
Scheme /
attributes
Depends on
tool
Conreffed
<data>
Takes setup
(conref
push)
Use
conref
push
No
Create & maintain thesaurus
❌
?
?
❌
❌
❌
❌
❌

Authors /
editors see
pref. labels
IDs
<>
URIs
Create & maintain
thesaurus
structures
Classification
maps
No
Hard
Some structures
only, and those
with difficulty
Subject
Scheme /
attributes
Depends on
tool
Conreffed
<data>
Takes setup
(conref
push)
Use
conref
push
Basically, no
Create & maintain thesaurus
There’s a limit to what you can feasibly
author and manage with lists and tables
•
•
•
•
•
•
•

Verdict: Subject Scheme with attributes
good; <data> conrefs not bad (and
sometimes the
only option);
thesauruses
tricky in DITA!
Concepts are
addressed with
unique IDs
UI supports /
constrains authors
appropriately
Tagging
travels with
content
Classification
maps
Yes
A little
(lists of keys)
No
Subject
Scheme /
attributes
Conreffed
<data>
Yes
Yes, though authors
must follow simple
business rules
Yes
Apply to any object
/ inline element)
Each object
accepts multiple
metadata fields
Multiple
values
per field
Authors /
editors see
pref. labels
IDs
<>
URIs
Create & maintain
thesaurus
structures
Classification
maps
Maps & topics only No Yes No
Hard
Some structures
only, and those
with difficulty
Subject
Scheme /
attributes
Yes Yes Yes
Depends on
tool
Conreffed
<data>
Nearly all elements
Can nest in
semantic
elements
Yes*
Takes setup
(conref
push)
Use
conref
push
Basically, no
Essential
Often needed

With well-tagged content,
where to go next?

Possible local search option
(would include Git repos)

Get more from a CCMS
~ The markup options presented will already
make search easier
– Some systems could provide a nice taxonomy UI
based solely on this metadata
~ Systems’ own metadata capabilities can be
very useful
– Plan how to use them (or evaluate the system if
you’re still considering one) against the 9 criteria

Your web devs / CMS people
could start to use the metadata
Light DITA-OT tweaks allow taxonomy tags
through in basic XHTML output

Dynamic delivery
~ Quickest, and often cheapest, way to do
sophisticated faceted browse, synonym
search, and other stuff to really improve UX
and get more value from your content
~ Again, if evaluating tools, look at the 9 criteria

Proper taxonomy management
~ Even for simple thesaurus management:
– Drag & drop, much easier visualization, all standard
thesaurus relationships, workflow too
– Only real way to handle enterprise-wide taxonomy
~ More advanced semantic tech stuff:
– Ontology
– Easy linking / using bits of external taxonomies
– Corpus analysis, auto-tagging (could consider integrated
friendly DITA editors too, so casual authors can also tag stuff)
~ Criteria
– Tool that uses SKOS (preferably natively) is the safest bet
– Look for extensibility, good documentation,
good support

Thoughts? Questions?
Get in touch:
joe.pairman@mekon.com
@joepairman

Taxonomy Now! Building a stress-resistant knowledge architecture in your current tools

Recomendados

Recomendados

Mais conteúdo relacionado

Semelhante a Taxonomy Now! Building a stress-resistant knowledge architecture in your current tools

Semelhante a Taxonomy Now! Building a stress-resistant knowledge architecture in your current tools (20)

Mais de Joe Pairman

Mais de Joe Pairman (9)

Último

Último (20)

Taxonomy Now! Building a stress-resistant knowledge architecture in your current tools