Elevate Developer Efficiency & build GenAI Application with Amazon Q
Realizing the Full Potential of Taxonomies by Branka Kosovac
1. Realizing the Full Potential of
Taxonomies
Content Strategy Workshops
Vancouver, BC, July 12, 2013
Branka Kosovac, dotWit Consulting
Branka.kosovac@dotwit.com
6. <rdf:RDF
xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#"
xmlns:skos="http://www.w3.org/2004/02/skos/core#">
<skos:Concept rdf:about="http://www.my.com/#canals">
<skos:definition>A feature type category for places
such as the Erie Canal</skos:definition>
<skos:prefLabel>canals</skos:prefLabel>
<skos:altLabel>canal bends</skos:altLabel>
<skos:altLabel>canalized streams</skos:altLabel>
<skos:altLabel>ditch mouths</skos:altLabel>
<skos:altLabel>ditches</skos:altLabel>
<skos:altLabel>drainage canals</skos:altLabel>
<skos:broader
rdf:resource="http://www.my.com/#hydrographic%20structures"/>
<skos:related rdf:resource="http://www.my.com/#channels"/>
<skos:related
rdf:resource="http://www.my.com/#transportation%20features"/>
<skos:related rdf:resource="http://www.my.com/#tunnels"/>
<skos:scopeNote>Manmade waterway used by watercraft or for
drainage, irrigation, mining,
or water power</skos:scopeNote>
</skos:Concept>
</rdf:RDF>
14
11. Objects
• Documents
• Webpages
• Content components
• Digital assets
• Knowledge assets
• Marketing
assets/resources
• Records
• Social content
• Products
• People profiles
• …
• Subject domain
• Enterprise
• Intranet
• Website
• World Wide Web
• Catalogue
– Single channel
– Multi-channel
• Application
• …
Scopes
12. Elements
Categories Labels Relationships
Descriptions Codes (language independent) Hierarchy
Designed
organic
Scope notes Preferred Typed
Named
Formally defined
Formal definitions
(for computer
inference)
Alternative
Synonym rings
Equivalence relationships
Generic (Is a kind of)
Partitive (is a part of)
Instance of (is an instance of)
Typed Associative
Multilingual Transitivity
Reflectivity
Symmetry
Associated vocabulary (for
auto-classification)
user-added keywords, hashtags
(for social content)
13. • Those that belong to the emperor
• Embalmed ones
• Those that are trained
• Suckling pigs
• Mermaids (or Sirens)
• Fabulous ones
• Stray dogs
• Those that are included in this classification
• Those that tremble as if they were mad
• Innumerable ones
• Those drawn with a very fine camel hair brush
• Et cetera
• Those that have just broken the flower vase
• Those that, at a distance, resemble flies
Taxonomy of Animals in Celestial Emporium of Benevolent Knowledge
from Jorge Luis Borges essay "The Analytical Language of John Wilkins", 1942
14. KINGDOM
STRUCTURAL
ORGANIZATION
METHOD OF
NUTRITION
Monera small, simple single prokaryotic cell (nucleus is
not enclosed by a membrane); some form
chains or mats
absorb food and/or
photosynthesize
Protista large, single eukaryotic cell (nucleus is
enclosed by a membrane); some form chains
or colonies
absorb, ingest, and/or
photosynthesize food
Fungi multicellular filamentous form with
specialized eukaryotic cells
absorb food
Plantae multicellular form with
specialized eukaryotic cells; do not have their
own means of locomotion
photosynthesize food
Animalia multicellular form with
specialized eukaryotic cells; have their own
means of locomotion
ingest food
Definitions of Kingdom categories in the Linnaean Classification of Living Things
15. Linnaean Classification of Living Things: hierarchy for homo sapiens Images taken from: Encyclopaedia Britannica
ANIMALIA
CHORDATA
SAPIENS
MAMMALIA
ORDER
GENUS
SPECIES
eukaryotic cells having cell membrane but lacking a cell
wall, multicellular, heterotrophic
animals with a notochord, dorsal nerve cord,
and pharyngeal gill slits, which may be vestigialPHYLUM
KINGDOM
CLASS
PRIMATES
warm-blooded vertebrates with hair and mammary glands
which, in females, secrete milk to feed young
FAMILY
upright posture, large brain, stereoscopic vision, flat face,
hands and feet have different specializations
HOMINIDAE
s-curved spineHOMO
HABILIS ERECTUS
high forehead, well-developed chin,
skull bones thin
collar bone, eyes face forward, grasping hands with
fingers, and two types of teeth: incisors and molars
16. Classification theories
Aristotle’s categories
• Class definitions
• Membership based on shared characteristics--
necessary and sufficient conditions
• Strong influence on Western thinking
• Not how the real world works, but is what
Western audiences are expecting
Prototype theory
• Categories based on prototypes
• Membership decided based on family
resemblances
18. • when there is a single clear
distinguishing feature
• when there are well established
categories (someone of authority
created them, e.g. state/province,
zodiac sign, blood type, …)
• when you work at a “basic category”
level
• when the collection is not too large
and diverse
• when it’s single use
• when homogeneous audience
Sometimes it’s easy
Select v
circle
square
triangle
20. Sometimes a bit less easy
Color
Blue
Red
Yellow
Shape
Circle
Square
Triangle
Size
Small
Medium
Big
But what if…
• Your technology does not support
faceted approach or polyhierarchy?
• These are physical objects:
• Table linen you have to put into
your drawer?
• Earrings?
22. When it gets complicated
• large and diverse collections
• multiple uses
• diverse user groups
• cultural differences
• cultural/political sensitivities
• no formal agreement/authoritative source
• emerging and volatile domains
• far from “basic categories”
• ….
23. What to do then?
• There are some general (but not universal) rules
• and some tricks of trade
• but above all: context, context, context…
– external users vs. internal audience
– human use vs. computer inference
– impact of error
– use scenarios
– display constraints
– supporting technology
– costs…
26. Hierarchy
• consistent or varied depth?
• defined levels, typed relationships, or organic?
• polyhierarchy?
• lots of top level categories or deep hierarchy?
• transitive or not transitive?
27. Overall structure
• logical
• consistent
• well-balanced
• extensible
• fit for purpose (scenarios, business goals…)
• ordering logical and consistent
• top levels convey the scope
• no single-child categories
• no Other/Miscellaneous/General
28. Some techniques
• Standardize, but not more than necessary
• Consensus vs. mapping vs. standardized core
and general rules
• Derivative local taxonomies—mix & match
• Scoped labels and/or relationships
• If future use not known, follow general rules,
define ad document as much as possible
29. How to begin
• make sure you know what your taxonomy needs to do–now
and in the future
– user research, business requirements, vision, scenarios
• make sure you know all the constraints
– tools, costs (including long-term maintenance), available expertise,
organizational culture…
• promote and obtain high-level management support
• gather sources:
– user warrant (search logs, social content, user research/feedback logs)
– content warrant (your content, global content, your competitors’…)
– existing metadata, folksonomies, glossaries, formal or informal
taxonomies…
– publicly available taxonomies—reuse, adapt, start from scratch
(e.g. Linked Data, Taxonomy Warehouse)
30. How to develop
• Combination of:
– Top down (domain modelling)
– Bottom up (terminology clustering, open card sort)
• Design & Strategy
– Metadata element set, associated facets/branches
– Category/term properties, relationship types, hierarchy levels…
– Sustainable maintenance strategy
– Metrics
– Roadmap
• Development
– Know where to stop
• Validation & Testing
– Throughout development and beyond
31. How to complete
• Documentation
– Scope
– Design
– Maintenance guidelines
– Implementation guidance
– Use guidelines
• Deployment
– Work with developers, UX designers, taggers and don’t give up until
properly implemented
• Governance
– Roles and responsibilities
– Procedures
32. Exercises
• Exercise groups/topics
• Exercise tasks
– Describe vision (add context details as needed)
– Develop domain model
– High-level taxonomy design and strategy
– Develop key facet
– Record your considerations, sources, thought
process