SlideShare uma empresa Scribd logo
1 de 327
Taxonomy Fundamentals
Why build a taxonomy?
SLA – Vancouver – June 7, 2013
www.accessinn.com
www.dataharmony.com
505-998-0800
Marjorie M.K. Hlava
President and Chief Scientist
Bob Kasenchak
Project Coordinator
Access Innovations, Inc.
Copyright © 2013 Access Innovations, Inc.
A fast moving and powerful introduction to both the theoretical
and practical aspects of building a taxonomy, thesaurus, and
ontology. A well-built taxonomy is part of the foundation of the
information architecture underlying web sites, corporate
Intranets, search/retrieval, and access to relevant content in
databases. After defining controlled vocabularies and identifying
core standards, you will explore key concepts of taxonomy,
thesaurus, indexing, classification, and filtering. Discussion will
include the basics of a taxonomy records and fundamental term
relationships. Attendees will put concepts into practice through
multiple exercises, taxonomy, indexing, and related software
tools will be demonstrated.
Introduction To
Taxonomy Concepts
Copyright © 2013 Access Innovations, Inc.
About Access Innovations
Access Innovations are experts in content creation, enrichment, and
conversion services. We provide services to semantically enrich and tag raw
text into highly structured data. We deliver clean, well-formed, metadata-
enriched content so our clients can reuse, repurpose, store, and find their
knowledge assets. We go beyond the standards to build taxonomies and
other data control structures as a solid foundation for your information.
Our services and software allow organizations to use and present their
information to both internal and external constituents by leveraging search,
presentation, and e-commerce. We change search to found!
Quick Facts
• Founded in 1978
• Headquartered in Albuquerque, NM
• Privately held
• Delivered more than 2000 engagements
Copyright © 2013 Access Innovations, Inc.
What we do
 Access Innovations
 Ensure clean, well formed content
 Create Knowledge Organization Systems (KOS)
 Data Harmony Tools
 To automatically index content
 To manage KOS and more
 To semantically enrich the content
 To organize the content
 Visualization tools to portray the data
4
Copyright © 2013 Access Innovations, Inc.
Outline of the Day
 Why the excitement
 What is a Taxonomy
 Card Sort – Slide 39
 How to build a taxonomy
 Term relationships
 Thesaurus Examples
 Pre and Post
Coordination
 What are we controlling
 Vocabulary Options
 TaxoMatch - Slide 189
 Term Forms
 Facets / Notation / Roles /
Treatment/ Weighting
 Auto Indexing
 A Taxing Situation - Slide
315
 Search
 Where do I use it?
 Standards and references
Why The Excitement?
 Makes information findable!
 Cut search time by 50%! (The Weather Channel)
 Leverages information in new ways
 User satisfaction
 Organizes topical areas and web sites
 Provides better online help
 Customer support 30x more costly than web self-
service*
*(Forrester Research "Tier Zero Customer Support" 1999)
Copyright © 2013 Access Innovations, Inc.
Taxonomies are found…
• In “indexing”, tagging, categorizing, subject metadata
• In search - precision, recall
• In content management systems, web sites
• In SharePoint to replace term tree, tag uploads
• In mashups, repackaging, repurposing data
• In social networking sites
• In author tagging - peer reviewer selection
• In filtering data – e.g., spam filters and RSS feeds
• In web crawlers
• In text analytics – trend analysis
• … and much more
Copyright © 2013 Access Innovations, Inc.
Because taxonomies make them work
Where Does
Implementation Happen?
 At the backend
 When the records / articles are added to
the production system
 When the search software’s “inverted file”
is created
 When the HTML for the web page is
created
Copyright © 2013 Access Innovations, Inc.
Heart Of The “Big Data”
Production Process
Copyright © 2013 Access Innovations, Inc.
From the
production side
to the website
display, carry
the taxonomy
descriptors for
use in precision
search
Copyright © 2013 Access Innovations, Inc.
Taxonomy
Copyright © 2013 Access Innovations, Inc.
Authors at a place
MASHUP locations to a
GPS grid of an area
Two data points
GPS Coordinates
Taxonomy description of the place
Copyright © 2013 Access Innovations, Inc.
Watch Crime In Action
Copyright © 2013 Access Innovations, Inc.
Copyright © 2013 Access Innovations, Inc.
Copyright © 2013 Access Innovations, Inc.
Two data points
GPS Coordinates
Taxonomy description of the crime
Copyright © 2013 Access Innovations, Inc.
17
Visualization Strategies
Matrix
Visualization
Software
Copyright © 2013 Access Innovations, Inc.
Copyright © 2013 Access Innovations, Inc.
Copyright © 2013 Access Innovations, Inc.
All Data Up-posted
To The Top Level
Copyright © 2013 Access Innovations, Inc.
Pattern Analysis
Indexing Clusters
Copyright © 2013 Access Innovations, Inc.
Pattern Analysis
Domain Associations
Copyright © 2013 Access Innovations, Inc.
Pattern Analysis
Domain Correlations
Copyright © 2013 Access Innovations, Inc.
Pattern Analysis
Gap Analyses
Copyright © 2013 Access Innovations, Inc.
Pattern Analysis
Component Gaps
Copyright © 2013 Access Innovations, Inc.
More Like This - Recommender
Cancer Epidemiology Biomarkers & Prevention
Vol. 12, 161-164,
February 2003
© 2003 American Association for Cancer Research
Short Communications
Alcohol, Folate, Methionine, and Risk of Incident Breast Cancer in the
American Cancer Society Cancer Prevention Study II Nutrition Cohort
Heather Spencer Feigelson1, Carolyn R. Jonas, Andreas S. Robertson,
Marjorie L. McCullough, Michael J. Thun and Eugenia E. Calle Department
of Epidemiology and Surveillance Research, American Cancer Society,
National Home Office, Atlanta, Georgia 30329-4251
Recent studies suggest that the increased risk of breast cancer associated
with alcohol consumption may be reduced by adequate folate intake. We
examined this question among 66,561 postmenopausal women in the
American Cancer Society Cancer Prevention Study II Nutrition Cohort.
Related Press Releases
•How What and How Much We Eat (And Drink) Affects Our
Risk of Cancer
•Novel COX-2 Combination Treatment May Reduce Colon
Cancer Risk Combination Regimen of COX-2 Inhibitor and
Fish Oil Causes Cell Death
•COX-2 Levels Are Elevated in Smokers
Related AACR Workshops and Conferences
•Frontiers in Cancer Prevention Research
•Continuing Medical Education (CME)
•Molecular Targets and Cancer Therapeutics
Related Meeting Abstracts
•Association between dietary folate intake, alcohol intake, and
methylenetetrahydrofolate reductase C677T and A1298C
polymorphisms and subsequent breast
•Folate, folate cofactor, and alcohol intakes and risk for
colorectal adenoma
•Dietary folate intake and risk of prostate cancer in a large
prospective cohort study
Related Working Groups
•Finance
•Charter
•Molecular Epidemiology
Related Education Book Content
Oral Contraceptives, Postmenopausal Hormones,
and Breast Cancer
Physical Activity and Cancer
Hormonal Interventions: From Adjuvant Therapy to
Breast Cancer Prevention
Related Awards
•AACR-GlaxoSmithKline Clinical Cancer Research Scholar Awards
•ACS Award
•Weinstein Distinguished Lecture
Webcasts
Related Webcasts
Think Tank Report
Related Think Tank Report
Content
Copyright © 2013 Access Innovations, Inc.
Link to Society Resources
Journal
Article on
Topic A
Other
Journal
Articles on
Topic A
Upcoming
Conference
on Topic A
Podcast Interview
with Researcher
Working on Topic A
Grant Available
for Researchers
Working on
Topic A
CME
Activity on
Topic A
Job Posting
for Expert
on Topic A
Copyright © 2013 Access Innovations, Inc.
Author Connections
Copyright © 2013 Access Innovations, Inc.
What is a taxonomy?
Albuquerque, NM 87110
www.accessinn.com
www.dataharmony.com
505-998-0800
Marjorie M.K. Hlava
President and Chief Scientist
Access Innovations, Inc.
Copyright © 2013 Access Innovations, Inc.
Vocabulary Control - Options
 Classification
systems*
 Authority files
 Controlled term lists
 Uncontrolled term
lists
 Thesauri
Copyright © 2013 Access Innovations, Inc.
[*We will concentrate on taxonomies and
thesauri, first, and then cover the others as
time permits.]
Taxonomy Standards
 Z39.19 (2005) Controlled Vocabularies
 BS 8723 Parts 1 – 5
 ISO25964 Parts 1 - 4
 TAG 37 and 46 standards
 SKOS - Simple Knowledge Organization
System
 OWL - Web Ontology Language
 AND more!
Copyright © 2013 Access Innovations, Inc.
A Taxonomy is a
Knowledge Organization System (KOS)
 Uncontrolled list
 Name authority file
 Synonym set/ring
 Controlled vocabulary
 Taxonomy
 Thesaurus
 Ontology
 Semantic network
Not complex
Highly complex
Copyright © 2013 Access Innovations, Inc.
Structure Of
Controlled Vocabularies
Lists Synonyms Taxonomy Thesaurus Ontology
Ambiguity Ambiguity Ambiguity Specifies a KOS
Synonym Synonym Additional kinds of
Hierarchy Hierarchy Relationships
Relationships
relationships
INCREASING COMPLEXITY and CONTROL
Copyright © 2013 Access Innovations, Inc.
What is a Taxonomy?
ANSI/NISO Z39.19-2005
“A collection of controlled vocabulary terms
organized into a
hierarchical structure.”
controlled
Missing:
equivalence, homographic, and associative relationships
and notes
Yes!
Copyright © 2013 Access Innovations, Inc.
Taxonomy? Thesaurus?
 Often used interchangeably
 Thesaurus is a taxonomy with extras
 Related Terms
 Non-preferred Terms (USE/Used for)
 Scope Notes
 More
 Taxonomies often have the actual
information object at the final node.
 CMS and SharePoint tend to the
hierarchical view only, definition, and USE
Copyright © 2013 Access Innovations, Inc.
Taxonomy? Thesaurus?
 Main Term (MT)
 Top Term (TT)
 Broader Terms (BT)
 Narrower Terms (NT)
 Related Terms (RT)
 See also (SA)
 Non-Preferred Term (NP)
 Used for (UF), See (S)
 Scope Note (SN)
 History (H)
= subject term, heading, node,
category, descriptor, class
TAXONOMY
THESAURUS
OWL can specify
Copyright © 2013 Access Innovations, Inc.
The Semantic Roadmap:
Knowledge Organization Systems
 Semantic network
 Ontology
 Thesaurus
 Taxonomy
 Controlled vocabulary
 Synonym set/ring
 Name authority file
 Uncontrolled list
•Unrelated Entities
•Ambiguity
•Linked Entities
•Contextual Specificity
•Simple
•Low Value
•Complex
•High value
Uncontrolled
list has the
Highest Cost
over Time!
Copyright © 2013 Access Innovations, Inc.
Copyright © 2005 - Access Innovations, Inc.
Taxonomy
view
Thesaurus
Term Record
view
Copyright © 2013 Access Innovations, Inc.
Copyright © 2013 Access Innovations, Inc.
Taxonomy 101
How do you build a taxonomy?
Albuquerque, NM 87110
www.accessinn.com
www.dataharmony.com
505-998-0800
Marjorie M.K. Hlava
President and Chief Scientist
Access Innovations, Inc.
Copyright © 2013 Access Innovations, Inc.
How Do You Build a
Taxonomy ?
• Define subject field
• Collect terms
• Organize terms
• Fill in gaps
• Flesh out and interrelate terms
• Apply to your data
You’re done!
Copyright © 2013 Access Innovations, Inc.
Foundations
 Start with what is known
 Build from there
 Use the literature, your data
 Use the lists you already have internally
 Built-in continuous review throughout the
process, and beyond
 Who is involved?
 Taxonomists
 Subject matter experts (SME)
 Project management
 Users
Copyright © 2013 Access Innovations, Inc.
Define Subject Field
 Review representative collection of content
 Determine:
 Core areas
 Peripheral topics
Psychology
Education
Sociology
Law
 Scope can be modified later
Copyright © 2013 Access Innovations, Inc.
Where Do I Get the
Terms?
 Your documents and databases
 Departmental terminology
 Text books and their indexes
 Book tables of contents and indexes
 Journal quarterly indexes
 Encyclopedias
 Lexicons, glossaries on the topic
 Web resources
 Users and experts
 Search logs
Copyright © 2013 Access Innovations, Inc.
How Do You Choose
Terms?
 Importance in the subject area
 Use in the literature, by the organization
or community
 Necessary degree of specificity or detail
 Relationship with other controlled
vocabularies
 Single concept = single term
Copyright © 2013 Access Innovations, Inc.
Build, Buy, Augment?
 Survey existing thesaurus/taxonomy resources for your
domain
 Test for
• Scope
• Depth
• Make-or-break terms
• Cost
 Adoption of existing taxonomies
 Term registries
 Taxobank
 Taxonomy Warehouse
 Other resources
Don’t reinvent the wheel!
Copyright © 2013 Access Innovations, Inc.
Gather Terms From
Search Logs
 Top ~100 search terms from search logs
 Terms used more than 50 times
 Match to web site with appropriate
answer
 Basis for favorites or best bets, presented
at the top of results list
 Behavior-based taxonomy
Copyright © 2013 Access Innovations, Inc.
Vocabulary Control –
How?
 Use unambiguous terms, clear to the user
group
 Distinguish between terms that appear
similar
 Use Scope Notes when necessary
 Use terms as elements that can be
coordinated in a flexible manner
 Create compound terms, if necessary
Copyright © 2013 Access Innovations, Inc.
Term Format
 KISS – Keep it short and simple
• 1-2-3 words
• Effect on search
• Pre and Post Coordination
 Establish a policy
• follow Chicago Manual of Style
 Grammatical issues
• Nouns and noun phrases
• Verbs  Gerunds
• Adjectives - no
• Adverbs - no
• Initial articles – no
Copyright © 2013 Access Innovations, Inc.
Thesaurus - Format
 Main Entries
 Top Terms - TT
 Broader Terms - BT
 Narrower Terms - NT
 Related Terms - RT
 Scope Notes - SN
 History - HI
 Date term added/changed - DA
Copyright © 2013 Access Innovations, Inc.
Thesaurus - Format
 Related terms - RT
 See - S
 See also - SA
 Use - U
 Preferred Term PT
 Use for - UF
 Non Preferred Term NP
 ..
Copyright © 2013 Access Innovations, Inc.
Definitions
 Index term
 the representation of a concept
 Preferred term (International)
 a term used consistently to index a concept
 descriptor (USE)
 what the “USED FOR” reference points to
Copyright © 2013 Access Innovations, Inc.
Definitions
 Non preferred term (International)
 synonym or quasi synonym of a preferred term
 non-descriptor (USE)
 the “USE” reference
 the “SEE” reference
 Related term
 the “SEE ALSO”
Copyright © 2013 Access Innovations, Inc.
Indexing Terms
 Three main categories
 concrete entities
 abstract concepts
 proper nouns
Copyright © 2013 Access Innovations, Inc.
One Term / One Concept
 Importance in the subject area
 Use in the literature, by the organization
or community
 Necessary degree of specificity or detail
 Relationship with other controlled
vocabularies
Copyright © 2013 Access Innovations, Inc.
One Term / One Concept
 Terms represent simple or unitary concept
 A unit of thought
 Can be a single-word term
 Can be a multiword term, if required to
represent the concept
 Three main categories
– Concrete entities
– Abstract concepts
– Proper nouns
“A unit of thought, formed by
mentally combining some or all of
the characteristics of a concrete or
abstract, real or imaginary object.
Concepts exist in the mind as
abstract entities independent of
terms used to express them.”
Copyright © 2013 Access Innovations, Inc.
Concrete Entities
 Things and their physical parts
 primates
 head
 buildings
 floors
 islands
Copyright © 2013 Access Innovations, Inc.
Concrete Entities as Terms
• Things and their physical parts
– Birds
• Feathers
• Buildings
• Floors
• Materials
– Cement
– Wood
– Lead
– Cards and Chips
Copyright © 2013 Access Innovations, Inc.
Concrete Entities
 Materials
 cement
 wood
 lead
 cars
 refrigerators
Copyright © 2013 Access Innovations, Inc.
Abstract Concepts
 Actions and events
 evolution
 respiration
 skating
 management
 wars
 ceremonies
Copyright © 2013 Access Innovations, Inc.
Abstract Concepts
 Abstract entities, properties of things,
materials and actions
 law
 theory
 strength
 efficiency
 lead (management)
Copyright © 2013 Access Innovations, Inc.
Abstract Concepts
 Disciplines and sciences
 physics
 meteorology
 mathematics
 psychology
Copyright © 2013 Access Innovations, Inc.
Abstract Concepts
 Units of measurement
 kilograms
 pounds
 meters
 miles
Copyright © 2013 Access Innovations, Inc.
Abstract Concepts as Terms
• Actions and events
– evolution, skating, management, ceremonies
• Abstract entities
– law, theory
• Properties of things, materials, and
actions
– strength, efficiency
• Disciplines and sciences
– physics, meteorology, mathematics
• Units of measurement
– pounds, kilograms, miles, meters, nanoseconds
Copyright © 2013 Access Innovations, Inc.
Proper Nouns*
 Individual entities, or “classes of one”,
expressed as proper nouns
 San Francisco
 United States of America
 Lake Michigan
* Proper names – of persons – are not
included
Copyright © 2013 Access Innovations, Inc.
Proper Nouns as Terms
 Individual entities – “classes of one” –
expressed as proper nouns
 San Francisco, Lake Michigan
Thesaurus standards exclude proper names,
persons, and trade names  authority files.
Taxonomies include them as final nodes.
Copyright © 2013 Access Innovations, Inc.
Most Terms Are Nouns
 Nouns or simple noun phrases
 Adj + Noun – Art history (ANSI/NISO standard)
 Noun + Prep + Noun – History of art (ISO standard)
 Exceptions – Burden of proof, Coats of arms,
Prisoners of war, Birds of prey, etc.
Copyright © 2013 Access Innovations, Inc.
About “and”
 Avoid “and” in terms – not a single concept
Instead of: Children and television
Factor and postcoordinate
USE Media influence + Television + Children
“And” is not in the standard
In real life—need for granularity may dictate your choice
Copyright © 2013 Access Innovations, Inc.
Compound Terms – Nope!
 “Terms in a thesaurus should represent
simple or unitary concepts…” (ISO standard)
 “Compound terms should be factored
(split) into simple elements…” (ANSI/NISO
standard)
 Term phrases are okay (bigrams)
 Adjective Noun
 American history
 Two concepts combined are not
 Aromatherapy for bloating
Copyright © 2013 Access Innovations, Inc.
Organize Terms –
Roughly
 Sort terms into several major categories –
logical groups of similar concepts as Top
Terms
 Identify core areas and peripheral topics
 10 – 20 to start
 Consider moving proper names to authority files
 Result: loose collection of terms under
several main headings
 Rough and tentative – see how it fits as you go
 Initial gap analysis
 Add / modify / delete as needed
Copyright © 2013 Access Innovations, Inc.
Term Relationships
How Do Terms Relate?
 Hierarchical relationships
-- Parents and their children
 Equivalence relationships
-- Aliases
 Associative relationships
-- Cousins
TAXONOMY
THESAURUS
Copyright © 2013 Access Innovations, Inc.
Hierarchical
Relationships
 Broader Term (BT) represents the class,
whole, or genus
 Narrower Term (BT) is a member, part, or
species
 Generic relationship
 Whole-part relationship
 Instance relationship
 NT inherit all the BT characteristics
 BTs/NTs have a reciprocal relationship
Copyright © 2013 Access Innovations, Inc.
Hierarchical
Relationships
 Class as a whole
 superordination
 broader term (BT)
 sometimes top term (TT)
 Members or parts of the class
 subordination
 narrower term (NT)
 Reciprocal
Copyright © 2013 Access Innovations, Inc.
Hierarchical
Relationships
 BT/NT based on being part of same class
 Same fundamental category
 entities
 activities
 agents
 properties
Copyright © 2013 Access Innovations, Inc.
Hierarchical
Relationships
 Museums
 Archaeological museum type of entity NT
 Ethnological museum type of entity NT
 Curators agents RT
 Museum techniques action RT
 Scientific museum type of entity NT
Copyright © 2013 Access Innovations, Inc.
Hierarchy –
Whole-Part Relationships
 Four general types
1. Body systems and organs
 Ear  Middle ear
2. Geographical locations
 Bernalillo County  Albuquerque
3. Fields of study
 Geology  Physical geology
4. Hierarchical social structures
 Ontario  Manitoulin District
Copyright © 2013 Access Innovations, Inc.
Hierarchy –
Instance Relationships
 General category (common noun) as BT,
with individual example (proper noun) as
Narrower Term Instance (NTI)
Seas French cathedrals
Baltic Sea Chartres Cathedral
Caspian Sea Rheims Cathedral
Mediterranean Sea Rouen Cathedral
Essentially identical to “final node” in taxonomies
Copyright © 2013 Access Innovations, Inc.
Hierarchical Types
of Display
 Systematic
 Alphabetic
 other, but less common views
Copyright © 2013 Access Innovations, Inc.
80
DTIC
 Hierarchy
Copyright © 2013 Access Innovations, Inc.
Polyhierarchical
Relationship
• Term can logically fit under more than one
Broader Term – can have Multiple Broader
Terms (MBT)
• Part of ISO standards, new to ANSI/NISO
Nurses Health administrators
Nurse administrators Nurse administrators
Finance Careers
Accounting Accounting
Copyright © 2013 Access Innovations, Inc.
Polyhierarchical
Relationships
 Great for the web click environment
 Terms occur in multiple categories
 Can be generic as well as hierarchical
Engineering Physics
NT Nanotechnology NT Nanotechnology
Nanotechnology
BT Engineering
BT Physics
Copyright © 2013 Access Innovations, Inc.
83
DTIC
 Alpha
Copyright © 2013 Access Innovations, Inc.
Pests
Generic Relationship Tests
Squirrels
Rodents
 ALL squirrels are rodents
x NOT ALL squirrels are pests
x NOT ALL pests are rodents
Copyright © 2013 Access Innovations, Inc.
Generic Relationship
Tests
• Both terms in same fundamental category
• “All-and-some” test
SOME ALL
SOME NOT ALL
Rodents
Squirrels
Pests
Squirrels
Consider concepts of marketing and advertising
Copyright © 2013 Access Innovations, Inc.
Generic Relationships
 “Identifies the link between a class or
category and its members or species.”
 Easy in biology
 Rodents
 NT Squirrels
 All and some rule
Copyright © 2013 Access Innovations, Inc.
All and Some Rule
 Rodents
 NT Squirrels
 RT Pests
 Q. Is this an example of polyhierarchy?
 Q. Do you need to make RT relationships
for “Pests” to all of the NTs under
“Rodents”?
Copyright © 2013 Access Innovations, Inc.
Instance Relationships
 Seas ISO
 NT Baltic Sea
 NT Caspian Sea
 NT Mediterranean Sea
 French Cathedrals NISO / ANSI
 NTI Chartres Cathedral
 NTI Rheims Cathedral
 NTI Rouen Cathedral
 RT Gothic cathedrals
Copyright © 2013 Access Innovations, Inc.
Instance Relationships
 French Cathedrals NISO / ANSI
 NTI Chartres Cathedral
 NTI Rheims Cathedral
 NTI Rouen Cathedral
 RT Gothic cathedrals
 French Gothic Cathedral
 NTI Chartres Cathedral
 NTI Rheims Cathedral
 NTI Rouen Cathedral
 BT Gothic cathedrals
 Q. Why/how do these differ?
Copyright © 2013 Access Innovations, Inc.
90
CABI Pages
Copyright © 2013 Access Innovations, Inc.
Instance Relationships
 “…a general category of things and
events expressed by a common
noun, and an individual instance of
that category, the instance then
forming a class of one which is
represented by a proper name.”
 A way of adding the proper names
and items from the Authority files to
the thesaurus
Copyright © 2013 Access Innovations, Inc.
Questions before moving on to
Associative Relationships?
Associative Relationships
 Related Terms (RTs) – cousins
 “…terms related conceptually, but not
hierarchically, and are not part of an equivalence
set” (i.e. not synonyms)
 Both terms are valid thesaurus terms for indexing
and have reciprocal relationship
 Expands user’s awareness and reflects thesaurus
coverage of unanticipated areas
 Standards describe specific types
Copyright © 2013 Access Innovations, Inc.
Associated Relationships
Related terms
Physicians Medicine
(“Reciprocal posting” done
automatically is highly desirable.)
Copyright © 2013 Access Innovations, Inc.
Associative Relationships
 Sibling relationships
 Examples:
 Brother : Sister
 Desk : Chair
 Easier to create within well defined
facets (e.g. AAT)
 Usual step in building process
 Can be identified automatically
Copyright © 2013 Access Innovations, Inc.
Associative Relationships
 RT relationships
 Braking systems
 RT Trains
 RT Bicycle
 RT Motor vehicle
 Office furniture
 RT Office buildings
 RT Ergonomics
Copyright © 2013 Access Innovations, Inc.
Associative Relationships
 Field of study and objects studied
 Seismology
 RT Earthquakes
 Meteorology
 RT Weather patterns
Copyright © 2013 Access Innovations, Inc.
Associative Relationships
 Operation or process and the agent or
instrument
 Hairdressing
 RT Hair dryers
 Word processing
 RT Typing skills
Copyright © 2013 Access Innovations, Inc.
Associative Relationships
 Occupation and person in occupation
 Social work
 RT Social workers
 Information science
 RT Special librarians
Copyright © 2013 Access Innovations, Inc.
Associative Relationships
 Action and the product of the action
 Publishing
 RT Music scores
 Landscaping
 RT Lawn mowers
 RT Irrigation systems
Copyright © 2013 Access Innovations, Inc.
Associative Relationships
 Action and its patient
 Teaching
 RT Students
 Conducting
 RT Musicians
Copyright © 2013 Access Innovations, Inc.
Associative Relationships
 Concepts related to their properties
 Women
 RT Femininity
 Automobiles
 RT Automotive safety
Copyright © 2013 Access Innovations, Inc.
Associative Relationships
 Concepts related to their origins
 Water
 RT Water wells
 Carpet
 RT Thread
Copyright © 2013 Access Innovations, Inc.
Associative Relationships
 Concepts linked by causal dependence
 Injuries
 RT Accidents
 Cultural stress
 RT Culture shock
Copyright © 2013 Access Innovations, Inc.
Associative Relationships
 Action and counter action
 Pests
 RT Pesticides
 Log on
 RT Log off
Copyright © 2013 Access Innovations, Inc.
Associative Relationships
 Raw material and its product
 Hides
 RT Leather
 Clothing
 RT Fabric
Copyright © 2013 Access Innovations, Inc.
Associative Relationships
 Action and associated property
 Precision instrument
 RT Accuracy
 Production processes
 RT Quality control
Copyright © 2013 Access Innovations, Inc.
Associative Relationships
 Concept and its opposite
 Single People
 RT Married people
 Height
 RT Depth
 RT Weight
 If not hierarchical, probably
associative
Copyright © 2013 Access Innovations, Inc.
Questions before moving on to
Equivalence Relationships?
Equivalence Relationships
 Refer to the same concept
 (Use for)
 Prefix for non-preferred terms
 (Use)
 Prefix for preferred terms
 Automobiles
 used for Cars
 Cars
 use Automobiles
Copyright © 2013 Access Innovations, Inc.
Equivalence Relationships
Use
Use for
Physicians
Doctors
Copyright © 2013 Access Innovations, Inc.
Equivalence Relationships
 Synonyms
 popular and scientific
 spiders - arachnida
 scientific and trade names
 Motrin (TM) - ibuprofen
 standard names and slang
 hi fi - high fidelity
 different linguistic origin
 home care - domicillary care
Copyright © 2013 Access Innovations, Inc.
Equivalence Relationships
 Synonyms cont’d
 different cultures
 aerials - antenna
 trunk - boot
 hire - rent
 emerging concepts
 telecommuting - distance working
 outdated
 refrigerators - iceboxes
Copyright © 2013 Access Innovations, Inc.
A “Term” Synonym Ring
Term
Node
Subject headingCategory
Descriptor
Copyright © 2013 Access Innovations, Inc.
Equivalence Relationships
 Lexical variants
 variant spellings
 Muslim - Moslem
 center - centre
 direct and indirect forms
 electric power plants
 power plants, electric
 abbreviations
 ECG - electrocardiograph
Copyright © 2013 Access Innovations, Inc.
Equivalence Relationships
 Quasi synonyms
 urban areas - cities
 gifted people - geniuses
 Antonyms
 height - depth
 literacy - illiteracy
Copyright © 2013 Access Innovations, Inc.
Equivalence Relationships
 Up posting (generic posting)
 useful for web interfaces
 NT equivalent to their BT
 not sub species of BT
Copyright © 2013 Access Innovations, Inc.
Equivalence Relationships
PsychInfo Rotated
Copyright © 2013 Access Innovations, Inc.
Equivalence Relationships
 Factored terms
 express terms in their combinations
 Milk hygiene
 use milk and hygiene
Copyright © 2013 Access Innovations, Inc.
Equivalence Relationship
• Preferred Term
– Thesaurus term and valid for indexing
– Thesaurus notation: USE
• Non-Preferred Term
– Not valid for indexing
– An alias or imposter
– Entry point, directs user to Preferred Term
– Thesaurus notation: UF or NPT
Spiders Plant pathology
UF Arachnids USE Phytopathology
Copyright © 2013 Access Innovations, Inc.
Equivalence – When to Use
 Synonyms, slang, quasi-synonyms
 Scientific and trade names
 Ibubrofen UF Motrin™
 Lexical variants
 Fiber optics UF Fibre optics
 Mouse UF Mice
 Upward posting of narrow concepts not specified
in taxonomy or thesaurus
 Social class UF Elite, Middle class, Working class
Get equivalent terms from search logs, brainstorming…
Copyright © 2013 Access Innovations, Inc.
Scope Notes (SN)
 Indicate meaning of the term in the context of
this thesaurus, for this audience
 Stress – Metal, Psychological, Physiological
 Indicate any restriction in meaning
 Indicate range of topics covered
 Provide direction for indexers; for terms often
confused, may suggest an alternative term
 Use only as needed – not for every term
 Establish and stick with consistent format
 Be concise
Copyright © 2013 Access Innovations, Inc.
Scope Notes (SN)
 Restrictions on meaning
 Range of topics covered
 Instructions to indexers
 Term histories
 Reciprocal scope notes
Copyright © 2013 Access Innovations, Inc.
Questions before moving on
to more thesaurus examples?
Thesaurus - Examples
 Roget's 1852
 synonyms
 COSATI - 1964
 concept linking
 NASA
 AEC - ERDA - DOE - ESA
 National Library of Medicine
 outline of a field
 Medical Subject Headings - MeSH
Copyright © 2013 Access Innovations, Inc.
Copyright © 2001 Access Innovations, Inc.
126
NASA
 Alphabetic
Copyright © 2013 Access Innovations, Inc.
127
NASA
 Hierarchical
Copyright © 2013 Access Innovations, Inc.
Thesaurus - Examples
 INSPEC - multifaceted
 Thesaurus
 Classification system
 Free text terms
 Variant spellings
 NICEM
 27 Top Terms
Copyright © 2013 Access Innovations, Inc.
Copyright © 2001 Access Innovations, Inc.
129
INSPEC
Copyright © 2013 Access Innovations, Inc.
Copyright © 2001 Access Innovations, Inc.
130
INSPEC
 Hierarchy
Copyright © 2013 Access Innovations, Inc.
Merged Vocabularies
 Yahoo!
 Subject headings
 Authority files
 In a single list
Copyright © 2013 Access Innovations, Inc.
Copyright © 2001 Access Innovations, Inc.
132
Copyright © 2013 Access Innovations, Inc.
Copyright © 2001 Access Innovations, Inc.
133
Yahoo!
 Hierarchy
Copyright © 2013 Access Innovations, Inc.
Merged Vocabularies -
continued
 Office.com
 Multiple broader terms
 Concept mapping
Copyright © 2013 Access Innovations, Inc.
Copyright © 2001 Access Innovations, Inc.
135Copyright © 2013 Access Innovations, Inc.
Eurovoc
Thesaurus
Pages
Copyright © 2013 Access Innovations, Inc.
Copyright © 2001 Access Innovations, Inc.
137
Eurovoc Thesaurus Hierarchy
Copyright © 2013 Access Innovations, Inc.
138
Eurovoc Terms
Copyright © 2013 Access Innovations, Inc.
So far you’ve got…
 Hierarchy
– Broader and Narrower Terms
• Polyhierarchies when needed
– Preferred/Non-Preferred Terms
– Equivalence relationships
– Related Terms
– Associative relationships
– Scope Notes
– Complete term records
– Correct term format
Copyright © 2013 Access Innovations, Inc.
So far you’ve got…
 Hierarchical relationships
-- Parents and their children
 Equivalence relationships
-- Aliases
 Associative relationships
-- Cousins
-- See Also’s
TAXONOMY
THESAURUS
Copyright © 2013 Access Innovations, Inc.
So far you’ve got…
• Term format
• Grammatical issues
• Singular and plural forms
• Spelling
• Abbreviations and acronyms
• Capitalization
• Other punctuation
• Consistency
Copyright © 2013 Access Innovations, Inc.
Pre and Post Coordination
Pre and Post Coordinate
Terms
 Pre coordinates – two concepts
 Subject headings – Library of Congress
 American history – Civil War
 Back of the book
 Put together in advance by the publisher
 Post Coordinate
 Taxonomy terms
 Single concept
 Put together by the user / searcher
Copyright © 2013 Access Innovations, Inc.
Pre-coordination
 Card catalogs - printed indexes
 Links and roles defined
 Controlled vocabularies
 High input costs
 Precise recall - easier searching
Copyright © 2013 Access Innovations, Inc.
Post-coordination
 Starting with punch cards
 Machine readable
 Frequently natural language
 Currency and specificity
 Exhaustive coverage - loss of precision
 Low input costs
 False drops
Copyright © 2013 Access Innovations, Inc.
 Work first from the literature
 Establish literary warrant for terms
 Some one else do the clerical work
 Differentiate the lexicography work
 From the Subject Matter expert work
 Let SMEs do the review and tailoring
 Expert review ensures the proper term use and
application
 Advisory Board…advisable!
Subject Matter Experts (SME)
Copyright © 2013 Access Innovations, Inc.
Again, why do we index?
 Improve precision
 define scope of terms
 Improve recall
 different terms for same concept
 Guide to a field of expertise
 Learning tool
 Richer expression
Copyright © 2013 Access Innovations, Inc.
Uses?
 Indexing
 …process by which subject terms or
classification symbols are assigned to concepts in
documents
 A thesaurus is also known as an indexing
language
 M.A.I.™ is an automated indexing system
Copyright © 2013 Access Innovations, Inc.
What are We Controlling?
What are We Controlling?
 Synonyms
 different terms same concept
 Polysemes or Homonyms
 same word different meanings
 lead or mercury
Copyright © 2013 Access Innovations, Inc.
How?
 Meaning
 delineation of scope of a term
 Term equivalence
 linking of synonyms
 Disambiguation of homonyms
 lead (metal)
 lead (element)
 lead (management)
Copyright © 2013 Access Innovations, Inc.
Disambiguation
Bridge Structure
Bridge Dentistry
Bridge Game
Bridge Concept
Copyright © 2013 Access Innovations, Inc.
Disambiguation
 Restriction and clarification of meaning
 Cells
 biological microsystems
 electrical equipment
 prison housing
 Reading
 town in England
 communication process
Copyright © 2013 Access Innovations, Inc.
Disambiguation
Bill Invoice
Bill Legislative
Bill Sport
Bill Person
Copyright © 2013 Access Innovations, Inc.
Disambiguation: Pre-Coordinate vs.
Post-Coordinate Forms
 Cells (biology)
 Cells (electric)
 Cells (prison)
 Reading (place)
 Reading (process)
 Biological cells
 Electric cells
Copyright © 2013 Access Innovations, Inc.
Precision Options
 Language specificity
 Coordination
 Compound terms - level of
precoordination
 Homographs and scope notes
 Word distance indication
Copyright © 2013 Access Innovations, Inc.
Precision Options
 Structural relationships
 Links and roles
 Treatment and aspect codes
 Weighting
Copyright © 2013 Access Innovations, Inc.
Maintenance of a
Controlled Vocabulary
 Allow for new jargon to be added
 Any living field will have new terms
 Identifier field
 Candidate terms
 Consider multiple broader terms
Copyright © 2013 Access Innovations, Inc.
Review, edit, test, edit,
use, edit, and maintain, i.e., edit
 Review
 Users
 Expert reviewers
 Test
 Index 500+ documents
(more for variable writing
style; fewer for strict
style)
 Monitor search log
 Edit and maintain
 Add term
 Change existing term
 Change term status
 Delete term
 Add term relationship
 Delete term relationship
 Add/modify Scope Note
 Change overall structure
Consider automated / assisted indexing software
Copyright © 2013 Access Innovations, Inc.
When Do You Add
More Terms?
 On demand
 When usage changes
 Stewardess – flight attendant
 As the field evolves
 8 changes to 64 colors
 In Use
 Don’t freeze waiting for perfection
Copyright © 2013 Access Innovations, Inc.
Vocabulary Control - Options
 Classification
systems
 Authority files
 Controlled term lists
 Uncontrolled term
lists
 Thesauri
Copyright © 2013 Access Innovations, Inc.
Classification Systems - Defined
 Are used to put an object in a specific
place. In the traditional classification
system each item has a single spot to go.
 Follow an outline of knowledge
 Used to shelve books in a library
Copyright © 2013 Access Innovations, Inc.
Catalog Systems - Defined
 Used to catalog the object to identify its
contents
 Based on perception
 Multiple terms are used to identify a
single object
 Not natural language
 Pre-coordinated - subheadings
Copyright © 2013 Access Innovations, Inc.
Classification Systems - Examples
 Classification of actual collections
 New York State Library - Dewey
 810.01
 Cutter - Universities 1800 - 1960’s
 Z34
 Lan
 Thomas Jefferson - Library of Congress
 z34.18
 la
 Government Documents Numbers
 based on government structure
Copyright © 2013 Access Innovations, Inc.
Catalog Systems - Examples
 Library of Congress Subject Headings
 Sears Subject Headings
 (used with Dewey)
Copyright © 2013 Access Innovations, Inc.
King of Catalogers
 Charles Ammi Cutter
 rules for alphabetical subject indexing
 most specific heading
 put two topics under two headings
 use English if possible
 x ref antonyms
 careful with homographs
 1895 ALA Subject Headings following
Cutter
Copyright © 2013 Access Innovations, Inc.
Politics in Libraries
 In 1905 Dewey was president of ALA
(American Library Association)
 LC adopted DDC
 Threw out Cutter
 The two never spoke again.
Copyright © 2013 Access Innovations, Inc.
Types of Headings
 Single word
 Botany or Ethics
 Adjective noun
 Capital punishment
 Noun - noun
 Death penalty
 American Standard
 Noun preposition noun
 Penalty of death
 International Standard
 Noun conjunction noun
 Nurses and nursing
Copyright © 2013 Access Innovations, Inc.
Cutter Guidelines
 File under the phrase “as it reads”
 Use the most significant words
 Reduce adjective nouns to noun
phrases
 Use singular rather than plural
 File compound words under the first
word
 No subheadings
Copyright © 2013 Access Innovations, Inc.
Cross References
 Cross reference synonyms
 main heading should be what the class uses
 use the common term
 use the unambiguous heading
 prefer the one which brings relations
 “…with a well defined network of cross
references the mob becomes an army.. “
 C.A. Cutter
Copyright © 2013 Access Innovations, Inc.
Library of Congress (LC)
Subject Headings
 1911 - List of Subject Headings
 extensive use of sub-headings
 invert phrases for main subject
 file under the noun not the adjective
 see references not cross filing
 place holder terms
 homographs defined parenthetically
Copyright © 2013 Access Innovations, Inc.
Classification vs.
Subject Headings
 Classification
 single spot or placement
 browse physical list
 often a numbering system
 clear hierarchy
 no or few cross references
 Like Yahoo!
Copyright © 2013 Access Innovations, Inc.
Classification vs.
Subject Headings
 Subject headings
 generic search
 hidden classification system
 related terms and cross references in heavy use
 usually the inverted form
 cells, electric
Copyright © 2013 Access Innovations, Inc.
Vocabulary Control - Options
 Classification
systems
 Authority files
 Controlled term lists
 Uncontrolled term
lists
 Thesauri
Copyright © 2013 Access Innovations, Inc.
Authority Systems - Defined
 Frequently have cross references
 Widely available
 Frequently coded lists
 Brand names
 .. Lists of terms in the preferred format for
use.
Copyright © 2013 Access Innovations, Inc.
Authority Files - Defined
 People
 Places
 Things
 ……..NOT
 Concepts
 Methods
 Processes
Copyright © 2013 Access Innovations, Inc.
Authority Files - Examples
 ISO Country Name and Code
 International Standards Organization
 ISO Language list
 NAICS (SIC)
 Standard Industrial Classification Code (SIC)
Replaced by
 North American Industrial Classification
System (NAICS)
Copyright © 2013 Access Innovations, Inc.
Authority Lists - Format
 Belgian Congo
 use Congo
 Bill Gates
 use William F. Gates, III (computer scientist)
 see also
 William Gates (basketball player)
Copyright © 2013 Access Innovations, Inc.
Authority Lists -
Need Style Sheets
 Names
 AACR2
 Anglo American Cataloging Rules
 AAP
 American Association of Publishers
 Chicago Manual of Style
 Dun & Bradstreet
 Style Sheet
Copyright © 2013 Access Innovations, Inc.
Vocabulary Control - Options
 Classification
systems
 Authority files
 Controlled term
lists
 Uncontrolled term
lists
 Thesauri
Copyright © 2013 Access Innovations, Inc.
Controlled Term Lists -
Defined
 State the preferred terms
 Provide allowed term entry
 Heavily cross referenced
 Not generally hierarchical
 Popular
 Easy to create
Copyright © 2013 Access Innovations, Inc.
Controlled Term Lists -
Examples
 ABI/Inform
 Predicasts
 RDS - Responsive Data Services
 Back of book indexes
 Art and Architecture Thesaurus
 …....These are not FULL thesauri
Copyright © 2013 Access Innovations, Inc.
Controlled Term List -
Format
 Cars
 use Automobiles
 Personal Computer
 use Microcomputer
Copyright © 2013 Access Innovations, Inc.
Vocabulary Control - Options
 Classification
systems
 Authority files
 Controlled term lists
 Uncontrolled term
lists
 Thesauri
Copyright © 2013 Access Innovations, Inc.
Uncontrolled List - Define
 Add terms as they occur
 No cross reference
 Simple flat structure
Copyright © 2013 Access Innovations, Inc.
Uncontrolled List - Example
 List of names
 Grocery list
 Candidate term list
Copyright © 2013 Access Innovations, Inc.
Uncontrolled List - Format
 Laundry
 Trim bushes
 Cat box needs cleaning
 Tommy’s birthday (bake cake)
 Iron
 Water plants
 ….other natural language lists
Copyright © 2013 Access Innovations, Inc.
Trying to Impose Control...
 Do laundry
 Trim bushes
 Clean cat box
 Bake birthday cake
 Iron shirts
 Water plants
Copyright © 2013 Access Innovations, Inc.
 Designed to enhance understanding and retention of the
vocabulary concepts necessary for creating a taxonomy,
ontology, thesaurus, or controlled vocabulary.
 Game supplies:
 1 Deck of Orange Question and Challenge Cards
 1 Deck of Green Answer Cards
 Game setup:
 Shuffle the deck of Green Answer cards,
 Deal the entire deck to the players.
 Shuffle the deck of Orange Question and Challenge cards
 Place them facedown in a pile in the middle of the table so that all
players can reach the pile.
 Reinforce what you just heard!
 Have fun!
Copyright © 2013 Access Innovations, Inc.
1. Play moves to the left of the dealer
2. Draw a card from the top of the Orange cards.
Read it aloud to all of the players.
3. The player who read the card says out loud
what they think the answer is.
4. Each player looks at the Green Answer cards
in their hand.
1. If they have the correct answer to the
Question or Challenge, they show their
card to everyone at the table.
2. If everyone agrees that the answer is
correct, the player holding the correct
answer card gives it to the player who
read the Question or Challenge card.
5. The player places their associated pair of
cards – one Orange Question and Challenge
card and one Green Answer card – face up on
the table in front of them.
6. Play passes to the person who held the correct
Green Answer card in their hand. Play
continues as in step 2 above.
7. Discussion among the players to arrive at the
correct answer is permissible and encouraged!
8. If players do not arrive at a consensus
regarding the correct answer, the Orange
Question and Challenge card may be returned
to the bottom of the pile, and play passes to
the person to the left of the player who drew
the previous card.
9. When all of the Orange Question and
Challenge cards have been drawn, read aloud,
and matched with their Green Answer cards,
the game ends.
10. If there are any Orange Question and
Challenge cards remaining to which players
cannot agree on an answer, players may
consult their notes or ask the session speaker.
Copyright © 2013 Access Innovations, Inc.
Term Forms
Term Forms
 Nouns
 Prepositional forms
 Adjectives
 Adverbs
 Initial Articles
 Singular and plural
Copyright © 2013 Access Innovations, Inc.
Term Forms - Noun and
Noun Phrases
 Nouns and noun phrases
 print media
 carpet
Copyright © 2013 Access Innovations, Inc.
Term Forms - Prepositional
Forms
 Prepositional forms are seldom used
 okay in International Standard ISO
 Philosophy of Education
 ANSI / NISO
 Educational philosophy
Copyright © 2013 Access Innovations, Inc.
Term Forms – Adjectives
 Adjectives
 not used in isolation
 may be used for coordination
 Miniature paintings
 USE PAINTINGS AND MINIATURE
 Portable typewriters
 USE TYPEWRITERS AND PORTABLE
Copyright © 2013 Access Innovations, Inc.
Term Forms – Adjectives
 Adjectives
 may convert to noun forms
 MINIATURE SIZE
 PORTABLE DEVICES
 TRIANGULAR SHAPE
Copyright © 2013 Access Innovations, Inc.
Term Forms - Adverbs
 Adverbs
 not used unless part of a compound term
 VERY LARGE ARRAY RADIO TELESCOPE
 Used for VLA
Copyright © 2013 Access Innovations, Inc.
Term Forms - Verbs
 Verbs
 no infinitive or participle forms
 for actions that can be expressed as nouns and retain
clear meaning, use noun form or gerunds
 Examples
 Speaking (not Speech)
 Walking (not Ambulation)
 Communication (not Communicate)
 Administration (not Administer)
Copyright © 2013 Access Innovations, Inc.
Term Forms - Initial Articles
 AVOID THEM
 Example
 Theater not The theater
 State (political entity) not The state
 Use if part of a proper name
 Le Mans
 El Salvador
Copyright © 2013 Access Innovations, Inc.
Term Forms - Singular and
Plural
 Concrete entities
 count nouns are plurals - how many?
 planets
 children
 non count nouns - how much?
 nickel
 snow
 lace
Copyright © 2013 Access Innovations, Inc.
Term Forms - Singular
and Plural
 fully formed organism
 eyes
 mouth
 objects are singular
 lamp
 classes of things
 fruits
Copyright © 2013 Access Innovations, Inc.
Term Forms - Singular and
Plural
 Abstract concepts
 Show in the singular form
 authority
 socialism
 packaging
 biochemistry
Copyright © 2013 Access Innovations, Inc.
Term Forms - Singular and
Plural
 Unique entities
 Show in the singular
 Big Ben
 Grand Canyon
Copyright © 2013 Access Innovations, Inc.
Other Formatting
 Spelling
 Punctuation
 Capitalization
 Abbreviations
 ...
Copyright © 2013 Access Innovations, Inc.
Spelling
 Use what the users will use and cross
post for multilingual
 fiber - fibre
 center - centre
 organization - organisation
 hemo - haemo
 Pediatrics - paediatrics
Copyright © 2013 Access Innovations, Inc.
Punctuation
 Parentheses only for qualifiers
 Apostrophes are retained
 Hyphens - avoid
 avoid
 avoid
 avoid
 avoid
Copyright © 2013 Access Innovations, Inc.
Capitalization
 NISO = initial only
 AACR2 format
 Practice is to follow a manual of style
 Chicago Manual of Style
 Associated Press
 American Association of Publishers
Copyright © 2013 Access Innovations, Inc.
Abbreviations
 Use only when well known
 Always include the full meaning
 LASER
 Scope Note Light Amplification by Stimulated
Emission of Radiation
 WHO
 World Health Organization
Copyright © 2013 Access Innovations, Inc.
Other Ways of Adding Value
 Cross references
 Facets
 Notation
 Roles
 Treatment
 Term weighting
Copyright © 2013 Access Innovations, Inc.
Cross References
 See - S
 See also - SA
 Not related or associated
 Not opposite
 Just helpful guides
Copyright © 2013 Access Innovations, Inc.
Synthesis in Classification
 S.R.Ranganathan 1933
 Colon Classification
 analytico-syntactic classification
 analyze subject into component parts
(facets)
 arrange facets into schedules
 combine facets to express subject
complexity
Copyright © 2013 Access Innovations, Inc.
Ranganathan
 A General Properties
 Ab Configuration
 Ac Tubular
 B Materials
 Bc Metals
 Bcc ferrous
 Bcd steels
 Bcf Chromium steels
 Bcfi Chromium-nickel steels
 K Modes of failure
 Kg Creep
 Kgb Creep rupture
 L Stresses and loads
 Lb Tensile
Copyright © 2013 Access Innovations, Inc.
Ranganathan
 Tubular Chromium Nickel steel creep
rupture Tensile strength
 Ac Bcfi Kgb Bb
 Chain indexing
 Tubular
 Chromium Nickel steel
 creep rupture
 Tensile strength
Copyright © 2013 Access Innovations, Inc.
Other Ways of Adding Value
 Cross references
 Facets
 Notation
 Roles
 Treatment
 Term weighting
Copyright © 2013 Access Innovations, Inc.
Facets
 Additional ways to add meaning
 Divide terms into categories using a
single characteristic
 Limited number of categories
Copyright © 2013 Access Innovations, Inc.
Facets and Roles
 PRECIS - Austin 1984
 order of terms
 post-coordinate indexing system
 role of the term is important
 tomato
 living plant?
 marketable product?
 Facet role indicator
 organism
 end product
Copyright © 2013 Access Innovations, Inc.
Many Faceted Vocabularies
 UMLS Semantic Network
 Unified Medical Language System - 49
 BLISS Classification Association
 British Library Information Science System
 Dewey Decimal Classification System
 Universal Decimal Classification
System
 Art and Architecture Thesaurus
Copyright © 2013 Access Innovations, Inc.
MeSH and Tree Pages
Copyright © 2013 Access Innovations, Inc.
Copyright © 2001 Access Innovations, Inc.
219
MeSH Alpha
Copyright © 2013 Access Innovations, Inc.
Order of Facets
 Post-coordinate
 Means before order
 Notation becomes important
 Breaks down for large classes
 (more than 5,000 terms)
Copyright © 2013 Access Innovations, Inc.
Other Ways of Adding Value
 Cross references
 Facets
 Notation
 Roles
 Treatment
 Term weighting
Copyright © 2013 Access Innovations, Inc.
Notation Options
 Expressive
 Ordinal
 Synthetic
 Enumeration
 Many style options
Copyright © 2013 Access Innovations, Inc.
Expressive Notation
 83Hazards
 831 Fire
 831.5 Fire fighting
 831.53 Fire fighting equipment
 831.532 Fire extinguishers
 831.532.5 Carbon dioxide fire extinguishers
 832 Explosions
Copyright © 2013 Access Innovations, Inc.
Ordinal and Semi-ordinal
Notation
 HK Hazards
 HL Fire
 HM Fire fighting
 HN Fire fighting equipment
 HNB Fire extinguishers
 HNE Carbon dioxide fire extinguishers
 HO Explosions
 Indention is the sole indication of hierarchy
Copyright © 2013 Access Innovations, Inc.
Synthetic and Enumeration
Notation
 Need to allow the classification system to
grow
 Synthetic example
 P Architecture
 PAT Architectural information
 PAT.M Architectural information services
Copyright © 2013 Access Innovations, Inc.
Copyright © 2001 Access Innovations, Inc.
226
Notation Examples
- AAT Facets
Copyright © 2013 Access Innovations, Inc.
Systematic Display
 Paints
 (By composition)
 Oil paints
 Water paints
 Cement paints
 (By use)
 Primers
 Undercoats
 Top coats
Copyright © 2013 Access Innovations, Inc.
Copyright © 2001 Access Innovations, Inc.
228
AAT Pages
 Notice faceted
 indentions
Copyright © 2013 Access Innovations, Inc.
229
AAT Term
Copyright © 2013 Access Innovations, Inc.
Alphabetical Display
 Paints
 NT
 Cement paints
 Oil paints
 Primers
 Top coats
 Undercoats
 Water paints
Copyright © 2013 Access Innovations, Inc.
Other Ways of Adding Value
 Cross references
 Facets
 Notation
 Roles
 Treatment
 Term weighting
Copyright © 2013 Access Innovations, Inc.
Roles
 ERIC Thesaurus - role indicators
 Adjectives - bibliographic terms
 Input or raw material
 Output or product
 Undesirables
 Indicated uses
 Materials “In which”
 Affects
 Primary topics of discussion
 Passive recipients, possessors, location
 Means used
Copyright © 2013 Access Innovations, Inc.
Roles
 CAS - Super roles
 Analytical study
 Biological study
 Formation, nonpreparative
 Occurrence
 Preparation
 Process
 Uses
 CAS Specific roles
 Miscellaneous
 Properties
 Reactant
Copyright © 2013 Access Innovations, Inc.
Subheadings as Roles
 MeSH
 Therapeutic use
 Drug treatment (disease)
 Adverse effect (drug treatment)
 Diagnosis
Copyright © 2013 Access Innovations, Inc.
Other Ways of Adding Value
 Cross references
 Facets
 Notation
 Roles
 Treatment
 Term weighting
Copyright © 2013 Access Innovations, Inc.
Treatment and
Aspect Codes
 Apply codes or types at article level
 Theoretical
 New development
 Experimental
 Practical
Copyright © 2013 Access Innovations, Inc.
Other Ways of Adding Value
 Cross references
 Facets
 Notation
 Roles
 Treatment
 Term weighting
Copyright © 2013 Access Innovations, Inc.
Cranfield Project -
Cleverdon 1966
 Concepts in the main theme 9/10
 Major subsidiary theme 7/8
 Minor subsidiary theme 5/6
Copyright © 2013 Access Innovations, Inc.
Internet Engines
 Complex weighting of terms
 Use term frequency
 Rank output wholly automatic
 Output based on input term weights
 Can also use “well formed” data -
 like a thesaurus hierarchy
 field formatted data
 XML files
Copyright © 2013 Access Innovations, Inc.
Automatic and Semi-automatic
Classification?
 Data Harmony® M.A.I.™
 Semio
 Autonomy - Muscat
 Net Owl - Names
 n-Stein
 Quiver
 Smart Logic
Copyright © 2013 Access Innovations, Inc.
Machine Aided Indexing Goals
 Improve
 Indexing efficiency
 Indexing consistency
 Reduce editorial drift
 Depth of Indexing
 Reduce
 Over and under indexing
 Term over use and under use
Copyright © 2013 Access Innovations, Inc.
Machine Aided Indexing Goals
 Improve productivity
 Indexer
 Information worker
 Disambiguate terms
 Increase clarity
Copyright © 2013 Access Innovations, Inc.
Machine Aided Indexing -
Intellectual Components
 Word List or Thesaurus
 Knowledge base
 Rules based
 Natural Language (Semantic)
 Editorial evaluation
Copyright © 2013 Access Innovations, Inc.
Example:
M.A.I.™ Software Components
 Rule Builder
 Concept Extractor
 Statistics Collector
Copyright © 2013 Access Innovations, Inc.
Copyright © 2013 Access Innovations, Inc.
Taxonomies in
Search
Copyright © 2013 Access Innovations, Inc.
Do the Data FIRST
 What do you have?
 What does it need?
 How would you LIKE to access it?
 Look at the data BEFORE you create the
specifications
 DTD built without data is not going to work
 Then choose the system that will support
your data
Copyright © 2013 Access Innovations, Inc.
My Main Frustration
1. Select hardware
2. Select software
3. Design system
4. Try to load the data
5. Add the taxonomy, if at all
 That’s BACKWARDS
Copyright © 2013 Access Innovations, Inc.
Why Does Search Fail?
 Most large organizations have 5 different
search 7
 All disappointing and sitting on the shelf
 Inconsistent results
 Unclear path to results
 Lack of single unified clear consistent
vocabulary
 Not tied to data governance
 Taxonomy
 Other metadata
Copyright © 2013 Access Innovations, Inc.
SEARCH
 How search works
 Measuring accuracy in search
 Precision
 Recall
 Relevance
 Search theoretical basis
 Bayes, Boole, and the rest of the guys
 The taxonomy effect
Copyright © 2013 Access Innovations, Inc.
Parts of Search
 Search software
 Inverted Index
 Search algorithms
 Presentation layer
 Search box
 Autocompletion
 Related and narrower terms
 Hierarchical display
Copyright © 2013 Access Innovations, Inc.
Hierarchical Display
Inverted
File
Index
Searchable Index
Taxonomy
Thesaurus
Inverted Files and Boolean
are Basic to ALL Search
Copyright © 2013 Access Innovations, Inc.
Note: not available in all systems!
“Outline of Presentation”
1 Define key terminology
2 Thesaurus tools
 Features
 Functions
3 Costs
 Thesaurus construction
 Thesaurus tools
4 Why & when?
Creating
an
Inverted
File
Index
Sample DOCUMENT
Copyright © 2013 Access Innovations, Inc.
Simple Inverted File Index of
the Terms from the “Outline”
&
1
2
3
4
construction
costs
define
features
functions
key
of
outline
presentation
terminology
thesaurus
tools
when
why
Copyright © 2013 Access Innovations, Inc.
& - Stop
1 - Stop
2 - Stop
3 - Stop
4 - Stop
construction - L7, P2, SH
costs - L6, P1, H
define - L2, P1, H
features - L4, P1, SH
functions - L5, P1, SH
key - L2, P2, H
of - Stop
outline - L1, P1, T
presentation - L1, P3, T
terminology - L2, P3, H
thesaurus - (1) - L3, P1, H
(2) - L7, P1, SH
(3) - L8, P1, SH
tools - (1) - L3, P2, H
(2) - L8, P2, SH
when - L9, P3, H
why - L9, P1, H
Complex Inverted File Index -
Placement, Location added
Copyright © 2013 Access Innovations, Inc.
Search Presentation Layer
Automatic completion
And type ahead
from Thesaurus
Copyright © 2013 Access Innovations, Inc.
Search Presentation Layer
Related
Narrower
Copyright © 2013 Access Innovations, Inc.
Search Presentation Layer
The Hierarchical view of the thesaurus is
also a browse able view of the content.
The numbers include the number of hits
1. For the term
2. For the branch
Copyright © 2013 Access Innovations, Inc.
Many parts
 Search software – of course
 Computer network
 Parsing of text – the “inverted file”
 Well formed or structured text
 CLEAN DATA
 Computer software – network
 Computer hardware
 Telecommunications connection
 Training sets for statistical systems
How Does Search Work?
Copyright © 2013 Access Innovations, Inc.
Technical Parts of Search
 Search technology
 Ranking algorithms
 Query language
 Federators
 Cache
 Inverted index – as discussed above
 Other enhancements
 Presentation Layer
Copyright © 2013 Access Innovations, Inc.
Access Innovations – Complex
Farm With Perfect Search
Source
Data
Query
Search
Harmony
Presentation
Layer
Repository XIS
(cache)
Cleanup, etc.
Federators
Query Servers
Index
Builders
Deploy
Hub
Cache
Builders
Copyright © 2013 Access Innovations, Inc.
QUERYAPI
CUSTOM
CONNECTOR
EMAIL
CONNECTOR
Core Architectural Components
Pipeline
SEARCH
SERVER
QUERY
PROCESSOR
Query
Results
Vertical
Applications
Portals
Custom
Front-Ends
Mobile
DevicesContent
Push
DOCUMENT
PROCESSOR
Web
Content
Files,
Documents
Databases
Custom
Applications
CONTENTAPI
MANAGEMENT API
Index DB
DATABASE
CONNECTOR
FILE
TRAVERSER
WEB
CRAWLER
Pipeline
Email,
Groupware
Administrator’s
Dashboard
FILTER
SERVER
Agent DB
Alerts
Data Harmony Governance API
MAIstro
Searchharmony
FAST Search Example
Copyright © 2013 Access Innovations, Inc.
Measuring Accuracy in Search
 Relevance
 Recall
 Precision
 Accuracy – Hits, miss, noise
 Ranking
 Linguistics
 Query Processing
 Results Processing
 Display
 Search refinement
 Usability
 Business Rules
263
Copyright © 2013 Access Innovations, Inc.
Relevance
 How well a set of returned documents answers
the information need
 “Accuracy”
 Related to objective of search
 Different user communities
 Information resources
 Tension of user needs and context available
 A confidence “guesstimate”
Copyright © 2013 Access Innovations, Inc.
Recall = Number of relevant items retrieved
Number of relevant items in the collection
Precision = Number of relevant items retrieved
Number of items retrieved
Relevance = Germane (Precision)
Pertinent (Recall)
The Formulas
Copyright © 2013 Access Innovations, Inc.
Measuring Relevance
 Concepts
 Context
 Age of documents
 Completeness (recall)
 Quality
 Statistically determined ?
 Nope, it is subjective
 Someone has to determine the rightness of the item
 A confidence factor = canard!
Copyright © 2013 Access Innovations, Inc.
Kinds of Search
 Bayesian –
 FAST
 Lucene
 Autonomy / Verity
 Boolean
 Dialog
 Endeca
 Perfect Search
 Ranking algorithms
 Google
267
Copyright © 2013 Access Innovations, Inc.
George Boole
and Boolean Algebra
 George Boole
 Mathematician
 1815-1864
 Boolean algebra
 An algebraic system of logic
 AND, OR, NOT, ANDNOT,
 Dialog, BRS, Stairs
268
Copyright © 2013 Access Innovations, Inc.
Boolean Representation
 Venn diagram showing
the intersection of sets A
AND B (in violet),
 The union of sets A OR
B (all the colored
regions),
 And set A XOR B (all the
colored regions except
the violet).
 The "universe" is
represented by the
rectangular frame.
269
Copyright © 2013 Access Innovations, Inc.
Bayes and Bayes’ Theorem
 Thomas Bayes
 Mathematician
 1702 - 1761
 Bayesian theorem
 Uses probability inductively
 Established a mathematical basis for probability inference
 WHAT?
 A means of calculating,
 from the number of times an event has not occurred,
 the probability that it will occur in future trials
270
Copyright © 2013 Access Innovations, Inc.
Bayesian Methods –
Cautions
 A user might wish to change the distribution of
probabilities.
 A user will make a novel request for information
in a previously unanticipated way.
 The computational difficulty of exploring a
previously unknown network.
 The quality and extent of the prior beliefs used
in Bayesian inference processing.
Copyright © 2013 Access Innovations, Inc.
Bayesian Methods -
Cautions (continued)
 A Bayesian network is only as useful as the
prior knowledge is reliable.
 An optimistic or pessimistic expectation of the
quality of these prior beliefs will distort the
entire network and invalidate the results.
 Must ensure the selection of the statistical
distribution induced in modeling the data.
 Must have the proper distribution model to
describe the data.
 That is… you have to constantly train and
retrain the data
Copyright © 2013 Access Innovations, Inc.
Basic Areas of Natural
Language Processing (NLP)
 Syntactic
 Semantic
 Morphological
 Phraseological
 Lemmatization (stemming)
 Statistical
 Grammatical
 Common Sense
Copyright © 2013 Access Innovations, Inc.
Basic Areas of Automatic
Language Processing (ALP)
 Auto Translation
 Auto Indexing
 Auto Abstracting
 Artificial Intelligence
 Searching
 Spell Checking
 Semantic Web
 Natural Language Processes (NLP)
 Computational Linguistics
Copyright © 2013 Access Innovations, Inc.
Statistical Search
 Cluster analysis
 Neural networks
 Co-occurrence
 Bayesian inference
 Latent Semantic
 Etc.
275
Copyright © 2013 Access Innovations, Inc.
Word and Term Parsing
 Stemming
 -ing, -ed, -es, -’s, -s’, etc.
 Depluralization
 Truncation
 Left and right
 Wild cards
 Organi*ation
 Variant Spellings
 Centre, Center
 Hyphens
Copyright © 2013 Access Innovations, Inc.
The Taxonomy Effect
 Where do the terms go?
 How are they used in search
 What other ways can I use the taxonomy
in search?
Copyright © 2013 Access Innovations, Inc.
For search all publications
Search database
for Journals and
pubs
Bookstore search
Search of 53 crawled
sites including
journals, books, web
site, conference
sites, etc.
Site search
Navigation
Copyright © 2013 Access Innovations, Inc.
Taxonomy Driven
Search Presentation
Navigate
the full
taxonomy
“tree”
BROWSE
Auto-completion using the
taxonomy
Guide the user
Copyright © 2013 Access Innovations, Inc.
Subject Browsing
Copyright © 2013 Access Innovations, Inc.
Targeted Resources Based
on Subject or User Role
CONFIDENTIAL
Copyright © 2013 Access Innovations, Inc.
Member Profile Tagging
User pastes or
uploads CV
Button to auto-
extract taxonomy
attributes
Copyright © 2013 Access Innovations, Inc.
TaxoTerm Server
DataHarmony
(M.A.I.)
Returns subject
metadata
Microsoft
SharePoint
Server2010
User uploads a document
to SharePoint space
Before uploading to
SharePoint server, the
EventHandler sends the
document to Data
Harmony.
Data Harmony
automatically attaches
indexing terms before
uploading to MOSS
Adding Terms
to SharePoint
Copyright © 2013 Access Innovations, Inc.
SharePoint 2010 Only Shows
10 Lines of the Taxonomy
284
This add on makes it all viewable
Copyright © 2013 Access Innovations, Inc.
QUERYAPI
CUSTOM
CONNECTOR
EMAIL
CONNECTOR
Core Architectural Components
Pipeline
SEARCH
SERVER
QUERY
PROCESSOR
Query
Results
Vertical
Applications
Portals
Custom
Front-Ends
Mobile
DevicesContent
Push
DOCUMENT
PROCESSOR
Web
Content
Files,
Documents
Databases
Custom
Applications
CONTENTAPI
FAST MANAGEMENT API
Index DB
DATABASE
CONNECTOR
FILE
TRAVERSER
WEB
CRAWLER
Pipeline
Email,
Groupware
Administrator’s
Dashboard
FILTER
SERVER
Agent DB
Alerts
Use taxonomy terms here
Data Harmony Governance API
MAIstro
Searchharmony
Taxonomies Added in
Search Example
Copyright © 2013 Access Innovations, Inc.
Auto suggestion of
Taxonomy Terms
Populate
Keywords,
Descriptors,
Indexing terms,
etc.
Allow for manual
review of auto-
tagging for
quality
assurance.
Copyright © 2013 Access Innovations, Inc.
Where do I use a taxonomy?
Copyright © 2013 Access Innovations, Inc.
Thesaurus
Master
Machine Aided
Indexer
(M.A.I.™)
Database
Repository
Search
Presentation
Layer
Increases
accuracy
Browse by
Subject
Auto-completion
Broader Terms
Narrower Terms
Related Terms
Client Taxonomy
Inline Tagging
Metadata and
Entity
Extractor
Automatic
Summarization
Search
Software
Client Data
Full Text
HTML, PDF,
Data Feeds,
etc.
Client
taxonomy
The Workflow
288
Tag and
Create
metadata
Put in
data base
with tags
Build
Search
inverted index
Create
user
interface
Gather
source
data
Copyright © 2013 Access Innovations, Inc.
Thesaurus
Master
Machine Aided
Indexer
(M.A.I.™)
Repository
Search
Presentation:
90%
accuracy
Browse by
Subject
Auto-
completion
Broader Terms
Narrower Terms
Related Terms
Client Taxonomy
Inline Tagging
Metadata and
Entity
Extractor
Automatic
Summarization
Search
Software
Client Data
Full Text
HTML, PDF,
Data Feeds, etc.
Client
taxonomy
Taxonomy In Sharepoint
Copyright © 2013 Access Innovations, Inc.
[Data Harmony fully integrated with MOSS.]
Adding Terms to
Information Objects
 Part of the record
 XML
 MARC
 A relational table pointing the terms to a
record ID number (Secondary key)
 Adding data to the HTML
 META NAME KEYWORD Element
 Many other options
Copyright © 2013 Access Innovations, Inc.
Part of the Record - XML
 Added as an element in the XML record
 Need an element to put the data in
 <Taxonomy Term>
 Capture the terms when creating the
records
Copyright © 2013 Access Innovations, Inc.
The author pastes
the data to the
document
template,
attaching images,
graphs, as
necessary:
Author
Submission
Module
Copyright © 2013 Access Innovations, Inc.
Editorial Workflow Integration
Author Submission Module
The author fills in the data to the document template, attaching images
and graphs as necessary.
An API calls Data Harmony and generates a list of indexing terms
based on the content.
Copyright © 2013 Access Innovations, Inc.
Authors review the
indexing and may
change it.
Content is stored
into a data
repository as
HTML, XML, etc.
Editorial Workflow Integration
Author Submission Module
Copyright © 2013 Access Innovations, Inc.
In the HTML Record
 Makes it crawlable for the internet
 Used in CMS applications
 Content Management Systems
 Add to the HTML
 Manually
 In Dreamweaver
 In your CMS like Extron
 Author Submissions Example
 Do the same with SharePoint
Copyright © 2013 Access Innovations, Inc.
META NAME “KEYWORDS”
Copyright © 2013 Access Innovations, Inc.
In Relational Database
Table
 Primary Key – the record
 Secondary key all the metadata
 Like taxonomy terms
 Like author
 Like publication date
 Used in Oracle, SQL, etc
 Need a field to put the taxonomy data in
 Supports “Faceted Search”
 each item in a separate field or element or table
Copyright © 2013 Access Innovations, Inc.
RDBMS Connection
Taxonomy
term table
Copyright © 2013 Access Innovations, Inc.
Using Taxonomies
in Applications
• Improve search
• Subject browsing
• Mobile intelligence
• Targeted resources based
on subject or user role
• Link to society resources
• Author submission module
• Author authority database
• Expert reviewer identification
• Member profiles
• Data visualization
• More like this
• In “indexing” or categorizing,
as subject metadata
• In content management
systems
• In SharePoint
• In mashups
• In social networking sites
• In author tagging
• In filtering data – e.g., spam
filters and RSS feeds
• In web crawlers
• Social media - community
Copyright © 2013 Access Innovations, Inc.
A Quick Look
Behind the Scenes
Database
Management
System
Thesaurus
tool
Indexing
tool•Validate terms
•Add terms and rules
•Change terms and rules
•Delete terms and rules
•Search thesaurus
•Validate term entry
•Block invalid terms
•Record candidates
•Establish rules for
term use
•Suggest indexing
terms
Copyright © 2013 Access Innovations, Inc.
Taxonomy
view
Thesaurus
Term Record
view
Copyright © 2013 Access Innovations, Inc.
Where Does the
Subject Metadata Go?
 Apply to content itself
 Use meta name field in HTML header
 Connect search to the keywords in the SQL or
other database tables
Copyright © 2013 Access Innovations, Inc.
HTML Header
Copyright © 2013 Access Innovations, Inc.
Suggested taxonomy descriptors
Copyright © 2013 Access Innovations, Inc.
Copyright © 2013 Access Innovations, Inc.
Integrate Taxonomy
to Enhance Find-ability
 Browsable categories of a directory
 Browsable faceted navigation
 Smart search for term equivalents
 Taxonomy terms (original or modified) as labels
 Navigation aids incorporate taxonomy terms
and relationships
Copyright © 2013 Access Innovations, Inc.
More Taxonomy
Enrichment
 Spelling alternatives and correction
 Related concepts
 Statistical information about the metadata
 Navigation or drill downs
 Search refinement
 Recursive sets
 Concept linking
 Dictionary lookup (in taxonomy glossary)
Copyright © 2013 Access Innovations, Inc.
Brand is repeated in several spots and
tied to search as well
Copyright © 2013 Access Innovations, Inc.
Raw Full
text data
feeds
XIS™
Creation
Taxonomy
Thesaurus
Master®
Printed
source
materials
Taxonomy
terms
M.A.I.™
Concept
Extractor
M.A.I.™
Rule Base
Load to
Perfect Search
Search
Harmon™
Display
Search
Database Plus Search Workflow
Data
Crawls on
53+
sources
Add
metadata
XIS™
repository
SQL for
ecommerce
Save data to search and
repositories at the same timeCopyright © 2013 Access Innovations, Inc.
Raw Full
text data
feeds XIS
Creation
Taxonomy
Thesaurus
Master
Printed
source
materials
Taxonomy
terms
MAI Rule
Base
Load to
Search
Search
Harmony
Display
Search
Data Base Plus Search Workflow
Data
Crawls on
data
sources
Add
metadata
XIS
repository
SQL for
ecommerce
MAI Concept
Extractor
Source data
Clean and enhance data
Search data
Copyright © 2013 Access Innovations, Inc.
Use Case: Inline Tagging
Show the exact point where the
concept is mentioned
Mouse-over to view the term record
Statistical summary, showing the
number of times each term is
mentioned in the article
Copyright © 2013 Access Innovations, Inc.
Inline Tagging HTML View
Copyright © 2013 Access Innovations, Inc.
XML View for
Inline Tagging
Copyright © 2013 Access Innovations, Inc.
Taxonomy
view
Thesaurus
Term Record
view
Copyright © 2013 Access Innovations, Inc.
 The New Board Game
 Applications
 Implementation
 The taxonomy
Copyright © 2013 Access Innovations, Inc.
The Changing Faces of
Web Taxonomies
 ….and how the information is delivered
 From current site
 To new version
 Depends on TAXONOMY
 Personalization
 Feeding ads
 Consistent information
Copyright © 2013 Access Innovations, Inc.
Copyright © 2013 Access Innovations, Inc.
Copyright © 2013 Access Innovations, Inc.
HTML Headers
META NAME KEYWORD
Use the taxonomy here
Copyright © 2013 Access Innovations, Inc.
Copyright © 2005 - Access Innovations, Inc.
Copyright © 2013 Access Innovations, Inc.
More Innovations!
 Link topic to article to author to event
 Make visual links within domain
 Enable authors to submit and categorize conference
submissions
 Create author authority database linking to co-authors,
topics, locations, etc.
 Create expert reviewer database
 Create member profiles with alternate names, publications,
tagged by topic
 Visualize data and domain distribution
 Display interest connections in social network
 Deliver accurate targeted information through mobile
applications
 Etc.
Copyright © 2013 Access Innovations, Inc.
Change to Ready, Aim, Fire!
 Follow the data
 Look at the data, format and content
 Design taxonomy for data
 Leverage the standards
 Use taxonomy to tag data
 Choose search and repository software for data
 Load the data into the system
 Keep your eye on the target
Copyright © 2013 Access Innovations, Inc.
Standards for
Monolingual Thesauri
 TEST - Thesaurus of engineering and
scientific terms - COSATI 1967
 ARNOR NFZ 47-100 1981 French
 DIN 1463 German 1987-1993
 NISO Z39.19 - 1993 - American
Copyright © 2013 Access Innovations, Inc.
Where Can I Get
Taxonomy Standards?
 www.niso.org
 Z39.19 (2010) Controlled Vocabularies
 www.ISO.ce
 ISO 25964 parts 1 and 2 (2012 and 2013)
 www.bsi.uk.co
 www.w3c.org SKOS and OWL
 www.accessinn.com/library
Copyright © 2013 Access Innovations, Inc.
Suggested Reading
 F.W. Lancaster - 1986
 Vocabulary Control 1986
 Aitchison, Gilchrist and Bawden
 Thesaurus construction and use: a practical manual
4th edition
 Accidental Taxonomist
 Heather Heddon
 TaxoDiary.com Blog site
Copyright © 2013 Access Innovations, Inc.
Suggested Reading
 Introduction to any thesaurus
 INSPEC
 NICEM
 Pychological Abstracts
 etc.
Copyright © 2013 Access Innovations, Inc.
It Just Takes
a Little
Imagination
Thank you
Marjorie M.K. Hlava, President
Bob Kasenchak, Project Coordinator
Access Innovations
505-998-0800
mhlava@accessinn.com
Bob_kasenchak@accessinn.com
Copyright © 2013 Access Innovations, Inc.

Mais conteúdo relacionado

Mais procurados

Asis&t webinar people directories access innovations
Asis&t webinar people directories access innovationsAsis&t webinar people directories access innovations
Asis&t webinar people directories access innovationsBert Carelli
 
Leveraging Your Taxonomy With Navtree and MAIQuery
Leveraging Your Taxonomy With Navtree and MAIQueryLeveraging Your Taxonomy With Navtree and MAIQuery
Leveraging Your Taxonomy With Navtree and MAIQueryAccess Innovations, Inc.
 
Webinar: Business Solutions and Metadata Design
Webinar:  Business Solutions and Metadata DesignWebinar:  Business Solutions and Metadata Design
Webinar: Business Solutions and Metadata Designmartingarland
 
Advanced Taxonomy for Content Strategists
Advanced Taxonomy for Content StrategistsAdvanced Taxonomy for Content Strategists
Advanced Taxonomy for Content StrategistsDawn Bovasso
 
Implementing Semantic Search
Implementing Semantic SearchImplementing Semantic Search
Implementing Semantic SearchPaul Wlodarczyk
 
Improve your Searches, Get Trained up on Expernova!
Improve your Searches, Get Trained up on Expernova!Improve your Searches, Get Trained up on Expernova!
Improve your Searches, Get Trained up on Expernova!Expernova
 
How to Apply Your Taxonomy to Your Content Automatically
How to Apply Your Taxonomy to Your Content AutomaticallyHow to Apply Your Taxonomy to Your Content Automatically
How to Apply Your Taxonomy to Your Content AutomaticallyAccess Innovations, Inc.
 
Multimedia Data Navigation and the Semantic Web (SemTech 2006)
Multimedia Data Navigation and the Semantic Web (SemTech 2006)Multimedia Data Navigation and the Semantic Web (SemTech 2006)
Multimedia Data Navigation and the Semantic Web (SemTech 2006)Bradley Allen
 
Global ID’s & Publicizing Researches (ORCID)
Global ID’s & Publicizing Researches (ORCID)Global ID’s & Publicizing Researches (ORCID)
Global ID’s & Publicizing Researches (ORCID)Nabeel Salih Ali
 
Semantic Web in Action: Ontology-driven information search, integration and a...
Semantic Web in Action: Ontology-driven information search, integration and a...Semantic Web in Action: Ontology-driven information search, integration and a...
Semantic Web in Action: Ontology-driven information search, integration and a...Amit Sheth
 
Semantics in Financial Services -David Newman
Semantics in Financial Services -David NewmanSemantics in Financial Services -David Newman
Semantics in Financial Services -David NewmanPeter Berger
 
Semantic Web: Technolgies and Applications for Real-World
Semantic Web: Technolgies and Applications for Real-WorldSemantic Web: Technolgies and Applications for Real-World
Semantic Web: Technolgies and Applications for Real-WorldAmit Sheth
 
Trustworthy AI and Open Science
Trustworthy AI and Open ScienceTrustworthy AI and Open Science
Trustworthy AI and Open ScienceBeth Plale
 
Semantic Search tutorial at SemTech 2012
Semantic Search tutorial at SemTech 2012Semantic Search tutorial at SemTech 2012
Semantic Search tutorial at SemTech 2012Peter Mika
 
Lines of Communication: Open Access Repositories & Scholarly Publication
Lines of Communication: Open Access Repositories & Scholarly PublicationLines of Communication: Open Access Repositories & Scholarly Publication
Lines of Communication: Open Access Repositories & Scholarly PublicationGaz Johnson
 
Bioschemas Workshop
Bioschemas WorkshopBioschemas Workshop
Bioschemas WorkshopNiall Beard
 
NetIKX Semantic Search Presentation
NetIKX Semantic Search PresentationNetIKX Semantic Search Presentation
NetIKX Semantic Search Presentationurvics
 
Semantic Search using RDF Metadata (SemTech 2005)
Semantic Search using RDF Metadata (SemTech 2005)Semantic Search using RDF Metadata (SemTech 2005)
Semantic Search using RDF Metadata (SemTech 2005)Bradley Allen
 

Mais procurados (20)

Asis&t webinar people directories access innovations
Asis&t webinar people directories access innovationsAsis&t webinar people directories access innovations
Asis&t webinar people directories access innovations
 
Leveraging Your Taxonomy With Navtree and MAIQuery
Leveraging Your Taxonomy With Navtree and MAIQueryLeveraging Your Taxonomy With Navtree and MAIQuery
Leveraging Your Taxonomy With Navtree and MAIQuery
 
Webinar: Business Solutions and Metadata Design
Webinar:  Business Solutions and Metadata DesignWebinar:  Business Solutions and Metadata Design
Webinar: Business Solutions and Metadata Design
 
Advanced Taxonomy for Content Strategists
Advanced Taxonomy for Content StrategistsAdvanced Taxonomy for Content Strategists
Advanced Taxonomy for Content Strategists
 
Implementing Semantic Search
Implementing Semantic SearchImplementing Semantic Search
Implementing Semantic Search
 
Improve your Searches, Get Trained up on Expernova!
Improve your Searches, Get Trained up on Expernova!Improve your Searches, Get Trained up on Expernova!
Improve your Searches, Get Trained up on Expernova!
 
How to Apply Your Taxonomy to Your Content Automatically
How to Apply Your Taxonomy to Your Content AutomaticallyHow to Apply Your Taxonomy to Your Content Automatically
How to Apply Your Taxonomy to Your Content Automatically
 
Multimedia Data Navigation and the Semantic Web (SemTech 2006)
Multimedia Data Navigation and the Semantic Web (SemTech 2006)Multimedia Data Navigation and the Semantic Web (SemTech 2006)
Multimedia Data Navigation and the Semantic Web (SemTech 2006)
 
Global ID’s & Publicizing Researches (ORCID)
Global ID’s & Publicizing Researches (ORCID)Global ID’s & Publicizing Researches (ORCID)
Global ID’s & Publicizing Researches (ORCID)
 
Semantic Web in Action: Ontology-driven information search, integration and a...
Semantic Web in Action: Ontology-driven information search, integration and a...Semantic Web in Action: Ontology-driven information search, integration and a...
Semantic Web in Action: Ontology-driven information search, integration and a...
 
Semantics in Financial Services -David Newman
Semantics in Financial Services -David NewmanSemantics in Financial Services -David Newman
Semantics in Financial Services -David Newman
 
Semantic Web: Technolgies and Applications for Real-World
Semantic Web: Technolgies and Applications for Real-WorldSemantic Web: Technolgies and Applications for Real-World
Semantic Web: Technolgies and Applications for Real-World
 
Trustworthy AI and Open Science
Trustworthy AI and Open ScienceTrustworthy AI and Open Science
Trustworthy AI and Open Science
 
Semantic Search tutorial at SemTech 2012
Semantic Search tutorial at SemTech 2012Semantic Search tutorial at SemTech 2012
Semantic Search tutorial at SemTech 2012
 
NISO/NFAIS Joint Virtual Conference: Connecting the Library to the Wider Wor...
NISO/NFAIS Joint Virtual Conference:  Connecting the Library to the Wider Wor...NISO/NFAIS Joint Virtual Conference:  Connecting the Library to the Wider Wor...
NISO/NFAIS Joint Virtual Conference: Connecting the Library to the Wider Wor...
 
NISO/NFAIS Joint Virtual Conference: Connecting the Library to the Wider Worl...
NISO/NFAIS Joint Virtual Conference: Connecting the Library to the Wider Worl...NISO/NFAIS Joint Virtual Conference: Connecting the Library to the Wider Worl...
NISO/NFAIS Joint Virtual Conference: Connecting the Library to the Wider Worl...
 
Lines of Communication: Open Access Repositories & Scholarly Publication
Lines of Communication: Open Access Repositories & Scholarly PublicationLines of Communication: Open Access Repositories & Scholarly Publication
Lines of Communication: Open Access Repositories & Scholarly Publication
 
Bioschemas Workshop
Bioschemas WorkshopBioschemas Workshop
Bioschemas Workshop
 
NetIKX Semantic Search Presentation
NetIKX Semantic Search PresentationNetIKX Semantic Search Presentation
NetIKX Semantic Search Presentation
 
Semantic Search using RDF Metadata (SemTech 2005)
Semantic Search using RDF Metadata (SemTech 2005)Semantic Search using RDF Metadata (SemTech 2005)
Semantic Search using RDF Metadata (SemTech 2005)
 

Destaque

Case Study: Integrating Data Harmony Terms and the eJournalPress Peer Review...
Case Study:  Integrating Data Harmony Terms and the eJournalPress Peer Review...Case Study:  Integrating Data Harmony Terms and the eJournalPress Peer Review...
Case Study: Integrating Data Harmony Terms and the eJournalPress Peer Review...Access Innovations, Inc.
 
NFAIS 2014 Miles Conrad Award Lecture, Presented by Marjorie M.K. Hlava
NFAIS 2014 Miles Conrad Award Lecture, Presented by Marjorie M.K. HlavaNFAIS 2014 Miles Conrad Award Lecture, Presented by Marjorie M.K. Hlava
NFAIS 2014 Miles Conrad Award Lecture, Presented by Marjorie M.K. HlavaAccess Innovations, Inc.
 
ROI & Impact - Quantitative & Qualitative Measures for Taxonomies
ROI & Impact - Quantitative & Qualitative Measures for TaxonomiesROI & Impact - Quantitative & Qualitative Measures for Taxonomies
ROI & Impact - Quantitative & Qualitative Measures for TaxonomiesAccess Innovations, Inc.
 
Developing the AIP Thesaurus: The Platform for an Ontology
Developing the AIP Thesaurus: The Platform for an OntologyDeveloping the AIP Thesaurus: The Platform for an Ontology
Developing the AIP Thesaurus: The Platform for an OntologyAccess Innovations, Inc.
 
The JTHES as Part of the Intelligence Layer for the Sustainability Collection...
The JTHES as Part of the Intelligence Layer for the Sustainability Collection...The JTHES as Part of the Intelligence Layer for the Sustainability Collection...
The JTHES as Part of the Intelligence Layer for the Sustainability Collection...Access Innovations, Inc.
 
Webinar: How to build a digital culture in higher education
Webinar: How to build a digital culture in higher educationWebinar: How to build a digital culture in higher education
Webinar: How to build a digital culture in higher educationPrecedent
 
Digital First Thinking and Working
Digital First Thinking and WorkingDigital First Thinking and Working
Digital First Thinking and WorkingSteve Buttry
 
Progress in semantic mapping - NKOS
Progress in semantic mapping - NKOSProgress in semantic mapping - NKOS
Progress in semantic mapping - NKOSAntoine Isaac
 
Enterprise Architecture By Sherif Abd El Gawad
Enterprise Architecture By Sherif Abd El GawadEnterprise Architecture By Sherif Abd El Gawad
Enterprise Architecture By Sherif Abd El GawadSherif Abdelgawad
 
Enterprise Content Management and the Federal Enterprise Architecture
Enterprise Content Management and the Federal Enterprise ArchitectureEnterprise Content Management and the Federal Enterprise Architecture
Enterprise Content Management and the Federal Enterprise ArchitectureJames Melzer
 
Lucie McLean — Come Together: Building Product Culture in Non-Digital Organis...
Lucie McLean — Come Together: Building Product Culture in Non-Digital Organis...Lucie McLean — Come Together: Building Product Culture in Non-Digital Organis...
Lucie McLean — Come Together: Building Product Culture in Non-Digital Organis...Turing Fest
 
SCIP workshop by Comintelli - Creating & Using Topic Maps to Visualize Your B...
SCIP workshop by Comintelli - Creating & Using Topic Maps to Visualize Your B...SCIP workshop by Comintelli - Creating & Using Topic Maps to Visualize Your B...
SCIP workshop by Comintelli - Creating & Using Topic Maps to Visualize Your B...Comintelli
 
ITSM Academy Webinar - Establishing A Business Process Group
ITSM Academy Webinar - Establishing A Business Process GroupITSM Academy Webinar - Establishing A Business Process Group
ITSM Academy Webinar - Establishing A Business Process GroupITSM Academy, Inc.
 
2.project lifecycle
2.project lifecycle2.project lifecycle
2.project lifecyclerlabsza
 
A Reference Model of the Knowledge Culture
A Reference Model of the Knowledge CultureA Reference Model of the Knowledge Culture
A Reference Model of the Knowledge CultureMalcolm Ryder
 
IT Service Catalog: Customer, Provider and Manager Views of a Service Catalog
IT Service Catalog: Customer, Provider and Manager Views of a Service CatalogIT Service Catalog: Customer, Provider and Manager Views of a Service Catalog
IT Service Catalog: Customer, Provider and Manager Views of a Service CatalogEvergreen Systems
 

Destaque (20)

Case Study: Integrating Data Harmony Terms and the eJournalPress Peer Review...
Case Study:  Integrating Data Harmony Terms and the eJournalPress Peer Review...Case Study:  Integrating Data Harmony Terms and the eJournalPress Peer Review...
Case Study: Integrating Data Harmony Terms and the eJournalPress Peer Review...
 
NFAIS 2014 Miles Conrad Award Lecture, Presented by Marjorie M.K. Hlava
NFAIS 2014 Miles Conrad Award Lecture, Presented by Marjorie M.K. HlavaNFAIS 2014 Miles Conrad Award Lecture, Presented by Marjorie M.K. Hlava
NFAIS 2014 Miles Conrad Award Lecture, Presented by Marjorie M.K. Hlava
 
ROI & Impact - Quantitative & Qualitative Measures for Taxonomies
ROI & Impact - Quantitative & Qualitative Measures for TaxonomiesROI & Impact - Quantitative & Qualitative Measures for Taxonomies
ROI & Impact - Quantitative & Qualitative Measures for Taxonomies
 
Developing the AIP Thesaurus: The Platform for an Ontology
Developing the AIP Thesaurus: The Platform for an OntologyDeveloping the AIP Thesaurus: The Platform for an Ontology
Developing the AIP Thesaurus: The Platform for an Ontology
 
The JTHES as Part of the Intelligence Layer for the Sustainability Collection...
The JTHES as Part of the Intelligence Layer for the Sustainability Collection...The JTHES as Part of the Intelligence Layer for the Sustainability Collection...
The JTHES as Part of the Intelligence Layer for the Sustainability Collection...
 
Webinar: How to build a digital culture in higher education
Webinar: How to build a digital culture in higher educationWebinar: How to build a digital culture in higher education
Webinar: How to build a digital culture in higher education
 
SKOS intro
SKOS introSKOS intro
SKOS intro
 
Digital First Thinking and Working
Digital First Thinking and WorkingDigital First Thinking and Working
Digital First Thinking and Working
 
Progress in semantic mapping - NKOS
Progress in semantic mapping - NKOSProgress in semantic mapping - NKOS
Progress in semantic mapping - NKOS
 
Enterprise Architecture By Sherif Abd El Gawad
Enterprise Architecture By Sherif Abd El GawadEnterprise Architecture By Sherif Abd El Gawad
Enterprise Architecture By Sherif Abd El Gawad
 
Enterprise Content Management and the Federal Enterprise Architecture
Enterprise Content Management and the Federal Enterprise ArchitectureEnterprise Content Management and the Federal Enterprise Architecture
Enterprise Content Management and the Federal Enterprise Architecture
 
Business Transformation Using TOGAF
Business Transformation Using TOGAF Business Transformation Using TOGAF
Business Transformation Using TOGAF
 
Lucie McLean — Come Together: Building Product Culture in Non-Digital Organis...
Lucie McLean — Come Together: Building Product Culture in Non-Digital Organis...Lucie McLean — Come Together: Building Product Culture in Non-Digital Organis...
Lucie McLean — Come Together: Building Product Culture in Non-Digital Organis...
 
SCIP workshop by Comintelli - Creating & Using Topic Maps to Visualize Your B...
SCIP workshop by Comintelli - Creating & Using Topic Maps to Visualize Your B...SCIP workshop by Comintelli - Creating & Using Topic Maps to Visualize Your B...
SCIP workshop by Comintelli - Creating & Using Topic Maps to Visualize Your B...
 
ITSM Academy Webinar - Establishing A Business Process Group
ITSM Academy Webinar - Establishing A Business Process GroupITSM Academy Webinar - Establishing A Business Process Group
ITSM Academy Webinar - Establishing A Business Process Group
 
2.project lifecycle
2.project lifecycle2.project lifecycle
2.project lifecycle
 
A Reference Model of the Knowledge Culture
A Reference Model of the Knowledge CultureA Reference Model of the Knowledge Culture
A Reference Model of the Knowledge Culture
 
DIKW model
DIKW modelDIKW model
DIKW model
 
Ontopia tutorial
Ontopia tutorialOntopia tutorial
Ontopia tutorial
 
IT Service Catalog: Customer, Provider and Manager Views of a Service Catalog
IT Service Catalog: Customer, Provider and Manager Views of a Service CatalogIT Service Catalog: Customer, Provider and Manager Views of a Service Catalog
IT Service Catalog: Customer, Provider and Manager Views of a Service Catalog
 

Semelhante a Taxonomy Fundamentals - SLA 2014

Information Architecture Primer - Integrating search,tagging, taxonomy and us...
Information Architecture Primer - Integrating search,tagging, taxonomy and us...Information Architecture Primer - Integrating search,tagging, taxonomy and us...
Information Architecture Primer - Integrating search,tagging, taxonomy and us...Dan Keldsen
 
Taxonomies And Search Aiim Mn
Taxonomies And Search Aiim MnTaxonomies And Search Aiim Mn
Taxonomies And Search Aiim MnAIIM Minnesota
 
Taxonomies for Publishing: Enhancing the User Experience
Taxonomies for Publishing: Enhancing the User ExperienceTaxonomies for Publishing: Enhancing the User Experience
Taxonomies for Publishing: Enhancing the User ExperienceTSoholt
 
Theresa regli bw-3
Theresa regli bw-3Theresa regli bw-3
Theresa regli bw-3R Aunpad
 
Inverted files for text search engines
Inverted files for text search enginesInverted files for text search engines
Inverted files for text search enginesunyil96
 
Looking Under the Hood -- Australia SharePoint Conference
Looking Under the Hood -- Australia SharePoint ConferenceLooking Under the Hood -- Australia SharePoint Conference
Looking Under the Hood -- Australia SharePoint ConferenceChristian Buckley
 
Making AI Behave: Using Knowledge Domains to Produce Useful, Trustworthy Results
Making AI Behave: Using Knowledge Domains to Produce Useful, Trustworthy ResultsMaking AI Behave: Using Knowledge Domains to Produce Useful, Trustworthy Results
Making AI Behave: Using Knowledge Domains to Produce Useful, Trustworthy ResultsAccess Innovations, Inc.
 
Vocabulary interoperability in the semantic web james r morris
Vocabulary interoperability in the semantic web   james r morrisVocabulary interoperability in the semantic web   james r morris
Vocabulary interoperability in the semantic web james r morrisJames R. Morris
 
Expert Webinar Series 2: Designing Information Architecture for SharePoint: M...
Expert Webinar Series 2: Designing Information Architecture for SharePoint: M...Expert Webinar Series 2: Designing Information Architecture for SharePoint: M...
Expert Webinar Series 2: Designing Information Architecture for SharePoint: M...martingarland
 
Henry stewart dam2010_taxonomicsearch_markohurst
Henry stewart dam2010_taxonomicsearch_markohurstHenry stewart dam2010_taxonomicsearch_markohurst
Henry stewart dam2010_taxonomicsearch_markohurstWIKOLO
 
Taxonomy Development and Digital Projects
Taxonomy Development and Digital ProjectsTaxonomy Development and Digital Projects
Taxonomy Development and Digital Projects daniela barbosa
 
Role of metadata in transportation agency data programs
Role of metadata in transportation agency data programsRole of metadata in transportation agency data programs
Role of metadata in transportation agency data programsJoseph Busch
 
OK So Enterprise Search is "Janky" - Now What?
OK So Enterprise Search is "Janky" - Now What?OK So Enterprise Search is "Janky" - Now What?
OK So Enterprise Search is "Janky" - Now What?Earley Information Science
 

Semelhante a Taxonomy Fundamentals - SLA 2014 (20)

User-Driven Taxonomies
User-Driven TaxonomiesUser-Driven Taxonomies
User-Driven Taxonomies
 
NISO/NFAIS Joint Virtual Conference: Connecting the Library to the Wider Wor...
NISO/NFAIS Joint Virtual Conference:  Connecting the Library to the Wider Wor...NISO/NFAIS Joint Virtual Conference:  Connecting the Library to the Wider Wor...
NISO/NFAIS Joint Virtual Conference: Connecting the Library to the Wider Wor...
 
Information Architecture Primer - Integrating search,tagging, taxonomy and us...
Information Architecture Primer - Integrating search,tagging, taxonomy and us...Information Architecture Primer - Integrating search,tagging, taxonomy and us...
Information Architecture Primer - Integrating search,tagging, taxonomy and us...
 
Taxonomies And Search Aiim Mn
Taxonomies And Search Aiim MnTaxonomies And Search Aiim Mn
Taxonomies And Search Aiim Mn
 
Tec2010 Buckley Share
Tec2010 Buckley ShareTec2010 Buckley Share
Tec2010 Buckley Share
 
Taxonomies for Publishing: Enhancing the User Experience
Taxonomies for Publishing: Enhancing the User ExperienceTaxonomies for Publishing: Enhancing the User Experience
Taxonomies for Publishing: Enhancing the User Experience
 
Theresa regli bw-3
Theresa regli bw-3Theresa regli bw-3
Theresa regli bw-3
 
Inverted files for text search engines
Inverted files for text search enginesInverted files for text search engines
Inverted files for text search engines
 
What’s New in Semantic Enrichment?
What’s New in Semantic Enrichment?What’s New in Semantic Enrichment?
What’s New in Semantic Enrichment?
 
Looking Under the Hood -- Australia SharePoint Conference
Looking Under the Hood -- Australia SharePoint ConferenceLooking Under the Hood -- Australia SharePoint Conference
Looking Under the Hood -- Australia SharePoint Conference
 
FAST Search-webinar-06-29-2010
FAST Search-webinar-06-29-2010FAST Search-webinar-06-29-2010
FAST Search-webinar-06-29-2010
 
Making AI Behave: Using Knowledge Domains to Produce Useful, Trustworthy Results
Making AI Behave: Using Knowledge Domains to Produce Useful, Trustworthy ResultsMaking AI Behave: Using Knowledge Domains to Produce Useful, Trustworthy Results
Making AI Behave: Using Knowledge Domains to Produce Useful, Trustworthy Results
 
Vocabulary interoperability in the semantic web james r morris
Vocabulary interoperability in the semantic web   james r morrisVocabulary interoperability in the semantic web   james r morris
Vocabulary interoperability in the semantic web james r morris
 
Expert Webinar Series 2: Designing Information Architecture for SharePoint: M...
Expert Webinar Series 2: Designing Information Architecture for SharePoint: M...Expert Webinar Series 2: Designing Information Architecture for SharePoint: M...
Expert Webinar Series 2: Designing Information Architecture for SharePoint: M...
 
Henry stewart dam2010_taxonomicsearch_markohurst
Henry stewart dam2010_taxonomicsearch_markohurstHenry stewart dam2010_taxonomicsearch_markohurst
Henry stewart dam2010_taxonomicsearch_markohurst
 
Taxonomy Development and Digital Projects
Taxonomy Development and Digital ProjectsTaxonomy Development and Digital Projects
Taxonomy Development and Digital Projects
 
Role of metadata in transportation agency data programs
Role of metadata in transportation agency data programsRole of metadata in transportation agency data programs
Role of metadata in transportation agency data programs
 
Hlava, Davis, Corson-Rikert, and Parr "Control Your Vocabulary: Real-World A...
Hlava, Davis, Corson-Rikert, and Parr "Control Your Vocabulary:  Real-World A...Hlava, Davis, Corson-Rikert, and Parr "Control Your Vocabulary:  Real-World A...
Hlava, Davis, Corson-Rikert, and Parr "Control Your Vocabulary: Real-World A...
 
Taxonomy Governance and Iteration
Taxonomy Governance and IterationTaxonomy Governance and Iteration
Taxonomy Governance and Iteration
 
OK So Enterprise Search is "Janky" - Now What?
OK So Enterprise Search is "Janky" - Now What?OK So Enterprise Search is "Janky" - Now What?
OK So Enterprise Search is "Janky" - Now What?
 

Mais de Access Innovations, Inc.

ISO 25964-1Working Group ISO/TC 46/SC 9/WG 8
ISO 25964-1Working Group ISO/TC 46/SC 9/WG 8ISO 25964-1Working Group ISO/TC 46/SC 9/WG 8
ISO 25964-1Working Group ISO/TC 46/SC 9/WG 8Access Innovations, Inc.
 
Hindawi taxonomy and personalization 27.10 (1)
Hindawi taxonomy and personalization 27.10 (1)Hindawi taxonomy and personalization 27.10 (1)
Hindawi taxonomy and personalization 27.10 (1)Access Innovations, Inc.
 
Asco using ai-taxos-for meta-titles-february-2021
Asco using ai-taxos-for meta-titles-february-2021Asco using ai-taxos-for meta-titles-february-2021
Asco using ai-taxos-for meta-titles-february-2021Access Innovations, Inc.
 
Ai webinar 2 -what's in a name (consolidated pdf)
Ai webinar 2 -what's in a name (consolidated pdf)Ai webinar 2 -what's in a name (consolidated pdf)
Ai webinar 2 -what's in a name (consolidated pdf)Access Innovations, Inc.
 
Tagging overview - Why Keywords Don't Cut It
Tagging overview  - Why Keywords Don't Cut ItTagging overview  - Why Keywords Don't Cut It
Tagging overview - Why Keywords Don't Cut ItAccess Innovations, Inc.
 
DHUG 2018: Towards Web-Centric Repository Interoperability
DHUG 2018: Towards Web-Centric Repository InteroperabilityDHUG 2018: Towards Web-Centric Repository Interoperability
DHUG 2018: Towards Web-Centric Repository InteroperabilityAccess Innovations, Inc.
 
DHUG 2017 - Understanding ROI Just Enough to Get Your Project Funded
DHUG 2017 - Understanding ROI Just Enough to Get Your Project FundedDHUG 2017 - Understanding ROI Just Enough to Get Your Project Funded
DHUG 2017 - Understanding ROI Just Enough to Get Your Project FundedAccess Innovations, Inc.
 

Mais de Access Innovations, Inc. (20)

ISO 25964-1Working Group ISO/TC 46/SC 9/WG 8
ISO 25964-1Working Group ISO/TC 46/SC 9/WG 8ISO 25964-1Working Group ISO/TC 46/SC 9/WG 8
ISO 25964-1Working Group ISO/TC 46/SC 9/WG 8
 
Smart submit
Smart submitSmart submit
Smart submit
 
Plos taxonomy beyond search dhug 2021
Plos taxonomy beyond search   dhug 2021Plos taxonomy beyond search   dhug 2021
Plos taxonomy beyond search dhug 2021
 
Hindawi taxonomy and personalization 27.10 (1)
Hindawi taxonomy and personalization 27.10 (1)Hindawi taxonomy and personalization 27.10 (1)
Hindawi taxonomy and personalization 27.10 (1)
 
Data harmonycloudpowerpointclientfacing
Data harmonycloudpowerpointclientfacingData harmonycloudpowerpointclientfacing
Data harmonycloudpowerpointclientfacing
 
Data harmony update 2021
Data harmony update 2021 Data harmony update 2021
Data harmony update 2021
 
Atypon dhug2021
Atypon dhug2021Atypon dhug2021
Atypon dhug2021
 
Asco using ai-taxos-for meta-titles-february-2021
Asco using ai-taxos-for meta-titles-february-2021Asco using ai-taxos-for meta-titles-february-2021
Asco using ai-taxos-for meta-titles-february-2021
 
Asce more than just topic taxonomies
Asce more than just topic taxonomiesAsce more than just topic taxonomies
Asce more than just topic taxonomies
 
Acs discoverability-dhug2021
Acs discoverability-dhug2021Acs discoverability-dhug2021
Acs discoverability-dhug2021
 
Ai webinar 2 -what's in a name (consolidated pdf)
Ai webinar 2 -what's in a name (consolidated pdf)Ai webinar 2 -what's in a name (consolidated pdf)
Ai webinar 2 -what's in a name (consolidated pdf)
 
Tagging overview - Why Keywords Don't Cut It
Tagging overview  - Why Keywords Don't Cut ItTagging overview  - Why Keywords Don't Cut It
Tagging overview - Why Keywords Don't Cut It
 
Health Affairs - Why Keywords Don't Cut It
Health Affairs - Why Keywords Don't Cut ItHealth Affairs - Why Keywords Don't Cut It
Health Affairs - Why Keywords Don't Cut It
 
Why Keywords Don't Cut It
Why Keywords Don't Cut ItWhy Keywords Don't Cut It
Why Keywords Don't Cut It
 
Data Harmony update 2020 final
Data Harmony update 2020 finalData Harmony update 2020 final
Data Harmony update 2020 final
 
Data Harmony Update 2020 final
Data Harmony Update 2020 finalData Harmony Update 2020 final
Data Harmony Update 2020 final
 
DHUG 2018: Towards Web-Centric Repository Interoperability
DHUG 2018: Towards Web-Centric Repository InteroperabilityDHUG 2018: Towards Web-Centric Repository Interoperability
DHUG 2018: Towards Web-Centric Repository Interoperability
 
DHUG 2018 - Florida Thesis OCR
DHUG 2018 - Florida Thesis OCRDHUG 2018 - Florida Thesis OCR
DHUG 2018 - Florida Thesis OCR
 
DHUG 2017 - Understanding ROI Just Enough to Get Your Project Funded
DHUG 2017 - Understanding ROI Just Enough to Get Your Project FundedDHUG 2017 - Understanding ROI Just Enough to Get Your Project Funded
DHUG 2017 - Understanding ROI Just Enough to Get Your Project Funded
 
DHUG 2017 - Thesaurus Construction Training
DHUG 2017 - Thesaurus Construction TrainingDHUG 2017 - Thesaurus Construction Training
DHUG 2017 - Thesaurus Construction Training
 

Taxonomy Fundamentals - SLA 2014

  • 1. Taxonomy Fundamentals Why build a taxonomy? SLA – Vancouver – June 7, 2013 www.accessinn.com www.dataharmony.com 505-998-0800 Marjorie M.K. Hlava President and Chief Scientist Bob Kasenchak Project Coordinator Access Innovations, Inc. Copyright © 2013 Access Innovations, Inc.
  • 2. A fast moving and powerful introduction to both the theoretical and practical aspects of building a taxonomy, thesaurus, and ontology. A well-built taxonomy is part of the foundation of the information architecture underlying web sites, corporate Intranets, search/retrieval, and access to relevant content in databases. After defining controlled vocabularies and identifying core standards, you will explore key concepts of taxonomy, thesaurus, indexing, classification, and filtering. Discussion will include the basics of a taxonomy records and fundamental term relationships. Attendees will put concepts into practice through multiple exercises, taxonomy, indexing, and related software tools will be demonstrated. Introduction To Taxonomy Concepts Copyright © 2013 Access Innovations, Inc.
  • 3. About Access Innovations Access Innovations are experts in content creation, enrichment, and conversion services. We provide services to semantically enrich and tag raw text into highly structured data. We deliver clean, well-formed, metadata- enriched content so our clients can reuse, repurpose, store, and find their knowledge assets. We go beyond the standards to build taxonomies and other data control structures as a solid foundation for your information. Our services and software allow organizations to use and present their information to both internal and external constituents by leveraging search, presentation, and e-commerce. We change search to found! Quick Facts • Founded in 1978 • Headquartered in Albuquerque, NM • Privately held • Delivered more than 2000 engagements Copyright © 2013 Access Innovations, Inc.
  • 4. What we do  Access Innovations  Ensure clean, well formed content  Create Knowledge Organization Systems (KOS)  Data Harmony Tools  To automatically index content  To manage KOS and more  To semantically enrich the content  To organize the content  Visualization tools to portray the data 4 Copyright © 2013 Access Innovations, Inc.
  • 5. Outline of the Day  Why the excitement  What is a Taxonomy  Card Sort – Slide 39  How to build a taxonomy  Term relationships  Thesaurus Examples  Pre and Post Coordination  What are we controlling  Vocabulary Options  TaxoMatch - Slide 189  Term Forms  Facets / Notation / Roles / Treatment/ Weighting  Auto Indexing  A Taxing Situation - Slide 315  Search  Where do I use it?  Standards and references
  • 6. Why The Excitement?  Makes information findable!  Cut search time by 50%! (The Weather Channel)  Leverages information in new ways  User satisfaction  Organizes topical areas and web sites  Provides better online help  Customer support 30x more costly than web self- service* *(Forrester Research "Tier Zero Customer Support" 1999) Copyright © 2013 Access Innovations, Inc.
  • 7. Taxonomies are found… • In “indexing”, tagging, categorizing, subject metadata • In search - precision, recall • In content management systems, web sites • In SharePoint to replace term tree, tag uploads • In mashups, repackaging, repurposing data • In social networking sites • In author tagging - peer reviewer selection • In filtering data – e.g., spam filters and RSS feeds • In web crawlers • In text analytics – trend analysis • … and much more Copyright © 2013 Access Innovations, Inc. Because taxonomies make them work
  • 8. Where Does Implementation Happen?  At the backend  When the records / articles are added to the production system  When the search software’s “inverted file” is created  When the HTML for the web page is created Copyright © 2013 Access Innovations, Inc.
  • 9. Heart Of The “Big Data” Production Process Copyright © 2013 Access Innovations, Inc.
  • 10. From the production side to the website display, carry the taxonomy descriptors for use in precision search Copyright © 2013 Access Innovations, Inc.
  • 11. Taxonomy Copyright © 2013 Access Innovations, Inc.
  • 12. Authors at a place MASHUP locations to a GPS grid of an area Two data points GPS Coordinates Taxonomy description of the place Copyright © 2013 Access Innovations, Inc.
  • 13. Watch Crime In Action Copyright © 2013 Access Innovations, Inc.
  • 14. Copyright © 2013 Access Innovations, Inc.
  • 15. Copyright © 2013 Access Innovations, Inc.
  • 16. Two data points GPS Coordinates Taxonomy description of the crime Copyright © 2013 Access Innovations, Inc.
  • 18. Copyright © 2013 Access Innovations, Inc.
  • 19. Copyright © 2013 Access Innovations, Inc.
  • 20. All Data Up-posted To The Top Level Copyright © 2013 Access Innovations, Inc.
  • 21. Pattern Analysis Indexing Clusters Copyright © 2013 Access Innovations, Inc.
  • 22. Pattern Analysis Domain Associations Copyright © 2013 Access Innovations, Inc.
  • 23. Pattern Analysis Domain Correlations Copyright © 2013 Access Innovations, Inc.
  • 24. Pattern Analysis Gap Analyses Copyright © 2013 Access Innovations, Inc.
  • 25. Pattern Analysis Component Gaps Copyright © 2013 Access Innovations, Inc.
  • 26. More Like This - Recommender Cancer Epidemiology Biomarkers & Prevention Vol. 12, 161-164, February 2003 © 2003 American Association for Cancer Research Short Communications Alcohol, Folate, Methionine, and Risk of Incident Breast Cancer in the American Cancer Society Cancer Prevention Study II Nutrition Cohort Heather Spencer Feigelson1, Carolyn R. Jonas, Andreas S. Robertson, Marjorie L. McCullough, Michael J. Thun and Eugenia E. Calle Department of Epidemiology and Surveillance Research, American Cancer Society, National Home Office, Atlanta, Georgia 30329-4251 Recent studies suggest that the increased risk of breast cancer associated with alcohol consumption may be reduced by adequate folate intake. We examined this question among 66,561 postmenopausal women in the American Cancer Society Cancer Prevention Study II Nutrition Cohort. Related Press Releases •How What and How Much We Eat (And Drink) Affects Our Risk of Cancer •Novel COX-2 Combination Treatment May Reduce Colon Cancer Risk Combination Regimen of COX-2 Inhibitor and Fish Oil Causes Cell Death •COX-2 Levels Are Elevated in Smokers Related AACR Workshops and Conferences •Frontiers in Cancer Prevention Research •Continuing Medical Education (CME) •Molecular Targets and Cancer Therapeutics Related Meeting Abstracts •Association between dietary folate intake, alcohol intake, and methylenetetrahydrofolate reductase C677T and A1298C polymorphisms and subsequent breast •Folate, folate cofactor, and alcohol intakes and risk for colorectal adenoma •Dietary folate intake and risk of prostate cancer in a large prospective cohort study Related Working Groups •Finance •Charter •Molecular Epidemiology Related Education Book Content Oral Contraceptives, Postmenopausal Hormones, and Breast Cancer Physical Activity and Cancer Hormonal Interventions: From Adjuvant Therapy to Breast Cancer Prevention Related Awards •AACR-GlaxoSmithKline Clinical Cancer Research Scholar Awards •ACS Award •Weinstein Distinguished Lecture Webcasts Related Webcasts Think Tank Report Related Think Tank Report Content Copyright © 2013 Access Innovations, Inc.
  • 27. Link to Society Resources Journal Article on Topic A Other Journal Articles on Topic A Upcoming Conference on Topic A Podcast Interview with Researcher Working on Topic A Grant Available for Researchers Working on Topic A CME Activity on Topic A Job Posting for Expert on Topic A Copyright © 2013 Access Innovations, Inc.
  • 28. Author Connections Copyright © 2013 Access Innovations, Inc.
  • 29. What is a taxonomy? Albuquerque, NM 87110 www.accessinn.com www.dataharmony.com 505-998-0800 Marjorie M.K. Hlava President and Chief Scientist Access Innovations, Inc. Copyright © 2013 Access Innovations, Inc.
  • 30. Vocabulary Control - Options  Classification systems*  Authority files  Controlled term lists  Uncontrolled term lists  Thesauri Copyright © 2013 Access Innovations, Inc. [*We will concentrate on taxonomies and thesauri, first, and then cover the others as time permits.]
  • 31. Taxonomy Standards  Z39.19 (2005) Controlled Vocabularies  BS 8723 Parts 1 – 5  ISO25964 Parts 1 - 4  TAG 37 and 46 standards  SKOS - Simple Knowledge Organization System  OWL - Web Ontology Language  AND more! Copyright © 2013 Access Innovations, Inc.
  • 32. A Taxonomy is a Knowledge Organization System (KOS)  Uncontrolled list  Name authority file  Synonym set/ring  Controlled vocabulary  Taxonomy  Thesaurus  Ontology  Semantic network Not complex Highly complex Copyright © 2013 Access Innovations, Inc.
  • 33. Structure Of Controlled Vocabularies Lists Synonyms Taxonomy Thesaurus Ontology Ambiguity Ambiguity Ambiguity Specifies a KOS Synonym Synonym Additional kinds of Hierarchy Hierarchy Relationships Relationships relationships INCREASING COMPLEXITY and CONTROL Copyright © 2013 Access Innovations, Inc.
  • 34. What is a Taxonomy? ANSI/NISO Z39.19-2005 “A collection of controlled vocabulary terms organized into a hierarchical structure.” controlled Missing: equivalence, homographic, and associative relationships and notes Yes! Copyright © 2013 Access Innovations, Inc.
  • 35. Taxonomy? Thesaurus?  Often used interchangeably  Thesaurus is a taxonomy with extras  Related Terms  Non-preferred Terms (USE/Used for)  Scope Notes  More  Taxonomies often have the actual information object at the final node.  CMS and SharePoint tend to the hierarchical view only, definition, and USE Copyright © 2013 Access Innovations, Inc.
  • 36. Taxonomy? Thesaurus?  Main Term (MT)  Top Term (TT)  Broader Terms (BT)  Narrower Terms (NT)  Related Terms (RT)  See also (SA)  Non-Preferred Term (NP)  Used for (UF), See (S)  Scope Note (SN)  History (H) = subject term, heading, node, category, descriptor, class TAXONOMY THESAURUS OWL can specify Copyright © 2013 Access Innovations, Inc.
  • 37. The Semantic Roadmap: Knowledge Organization Systems  Semantic network  Ontology  Thesaurus  Taxonomy  Controlled vocabulary  Synonym set/ring  Name authority file  Uncontrolled list •Unrelated Entities •Ambiguity •Linked Entities •Contextual Specificity •Simple •Low Value •Complex •High value Uncontrolled list has the Highest Cost over Time! Copyright © 2013 Access Innovations, Inc.
  • 38. Copyright © 2005 - Access Innovations, Inc. Taxonomy view Thesaurus Term Record view Copyright © 2013 Access Innovations, Inc.
  • 39. Copyright © 2013 Access Innovations, Inc.
  • 40. Taxonomy 101 How do you build a taxonomy? Albuquerque, NM 87110 www.accessinn.com www.dataharmony.com 505-998-0800 Marjorie M.K. Hlava President and Chief Scientist Access Innovations, Inc. Copyright © 2013 Access Innovations, Inc.
  • 41. How Do You Build a Taxonomy ? • Define subject field • Collect terms • Organize terms • Fill in gaps • Flesh out and interrelate terms • Apply to your data You’re done! Copyright © 2013 Access Innovations, Inc.
  • 42. Foundations  Start with what is known  Build from there  Use the literature, your data  Use the lists you already have internally  Built-in continuous review throughout the process, and beyond  Who is involved?  Taxonomists  Subject matter experts (SME)  Project management  Users Copyright © 2013 Access Innovations, Inc.
  • 43. Define Subject Field  Review representative collection of content  Determine:  Core areas  Peripheral topics Psychology Education Sociology Law  Scope can be modified later Copyright © 2013 Access Innovations, Inc.
  • 44. Where Do I Get the Terms?  Your documents and databases  Departmental terminology  Text books and their indexes  Book tables of contents and indexes  Journal quarterly indexes  Encyclopedias  Lexicons, glossaries on the topic  Web resources  Users and experts  Search logs Copyright © 2013 Access Innovations, Inc.
  • 45. How Do You Choose Terms?  Importance in the subject area  Use in the literature, by the organization or community  Necessary degree of specificity or detail  Relationship with other controlled vocabularies  Single concept = single term Copyright © 2013 Access Innovations, Inc.
  • 46. Build, Buy, Augment?  Survey existing thesaurus/taxonomy resources for your domain  Test for • Scope • Depth • Make-or-break terms • Cost  Adoption of existing taxonomies  Term registries  Taxobank  Taxonomy Warehouse  Other resources Don’t reinvent the wheel! Copyright © 2013 Access Innovations, Inc.
  • 47. Gather Terms From Search Logs  Top ~100 search terms from search logs  Terms used more than 50 times  Match to web site with appropriate answer  Basis for favorites or best bets, presented at the top of results list  Behavior-based taxonomy Copyright © 2013 Access Innovations, Inc.
  • 48. Vocabulary Control – How?  Use unambiguous terms, clear to the user group  Distinguish between terms that appear similar  Use Scope Notes when necessary  Use terms as elements that can be coordinated in a flexible manner  Create compound terms, if necessary Copyright © 2013 Access Innovations, Inc.
  • 49. Term Format  KISS – Keep it short and simple • 1-2-3 words • Effect on search • Pre and Post Coordination  Establish a policy • follow Chicago Manual of Style  Grammatical issues • Nouns and noun phrases • Verbs  Gerunds • Adjectives - no • Adverbs - no • Initial articles – no Copyright © 2013 Access Innovations, Inc.
  • 50. Thesaurus - Format  Main Entries  Top Terms - TT  Broader Terms - BT  Narrower Terms - NT  Related Terms - RT  Scope Notes - SN  History - HI  Date term added/changed - DA Copyright © 2013 Access Innovations, Inc.
  • 51. Thesaurus - Format  Related terms - RT  See - S  See also - SA  Use - U  Preferred Term PT  Use for - UF  Non Preferred Term NP  .. Copyright © 2013 Access Innovations, Inc.
  • 52. Definitions  Index term  the representation of a concept  Preferred term (International)  a term used consistently to index a concept  descriptor (USE)  what the “USED FOR” reference points to Copyright © 2013 Access Innovations, Inc.
  • 53. Definitions  Non preferred term (International)  synonym or quasi synonym of a preferred term  non-descriptor (USE)  the “USE” reference  the “SEE” reference  Related term  the “SEE ALSO” Copyright © 2013 Access Innovations, Inc.
  • 54. Indexing Terms  Three main categories  concrete entities  abstract concepts  proper nouns Copyright © 2013 Access Innovations, Inc.
  • 55. One Term / One Concept  Importance in the subject area  Use in the literature, by the organization or community  Necessary degree of specificity or detail  Relationship with other controlled vocabularies Copyright © 2013 Access Innovations, Inc.
  • 56. One Term / One Concept  Terms represent simple or unitary concept  A unit of thought  Can be a single-word term  Can be a multiword term, if required to represent the concept  Three main categories – Concrete entities – Abstract concepts – Proper nouns “A unit of thought, formed by mentally combining some or all of the characteristics of a concrete or abstract, real or imaginary object. Concepts exist in the mind as abstract entities independent of terms used to express them.” Copyright © 2013 Access Innovations, Inc.
  • 57. Concrete Entities  Things and their physical parts  primates  head  buildings  floors  islands Copyright © 2013 Access Innovations, Inc.
  • 58. Concrete Entities as Terms • Things and their physical parts – Birds • Feathers • Buildings • Floors • Materials – Cement – Wood – Lead – Cards and Chips Copyright © 2013 Access Innovations, Inc.
  • 59. Concrete Entities  Materials  cement  wood  lead  cars  refrigerators Copyright © 2013 Access Innovations, Inc.
  • 60. Abstract Concepts  Actions and events  evolution  respiration  skating  management  wars  ceremonies Copyright © 2013 Access Innovations, Inc.
  • 61. Abstract Concepts  Abstract entities, properties of things, materials and actions  law  theory  strength  efficiency  lead (management) Copyright © 2013 Access Innovations, Inc.
  • 62. Abstract Concepts  Disciplines and sciences  physics  meteorology  mathematics  psychology Copyright © 2013 Access Innovations, Inc.
  • 63. Abstract Concepts  Units of measurement  kilograms  pounds  meters  miles Copyright © 2013 Access Innovations, Inc.
  • 64. Abstract Concepts as Terms • Actions and events – evolution, skating, management, ceremonies • Abstract entities – law, theory • Properties of things, materials, and actions – strength, efficiency • Disciplines and sciences – physics, meteorology, mathematics • Units of measurement – pounds, kilograms, miles, meters, nanoseconds Copyright © 2013 Access Innovations, Inc.
  • 65. Proper Nouns*  Individual entities, or “classes of one”, expressed as proper nouns  San Francisco  United States of America  Lake Michigan * Proper names – of persons – are not included Copyright © 2013 Access Innovations, Inc.
  • 66. Proper Nouns as Terms  Individual entities – “classes of one” – expressed as proper nouns  San Francisco, Lake Michigan Thesaurus standards exclude proper names, persons, and trade names  authority files. Taxonomies include them as final nodes. Copyright © 2013 Access Innovations, Inc.
  • 67. Most Terms Are Nouns  Nouns or simple noun phrases  Adj + Noun – Art history (ANSI/NISO standard)  Noun + Prep + Noun – History of art (ISO standard)  Exceptions – Burden of proof, Coats of arms, Prisoners of war, Birds of prey, etc. Copyright © 2013 Access Innovations, Inc.
  • 68. About “and”  Avoid “and” in terms – not a single concept Instead of: Children and television Factor and postcoordinate USE Media influence + Television + Children “And” is not in the standard In real life—need for granularity may dictate your choice Copyright © 2013 Access Innovations, Inc.
  • 69. Compound Terms – Nope!  “Terms in a thesaurus should represent simple or unitary concepts…” (ISO standard)  “Compound terms should be factored (split) into simple elements…” (ANSI/NISO standard)  Term phrases are okay (bigrams)  Adjective Noun  American history  Two concepts combined are not  Aromatherapy for bloating Copyright © 2013 Access Innovations, Inc.
  • 70. Organize Terms – Roughly  Sort terms into several major categories – logical groups of similar concepts as Top Terms  Identify core areas and peripheral topics  10 – 20 to start  Consider moving proper names to authority files  Result: loose collection of terms under several main headings  Rough and tentative – see how it fits as you go  Initial gap analysis  Add / modify / delete as needed Copyright © 2013 Access Innovations, Inc.
  • 72. How Do Terms Relate?  Hierarchical relationships -- Parents and their children  Equivalence relationships -- Aliases  Associative relationships -- Cousins TAXONOMY THESAURUS Copyright © 2013 Access Innovations, Inc.
  • 73. Hierarchical Relationships  Broader Term (BT) represents the class, whole, or genus  Narrower Term (BT) is a member, part, or species  Generic relationship  Whole-part relationship  Instance relationship  NT inherit all the BT characteristics  BTs/NTs have a reciprocal relationship Copyright © 2013 Access Innovations, Inc.
  • 74. Hierarchical Relationships  Class as a whole  superordination  broader term (BT)  sometimes top term (TT)  Members or parts of the class  subordination  narrower term (NT)  Reciprocal Copyright © 2013 Access Innovations, Inc.
  • 75. Hierarchical Relationships  BT/NT based on being part of same class  Same fundamental category  entities  activities  agents  properties Copyright © 2013 Access Innovations, Inc.
  • 76. Hierarchical Relationships  Museums  Archaeological museum type of entity NT  Ethnological museum type of entity NT  Curators agents RT  Museum techniques action RT  Scientific museum type of entity NT Copyright © 2013 Access Innovations, Inc.
  • 77. Hierarchy – Whole-Part Relationships  Four general types 1. Body systems and organs  Ear  Middle ear 2. Geographical locations  Bernalillo County  Albuquerque 3. Fields of study  Geology  Physical geology 4. Hierarchical social structures  Ontario  Manitoulin District Copyright © 2013 Access Innovations, Inc.
  • 78. Hierarchy – Instance Relationships  General category (common noun) as BT, with individual example (proper noun) as Narrower Term Instance (NTI) Seas French cathedrals Baltic Sea Chartres Cathedral Caspian Sea Rheims Cathedral Mediterranean Sea Rouen Cathedral Essentially identical to “final node” in taxonomies Copyright © 2013 Access Innovations, Inc.
  • 79. Hierarchical Types of Display  Systematic  Alphabetic  other, but less common views Copyright © 2013 Access Innovations, Inc.
  • 80. 80 DTIC  Hierarchy Copyright © 2013 Access Innovations, Inc.
  • 81. Polyhierarchical Relationship • Term can logically fit under more than one Broader Term – can have Multiple Broader Terms (MBT) • Part of ISO standards, new to ANSI/NISO Nurses Health administrators Nurse administrators Nurse administrators Finance Careers Accounting Accounting Copyright © 2013 Access Innovations, Inc.
  • 82. Polyhierarchical Relationships  Great for the web click environment  Terms occur in multiple categories  Can be generic as well as hierarchical Engineering Physics NT Nanotechnology NT Nanotechnology Nanotechnology BT Engineering BT Physics Copyright © 2013 Access Innovations, Inc.
  • 83. 83 DTIC  Alpha Copyright © 2013 Access Innovations, Inc.
  • 84. Pests Generic Relationship Tests Squirrels Rodents  ALL squirrels are rodents x NOT ALL squirrels are pests x NOT ALL pests are rodents Copyright © 2013 Access Innovations, Inc.
  • 85. Generic Relationship Tests • Both terms in same fundamental category • “All-and-some” test SOME ALL SOME NOT ALL Rodents Squirrels Pests Squirrels Consider concepts of marketing and advertising Copyright © 2013 Access Innovations, Inc.
  • 86. Generic Relationships  “Identifies the link between a class or category and its members or species.”  Easy in biology  Rodents  NT Squirrels  All and some rule Copyright © 2013 Access Innovations, Inc.
  • 87. All and Some Rule  Rodents  NT Squirrels  RT Pests  Q. Is this an example of polyhierarchy?  Q. Do you need to make RT relationships for “Pests” to all of the NTs under “Rodents”? Copyright © 2013 Access Innovations, Inc.
  • 88. Instance Relationships  Seas ISO  NT Baltic Sea  NT Caspian Sea  NT Mediterranean Sea  French Cathedrals NISO / ANSI  NTI Chartres Cathedral  NTI Rheims Cathedral  NTI Rouen Cathedral  RT Gothic cathedrals Copyright © 2013 Access Innovations, Inc.
  • 89. Instance Relationships  French Cathedrals NISO / ANSI  NTI Chartres Cathedral  NTI Rheims Cathedral  NTI Rouen Cathedral  RT Gothic cathedrals  French Gothic Cathedral  NTI Chartres Cathedral  NTI Rheims Cathedral  NTI Rouen Cathedral  BT Gothic cathedrals  Q. Why/how do these differ? Copyright © 2013 Access Innovations, Inc.
  • 90. 90 CABI Pages Copyright © 2013 Access Innovations, Inc.
  • 91. Instance Relationships  “…a general category of things and events expressed by a common noun, and an individual instance of that category, the instance then forming a class of one which is represented by a proper name.”  A way of adding the proper names and items from the Authority files to the thesaurus Copyright © 2013 Access Innovations, Inc.
  • 92. Questions before moving on to Associative Relationships?
  • 93. Associative Relationships  Related Terms (RTs) – cousins  “…terms related conceptually, but not hierarchically, and are not part of an equivalence set” (i.e. not synonyms)  Both terms are valid thesaurus terms for indexing and have reciprocal relationship  Expands user’s awareness and reflects thesaurus coverage of unanticipated areas  Standards describe specific types Copyright © 2013 Access Innovations, Inc.
  • 94. Associated Relationships Related terms Physicians Medicine (“Reciprocal posting” done automatically is highly desirable.) Copyright © 2013 Access Innovations, Inc.
  • 95. Associative Relationships  Sibling relationships  Examples:  Brother : Sister  Desk : Chair  Easier to create within well defined facets (e.g. AAT)  Usual step in building process  Can be identified automatically Copyright © 2013 Access Innovations, Inc.
  • 96. Associative Relationships  RT relationships  Braking systems  RT Trains  RT Bicycle  RT Motor vehicle  Office furniture  RT Office buildings  RT Ergonomics Copyright © 2013 Access Innovations, Inc.
  • 97. Associative Relationships  Field of study and objects studied  Seismology  RT Earthquakes  Meteorology  RT Weather patterns Copyright © 2013 Access Innovations, Inc.
  • 98. Associative Relationships  Operation or process and the agent or instrument  Hairdressing  RT Hair dryers  Word processing  RT Typing skills Copyright © 2013 Access Innovations, Inc.
  • 99. Associative Relationships  Occupation and person in occupation  Social work  RT Social workers  Information science  RT Special librarians Copyright © 2013 Access Innovations, Inc.
  • 100. Associative Relationships  Action and the product of the action  Publishing  RT Music scores  Landscaping  RT Lawn mowers  RT Irrigation systems Copyright © 2013 Access Innovations, Inc.
  • 101. Associative Relationships  Action and its patient  Teaching  RT Students  Conducting  RT Musicians Copyright © 2013 Access Innovations, Inc.
  • 102. Associative Relationships  Concepts related to their properties  Women  RT Femininity  Automobiles  RT Automotive safety Copyright © 2013 Access Innovations, Inc.
  • 103. Associative Relationships  Concepts related to their origins  Water  RT Water wells  Carpet  RT Thread Copyright © 2013 Access Innovations, Inc.
  • 104. Associative Relationships  Concepts linked by causal dependence  Injuries  RT Accidents  Cultural stress  RT Culture shock Copyright © 2013 Access Innovations, Inc.
  • 105. Associative Relationships  Action and counter action  Pests  RT Pesticides  Log on  RT Log off Copyright © 2013 Access Innovations, Inc.
  • 106. Associative Relationships  Raw material and its product  Hides  RT Leather  Clothing  RT Fabric Copyright © 2013 Access Innovations, Inc.
  • 107. Associative Relationships  Action and associated property  Precision instrument  RT Accuracy  Production processes  RT Quality control Copyright © 2013 Access Innovations, Inc.
  • 108. Associative Relationships  Concept and its opposite  Single People  RT Married people  Height  RT Depth  RT Weight  If not hierarchical, probably associative Copyright © 2013 Access Innovations, Inc.
  • 109. Questions before moving on to Equivalence Relationships?
  • 110. Equivalence Relationships  Refer to the same concept  (Use for)  Prefix for non-preferred terms  (Use)  Prefix for preferred terms  Automobiles  used for Cars  Cars  use Automobiles Copyright © 2013 Access Innovations, Inc.
  • 112. Equivalence Relationships  Synonyms  popular and scientific  spiders - arachnida  scientific and trade names  Motrin (TM) - ibuprofen  standard names and slang  hi fi - high fidelity  different linguistic origin  home care - domicillary care Copyright © 2013 Access Innovations, Inc.
  • 113. Equivalence Relationships  Synonyms cont’d  different cultures  aerials - antenna  trunk - boot  hire - rent  emerging concepts  telecommuting - distance working  outdated  refrigerators - iceboxes Copyright © 2013 Access Innovations, Inc.
  • 114. A “Term” Synonym Ring Term Node Subject headingCategory Descriptor Copyright © 2013 Access Innovations, Inc.
  • 115. Equivalence Relationships  Lexical variants  variant spellings  Muslim - Moslem  center - centre  direct and indirect forms  electric power plants  power plants, electric  abbreviations  ECG - electrocardiograph Copyright © 2013 Access Innovations, Inc.
  • 116. Equivalence Relationships  Quasi synonyms  urban areas - cities  gifted people - geniuses  Antonyms  height - depth  literacy - illiteracy Copyright © 2013 Access Innovations, Inc.
  • 117. Equivalence Relationships  Up posting (generic posting)  useful for web interfaces  NT equivalent to their BT  not sub species of BT Copyright © 2013 Access Innovations, Inc.
  • 118. Equivalence Relationships PsychInfo Rotated Copyright © 2013 Access Innovations, Inc.
  • 119. Equivalence Relationships  Factored terms  express terms in their combinations  Milk hygiene  use milk and hygiene Copyright © 2013 Access Innovations, Inc.
  • 120. Equivalence Relationship • Preferred Term – Thesaurus term and valid for indexing – Thesaurus notation: USE • Non-Preferred Term – Not valid for indexing – An alias or imposter – Entry point, directs user to Preferred Term – Thesaurus notation: UF or NPT Spiders Plant pathology UF Arachnids USE Phytopathology Copyright © 2013 Access Innovations, Inc.
  • 121. Equivalence – When to Use  Synonyms, slang, quasi-synonyms  Scientific and trade names  Ibubrofen UF Motrin™  Lexical variants  Fiber optics UF Fibre optics  Mouse UF Mice  Upward posting of narrow concepts not specified in taxonomy or thesaurus  Social class UF Elite, Middle class, Working class Get equivalent terms from search logs, brainstorming… Copyright © 2013 Access Innovations, Inc.
  • 122. Scope Notes (SN)  Indicate meaning of the term in the context of this thesaurus, for this audience  Stress – Metal, Psychological, Physiological  Indicate any restriction in meaning  Indicate range of topics covered  Provide direction for indexers; for terms often confused, may suggest an alternative term  Use only as needed – not for every term  Establish and stick with consistent format  Be concise Copyright © 2013 Access Innovations, Inc.
  • 123. Scope Notes (SN)  Restrictions on meaning  Range of topics covered  Instructions to indexers  Term histories  Reciprocal scope notes Copyright © 2013 Access Innovations, Inc.
  • 124. Questions before moving on to more thesaurus examples?
  • 125. Thesaurus - Examples  Roget's 1852  synonyms  COSATI - 1964  concept linking  NASA  AEC - ERDA - DOE - ESA  National Library of Medicine  outline of a field  Medical Subject Headings - MeSH Copyright © 2013 Access Innovations, Inc.
  • 126. Copyright © 2001 Access Innovations, Inc. 126 NASA  Alphabetic Copyright © 2013 Access Innovations, Inc.
  • 127. 127 NASA  Hierarchical Copyright © 2013 Access Innovations, Inc.
  • 128. Thesaurus - Examples  INSPEC - multifaceted  Thesaurus  Classification system  Free text terms  Variant spellings  NICEM  27 Top Terms Copyright © 2013 Access Innovations, Inc.
  • 129. Copyright © 2001 Access Innovations, Inc. 129 INSPEC Copyright © 2013 Access Innovations, Inc.
  • 130. Copyright © 2001 Access Innovations, Inc. 130 INSPEC  Hierarchy Copyright © 2013 Access Innovations, Inc.
  • 131. Merged Vocabularies  Yahoo!  Subject headings  Authority files  In a single list Copyright © 2013 Access Innovations, Inc.
  • 132. Copyright © 2001 Access Innovations, Inc. 132 Copyright © 2013 Access Innovations, Inc.
  • 133. Copyright © 2001 Access Innovations, Inc. 133 Yahoo!  Hierarchy Copyright © 2013 Access Innovations, Inc.
  • 134. Merged Vocabularies - continued  Office.com  Multiple broader terms  Concept mapping Copyright © 2013 Access Innovations, Inc.
  • 135. Copyright © 2001 Access Innovations, Inc. 135Copyright © 2013 Access Innovations, Inc.
  • 136. Eurovoc Thesaurus Pages Copyright © 2013 Access Innovations, Inc.
  • 137. Copyright © 2001 Access Innovations, Inc. 137 Eurovoc Thesaurus Hierarchy Copyright © 2013 Access Innovations, Inc.
  • 138. 138 Eurovoc Terms Copyright © 2013 Access Innovations, Inc.
  • 139. So far you’ve got…  Hierarchy – Broader and Narrower Terms • Polyhierarchies when needed – Preferred/Non-Preferred Terms – Equivalence relationships – Related Terms – Associative relationships – Scope Notes – Complete term records – Correct term format Copyright © 2013 Access Innovations, Inc.
  • 140. So far you’ve got…  Hierarchical relationships -- Parents and their children  Equivalence relationships -- Aliases  Associative relationships -- Cousins -- See Also’s TAXONOMY THESAURUS Copyright © 2013 Access Innovations, Inc.
  • 141. So far you’ve got… • Term format • Grammatical issues • Singular and plural forms • Spelling • Abbreviations and acronyms • Capitalization • Other punctuation • Consistency Copyright © 2013 Access Innovations, Inc.
  • 142. Pre and Post Coordination
  • 143. Pre and Post Coordinate Terms  Pre coordinates – two concepts  Subject headings – Library of Congress  American history – Civil War  Back of the book  Put together in advance by the publisher  Post Coordinate  Taxonomy terms  Single concept  Put together by the user / searcher Copyright © 2013 Access Innovations, Inc.
  • 144. Pre-coordination  Card catalogs - printed indexes  Links and roles defined  Controlled vocabularies  High input costs  Precise recall - easier searching Copyright © 2013 Access Innovations, Inc.
  • 145. Post-coordination  Starting with punch cards  Machine readable  Frequently natural language  Currency and specificity  Exhaustive coverage - loss of precision  Low input costs  False drops Copyright © 2013 Access Innovations, Inc.
  • 146.  Work first from the literature  Establish literary warrant for terms  Some one else do the clerical work  Differentiate the lexicography work  From the Subject Matter expert work  Let SMEs do the review and tailoring  Expert review ensures the proper term use and application  Advisory Board…advisable! Subject Matter Experts (SME) Copyright © 2013 Access Innovations, Inc.
  • 147. Again, why do we index?  Improve precision  define scope of terms  Improve recall  different terms for same concept  Guide to a field of expertise  Learning tool  Richer expression Copyright © 2013 Access Innovations, Inc.
  • 148. Uses?  Indexing  …process by which subject terms or classification symbols are assigned to concepts in documents  A thesaurus is also known as an indexing language  M.A.I.™ is an automated indexing system Copyright © 2013 Access Innovations, Inc.
  • 149. What are We Controlling?
  • 150. What are We Controlling?  Synonyms  different terms same concept  Polysemes or Homonyms  same word different meanings  lead or mercury Copyright © 2013 Access Innovations, Inc.
  • 151. How?  Meaning  delineation of scope of a term  Term equivalence  linking of synonyms  Disambiguation of homonyms  lead (metal)  lead (element)  lead (management) Copyright © 2013 Access Innovations, Inc.
  • 152. Disambiguation Bridge Structure Bridge Dentistry Bridge Game Bridge Concept Copyright © 2013 Access Innovations, Inc.
  • 153. Disambiguation  Restriction and clarification of meaning  Cells  biological microsystems  electrical equipment  prison housing  Reading  town in England  communication process Copyright © 2013 Access Innovations, Inc.
  • 154. Disambiguation Bill Invoice Bill Legislative Bill Sport Bill Person Copyright © 2013 Access Innovations, Inc.
  • 155. Disambiguation: Pre-Coordinate vs. Post-Coordinate Forms  Cells (biology)  Cells (electric)  Cells (prison)  Reading (place)  Reading (process)  Biological cells  Electric cells Copyright © 2013 Access Innovations, Inc.
  • 156. Precision Options  Language specificity  Coordination  Compound terms - level of precoordination  Homographs and scope notes  Word distance indication Copyright © 2013 Access Innovations, Inc.
  • 157. Precision Options  Structural relationships  Links and roles  Treatment and aspect codes  Weighting Copyright © 2013 Access Innovations, Inc.
  • 158. Maintenance of a Controlled Vocabulary  Allow for new jargon to be added  Any living field will have new terms  Identifier field  Candidate terms  Consider multiple broader terms Copyright © 2013 Access Innovations, Inc.
  • 159. Review, edit, test, edit, use, edit, and maintain, i.e., edit  Review  Users  Expert reviewers  Test  Index 500+ documents (more for variable writing style; fewer for strict style)  Monitor search log  Edit and maintain  Add term  Change existing term  Change term status  Delete term  Add term relationship  Delete term relationship  Add/modify Scope Note  Change overall structure Consider automated / assisted indexing software Copyright © 2013 Access Innovations, Inc.
  • 160. When Do You Add More Terms?  On demand  When usage changes  Stewardess – flight attendant  As the field evolves  8 changes to 64 colors  In Use  Don’t freeze waiting for perfection Copyright © 2013 Access Innovations, Inc.
  • 161. Vocabulary Control - Options  Classification systems  Authority files  Controlled term lists  Uncontrolled term lists  Thesauri Copyright © 2013 Access Innovations, Inc.
  • 162. Classification Systems - Defined  Are used to put an object in a specific place. In the traditional classification system each item has a single spot to go.  Follow an outline of knowledge  Used to shelve books in a library Copyright © 2013 Access Innovations, Inc.
  • 163. Catalog Systems - Defined  Used to catalog the object to identify its contents  Based on perception  Multiple terms are used to identify a single object  Not natural language  Pre-coordinated - subheadings Copyright © 2013 Access Innovations, Inc.
  • 164. Classification Systems - Examples  Classification of actual collections  New York State Library - Dewey  810.01  Cutter - Universities 1800 - 1960’s  Z34  Lan  Thomas Jefferson - Library of Congress  z34.18  la  Government Documents Numbers  based on government structure Copyright © 2013 Access Innovations, Inc.
  • 165. Catalog Systems - Examples  Library of Congress Subject Headings  Sears Subject Headings  (used with Dewey) Copyright © 2013 Access Innovations, Inc.
  • 166. King of Catalogers  Charles Ammi Cutter  rules for alphabetical subject indexing  most specific heading  put two topics under two headings  use English if possible  x ref antonyms  careful with homographs  1895 ALA Subject Headings following Cutter Copyright © 2013 Access Innovations, Inc.
  • 167. Politics in Libraries  In 1905 Dewey was president of ALA (American Library Association)  LC adopted DDC  Threw out Cutter  The two never spoke again. Copyright © 2013 Access Innovations, Inc.
  • 168. Types of Headings  Single word  Botany or Ethics  Adjective noun  Capital punishment  Noun - noun  Death penalty  American Standard  Noun preposition noun  Penalty of death  International Standard  Noun conjunction noun  Nurses and nursing Copyright © 2013 Access Innovations, Inc.
  • 169. Cutter Guidelines  File under the phrase “as it reads”  Use the most significant words  Reduce adjective nouns to noun phrases  Use singular rather than plural  File compound words under the first word  No subheadings Copyright © 2013 Access Innovations, Inc.
  • 170. Cross References  Cross reference synonyms  main heading should be what the class uses  use the common term  use the unambiguous heading  prefer the one which brings relations  “…with a well defined network of cross references the mob becomes an army.. “  C.A. Cutter Copyright © 2013 Access Innovations, Inc.
  • 171. Library of Congress (LC) Subject Headings  1911 - List of Subject Headings  extensive use of sub-headings  invert phrases for main subject  file under the noun not the adjective  see references not cross filing  place holder terms  homographs defined parenthetically Copyright © 2013 Access Innovations, Inc.
  • 172. Classification vs. Subject Headings  Classification  single spot or placement  browse physical list  often a numbering system  clear hierarchy  no or few cross references  Like Yahoo! Copyright © 2013 Access Innovations, Inc.
  • 173. Classification vs. Subject Headings  Subject headings  generic search  hidden classification system  related terms and cross references in heavy use  usually the inverted form  cells, electric Copyright © 2013 Access Innovations, Inc.
  • 174. Vocabulary Control - Options  Classification systems  Authority files  Controlled term lists  Uncontrolled term lists  Thesauri Copyright © 2013 Access Innovations, Inc.
  • 175. Authority Systems - Defined  Frequently have cross references  Widely available  Frequently coded lists  Brand names  .. Lists of terms in the preferred format for use. Copyright © 2013 Access Innovations, Inc.
  • 176. Authority Files - Defined  People  Places  Things  ……..NOT  Concepts  Methods  Processes Copyright © 2013 Access Innovations, Inc.
  • 177. Authority Files - Examples  ISO Country Name and Code  International Standards Organization  ISO Language list  NAICS (SIC)  Standard Industrial Classification Code (SIC) Replaced by  North American Industrial Classification System (NAICS) Copyright © 2013 Access Innovations, Inc.
  • 178. Authority Lists - Format  Belgian Congo  use Congo  Bill Gates  use William F. Gates, III (computer scientist)  see also  William Gates (basketball player) Copyright © 2013 Access Innovations, Inc.
  • 179. Authority Lists - Need Style Sheets  Names  AACR2  Anglo American Cataloging Rules  AAP  American Association of Publishers  Chicago Manual of Style  Dun & Bradstreet  Style Sheet Copyright © 2013 Access Innovations, Inc.
  • 180. Vocabulary Control - Options  Classification systems  Authority files  Controlled term lists  Uncontrolled term lists  Thesauri Copyright © 2013 Access Innovations, Inc.
  • 181. Controlled Term Lists - Defined  State the preferred terms  Provide allowed term entry  Heavily cross referenced  Not generally hierarchical  Popular  Easy to create Copyright © 2013 Access Innovations, Inc.
  • 182. Controlled Term Lists - Examples  ABI/Inform  Predicasts  RDS - Responsive Data Services  Back of book indexes  Art and Architecture Thesaurus  …....These are not FULL thesauri Copyright © 2013 Access Innovations, Inc.
  • 183. Controlled Term List - Format  Cars  use Automobiles  Personal Computer  use Microcomputer Copyright © 2013 Access Innovations, Inc.
  • 184. Vocabulary Control - Options  Classification systems  Authority files  Controlled term lists  Uncontrolled term lists  Thesauri Copyright © 2013 Access Innovations, Inc.
  • 185. Uncontrolled List - Define  Add terms as they occur  No cross reference  Simple flat structure Copyright © 2013 Access Innovations, Inc.
  • 186. Uncontrolled List - Example  List of names  Grocery list  Candidate term list Copyright © 2013 Access Innovations, Inc.
  • 187. Uncontrolled List - Format  Laundry  Trim bushes  Cat box needs cleaning  Tommy’s birthday (bake cake)  Iron  Water plants  ….other natural language lists Copyright © 2013 Access Innovations, Inc.
  • 188. Trying to Impose Control...  Do laundry  Trim bushes  Clean cat box  Bake birthday cake  Iron shirts  Water plants Copyright © 2013 Access Innovations, Inc.
  • 189.  Designed to enhance understanding and retention of the vocabulary concepts necessary for creating a taxonomy, ontology, thesaurus, or controlled vocabulary.  Game supplies:  1 Deck of Orange Question and Challenge Cards  1 Deck of Green Answer Cards  Game setup:  Shuffle the deck of Green Answer cards,  Deal the entire deck to the players.  Shuffle the deck of Orange Question and Challenge cards  Place them facedown in a pile in the middle of the table so that all players can reach the pile.  Reinforce what you just heard!  Have fun! Copyright © 2013 Access Innovations, Inc.
  • 190. 1. Play moves to the left of the dealer 2. Draw a card from the top of the Orange cards. Read it aloud to all of the players. 3. The player who read the card says out loud what they think the answer is. 4. Each player looks at the Green Answer cards in their hand. 1. If they have the correct answer to the Question or Challenge, they show their card to everyone at the table. 2. If everyone agrees that the answer is correct, the player holding the correct answer card gives it to the player who read the Question or Challenge card. 5. The player places their associated pair of cards – one Orange Question and Challenge card and one Green Answer card – face up on the table in front of them. 6. Play passes to the person who held the correct Green Answer card in their hand. Play continues as in step 2 above. 7. Discussion among the players to arrive at the correct answer is permissible and encouraged! 8. If players do not arrive at a consensus regarding the correct answer, the Orange Question and Challenge card may be returned to the bottom of the pile, and play passes to the person to the left of the player who drew the previous card. 9. When all of the Orange Question and Challenge cards have been drawn, read aloud, and matched with their Green Answer cards, the game ends. 10. If there are any Orange Question and Challenge cards remaining to which players cannot agree on an answer, players may consult their notes or ask the session speaker. Copyright © 2013 Access Innovations, Inc.
  • 192. Term Forms  Nouns  Prepositional forms  Adjectives  Adverbs  Initial Articles  Singular and plural Copyright © 2013 Access Innovations, Inc.
  • 193. Term Forms - Noun and Noun Phrases  Nouns and noun phrases  print media  carpet Copyright © 2013 Access Innovations, Inc.
  • 194. Term Forms - Prepositional Forms  Prepositional forms are seldom used  okay in International Standard ISO  Philosophy of Education  ANSI / NISO  Educational philosophy Copyright © 2013 Access Innovations, Inc.
  • 195. Term Forms – Adjectives  Adjectives  not used in isolation  may be used for coordination  Miniature paintings  USE PAINTINGS AND MINIATURE  Portable typewriters  USE TYPEWRITERS AND PORTABLE Copyright © 2013 Access Innovations, Inc.
  • 196. Term Forms – Adjectives  Adjectives  may convert to noun forms  MINIATURE SIZE  PORTABLE DEVICES  TRIANGULAR SHAPE Copyright © 2013 Access Innovations, Inc.
  • 197. Term Forms - Adverbs  Adverbs  not used unless part of a compound term  VERY LARGE ARRAY RADIO TELESCOPE  Used for VLA Copyright © 2013 Access Innovations, Inc.
  • 198. Term Forms - Verbs  Verbs  no infinitive or participle forms  for actions that can be expressed as nouns and retain clear meaning, use noun form or gerunds  Examples  Speaking (not Speech)  Walking (not Ambulation)  Communication (not Communicate)  Administration (not Administer) Copyright © 2013 Access Innovations, Inc.
  • 199. Term Forms - Initial Articles  AVOID THEM  Example  Theater not The theater  State (political entity) not The state  Use if part of a proper name  Le Mans  El Salvador Copyright © 2013 Access Innovations, Inc.
  • 200. Term Forms - Singular and Plural  Concrete entities  count nouns are plurals - how many?  planets  children  non count nouns - how much?  nickel  snow  lace Copyright © 2013 Access Innovations, Inc.
  • 201. Term Forms - Singular and Plural  fully formed organism  eyes  mouth  objects are singular  lamp  classes of things  fruits Copyright © 2013 Access Innovations, Inc.
  • 202. Term Forms - Singular and Plural  Abstract concepts  Show in the singular form  authority  socialism  packaging  biochemistry Copyright © 2013 Access Innovations, Inc.
  • 203. Term Forms - Singular and Plural  Unique entities  Show in the singular  Big Ben  Grand Canyon Copyright © 2013 Access Innovations, Inc.
  • 204. Other Formatting  Spelling  Punctuation  Capitalization  Abbreviations  ... Copyright © 2013 Access Innovations, Inc.
  • 205. Spelling  Use what the users will use and cross post for multilingual  fiber - fibre  center - centre  organization - organisation  hemo - haemo  Pediatrics - paediatrics Copyright © 2013 Access Innovations, Inc.
  • 206. Punctuation  Parentheses only for qualifiers  Apostrophes are retained  Hyphens - avoid  avoid  avoid  avoid  avoid Copyright © 2013 Access Innovations, Inc.
  • 207. Capitalization  NISO = initial only  AACR2 format  Practice is to follow a manual of style  Chicago Manual of Style  Associated Press  American Association of Publishers Copyright © 2013 Access Innovations, Inc.
  • 208. Abbreviations  Use only when well known  Always include the full meaning  LASER  Scope Note Light Amplification by Stimulated Emission of Radiation  WHO  World Health Organization Copyright © 2013 Access Innovations, Inc.
  • 209. Other Ways of Adding Value  Cross references  Facets  Notation  Roles  Treatment  Term weighting Copyright © 2013 Access Innovations, Inc.
  • 210. Cross References  See - S  See also - SA  Not related or associated  Not opposite  Just helpful guides Copyright © 2013 Access Innovations, Inc.
  • 211. Synthesis in Classification  S.R.Ranganathan 1933  Colon Classification  analytico-syntactic classification  analyze subject into component parts (facets)  arrange facets into schedules  combine facets to express subject complexity Copyright © 2013 Access Innovations, Inc.
  • 212. Ranganathan  A General Properties  Ab Configuration  Ac Tubular  B Materials  Bc Metals  Bcc ferrous  Bcd steels  Bcf Chromium steels  Bcfi Chromium-nickel steels  K Modes of failure  Kg Creep  Kgb Creep rupture  L Stresses and loads  Lb Tensile Copyright © 2013 Access Innovations, Inc.
  • 213. Ranganathan  Tubular Chromium Nickel steel creep rupture Tensile strength  Ac Bcfi Kgb Bb  Chain indexing  Tubular  Chromium Nickel steel  creep rupture  Tensile strength Copyright © 2013 Access Innovations, Inc.
  • 214. Other Ways of Adding Value  Cross references  Facets  Notation  Roles  Treatment  Term weighting Copyright © 2013 Access Innovations, Inc.
  • 215. Facets  Additional ways to add meaning  Divide terms into categories using a single characteristic  Limited number of categories Copyright © 2013 Access Innovations, Inc.
  • 216. Facets and Roles  PRECIS - Austin 1984  order of terms  post-coordinate indexing system  role of the term is important  tomato  living plant?  marketable product?  Facet role indicator  organism  end product Copyright © 2013 Access Innovations, Inc.
  • 217. Many Faceted Vocabularies  UMLS Semantic Network  Unified Medical Language System - 49  BLISS Classification Association  British Library Information Science System  Dewey Decimal Classification System  Universal Decimal Classification System  Art and Architecture Thesaurus Copyright © 2013 Access Innovations, Inc.
  • 218. MeSH and Tree Pages Copyright © 2013 Access Innovations, Inc.
  • 219. Copyright © 2001 Access Innovations, Inc. 219 MeSH Alpha Copyright © 2013 Access Innovations, Inc.
  • 220. Order of Facets  Post-coordinate  Means before order  Notation becomes important  Breaks down for large classes  (more than 5,000 terms) Copyright © 2013 Access Innovations, Inc.
  • 221. Other Ways of Adding Value  Cross references  Facets  Notation  Roles  Treatment  Term weighting Copyright © 2013 Access Innovations, Inc.
  • 222. Notation Options  Expressive  Ordinal  Synthetic  Enumeration  Many style options Copyright © 2013 Access Innovations, Inc.
  • 223. Expressive Notation  83Hazards  831 Fire  831.5 Fire fighting  831.53 Fire fighting equipment  831.532 Fire extinguishers  831.532.5 Carbon dioxide fire extinguishers  832 Explosions Copyright © 2013 Access Innovations, Inc.
  • 224. Ordinal and Semi-ordinal Notation  HK Hazards  HL Fire  HM Fire fighting  HN Fire fighting equipment  HNB Fire extinguishers  HNE Carbon dioxide fire extinguishers  HO Explosions  Indention is the sole indication of hierarchy Copyright © 2013 Access Innovations, Inc.
  • 225. Synthetic and Enumeration Notation  Need to allow the classification system to grow  Synthetic example  P Architecture  PAT Architectural information  PAT.M Architectural information services Copyright © 2013 Access Innovations, Inc.
  • 226. Copyright © 2001 Access Innovations, Inc. 226 Notation Examples - AAT Facets Copyright © 2013 Access Innovations, Inc.
  • 227. Systematic Display  Paints  (By composition)  Oil paints  Water paints  Cement paints  (By use)  Primers  Undercoats  Top coats Copyright © 2013 Access Innovations, Inc.
  • 228. Copyright © 2001 Access Innovations, Inc. 228 AAT Pages  Notice faceted  indentions Copyright © 2013 Access Innovations, Inc.
  • 229. 229 AAT Term Copyright © 2013 Access Innovations, Inc.
  • 230. Alphabetical Display  Paints  NT  Cement paints  Oil paints  Primers  Top coats  Undercoats  Water paints Copyright © 2013 Access Innovations, Inc.
  • 231. Other Ways of Adding Value  Cross references  Facets  Notation  Roles  Treatment  Term weighting Copyright © 2013 Access Innovations, Inc.
  • 232. Roles  ERIC Thesaurus - role indicators  Adjectives - bibliographic terms  Input or raw material  Output or product  Undesirables  Indicated uses  Materials “In which”  Affects  Primary topics of discussion  Passive recipients, possessors, location  Means used Copyright © 2013 Access Innovations, Inc.
  • 233. Roles  CAS - Super roles  Analytical study  Biological study  Formation, nonpreparative  Occurrence  Preparation  Process  Uses  CAS Specific roles  Miscellaneous  Properties  Reactant Copyright © 2013 Access Innovations, Inc.
  • 234. Subheadings as Roles  MeSH  Therapeutic use  Drug treatment (disease)  Adverse effect (drug treatment)  Diagnosis Copyright © 2013 Access Innovations, Inc.
  • 235. Other Ways of Adding Value  Cross references  Facets  Notation  Roles  Treatment  Term weighting Copyright © 2013 Access Innovations, Inc.
  • 236. Treatment and Aspect Codes  Apply codes or types at article level  Theoretical  New development  Experimental  Practical Copyright © 2013 Access Innovations, Inc.
  • 237. Other Ways of Adding Value  Cross references  Facets  Notation  Roles  Treatment  Term weighting Copyright © 2013 Access Innovations, Inc.
  • 238. Cranfield Project - Cleverdon 1966  Concepts in the main theme 9/10  Major subsidiary theme 7/8  Minor subsidiary theme 5/6 Copyright © 2013 Access Innovations, Inc.
  • 239. Internet Engines  Complex weighting of terms  Use term frequency  Rank output wholly automatic  Output based on input term weights  Can also use “well formed” data -  like a thesaurus hierarchy  field formatted data  XML files Copyright © 2013 Access Innovations, Inc.
  • 240. Automatic and Semi-automatic Classification?  Data Harmony® M.A.I.™  Semio  Autonomy - Muscat  Net Owl - Names  n-Stein  Quiver  Smart Logic Copyright © 2013 Access Innovations, Inc.
  • 241. Machine Aided Indexing Goals  Improve  Indexing efficiency  Indexing consistency  Reduce editorial drift  Depth of Indexing  Reduce  Over and under indexing  Term over use and under use Copyright © 2013 Access Innovations, Inc.
  • 242. Machine Aided Indexing Goals  Improve productivity  Indexer  Information worker  Disambiguate terms  Increase clarity Copyright © 2013 Access Innovations, Inc.
  • 243. Machine Aided Indexing - Intellectual Components  Word List or Thesaurus  Knowledge base  Rules based  Natural Language (Semantic)  Editorial evaluation Copyright © 2013 Access Innovations, Inc.
  • 244. Example: M.A.I.™ Software Components  Rule Builder  Concept Extractor  Statistics Collector Copyright © 2013 Access Innovations, Inc.
  • 245. Copyright © 2013 Access Innovations, Inc.
  • 246. Taxonomies in Search Copyright © 2013 Access Innovations, Inc.
  • 247. Do the Data FIRST  What do you have?  What does it need?  How would you LIKE to access it?  Look at the data BEFORE you create the specifications  DTD built without data is not going to work  Then choose the system that will support your data Copyright © 2013 Access Innovations, Inc.
  • 248. My Main Frustration 1. Select hardware 2. Select software 3. Design system 4. Try to load the data 5. Add the taxonomy, if at all  That’s BACKWARDS Copyright © 2013 Access Innovations, Inc.
  • 249. Why Does Search Fail?  Most large organizations have 5 different search 7  All disappointing and sitting on the shelf  Inconsistent results  Unclear path to results  Lack of single unified clear consistent vocabulary  Not tied to data governance  Taxonomy  Other metadata Copyright © 2013 Access Innovations, Inc.
  • 250. SEARCH  How search works  Measuring accuracy in search  Precision  Recall  Relevance  Search theoretical basis  Bayes, Boole, and the rest of the guys  The taxonomy effect Copyright © 2013 Access Innovations, Inc.
  • 251. Parts of Search  Search software  Inverted Index  Search algorithms  Presentation layer  Search box  Autocompletion  Related and narrower terms  Hierarchical display Copyright © 2013 Access Innovations, Inc.
  • 252. Hierarchical Display Inverted File Index Searchable Index Taxonomy Thesaurus Inverted Files and Boolean are Basic to ALL Search Copyright © 2013 Access Innovations, Inc. Note: not available in all systems!
  • 253. “Outline of Presentation” 1 Define key terminology 2 Thesaurus tools  Features  Functions 3 Costs  Thesaurus construction  Thesaurus tools 4 Why & when? Creating an Inverted File Index Sample DOCUMENT Copyright © 2013 Access Innovations, Inc.
  • 254. Simple Inverted File Index of the Terms from the “Outline” & 1 2 3 4 construction costs define features functions key of outline presentation terminology thesaurus tools when why Copyright © 2013 Access Innovations, Inc.
  • 255. & - Stop 1 - Stop 2 - Stop 3 - Stop 4 - Stop construction - L7, P2, SH costs - L6, P1, H define - L2, P1, H features - L4, P1, SH functions - L5, P1, SH key - L2, P2, H of - Stop outline - L1, P1, T presentation - L1, P3, T terminology - L2, P3, H thesaurus - (1) - L3, P1, H (2) - L7, P1, SH (3) - L8, P1, SH tools - (1) - L3, P2, H (2) - L8, P2, SH when - L9, P3, H why - L9, P1, H Complex Inverted File Index - Placement, Location added Copyright © 2013 Access Innovations, Inc.
  • 256. Search Presentation Layer Automatic completion And type ahead from Thesaurus Copyright © 2013 Access Innovations, Inc.
  • 257. Search Presentation Layer Related Narrower Copyright © 2013 Access Innovations, Inc.
  • 258. Search Presentation Layer The Hierarchical view of the thesaurus is also a browse able view of the content. The numbers include the number of hits 1. For the term 2. For the branch Copyright © 2013 Access Innovations, Inc.
  • 259. Many parts  Search software – of course  Computer network  Parsing of text – the “inverted file”  Well formed or structured text  CLEAN DATA  Computer software – network  Computer hardware  Telecommunications connection  Training sets for statistical systems How Does Search Work? Copyright © 2013 Access Innovations, Inc.
  • 260. Technical Parts of Search  Search technology  Ranking algorithms  Query language  Federators  Cache  Inverted index – as discussed above  Other enhancements  Presentation Layer Copyright © 2013 Access Innovations, Inc.
  • 261. Access Innovations – Complex Farm With Perfect Search Source Data Query Search Harmony Presentation Layer Repository XIS (cache) Cleanup, etc. Federators Query Servers Index Builders Deploy Hub Cache Builders Copyright © 2013 Access Innovations, Inc.
  • 262. QUERYAPI CUSTOM CONNECTOR EMAIL CONNECTOR Core Architectural Components Pipeline SEARCH SERVER QUERY PROCESSOR Query Results Vertical Applications Portals Custom Front-Ends Mobile DevicesContent Push DOCUMENT PROCESSOR Web Content Files, Documents Databases Custom Applications CONTENTAPI MANAGEMENT API Index DB DATABASE CONNECTOR FILE TRAVERSER WEB CRAWLER Pipeline Email, Groupware Administrator’s Dashboard FILTER SERVER Agent DB Alerts Data Harmony Governance API MAIstro Searchharmony FAST Search Example Copyright © 2013 Access Innovations, Inc.
  • 263. Measuring Accuracy in Search  Relevance  Recall  Precision  Accuracy – Hits, miss, noise  Ranking  Linguistics  Query Processing  Results Processing  Display  Search refinement  Usability  Business Rules 263 Copyright © 2013 Access Innovations, Inc.
  • 264. Relevance  How well a set of returned documents answers the information need  “Accuracy”  Related to objective of search  Different user communities  Information resources  Tension of user needs and context available  A confidence “guesstimate” Copyright © 2013 Access Innovations, Inc.
  • 265. Recall = Number of relevant items retrieved Number of relevant items in the collection Precision = Number of relevant items retrieved Number of items retrieved Relevance = Germane (Precision) Pertinent (Recall) The Formulas Copyright © 2013 Access Innovations, Inc.
  • 266. Measuring Relevance  Concepts  Context  Age of documents  Completeness (recall)  Quality  Statistically determined ?  Nope, it is subjective  Someone has to determine the rightness of the item  A confidence factor = canard! Copyright © 2013 Access Innovations, Inc.
  • 267. Kinds of Search  Bayesian –  FAST  Lucene  Autonomy / Verity  Boolean  Dialog  Endeca  Perfect Search  Ranking algorithms  Google 267 Copyright © 2013 Access Innovations, Inc.
  • 268. George Boole and Boolean Algebra  George Boole  Mathematician  1815-1864  Boolean algebra  An algebraic system of logic  AND, OR, NOT, ANDNOT,  Dialog, BRS, Stairs 268 Copyright © 2013 Access Innovations, Inc.
  • 269. Boolean Representation  Venn diagram showing the intersection of sets A AND B (in violet),  The union of sets A OR B (all the colored regions),  And set A XOR B (all the colored regions except the violet).  The "universe" is represented by the rectangular frame. 269 Copyright © 2013 Access Innovations, Inc.
  • 270. Bayes and Bayes’ Theorem  Thomas Bayes  Mathematician  1702 - 1761  Bayesian theorem  Uses probability inductively  Established a mathematical basis for probability inference  WHAT?  A means of calculating,  from the number of times an event has not occurred,  the probability that it will occur in future trials 270 Copyright © 2013 Access Innovations, Inc.
  • 271. Bayesian Methods – Cautions  A user might wish to change the distribution of probabilities.  A user will make a novel request for information in a previously unanticipated way.  The computational difficulty of exploring a previously unknown network.  The quality and extent of the prior beliefs used in Bayesian inference processing. Copyright © 2013 Access Innovations, Inc.
  • 272. Bayesian Methods - Cautions (continued)  A Bayesian network is only as useful as the prior knowledge is reliable.  An optimistic or pessimistic expectation of the quality of these prior beliefs will distort the entire network and invalidate the results.  Must ensure the selection of the statistical distribution induced in modeling the data.  Must have the proper distribution model to describe the data.  That is… you have to constantly train and retrain the data Copyright © 2013 Access Innovations, Inc.
  • 273. Basic Areas of Natural Language Processing (NLP)  Syntactic  Semantic  Morphological  Phraseological  Lemmatization (stemming)  Statistical  Grammatical  Common Sense Copyright © 2013 Access Innovations, Inc.
  • 274. Basic Areas of Automatic Language Processing (ALP)  Auto Translation  Auto Indexing  Auto Abstracting  Artificial Intelligence  Searching  Spell Checking  Semantic Web  Natural Language Processes (NLP)  Computational Linguistics Copyright © 2013 Access Innovations, Inc.
  • 275. Statistical Search  Cluster analysis  Neural networks  Co-occurrence  Bayesian inference  Latent Semantic  Etc. 275 Copyright © 2013 Access Innovations, Inc.
  • 276. Word and Term Parsing  Stemming  -ing, -ed, -es, -’s, -s’, etc.  Depluralization  Truncation  Left and right  Wild cards  Organi*ation  Variant Spellings  Centre, Center  Hyphens Copyright © 2013 Access Innovations, Inc.
  • 277. The Taxonomy Effect  Where do the terms go?  How are they used in search  What other ways can I use the taxonomy in search? Copyright © 2013 Access Innovations, Inc.
  • 278. For search all publications Search database for Journals and pubs Bookstore search Search of 53 crawled sites including journals, books, web site, conference sites, etc. Site search Navigation Copyright © 2013 Access Innovations, Inc.
  • 279. Taxonomy Driven Search Presentation Navigate the full taxonomy “tree” BROWSE Auto-completion using the taxonomy Guide the user Copyright © 2013 Access Innovations, Inc.
  • 280. Subject Browsing Copyright © 2013 Access Innovations, Inc.
  • 281. Targeted Resources Based on Subject or User Role CONFIDENTIAL Copyright © 2013 Access Innovations, Inc.
  • 282. Member Profile Tagging User pastes or uploads CV Button to auto- extract taxonomy attributes Copyright © 2013 Access Innovations, Inc.
  • 283. TaxoTerm Server DataHarmony (M.A.I.) Returns subject metadata Microsoft SharePoint Server2010 User uploads a document to SharePoint space Before uploading to SharePoint server, the EventHandler sends the document to Data Harmony. Data Harmony automatically attaches indexing terms before uploading to MOSS Adding Terms to SharePoint Copyright © 2013 Access Innovations, Inc.
  • 284. SharePoint 2010 Only Shows 10 Lines of the Taxonomy 284 This add on makes it all viewable Copyright © 2013 Access Innovations, Inc.
  • 285. QUERYAPI CUSTOM CONNECTOR EMAIL CONNECTOR Core Architectural Components Pipeline SEARCH SERVER QUERY PROCESSOR Query Results Vertical Applications Portals Custom Front-Ends Mobile DevicesContent Push DOCUMENT PROCESSOR Web Content Files, Documents Databases Custom Applications CONTENTAPI FAST MANAGEMENT API Index DB DATABASE CONNECTOR FILE TRAVERSER WEB CRAWLER Pipeline Email, Groupware Administrator’s Dashboard FILTER SERVER Agent DB Alerts Use taxonomy terms here Data Harmony Governance API MAIstro Searchharmony Taxonomies Added in Search Example Copyright © 2013 Access Innovations, Inc.
  • 286. Auto suggestion of Taxonomy Terms Populate Keywords, Descriptors, Indexing terms, etc. Allow for manual review of auto- tagging for quality assurance. Copyright © 2013 Access Innovations, Inc.
  • 287. Where do I use a taxonomy? Copyright © 2013 Access Innovations, Inc.
  • 288. Thesaurus Master Machine Aided Indexer (M.A.I.™) Database Repository Search Presentation Layer Increases accuracy Browse by Subject Auto-completion Broader Terms Narrower Terms Related Terms Client Taxonomy Inline Tagging Metadata and Entity Extractor Automatic Summarization Search Software Client Data Full Text HTML, PDF, Data Feeds, etc. Client taxonomy The Workflow 288 Tag and Create metadata Put in data base with tags Build Search inverted index Create user interface Gather source data Copyright © 2013 Access Innovations, Inc.
  • 289. Thesaurus Master Machine Aided Indexer (M.A.I.™) Repository Search Presentation: 90% accuracy Browse by Subject Auto- completion Broader Terms Narrower Terms Related Terms Client Taxonomy Inline Tagging Metadata and Entity Extractor Automatic Summarization Search Software Client Data Full Text HTML, PDF, Data Feeds, etc. Client taxonomy Taxonomy In Sharepoint Copyright © 2013 Access Innovations, Inc. [Data Harmony fully integrated with MOSS.]
  • 290. Adding Terms to Information Objects  Part of the record  XML  MARC  A relational table pointing the terms to a record ID number (Secondary key)  Adding data to the HTML  META NAME KEYWORD Element  Many other options Copyright © 2013 Access Innovations, Inc.
  • 291. Part of the Record - XML  Added as an element in the XML record  Need an element to put the data in  <Taxonomy Term>  Capture the terms when creating the records Copyright © 2013 Access Innovations, Inc.
  • 292. The author pastes the data to the document template, attaching images, graphs, as necessary: Author Submission Module Copyright © 2013 Access Innovations, Inc.
  • 293. Editorial Workflow Integration Author Submission Module The author fills in the data to the document template, attaching images and graphs as necessary. An API calls Data Harmony and generates a list of indexing terms based on the content. Copyright © 2013 Access Innovations, Inc.
  • 294. Authors review the indexing and may change it. Content is stored into a data repository as HTML, XML, etc. Editorial Workflow Integration Author Submission Module Copyright © 2013 Access Innovations, Inc.
  • 295. In the HTML Record  Makes it crawlable for the internet  Used in CMS applications  Content Management Systems  Add to the HTML  Manually  In Dreamweaver  In your CMS like Extron  Author Submissions Example  Do the same with SharePoint Copyright © 2013 Access Innovations, Inc.
  • 296. META NAME “KEYWORDS” Copyright © 2013 Access Innovations, Inc.
  • 297. In Relational Database Table  Primary Key – the record  Secondary key all the metadata  Like taxonomy terms  Like author  Like publication date  Used in Oracle, SQL, etc  Need a field to put the taxonomy data in  Supports “Faceted Search”  each item in a separate field or element or table Copyright © 2013 Access Innovations, Inc.
  • 298. RDBMS Connection Taxonomy term table Copyright © 2013 Access Innovations, Inc.
  • 299. Using Taxonomies in Applications • Improve search • Subject browsing • Mobile intelligence • Targeted resources based on subject or user role • Link to society resources • Author submission module • Author authority database • Expert reviewer identification • Member profiles • Data visualization • More like this • In “indexing” or categorizing, as subject metadata • In content management systems • In SharePoint • In mashups • In social networking sites • In author tagging • In filtering data – e.g., spam filters and RSS feeds • In web crawlers • Social media - community Copyright © 2013 Access Innovations, Inc.
  • 300. A Quick Look Behind the Scenes Database Management System Thesaurus tool Indexing tool•Validate terms •Add terms and rules •Change terms and rules •Delete terms and rules •Search thesaurus •Validate term entry •Block invalid terms •Record candidates •Establish rules for term use •Suggest indexing terms Copyright © 2013 Access Innovations, Inc.
  • 302. Where Does the Subject Metadata Go?  Apply to content itself  Use meta name field in HTML header  Connect search to the keywords in the SQL or other database tables Copyright © 2013 Access Innovations, Inc.
  • 303. HTML Header Copyright © 2013 Access Innovations, Inc.
  • 304. Suggested taxonomy descriptors Copyright © 2013 Access Innovations, Inc.
  • 305. Copyright © 2013 Access Innovations, Inc.
  • 306. Integrate Taxonomy to Enhance Find-ability  Browsable categories of a directory  Browsable faceted navigation  Smart search for term equivalents  Taxonomy terms (original or modified) as labels  Navigation aids incorporate taxonomy terms and relationships Copyright © 2013 Access Innovations, Inc.
  • 307. More Taxonomy Enrichment  Spelling alternatives and correction  Related concepts  Statistical information about the metadata  Navigation or drill downs  Search refinement  Recursive sets  Concept linking  Dictionary lookup (in taxonomy glossary) Copyright © 2013 Access Innovations, Inc.
  • 308. Brand is repeated in several spots and tied to search as well Copyright © 2013 Access Innovations, Inc.
  • 309. Raw Full text data feeds XIS™ Creation Taxonomy Thesaurus Master® Printed source materials Taxonomy terms M.A.I.™ Concept Extractor M.A.I.™ Rule Base Load to Perfect Search Search Harmon™ Display Search Database Plus Search Workflow Data Crawls on 53+ sources Add metadata XIS™ repository SQL for ecommerce Save data to search and repositories at the same timeCopyright © 2013 Access Innovations, Inc.
  • 310. Raw Full text data feeds XIS Creation Taxonomy Thesaurus Master Printed source materials Taxonomy terms MAI Rule Base Load to Search Search Harmony Display Search Data Base Plus Search Workflow Data Crawls on data sources Add metadata XIS repository SQL for ecommerce MAI Concept Extractor Source data Clean and enhance data Search data Copyright © 2013 Access Innovations, Inc.
  • 311. Use Case: Inline Tagging Show the exact point where the concept is mentioned Mouse-over to view the term record Statistical summary, showing the number of times each term is mentioned in the article Copyright © 2013 Access Innovations, Inc.
  • 312. Inline Tagging HTML View Copyright © 2013 Access Innovations, Inc.
  • 313. XML View for Inline Tagging Copyright © 2013 Access Innovations, Inc.
  • 315.  The New Board Game  Applications  Implementation  The taxonomy Copyright © 2013 Access Innovations, Inc.
  • 316. The Changing Faces of Web Taxonomies  ….and how the information is delivered  From current site  To new version  Depends on TAXONOMY  Personalization  Feeding ads  Consistent information Copyright © 2013 Access Innovations, Inc.
  • 317. Copyright © 2013 Access Innovations, Inc.
  • 318. Copyright © 2013 Access Innovations, Inc.
  • 319. HTML Headers META NAME KEYWORD Use the taxonomy here Copyright © 2013 Access Innovations, Inc.
  • 320. Copyright © 2005 - Access Innovations, Inc. Copyright © 2013 Access Innovations, Inc.
  • 321. More Innovations!  Link topic to article to author to event  Make visual links within domain  Enable authors to submit and categorize conference submissions  Create author authority database linking to co-authors, topics, locations, etc.  Create expert reviewer database  Create member profiles with alternate names, publications, tagged by topic  Visualize data and domain distribution  Display interest connections in social network  Deliver accurate targeted information through mobile applications  Etc. Copyright © 2013 Access Innovations, Inc.
  • 322. Change to Ready, Aim, Fire!  Follow the data  Look at the data, format and content  Design taxonomy for data  Leverage the standards  Use taxonomy to tag data  Choose search and repository software for data  Load the data into the system  Keep your eye on the target Copyright © 2013 Access Innovations, Inc.
  • 323. Standards for Monolingual Thesauri  TEST - Thesaurus of engineering and scientific terms - COSATI 1967  ARNOR NFZ 47-100 1981 French  DIN 1463 German 1987-1993  NISO Z39.19 - 1993 - American Copyright © 2013 Access Innovations, Inc.
  • 324. Where Can I Get Taxonomy Standards?  www.niso.org  Z39.19 (2010) Controlled Vocabularies  www.ISO.ce  ISO 25964 parts 1 and 2 (2012 and 2013)  www.bsi.uk.co  www.w3c.org SKOS and OWL  www.accessinn.com/library Copyright © 2013 Access Innovations, Inc.
  • 325. Suggested Reading  F.W. Lancaster - 1986  Vocabulary Control 1986  Aitchison, Gilchrist and Bawden  Thesaurus construction and use: a practical manual 4th edition  Accidental Taxonomist  Heather Heddon  TaxoDiary.com Blog site Copyright © 2013 Access Innovations, Inc.
  • 326. Suggested Reading  Introduction to any thesaurus  INSPEC  NICEM  Pychological Abstracts  etc. Copyright © 2013 Access Innovations, Inc.
  • 327. It Just Takes a Little Imagination Thank you Marjorie M.K. Hlava, President Bob Kasenchak, Project Coordinator Access Innovations 505-998-0800 mhlava@accessinn.com Bob_kasenchak@accessinn.com Copyright © 2013 Access Innovations, Inc.